You are not refactoring

And maybe you shouldn't be

Bill Kidwell

I hear a lot of talk about refactoring code to make it better. Unfortunately, I see a lot of developers re-writing code to make it better. There is a difference.

Refactoring has a specific definition. When you refactor, you change the code in a way that is behavior preserving. That means that you aren’t adding error handling. That is a behavior change. Some performance changes might be behavior preserving, but some aren’t.

In Refactoring by Martin Fowler, he provides a number of refactorings - behavior preserving transformations. Many of these transformations are available in modern IDEs to make it easier. One major reason that these can be automated is that they are strictly structural changes to the code.

So what?

Now, besides this being a pet peeve, why should we care whether we call it re-writing or refactoring? I can think of two reasons.

The technical reason we should care is that the process for handling the change should be different. If you are refactoring code, you should have adequate tests to ensure that you don’t break anything. After all, a behavior-preserving change that causes a test to fail is probably due to a typo, or similarly obvious problem. Refactoring is fairly safe in that way. The red-green-refactor approach works because you got tests to pass, and now you can worry about structure, speed, and non-behavioral properties of the software.

On the other hand, if you are re-writing code, the risk that you are going to break something is much larger. In code that you wrote an hour ago, the difference might not be a big deal. However, if we are talking about existing code that you want to change before adding new functionality, you might be wreaking havoc. How good is the test coverage in that code? Is it defect prone? It might be a good idea to add a code review and/or some extra QA before you make any changes.

The organizational reason we should care is that it gives refactoring a bad name. I have heard competent managers complain about refactoring efforts. They cite changes that result in major regression fallout and cause schedule delays. As these problems recur, you hear questions about why we should be refactoring at all. We just wrote that code, why should we have to refactor it? Why didn’t we do it right the first time? This is obviously the wrong attitude to have about refactoring. We should be refactoring continuously. The cause, however, is not just a lack of education on the managers’ parts, but also a lack of clarity about the nature of the change.

Obviously, the title has a click-bait aspect to it. I strongly believe that refactoring should be done on a continous basis. It is equally important that we identify when we are re-designing, re-architecting, or re-writing the code and make that distinction.