Refactoring for dummies

Yes, I’m a refactoring dummy. Many moons ago, I was ranting about the constant need for engineers to refactor their code at work and got a homework assignment to read Refactoring: Improving the Design of Existing Code written by some fancy software engineer prof. Well, I’ve procrastinated my homework for well over a year now and need to get this book back to Nick. So, here’s what I learned.

Refactoring is risky. It requires changes to working code that can introduce subtle bugs. Refactoring, if not done properly, can set you back days, even weeks.

This has been my experience with big refactoring projects. Bugs, bugs, bugs. And that’s with top notch engineers doing the work. Now to be fair, that’s what QA’s for – – to find bugs, right? Sure. But I haven’t yet met an engineer that looked at another’s code and didn’t think they could do a better job writing it. While I believe that refactoring is certainly a necessity from time to time, refactoring for refactoring sake has never shown positives in my relatively short history in high tech. Refactoring to do something new, that’s the only time I’ve seen refactoring work out OK.

Refactoring is the process of changing a software system in such a way that it does not alter the external behaviour of the code yet improves its internal structure. It is a disciplined way to clean up code that minimizes the chances of introducing bugs. In essence when you refactor you are imporving the design of the code after it has been written.

This is the crux of the problem with refactoring for me is that it can introduce NO new functionality. “Let’s clean up the code to make it easier to maintain.” I guess that’s a legitimate reason given maintenance costs at companies can be high. What’s more ideal in my mind is to launch some new incremental functionality and refactor in process. Kill two birds with one stone rather than just fixing software to reduce maintenance costs x%.

When you find you have to add a feature to a program, and the program’s code is not structured in a convenient way to add the feature, first refactor the program to make it easy to add the feature, then add the feature.

Now this sounds reasonable – – the author agrees with me! Maybe there’s something to refactoring. 🙂

Before you start refactoring, check that you have a solid suite of tests. These tests must be self-checking.

It looks like unit tests are critical part of the refactoring process. Since whatever you refactor you have to test, solid tests are critical to verifying you haven’t introduced new bugs. Nothing surprising here but I’ve seen even sr. software engineers ignore unit tests. “I don’t need tests because I’m a programming god and QA can pick up any bugs.” Ick.

When you use refactoring to develop software, you divide your time between two distinct activities: adding function and refactoring.

The author goes on to point out that when you’re adding new function, you shouldn’t be changing any existing code; you are just adding new capability. This is an interesting notion for me (not being a software engineer) because I always assumed that building something new required touching old code to make it possible to introduce new code. Maybe this is just a more theoretical statement versus something that happens in practice? In any event, adding new functionality seems to absolutely be part of refactoring. So, I’ve learned something new here. Refactoring doesn’t just mean massaging old code with nothing new attached to it. It could mean that, but it certainly isn’t the only meaning. Well this is positive.

In almost all cases, I’m opposed to setting aside time for refactoring. In my view refactoring is not an activity you set aside time to do. Refactoring is something you do all the time in little bursts. You don’t decide to refactor, you refactor because you want to do something else, and refactoring helps you do that other thing.

Amen brother! The refactoring god agrees with the basis of my rant. Refactoring for refactoring sake is wrong, wrong, wrong.

There are times when you should not refactor at all. The principle example is when you should rewrite from scratch instead. There are times when the existing code is such a mess that although you could refactor it, it would be easier to start from the beginning. This decision is not an easy one to make, and I admit that I don’t really have good guidelines for it.

Here I come back to my “all engineers think they can do better than their colleagues” observation. How can you tell when when an engineer tells you “the code is a total mess, we need to completely rewrite it.” Since I’m not an engineer, I can’t call fooy very easily. Then again, if I was an engineer, I’d probably believe since I could write it better myself, refactoring must be OK. 😉 This is a tough call!

One argument is that refactoring can be an alternative to upfront design. In this scenario you don’t do any design at all. You just code the first approach that comes into your head, get it working, and then refactor it into shape. …Those who support Extreme Programming often are protrayed as advocating this approach.

Blah! This sounds terrible to me. Say I’m the architect for your new dream house and I tell you, “I’ll have the contractors here tomorrow to start work. We’ll figure out the design as we go along and fix any problems along the way.” What a terrible way to manage a project. You can most certainly take planning too far and never ship anything. But I rarely see this as a problem. I more frequently see software that doesn’t do what is was supposed to do which to me signals a lack of forthought.

My main take away from the refactoring book is that refactoring is about making small incremental changes to code to make it easier to manage. Every change should be testable and introducing new code shouldn’t effect old working code (unless it needs to be restructured to make the new code possible). Sure there are large refactoring projects where you effectively bulldoze everything you have and start over. But it seems the author doesn’t believe this is the usual. Writing code is about constantly making changes to software that makes it better over time. Managing software projects for a while, I don’t get the sense that this is standard operating procedure in most companies I’ve worked at. Projects run from more of a “is it done yet?” mentality – – good code or not. I’ll bet if code were as easy to see as say a user interface for a new application, management would be all over making sure their engineers were focused on writing good code.

In the end, it looks like to me that organizations need to advocate an environment of excellence for software quality. Sure you need to balance time to market, etc. But since quality is usually the first thing thrown out the window in high tech, management should go out of their way to make it clear that expectations are as high at the code level for projects as the external customer facing level.


3 Comments on “Refactoring for dummies”

  1. Greg Linden says:

    Great review, Andrej. Absolutely true that refactoring for refactoring sake is dangerous. The basic problem is that you have no goal, no metric, no way to optimize. How do you know when you’re done? How do you know you’ve done a good job refactoring? You don’t because there’s no objective, just a vague mandate to “clean things up.” Instead, as you said, it’s better to make refactoring part of every project. It takes discipline to allow time for refactoring in every little software project, but it pays off long term.
    I think your analogy between coding and building a house is a little off though. Building a house is a well understood problem, something that has been done millions of times before, and you pretty much can plan the entire thing from the beginning. Changes during or after construction are extremely costly, so they need to be minimized. By contrast, software projects usually involve developing and designing something new, a custom project that has never been done before. And changes, while costly, are not as costly as on physical projects.
    A better analogy might be writing a book. It’s a creative project, building something new. A high level outline of the book is a good place to start, but, in the process of writing the book, the work will have to be continuously refined as you get more ink on the page and see how things evolve.

  2. Eric Vadon says:

    I strongly disagree with bundling refactor time with new feature time. Every time I’ve done this, the project invariably takes longer than if we did the refactor and new feature separately. Besides, if you find yourself doing pork barrel project management – either you’ve completely lost control over your team (e.g. the tail is wagging the dog) or your software is an absolute friggin mess (in which case you need to **stop** doing pork barrel software development and **start** fixing your software). Take your pick.
    I absolutely disagree with the author that you shouldn’t set aside time for refactoring and that you should do it a little at a time. Refactoring requires thought and planning. You have to figure out where you want to go with the software and the product. That’s best done after careful consideration and thought. Attempting to bang something out in an afternoon is a waste of time – especially if you don’t know where you’re going. It’s throw away work.
    All refactoring needs business justification – either in terms of reduced maintenence or faster development. Never refactor for refactoring’s sake – that’s a waste of time. However, knowing when to refactor and when not to refactor is a skill you acquire over time. You don’t need to be an SDE.
    Refactoring can certainly be a macho thing – e.g. “I don’t understand what this code is doing because I didn’t write this code. Obviously the code is wrong because I can’t be wrong. If I rewrote the thing, I would understand it and know what’s going on.” If you find the same SDE coming to you over and over again saying that the code is a mess and that it needs to be rewritten – something is a miss – especially if it’s on different projects that don’t share the same code base at all.
    Do you carry a pager? How often do you look at ticket reports? If the answer to these questions is “not very often” – then maybe you should start. You can’t possibly agree to a refactor if you don’t know the maintenence associated with it. Badly written code will often exhibit buggy behavior. If you ever start saying to yourself, “where in the world are all these tickets coming from? this operation just doesn’t seem to be that difficult especially when compared to project X” – then you should consider refactoring.
    Ever hear the adage – “if you’re working too hard at something, then you’re probably doing it wrong”. Same thing applies. If you ask an SDE to do something that you think is trivial and find that it takes weeks to do and that the person has to write thousands of lines of code – than something is going on. Consider a refactor. If the software is written well, you should be able to do simple things easily.
    Anyway, that’s my 2 cents.

  3. Selly says:

    Good thoughts here. A couple cases you miss with the need for “direct, quantifiable justification of re-factoring”:
    1. Incorrect code design is affecting estimates of (and observed) new feature development. In this case, there may not be a lot of bugs (for example a legacy feature). What you will have consistently is if you ask >1 engineer who knows the code how long something will take, they’ll say “the code is a mess” meaning it wasn’t designed to do this.
    2. Refactoring is needed to support shared functionality. In a small codebase, this is observed as numerous implementations of the same functionality. In a larger codebase, you’ll have numerous packages replicating behavior.
    This is counter to the single-bug, single-fix principle, and although the absolute count of bugs will be low, *if* it can be anticipated that this will be common functionality, it should be abstracted and refactored out of the application space.
    As I’m Jewish, you only get 0.5 of my cents. But take it and be happy with it.