Skip to content

80/20 Refactoring

Sunday, 13 November 2022 | Adriaan de Groot


I have found a new bugbear. Something to be creatively annoyed about. I’m going to call it 80/20 refactoring, to express the idea that a refactoring is started, but then not finished. Probably because doing all of the edge cases in a refactoring is hard.

Premise

Suppose we have a large-ish codebase, with a repeated pattern of code. Thanks to the magic of copy-and-paste, that pattern will be, say, a half dozen lines of code, repeated over and over, but also sometimes subtly changed. Something like this:

SerializingBuffer request;
request.start(ID_RESET);
request.add(0);
request.add("timed-out");
connection.send(&request);

You don’t have to understand what this is doing, just that there’s a pattern of code. Variations will include:

  • having this in a function, with request passed in as a pointer,
  • same, but as a reference,
  • variables renamed rqst and conn because someone didn’t watch Kate Gregory’s Naming is Hard: Let’s Do Better talk,
  • an extra 0 somewhere.

Refactoring

My gut feeling says there’s a refactoring to be done here: make a function ReportTimeout() that takes a connection reference and does all the things. It hides the buffer, it hides the exact details of what is being sent, and it has a nicer name.

This is the point where 80/20 refactoring comes in, and I am absolutely guilty of doing just this: the 80% of cases that are easy to deal with are done in no time, the refactoring lands, there’s a net reduction of lines-of-code, lovely. Interest turns elsewhere, and now in many ways the codebase is worse off once the here-and-now knowledge about the refactoring is lost.

There’s one case, one particular function, used a lot, and then there’s a handful (20%) of cases that sorta-kinda looks like they would fit that function, but they don’t. Chesterson’s fence? Is there something special going on? Removed from the here-and-now of the refactoring, re-discovering what those remaining 20% of copied chunks of code mean, and how they can be shimmed into the new function, is so much harder.

So what happens is the nice bits get nicer, and the ugly bits get uglier, until there’s nothing left to do with the nasty bits but throw them away and start over – hopefully this time, using the newer functions from the outset.

Takeaway

Don’t put away the refactoring tool until it’s really done. Cover all of the edge cases. Anything that you could refactor, and don’t, comment the heck out of it why it doesn’t fit the new stuff. Then you won’t end up asking the question “why have we got 15 flavors of shit?” (ObXKCD 927)