Development

Taming Technical Debt

Shannon Wells · May 22nd, 2019

You, fellow software engineer, have probably felt the same as I have: while coding some big feature, you find a thing that is inefficient, unreadable, deprecated, confusing, or just buggy. Maybe it’s not bad code at all — you just realize that several packages are out of date, or your framework needs to be upgraded. You roll your eyes, you sigh, and you think, “Ugh, this has got to be fixed.” It’s technical debt.

You may also have experienced this: you go to your manager, or your project manager, and complain about the Technical Debt. “We really need to upgrade to Rails 5!” Yet you can’t seem to get the go-ahead to fix it. “We have a hard deadline!” “We don’t have time for that!” they say.

Those in charge insist you have more important things to do. How do we convince them otherwise?

Getting Buy-In

Technical debt activities should be documented, scheduled, and tracked as stories, if for no other reason than they take time, which must be accounted for. This is not just because we’re paid for that time, but because what we do affects business priorities. That knowledge is necessary in order to properly understand how long anything will take, even if it’s just an estimate. A given technical debt story may even block a feature or a product release, and management needs to know that. Sometimes they’re right; sometimes you do have more important things to do than, say, reorganizing a bunch of files, or renaming a controller to something that reflects a name change in the UI, but often they’re not. When that’s the case, here are some ways to convince them otherwise.

Turn Technical Debt into a Business Case

If you want your manager or PM to be receptive to paying down technical debt, make the business case for putting it on the schedule. It’s pretty easy to make this case with security updates, but harder for upgrading major versions of your architecture or core libraries in your application, or even smaller things like keeping packages up to date — which can sometimes cause widespread problems if they’re not done regularly.

Developer productivity – Will it help your team develop faster? Better yet, has this issue already caused N developer hours of wasted time? Multiply N by how much those devs cost. There’s your business case. Numbers always beat no numbers.

Customer happiness – Will this story result in a more stable code base and happier customers? Improved customer retention and reduced support calls are certainly business cases.

Housekeeping – Simple housekeeping is a chore that we all need to do. We still have to wash our dishes and do the laundry, even though they’ll just get dirty all over again. Call it another manifestation of the 2nd Law of Thermodynamics. Tell your PM, it’s the same with code. We have to keep doing these things because keeping things up-to-date, following best practices, and improving old code to meet new needs, will prevent worse headaches down the road — the kind of headaches which could endanger important deadlines, lose revenue, or cause PR problems.

Here’s an example: one of our former clients creates monthly stories just for updating all their packages. Around the first of the month, whoever is ready to start a story takes that monthly chore. Updating packages is almost always easy and fast. Occasionally it isn’t, but it’d definitely be worse if one didn’t keep up with it regularly. This also gives the opportunity to:

re-evaluate packages used by an application
make sure they are still needed
determine whether there is anything better, or if the pinned versions could be safely upgraded yet
keep better track of needed security updates. Better security is a business case.

Don’t Say “Refactoring”!

Refactoring can be a dirty word to non-coders. Sometimes it’s because of a bad experience in the past. It could be they had to deal with fallout from bad choices made by engineers who didn’t put the business first, or engineering didn’t know how to talk about refactoring with non-coders. It could even have been you that made those decisions.

A trigger response to “refactoring” may mean you need to avoid using that term for awhile. You’re going to have to earn some trust first, so spend some time honestly looking at why there is dislike for it. Make sure that you are truly exercising good judgment about your team’s coding efforts before any significant refactoring effort; anything that takes longer than a couple of hours is significant. If you don’t believe me, multiply your hourly rate by 1.5 for employee overhead, then by 2, and then ask yourself how often your manager would be okay with you spending that much of the company’s money on things that may or may not help the business. So maybe once in awhile, that’s fine, but definitely not regularly.

Martin Fowler writes that the need to refactor isn’t necessarily due to flawed code. It’s also a recognition that product needs, dependencies, and understandings change, too:

As I build the system, I learn how to improve the design. The result of this interaction is a program whose design stays good as development continues.
Martin Fowler. Refactoring: Improving the Design of Existing Code, 2nd Ed.

How do you earn the trust?

Pick Your Battles

Before butting heads with management, ask yourself, do you really want to die on this hill? Do you really need to achieve this possibly insignificant coding standard? Maybe not, even if it hurts your eyes looking at it. Keep reminding yourself, “everything in its time.” Wait for the right time to fix it. Accept that the right time might never come.

Time Boxing

Spending days refactoring to achieve marginal performance improvements is not ethical; ask yourself what the return on investment for this story might be. I’m not suggesting that you use the hourly wage calculator all the time, but for something complicated, try making a ballpark estimate and set a deadline for solving this issue. Then spend a little time up front coming up with the best way to investigate the issue.

Friends Don’t Lie

I hope this needn’t be said, but one thing not to do is exaggerate the importance or effects of paying down some technical debt in order to get approval. As ethical employees paid to help our company succeed, we should be acting in the best interests of the company. Undercutting business goals to work on a pet story is not ethical. Be realistic about your time estimates and the benefits of any given technical debt story. When everyone has all the best information available, they can make the best decisions.

While these principles are also general guidelines for working on any story, applying them to technical debt is key to earning trust for when you ask to put them in the schedule.

Killing Snakes

“If you see a snake, kill it.”

I picked this up from Bob Moss, who was a Director of Engineering at Red Hat. It means if it’s small enough for you to take care of on your own, and quickly, just do it; don’t make a federal case out of it, wait for someone else to take care of it, or let it bite someone.

If you were a Scout like me, you probably have been told, “leave things better than you found them.” This is my basis for deciding which snakes to kill, so when I come across a small thing that needs changing, like sorting imports, a naming inconsistency, removing dead code, if I am changing that file, I’ll fix it as part of my pull request. I used to get irritated and fix it everywhere, but this habit pollutes the change set with a bunch of code irrelevant to the story I was working on. I learned to be more disciplined and keep my pull requests “brain friendly” (more on that term later). This potentially leaves inconsistent patterns around everywhere, so you need to use your best judgment here. You could alternatively create a small PR just for this kind of thing.

There may be a good reason for such a PR. In one project that used golang, I was working on a feature, and kept running into import cycles, which prevent compilation. This was in part because of an established pattern of testing that made all tests part of the package to be tested, rather than <package>_test. I had two choices: track down and figure out how to separate all these test utilities that created the import cycle, or move a bunch of tests into their own test packages.

I knew that the vast majority tested exported functions, meaning that they did not need access to the package internals. I spent about 20 minutes to see how much would fail if I just moved the tests to a test package. The answer was: not much, however, it did touch a lot of files. This problem was blocking the story, so I posted two small PRs, one for each new test package, and in the pull request, I noted that these were blocking this story. Since it was all test code, and since CI passed, it was a pretty easy approval. The refactors unblocked me, while also addressing a technical debt issue that we’d already discussed.

This also follows Kent Beck’s advice on refactoring: for each desired change, make the change easy (warning: this may be hard), then make the easy change.”

When You Can’t get Buy-In

You say nothing you’ve tried has convinced Management that officially reducing technical debt is a good idea? This gives you very difficult choices, which boil down to:

do it anyway and be upfront about it, and deal with the fallout,
do it anyway and hide it among other things by somehow padding your efforts on other stories,
don’t do it at all, be miserable and maybe the product fails miserably,
quit and find a place that supports better engineering habits.

Quitting is probably not on the table, and the situation is likely not quite so dire. Not addressing technical debt at all — option three — is eventually going to make it impossible to get your job done. Realistically, your choices are between one and two — you either openly address technical debt or do it on the sly. Two is not sustainable, but may be necessary in the short term. At some point, you will have to let management know what you are doing. I believe this situation is rare anyway, but I’d say start with two, limiting it to the smaller, shorter tasks, and wait for a really great, impossible-to-argue-with technical debt story for one, where you are certain there is a benefit. See Pick Your Battles.

I have actually been in this situation. It was on a project with an engineering director infamous for micromanaging with actually impossible deadlines. Fortunately, I was not alone in recognizing the need to address our technical debt; my teammates determined that option two was what we had to do, and we limited ourselves only to tasks that had measurable benefit compared to the amount of time they’d require. Yes, it was unsustainable, but once we got out of “crunch time,” we were more open about working on technical debt. Once we finished our release, the pressure was off, and management was more receptive to discussing these stories that were not directly related to our release.

Getting Stuff Done

Now that I’ve laid out the higher level ideas about technical debt, I want to talk about some ways to manage and track it.

First, I can’t underemphasize the importance of undertaking these things as a group. When I refer to “you” and advise “you” how to approach these problems, I mean “you” in the plural, as in an engineering team. None of these strategies and tactics should be adopted unilaterally. Instead, they should be socialized within the team and adapted for your needs and working style. Otherwise, they will be ineffective, and at worst, counterproductive.

Extracting

TODO comments in your code indicate technical debt. Don’t leave TODOs in your code. Your code base is a poor story tracking system. TODOs are really stories, so why put TODOs in your code base? If you think the TODO is too small to be a story, it should have been fixed already (see “Killing Snakes”). If it’s not too small to be a story, pull it out of your code base and make a story instead. If you feel you must leave them, however, file a story anyway.

In the TODO comment, include the story number and a short description. This leaves no doubt as to the status of this TODO, and gives the person who takes the story a way to search for the starting point. Nobody is going to come across this and wonder whether it’s relevant and worth fixing, or have to ask about it. (“Hey! Who put that there? …. git blame Oh . . . I did . . . Why did I put that there?”)

Tracking

When you log these TODOs and any other technical debt stories in your story tracking system, mark them as technical debt in whatever way makes sense; in an epic, with a label, whatever. Include as complete a description as possible, explain why it needs to be done, and where to start. Acceptance criteria are extremely helpful; these will guide unit testing.

The Snacklog

One of my coworkers, Dylan Clark, coined the term “snacklog”, and used it to label these small technical debt stories that can be worked on independently. They’re ready to go and are small enough not to require planning. They’re the “sand” in the “stone, pebbles and sand” analogy for time management. The Snacklog approach is helpful for categorizing and sizing technical debt effort, and has a detailed writeup by Eric Fung: “In Support of the Snacklog”.

Brain-Friendly Pull Requests

“Brain-friendly” is a term I learned from a seminar on giving feedback. I really like this term, because it doesn’t have any emotional baggage; instead, it frames it positively, as a simple recognition of the limitations of human attention. Brain-friendly PRs are those which are small and simple enough for one person to review without getting cognitive overload. They should be reviewable within about 20 minutes. I strongly encourage the explicit adoption of a Brain-Friendly PR policy.

What the parameters are for this must be decided by your team, however, I like PRs with fewer than ten files changed, and less than about 400 total lines added, deleted or changed, excluding test code.

What does this have to do with taming technical debt? Inadequately reviewed code can increase technical debt. PR reviewers must consider several factors all at once when looking at a PR: compliance to established style and patterns, good coding practice, error conditions and bugs, and adequate, valid testing, in addition to fulfilling the acceptance requirements of the story. This is already asking a lot of one’s brain. If a PR is too long, there is no way one person can be expected to consider all these things, while looking at your code, and be good at it — at least, not without either spending hours and hours on it or sitting there with you explaining it to them. Nobody has time for that! One way to guarantee that a team will absolutely hate doing PR review is to make the process painful, tedious and long. We want people to be motivated to review PRs, not to see it as a hateful chore that they have to be begged (or ordered) to do.

I’m still guilty of not making my PRs brain-friendly enough, even when I think I am. I was recently asked to break up one PR into multiple smaller ones. It was a spike, and while there were only 11 files changed, over 1000 lines changed. While this line count was weighted heavily by auto-generated files, there was still too much code to review. I followed up with one for 173 lines of change and 5 files. The response? “Your PR is small and it makes me happy ?.” Making your coworkers happy is also good for team cohesion, which leads to increased productivity. Make it easy for your teammates to review your code.

Takeaways

Technical debt is unavoidable, and not always a sign of bad code. Sometimes it may be reasonable to accept a certain amount of technical debt when business needs outweigh the cost of the debt. Regardless, it must be managed and scheduled, which means those in charge of the product need to understand what it is and why it’s important to address it. If it is properly assessed, scheduled, and completed on a frequent, if not regular basis, the technical debt in your code base becomes manageable, rather than a dark thunderhead growing taller over the product.