Legacy code

A few weeks ago I stumbled across a video from Karoline Klever titled “What is the actual life expectancy of your code?” and it challenged everything I thought I knew about legacy code. Not only that, just how important it is to question what I think I know about the terms I fling around the office on a day to day basis. It also changed my attitude on code which keeps the business running and whether or not code should keep being “revived” as if it were a living organism.

Although the talk focuses on the value and lifetime of code before it’s considered dead and retired I thought I’d concentrate on the “legacy code” part because it’s something which has a lot of stigma attached to it and is all too often that it’s used as an excuse to decommission systems or parts of systems.

Legacy code is something I thought I knew so well that I never stopped to ask what it actually is. The words I think of when I hear legacy are “old code” or code which is a few years old, not well understood by many, developers are generally too scared to touch it and doesn’t follow modern development practice for the platform on which it is being built. I’ve put some real thought in to this and although most developers can agree on some ideas about what “it” is, I’d like to put my spin on things based on my experience and also ideas which I’ve found on the web.

Wikipedia’s entry on legacy code states that legacy code refers to a technology stack which is no longer supported. This might include the operating system or language/framework the system is built on. It also mentions some modern interpretations such as code which was inherited from someone else or source code inherited from an older system presumably working in a newer system.

Code which no longer compiles on modern operating systems or built on unsupported frameworks/libraries fits the definition nicely, but it’s only part of the definition.

I don’t think code inherited from someone else makes it legacy. Developers come and go during projects and even after a project is complete and is running in production there are usually enhancements and bug fixes which is a normal part of maintenance. It’s hardly legacy and there’s no clear definition on what “old” code is. At what point will that team consider their code base old especially when it’s constantly being enhanced and engineered? 2 weeks, 2 years, 10 years?

There’s also a post on Stackoverflow and StackExchange Programmers and a definition on Technopedia. Here’s a summary of some of the popular points of view.

  • Lack of testing (Made famous by Michael Feathers)
  • Code which is no longer engineered or enhanced but frequently patched
  • Code you’d rather replace than work with
  • A lot of effort is required to extend it
  • Code which is orphaned in some way
  • Too business critical to be easily replaced

I generally agree with most of these points but there’s always grey areas. It’s great to see developers put a lot more value in the code they write. That’s a positive.

Lack of testing is a tricky one. I don’t think it makes code legacy but it certainly tells you that the code being worked on doesn’t follow modern development practices. I think some developers would have a hard time believing the code they are writing is legacy as it leaves their fingertips. As someone who is a strong believer in TDD I still think it’s OK to write code which don’t have tests such as a small bash, PowerShell or LINQPad scripts, perhaps a proof-of-concept or idea.

Code which you’d rather replace than work is subjective. Each developer will have a different opinion based on the code they are looking at and we all solve problems differently. To me this isn’t part of the definition.

One of the highest voted answers on StackExchange Programmers states that code which is used in production “is precisely what makes it legacy” which I have a hard time understanding the logic. Even a code base which has been in production for 20 years, is still being enhanced with new features and constantly iterated on and you can also throw in unit tests while you’re at it. Not legacy. There are many systems like this and you might even be working on one. Sure there might be parts which haven’t been touched in months or even years but it’s a living, breathing system constantly being iterated on. Just because developers aren’t following modern practices or the code is over n years old still doesn’t make it legacy.

Where to go from here

There’s always a counter argument to these points and I feel like I’ve only come a little closer to an understanding. There are so many corner cases that it makes a concrete definition hard.

I think the following four points should determine if a piece of code (not the whole code base) should be considered “legacy” with all the stigma that goes along with it.

  • Code which is no longer engineered or enhanced but frequently patched
  • Code which is orphaned in some way. Undocumented and the original author(s) unavailable
  • The compiler or interpreter for the code no longer runs on a supported operating system
  • Can be compiled on a supported operating system but has no compatibility or cannot run within the operating system due to outdated API’s.

I would have liked to add a fifth point stating that code which is hard to change/approach or difficult to understand but that’s an opinion. If your working with code like that then that’s simply a challenge you’ll need to face. If it’s hard then do it more, and keep touching it until your confident you know what you’re doing. Code that is hard to change or difficult to understand is subjective as every developer solves problems differently. A lack of understanding doesn’t make code bad in any way. It’s not legacy if you need to enhance it with more functionality and features, you’ve simply got a task of documenting what you’re doing to give that code more life.

I’d like to sum these points up in to a clean definition but I don’t think that’s possible. If the code your working with matches one of these point then you can safely consider it legacy code and use that in whatever argument you see fit with your team. It’s not however, a green light to consider code dead and rewrite it.