Understanding Technical Debt

From Wikipedia: “Technical debt (also known as design debt or code debt) is “a concept in programming that reflects the extra development work that arises when code that is easy to implement in the short run is used instead of applying the best overall solution”.

Technical debt can be compared to monetary debt. If technical debt is not repaid, it can accumulate ‘interest’, making it harder to implement changes later on. Unaddressed technical debt increases software entropy. Technical debt is not necessarily a bad thing, and sometimes (e.g., as a proof-of-concept) technical debt is required to move projects forward. On the other hand, some experts claim that the “technical debt” metaphor tends to minimize the impact, which results in insufficient prioritization of the necessary work to correct it.”

The concept of technical debt comes from the software engineering world, but it applies to the world of IT and business infrastructure just as much. Like software engineering, we design our systems and our networks, and taking shortcuts in our designs, which includes working with less than ideal designs, incorporating existing hardware and other bad design practices produce technical debt.  One of the more significant forms of this comes from investing in the “past” rather than in the “future” and is quite often triggered through the sunk cost fallacy (a.k.a. throwing good money after bad.)

It is easy to see this happening in businesses every day.  New plans are made for the future, but before they are implemented investments are made in making an old system design continue working, work better, expand or whatever.  This investment then either turns into a nearly immediate financial loss or, more often, becomes incentive to not invest in the future designs as quickly, as thoroughly or possible, at all.  The investment in the past can become crippling in the worst cases.

This happens in numerous ways and is generally unintentional.  Often investments are needed to keep an existing system running properly and, under normal conditions, would simply be made.  But in a situation where there is a future change that is needed or potentially planned this investment can be problematic.  Better cost analysis and triage planning can remedy this, in many cases, though.

In a non-technical example, imagine owning an older car that has served well but is due for retirement in three months.  In three months you plan to invest in a new car because the old one is no longer cost effective due to continuous maintenance needs, lower efficiency and so forth.  But before your three month plan to buy a new car comes around, the old car suffers a minor failure and now requires a significant investment to keep it running.  Putting money into the old car would be an new investment in the technical debt.  Rather than spending a large amount of money to make an old car run for a few months, moving up the time table to buy the new one is obviously drastically more financially sound.  With cars, we see this easily (in most cases.)  We save money, potentially a lot of it, by quickly buying a new car.  If we were to invest heavily in the old one, we either lose that investment in a few months or we risk changes our solid financial planning for the purchase of a new car that was already made.  Both cases are bad financially.

IT works the same way.  Spending a large sum of money to maintain an old email system six months before a planned migration to a hosted email system would likely be very foolish.  The investment is either lost nearly immediately when the old system is decommissioned or it undermines our good planning processes and leads us to not migrate as planned and do a sub-par job for our businesses because we allowed technical debt to drive our decision making rather than proper planning.

Often a poor triage operation or improper authority to triage players can be the factor that causes emergency technical debt investments rather than rapid future looking investments.  This is only one area where major improvements may address issues, but it is a major one.  This can also be mitigated, in some cases, through “what if” planning to have investment plans in place contingent on common or expected emergencies that might arise, which may be as simple as capacity expansion needs due to growth that happen before systems planning comes into play.

Another great example of common technical debt is server storage capacity expansion.  This is a scenario that I see with some frequency and demonstrates technical debt well.  It is common for a company to purchase servers that lack large internal storage capacity.  Either immediately or sometime down the road more capacity is needed.  If this happens immediately we can see that the server purchased was a form of technical debt in improper design and obviously represents a flaw in the planning and purchasing process.

But a more common example is needing to expand storage two or three years after a server has been purchased.  Common expansion choices include adding an external storage array to attach to the server or modifying the server to accept more local storage.  Both of these approaches tend to be large investments in an already old server, a server that is easily forty percent or higher through its useful lifespan.  In many cases the same or only slightly higher investment in a completely new server can result in new hardware, faster CPUs, more RAM, the storage needed, purpose designed and built, aligned and refreshed support lifespan, smaller datacenter footprint, lower power consumption, newer technologies and features, better vendor relationships and more all while retaining the original server to reuse, retire or resell.  One way spends money supporting the past, the other often can spend comparable money on the future.

Technical debt is a crippling factor for many businesses.  It increases the cost of IT, sometimes significantly, and can lead to high levels of risk through a lack of planning and most spending being emergency based.


Leave a Reply

Your email address will not be published. Required fields are marked *