We loathe downtime and failing services. We often talk about how we can keep our code stable. How we can avoid instability. However, certain types of instability should be embraced. We want our software to be experienced as stable while running, while making sure the code behind it all is continuously evolved.
I'd like to dive into what we mean when we talk about stability, and to make a few important distinctions.
Service delivery is the service the software is meant to provide, either it's running a web site or controlling machinery in a factory. This is what we often talk about as stable software. As long as a web site is available for users when they need it, all is well. The software delivers the service it is meant to.
Code stability is how often or how much your code base changes. We should not measure it in number of bugs or unstable service delivery. It might very well affect the running software, but having stable code relates to rate of change. Code already deployed won't change. It may be bug-ridden, but it's not going to change by itself running in production - it's stable but full of bugs. Bugs are another topic altogether and may indeed arise from an unstable code base, but not necessarily.
Differentiating between these two is critical when talking about stability.
Build for evolution
As preached by a lot of literature nowadays, modern organisations must be built for change - built to evolve. This means that their code must inherently be unstable, while at the same time providing stable service delivery.
To achieve this, you need to meet some preconditions, among which in brief terms is to move as far right on the Continuous Delivery Maturity Model as you can. Automate processes, deliver continuously, monitor your application landscape and adapt to a changing world every day.
You also need to keep your libraries, tooling and infrastructure up-to-date. It's easy to forget in the midst of pushing features, and it's often ignored for long periods of time. It might indeed work for a while, all the way until it suddenly doesn't. Suddenly a manager wants to build a shiny new feature based on news from a conference, and you'll have to tell them how you're eight major versions behind, and that there's a huge data migration path to work through first. Or, you're trying to recruit those top n% people everyone insists on hiring, but all you can brag about is pre-2000 Java deployed on some Windows Server 2003 boxes in the back room. Fresh graduates which have grown up with npm, React, Vue and Angular won't even bother looking at you.
Thirdly, keep your code base as lean as you can. Dead code and unused features costs a lot to keep around over time. And the more code there is, the more vectors there are for introducing bugs when you change it. A large code base also increases cognitive load while working with it, making everything more complex and expensive than it needs to be. Keeping the code base lean and clean also makes it easy to code fearlessly.
These points might seem obvious. Or overly simple. Or have been repeated by others many times over. And those are all true. When done correctly, it just makes it easy to modify your code with low risk. High test coverage and good routines for automating testing in general, along with security nets along the build and release pipelines, should make for a smaller risk of deploying bugs to production. When it becomes safe and easy, you can do it more often, which in turn should lead to new improvements in process, deployment or testing. Keeping the code base unstable becomes safe.
I preach this for you to embrace instability as a force of good, not as a metric to keep at bay. Just keep in mind what form of instability. And let that be an active choice you maintain over time, not accidental design. See opportunity, not obstacle. Make sure to keep your code base unstable.
I'd even argue that, over time, an unstable code base should lead to more reliable service delivery.
Forcing the code to be rapidly modified over time helps spreading knowledge in the team, too, reducing risks when people for various reasons are unavailable. Not to mention when that 13 year old website the current team has never touched inevitably needs a big, very important new feature, you're in for a ride like no other.
There's always going to be downsides. Updating your frameworks just to update them is, in itself, pointless. Deleting code from a stable code base that'll be put out of service in six months is, in itself, pointless. Spending a day setting up CI/CD for a one-off-tool might, in itself, be pointless. Find your balance. But embrace instability, and preach it to your team. It'll reward both yourself and the business in the long run.