Stabilising a Codebase You Inherited
- technical rescue
- architecture
- leadership
Someone leaves, an agency hands over, a startup's first hire moves on — and you're holding a system that runs the business and that nobody fully understands. I've been the person brought in for exactly this more than once. Panic is the wrong first move. Sequence is everything.
Stop the bleeding before you understand the patient
The first job isn't comprehension — it's stability. What's actively on fire? What wakes someone at 3am? Put monitoring and alerting on the parts that hurt, even crude ones, so you're reacting to data instead of vibes. You cannot fix what you can't see.
Make it observable, then make it boring
Structured logging, error tracking, a basic dashboard of the handful of metrics that actually predict an outage. Once you can see the system behaving, the scary unknown shrinks into a list of known issues. Known issues are just work.
Map the load-bearing walls
Now read the code — but selectively. You don't need to understand all of it. You need to find the load-bearing walls: the auth path, the money path, the data the business can't lose. Document those first. Everything else can wait.
Change one thing at a time
The temptation with inherited mess is the grand rewrite. Resist it. Rewrites of systems you don't understand fail spectacularly. Stabilise, add tests around the load-bearing walls, then refactor in small reversible steps with a way to roll back. Deploy times from hours to minutes, blue-green so a bad release isn't a catastrophe — these unglamorous changes buy you the confidence to do everything else.
The goal is a system that doesn't need a hero
You've succeeded not when you understand the codebase, but when the team can ship it safely without you in the room. That's the whole point of stabilisation: turning a hostage situation back into ordinary engineering.