Chapter 30 · Part 5

Recovery and Rollback Infrastructure

A focused chapter on recovery and rollback infrastructure, with practical delivery concerns, trade-offs, and the operational questions behind CI/CD work.

30-1

Why Rolling Back Infrastructure Is Nothing Like Rolling Back an Application

You push a bad application update. Users start seeing errors. Your team swaps the load balancer back to the previous version, or the pipeline redeploys

5 min 30-2

When Infrastructure Changes Go Wrong: Recovery Options From Reapply to Failover

You just ran terraform apply on your production infrastructure. The output looks clean. No errors. Then your monitoring alert fires: users can't connect

6 min 30-3

Blast Radius: How to Decide Which Recovery Strategy You Actually Need

Every infrastructure change carries risk. Some risks are tiny. Some can take down your entire business. The question is not whether you should make

5 min 30-4

Recovery Plans for High-Risk Infrastructure Changes

You have a change coming up that could break production. Maybe it's a network architecture overhaul, a database migration, or a security group

6 min 30-5

Why Your Recovery Plan Will Fail Without Practice

A recovery plan sitting in a shared folder, approved by management, and never touched again is not a recovery plan. It is a security blanket. The first

5 min 30-6

When Infrastructure Changes Break: A Step-by-Step Recovery Walkthrough

The pipeline turned red. A Terraform apply that should have taken two minutes has been running for fifteen. Your monitoring dashboard shows five resources

6 min 30-7

What Happens After Recovery: Turning Infrastructure Failures Into Process Improvements

The monitoring dashboard is green again. The team breathes a collective sigh of relief. The incident is resolved, the service is back, and everyone can

5 min