34-1 · Chapter 34 · 6 min read

Why Releasing in Stages Matters More Than You Think

Your team just spent two weeks finishing a new feature. Code was reviewed. Tests passed in staging. Everything looked good. You run the pipeline, the new

Why Releasing in Stages Matters More Than You Think

Your team just spent two weeks finishing a new feature. Code was reviewed. Tests passed in staging. Everything looked good. You run the pipeline, the new version goes to all servers, and every user immediately gets the new feature.

Five minutes later, error rates start climbing. Ten minutes later, users report they can't access the main page. You panic, hit rollback, and everyone goes back to the old version. But the damage is done: some users lost data temporarily, others hit errors, and trust in your application takes a hit.

The problem wasn't bad code. The problem was how you shipped it. When every change goes to every user at the same time, you have no window to see if the new version actually works in production. You get no chance to check the impact before everyone is affected.

This is called a big bang release. All changes go out at once, to all users, with no pause for observation. It carries risks that are hard to avoid.

Why Staging Isn't Enough

Staging environments never perfectly match production. Configuration differences, traffic patterns, and real user data create gaps that only show up in production. Your tests might be solid, but they run against synthetic data and simulated behavior. Real users do unexpected things.

Different users also have different usage patterns. A feature that works well for one group might break for another. Power users hit edge cases that casual users never trigger. Mobile users have different network conditions than desktop users. A single release treats everyone the same, but your users aren't the same.

When something goes wrong with a big bang release, the blast radius is massive. Every user is exposed. Every session is affected. The pressure to fix it immediately is intense, which leads to rushed decisions and more mistakes.

The Alternative: Progressive Delivery

Instead of sending a new version to everyone at once, you send it gradually. Start with a small percentage of users. Watch what happens. If things look good, expand the reach. If something goes wrong, only a small group is affected.

Progressive delivery isn't one technique. It's a combination of practices:

The following flowchart illustrates how a staged release works, with automated checks at each step:

flowchart TD A[Start: 1% of users] --> B[Monitor metrics] B --> C{All green?} C -->|Yes| D[Increase to 5%] D --> E[Monitor metrics] E --> F{All green?} F -->|Yes| G[Increase to 10%] G --> H[Monitor metrics] H --> I{All green?} I -->|Yes| J[Increase to 25%] J --> K[Monitor metrics] K --> L{All green?} L -->|Yes| M[Increase to 50%] M --> N[Monitor metrics] N --> O{All green?} O -->|Yes| P[100% rollout] C -->|No| Q[Pause or roll back] F -->|No| Q I -->|No| Q L -->|No| Q O -->|No| Q

Controlling what percentage of traffic gets the new version
Deciding which users see changes first
Toggling features on or off for specific groups
Monitoring metrics in real time
Making automated decisions based on collected data

The goal is simple: reduce risk by limiting exposure. When a bad release only hits 5% of users, the problem is manageable. You have time to analyze, decide whether to fix or roll back, and act without panic.

What You Control During a Staged Release

Progressive delivery gives you several levers to pull during a release. Understanding each one helps you design a strategy that fits your situation.

Traffic shifting controls how much user traffic reaches the new version. You might start with 1% of traffic, then move to 5%, 20%, 50%, and finally 100%. Each step gives you data before increasing exposure.

User targeting lets you choose who gets the new version first. Internal users, beta testers, or users in a specific region can be early adopters. This gives you feedback from a controlled group before wider release.

Feature flags separate deployment from release. You can deploy code with new features turned off, then enable them gradually. If something goes wrong, you flip the flag off without rolling back the entire deployment.

Environment gating means you don't jump straight to production. You might release to a canary environment first, then a small production subset, then wider production. Each environment adds confidence.

Metrics That Matter During Progressive Delivery

Releasing in stages only helps if you're watching the right signals. Without monitoring, you're flying blind.

Error rates are the most obvious metric. A spike in 5xx errors or client-side exceptions means something is wrong. But don't just watch the overall rate. Compare error rates between the new version and the old version. A 0.5% error rate might look fine until you see the old version running at 0.05%.

Response times often degrade before errors appear. If the new version is slower, users might not complain immediately, but they'll notice. Track p95 and p99 latency, not just averages.

Business metrics tell you if the feature is actually working. Conversion rates, sign-ups, purchases, or engagement metrics show whether the change delivers value. A technically perfect release that hurts business metrics is still a bad release.

User-reported issues are slower but valuable. Monitor support tickets, social media mentions, and internal reports. Sometimes users notice problems that automated monitoring misses.

Building a Pipeline That Decides for Itself

The most effective progressive delivery pipelines don't wait for humans to make every decision. They automate the go/no-go check based on metrics.

Here's how it works in practice:

Here's a concrete example using Argo Rollouts, a Kubernetes controller that automates this process:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-service
spec:
  replicas: 10
  strategy:
    canary:
      steps:
        - setWeight: 5
        - pause: {duration: 10m}
        - analysis:
            templates:
            - templateName: success-rate
            args:
            - name: service-name
              value: my-service
        - setWeight: 20
        - pause: {duration: 10m}
        - analysis:
            templates:
            - templateName: success-rate
        - setWeight: 50
        - pause: {duration: 10m}
        - analysis:
            templates:
            - templateName: success-rate
        - setWeight: 100

This configuration gradually shifts traffic from 5% to 100%, pausing after each step to run automated analysis checks. If the success rate drops below a threshold, the rollout automatically pauses or rolls back.

Pipeline deploys the new version to a small subset of servers or users
Monitoring system collects metrics for a defined period (say 10 minutes)
Automated checks compare metrics against thresholds
If metrics are healthy, pipeline increases the rollout percentage
If metrics cross thresholds, pipeline pauses or rolls back automatically

This removes the human delay between detecting a problem and acting on it. When you're asleep or in a meeting, the pipeline keeps protecting your users.

The thresholds need to be set carefully. Too tight, and you'll get false positives that block releases. Too loose, and you'll miss real problems. Start with conservative thresholds and adjust based on experience.

A Practical Checklist for Your Next Staged Release

Before you implement progressive delivery, run through this checklist:

Can you route traffic to specific versions of your application?
Do you have real-time monitoring for error rates, latency, and business metrics?
Have you defined clear thresholds for each metric?
Can you roll back a partial release without affecting users on the old version?
Do you have a way to target specific user groups (internal, beta, region-based)?
Is your pipeline capable of pausing or rolling back automatically?
Have you tested the progressive delivery process in a non-production environment?

The Takeaway

Big bang releases turn every deployment into a gamble. You're betting that staging tests caught everything, that production will behave exactly like your test environment, and that all your users will have the same experience. Those bets fail more often than teams admit.

Progressive delivery changes the equation. Instead of betting everything on one release, you place small bets and check the results before betting more. When something goes wrong, the damage is contained. Your team stays calm, your users stay happy, and your release process becomes something you trust instead of something you fear.

Start small. Pick one service or one feature. Set up traffic shifting and monitoring. Run your next release through a staged process. The confidence you gain will make you wonder why you ever shipped any other way.