38-1 · Chapter 38 · 6 min read

After Deployment: What to Check Before You Call It Done

The pipeline turns green. Build artifacts are uploaded. The deployment script finishes without errors. Many teams stop there, assuming the new version is

After Deployment: What to Check Before You Call It Done

The pipeline turns green. Build artifacts are uploaded. The deployment script finishes without errors. Many teams stop there, assuming the new version is live and working. That assumption is dangerous.

A green pipeline only means the delivery process ran without technical errors. It does not mean the new version actually works in production. Between a successful pipeline and a working deployment, there is a gap that needs active checking.

Why Pipeline Success Is Not Enough

Production environments are different from staging or test environments. In production, your new version faces real data, real traffic patterns, and real network conditions. Things happen that no pipeline can detect:

A database connection that slows down because the production dataset is ten times larger than staging
A cache configuration that works fine with test queries but misses on actual user access patterns
A misconfigured server that the deployment script didn't catch
A third-party API that responds differently under real load

The pipeline runs scripts and checks code. It does not know how the system behaves when it meets reality. That is why you need a separate step after deployment: verification.

The Difference Between Deployment and Verification

Deployment is the act of placing a new version into an environment. Verification is the act of confirming that the new version works as expected in that environment.

These are two different activities. Deployment is about machines and scripts. Verification is about behavior and signals. Many teams treat them as one thing, or skip verification entirely because the pipeline said everything was fine.

The following flowchart illustrates how deployment and verification are separate tracks that must both succeed before a deployment is considered complete:

flowchart TD A[Pipeline Green] --> B[Deployment Script Runs] B --> C[Artifacts Uploaded] C --> D[Deployment Track Complete] D --> E{Verification Track} E --> F[Smoke Test] F --> G{Pass?} G -- No --> H[Rollback] G -- Yes --> I[Check Error Rate] I --> J{Within Threshold?} J -- No --> H J -- Yes --> K[Check Latency] K --> L{Within Bounds?} L -- No --> H L -- Yes --> M[Check Throughput] M --> N{Expected Volume?} N -- No --> H N -- Yes --> O[Deployment Complete]

The moment you treat deployment as complete when the script finishes, you are gambling that nothing unexpected happens in production. That gamble might pay off for simple changes. For anything involving database migrations, configuration changes, or infrastructure updates, the odds are against you.

Start With a Smoke Test

The most basic verification step is a smoke test. The term comes from hardware engineering: when you power on a new device for the first time, you check whether smoke comes out. No smoke means the device at least did not catch fire.

In software deployment, a smoke test is a quick check to see if the new version is alive and responding. It answers one question: can this version accept requests and return reasonable responses?

A practical smoke test might include:

Here is a minimal bash script that runs a smoke test against a deployed endpoint and exits with a non-zero code on failure:

#!/bin/bash
# smoke-test.sh - Quick check that the deployed version is alive

URL="https://your-app.example.com/health"
EXPECTED_STATUS=200

HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$URL")

if [ "$HTTP_STATUS" -ne "$EXPECTED_STATUS" ]; then
    echo "Smoke test failed: expected status $EXPECTED_STATUS, got $HTTP_STATUS"
    exit 1
fi

echo "Smoke test passed: $URL returned $HTTP_STATUS"
exit 0

Hitting the main page and checking for a 200 response
Calling a simple API endpoint and verifying the response structure
Confirming the database connection is alive
Checking that static assets load correctly

Smoke tests do not need to be deep. They are fast, shallow checks that catch obvious failures. If the smoke test fails, you know something is seriously wrong and need to stop further traffic or roll back. If it passes, you can move to more detailed checks.

Check the Basic Signals

Once the smoke test passes, look at the system's operational signals. These are the metrics that tell you whether the new version is running normally or causing problems.

The signals you need depend on what you deployed, but some are universal:

Error rate: Is the percentage of failed requests higher than before the deployment?
Latency: Are response times within acceptable bounds? A sudden spike often indicates a problem.
Resource usage: Did CPU, memory, or disk usage change significantly after the deployment?
Traffic volume: Is the system receiving the expected amount of requests? A sudden drop might mean users cannot reach the new version.

You do not need complex analysis for this. Compare the current values against the same time window before the deployment. If you have a monitoring dashboard, this comparison takes a few minutes.

The key is to make this check a standard part of your deployment process, not something you do when you remember or when someone reports a problem.

Verification Is Part of the Deployment

Here is the mindset shift: verification is not a separate activity that happens after deployment. Verification is part of the deployment itself. The deployment is not complete until you have enough confidence that the new version is running normally.

This has a practical consequence for your pipeline. The pipeline should not mark the deployment as successful when the script finishes. It should wait until verification passes. If verification fails, the pipeline should report the deployment as failed, even if the script ran without errors.

Some teams implement this by having the pipeline pause after deployment and wait for manual confirmation. Others automate the smoke test and basic signal checks, and only mark success when those checks pass. Either approach is better than assuming everything is fine.

Different Types of Changes Need Different Checks

Not all deployments are the same. An application update, a database migration, and an infrastructure change each have different risks and different signals to check.

For an application update, the main risks are around request handling, response correctness, and integration with existing services. Smoke tests and error rate checks are usually sufficient.

For a database migration, the risks are different. You need to check that the migration ran correctly, that data integrity is preserved, and that query performance has not degraded. Signal checks should include database connection pool usage, query latency, and replication lag if applicable.

For an infrastructure change, the risks are around connectivity, resource availability, and configuration correctness. Signal checks should include network latency, certificate validity, and service discovery status.

The principle is the same: identify what could go wrong for this specific type of change, and check those things before calling the deployment done.

A Practical Post-Deployment Checklist

If you want something concrete to start with, here is a minimal checklist that works for most web applications:

Smoke test passes: main page, critical API endpoints, database connection
Error rate is not higher than before deployment
Latency is within normal range
CPU and memory usage are stable
No unusual logs or error messages in the application logs
If database migration was involved: migration status is successful, query performance is normal

This checklist is not exhaustive, but it covers the basics. You can expand it as you learn what signals matter most for your specific system.

The Takeaway

A green pipeline does not mean a successful deployment. The pipeline handles the mechanics of delivery. Verification handles the reality of whether the new version actually works. Do not treat deployment as finished until you have checked that the new version is running normally in production. That single habit will catch problems before your users do.