6-3 · Chapter 6 · 7 min read

How to Check If Your New Version Actually Works

You just finished a deployment. The pipeline is green. The new version is live. Someone on the team opens a browser, loads the homepage, and says "Looks

How to Check If Your New Version Actually Works

You just finished a deployment. The pipeline is green. The new version is live. Someone on the team opens a browser, loads the homepage, and says "Looks fine." Everyone breathes a sigh of relief and moves on.

That feeling is familiar, but it is also dangerous. One person checking one page is not a repeatable process. It does not tell you whether other features work. It does not tell you whether the API responds correctly. And it certainly does not tell you whether users will have a good experience.

If your deployment verification relies on someone's gut feeling, you are essentially using your users as your first detection system. By the time they report a problem, the damage has already spread.

The Problem With Manual Checks

Manual checks after deployment have three fundamental flaws.

First, they are inconsistent. The person who checks today might look at different things than the person who checks tomorrow. One person might check the login page. Another might check the search feature. You cannot compare results across deployments because the checks themselves keep changing.

Second, manual checks are shallow. A human can only check a handful of pages or endpoints in a reasonable time. Complex workflows that involve multiple steps are rarely tested manually after every deployment. The things that break most often are exactly those multi-step flows that nobody checks.

Third, manual checks are slow. By the time someone finishes checking a few pages, the deployment might already be serving thousands of users. If something is broken, those users are already affected.

This is where verification comes in as a structured process. Verification is the act of checking whether a newly deployed version actually runs correctly, not just from the server side but from the user's perspective. It answers one question: "Can people use this version without hitting obvious problems?"

Start With Smoke Tests

The simplest form of verification is the smoke test. The term comes from electronics: when you power on a new circuit board for the first time, you check whether any smoke comes out. No smoke means no component is burning, and you can proceed with deeper testing.

In deployment terms, a smoke test runs basic checks to confirm the new version is not immediately broken. These checks are simple and fast. They do not verify business logic. They only catch the most obvious failures.

A typical smoke test might include:

Here is a concrete example of a smoke test script you can run after every deployment:

#!/bin/bash
# smoke-test.sh - Run basic smoke checks after deployment

BASE_URL="http://localhost:8080"
FAILED=0

check_endpoint() {
  local url="$1"
  local description="$2"
  local status=$(curl -s -o /dev/null -w "%{http_code}" "$url")
  if [ "$status" -eq 200 ]; then
    echo "PASS: $description"
  else
    echo "FAIL: $description (HTTP $status)"
    FAILED=1
  fi
}

check_endpoint "$BASE_URL/" "Homepage returns 200"
check_endpoint "$BASE_URL/login" "Login page renders"
check_endpoint "$BASE_URL/api/health" "Health endpoint responds"
check_endpoint "$BASE_URL/static/css/main.css" "CSS asset loads"

if [ "$FAILED" -eq 1 ]; then
  echo "Smoke tests failed. Rollback recommended."
  exit 1
else
  echo "All smoke tests passed."
fi

Loading the homepage and confirming it returns HTTP 200
Checking that the login page renders without errors
Verifying that a critical API endpoint responds within a reasonable time
Confirming that static assets like CSS and JavaScript files are loading correctly

The key requirement is consistency. You need to run the same checks every single time you deploy. If the checks change depending on who is running them, you lose the ability to compare results across deployments. A smoke test that passes today but failed last week tells you something useful. A smoke test that changes every time tells you nothing.

Smoke tests should complete in under a minute. If they take longer, they are not smoke tests anymore. They are something else. Keep them fast and shallow.

Go Deeper With Synthetic Transactions

Once smoke tests pass, you can move to a more thorough check: synthetic transactions. These are automated simulations of what real users do inside your application.

A synthetic transaction is not just checking whether a page loads. It walks through an actual user flow. For example:

Open the homepage
Click the login button
Enter a test username and password
Submit the login form
Navigate to a specific feature
Fill out a form with test data
Submit the form
Verify that the expected result appears on the next page

Synthetic transactions differ from smoke tests in two important ways.

First, they are longer and more realistic. A smoke test checks whether the door opens. A synthetic transaction checks whether you can walk in, sit down, order food, pay, and leave with a receipt.

Second, synthetic transactions verify data correctness, not just page availability. A smoke test confirms the login page loads. A synthetic transaction confirms that after logging in, the user sees their dashboard with the correct data. That is a much stronger signal.

Synthetic transactions should run immediately after the deployment completes. They are not meant to replace long-term monitoring. They are meant to give you a fast answer: "Is this version healthy enough to keep serving users, or should we roll back?"

Run Both Immediately After Deployment

Smoke tests and synthetic transactions work best as a two-step verification pipeline.

Step one: run smoke tests. If they fail, stop. Something is fundamentally broken. Roll back or fix immediately. Do not proceed to deeper checks until the basics work.

The diagram below shows the automated decision flow:

flowchart TD A[Deployment Complete] --> B[Run Smoke Tests] B -- Fail --> C[Rollback] B -- Pass --> D[Run Synthetic Transactions] D -- Fail --> C D -- Pass --> E[Deployment Healthy] style A fill:#e6f3ff,stroke:#333,stroke-width:2px style B fill:#fff3cd,stroke:#333,stroke-width:2px style D fill:#fff3cd,stroke:#333,stroke-width:2px style C fill:#f8d7da,stroke:#333,stroke-width:2px style E fill:#d4edda,stroke:#333,stroke-width:2px

Step two: run synthetic transactions. If they fail, you have a more nuanced problem. The application is running, but specific user flows are broken. You need to decide whether to roll back or hotfix, depending on the severity.

Both checks should complete within a few minutes after deployment. The goal is to know whether the new version is safe before most users encounter it.

A Practical Verification Checklist

Here is a simple checklist you can adapt for your own deployments:

Smoke test: homepage returns 200
Smoke test: login page renders without errors
Smoke test: critical API endpoint responds in under 2 seconds
Smoke test: static assets load correctly
Synthetic transaction: user can log in with test credentials
Synthetic transaction: user can complete a primary workflow (e.g., create an order, submit a form, view a report)
Synthetic transaction: data created during the test is visible in the expected location

This checklist is not exhaustive. You will need to adjust it based on your application's specific flows. But the structure is universal: start fast and shallow, then go deeper.

What This Looks Like in Practice

Imagine you deploy a new version of an e-commerce application. The smoke test runs and confirms the homepage loads, the search API responds, and the product images appear. Good.

Then the synthetic transaction runs. It simulates a user searching for a product, adding it to the cart, proceeding to checkout, and completing payment. The transaction fails at the payment step. The test reveals that the new version broke the payment integration.

Without the synthetic transaction, you might not discover this until users start complaining. With it, you know within two minutes of deployment. You can roll back before more than a handful of users hit the broken flow.

That is the difference between structured verification and hoping for the best.

Not Everything Can Be Verified the Same Way

Smoke tests and synthetic transactions work well for applications. But they do not work the same way for databases or infrastructure. A database migration cannot be verified by loading a page. An infrastructure change cannot be verified by simulating a user login.

Different types of deployments need different verification signals. The principle stays the same: check early, check consistently, and check from the user's perspective. But the specific checks will look different depending on what you are deploying.

The Concrete Takeaway

Stop relying on one person opening a browser after deployment. Build a two-step verification pipeline: smoke tests for basic sanity, then synthetic transactions for realistic user flows. Run both immediately after every deployment. If either step fails, you know before your users do.

Your users should never be the first ones to discover that your deployment broke something. Structured verification is how you keep it that way.