20-3 · Chapter 20 · 6 min read

When Your Database Schema Needs Version Control Too

Imagine this: your team has a solid CI/CD pipeline for the application code. Every pull request triggers automated tests, builds a container image, and

When Your Database Schema Needs Version Control Too

Imagine this: your team has a solid CI/CD pipeline for the application code. Every pull request triggers automated tests, builds a container image, and deploys to staging. Then comes the production deployment. The pipeline runs, the application starts, and immediately crashes with a column-not-found error. Someone forgot to run the database migration that adds the phone_number column. The deployment fails, users see errors, and the team scrambles to figure out what went wrong.

This scenario plays out in teams every day. The application code is versioned, tested, and deployed through a pipeline. But the database schema changes are treated as an afterthought, something someone runs manually before or after deployment. The disconnect between code deployment and schema changes creates a gap where errors slip through.

The Problem: How Does the Pipeline Know What Has Run?

When you have a directory of migration scripts like V001_create_users.sql, V002_add_phone.sql, and V003_add_index.sql, the pipeline needs to know which ones have already been applied to the database. You cannot run all files from the beginning every time you deploy. The production database already has real data. Running V001 again would either fail because the table already exists or, worse, drop and recreate tables, destroying customer data.

Without a tracking mechanism, teams resort to manual checks. Someone logs into the database, runs \dt or SHOW TABLES, and tries to remember what was deployed last week. Or they rely on a shared spreadsheet that nobody updates. Or they just run the migration and hope for the best.

None of these approaches scale. They introduce human error, slow down deployments, and create fear around every database change.

The Solution: A Migration Table in the Database

The answer is surprisingly simple: let the database track its own migration history. Create a special table, commonly named schema_migrations or migration_history, that records every migration script that has been executed.

Here is how it works in practice:

The first time a migration tool runs against an empty database, it creates the migration table.
After each migration script executes successfully, the tool inserts a row with the script name and execution timestamp.
On subsequent deployments, the pipeline reads the migration table, compares it against the list of available migration files, and runs only the scripts that are not yet recorded.

For example, after V001_create_users.sql runs, the migration table contains one row: V001_create_users.sql. When the next deployment includes V002_add_phone.sql, the pipeline checks the table, sees that V002 is missing, and runs it. After success, it adds a new row. The database itself becomes the single source of truth for what version of the schema is currently running.

Why This Matters for Your Pipeline

This mechanism is called version locking. The database holds the authoritative record of its own state. There is no need for a separate configuration file, an environment variable that might get out of sync, or a manual checklist that someone forgets to update.

For a CI/CD pipeline, this is critical. The pipeline can now make an objective decision: "Based on what the database tells me, I need to run these three migration files." No guessing, no manual checks, no fear of running the same migration twice.

The following sequence diagram illustrates this exact flow:

sequenceDiagram participant Pipeline as CI/CD Pipeline participant Scripts as Migration Scripts Directory participant Table as Database Migration Table participant Schema as Database Schema Pipeline->>Table: Query applied migrations Table-->>Pipeline: Return list (e.g., V001, V002) Pipeline->>Scripts: Compare with available scripts Scripts-->>Pipeline: New scripts (e.g., V003) Pipeline->>Schema: Run V003_add_index.sql Schema-->>Pipeline: Success Pipeline->>Table: Insert V003 record Pipeline->>Schema: Application starts with updated schema

Different migration tools implement this slightly differently. Some record a checksum of each migration file to detect if someone modified a script that already ran. Others use sequential version numbers instead of file names. Some tools store the migration history in a separate schema or database. But the core principle remains the same: the database tracks its own history, and the pipeline reads that history to determine the next steps.

Bootstrapping: The First Migration

There is a chicken-and-egg problem here. The migration table itself needs to exist before any other migration can be recorded. How do you create it?

Most migration tools handle this automatically. When you run the tool against an empty database for the first time, it creates the migration table as part of its bootstrap process. Some tools even record this bootstrap action as the first entry in the migration history.

If you are adopting migration scripts for an existing database that already has tables and data, you need a different approach. This is where a baseline migration comes in. Instead of trying to recreate every historical change, you create a single migration script that captures the current state of the database schema. You mark this as the baseline, and the migration tool records it as already applied. From that point forward, you only add new migration scripts for changes going forward.

A baseline migration is a pragmatic solution. It acknowledges that you cannot rewrite history, but you can start tracking changes from today. The alternative would be to reverse-engineer every schema change ever made, which is impractical for most teams.

What the Migration Table Does Not Solve

The migration table solves one specific problem: knowing which scripts have already run. It gives the pipeline a reliable way to determine the current schema version and apply pending changes.

But it does not solve everything. The migration table works well for additive changes like adding tables, columns, or indexes. These changes do not break existing application code that was written for the old schema. The problems start when you need to remove or rename columns, change data types, or restructure tables. These destructive or transformative changes can break running applications, cause downtime, or corrupt data.

The migration table tells you what has run, but it does not tell you whether the running application is compatible with the new schema. That requires a different set of practices around backward-compatible migrations, phased rollouts, and careful coordination between code deployment and schema changes.

Practical Checklist for Migration Tracking

Choose a migration tool that supports automatic tracking via a migration table.
Ensure the migration table is created during the first deployment, not manually.
Never modify a migration script that has already been applied to production.
If adopting migrations for an existing database, create a baseline migration first.
Include migration execution as a step in your deployment pipeline, not a manual process.
Test migrations in a staging environment that mirrors production data volume.

The Concrete Takeaway

Your database schema is as important as your application code, and it deserves the same level of version control and automation. A migration table gives your pipeline a reliable, database-backed way to know what has been applied and what still needs to run. Without it, you are guessing. With it, you have a single source of truth that eliminates the most common source of deployment failures related to database changes. Start tracking your schema versions today, and your future self will thank you during the next deployment.