17-3 · Chapter 17 · 6 min read

Building Docker Images in CI/CD Pipelines

You have a Dockerfile that works perfectly on your laptop. You run docker build, the image builds, and your application runs. But when you push that same

Building Docker Images in CI/CD Pipelines

You have a Dockerfile that works perfectly on your laptop. You run docker build, the image builds, and your application runs. But when you push that same Dockerfile into a pipeline, things start to behave differently.

The build that took two minutes on your machine now takes fifteen. The image that worked yesterday fails today for no obvious reason. And when you need to rebuild an old version for a rollback, the resulting image behaves differently than the original.

These problems are not random. They come from the gap between building images on a developer machine and building them in an automated pipeline. Understanding that gap is the first step to fixing it.

What Changes When You Move to a Pipeline

When you build an image in a pipeline, three things change fundamentally.

First, the build must run on someone else's machine. That machine might have different resources, different network conditions, and different file systems. Your Dockerfile needs to work regardless of where it executes.

Second, the pipeline must rebuild the image every time code changes. This is not optional. If you skip rebuilds, your deployment will run stale code. But rebuilding every time means every commit triggers the full build process, and that process needs to be fast enough to keep development moving.

Third, the build must be reproducible. The same source code must produce the same image, regardless of when or where the build runs. Without reproducibility, you cannot trust that rolling back to an old commit will restore the exact same application behavior.

The diagram below shows the typical stages of a Docker build pipeline, from source code to registry push.

flowchart TD A[Source Code] --> B[Checkout] B --> C[Set Build Context] C --> D[Docker Build] D --> E{Cache Hit?} E -- Yes --> F[Reuse Cached Layers] E -- No --> G[Build New Layers] F --> H[Tag Image] G --> H H --> I[Push to Registry]

Control Your Build Context

The build context is the folder you send to the Docker daemon when you run docker build. On your laptop, this is usually your project folder. In a pipeline, the same thing happens, but the consequences of a large context are worse.

Every file in the build context gets sent to the Docker daemon. If your repository contains node_modules, Python virtual environments, or compiled binaries, those files get transferred even though they are not needed for the build. This slows down every single pipeline run.

The fix is a .dockerignore file. It works like .gitignore but for Docker builds. List everything that is not needed for the image: dependency folders, cache directories, .git history, test fixtures, and documentation. A lean build context means faster builds and less network traffic.

Make Cache Work for You

Docker builds images in layers. Each instruction in your Dockerfile creates one layer. When you rebuild, Docker checks if each layer has changed. If a layer is unchanged, Docker reuses the cached result from the previous build.

This caching mechanism is your biggest ally for fast builds. But it only works if the cache survives between pipeline runs.

In local development, the cache lives on your machine. In a pipeline, the cache disappears when the runner finishes unless you explicitly save it. Some CI systems provide built-in Docker layer caching. If yours does not, you need to configure it manually or accept that every build starts from scratch.

Even with caching working, the order of instructions in your Dockerfile determines how much cache you actually use. The golden rule is: copy things that change less frequently first.

For a Node.js application, this means copying package.json and package-lock.json before the rest of the source code. Run npm install right after copying these files. Then copy the application code. With this order, dependency installation only reruns when your dependencies actually change, not when you modify a single line of application code.

The same principle applies to any language. Python projects should copy requirements.txt or pyproject.toml first. Go projects should copy go.mod and go.sum first. The pattern is universal: separate stable dependencies from changing application code.

Use Build Arguments for Flexibility

Your Dockerfile should not hardcode values that change between environments. The version of a base image, the environment name, or an access token for a private registry should come from outside the Dockerfile.

Docker provides ARG for this purpose. You define a placeholder in your Dockerfile, and the pipeline fills it in during the build.

ARG BASE_IMAGE_VERSION=20.04
FROM ubuntu:${BASE_IMAGE_VERSION}

In your pipeline, you pass the actual value:

docker build --build-arg BASE_IMAGE_VERSION=22.04 .

This keeps your Dockerfile generic and reusable. One Dockerfile can serve development, staging, and production builds without duplication.

Ensure Reproducible Builds

Reproducibility means that building the same source code at different times produces the same image. Without this, you cannot trust your rollback strategy, your audit trail, or your security scanning.

Three things commonly break reproducibility.

First, using latest tags for base images. The latest tag changes over time. Today's latest is not tomorrow's latest. Pin your base image to a specific version tag like ubuntu:22.04 or node:20-alpine.

Second, not locking dependency versions. Your package.json might specify a version range like ^4.0.0. That range resolves to different versions over time. Use lock files like package-lock.json or yarn.lock to freeze exact versions.

Third, embedding build timestamps or build metadata into the image. If your build process writes the current date into a file inside the image, that file will differ between builds even when the source code is identical. Avoid this unless you have a specific operational reason for it.

Store the Image in a Registry

Once the pipeline builds the image, it needs a place to live where servers and Kubernetes clusters can pull it. That place is a container registry.

Your pipeline should tag the image with meaningful identifiers. A common pattern is to use the commit SHA as the primary tag, with additional tags for branches or semantic versions.

docker tag myapp:${COMMIT_SHA} registry.example.com/myapp:${COMMIT_SHA}
docker push registry.example.com/myapp:${COMMIT_SHA}

This gives you a permanent reference to every image ever built. You can always go back to any commit and pull the exact image that was built from it.

Practical Checklist

Before you push your Docker build into a pipeline, check these items:

.dockerignore excludes dependency folders, cache, and .git
Dockerfile copies dependency files before application code
Base image tags are pinned to specific versions, not latest
Dependency versions are locked with lock files
Build arguments are used for environment-specific values
Pipeline saves Docker layer cache between runs
Images are tagged with commit SHA and pushed to a registry

What This Means for Your Pipeline

Building images in a pipeline is not just about running docker build on a different machine. It is about designing your Dockerfile and your pipeline to work together. A well-structured Dockerfile that respects layer ordering and build context will build faster. A pipeline that preserves cache and tags images properly will give you reliable, reproducible artifacts.

The image you build today should be the same image you can rebuild six months later from the same commit. That consistency is what makes deployments predictable and rollbacks safe.