17-7 · Chapter 17 · 6 min read

When Your Container Image Is Ready, Where Does It Actually Run?

You have built the image. You have scanned it for vulnerabilities. You have pushed it to a registry. Now comes the moment that separates a working

When Your Container Image Is Ready, Where Does It Actually Run?

You have built the image. You have scanned it for vulnerabilities. You have pushed it to a registry. Now comes the moment that separates a working pipeline from a real deployment: actually running that container somewhere that users can reach.

The way you run that container depends entirely on where it is going. A single server and a Kubernetes cluster look similar on paper -- both run containers -- but the operational experience is completely different. The choice affects how you update, how you recover from failures, and how much manual coordination your team needs to do every time a new version goes out.

Running Containers on a Single Server

A single server deployment looks straightforward. You SSH into the machine, run docker run with the image tag you just promoted, and the application starts. In a demo environment, that is the end of the story.

In practice, a single server is rarely running just one container. You usually have an application container, a database container, a cache, maybe a queue worker. These containers need to start in the right order, talk to each other over the right network, and handle the case where one of them crashes. This is where docker-compose becomes useful. You define all the services, their dependencies, their ports, and their restart policies in a single file. One command brings everything up in the correct sequence.

The real challenge shows up when you need to update the application version. On a single server, you stop the old container and start the new one. During that window, the application cannot serve requests. If the application is used by real people, that downtime matters.

The simplest way to reduce downtime is to run two containers side by side. Keep the old version running while the new version starts up. Once the new container is ready to accept connections, switch the traffic to it, then stop the old container. This is a rolling update in its most basic form. You can do it manually with a script, or you can use a reverse proxy like Nginx or Traefik to handle the traffic switch.

But even with a rolling update pattern, a single server has a hard limit. If the server itself goes down, the application goes down. If you need to apply a security patch to the host operating system, you need to schedule downtime. For internal tools used by a small team, that trade-off is often acceptable. For customer-facing applications, it is usually not.

Running Containers on Kubernetes

Kubernetes treats the problems of single-server deployments as solved problems and builds on top of them. You do not manage containers directly. You define a Deployment object that describes the desired state: which image to run, how many replicas, what health checks to use, and how to perform updates.

When you update the image tag in a Deployment, Kubernetes does not stop everything and restart. It creates new pods with the new image, waits for them to pass their health checks, then gradually terminates the old pods. During the entire process, there is always at least one pod serving traffic. Users do not see a service interruption.

A pod is the smallest unit in Kubernetes. It can run one or more containers, but the key idea is that a pod is ephemeral. Kubernetes creates pods, destroys them, and moves them to different nodes as needed. You never think about which specific server a pod is running on. The cluster handles that.

The difference between a single server and Kubernetes is not just about scaling to more traffic. It is about who owns the coordination. On a single server, you decide the startup order, the restart policy, and the failure handling. You write scripts or use docker-compose to enforce those decisions. On Kubernetes, the orchestrator owns that coordination. It checks pod health periodically, restarts failed pods, and redistributes pods to healthy nodes when a node goes down.

This shift in ownership changes how your team operates. You stop writing scripts that handle container lifecycle. You start writing Deployment manifests that describe the desired state, and you let the cluster figure out how to reach that state.

Here is what a minimal Deployment manifest looks like in practice:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: app
        image: my-registry/my-app:v1.2.3
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 10

This manifest tells Kubernetes to run three replicas, update them one at a time, and only route traffic to a pod after its /health endpoint responds successfully.

How to Choose Between the Two

The choice between a single server and Kubernetes is not a technical purity test. It is a decision based on operational requirements.

The following flowchart can help you decide which path fits your situation:

flowchart TD A[Start] --> B{High traffic or zero-downtime needed?} B -- Yes --> C{Multiple services to manage?} B -- No --> D{Small team, simple app?} C -- Yes --> E[Use Kubernetes] C -- No --> F[Consider K3s or MicroK8s] D -- Yes --> G[Use single server with docker-compose] D -- No --> H{Operational maturity for cluster?} H -- Yes --> E H -- No --> G

Use a single server with docker-compose when:

The application is used by a small internal team.
Downtime for updates or maintenance is acceptable.
You have one or two services to manage.
You do not need to scale horizontally.
Your team size is small and you want minimal infrastructure complexity.

Use Kubernetes when:

The application must be available even during updates.
You need to scale services independently based on traffic.
You run multiple services that need to be deployed and updated separately.
You want automated recovery from node failures.
Your team has the operational maturity to manage a cluster.

There is a middle ground. Some teams run a small Kubernetes cluster with a single node using tools like K3s or MicroK8s. This gives you the rolling update and health check features of Kubernetes without the full complexity of a multi-node cluster. It is worth considering if you want the deployment patterns but do not yet need the scale.

The One Rule That Never Changes

Regardless of where you deploy, one rule stays the same: the image that runs in production must be exactly the same image that passed all tests and scans in the pipeline.

Never rebuild the image on the server. Never pull a different tag because "it should be the same." Never let anyone SSH into the server and run a container with a locally modified image. If the image in the registry is not the image that is running, you have lost the ability to reproduce, audit, and roll back.

This is why image tagging and promotion matter. When you promote an image from staging to production, you are not rebuilding anything. You are simply changing which environment is allowed to pull that specific tag. The bytes are identical.

Practical Checklist for Container Deployment

Before you deploy a container to any environment, confirm these points:

The image tag in the deployment matches the tag that passed the pipeline.
The container has a health check endpoint that tells the orchestrator when it is ready.
The environment variables and secrets are set correctly for the target environment.
The update strategy is defined: rolling update for zero-downtime, recreate for simple cases.
You have a way to see which version of the image is currently running.
You have a rollback plan: either a previous image tag or a previous deployment manifest.

What Comes Next

Running the container is only half the work. Once it is running, you need to know which version is actually serving traffic, whether it is healthy, and what to do when the new version has a problem. That is where image version tracking and rollback come in. Those are the topics for the next part of this discussion.

For now, the important thing is to pick the deployment target that matches your operational reality, and to make sure the image you run is the image you tested. Everything else follows from that.