32-5 · Chapter 32 · 6 min read

Rotating Secrets: Why, When, and How to Do It Without Breaking Your System

You have your secrets stored safely in a vault. Your pipeline injects them at deploy time. Everything looks solid. But there is a problem you might not

Rotating Secrets: Why, When, and How to Do It Without Breaking Your System

You have your secrets stored safely in a vault. Your pipeline injects them at deploy time. Everything looks solid. But there is a problem you might not have considered: the same secret, used for months or years, is a ticking time bomb.

Imagine a developer accidentally includes an API key in a screenshot during a company presentation. Or a log file from six months ago still contains a database password, and someone with unauthorized access finds it. The longer a secret lives, the higher the chance it leaks without you ever knowing. Rotation is your safety net: if a secret does leak, its useful life has already expired before the leak can be exploited.

Why Rotate Secrets?

The most fundamental reason for rotation is reducing your window of vulnerability. If an API key is valid for a year and leaks in month three, an attacker has nine months of access. But if you rotate that key every week, the maximum damage window is seven days.

Rotation also matters for compliance. Standards like PCI DSS and SOC 2 require periodic secret rotation. But even without regulatory pressure, rotation is a practical defense mechanism. It limits blast radius, forces you to verify that your secret management pipeline actually works, and builds operational muscle memory for handling credential changes under calm conditions rather than during an incident.

When Should You Rotate?

There are three main scenarios that trigger rotation.

Scheduled rotation. This is the routine, predictable cycle. Every 30, 60, or 90 days, depending on the sensitivity of the secret. A database password for a production system might rotate every 30 days. A less critical internal API key might rotate every 90 days. The schedule should match the risk profile of what the secret protects.

Incident-driven rotation. Something happened. Maybe you spotted suspicious access in audit logs. Maybe a secret appeared in a public GitHub repository. Maybe an employee's laptop was compromised. In these cases, you rotate immediately. Do not wait for the next scheduled window. Speed matters more than ceremony.

Personnel changes. When someone who had access to secrets leaves the team, changes roles, or is terminated, rotate any secrets they could have accessed. This is not about trust. It is about removing the possibility that credentials they memorized, saved locally, or wrote down somewhere are still valid after their access should have ended.

How to Rotate Without Breaking Applications

Here is the hard part. If you rotate a database password and all your services suddenly lose connection, you have made things worse. The goal is to rotate without downtime. The most reliable strategy is dual secret rotation, also called rotation with a transition period.

Dual Secret Rotation

The idea is simple: your application accepts two valid secrets at the same time during the transition. Here is the step-by-step:

The following sequence diagram illustrates the dual secret rotation process across a vault, a configuration service, and multiple application instances:

Here is how you might implement this with HashiCorp Vault and a JSON configuration file:

# Step 1: Generate a new secret version (keep old)
vault kv put secret/db-password \
  old_password="$(vault kv get -field=password secret/db-password)" \
  new_password="$(openssl rand -base64 32)"

# Step 2: Update application config with both secrets
cat > /etc/myapp/config.json <<EOF
{
  "db": {
    "old_password": "$(vault kv get -field=old_password secret/db-password)",
    "new_password": "$(vault kv get -field=new_password secret/db-password)"
  }
}
EOF

# Step 3: Reload application to pick up new config
systemctl reload myapp

# Step 4: After all instances use the new secret, remove the old one
vault kv patch secret/db-password old_password=""

# Step 5: Update config to only use the new secret
cat > /etc/myapp/config.json <<EOF
{
  "db": {
    "password": "$(vault kv get -field=new_password secret/db-password)"
  }
}
EOF
systemctl reload myapp

sequenceDiagram participant Vault participant Config participant ServiceA participant ServiceB Note over Vault,ServiceB: Step 1: Generate new secret Vault->>Vault: Generate new secret (keep old) Note over Vault,ServiceB: Step 2: Deploy new secret to all services Vault->>Config: Provide old + new secret Config->>ServiceA: Update config (both secrets valid) Config->>ServiceB: Update config (both secrets valid) Note over Vault,ServiceB: Step 3: Verify all services use new secret ServiceA->>Vault: Connect using new secret ServiceB->>Vault: Connect using new secret Note over Vault,ServiceB: Step 4: Deactivate old secret Vault->>Config: Mark old secret as invalid Config->>ServiceA: Remove old secret from config Config->>ServiceB: Remove old secret from config Note over Vault,ServiceB: Step 5: Remove old secret from vault Vault->>Vault: Delete old secret

Generate a new secret in your vault or secret store. Do not delete the old one.
Update your application configuration so it knows both the old and new secrets are valid.
Deploy the updated configuration. All running instances now accept both secrets.
Wait until every instance has picked up the new configuration and is using the new secret.
Remove the old secret from the configuration and deploy again.

During this process, your application never loses connectivity. If an instance has not yet received the new configuration, it still works with the old secret. Once it picks up the update, it can use either one. After the old secret is removed, only the new one works.

This approach works well for secrets that are consumed directly by applications: database passwords, API keys, service-to-service tokens. The key requirement is that your application code or middleware supports multiple valid credentials simultaneously. Most modern database drivers and HTTP clients can handle this.

Coordinating Rotation Across Multiple Services

When a single secret is shared across many services, rotation becomes more complex. Imagine a database password used by ten microservices. You cannot rotate it one service at a time without coordination. If service A switches to the new password while service B still uses the old one, and the database only accepts the new password, service B breaks.

One solution is to use a service mesh or sidecar proxy that manages database connections centrally. The sidecar handles authentication to the database. Your services connect to the sidecar, not directly to the database. When you rotate the database password, you only update the sidecar configuration. The services do not even know a rotation happened.

Another approach is to use a dynamic secret system, which we will cover shortly. But for static secrets shared across many consumers, a service mesh or a dedicated connection pooler is the most practical pattern.

What Else Matters in Rotation

Rotation is not just about changing a value. It is a process that needs supporting practices.

Audit logging. Every rotation should be recorded. Who triggered it, when, which secret was rotated, and what the outcome was. This is essential for incident investigation and compliance audits.

Test in staging first. Never rotate a production secret without verifying the process in a staging environment. Staging should mirror production's secret consumption patterns. If the rotation breaks something, you want it to break in staging, not production.

Have a rollback plan. Sometimes a rotation causes unexpected problems. Maybe the new secret does not propagate correctly. Maybe a service fails to pick up the configuration change. Your rotation procedure should include a way to revert to the old secret quickly. This means keeping the old secret valid until you are confident the new one works everywhere.

Practical Checklist for Secret Rotation

Identify which secrets need rotation and their risk level
Define rotation schedules based on risk (30/60/90 days)
Implement dual secret support in application code or middleware
Test rotation procedure in staging environment
Document rollback steps before rotating production secrets
Log every rotation with timestamp, operator, and outcome
Automate scheduled rotations where possible
Establish an incident response process for emergency rotation

The Takeaway

A secret that never changes is a vulnerability waiting to be exploited. Rotation limits the damage window, satisfies compliance requirements, and forces you to keep your secret management practices sharp. The dual secret pattern gives you a safe way to rotate without downtime. But rotation is only part of the picture. The next question is whether you can move beyond static secrets entirely, to secrets that expire automatically after a single use. That is where dynamic secrets come in.