Rotate Everything in Production (Part 1): The Order That Keeps You Out of Trouble
This series documents a full production-style rotation workflow and why the order matters more than the individual commands.
What this post focuses on:
- dependency chain (what breaks what)
- minimal checks that keep you safe
- sequencing to avoid lockouts
Scope here: one production environment, multiple workloads, one edge proxy layer, and one Vault listener that all clients depend on.
TL;DR
- Rotate admin client cert first so
vault statusstays possible. - Vault server TLS is the center of gravity; everything else must be ready for it.
- Agents before apps; proxy chain last; healthcheck gates the end.
One-line order of operations
- Rotate the admin client cert (mTLS).
- Rotate Vault server TLS (listener cert).
- Rotate agent login mTLS certs + cert-auth mappings.
- Rotate app leaf certs (nginx/app frontends).
- Refresh proxy CA chain + trust material.
- Restart proxy stack and Vault server container (if required).
- Run a Vault TLS healthcheck (seal status /
s_client).
Why this order matters
1) Don’t lock yourself out
If you rotate server TLS before you have a working admin client cert and trust chain, you can lose the only reliable way to run vault status.
Start with the admin cert so you always have a known-good identity to validate the rest of the pipeline.
2) Vault server TLS is the center of gravity
Once the listener changes, every client path must still work:
- Vault Agent mTLS login certs must match their cert-auth mappings
- CA bundles used by clients/proxies must include the right chain
- SNI must match the hostname clients validate for that environment
If clients are not ready, you will see:
connection reset by peer- OpenSSL
errno=104
3) Agents before apps
Apps depend on agents to authenticate and renew. If cert-auth is broken, leaf rotation becomes flaky or fails silently.
4) Proxy chain last (but before restarts)
Refresh proxy trust after internal rotations so you do not churn the public edge while core pieces are still changing.
Preflight checklist (short and honest)
- Confirm the target environment and listener address you intend to rotate.
- Confirm the TLS server name clients are expected to validate.
- Confirm which app workloads are in scope.
- Know exactly which user services or container units you may need to restart:
- Vault Agent units per app
- proxy container unit
- Vault container unit
Minimal “I feel safe” checks during the run
After rotating admin client cert:
vault statusworks with the new cert/key
After rotating Vault server TLS + restart:
- Listener accepts TLS with expected SNI
vault statusworks again
After agent/login + app leaf rotation:
*.fullchain.pemfiles exist with sane expiry
At the end:
sys/seal-statusresponds without resetsopenssl s_clientcompletes cleanly
Wrapper command pattern
Run the environment wrapper around the full rotation script for the target environment.
Next up
Part 2 covers the first two rotations:
- admin client cert (permissions, exports, trust chain)
- Vault server TLS (owner, chain files, restart implications)