The Problem: mTLS Everywhere, No Way In

One of the first real-world problems I hit with my PKI/Vault setup was simple and brutal:

  • Vault listeners require mTLS (tls_require_and_verify_client_cert = true),
  • my admin client certificates expired,
  • suddenly even my root token was useless.

TLS failed before Vault ever saw my token. The error was just:

remote error: tls: expired certificate

This post is about how I think about that failure mode and what I want my recovery story to look like.

Two Layers: TLS First, Vault Second

The key lesson:

  • TLS layer: verifies certificates, decides if the HTTP request is allowed to exist.
  • Vault auth layer: checks tokens, policies, capabilities.

If client cert validation fails on TLS, Vault never gets a chance to validate any token. So even a perfect root token doesn’t help if your mTLS client certs are dead.

What I Want in My Design

Going forward, I want:

  • Short-lived admin certs with a clear rotation plan.
  • At least one controlled way to access Vault without mTLS, only during maintenance.
  • Monitoring or scripts that warn me before the important certs expire.

Everything else in my Vault design hangs on taking this failure mode seriously.