This part covers the workload-facing rotations:

  • agent login mTLS certs (Vault Agent via auth/cert)
  • app leaf cert rotation (per-app nginx-* roles + reload)

Apps in scope: ncprd1, resume2me.


TL;DR

  • Rotate agent mTLS certs and cert-auth mappings first.
  • Then issue app leaf certs and reload via post-hook.
  • Verify *.fullchain.pem plus a manual cert login.

3) Agent mTLS login certs

What the run did (per app)

ncprd1

  • CN: agent-ncprd1.prod.privsec.ch
  • role: pki-prod/roles/agent-mtls-ncprd1
  • mapping: auth/cert/certs/agent-ncprd1
  • policies: pki-issue-ncprd1, marker-cert-auth

Files written:

  • /home/ncprd1/vault/mtls/agent.key (0600)
  • /home/ncprd1/vault/mtls/agent.crt (0644)
  • /home/ncprd1/vault/ca/ca.pem (0644)

Validity (from run):

  • notBefore: Dec 22 08:12:20 2025 GMT
  • notAfter: Jan 21 08:12:50 2026 GMT

resume2me

  • CN: agent-resume2me.prod.privsec.ch
  • role: pki-prod/roles/agent-mtls-resume2me
  • mapping: auth/cert/certs/agent-resume2me
  • policies: pki-issue-resume2me, marker-cert-auth

Files written:

  • /home/resume2me/vault/mtls/agent.key (0600)
  • /home/resume2me/vault/mtls/agent.crt (0644)
  • /home/resume2me/vault/ca/ca.pem (0644)

Validity:

  • notBefore: Dec 22 08:12:21 2025 GMT
  • notAfter: Jan 21 08:12:51 2026 GMT

Why this step exists

Agent rotation is more than “renew a file”. In this setup it means:

  • issue a client cert under a per-app role
  • ensure auth/cert is enabled
  • upsert the mapping at auth/cert/certs/<agent-name>
  • ensure policies are attached (including your marker policy)

If mappings or policies drift, Vault Agent can look “healthy” but fail to auth or renew.

Quick verification: manual cert login

For ncprd1 (from the run):

VAULT_ADDR="https://127.0.0.1:22400" VAULT_CACERT="/home/ncprd1/vault/ca/ca.pem" \
vault login -method=cert -client-cert "/home/ncprd1/vault/mtls/agent.crt" -client-key "/home/ncprd1/vault/mtls/agent.key"

This single command exercises TLS trust, cert-auth mapping, and policy attachment.


5) App leaf cert rotation (nginx roles + reload)

ncprd1

The run:

  • upserted PKI role: nginx-ncprd1
  • allowed domains: int.privsec.ch,prod.privsec.ch
  • removed debug-policy from mapping
  • enabled KV access (--with-kv-access)
  • detected KV v2 on mount kv
  • attached KV policy: secret-agent-ncprd1-policy
  • strict auth: Auth=cert (mTLS) only
  • detected kv/ncprd1 exists, so seed not overwritten
  • installed post-hook:
    • /home/ncprd1/.vault-agent-ncprd1/bin/vault-agent-post-leaf.sh
  • restarted user service:
    • vault-agent-ncprd1.service (v5.0)

Leaf file check:

  • /home/ncprd1/tls/ncprd1.fullchain.pem
  • subject: CN = ncprd1.int.privsec.ch
  • issuer: PrivSec Intermediate CA (prod)
  • notAfter: Dec 23 08:12:57 2025 GMT

resume2me

The run:

  • upserted PKI role: nginx-resume2me
  • allowed domains: int.privsec.ch,prod.privsec.ch
  • removed debug-policy from mapping
  • enabled KV access (--with-kv-access)
  • KV v2 already active on kv
  • attached KV policy: secret-agent-resume2me-policy
  • strict auth: cert only
  • detected kv/resume2me missing (no seed because --with-kv-seed was not set)
  • installed post-hook:
    • /home/resume2me/.vault-agent-resume2me/bin/vault-agent-post-leaf.sh
  • restarted user service:
    • vault-agent-resume2me.service (v5.0)

Leaf file check:

  • /home/resume2me/tls/resume2me.fullchain.pem
  • subject: CN = resume2me.int.privsec.ch
  • issuer: PrivSec Intermediate CA (prod)
  • notAfter: Dec 22 23:43:22 2025 GMT

Reload mechanism (post-hook)

The logs show Vault Agent executing a post-hook that triggers reloads by label via Podman:

  • RELOAD_TLS_LABEL='tls=true'
  • PROXY_RELOAD='podman:proxyprod'

This keeps the sequence clean: issue -> write -> reload.


Watch item: agent/server version mismatch

Both vault-agent-* services reported:

  • Vault Agent: 1.21.1
  • Vault server: 1.20.4

That mismatch is not automatically broken, but it is a real operational signal. After upgrades, re-test auth, renewal, and template rendering.


Tiny post-rotation checklist (per app)

  • [ ] *.fullchain.pem exists where expected
  • [ ] expiry is sane
  • [ ] subject CN matches expected domain
  • [ ] Vault Agent service is active (running)
  • [ ] cert-auth login works with generated agent cert

Next up

Part 4 covers the edge and the failure you saw:

  • proxy CA chain refresh (proxyprod)
  • proxy + Vault container restarts
  • why the final healthcheck returned “connection reset by peer” and what to do right after