Current Source Of Truth

The current implementation is documented in:

This post is the blog summary version.

Big picture (Vault + env split + per-app identities)

[ offline root CA ]
        |
        v
[ Vault test ]     [ Vault prod ]
    |                   |
    v                   v
[ env proxy ]       [ env proxy ]
    |                   |
    +----> apps/users with their own Vault identities
    +----> agents render secrets, leaf certs, or CA chains

The operational shape now is:

  • offline root outside Vault,
  • one Vault environment per trust domain,
  • HTTPS + mTLS on the Vault API,
  • one Unix user per app/proxy,
  • one scoped Vault identity per app/proxy,
  • agents handling issuance, renewal, and render steps.

1) Repo layout

The repo is organized around stable entry points:

  • infra/ for wrapper scripts you actually call,
  • infra/versions/ for versioned implementations,
  • infra/scripts/ for lower-level hooks,
  • infra/config/apps.example.yaml for the tracked environment/app template,
  • infra/config/apps.yaml for the local untracked private config.

This matters because older posts referred to raw versioned script names like ...config2.sh. The current active entry points are the wrapper names without those suffixes.

2) One-time base initialization (per environment)

  1. Create the offline root CA
    • Script: 01_make_offline_root_ca.sh --env test
    • Purpose: create the offline root CA key/cert.
  2. Create the Vault intermediate & sign it with the offline root
    • Script: 02_intermediate_in_vault_sign_with_root.sh --env test --config ./config/apps.yaml
    • Purpose: enable PKI mount in Vault, sign the intermediate, set CA URLs.
  3. Issue the Vault server certificate
    • Script:

      03_issue_vault_server_cert.sh --env test --config ./config/apps.yaml \
        --cn vault.test.local \
        --dns "vault.test.local,localhost,host.containers.internal" \
        --ips "127.0.0.1,::1,<public-ip>"
      
      
    • Important: --cn must match what clients verify (SNI).
      If your agents connect via IP, keep address=https://<public-ip>:22300 and set tls_server_name="vault.test.local" (hostname check against the cert).

  4. Switch Vault to HTTPS
    • Script: 04_enable_https_in_compose.sh --env test
    • Result: Vault config uses tls_cert_file=.../fullchain.crt, tls_key_file=.../server.key.

(Optional) 5. Admin mTLS client cert

  • Script: 05_issue_admin_client_cert.sh --env test
  • For the vault CLI: set VAULT_CLIENT_CERT / VAULT_CLIENT_KEY.

3) Connect environment proxies to Vault (CA chain only)

For your environment-specific NGINX instances so they keep the CA chain updated.

  • Script: setup-vault-agent-proxy-config.sh --env test --app <edge-proxy>
  • Produces:
    • mTLS-Client (agent.crt/key + ca.pem)
    • Agent config that renders only the chain (no leaf certs)
    • Post-hook scripts/vault-agent-post.sh can trigger nginx -s reload on updates (via label tls=true).

When?

  • Once during proxy setup.
  • After that, rotation runs automatically (agent renews auth and re-renders the chain).

4) Onboard a new app (secrets and/or leaf certs)

Two typical paths:

A) Container app with a sidecar agent (e.g. Nextcloud + DB password)

  1. Create AppRole + KV/policies
    • Script: bootstrap-secret-agent.sh --env test --app examplekv
    • Result:
      • Policies (KV-read, optional PKI-Issue)
      • AppRole credentials stored in the app's private credential path
  2. Start the sidecar agent in Compose
    • In your docker-compose.yml (like vault-agent-app-test)
    • The agent reads role_id/secret_id, logs in, and renders secrets into /vault/secrets/... (tmpfs volume).

When?

  • Every time you set up a new container app.
  • SecretID rotation: either re-run bootstrap_secret_agent.sh with a new SecretID option or run vault write auth/approle/role/<role>/secret-id.

B) Host app (Unix service) needs a leaf certificate (mTLS)

  1. mTLS client for the agent
    • Script: setup-vault-agent-mtls-client-config.sh --env test --app exampleapp
    • Result: ~/vault/mtls/{agent.key,agent.crt,ca.crt}
  2. App agent that renders leaf certificates
    • Script: setup-vault-agent-app-config.sh --env test --app exampleapp
    • Result: a systemd user unit that fetches and rotates the leaf certificate for that workload.

When?

  • Once during onboarding.
  • Rotation runs automatically via the agent (renew + re-render).

5) Which scripts for which event?

Event Goal Scripts / action Order Key checks
Fresh Vault setup (per env) PKI + HTTPS ready 01 → 02 → 03 → 04 (→ 05 optional) 1→2→3→4 vault status, agent login works, CN/SAN correct
Connect environment proxy CA chain stays current setup-vault-agent-mtls-client-config.sh → setup-vault-agent-proxy-config.sh 1→2 Agent logs “rendered chain…”, NGINX reload ok
New container app Secrets + optional PKI bootstrap-secret-agent.sh → Compose sidecar 1→2 Agent logs “authentication successful”, secret files present
New host app (mTLS leaf) Leaf cert + rotation setup-vault-agent-mtls-client-config.sh → setup-vault-agent-app-config.sh 1→2 Leaf files exist, systemd unit active, app starts
SecretID rotation Refresh AppRole SecretID bootstrap-secret-agent.sh (new option) or vault write auth/approle/.../secret-id – New secret_id in app path, reload agent
Vault server cert expiring New server cert 03_issue_vault_server_cert.sh (same CN/SAN) → reload/restart Vault 1→(2) Clients connect, tls_server_name matches
Intermediate/root rotation New CA hierarchy 02 (new intermediate) + redeploy chains 1 Proxies/apps receive new chain (agents handle it)
DNS broken in container Fix connectivity In agent config: use address=https://<ip>:22300 and keep tls_server_name=vault.test.local – No “no such host” in logs
“no known role ID” Fix AppRole login Check /vault/creds/role_id and secret_id (0400, readable, correct) – Agent shows “authentication successful”
x509 CN mismatch Fix TLS hostname Re-issue cert or adjust tls_server_name (must match CN/SAN) – No “certificate is valid for … not …”

6) Which auth is used where?

  • Container sidecar (secrets): AppRole (RoleID/SecretID from bootstrap_secret_agent.sh).
  • Proxy agent (CA chain only): mTLS client cert (from setup-vault-agent-mtls-client-config2.sh).
  • Host app agent (leaf): mTLS client cert plus PKI role to issue a leaf cert.

Yes: you can replace AppRole with mTLS if you want client-cert distribution everywhere. In containers AppRole is often more convenient (no long-lived private key files). Mixed mode is fine.


7) Minimal commands (copy/paste, test environment)

Base setup (one-time):

./01_make_offline_root_ca.sh --env test
VAULT_TOKEN=hvs.<admin> ./02_intermediate_in_vault_sign_with_root.sh --env test --config ./config/apps.yaml
./03_issue_vault_server_cert.sh --env test --config ./config/apps.yaml \
  --cn vault.test.local \
  --dns "vault.test.local,localhost,host.containers.internal" \
  --ips "127.0.0.1,::1,<public-ip>"
./04_enable_https_in_compose.sh --env test

Attach proxy test to Vault (CA chain only):

./setup-vault-agent-mtls-client-config.sh --env test --app <edge-proxy>
./setup-vault-agent-proxy-config.sh --env test --app <edge-proxy>
# check the resulting user service if one was created

New container app (e.g. app-test):

./bootstrap-secret-agent.sh --env test --app examplekv
# then start your docker/podman compose with the vault-agent-app-test sidecar

New host app (leaf cert):

./setup-vault-agent-mtls-client-config.sh --env test --app exampleapp
./setup-vault-agent-app-config.sh --env test --app exampleapp


8) Logging notes (short)

  • Container best practice: log to stdout/stderr; avoid writing log files under /var/log/... inside containers (permission problems). For MariaDB: disable file logging in .cnf or route to stderr; debug via podman logs.
  • Agents: everything is visible in the logs; key lines:
    • authentication successful
    • rendered "<template>" => "<destination>"
    • renewed auth token

If you want, I can turn this into a one-page checklist—but for now you have the key events and the matching scripts in one place.