Overview
Current Source Of Truth
The current implementation is documented in:
- GitHub: https://github.com/blade-34242/vault-ops
- README: https://github.com/blade-34242/vault-ops/blob/main/README.md
This post is the blog summary version.
Big picture (Vault + env split + per-app identities)
[ offline root CA ]
|
v
[ Vault test ] [ Vault prod ]
| |
v v
[ env proxy ] [ env proxy ]
| |
+----> apps/users with their own Vault identities
+----> agents render secrets, leaf certs, or CA chains
The operational shape now is:
- offline root outside Vault,
- one Vault environment per trust domain,
- HTTPS + mTLS on the Vault API,
- one Unix user per app/proxy,
- one scoped Vault identity per app/proxy,
- agents handling issuance, renewal, and render steps.
1) Repo layout
The repo is organized around stable entry points:
infra/for wrapper scripts you actually call,infra/versions/for versioned implementations,infra/scripts/for lower-level hooks,infra/config/apps.example.yamlfor the tracked environment/app template,infra/config/apps.yamlfor the local untracked private config.
This matters because older posts referred to raw versioned script names like ...config2.sh. The current active entry points are the wrapper names without those suffixes.
2) One-time base initialization (per environment)
- Create the offline root CA
- Script:
01_make_offline_root_ca.sh --env test - Purpose: create the offline root CA key/cert.
- Script:
- Create the Vault intermediate & sign it with the offline root
- Script:
02_intermediate_in_vault_sign_with_root.sh --env test --config ./config/apps.yaml - Purpose: enable PKI mount in Vault, sign the intermediate, set CA URLs.
- Script:
- Issue the Vault server certificate
-
Script:
03_issue_vault_server_cert.sh --env test --config ./config/apps.yaml \ --cn vault.test.local \ --dns "vault.test.local,localhost,host.containers.internal" \ --ips "127.0.0.1,::1,<public-ip>" -
Important:
--cnmust match what clients verify (SNI).
If your agents connect via IP, keepaddress=https://<public-ip>:22300and settls_server_name="vault.test.local"(hostname check against the cert).
-
- Switch Vault to HTTPS
- Script:
04_enable_https_in_compose.sh --env test - Result: Vault config uses
tls_cert_file=.../fullchain.crt,tls_key_file=.../server.key.
- Script:
(Optional) 5. Admin mTLS client cert
- Script:
05_issue_admin_client_cert.sh --env test - For the
vaultCLI: setVAULT_CLIENT_CERT/VAULT_CLIENT_KEY.
3) Connect environment proxies to Vault (CA chain only)
For your environment-specific NGINX instances so they keep the CA chain updated.
- Script:
setup-vault-agent-proxy-config.sh --env test --app <edge-proxy> - Produces:
- mTLS-Client (agent.crt/key + ca.pem)
- Agent config that renders only the chain (no leaf certs)
- Post-hook
scripts/vault-agent-post.shcan triggernginx -s reloadon updates (via labeltls=true).
When?
- Once during proxy setup.
- After that, rotation runs automatically (agent renews auth and re-renders the chain).
4) Onboard a new app (secrets and/or leaf certs)
Two typical paths:
A) Container app with a sidecar agent (e.g. Nextcloud + DB password)
- Create AppRole + KV/policies
- Script:
bootstrap-secret-agent.sh --env test --app examplekv - Result:
- Policies (KV-read, optional PKI-Issue)
- AppRole credentials stored in the app's private credential path
- Script:
- Start the sidecar agent in Compose
- In your
docker-compose.yml(likevault-agent-app-test) - The agent reads
role_id/secret_id, logs in, and renders secrets into/vault/secrets/...(tmpfs volume).
- In your
When?
- Every time you set up a new container app.
- SecretID rotation: either re-run
bootstrap_secret_agent.shwith a new SecretID option or runvault write auth/approle/role/<role>/secret-id.
B) Host app (Unix service) needs a leaf certificate (mTLS)
- mTLS client for the agent
- Script:
setup-vault-agent-mtls-client-config.sh --env test --app exampleapp - Result:
~/vault/mtls/{agent.key,agent.crt,ca.crt}
- Script:
- App agent that renders leaf certificates
- Script:
setup-vault-agent-app-config.sh --env test --app exampleapp - Result: a systemd user unit that fetches and rotates the leaf certificate for that workload.
- Script:
When?
- Once during onboarding.
- Rotation runs automatically via the agent (renew + re-render).
5) Which scripts for which event?
| Event | Goal | Scripts / action | Order | Key checks |
|---|---|---|---|---|
| Fresh Vault setup (per env) | PKI + HTTPS ready | 01 → 02 → 03 → 04 (→ 05 optional) |
1→2→3→4 | vault status, agent login works, CN/SAN correct |
| Connect environment proxy | CA chain stays current | setup-vault-agent-mtls-client-config.sh → setup-vault-agent-proxy-config.sh |
1→2 | Agent logs “rendered chain…”, NGINX reload ok |
| New container app | Secrets + optional PKI | bootstrap-secret-agent.sh → Compose sidecar |
1→2 | Agent logs “authentication successful”, secret files present |
| New host app (mTLS leaf) | Leaf cert + rotation | setup-vault-agent-mtls-client-config.sh → setup-vault-agent-app-config.sh |
1→2 | Leaf files exist, systemd unit active, app starts |
| SecretID rotation | Refresh AppRole SecretID | bootstrap-secret-agent.sh (new option) or vault write auth/approle/.../secret-id |
– | New secret_id in app path, reload agent |
| Vault server cert expiring | New server cert | 03_issue_vault_server_cert.sh (same CN/SAN) → reload/restart Vault |
1→(2) | Clients connect, tls_server_name matches |
| Intermediate/root rotation | New CA hierarchy | 02 (new intermediate) + redeploy chains |
1 | Proxies/apps receive new chain (agents handle it) |
| DNS broken in container | Fix connectivity | In agent config: use address=https://<ip>:22300 and keep tls_server_name=vault.test.local |
– | No “no such host” in logs |
| “no known role ID” | Fix AppRole login | Check /vault/creds/role_id and secret_id (0400, readable, correct) |
– | Agent shows “authentication successful” |
| x509 CN mismatch | Fix TLS hostname | Re-issue cert or adjust tls_server_name (must match CN/SAN) |
– | No “certificate is valid for … not …” |
6) Which auth is used where?
- Container sidecar (secrets): AppRole (RoleID/SecretID from
bootstrap_secret_agent.sh). - Proxy agent (CA chain only): mTLS client cert (from
setup-vault-agent-mtls-client-config2.sh). - Host app agent (leaf): mTLS client cert plus PKI role to issue a leaf cert.
Yes: you can replace AppRole with mTLS if you want client-cert distribution everywhere. In containers AppRole is often more convenient (no long-lived private key files). Mixed mode is fine.
7) Minimal commands (copy/paste, test environment)
Base setup (one-time):
./01_make_offline_root_ca.sh --env test
VAULT_TOKEN=hvs.<admin> ./02_intermediate_in_vault_sign_with_root.sh --env test --config ./config/apps.yaml
./03_issue_vault_server_cert.sh --env test --config ./config/apps.yaml \
--cn vault.test.local \
--dns "vault.test.local,localhost,host.containers.internal" \
--ips "127.0.0.1,::1,<public-ip>"
./04_enable_https_in_compose.sh --env test
Attach proxy test to Vault (CA chain only):
./setup-vault-agent-mtls-client-config.sh --env test --app <edge-proxy>
./setup-vault-agent-proxy-config.sh --env test --app <edge-proxy>
# check the resulting user service if one was created
New container app (e.g. app-test):
./bootstrap-secret-agent.sh --env test --app examplekv
# then start your docker/podman compose with the vault-agent-app-test sidecar
New host app (leaf cert):
./setup-vault-agent-mtls-client-config.sh --env test --app exampleapp
./setup-vault-agent-app-config.sh --env test --app exampleapp
8) Logging notes (short)
- Container best practice: log to stdout/stderr; avoid writing log files under
/var/log/...inside containers (permission problems). For MariaDB: disable file logging in.cnfor route to stderr; debug viapodman logs. - Agents: everything is visible in the logs; key lines:
authentication successfulrendered "<template>" => "<destination>"renewed auth token
If you want, I can turn this into a one-page checklist—but for now you have the key events and the matching scripts in one place.