AORXI Homelab
Platform Services

Secrets — OpenBao

Two independent OpenBao instances (sa-bao-01, sb-bao-01) on the edge E200s: secret tiers, cross-site transit auto-unseal, AppRole consumers, and the break-glass fallback.

One OpenBao (Vault OSS fork) instance per site, hosted on the edge E200s — a good E200 workload: a single small Go binary in a ~2 GB VM. Decided 2026-07-02. There is deliberately no stretched Raft cluster: OpenBao has no cross-site replication, a 2-node Raft over WireGuard cannot quorum, and the design mirrors the one-cluster-per-site rule.

Topology

Site ASite B
VMsa-bao-01 on sa-edge-01sb-bao-01 on sb-edge-01
Final address10.10.30.40/24 (VLAN 30)10.20.30.40/24 (VLAN 30)
Specs2 vCPU / 2 GB / 20 GB, Ubuntu noblesame
Storagesingle-node integrated Raftsame
Version2.5.4 (pinned tarball, sha256-verified)same

Cross-site disaster recovery uses scheduled bao operator raft snapshot save, shipped to the other site over the backup path (VLAN 90, PBS pattern) — each site's secrets are restorable at the other.

Seal / Unseal

Each instance auto-unseals via the other site's transit engine over the WireGuard tunnel: sa-bao-01 points at sb-bao-01 and vice versa. Because the tunnel is the unseal path, the OPNsense VM must start before the bao VM on each E200 (Proxmox startup order: OPNsense order=1, bao order=2 plus delay).

Cold-start deadlock — documented break-glass

If both sites are down at once, neither can auto-unseal. Recovery: seal-migrate one instance back to Shamir (temporary seal stanza swap plus recovery keys), unseal it manually, let the other auto-unseal against it, then migrate back. Both sites' recovery keys live in the password manager — this is the documented recovery path, not an improvisation.

Interim — Site A only (2026-07-02)

Site B is non-operational, so sa-bao-01 runs standalone on Shamir manual unseal (3 of 5 keys after the rare VM reboot). This is bootstrap step 1, not throwaway work — the transit auto-unseal steps execute when Site B comes online. The site-b Pulumi stack stays parked at enabled: false.

Secret Tiers

TierStoreContents
0 — bootstraprepo-root .env.local (gitignored)Proxmox creds, PULUMI_CONFIG_PASSPHRASE, ANSIBLE_VAULT_PASSWORD — everything needed to (re)build bao itself
1 — runtimeOpenBao KV v2 (per site)app/service credentials, API keys, cert material

Tier 0 can never migrate into bao: bao cannot hold the secrets that build bao.

Consumers (bao-first, executed 2026-07-01)

The KV v2 mount is homelab, with paths opnsense/site-a, unifi/site-a, and pulumi/proxmox; Site B mirrors under */site-b on sb-bao-01 later. Access is per-consumer read-only via AppRole (opnsense-config, unifi-config, pulumi-provision), with role_id/secret_id in root .env.local.

  • Ansible (opnsense/config, unifi/config): secret vars are KV v2 lookups that fall back to the encrypted vault-credentials.yml files when bao is unreachable or sealed — the files stay in place as documented break-glass, re-synced via make openbao-config-seed.
  • Pulumi: make targets overlay PROXMOX_VE_* from homelab/pulumi/proxmox, silently falling back to .env.local on any failure. Stack encryption stays on the local passphrase provider — a bao-backed provider cannot fall back when bao is sealed, and aorxi-openbao could never use it anyway.
  • Unsetting BAO_ADDR forces pure fallback mode everywhere.

Still Deferred

Transit engines and seal migration (needs Site B), TLS via Let's Encrypt DNS-01 (sa-bao-01.core.aorxi.io; never expose port 8200 beyond the local segment while TLS is off), the Raft snapshot timer with cross-site shipping, and the api_addr correction at the VLAN 30 cutover. Init and recovery-key handling stay a manual runbook.

  • Initial Site Bootstrap — where bao provisioning and the bao-first migration sit in the build order
  • DNS VMs — the internal zone that will name the bao endpoints
  • Decisions Log — dated entries for the 2026-07-01/02 secrets decisions

On this page