AORXI Homelab
Operations / Runbooks

Build Phases

End-to-end build sequence for the two-site homelab: seven phases from flat-network bootstrap to a full Kubernetes stack on Proxmox, OPNsense, Ceph, and WireGuard.

Seven phases bring the two-site homelab from flat-network bootstrap to a complete Kubernetes stack. Each phase describes the topology change at that step and what changes from the previous one.

Two phase numbering schemes in this project

Two views of the same build co-exist in the documentation. vault/13-build-phases.md (the primary build reference and the source for this page) defines seven phases numbered 0–6: an explicit Phase 0 for initial bootstrapping, separate phases for each OPNsense site insertion, and distinct phases for Proxmox clustering, storage, and Kubernetes. CLAUDE.md § "OPNsense Migration Approach" presents a condensed six-phase migration view (1–6): it omits the pre-numbered bootstrap step, merges final-IP migration and cluster formation into one phase, and combines storage and Kubernetes into a single Phase 6. Both sequences cover the same work — the difference is the level of granularity. This page follows the vault/13 seven-phase sequence as the canonical build reference.

Site Building Blocks

TierSite ASite B
Edge router (bootstrap / fallback)UniFi Gateway MaxUSG Pro
Firewall / routersa-fw-01 OPNsense VM on sa-edge-01sb-fw-01 OPNsense VM on sb-edge-01
10 Gb L2 coresa-sw-01 (Netgear XS716T)sb-sw-01 (Netgear XS748T)
Access / IPMI / APsa-sw-02 / sa-sw-03sb-sw-02
Serverssa-edge-01 (E200), sa-stor-01 (5049A-T), sa-cmp-01 (P51), sa-cmp-02 (P52)sb-edge-01 (E200), sb-cmp-01 / sb-cmp-02 (FN8TP), sb-cmp-03 / sb-cmp-04 / sb-cmp-05 (FN4T)

Phase Summary

#PhaseKey change
0BootstrapAll nodes on temp IPs, flat behind UniFi
1OPNsense AInsert sa-fw-01, demote UniFi Gateway Max
2OPNsense BInsert sb-fw-01, demote USG Pro
3WireGuardSite-to-site VPN, 10.255.0.0/24 transit
4ClustersFinal VLAN 20 / 25 IPs, sa-pve + sb-pve
5StorageZFS + PBS-A (Site A), Ceph size 3 (Site B), PBS DR
6KubernetesVLAN 40 nodes, Cilium / MetalLB / ArgoCD stack

Phase 0 — Bootstrap

All nodes are racked and imaged behind the existing UniFi routers on a flat network. No VLANs, no OPNsense yet. This is the staging state before any segmentation.

Topology: ISP → UniFi router → Netgear core → all servers (both sites independent, flat)

Site ASite B
Temporary subnet192.168.1.0/24192.168.16.0/24

Deliverables:

  • Proxmox installed on all servers
  • Netgear and UniFi switches cabled but not yet VLAN-configured
  • Both sites independent and flat

Phase complete when: all nodes are reachable on their temporary IPs and the Proxmox web UI is accessible at each host.


Phase 1 — OPNsense at Site A

OPNsense comes online upstream of the UniFi Gateway Max at Site A. VLANs 10–120 begin on the Netgear switch and OPNsense. Site B remains in bootstrap.

Topology: ISP → OPNsense (sa-fw-01) → UniFi WAN → existing users

Deliverables:

  • sa-fw-01 OPNsense running as VM on sa-edge-01
  • UniFi Gateway Max demoted to users / Wi-Fi (double-NAT — intentional)
  • VLANs 10 20 25 30 and above tagged on sa-sw-01

Phase complete when: sa-fw-01 is handling WAN, VLAN gateways are active on sa-sw-01, and existing users behind UniFi continue to work.

See Migration Phases for the full OPNsense migration detail.


Phase 2 — OPNsense at Site B

The same OPNsense insertion is repeated at Site B. Both sites now route through OPNsense. Both UniFi routers are demoted to users / Wi-Fi.

Topology: ISP → OPNsense (sb-fw-01) → USG Pro WAN → existing users

Deliverables:

  • sb-fw-01 OPNsense running as VM on sb-edge-01
  • USG Pro demoted to users / Wi-Fi (double-NAT — intentional)
  • Both sites fully on OPNsense edge
  • Internal DNS served by OPNsense Unbound (core.aorxi.io) — Technitium DNS VMs take over in Phase 5

Phase complete when: sb-fw-01 is handling WAN at Site B; both sites are independently running OPNsense edge routers.


Phase 3 — WireGuard Site-to-Site VPN

A WireGuard tunnel between the two OPNsense instances links the sites with routed connectivity only — no stretched L2. This carries cross-site management traffic and, later, Proxmox Backup Server (PBS) replication.

Never stretch L2 between sites

Inter-site connectivity is routed only. Never bridge L2 across sites. The WireGuard tunnel carries 10.10.0.0/16 ↔ 10.20.0.0/16 as routed subnets.

ParameterValue
Transit subnet10.255.0.0/24
OPNsense-A address10.255.0.1
OPNsense-B address10.255.0.2
Routes carried10.10.0.0/16 ↔ 10.20.0.0/16

Phase complete when: 10.10.0.0/16 and 10.20.0.0/16 are mutually reachable over the WireGuard tunnel with routed traffic only; no L2 bridging exists between sites.


Phase 4 — Final IPs and Proxmox Clusters

Proxmox management moves to final VLAN 20 IPs and Corosync moves to VLAN 25 (no gateway). One Proxmox cluster per site is formed — never stretched across the VPN. WAN cuts over from the UniFi routers to OPNsense.

One cluster per site — no cross-WAN Proxmox clusters

sa-pve at Site A and sb-pve at Site B are never stretched across WAN or WireGuard. Do not cluster until every node in the site has its final 10.x.20.x IP and /etc/hosts is correct.

ClusterCreate onJoin count
sa-pvesa-stor-013 (remaining Site A nodes)
sb-pvesb-cmp-015 (remaining Site B nodes)

Deliverables:

  • Proxmox Management VLAN 20 — all nodes at final 10.x.20.x IPs
  • Corosync heartbeat VLAN 25 — no gateway, dedicated links
  • sa-pve cluster formed at Site A; sb-pve cluster formed at Site B

Phase complete when: both clusters are formed, all nodes show online in Proxmox, and Corosync quorum is stable on each site.

See Proxmox Clusters for cluster formation and storage configuration.


Phase 5 — Storage and Backups

Site A builds ZFS mirror vdevs on sa-stor-01 for PBS-A. Site B builds a local Ceph cluster across five nodes with replication size 3. PBS replicates across the WireGuard VPN for cross-site disaster recovery.

No stretched Ceph

Site B Ceph stays local. Do not stretch Ceph across the WireGuard tunnel. Use PBS replication over WireGuard for cross-site DR instead.

ItemSite ASite B
Storage typeZFS mirror vdevsCeph, replication size 3
Primary hostsa-stor-01 (5049A-T)sb-cmp-01 through sb-cmp-05
PBS mgmt IP10.10.30.2010.20.30.20
PBS backup-data IP10.10.90.4010.20.90.40
Cross-site DRPBS replication over WireGuard
  • Ceph public network: VLAN 60 (10.20.60.0/24)
  • Ceph cluster network: VLAN 65 (10.20.65.0/24, no GW)

Phase complete when: ZFS pools are healthy on sa-stor-01, Site B Ceph reaches health OK at replication size 3, and PBS replication jobs are running across the WireGuard VPN.


Phase 6 — Kubernetes and Full Stack

Kubernetes / OpenShift deploys on VLAN 40 node networks. Site B is the worker-heavy compute lab; Site A runs lighter clusters. GitOps, ingress, DNS, and monitoring complete the build.

ParameterValue
Node networkVLAN 40 (10.x0.40.0/22)
LB / VIPsVLAN 50 (10.x0.50.0/24)
Pod CIDR10.128.0.0/14
Service CIDR172.30.0.0/16

Stack: Cilium (CNI, overlay mode) · MetalLB · cert-manager · external-dns · ArgoCD · ingress-nginx or Traefik

Deliverables:

  • K8s nodes running on VLAN 40; MetalLB assigning VIPs from VLAN 50
  • Site B = worker-heavy compute; Site A = lighter / CI clusters
  • ArgoCD as GitOps control plane

Phase complete when: K8s / OpenShift nodes are running on VLAN 40, MetalLB is issuing VIPs from VLAN 50, and ArgoCD is serving as the GitOps control plane.

See Kubernetes Planning for node networks, machine CIDRs, and the CNI stack.


Node State Legend

StateMeaning
ExistingIn place from a prior phase
Added / changedNew or modified this phase
DemotedUniFi router demoted to users-only
StorageStorage role (ZFS / Ceph / PBS)
K8sKubernetes node
VPNWireGuard site-to-site tunnel

On this page