AORXI Homelab
Compute & Storage

Site B Ceph Storage

Site B Ceph plan: five-node cluster across FN8TP and FN4T hardware, replication size 3, public (VLAN 60) and cluster (VLAN 65) network split, Micron drive recommendations, and Proxmox pveceph deployment.

Site B runs a Ceph cluster distributed across all five compute nodes, providing block storage for Kubernetes and VM workloads. This page covers node roles, OSD sizing, network layout, drive recommendations, and deployment via pveceph.

No stretched Ceph

Site B Ceph stays local to Site B. Never stretch Ceph across the WireGuard link to Site A. Cross-site disaster recovery is handled by Proxmox Backup Server (PBS) replication — not Ceph replication.

Node Roles

All five Site B nodes participate in the Ceph cluster. The two FN8TP nodes (sb-cmp-01, sb-cmp-02) run Ceph MON and MGR processes alongside Kubernetes control-plane workloads. The three FN4T nodes (sb-cmp-03 through sb-cmp-05) serve as high-core OSD and Kubernetes worker nodes. Additional MON/MGR instances may be deployed on FN4T nodes as appropriate.

NodeHardwareCeph Role
sb-cmp-01SYS-5019D-4C-FN8TPMON, MGR, OSD
sb-cmp-02SYS-5019D-4C-FN8TPMON, MGR, OSD
sb-cmp-03SYS-5018D-FN4TOSD, K8s worker
sb-cmp-04SYS-5018D-FN4TOSD, K8s worker
sb-cmp-05SYS-5018D-FN4TOSD, K8s worker

OSD Plan

Each node targets 4–6 enterprise 1.92 TB SSDs as OSDs, for a planned cluster total of 20–30 OSDs. Replication size 3 is the fixed policy: every object is written to three OSDs on three distinct nodes.

Tentative — OSD counts not yet finalized

The per-node OSD count (4–6) and cluster total (20–30) are planning targets. Exact counts depend on drive procurement and per-chassis slot availability. Confirm slot counts before ordering drives.

Boot drive policy

Always use a 512 GB M.2 drive for the Proxmox OS. Never consume a 1.92 TB enterprise SSD as a boot device.

Drive Recommendations

Use CaseDriveRationale
Write-heavy OSDs (databases, high-ingest workloads)Micron 5200 MAXHigh endurance; write-optimized
VM disk images and Kubernetes persistent volumesMicron 5300 ProBalanced read/write for general storage

Ceph Networks

Ceph uses a two-network split: a public network for client I/O and OSD heartbeats, and a separate cluster network for OSD-to-OSD replication traffic. VLAN 65 carries no gateway by design — replication traffic must not leave the local site.

NetworkVLANSubnetGWNIC
Storage / Ceph public6010.20.60.0/24GWX710-T4 port 3
Ceph cluster6510.20.65.0/24no GWonboard 10GBASE-T

No gateway on VLAN 65

VLAN 65 (Ceph cluster) has no OPNsense gateway. Do not assign a default gateway to any host interface on this VLAN. Ceph replication traffic must remain local and must never traverse WireGuard.

Per-Node Network IPs

NodeCeph public 10.20.60.xCeph cluster 10.20.65.x
sb-cmp-0110.20.60.2010.20.65.20
sb-cmp-0210.20.60.2110.20.65.21
sb-cmp-0310.20.60.3010.20.65.30
sb-cmp-0410.20.60.3110.20.65.31
sb-cmp-0510.20.60.3210.20.65.32

Deployment

Ceph is deployed and managed through Proxmox via pveceph. Do not install Ceph packages manually outside the Proxmox tool chain.

Ceph Release and Repository

The cluster runs Ceph Tentacle (not Squid). The Ansible baseline role configures each node with the Proxmox ceph-tentacle no-subscription repository at download.proxmox.com/debian/ceph-tentacle (suite: trixie) and removes the Proxmox enterprise source file entirely.

Package version management

Proxmox owns the Ceph package versions through this repository. The Ansible packages role blacklists the full Ceph stack (ceph.*, radosgw.*, librados.*, librbd.*, libcephfs.*, librgw.*, python3-ceph*) from unattended-upgrades so that security auto-patching never desynchronizes OSD and MON versions across nodes. Ceph upgrades are a manual, staged Proxmox operation.

Deploy Sequence

Complete these steps after the sb-pve cluster is formed and all nodes have their final 10.20.20.x Proxmox Management IPs.

  1. Confirm all five nodes are joined to sb-pve and reachable on VLAN 20.
  2. Verify VLAN 60 and VLAN 65 interfaces are up and correctly addressed on all nodes.
  3. Initialize Ceph on sb-cmp-01, specifying the public network (10.20.60.0/24) and cluster network (10.20.65.0/24).
  4. Add Ceph MON on all five nodes; bring up MGR on sb-cmp-01 and sb-cmp-02 first.
  5. Create OSDs per node using the designated enterprise SSDs.
  6. Create pools for VM disk images and Kubernetes persistent volumes.
  7. Verify cluster health (ceph -s) after each node is added.

Cross-Site Disaster Recovery

Ceph replication does not cross the WireGuard link. PBS replication from PBS-B (backup-data 10.20.90.40) to PBS-A (backup-data 10.10.90.40) provides cross-site disaster recovery for VM and container data. See PBS & Backups for the replication configuration.

On this page