Homelab Proxmox Cluster Design with Role-Based Node Separation
A heterogeneous Proxmox cluster with role-separated nodes (reverse proxy, database, app containers, heavy compute) provides better resilience and cost efficiency than a single powerful server.
For small-scale web application hosting (Elixir/Phoenix, Docker containers), a multi-node Proxmox cluster with heterogeneous hardware and role separation offers better resilience and economics than a single powerful server. **Reference architecture:** | Node | Role | Hardware | Cost | |------|------|----------|------| | Node 1 | Caddy reverse proxy + DNS + monitoring | Cheapest available (HP t630 / N100 8GB) | ~€50–80 | | Node 2 | PostgreSQL + pgbouncer | Mid-tier with fast NVMe (CPU barely matters) | ~€90 | | Node 3–4 | Application containers | ThinkCentre M720q 16GB | ~€120 each | | Node 5 | Heavy jobs (scraping, LLM inference, CI) | M920q i7 or beefier Ryzen box | ~€160+ | **Why role separation matters:** - If an app node OOMs during a heavy scraping session, Postgres keeps running untouched - The database node needs fast NVMe and stable RAM, not CPU — an €80 N100 box with a good SSD is a perfectly adequate Postgres server for low-traffic apps - Proxmox clusters work fine with heterogeneous hardware — nodes just need the same Proxmox version and network connectivity - Live VM migration between mixed AMD/Intel nodes works if the VM CPU type is set to a generic baseline like `x86-64-v2` **Networking essentials:** - Tailscale for private mesh networking (SSH, database access between nodes) - Cloudflare Tunnel for public-facing services — hybrid approach is current best practice - Pangolin as a self-hosted alternative to the Tailscale/Cloudflare combination - TP-Link TL-SG108E (8-port 1GbE managed, ~€25) for basic inter-node networking - MikroTik CRS310-8G+2S+IN (~$219 USD) for 2.5GbE managed switching **Home hosting extras often overlooked:** - A UPS (APC Back-UPS 700VA, ~€60–80) prevents Postgres corruption during power blips and provides 15–20 minutes of graceful shutdown time at ~30–40W cluster draw - Home internet dependency is the real vulnerability — keep a cloud VPS as the public-facing edge (Caddy, DNS) and run compute at home - For nodes without vPro, a cheap KVM-over-IP dongle (~€30) enables remote rescue of unresponsive nodes