The Hardware#
ElysianLab is a 15-year-old Sony Vaio that should have been retired years ago. Second-gen i3, 4GB RAM, running Debian. The battery is dead so it’s permanently plugged in, which makes it a desktop that happens to have a keyboard attached to it.
And yet it’s been running for 10 months without any major problems. The RAM constraint turned out to be a feature. It forced me to think about what I was running and why, instead of just throwing containers at it.
SeleneBox is an i5 6th gen mini PC. Recently acquired, not yet fully put to work.
Both machines are unattended. If a node goes down, I’m not there to press the power button. This shapes every decision I make about how things run.
What’s Running#
Everything runs in Docker, organized into logical stacks, each with its own docker-compose.yaml:
- Nextcloud: personal cloud storage, the thing I’d miss most if it went down
- Gitea: self-hosted Git, because I don’t want all my personal repos on GitHub
- VaultWarden: Bitwarden-compatible password manager
- Linkding: bookmark manager
- Memos: quick markdown notes
- MariaDB: shared database backend for several of the above
- Redis: key-value store, shared cache
- Traefik: reverse proxy, handles routing and TLS for everything
- Glance: home dashboard showing server stats and container health
- Beszel + Beszel Agent: lightweight monitoring, tracks CPU/RAM/disk across nodes
And then there’s the music stack, which deserves its own mention.
Navidrome + Lidarr + Lidatube is a fully self-hosted music pipeline. Lidarr monitors artists and manages the collection. Lidatube handles downloading via YT-DLP. Navidrome sits on top as the streaming server, with a clean web UI and Subsonic API compatibility so any Subsonic client just works.
The result behaves like Spotify but the library is mine, the data stays local, and there are no ads, no algorithm, no “this song is no longer available in your region.” It’s the service I’d miss most if the machine died. Even more than Nextcloud, if I’m honest.
Networking is handled through Tailscale. Everything is reachable from anywhere without exposing ports to the public internet.
Backups are a set of shell scripts: backup-all.sh orchestrates the rest, mariadb-backup.sh and nextcloud-backup.sh handle the stateful bits, vaultwarden-backup.sh for the passwords. Obviously can’t lose those.
Memory Limits and Unattended Nodes#
Since both machines are in my hometown and I’m not, a crashing container is always preferable to a crashing node. A container Docker can restart on its own. A crashed node needs someone to physically be there.
So every container has a strict memory limit set in the compose file. This matters more than it sounds. Several services in a typical homelab stack are Python or Node.js applications, and both are prone to memory leaks over long uptimes. Node.js in particular can accumulate memory through uncollected event listeners, closures holding references longer than they should, and unmanaged buffers. Python is better about it but not immune. Left unconstrained on a 4GB machine, one leaky service can quietly eat enough RAM to take everything else down with it.
Hard limits mean the leaky container gets OOM-killed and restarted. Everything else keeps running. Not elegant, but reliable.
The Two-Node Problem#
The obvious next step with two machines is to connect them: run Kubernetes, get proper service discovery, health checks, rolling updates, the whole thing.
The power button problem makes this complicated. A Kubernetes node that can go down and stay down until someone physically intervenes isn’t a reliable cluster member. You can tune tolerations and pod eviction all you want, but if the node never comes back, those pods never reschedule cleanly. On 4GB of RAM, a control plane alone would eat most of the headroom before a single workload ran.
So full Kubernetes is off the table for now.
What I’m considering instead is Docker Swarm. It’s lighter than Kubernetes, runs comfortably on constrained hardware, and gives you actual orchestration without the control plane overhead. The unattended power problem doesn’t go away, but Swarm is at least realistic on these machines in a way Kubernetes isn’t.
Haven’t set it up yet. But it’s the most sensible path forward given the constraints.