homelab-planning

It starts with a big git repo.

terraform proxmox ansible & nix/nixos docker compose tailscale fun???

hardware

storage - truenas host

Intel. Lots of storage, lots of ram. Power efficient.

Dedicated “storage” for the system. Currently a random 3-disk zfs pool w/ ~6TB. Kinda slow, but perfect. 10G link to switch.

compute - proxmox host

AMD/Nvidia - Lots of ram - dedicated graphics card (2070S)

Proxmox is the main compute workhorse. Runs several docker-enabled LXC containers and a couple VMs.

LXC containers each have a docker compose “stack” - monitoring, srv (proxy, git server, file sync, etc), media (file streamin, photo upload), arr (archiver) - and some others

VMs include:

  • Proxmox Backup Server - backs up all important LXC/VM volumes (this data eventually gets rsynced to the storage box)
  • Home assistant
  • Anything else that requires a full vm

backups — 3,2,1

  • All devices make backups locally (nixos configurations, docker volumes, dotfiles in a git repo)
  • All data is sent to the storage box in
    • `[pool]/archives/[service])
      • rsynced to
        • RaspberryPI (Offsite)
    • [pool]/library/Backups/[service]/*
    • [pool]/library/Photos/*
      • rsynced to
        • RaspberryPI (Offsite)
        • Backblaze B2 (Cloud)

software

tailscale

Tailscale is used as the VPN of choice. It’s let’s me connect all hosts to private network and let’s me access them from the internet, pretty easily, pretty reliably.

nixos

Nixos is a declarative OS. We’ll be utilizing flakes (NOT home-manager) to declare the os and any package dependencies yada yada.

this will be the main os for any virtualized guest running a service because we can declare the config and guarantee that it’s reproducible.

packer

Packer will build images - nixos, debian, centos to use as proxmox templates

terraform

Terraform to configure hardware as much as possible.

Starting with proxmox module. (eventually, routers, switches, etc)

terraform will write out n LXC/VMs within proxmox. Most will run nixOS.

Then there’s one of each for testing/building/whatever. They don’t boot with the machine tho.

  • ubuntu (debian)
  • fedora (rhel)
  • arch
  • windows
  • macos

The nixos guests will each run a stack (docker compose? docker swarm? k3s?) of services (ctrl, srv, stats, media, arr, nvidia, etc).

proxmox

Proxmox, as a hypervisor, runs the guests, backs them up, and handles networking, configs, and other things I don’t want to manage myself.

proxmox backup server

in one of the vms, we run proxmox backup server. This will backup the other lxc/vm, utilizing a minimal amount of space thanks to it’s deduplication.

docker compose

Docker compose is resposible for running our services.

Each “stack” has a git repo with a compose.yml file defined. This compose.yml will import other to build a complete “stack”.

You’ll often find a “agents.yml” file as well which contains some common containers for monitoring or other agentic activities depending on the host — cadvisor, watchtower, portainer/gitea agent

traefik

Traefik is the reverse proxy of choice. Any containers running on the same host utilize labels to receive a domain, otherwise a config file is made per-service.

We also utilize traefik to proxy to services outside of docker as well — i.e. the truenas server at nas.[domain]

gitea

Gitea is the internal git server of choice, offering actions and agents to power most of this setup.

Everything is mirrored to a cloud provider for … reasons (github? gitlab? sourceforge)

ollama & friends

Ollama (and open-webui, comfyui) are run an nvidia-enabled guest for running local and openAI API-compatible models!

misc
  • owncloud - file sync
  • flame - dashboard
  • coolify - internal dev/deployment/staging pipeline
  • coder - internal remote dev containers
  • vaultwarden - password manager server
  • jellyfin & friends - media streaming
  • arr & friend - internet archiving
  • grafana & friends - monitoring + stats
  • other misc containers that I should probably get rid of

homelab-troubleo

So, after homelab-planning, I’ve run into a couple issues, and some awesome stuff, then more issues.

I have no ssh access

Immediately after provisioning an LXC/VM with proxmox, I don’t have ssh access.

So this step, in between terraform and ansible, is still manual.

There’s a couple different ways to solve though,

  • SSH SSH proxy through the proxmox host into the lxc ahh bad memory, i thought that the host always has access
  • Cloud-init images nada
  • Packer Seems like a lot, but all of this is a lot sooo….
  • NixOS Some way of creating custom images and uploading to proxmox

Terraform Ansible

All of my hosts are statically defined in ansible. So, when terraform is done, ansible kinda “assumes” that everything is a-okay.

Plus, the two run entirely independently. This is alright since don’t have ssh access, but it’d be nice for terraform to run ansible-playbook -i containers:[list-of-ips] sync --tasks core,docker or something similar.

Docker Mounts

Most of my current services have a docker volume structure like this

|[srv]-data - runtime data
|[srv]-config - config data
|./[misc] - various configs i wanted saved with the stack

For example traefik

  • traefik-data - docker volume with runtime data
  • traefik-config -docker volume with config data
  • ./_certs - bind mounts with random stuff

Some static mounts are fine, these can be commited to the stack.

My original assumption was that docker volumes would be easiest to move/backup, this is not true. Files are files. Everything is a file and there’s nothing simpler than moving files around.

For the future, my plan for each host to have a “docker” mountpoint. Inside of this directory, each services will have a config and data folder, respectively.

In the stack, we’ll give an env variable like $STATIC_MOUNT for each service to use in definiing volumes.

Having the volumes separated like this allows the hosts to be spun up/down as needed and just attach the storage.

This leads me to the next semi-issue…

My storage paths are a kinda….

In the nas, I have a few main datasets

  • library - music, movie, books, photos and everything else
  • archive - backups and whatnot (synced to B2 and the RPI)
  • shared - per-user home directories

Proxmox mounts the library and archive directory. Archives is used for backups, holding isos, and whatever else proxmox needs.

Docker hosts will mount library, services will bind-mount as needed.

To fix the docker problem, I can create a new dataset — services — or something. This will house each services config and data dirs. This new dataset will be mounted on proxmox as nfs, then the service host can mount that and the service can bind mount. Nice.

The goal

is to have a system where I define some resources (LXC/VM) in terraform the deploys to proxmox. Once the resources are up, I then run ansible to configure them, install the stacks, boot them. When the stacks run, they have a predefined config and maybe data.

I can scale this up/down in terraform as needed.


More Homelabbing

SSH

is awesome, when you understand it. Currently it’s the bane of my existence (not really).

The problem is that these isos/lxc don’t have ssh enabled by default. There are a couple ways to solve this.

  • Prebuilt isos - build you own isos/lxc-template with ssh enabled by default
  • cloud-init - setup cloud-init to enable ssh with specific keys
  • manually - go in the pve dashboard and enable ssh (HELL NO!)

So most likely, the best thing is a combination of the first two options.

A custom prebuilt iso with packages, users, ssh w/ public-keys enabled. Also, enable cloud-init for further configuration during provisioning.

Customize cloud-init during provisioning to further add ssh-keys and specifically set ip addresses.

IP Addresses

Currently, we have no control over the ip addressing in a vm. This is okayish —

  • dhcp - i tried enabling dhcp but the problem is that now i don’t have access to the ip
  • static - i set it, it doesn’t respect it — likely resolved by the ssh issue.

The plan

  • Get rid of the router.
    • Turn it into another docker node, (services like monitoring and backups)
  • Get rid of proxmox.
    • Move to docker swarm / stacks on bare metal with resource controls
    • Expose a NFS share that can be accessed from docker containers and on the network.
  • Sell the temporary NAS.
    • Rely on shared storage from the server and cloud / offsite backups
  • (Optional) Integrate a cloud VPS

Networking

  • Cloudflare for domain / dns management
  • Traefik for SSL termination and reverse proxy