proxmox in a container on fedora - by slonkazoid

violently disregarding all documentation

written

2024-06-28T23:54:00Z

last modified

2024-12-21T11:57:06Z

proxmox in a container on fedora

Proxmox-VE running a vm called "test". it's booted into Debian Bookworm with the GNOME desktop environment. There is a terminal with a neofetch output showing it's running Fedora 40, as well as pidof pvedaemon, which returned '7987'

violently disregarding all documentation

fediverse thread

what

i ran proxmox virtual environment on fedora, utilizing systemd-nspawn containers.

why

i wanted to see if i could.

when

today.

how

i took my dying nvme and slapped on a new partition table, created a new luks2 partition and formatted it as btrfs. created a few subvolumes (standard stuff) and installed debian on it from fedora using debootstrap.

then i went to the proxmox documentation and ignored all of it, then found the manual installation guide and read it. i set up the system all nice and well, installed daemons and stuff, and set up the repos.

after breaking all the rules, i booted the system in systemd-nspawn using the appropriate tools, then installed proxmox-ve in it. proxmox-ve depends on the proxmox kernel, which i couldn't even use because this was a regular container, not a virtual machine.

i rebooted the container. pve-cluster was now failing to start, because /dev/fuse didn't exist. there was a stale issue on the systemd repo about just this and it happened to contain a patch that 'fixes' it. i swiftly figured out how to patch and build a system package on fedora and compiled systemd several times before i got it to work. i also had to regenerate the patch file cuz it didn't apply with fedora's weird patch applying command.
i also had to DeviceAllow= it in systemctl edit systemd-nspawn@<container-name>.service. i am still not sure if this was required or i could have just bound it like how i did with /dev/kvm below.

at this point, i could launch and use proxmox-ve. i uploaded a debian netinst iso and.

TASK ERROR: KVM virtualisation configured, but not available. Either disable in VM configuration or enable in BIOS.

this looks like a scary error but i just had to bind /dev/kvm and DeviceAllow= it as well. after doing this, kvm on the host broke, cuz it was now owned by GID 104 (input), which was group kvm in the container. this normally wouldn't be a problem, but i had to set PrivateUsers=off to work around other permissions. i fixed it by editing /etc/group and changing some GIDs around in the container.

while doing all this, i was trying to get {net,}working. it was horrible.

some insignificants touch-ups here and there and everything was working. well except for dhcp cuz i didn't set that up.

networking

70% of the time i spent was probably on doing, or trying to do, networking.

thanks to my prior experience with systemd-nspawn, i knew the default networking setup was to just expose all interfaces to the container. i wanted all my (non-proxmox) vms and containers in a LAN so this was not the way to go for me. apparently systemd-nspawn@.service actually uses a virtual ethernet interface and NATs it, which inexplicably did not happen on my machine. i checked the systemd-networkd link definitions of nspawn, and they matched with the ArchWiki ones, yet still did not NAT anything.

i then tried just using libvirt's default virbr0 interface as a bridge. it worked perfectly* first try and resulted in the desired effect. but this depended on libvirtd running and i, at first, couldn't figure out how to get that running consistently.

after a few hours of trying to configure my own bridge interface, i realized that it wasn't going anywhere so i gave up. not before breaking some stuff and needing to fix it, though. instead i configured it to share the net ns as usual.

interface activation failed
kvm: -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown: network script /var/lib/qemu-server/pve-bridge failed with status 7680
TASK ERROR: start failed: QEMU exited with code 1

i gave up rather quickly and went back to using libvirt's virbr0. to fix the * mentioned previously, i added masquerade rules to /etc/network/interfaces:

auto vmbr0
iface vmbr0 inet static
        address 192.168.55.1/24
        gateway 192.168.55.1
        bridge-ports none
        bridge-stp off
        bridge-fd 0
		# epic rules
        post-up   iptables -t nat -A POSTROUTING -s '192.168.55.1/24' -o host0 -j MASQUERADE
        post-down iptables -t nat -D POSTROUTING -s '192.168.55.1/24' -o host0 -j MASQUERADE

this worked well, but did not have a dhcp server and i was too exhausted from networking to set that up.

i also had to forward the port 8006 from the container network into enp6s0. of course i tried DNAT before everything. this worked for a few minutes and then started getting rejecting by libvirt's firewall chain????? i tried using the Port= option which was also initially getting caught by firewalld. i disabled firewalld, and it stopped matching the rules, everything was ACCEPT by default and everyone was happy. it didn't work, the exact same way as before. i have no idea why.
i ended up going with the worst possible option and making a service that runs socat to forward the port, and removed a system service in the process. (which then made my system unbootable and required recovery from my other OS)

proxmox ruins it all

proxmox's web interface lets you create virtual bridge interfaces. these interfaces override the default route with themselves and break container networking (alongside proxmox netowrking).

moral of story

don't use proxmox.

footnotes

i'm working on another post which will hopefully be up at /posts/png soon. it's slightly more important than a single-day project so i don't want to publish post as shitty as this one.