back to home
proxmox in a container on fedora
violently disregarding all documentation
proxmox in a container on fedora
violently disregarding all documentation
what
i ran proxmox virtual environment on fedora, utilizing systemd-nspawn containers.
why
i wanted to see if i could.
when
today.
how
i took my dying nvme and slapped on a new partition table, created a new luks2
partition and formatted it as btrfs. created a few subvolumes (standard stuff)
and installed debian on it from fedora using debootstrap
.
then i went to the proxmox documentation and ignored all of it, then found the manual installation guide and read it. i set up the system all nice and well, installed daemons and stuff, and set up the repos.
after breaking all the rules, i booted the system in systemd-nspawn using the appropriate tools, then installed proxmox-ve in it. proxmox-ve depends on the proxmox kernel, which i couldn't even use because this was a regular container, not a virtual machine.
i rebooted the container. pve-cluster was now failing to start, because
/dev/fuse didn't exist. there was a stale issue on the systemd repo
about just this and it happened to contain a patch that 'fixes' it. i swiftly
figured out how to patch and build a system package on fedora and compiled
systemd several times before i got it to work. i also had to regenerate the
patch file cuz it didn't apply with fedora's weird patch applying command.
i also had to DeviceAllow=
it in systemctl edit systemd-nspawn@<container-name>.service
. i am still
not sure if this was required or i could have just bound it like how i did
with /dev/kvm below.
at this point, i could launch and use proxmox-ve. i uploaded a debian netinst iso and.
TASK ERROR: KVM virtualisation configured, but not available. Either disable in VM configuration or enable in BIOS.
this looks like a scary error but i just had to bind /dev/kvm and DeviceAllow= it as well. after doing this, kvm on the host broke, cuz it was now owned by GID 104 (input), which was group kvm in the container. this normally wouldn't be a problem, but i had to set PrivateUsers=off to work around other permissions. i fixed it by editing /etc/group and changing some GIDs around in the container.
while doing all this, i was trying to get {net,}working. it was horrible.
some insignificants touch-ups here and there and everything was working. well except for dhcp cuz i didn't set that up.
networking
70% of the time i spent was probably on doing, or trying to do, networking.
thanks to my prior experience with systemd-nspawn, i knew the default networking setup was to just expose all interfaces to the container. i wanted all my (non-proxmox) vms and containers in a LAN so this was not the way to go for me. apparently systemd-nspawn@.service actually uses a virtual ethernet interface and NATs it, which inexplicably did not happen on my machine. i checked the systemd-networkd link definitions of nspawn, and they matched with the ArchWiki ones, yet still did not NAT anything.
i then tried just using libvirt's default virbr0
interface as a bridge. it
worked perfectly* first try and resulted in the desired
effect. but this depended on libvirtd running and i, at first, couldn't figure
out how to get that running consistently.
after a few hours of trying to configure my own bridge interface, i realized that it wasn't going anywhere so i gave up. not before breaking some stuff and needing to fix it, though. instead i configured it to share the net ns as usual.
interface activation failed
kvm: -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown: network script /var/lib/qemu-server/pve-bridge failed with status 7680
TASK ERROR: start failed: QEMU exited with code 1
i gave up rather quickly and went back to using libvirt's virbr0
. to fix the * mentioned previously,
i added masquerade rules to /etc/network/interfaces:
auto vmbr0
iface vmbr0 inet static
address 192.168.55.1/24
gateway 192.168.55.1
bridge-ports none
bridge-stp off
bridge-fd 0
# epic rules
post-up iptables -t nat -A POSTROUTING -s '192.168.55.1/24' -o host0 -j MASQUERADE
post-down iptables -t nat -D POSTROUTING -s '192.168.55.1/24' -o host0 -j MASQUERADE
this worked well, but did not have a dhcp server and i was too exhausted from networking to set that up.
i also had to forward the port 8006 from the container network into enp6s0. of
course i tried DNAT before everything. this worked for a few minutes and then
started getting rejecting by libvirt's firewall chain????? i tried using the
Port= option
which was also initially getting caught by firewalld. i disabled firewalld, and
it stopped matching the rules, everything was ACCEPT by default and everyone
was happy. it didn't work, the exact same way as before. i have no idea why.
i ended up going with the worst possible option and making a service that
runs socat
to forward the port, and removed a system service in the process.
(which then made my system unbootable and required recovery from my other OS)
proxmox ruins it all
proxmox's web interface lets you create virtual bridge interfaces. these interfaces override the default route with themselves and break container networking (alongside proxmox netowrking).
moral of story
don't use proxmox.
footnotes
i'm working on another post which will hopefully be up at /posts/png soon. it's slightly more important than a single-day project so i don't want to publish post as shitty as this one.