Comment by ~droyo on ~droyo/misc
I ended up going with raw sockets. The implementation can be found in the tapalloc repo.
REPORTED
RESOLVED CLOSEDTicket created by ~droyo on ~droyo/misc
I want to be able to make small changes to guix packages and rebuild them quickly. Unfortunately, every time I make a small change to a package guix rebuilds it from scratch, including all the .o files that were unaffected. When the build is complete, it tosses all the intermediate files.
Rather than work out a way to preserve the intermediate files in the store, I could use ccache or a modern take on the idea like sccache to cache those artifacts and speed up rebuilds.
The target use case here is making small, iterative changes to fundamental packages like the linux kernel, or gcc, or coreutils, where even a tiny change is certain to kick off a massive rebuild. The rebuild should be fast.
~droyo assigned ~droyo to #2 on ~droyo/misc
Ticket created by ~droyo on ~droyo/misc
On Linux, IPVlan and their derived ipvtap devices have a lot going for them:
- They perform better than a tap/veth + bridge combo
- They work with wireless interfaces (unlike macvlan)
- They don't pollute the connected switch's arp table or put the nic into promiscuous mode (unlike macvlan)
I see them as a low-overhead way to give VMs and containers a "real" IP address on my network, that would work just as well on a wired computer or a wireless laptop. The caveats, and they are big ones, are:
- ipvlan devices can only receive unicast traffic for addresses that are assigned to their interface in the kernel
- ipvlan devices all use the mac address of the physical interface
This means DHCP will not work without setting the BROADCAST flag that tells the server to broadcast DHCPOFFER messages, and it means interfaces will get the same address by default. It also means that for ipvtap devices, which I plan to give to VMs, the VM has to use the same L3 addresses that are set on the interface in the host.
For ipvlan devices, slaac will "just work" since the kernel adds some unique data into the middle of the EUI64 address. That will not help VMs since they would need a way to discover what address is configured at the host level, which implies a level of privilege I don't want to provide.
I am figuring out a way to make this work while requiring minimal extra configuration of the guest VMs. I would be comfortable requiring their DHCP clients to use Client Identifiers, and requiring their IPv6 autoconf to use something like RFC7217.
Solutions I can think of:
- Sniff DHCPv4/DHCPv6/ICMPv6 packets from ipvtap devices and configure the l3 addresses that we observe the process obtains.
- Act as a DHCPv4/DHCPv6 relay and configure l3 addresses we negotiate on behalf of the client
- For ipvlan devices, implement a namespace-aware DHCP client
The benefit of 1 is that it could still work with servers that decide they don't want to respond to requests from a relay. The benefit of 2 is that the relay could generate DUID/Client IDs if the client doesn't want to. The benefit of #3 is that containers don't have to run their own dhcp client.
In any case, for SLAAC, ipvtap clients cannot use the mac address for their EUI64 address. On Linux, this can be achieved with the sysctl net.ipv6.conf.default.addr_gen_mode=2. OpenBSD does this by default.
The real open question is how to intercept packets, and how to tell what interface the packets are coming from. Implementing a relay is easy; just bind to the relevant broadcast/multicast addresses. But that won't tell you what interface a packet came from. What I'm working on now is using netfilter queues to deliver packets to a queue serviced by a program, and hope the deliveries have enough metadata to figure out the source. If that doesn't work, a raw socket a-la tcpdump will definitely work.
Comment by ~droyo on ~droyo/misc
I suspect that this is working as intended. I stumbled across this blog post series that shows a similar issue with "injecting" a mount into a user namespace:
https://xkyle.com/Advancing-the-State-of-The-Art-of-Container-Storage-With-Titus-Part-2/
The same restriction probably applies to block devices.