~droyo/misc#1: 
Expected permissions of chardev injected into userns

I've been working on a project that allocates network devices on behalf of unprivileged users. While developing the test suite for it, I encountered a permissions issue. I am able to do the tests differently to avoid the issue, so it's not a big problem for me, but I couldn't explain the behavior, so I'm reaching out here.

In short, if I use the mknodat(2) system call to create a character device in a tmpfs mount in a user+mount namespace, from a process that is outside that mount namespace, the processes in that mount namespace get an EACCES error when trying to open it. The steps to reproduce are:

  1. pid x: create a new user+mount namespace.
  2. pid x: mount a tmpfs somewhere, obtain a dirfd for its root
  3. pid y: obtain dirfd from pid x, via SCM_RIGHTS or pidfd_getfd(2)
  4. pid y: use mknodat(dirfd, ...) to make a char device in the tmpfs
  5. pid x: attempt to open that device

I am on kernel 6.12.0. I've attached mknodat-perm.c which demonstrates the issue. It needs CAP_MKNOD to run. You can run it as root, setcap the binary, or put it in your ambient caps:

gcc mknodat-perm.c
sudo setpriv --ambient-caps +mknod ./a.out

The EACCES error does not occur for regular files, and it does not occur when the device is created outside of the tmpfs mountpoint. Entering the mount ns before calling mknodat does not change the outcome. The error does not occur if the tmpfs mountpoint is setup before the user+mount namespace is created.

Status
REPORTED
Submitter
David Arroyo
Assigned to
Submitted
a month ago
Updated
a month ago
Labels
No labels applied.

~droyo a month ago

I suspect that this is working as intended. I stumbled across this blog post series that shows a similar issue with "injecting" a mount into a user namespace:

https://xkyle.com/Advancing-the-State-of-The-Art-of-Container-Storage-With-Titus-Part-2/

The same restriction probably applies to block devices.

Register here or Log in to comment, or comment via email.