Linux userland API discussions
 help / color / mirror / Atom feed
* Re: [RFC] Null Namespaces
From: John Ericson @ 2026-06-26 17:23 UTC (permalink / raw)
  To: David Laight, Andy Lutomirski
  Cc: H. Peter Anvin, Al Viro, Li Chen, Cong Wang, Christian Brauner,
	linux-arch, LKML, linux-fsdevel, linux-api, Arnd Bergmann,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Jan Kara, Jonathan Corbet, Shuah Khan, Kees Cook,
	Sergei Zimmerman, Farid Zakaria
In-Reply-To: <20260626092750.58a8de9c@pumpkin>

I am replying to both Andy and David in a single email --- hope that is
not confusing.

On Thu, Jun 25, 2026, at 7:09 PM, Andy Lutomirski wrote:
> On Thu, Jun 25, 2026 at 2:53 PM John Ericson <mail@johnericson.me> wrote:
> >
> > The argument against just having an empty, immutable root directory and
> > calling it a day is the tie-in with a new process-spawning API discussed
> > near the bottom of my original email. I want to have nice secure
> > defaults, rather than forcing the programmer to remember to unshare, but
> > I also don't want to degrade performance by speculatively creating new
> > empty mount namespaces that might just be thrown away. Null fields alone
> > get us both --- security and good performance.
>
> This seems like a false dichotomy.  There's such thing as a singleton.
>
> In fact, we have this spiffy nullfs_fs_get_tree.  It seems relatively
> straightforward to have an API to get an fd to the singleton nullfs,
> and the default for a newly spawned process could even be to have cwd
> pointing at nullfs.

Ah! This is the first I am learning about the new nullfs. OK yes I agree
this gives us both properties, since it is truly immutably empty.

I still have a slight preference for something that also makes
statting/opening/etc. of `/` itself fail, but this is otherwise good ---
there's no denying it.

> root is still harder, because of the shadowing issue.  I think I
> proposed, ages ago, relaxing the chroot rules so that, at least under
> certain circumstances (e.g. the task is not already chrooted) an
> unprivileged task could chroot.  chrooting to nullfs seems like a
> somewhat useful operation.
>
> I can imagine more complex schemes to allow even a chrooted process to
> safely start acting as though their root is nullfs, but that would be
> potentially fairly nasty.  *Maybe* everything would work if there was
> a root-for-dotdot and a separate root-for-absolute-paths, and
> nameidata->root could point to the former, but I'm certainly not
> willing to say that I think this would work with any confidence at
> all.

I really like these ideas!

- Splitting the two uses of root sounds great. Even more generally (at
  least as a thought experiment, I don't like the O(n) performance), one
  can imagine a set of paths one must not `cd ..` past. Conceptually, I
  feel optimistic that inserting another boundary path into the set on
  every `chroot` makes it safe.

- In the original "real root", the "root for .." field could be null,
  since no `..` check is actually needed. Then, if we only want to have
  a single "root for .." (to avoid the O(n)), only the initial
  assignment of it from null to non-null would be unprivileged --- this
  would implement your "task is not already chrooted" idea. Subsequent
  assignment would still be privileged since we are replacing, not
  extending our "set". (The nullable single path means we have 0 or 1
  paths in our set.)

----

On Fri, Jun 26, 2026, at 4:27 AM, David Laight wrote:
>
> You'd also need to sort out the 'pwd' mess.
> The kernel inode always has its real parent, inside a chroot the scan stops
> when the inode is the same as that of the base of the chroot.
> But faf about with namespaces (IIRC I was doing an unshare to get out of
> a network namespace) and that comparison can fail (if the chroot base isn't
> a mount point) - so "../.." can go all the way back to the real root rather
> than stopping at the base of the chroot (as you would expect).
>
> David

I did get the impression that the `..` check is...rather fragile. I am
also thinking that a global setting like `openat2`'s `RESOLVE_BENEATH`
to make `..` never work would be useful; then all manner of chrooting is
trivially safe, because you cannot go up regardless!

----

Given the state of the discussion, I'll go submit my null cwd and root
patch momentarily. The nullfs alternative is quite compelling; to the
extent that I do prefer making the root operations fail as I said above,
I think my best shot is demonstrating that this patch is so small and
lightweight that this slight benefit is paid for by the simplicity of
the implementation.

John

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox