Linux userland API discussions
 help / color / mirror / Atom feed
From: "John Ericson" <mail@johnericson.me>
To: "Al Viro" <viro@zeniv.linux.org.uk>
Cc: "Andy Lutomirski" <luto@kernel.org>, "Li Chen" <me@linux.beauty>,
	"Cong Wang" <cwang@multikernel.io>,
	"Christian Brauner" <brauner@kernel.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Thomas Gleixner" <tglx@kernel.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>, "Jan Kara" <jack@suse.cz>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Shuah Khan" <skhan@linuxfoundation.org>,
	"Kees Cook" <kees@kernel.org>,
	"Sergei Zimmerman" <sergei@zimmerman.foo>,
	"Farid Zakaria" <farid.m.zakaria@gmail.com>
Subject: Re: [RFC] Null Namespaces
Date: Fri, 26 Jun 2026 12:26:09 -0400	[thread overview]
Message-ID: <ba6be060-2dd8-4cda-ac73-4b154f968692@app.fastmail.com> (raw)
In-Reply-To: <20260626001538.GO2636677@ZenIV>

On Thu, Jun 25, 2026, at 8:15 PM, Al Viro wrote:
> On Wed, Jun 24, 2026 at 11:41:07PM -0400, John Ericson wrote:
>
> > But I don't want that global state.
>
> Don't use it, then... out of curiosity, does that extend to stdout et.al.?

Good question; it turns out I like the standard streams much better!

First of all, the standard streams are just an idiom --- there is
nothing actually special about file descriptors 0, 1, and 2. That's a
clean design --- the kernel doesn't need to know about userspace idioms.

Second of all, if you don't want any of those, you can just close 'em!
You can't do that with the cwd, however. It's stuck open.

Ideally `*at` would have been with us from the beginning, and, say, file
descriptor 3 would have been the "current working directory" merely by
convention.

> Would that kind of thing added kernel-side assist the development of such
> library?  Maybe, but I wouldn't bet too much on that - if you start from
> scratch, you can trivially verify that you don't even attempt given
> set of syscalls and if you use libc as a starting point, you get to
> debug all the failure exits you've added...

First of all, I am trying to change what processes are allowed to do,
and this includes programs I did not write. A libc-based solution is the
program cooperating with its own sandboxing; this is not a solution for
running arbitrary programs which may not be trusted in a restricted
manner.

Second of all, this would be very laborious in practice, because we're
talking not about what syscalls the program uses, but about what data is
passed in those syscalls. Any program that consumes arbitrary user input
(like shell utilities) might receive an absolute or relative path, and
so it would have to manually check for that, lest the user input "trick"
the program into using the root dir and cwd it is trying to ignore.

Making a tiny few edits in the kernel path resolution logic to allow for
these null fields is much more practical than defending a much broader
perimeter in userspace.

> > The programmer (or coding agent) is
> > encouraged to do everything with file descriptors rather than path
> > concatenations etc., because they need to use `*at` anyways, and then
> > voilà, without browbeating anyone in security seminars or code review, a
> > bunch of TOCTOU issues disappear simply because doing the right thing is
> > now the path of least resistance.
>
> I'm sorry, but the path of least resistance is picking a snippet from google
> that will implement open(), etc., on top of your setup and using it.
> _Especially_ if coding agents are going to be involved, precisely because
> they'll do a convincing simulation of human duhveloper's behaviour, i.e.
> "cut'n'paste it from the net".

We agree! But this is precisely why it is important to make these things
fail. Mindless Stack Overflow cut'n'pasters (human or agent) still run
their program to make sure it works. Making the thing you don't want
them to do *actually fail* creates sufficiently strong and incremental
feedback that they will end up doing the right thing.

> > The current working directory, roughly, is *just* some global state
> > holding a directory file descriptor.
>
> So's the descriptor table; what's the difference?

Now that I've responded to everything else, I can answer this in
summary:

- File descriptors can be closed; cwd and root cannot be.

- File descriptors need to be explicitly used in syscalls. The cwd and
  root are implicitly used (in too many different syscalls to make
  syscall-level auditing practical) based on the sort of path string
  argument to the syscall, without the program's explicit consent.

John

  reply	other threads:[~2026-06-26 16:26 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24 22:51 [RFC] Null Namespaces John Ericson
2026-06-24 23:06 ` Andy Lutomirski
2026-06-24 23:20   ` Andy Lutomirski
2026-06-24 23:53     ` John Ericson
2026-06-25  1:10       ` Al Viro
2026-06-25  3:41         ` John Ericson
2026-06-25 15:51           ` Andy Lutomirski
2026-06-25 18:21             ` John Ericson
2026-06-26  0:15           ` Al Viro
2026-06-26 16:26             ` John Ericson [this message]
2026-06-24 23:12 ` Al Viro
2026-06-25 21:00   ` H. Peter Anvin
2026-06-25 21:50     ` John Ericson
2026-06-25 23:09       ` Andy Lutomirski
2026-06-26  8:27         ` David Laight
2026-06-26 17:23           ` John Ericson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba6be060-2dd8-4cda-ac73-4b154f968692@app.fastmail.com \
    --to=mail@johnericson.me \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=cwang@multikernel.io \
    --cc=dave.hansen@linux.intel.com \
    --cc=farid.m.zakaria@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=kees@kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=me@linux.beauty \
    --cc=mingo@redhat.com \
    --cc=sergei@zimmerman.foo \
    --cc=skhan@linuxfoundation.org \
    --cc=tglx@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox