All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Alexey Gladkov <legion@kernel.org>
Cc: Dan Klishch <danilklishch@gmail.com>,
	 Al Viro <viro@zeniv.linux.org.uk>,
	"Eric W . Biederman" <ebiederm@xmission.com>,
	 Kees Cook <keescook@chromium.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v8 4/5] proc: Relax check of mount visibility
Date: Tue, 14 Apr 2026 11:55:26 +0200	[thread overview]
Message-ID: <20260414-mathematik-beispiel-cd22801e71af@brauner> (raw)
In-Reply-To: <adjfhxs5JTrZDnbv@example.org>

On Fri, Apr 10, 2026 at 01:31:19PM +0200, Alexey Gladkov wrote:
> On Fri, Apr 10, 2026 at 01:12:36PM +0200, Christian Brauner wrote:
> > On Tue, Feb 17, 2026 at 12:59:54PM +0100, Christian Brauner wrote:
> > > On Fri, Feb 13, 2026 at 11:44:29AM +0100, Alexey Gladkov wrote:
> > > > When /proc is mounted with the subset=pid option, all system files from
> > > > the root of the file system are not accessible in userspace. Only
> > > > dynamic information about processes is available, which cannot be
> > > > hidden with overmount.
> > > > 
> > > > For this reason, checking for full visibility is not relevant if
> > > > mounting is performed with the subset=pid option.
> > > > 
> > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > > ---
> > > >  fs/namespace.c                 | 29 ++++++++++++++++-------------
> > > >  fs/proc/root.c                 | 17 ++++++++++-------
> > > >  include/linux/fs/super_types.h |  2 ++
> > > >  3 files changed, 28 insertions(+), 20 deletions(-)
> > > > 
> > > > diff --git a/fs/namespace.c b/fs/namespace.c
> > > > index c58674a20cad..7daa86315c05 100644
> > > > --- a/fs/namespace.c
> > > > +++ b/fs/namespace.c
> > > > @@ -6116,7 +6116,8 @@ static bool mnt_already_visible(struct mnt_namespace *ns,
> > > >  		/* This mount is not fully visible if it's root directory
> > > >  		 * is not the root directory of the filesystem.
> > > >  		 */
> > > > -		if (mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root)
> > > > +		if (!(sb->s_iflags & SB_I_USERNS_ALLOW_REVEALING) &&
> > > > +		    mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root)
> > > >  			continue;
> > > >  
> > > >  		/* A local view of the mount flags */
> > > > @@ -6136,18 +6137,20 @@ static bool mnt_already_visible(struct mnt_namespace *ns,
> > > >  		    ((mnt_flags & MNT_ATIME_MASK) != (new_flags & MNT_ATIME_MASK)))
> > > >  			continue;
> > > 
> > > There are a few things that I find problematic here.
> > > 
> > > Even before your change the mount flags of the first fully visible
> > > procfs mount would be picked up. If the caller was unlucky they could
> > > stumble upon the most restricted procfs mount in the mount namespace
> > > rbtree. Leading to weird scenarios where a user cannot write to the
> > > procfs instance they just mounted but could to another one that is also
> > > in their namespace.
> > > 
> > > The other thing is that with this change specifically:
> > > 
> > >     if (!(sb->s_iflags & SB_I_USERNS_ALLOW_REVEALING) &&
> > >         mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root)
> > > 
> > > we start caring about mount options of even partially exposed procfs
> > > mounts. IOW, if someone had a bind-mount of e.g., /proc/pressure
> > > somewhere that got inherited via CLONE_NEWNS then we suddenly take the
> > > mount options of that into account for a new /proc/<pid>/* only instance.
> > > I think we should continue caring only about procfs mounts that are
> > > visible from their root.
> > > 
> > > The the other problem is that it is really annoying that we walk all
> > > mounts in a mount namespace just to find procfs and sysfs mounts in
> > > there. Currently a lot of workloads still do the CLONE_NEWNS dance
> > > meaning they inherit all the crap from the host and then proceed to
> > > setup their new rootfs. Busy container workloads that can be a lot.
> > > 
> > > So let's just be honest about it and treat procfs and sysfs as the
> > > snowflakes that they have become and record their instances in a
> > > separate per mount namespace hlist as in the (untested) patch below [1].
> > > 
> > > Also SB_I_USERNS_ALLOW_REVEALING seems unnecessary. The only time we
> > > care about that flag is when we setup a new superblock. So this could
> > > easily be a struct fs_context bitfield that just exists for the duration
> > > of the creation of the new superblock and mount. So maybe pass that down
> > > to mount_too_revealing() and further down into the actual helper.
> > > 
> > > [1]:
> > > >From 4bbd41e7a3ef91667dd334f31b1b6bf8caec0599 Mon Sep 17 00:00:00 2001
> > > From: Christian Brauner <brauner@kernel.org>
> > > Date: Tue, 17 Feb 2026 12:02:34 +0100
> > > Subject: [PATCH] namespace: record fully visible mounts in list
> > > 
> > > Instead of wading through all the mounts in the mount namespace rbtree
> > > to find fully visible procfs and sysfs mounts, be honest about them
> > > being special cruft and record them in a separate per-mount namespace
> > > list.
> > 
> > If you rework this I would expect to take it for v7.3. It's a bit late
> > now...
> 
> No problem. I understand. Sorry it took me so long to get back to you.
> I was laid off from my job and had to look for a new one quickly.

Damn, I'm very sorry to hear this. But the hopefully this will just be a
very minor setback.

  reply	other threads:[~2026-04-14  9:55 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251213050639.735940-1-danilklishch@gmail.com>
     [not found] ` <cover.1768295900.git.legion@kernel.org>
     [not found]   ` <e14856f2c5f4635ddf72de61ecc59851c131489c.1768295900.git.legion@kernel.org>
2026-02-04 14:39     ` [PATCH v7 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Christian Brauner
2026-02-11 19:35       ` Alexey Gladkov
2026-02-13 10:44   ` [PATCH v8 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-02-13 10:44     ` [PATCH v8 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-02-13 10:44     ` [PATCH v8 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-13 10:44     ` [PATCH v8 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-02-13 10:44     ` [PATCH v8 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-02-17 11:59       ` Christian Brauner
2026-04-10 11:12         ` Christian Brauner
2026-04-10 11:31           ` Alexey Gladkov
2026-04-14  9:55             ` Christian Brauner [this message]
2026-02-13 10:44     ` [PATCH v8 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2026-04-13 11:19     ` [PATCH v9 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-04-13 11:19       ` [PATCH v9 1/5] namespace: record fully visible mounts in list Alexey Gladkov
2026-04-13 11:19       ` [PATCH v9 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-04-13 11:19       ` [PATCH v9 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-04-13 11:19       ` [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used Alexey Gladkov
2026-04-16 12:30         ` Aleksa Sarai
2026-04-16 12:46           ` Aleksa Sarai
2026-04-16 13:30             ` Christian Brauner
2026-04-16 15:03               ` Aleksa Sarai
2026-04-21 11:51                 ` Christian Brauner
2026-04-21 12:24                   ` Alexey Gladkov
2026-04-22 12:46                     ` Christian Brauner
2026-04-22 22:32                   ` Aleksa Sarai
2026-04-16 12:52         ` Christian Brauner
2026-04-13 11:19       ` [PATCH v9 5/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-04-27  8:26       ` [PATCH v10 0/7] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-04-27  8:26         ` [PATCH v10 1/7] namespace: record fully visible mounts in list Alexey Gladkov
2026-04-27  8:26         ` [PATCH v10 2/7] fs: move SB_I_USERNS_VISIBLE to FS_USERNS_MOUNT_RESTRICTED Alexey Gladkov
2026-04-27  8:26         ` [PATCH v10 3/7] sysfs: remove trivial sysfs_get_tree() wrapper Alexey Gladkov
2026-04-27  8:26         ` [PATCH v10 4/7] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-04-27  8:26         ` [PATCH v10 5/7] proc: prevent reconfiguring subset=pid Alexey Gladkov
2026-04-27 22:31           ` Aleksa Sarai
2026-04-27  8:26         ` [PATCH v10 6/7] proc: handle subset=pid separately in userns visibility checks Alexey Gladkov
2026-04-27  8:26         ` [PATCH v10 7/7] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-04-27 15:54         ` [PATCH v10 0/7] proc: subset=pid: Relax check of mount visibility Christian Brauner
2026-04-27 22:34         ` Aleksa Sarai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260414-mathematik-beispiel-cd22801e71af@brauner \
    --to=brauner@kernel.org \
    --cc=danilklishch@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=legion@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.