From: Christian Brauner <brauner@kernel.org>
To: Alexey Gladkov <legion@kernel.org>
Cc: Dan Klishch <danilklishch@gmail.com>,
Al Viro <viro@zeniv.linux.org.uk>,
"Eric W . Biederman" <ebiederm@xmission.com>,
Kees Cook <keescook@chromium.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v8 4/5] proc: Relax check of mount visibility
Date: Fri, 10 Apr 2026 13:12:36 +0200 [thread overview]
Message-ID: <20260410-textstellen-anrollen-17acbf930c3a@brauner> (raw)
In-Reply-To: <20260217-helikopter-parkplatz-859f8f7c0054@brauner>
On Tue, Feb 17, 2026 at 12:59:54PM +0100, Christian Brauner wrote:
> On Fri, Feb 13, 2026 at 11:44:29AM +0100, Alexey Gladkov wrote:
> > When /proc is mounted with the subset=pid option, all system files from
> > the root of the file system are not accessible in userspace. Only
> > dynamic information about processes is available, which cannot be
> > hidden with overmount.
> >
> > For this reason, checking for full visibility is not relevant if
> > mounting is performed with the subset=pid option.
> >
> > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > ---
> > fs/namespace.c | 29 ++++++++++++++++-------------
> > fs/proc/root.c | 17 ++++++++++-------
> > include/linux/fs/super_types.h | 2 ++
> > 3 files changed, 28 insertions(+), 20 deletions(-)
> >
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index c58674a20cad..7daa86315c05 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -6116,7 +6116,8 @@ static bool mnt_already_visible(struct mnt_namespace *ns,
> > /* This mount is not fully visible if it's root directory
> > * is not the root directory of the filesystem.
> > */
> > - if (mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root)
> > + if (!(sb->s_iflags & SB_I_USERNS_ALLOW_REVEALING) &&
> > + mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root)
> > continue;
> >
> > /* A local view of the mount flags */
> > @@ -6136,18 +6137,20 @@ static bool mnt_already_visible(struct mnt_namespace *ns,
> > ((mnt_flags & MNT_ATIME_MASK) != (new_flags & MNT_ATIME_MASK)))
> > continue;
>
> There are a few things that I find problematic here.
>
> Even before your change the mount flags of the first fully visible
> procfs mount would be picked up. If the caller was unlucky they could
> stumble upon the most restricted procfs mount in the mount namespace
> rbtree. Leading to weird scenarios where a user cannot write to the
> procfs instance they just mounted but could to another one that is also
> in their namespace.
>
> The other thing is that with this change specifically:
>
> if (!(sb->s_iflags & SB_I_USERNS_ALLOW_REVEALING) &&
> mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root)
>
> we start caring about mount options of even partially exposed procfs
> mounts. IOW, if someone had a bind-mount of e.g., /proc/pressure
> somewhere that got inherited via CLONE_NEWNS then we suddenly take the
> mount options of that into account for a new /proc/<pid>/* only instance.
> I think we should continue caring only about procfs mounts that are
> visible from their root.
>
> The the other problem is that it is really annoying that we walk all
> mounts in a mount namespace just to find procfs and sysfs mounts in
> there. Currently a lot of workloads still do the CLONE_NEWNS dance
> meaning they inherit all the crap from the host and then proceed to
> setup their new rootfs. Busy container workloads that can be a lot.
>
> So let's just be honest about it and treat procfs and sysfs as the
> snowflakes that they have become and record their instances in a
> separate per mount namespace hlist as in the (untested) patch below [1].
>
> Also SB_I_USERNS_ALLOW_REVEALING seems unnecessary. The only time we
> care about that flag is when we setup a new superblock. So this could
> easily be a struct fs_context bitfield that just exists for the duration
> of the creation of the new superblock and mount. So maybe pass that down
> to mount_too_revealing() and further down into the actual helper.
>
> [1]:
> >From 4bbd41e7a3ef91667dd334f31b1b6bf8caec0599 Mon Sep 17 00:00:00 2001
> From: Christian Brauner <brauner@kernel.org>
> Date: Tue, 17 Feb 2026 12:02:34 +0100
> Subject: [PATCH] namespace: record fully visible mounts in list
>
> Instead of wading through all the mounts in the mount namespace rbtree
> to find fully visible procfs and sysfs mounts, be honest about them
> being special cruft and record them in a separate per-mount namespace
> list.
If you rework this I would expect to take it for v7.3. It's a bit late
now...
next prev parent reply other threads:[~2026-04-10 11:12 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-16 10:45 [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2021-07-16 10:45 ` [RESEND PATCH v6 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 4/5] proc: Relax check of mount visibility Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2025-12-13 5:06 ` [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount visibility Dan Klishch
2025-12-13 10:49 ` Alexey Gladkov
2025-12-13 18:00 ` Dan Klishch
2025-12-14 16:40 ` Alexey Gladkov
2025-12-14 18:02 ` Dan Klishch
2025-12-15 10:10 ` Alexey Gladkov
2025-12-15 14:46 ` Dan Klishch
2025-12-15 14:58 ` Alexey Gladkov
2025-12-24 12:55 ` Christian Brauner
2026-01-30 13:34 ` Alexey Gladkov
2025-12-15 11:30 ` Christian Brauner
2026-01-13 9:20 ` [PATCH v7 " Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-04 14:39 ` Christian Brauner
2026-02-11 19:35 ` Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-02-17 11:59 ` Christian Brauner
2026-04-10 11:12 ` Christian Brauner [this message]
2026-04-10 11:31 ` Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260410-textstellen-anrollen-17acbf930c3a@brauner \
--to=brauner@kernel.org \
--cc=danilklishch@gmail.com \
--cc=ebiederm@xmission.com \
--cc=keescook@chromium.org \
--cc=legion@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox