From: Alexey Gladkov <legion@kernel.org>
To: Christian Brauner <brauner@kernel.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>,
Dan Klishch <danilklishch@gmail.com>,
Al Viro <viro@zeniv.linux.org.uk>,
"Eric W . Biederman" <ebiederm@xmission.com>,
Kees Cook <keescook@chromium.org>,
containers@lists.linux.dev, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used
Date: Tue, 21 Apr 2026 14:24:08 +0200 [thread overview]
Message-ID: <aedsaAYi48NKWg_0@example.org> (raw)
In-Reply-To: <20260421-duzen-laugenbrezel-e1d1138ff5ae@brauner>
On Tue, Apr 21, 2026 at 01:51:47PM +0200, Christian Brauner wrote:
> On Fri, Apr 17, 2026 at 01:03:46AM +1000, Aleksa Sarai wrote:
> > On 2026-04-16, Christian Brauner <brauner@kernel.org> wrote:
> > > On Thu, Apr 16, 2026 at 10:46:50PM +1000, Aleksa Sarai wrote:
> > > > On 2026-04-16, Aleksa Sarai <cyphar@cyphar.com> wrote:
> > > > > On 2026-04-13, Alexey Gladkov <legion@kernel.org> wrote:
> > > > > > When procfs is mounted with the subset=pid option, all system files and
> > > > > > directories from the root of the filesystem are not accessible in
> > > > > > userspace. Only dynamic information about processes is available, which
> > > > > > cannot be hidden with overmount.
> > > > > >
> > > > > > For this reason, checking for full visibility is not relevant if mounting
> > > > > > is performed with the subset=pid option.
> > > > > >
> > > > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > > > > ---
> > > > >
> > > > > > -static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags)
> > > > > > +static bool mount_too_revealing(struct fs_context *fc, int *new_mnt_flags)
> > > > > > {
> > > > > > const unsigned long required_iflags = SB_I_NOEXEC | SB_I_NODEV;
> > > > > > struct mnt_namespace *ns = current->nsproxy->mnt_ns;
> > > > > > + const struct super_block *sb = fc->root->d_sb;
> > > > > > unsigned long s_iflags;
> > > > > >
> > > > > > if (ns->user_ns == &init_user_ns)
> > > > > > @@ -6388,7 +6387,7 @@ static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags
> > > > > > return true;
> > > > > > }
> > > > > >
> > > > > > - return !mnt_already_visible(ns, sb, new_mnt_flags);
> > > > > > + return (!fc->skip_visibility && !mnt_already_visible(ns, sb, new_mnt_flags));
> > > > > > }
> > > > >
> > > > > Unless I'm missing something (I haven't tested this locally yet, sorry),
> > > > > this will allow you to bypass mount_too_revealing() even for
> > > > > non-subset=pid mounts because once you create a subset=pid mount then a
> > > > > regular procfs mount will see the subset=pid mount and permit it.
> > > > >
> > > > > I think the solution is quite simple -- you can also skip super-blocks
> > > > > that have fc->skip_visibility set in mnt_already_visible().
> > > >
> > > > I now see that check was present in v8 but I guess its importance wasn't
> > > > obvious. I guess this means we will need to reintroduce
> > > > SB_I_USERNS_ALLOW_REVEALING. :/
> > >
> > > I've been playing with something else. So first we should move
> > > SB_I_USERNS_REVEALING to an fs_type flag. It's not an optional thing and
> > > always set and never removed. That also means we can simplify
> > > sysfs_get_tree() to just kernfs_get_tree().
> >
> > Seems quite reasonable to me.
> >
> > > And then we raise SB_I_USERNS_RESTRICTED on all procfs mounts with
> > > pid_only and disallow using them for calculating mount permissions for
> > > unrestricted procfs mounts.
> >
> > I think that's fine here and is probably all we need for now (though I
> > think that the name is a little confusing especially given my next
> > comment), but I think there is a bit of a deeper problem here that
> > deserves to be mentioned if only for posterity.
> >
> > The core issue really is that we actually have two versions of procfs
> > now, and they are effectively distinguished by fs_context flags instead
> > of fs_type. Back when subset=pid was merged there was a discussion about
> > a hypothetical proc2 -- while that got dropped, we are kind of
> > reinventing this split in an ad-hoc way.
> >
> > Conceptually considering them as two distinct fs_types would be the more
> > semantically correct thing to do, though I think it would be overkill to
> > come up with some framework for that. (I also really hope we don't have
> > any other filesystems where this is also the case.)
>
> Doing this is almost trivial. The problem really is that we'd expose
> "procfs2" or whatever as a new mount option to userspace without
> actually having fundamentally revamped what we would like procfs2 to do.
Agree.
The new procfs2 should not have the issues present in procfs.
> IOW, it would elevate a security hack that got added a while ago to a
> new filesystem type. I think that would just be sad. So I'm not so
> excited about doing this.
>
> > This is why I think the name is a little weird, since it isn't an issue
> > about one procfs being restricted in a userns, it's that they are
> > conceptually unrelated filesystem types. A very lightweight version of
>
> I somewhat disagree because it's unclear where you draw the line. If you
> use the hidepid= stuff at its most restrictive where it only shows tasks
> your user owns you could also argue that it's very close to a separate
> filesystem type. It's strictly subtractive and so is pidonly. There's
> just nothing that's better or novel.
>
> > this would be making it something like SB_I_RESTRICTED_VARIANT that
> > would be more a little more strict than what you have now --
> > SB_I_RESTRICTED_VARIANT would not be treated as compatible with anything
> > (even another SB_I_RESTRICTED_VARIANT mount) so that if we add more
> > variants in the future we don't treat them the same either. (Again, I
> > don't think we need this now?) Of course this would still allow
> > subset=pid without a procfs mount because of the other special casing
> > for it in mount_too_revealing().
>
> It's not a problem today. I think we can simply treat
> SB_I_RESTRICTED_VARIANT as not allowing other restricted variants. And
> we'd need a more elaborate mechanism anyway once we'd have "compatible"
> subsets.
>
> > We are also limited in what semantics we can change here too (and I'm a
> > little worried that even this bit for SB_I_USERNS_RESTRICTED is at risk
> > of breaking something) so maybe that isn't needed today.
>
> So I think yes, SB_I_RESTRICTED_VARIANT is fine by me for now.
Should I create a new version and add this to my patchset?
--
Rgrds, legion
next prev parent reply other threads:[~2026-04-21 12:24 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-16 10:45 [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2021-07-16 10:45 ` [RESEND PATCH v6 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 4/5] proc: Relax check of mount visibility Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2025-12-13 5:06 ` [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount visibility Dan Klishch
2025-12-13 10:49 ` Alexey Gladkov
2025-12-13 18:00 ` Dan Klishch
2025-12-14 16:40 ` Alexey Gladkov
2025-12-14 18:02 ` Dan Klishch
2025-12-15 10:10 ` Alexey Gladkov
2025-12-15 14:46 ` Dan Klishch
2025-12-15 14:58 ` Alexey Gladkov
2025-12-24 12:55 ` Christian Brauner
2026-01-30 13:34 ` Alexey Gladkov
2025-12-15 11:30 ` Christian Brauner
2026-01-13 9:20 ` [PATCH v7 " Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-04 14:39 ` Christian Brauner
2026-02-11 19:35 ` Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-01-13 9:20 ` [PATCH v7 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-02-13 10:44 ` [PATCH v8 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-02-17 11:59 ` Christian Brauner
2026-04-10 11:12 ` Christian Brauner
2026-04-10 11:31 ` Alexey Gladkov
2026-04-14 9:55 ` Christian Brauner
2026-02-13 10:44 ` [PATCH v8 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2026-04-13 11:19 ` [PATCH v9 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-04-13 11:19 ` [PATCH v9 1/5] namespace: record fully visible mounts in list Alexey Gladkov
2026-04-13 11:19 ` [PATCH v9 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-04-13 11:19 ` [PATCH v9 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-04-13 11:19 ` [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used Alexey Gladkov
2026-04-16 12:30 ` Aleksa Sarai
2026-04-16 12:46 ` Aleksa Sarai
2026-04-16 13:30 ` Christian Brauner
2026-04-16 15:03 ` Aleksa Sarai
2026-04-21 11:51 ` Christian Brauner
2026-04-21 12:24 ` Alexey Gladkov [this message]
2026-04-16 12:52 ` Christian Brauner
2026-04-13 11:19 ` [PATCH v9 5/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aedsaAYi48NKWg_0@example.org \
--to=legion@kernel.org \
--cc=brauner@kernel.org \
--cc=containers@lists.linux.dev \
--cc=cyphar@cyphar.com \
--cc=danilklishch@gmail.com \
--cc=ebiederm@xmission.com \
--cc=keescook@chromium.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox