public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Alexey Gladkov <legion@kernel.org>
To: Christian Brauner <brauner@kernel.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>,
	Dan Klishch <danilklishch@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"Eric W . Biederman" <ebiederm@xmission.com>,
	Kees Cook <keescook@chromium.org>,
	containers@lists.linux.dev, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used
Date: Tue, 21 Apr 2026 14:24:08 +0200	[thread overview]
Message-ID: <aedsaAYi48NKWg_0@example.org> (raw)
In-Reply-To: <20260421-duzen-laugenbrezel-e1d1138ff5ae@brauner>

On Tue, Apr 21, 2026 at 01:51:47PM +0200, Christian Brauner wrote:
> On Fri, Apr 17, 2026 at 01:03:46AM +1000, Aleksa Sarai wrote:
> > On 2026-04-16, Christian Brauner <brauner@kernel.org> wrote:
> > > On Thu, Apr 16, 2026 at 10:46:50PM +1000, Aleksa Sarai wrote:
> > > > On 2026-04-16, Aleksa Sarai <cyphar@cyphar.com> wrote:
> > > > > On 2026-04-13, Alexey Gladkov <legion@kernel.org> wrote:
> > > > > > When procfs is mounted with the subset=pid option, all system files and
> > > > > > directories from the root of the filesystem are not accessible in
> > > > > > userspace. Only dynamic information about processes is available, which
> > > > > > cannot be hidden with overmount.
> > > > > > 
> > > > > > For this reason, checking for full visibility is not relevant if mounting
> > > > > > is performed with the subset=pid option.
> > > > > > 
> > > > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > > > > ---
> > > > > 
> > > > > > -static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags)
> > > > > > +static bool mount_too_revealing(struct fs_context *fc, int *new_mnt_flags)
> > > > > >  {
> > > > > >  	const unsigned long required_iflags = SB_I_NOEXEC | SB_I_NODEV;
> > > > > >  	struct mnt_namespace *ns = current->nsproxy->mnt_ns;
> > > > > > +	const struct super_block *sb = fc->root->d_sb;
> > > > > >  	unsigned long s_iflags;
> > > > > >  
> > > > > >  	if (ns->user_ns == &init_user_ns)
> > > > > > @@ -6388,7 +6387,7 @@ static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags
> > > > > >  		return true;
> > > > > >  	}
> > > > > >  
> > > > > > -	return !mnt_already_visible(ns, sb, new_mnt_flags);
> > > > > > +	return (!fc->skip_visibility && !mnt_already_visible(ns, sb, new_mnt_flags));
> > > > > >  }
> > > > > 
> > > > > Unless I'm missing something (I haven't tested this locally yet, sorry),
> > > > > this will allow you to bypass mount_too_revealing() even for
> > > > > non-subset=pid mounts because once you create a subset=pid mount then a
> > > > > regular procfs mount will see the subset=pid mount and permit it.
> > > > > 
> > > > > I think the solution is quite simple -- you can also skip super-blocks
> > > > > that have fc->skip_visibility set in mnt_already_visible().
> > > > 
> > > > I now see that check was present in v8 but I guess its importance wasn't
> > > > obvious. I guess this means we will need to reintroduce
> > > > SB_I_USERNS_ALLOW_REVEALING. :/
> > > 
> > > I've been playing with something else. So first we should move
> > > SB_I_USERNS_REVEALING to an fs_type flag. It's not an optional thing and
> > > always set and never removed. That also means we can simplify
> > > sysfs_get_tree() to just kernfs_get_tree().
> > 
> > Seems quite reasonable to me.
> > 
> > > And then we raise SB_I_USERNS_RESTRICTED on all procfs mounts with
> > > pid_only and disallow using them for calculating mount permissions for
> > > unrestricted procfs mounts.
> > 
> > I think that's fine here and is probably all we need for now (though I
> > think that the name is a little confusing especially given my next
> > comment), but I think there is a bit of a deeper problem here that
> > deserves to be mentioned if only for posterity.
> > 
> > The core issue really is that we actually have two versions of procfs
> > now, and they are effectively distinguished by fs_context flags instead
> > of fs_type. Back when subset=pid was merged there was a discussion about
> > a hypothetical proc2 -- while that got dropped, we are kind of
> > reinventing this split in an ad-hoc way.
> > 
> > Conceptually considering them as two distinct fs_types would be the more
> > semantically correct thing to do, though I think it would be overkill to
> > come up with some framework for that. (I also really hope we don't have
> > any other filesystems where this is also the case.)
> 
> Doing this is almost trivial. The problem really is that we'd expose
> "procfs2" or whatever as a new mount option to userspace without
> actually having fundamentally revamped what we would like procfs2 to do.

Agree.

The new procfs2 should not have the issues present in procfs.

> IOW, it would elevate a security hack that got added a while ago to a
> new filesystem type. I think that would just be sad. So I'm not so
> excited about doing this.
> 
> > This is why I think the name is a little weird, since it isn't an issue
> > about one procfs being restricted in a userns, it's that they are
> > conceptually unrelated filesystem types. A very lightweight version of
> 
> I somewhat disagree because it's unclear where you draw the line. If you
> use the hidepid= stuff at its most restrictive where it only shows tasks
> your user owns you could also argue that it's very close to a separate
> filesystem type. It's strictly subtractive and so is pidonly. There's
> just nothing that's better or novel.
> 
> > this would be making it something like SB_I_RESTRICTED_VARIANT that
> > would be more a little more strict than what you have now --
> > SB_I_RESTRICTED_VARIANT would not be treated as compatible with anything
> > (even another SB_I_RESTRICTED_VARIANT mount) so that if we add more
> > variants in the future we don't treat them the same either. (Again, I
> > don't think we need this now?) Of course this would still allow
> > subset=pid without a procfs mount because of the other special casing
> > for it in mount_too_revealing().
> 
> It's not a problem today. I think we can simply treat
> SB_I_RESTRICTED_VARIANT as not allowing other restricted variants. And
> we'd need a more elaborate mechanism anyway once we'd have "compatible"
> subsets.
> 
> > We are also limited in what semantics we can change here too (and I'm a
> > little worried that even this bit for SB_I_USERNS_RESTRICTED is at risk
> > of breaking something) so maybe that isn't needed today.
> 
> So I think yes, SB_I_RESTRICTED_VARIANT is fine by me for now.

Should I create a new version and add this to my patchset?

-- 
Rgrds, legion


  reply	other threads:[~2026-04-21 12:24 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-16 10:45 [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2021-07-16 10:45 ` [RESEND PATCH v6 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 4/5] proc: Relax check of mount visibility Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2025-12-13  5:06 ` [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount visibility Dan Klishch
2025-12-13 10:49   ` Alexey Gladkov
2025-12-13 18:00     ` Dan Klishch
2025-12-14 16:40       ` Alexey Gladkov
2025-12-14 18:02         ` Dan Klishch
2025-12-15 10:10           ` Alexey Gladkov
2025-12-15 14:46             ` Dan Klishch
2025-12-15 14:58               ` Alexey Gladkov
2025-12-24 12:55                 ` Christian Brauner
2026-01-30 13:34                   ` Alexey Gladkov
2025-12-15 11:30           ` Christian Brauner
2026-01-13  9:20   ` [PATCH v7 " Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-04 14:39       ` Christian Brauner
2026-02-11 19:35         ` Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2026-02-13 10:44     ` [PATCH v8 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-02-13 10:44       ` [PATCH v8 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-02-13 10:44       ` [PATCH v8 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-13 10:44       ` [PATCH v8 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-02-13 10:44       ` [PATCH v8 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-02-17 11:59         ` Christian Brauner
2026-04-10 11:12           ` Christian Brauner
2026-04-10 11:31             ` Alexey Gladkov
2026-04-14  9:55               ` Christian Brauner
2026-02-13 10:44       ` [PATCH v8 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2026-04-13 11:19       ` [PATCH v9 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-04-13 11:19         ` [PATCH v9 1/5] namespace: record fully visible mounts in list Alexey Gladkov
2026-04-13 11:19         ` [PATCH v9 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-04-13 11:19         ` [PATCH v9 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-04-13 11:19         ` [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used Alexey Gladkov
2026-04-16 12:30           ` Aleksa Sarai
2026-04-16 12:46             ` Aleksa Sarai
2026-04-16 13:30               ` Christian Brauner
2026-04-16 15:03                 ` Aleksa Sarai
2026-04-21 11:51                   ` Christian Brauner
2026-04-21 12:24                     ` Alexey Gladkov [this message]
2026-04-16 12:52           ` Christian Brauner
2026-04-13 11:19         ` [PATCH v9 5/5] docs: proc: add documentation about mount restrictions Alexey Gladkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aedsaAYi48NKWg_0@example.org \
    --to=legion@kernel.org \
    --cc=brauner@kernel.org \
    --cc=containers@lists.linux.dev \
    --cc=cyphar@cyphar.com \
    --cc=danilklishch@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox