All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Gladkov <legion@kernel.org>
To: Christian Brauner <brauner@kernel.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>,
	Dan Klishch <danilklishch@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"Eric W . Biederman" <ebiederm@xmission.com>,
	Kees Cook <keescook@chromium.org>,
	containers@lists.linux.dev, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used
Date: Tue, 21 Apr 2026 14:24:08 +0200	[thread overview]
Message-ID: <aedsaAYi48NKWg_0@example.org> (raw)
In-Reply-To: <20260421-duzen-laugenbrezel-e1d1138ff5ae@brauner>

On Tue, Apr 21, 2026 at 01:51:47PM +0200, Christian Brauner wrote:
> On Fri, Apr 17, 2026 at 01:03:46AM +1000, Aleksa Sarai wrote:
> > On 2026-04-16, Christian Brauner <brauner@kernel.org> wrote:
> > > On Thu, Apr 16, 2026 at 10:46:50PM +1000, Aleksa Sarai wrote:
> > > > On 2026-04-16, Aleksa Sarai <cyphar@cyphar.com> wrote:
> > > > > On 2026-04-13, Alexey Gladkov <legion@kernel.org> wrote:
> > > > > > When procfs is mounted with the subset=pid option, all system files and
> > > > > > directories from the root of the filesystem are not accessible in
> > > > > > userspace. Only dynamic information about processes is available, which
> > > > > > cannot be hidden with overmount.
> > > > > > 
> > > > > > For this reason, checking for full visibility is not relevant if mounting
> > > > > > is performed with the subset=pid option.
> > > > > > 
> > > > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > > > > ---
> > > > > 
> > > > > > -static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags)
> > > > > > +static bool mount_too_revealing(struct fs_context *fc, int *new_mnt_flags)
> > > > > >  {
> > > > > >  	const unsigned long required_iflags = SB_I_NOEXEC | SB_I_NODEV;
> > > > > >  	struct mnt_namespace *ns = current->nsproxy->mnt_ns;
> > > > > > +	const struct super_block *sb = fc->root->d_sb;
> > > > > >  	unsigned long s_iflags;
> > > > > >  
> > > > > >  	if (ns->user_ns == &init_user_ns)
> > > > > > @@ -6388,7 +6387,7 @@ static bool mount_too_revealing(const struct super_block *sb, int *new_mnt_flags
> > > > > >  		return true;
> > > > > >  	}
> > > > > >  
> > > > > > -	return !mnt_already_visible(ns, sb, new_mnt_flags);
> > > > > > +	return (!fc->skip_visibility && !mnt_already_visible(ns, sb, new_mnt_flags));
> > > > > >  }
> > > > > 
> > > > > Unless I'm missing something (I haven't tested this locally yet, sorry),
> > > > > this will allow you to bypass mount_too_revealing() even for
> > > > > non-subset=pid mounts because once you create a subset=pid mount then a
> > > > > regular procfs mount will see the subset=pid mount and permit it.
> > > > > 
> > > > > I think the solution is quite simple -- you can also skip super-blocks
> > > > > that have fc->skip_visibility set in mnt_already_visible().
> > > > 
> > > > I now see that check was present in v8 but I guess its importance wasn't
> > > > obvious. I guess this means we will need to reintroduce
> > > > SB_I_USERNS_ALLOW_REVEALING. :/
> > > 
> > > I've been playing with something else. So first we should move
> > > SB_I_USERNS_REVEALING to an fs_type flag. It's not an optional thing and
> > > always set and never removed. That also means we can simplify
> > > sysfs_get_tree() to just kernfs_get_tree().
> > 
> > Seems quite reasonable to me.
> > 
> > > And then we raise SB_I_USERNS_RESTRICTED on all procfs mounts with
> > > pid_only and disallow using them for calculating mount permissions for
> > > unrestricted procfs mounts.
> > 
> > I think that's fine here and is probably all we need for now (though I
> > think that the name is a little confusing especially given my next
> > comment), but I think there is a bit of a deeper problem here that
> > deserves to be mentioned if only for posterity.
> > 
> > The core issue really is that we actually have two versions of procfs
> > now, and they are effectively distinguished by fs_context flags instead
> > of fs_type. Back when subset=pid was merged there was a discussion about
> > a hypothetical proc2 -- while that got dropped, we are kind of
> > reinventing this split in an ad-hoc way.
> > 
> > Conceptually considering them as two distinct fs_types would be the more
> > semantically correct thing to do, though I think it would be overkill to
> > come up with some framework for that. (I also really hope we don't have
> > any other filesystems where this is also the case.)
> 
> Doing this is almost trivial. The problem really is that we'd expose
> "procfs2" or whatever as a new mount option to userspace without
> actually having fundamentally revamped what we would like procfs2 to do.

Agree.

The new procfs2 should not have the issues present in procfs.

> IOW, it would elevate a security hack that got added a while ago to a
> new filesystem type. I think that would just be sad. So I'm not so
> excited about doing this.
> 
> > This is why I think the name is a little weird, since it isn't an issue
> > about one procfs being restricted in a userns, it's that they are
> > conceptually unrelated filesystem types. A very lightweight version of
> 
> I somewhat disagree because it's unclear where you draw the line. If you
> use the hidepid= stuff at its most restrictive where it only shows tasks
> your user owns you could also argue that it's very close to a separate
> filesystem type. It's strictly subtractive and so is pidonly. There's
> just nothing that's better or novel.
> 
> > this would be making it something like SB_I_RESTRICTED_VARIANT that
> > would be more a little more strict than what you have now --
> > SB_I_RESTRICTED_VARIANT would not be treated as compatible with anything
> > (even another SB_I_RESTRICTED_VARIANT mount) so that if we add more
> > variants in the future we don't treat them the same either. (Again, I
> > don't think we need this now?) Of course this would still allow
> > subset=pid without a procfs mount because of the other special casing
> > for it in mount_too_revealing().
> 
> It's not a problem today. I think we can simply treat
> SB_I_RESTRICTED_VARIANT as not allowing other restricted variants. And
> we'd need a more elaborate mechanism anyway once we'd have "compatible"
> subsets.
> 
> > We are also limited in what semantics we can change here too (and I'm a
> > little worried that even this bit for SB_I_USERNS_RESTRICTED is at risk
> > of breaking something) so maybe that isn't needed today.
> 
> So I think yes, SB_I_RESTRICTED_VARIANT is fine by me for now.

Should I create a new version and add this to my patchset?

-- 
Rgrds, legion


  reply	other threads:[~2026-04-21 12:24 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-16 10:45 [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2021-07-16 10:45 ` [RESEND PATCH v6 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 4/5] proc: Relax check of mount visibility Alexey Gladkov
2021-07-16 10:46 ` [RESEND PATCH v6 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2025-12-13  5:06 ` [RESEND PATCH v6 0/5] proc: subset=pid: Relax check of mount visibility Dan Klishch
2025-12-13 10:49   ` Alexey Gladkov
2025-12-13 18:00     ` Dan Klishch
2025-12-14 16:40       ` Alexey Gladkov
2025-12-14 18:02         ` Dan Klishch
2025-12-15 10:10           ` Alexey Gladkov
2025-12-15 14:46             ` Dan Klishch
2025-12-15 14:58               ` Alexey Gladkov
2025-12-24 12:55                 ` Christian Brauner
2026-01-30 13:34                   ` Alexey Gladkov
2025-12-15 11:30           ` Christian Brauner
2026-01-13  9:20   ` [PATCH v7 " Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-04 14:39       ` Christian Brauner
2026-02-11 19:35         ` Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-01-13  9:20     ` [PATCH v7 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2026-02-13 10:44     ` [PATCH v8 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-02-13 10:44       ` [PATCH v8 1/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-02-13 10:44       ` [PATCH v8 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-02-13 10:44       ` [PATCH v8 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-02-13 10:44       ` [PATCH v8 4/5] proc: Relax check of mount visibility Alexey Gladkov
2026-02-17 11:59         ` Christian Brauner
2026-04-10 11:12           ` Christian Brauner
2026-04-10 11:31             ` Alexey Gladkov
2026-04-14  9:55               ` Christian Brauner
2026-02-13 10:44       ` [PATCH v8 5/5] docs: proc: add documentation about relaxing visibility restrictions Alexey Gladkov
2026-04-13 11:19       ` [PATCH v9 0/5] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-04-13 11:19         ` [PATCH v9 1/5] namespace: record fully visible mounts in list Alexey Gladkov
2026-04-13 11:19         ` [PATCH v9 2/5] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-04-13 11:19         ` [PATCH v9 3/5] proc: Disable cancellation of subset=pid option Alexey Gladkov
2026-04-13 11:19         ` [PATCH v9 4/5] proc: Skip the visibility check if subset=pid is used Alexey Gladkov
2026-04-16 12:30           ` Aleksa Sarai
2026-04-16 12:46             ` Aleksa Sarai
2026-04-16 13:30               ` Christian Brauner
2026-04-16 15:03                 ` Aleksa Sarai
2026-04-21 11:51                   ` Christian Brauner
2026-04-21 12:24                     ` Alexey Gladkov [this message]
2026-04-22 12:46                       ` Christian Brauner
2026-04-22 22:32                     ` Aleksa Sarai
2026-04-16 12:52           ` Christian Brauner
2026-04-13 11:19         ` [PATCH v9 5/5] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-04-27  8:26         ` [PATCH v10 0/7] proc: subset=pid: Relax check of mount visibility Alexey Gladkov
2026-04-27  8:26           ` [PATCH v10 1/7] namespace: record fully visible mounts in list Alexey Gladkov
2026-04-27  8:26           ` [PATCH v10 2/7] fs: move SB_I_USERNS_VISIBLE to FS_USERNS_MOUNT_RESTRICTED Alexey Gladkov
2026-04-27  8:26           ` [PATCH v10 3/7] sysfs: remove trivial sysfs_get_tree() wrapper Alexey Gladkov
2026-04-27  8:26           ` [PATCH v10 4/7] proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN Alexey Gladkov
2026-04-27  8:26           ` [PATCH v10 5/7] proc: prevent reconfiguring subset=pid Alexey Gladkov
2026-04-27 22:31             ` Aleksa Sarai
2026-04-27  8:26           ` [PATCH v10 6/7] proc: handle subset=pid separately in userns visibility checks Alexey Gladkov
2026-04-27  8:26           ` [PATCH v10 7/7] docs: proc: add documentation about mount restrictions Alexey Gladkov
2026-04-27 15:54           ` [PATCH v10 0/7] proc: subset=pid: Relax check of mount visibility Christian Brauner
2026-04-27 22:34           ` Aleksa Sarai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aedsaAYi48NKWg_0@example.org \
    --to=legion@kernel.org \
    --cc=brauner@kernel.org \
    --cc=containers@lists.linux.dev \
    --cc=cyphar@cyphar.com \
    --cc=danilklishch@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.