From: Al Viro <viro@zeniv.linux.org.uk>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: linux-fsdevel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Christian Brauner <brauner@kernel.org>
Subject: Re: [RFC] MNT_LOCKED vs. finish_automount()
Date: Mon, 5 May 2025 00:24:41 +0100 [thread overview]
Message-ID: <20250504232441.GC2023217@ZenIV> (raw)
In-Reply-To: <87plgq8igd.fsf@email.froward.int.ebiederm.org>
On Fri, May 02, 2025 at 10:46:26PM -0500, Eric W. Biederman wrote:
> Al Viro <viro@zeniv.linux.org.uk> writes:
>
> > Back in 2011, when ->d_automount() had been introduced,
> > we went with "stepping on NFS referral, etc., has the submount
> > inherit the flags of parent one" (with the obvious exceptions
> > for internal-only flags). Back then MNT_LOCKED didn't exist.
> >
> > Two years later, when MNT_LOCKED had been added, an explicit
> > "don't set MNT_LOCKED on expirable mounts when propagating across
> > the userns boundary; their underlying mountpoints can be exposed
> > whenever the original expires anyway". Same went for root of
> > subtree attached by explicit mount --[r]bind - the mountpoint
> > had been exposed before the call, after all and for roots of
> > any propagation copies created by such (same reason). Normal mount
> > (created by do_new_mount()) could never get MNT_LOCKED to start with.
> >
> > However, mounts created by finish_automount() bloody well
> > could - if the parent mount had MNT_LOCKED on it, submounts would
> > inherited it. Even if they had been expirable. Moreover, all their
> > propagation copies would have MNT_LOCKED stripped out.
> >
> > IMO this inconsistency is a bug; MNT_LOCKED should not
> > be inherited in finish_automount().
> >
> > Eric, is there something subtle I'm missing here?
>
> I don't think you are missing anything. This looks like a pretty clear
> cut case of simply not realizing finish_automount was special in a way
> that could result in MNT_LOCKED getting set.
>
> I skimmed through the code just a minute ago and my reading of it
> matches your reading of it above.
>
> The intended semantics of MNT_LOCKED are to not let an unprivileged user
> see under mounts they would never be able to see under without creating
> a mount namespace.
>
> The mount point of an automount is pretty clearly something that is safe
> to see under. Doubly so if this is a directory that will always be
> empty on a pseudo filesystem (aka autofs).
Does anybody have objections to the following?
[PATCH] finish_automount(): don't leak MNT_LOCKED from parent to child
Intention for MNT_LOCKED had always been to protect the internal
mountpoints within a subtree that got copied across the userns boundary,
not the mountpoint that tree got attached to - after all, it _was_
exposed before the copying.
For roots of secondary copies that is enforced in attach_recursive_mnt() -
MNT_LOCKED is explicitly stripped for those. For the root of primary
copy we are almost always guaranteed that MNT_LOCKED won't be there,
so attach_recursive_mnt() doesn't bother. Unfortunately, one call
chain got overlooked - triggering e.g. NFS referral will have the
submount inherit the public flags from parent; that's fine for such
things as read-only, nosuid, etc., but not for MNT_LOCKED.
This is particularly pointless since the mount attached by finish_automount()
is usually expirable, which makes any protection granted by MNT_LOCKED
null and void; just wait for a while and that mount will go away on its own.
The minimal fix is to have do_add_mount() treat MNT_LOCKED the same
way as other internal-only flags. Longer term it would be cleaner to
deal with that in attach_recursive_mnt(), but that takes a bit more
massage, so let's go with the one-liner fix for now.
Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/namespace.c b/fs/namespace.c
index 04a9bb9f31fa..352b4ccf1aaa 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3761,7 +3761,7 @@ static int do_add_mount(struct mount *newmnt, struct mountpoint *mp,
{
struct mount *parent = real_mount(path->mnt);
- mnt_flags &= ~MNT_INTERNAL_FLAGS;
+ mnt_flags &= ~(MNT_INTERNAL_FLAGS | MNT_LOCKED);
if (unlikely(!check_mnt(parent))) {
/* that's acceptable only for automounts done in private ns */
next prev parent reply other threads:[~2025-05-04 23:24 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-01 20:15 [RFC] MNT_LOCKED vs. finish_automount() Al Viro
2025-05-03 3:46 ` Eric W. Biederman
2025-05-04 23:24 ` Al Viro [this message]
2025-05-05 13:52 ` Eric W. Biederman
2025-05-05 19:02 ` Al Viro
2025-05-06 2:25 ` Al Viro
2025-05-05 13:54 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250504232441.GC2023217@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=brauner@kernel.org \
--cc=ebiederm@xmission.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.