All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Song Liu <song@kernel.org>
Cc: Jan Kara <jack@suse.cz>,
	bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, kernel-team@meta.com,
	andrii@kernel.org, eddyz87@gmail.com, ast@kernel.org,
	daniel@iogearbox.net, martin.lau@linux.dev, brauner@kernel.org,
	kpsingh@kernel.org, mattbobrowski@google.com, amir73il@gmail.com,
	repnop@google.com, jlayton@kernel.org, josef@toxicpanda.com,
	mic@digikod.net, gnoack@google.com
Subject: Re: [PATCH bpf-next 3/4] bpf: Introduce path iterator
Date: Fri, 30 May 2025 00:10:18 +0100	[thread overview]
Message-ID: <20250529231018.GP2023217@ZenIV> (raw)
In-Reply-To: <CAPhsuW5oXZVEaMwNpSF74O7wZ_f2Qr_44pu9L4_=LBwdW5T9=w@mail.gmail.com>

On Thu, May 29, 2025 at 03:13:10PM -0700, Song Liu wrote:

> Is it an issue if we only hold a reference to a MNT_LOCKED mount for
> short period of time? "Short period" means it may get interrupted, page
> faults, or wait for an IO (read xattr), but it won't hold a reference to the
> mount and sleep indefinitely.

MNT_LOCKED mount itself is not a problem.  What shouldn't be done is
looking around in the mountpoint it covers.  It depends upon the things
you are going to do with that, but it's very easy to get an infoleak
that way.

> > OTOH, there's a good cause for moving some of the flags, MNT_LOCKED
> > included, out of ->mnt_flags and into a separate field in struct mount.
> > However, that would conflict with any code using that to deal with
> > your iterator safely.
> >
> > What's more, AFAICS in case of a stack of mounts each covering the root
> > of parent mount, you stop in each of those.  The trouble is, umount(2)
> > propagation logics assumes that intermediate mounts can be pulled out of
> > such stack without causing trouble.  For pathname resolution that is
> > true; it goes through the entire stack atomically wrt that stuff.
> > For your API that's not the case; somebody who has no idea about an
> > intermediate mount being there might get caught on it while it's getting
> > pulled from the stack.
> >
> > What exactly do you need around the mountpoint crossing?
> 
> I thought about skipping intermediate mounts (that are hidden by
> other mounts). AFAICT, not skipping them will not cause any issue.

It can.  Suppose e.g. that /mnt gets propagation from another namespace,
but not the other way round and you mount something on /mnt.

Later, in that another namespace, somebody mounts something on wherever
your /mnt gets propagation to.  A copy will be propagated _between_
your /mnt and whatever you've mounted on top of it; it will be entirely
invisible until you umount your /mnt.  At that point the propagated
copy will show up there, same as if it had appeared just after your
umount.  Prior to that it's entirely invisible.  If its original
counterpart in another namespace gets unmounted first, the copy will
be quietly pulled out.

Note that choose_mountpoint_rcu() callers (including choose_mountpoint())
will have mount_lock seqcount sampled before the traversal _and_ recheck
it after having reached the bottom of stack.  IOW, if you traverse ..
on the way to root, you won't get caught on the sucker being pulled out.

Your iterator, OTOH, would stop in that intermediate mount - and get
an unpleasant surprise when it comes back to do the next step (towards
/mnt on root filesystem, that is) and finds that path->mnt points
to something that is detached from everything - no way to get from
it any further.  That - despite the fact that location you've started
from is still mounted, still has the same pathname, etc. and nothing
had been disrupted for it.

And yes, landlock has a narrow race in the matching place.  Needs to
be fixed.  At least it does ignore those as far as any decisions are
concerned...

Note, BTW, that it might be better off by doing that similar to
d_path.c - without arseloads of dget_parent/dput et.al.; not sure
how feasible it is, but if everything in it can be done under
rcu_read_lock(), that's something to look into.

  reply	other threads:[~2025-05-29 23:10 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-28 22:26 [PATCH bpf-next 0/4] bpf path iterator Song Liu
2025-05-28 22:26 ` [PATCH bpf-next 1/4] namei: Introduce new helper function path_parent() Song Liu
2025-05-28 22:26 ` [PATCH bpf-next 2/4] landlock: Use path_parent() Song Liu
2025-05-31 13:51   ` Tingmao Wang
2025-06-02 13:36     ` Song Liu
2025-06-03  0:10       ` Song Liu
2025-06-03 12:47         ` Mickaël Salaün
2025-06-02 17:35     ` Mickaël Salaün
2025-06-02 22:56       ` Tingmao Wang
2025-05-28 22:26 ` [PATCH bpf-next 3/4] bpf: Introduce path iterator Song Liu
2025-05-28 22:37   ` Al Viro
2025-05-29 11:58     ` Jan Kara
2025-05-29 16:53       ` Song Liu
2025-05-29 16:57         ` Alexei Starovoitov
2025-05-29 17:05           ` Song Liu
2025-05-30 14:20             ` Mickaël Salaün
2025-06-02  9:41               ` Christian Brauner
2025-06-03  9:46               ` Jan Kara
2025-06-03 12:49                 ` Mickaël Salaün
2025-06-03 21:13                   ` Jan Kara
2025-05-29 17:38         ` Al Viro
2025-05-29 18:00           ` Song Liu
2025-05-29 18:35             ` Al Viro
2025-05-29 19:46               ` Song Liu
2025-05-29 20:15                 ` Al Viro
2025-05-29 21:07                   ` Song Liu
2025-05-29 21:45                     ` Al Viro
2025-05-29 22:13                       ` Song Liu
2025-05-29 23:10                         ` Al Viro [this message]
2025-05-30  0:42                           ` Song Liu
2025-05-30 12:20                             ` Mickaël Salaün
2025-05-30 18:43                               ` Al Viro
2025-05-31  8:39                                 ` Mickaël Salaün
2025-06-02  9:32                                 ` Christian Brauner
2025-05-30 18:55                               ` Song Liu
2025-05-31  8:40                                 ` Mickaël Salaün
2025-05-31 14:05                                 ` Tingmao Wang
2025-06-01 23:33                                   ` Song Liu
2025-06-04  0:58                             ` Tingmao Wang
2025-06-02  9:30                           ` Christian Brauner
2025-06-02  9:27             ` Christian Brauner
2025-06-02 13:27               ` Song Liu
2025-06-02 15:40                 ` Alexei Starovoitov
2025-06-02 21:39                   ` Song Liu
2025-05-28 22:26 ` [PATCH bpf-next 4/4] selftests/bpf: Add tests for bpf " Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250529231018.GP2023217@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=amir73il@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=gnoack@google.com \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@meta.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mattbobrowski@google.com \
    --cc=mic@digikod.net \
    --cc=repnop@google.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.