linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Mickaël Salaün" <mic@digikod.net>
To: Song Liu <songliubraving@meta.com>
Cc: NeilBrown <neil@brown.name>, "Tingmao Wang" <m@maowtm.org>,
	"Song Liu" <song@kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-security-module@vger.kernel.org"
	<linux-security-module@vger.kernel.org>,
	"brauner@kernel.org" <brauner@kernel.org>,
	"Kernel Team" <kernel-team@meta.com>,
	"andrii@kernel.org" <andrii@kernel.org>,
	"eddyz87@gmail.com" <eddyz87@gmail.com>,
	"ast@kernel.org" <ast@kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"martin.lau@linux.dev" <martin.lau@linux.dev>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"jack@suse.cz" <jack@suse.cz>,
	"kpsingh@kernel.org" <kpsingh@kernel.org>,
	"mattbobrowski@google.com" <mattbobrowski@google.com>,
	"Günther Noack" <gnoack@google.com>
Subject: Re: [PATCH v5 bpf-next 0/5] bpf path iterator
Date: Thu, 26 Jun 2025 11:43:49 +0200	[thread overview]
Message-ID: <20250626.eifo4iChohWo@digikod.net> (raw)
In-Reply-To: <9BD19ABC-08B8-4976-912D-DFCC06C29CAA@meta.com>

On Thu, Jun 26, 2025 at 05:52:50AM +0000, Song Liu wrote:
> 
> 
> > On Jun 25, 2025, at 6:05 PM, NeilBrown <neil@brown.name> wrote:
> 
> [...]
> 
> >> 
> >> I can't speak for Mickaël, but a callback-based interface is less flexible
> >> (and _maybe_ less performant?).  Also, probably we will want to fallback
> >> to a reference-taking walk if the walk fails (rather than, say, retry
> >> infinitely), and this should probably use Song's proposed iterator.  I'm
> >> not sure if Song would be keen to rewrite this iterator patch series in
> >> callback style (to be clear, it doesn't necessarily seem like a good idea
> >> to me, and I'm not asking him to), which means that we will end up with
> >> the reference walk API being a "call this function repeatedly", and the
> >> rcu walk API taking a callback.  I think it is still workable (after all,
> >> if Landlock wants to reuse the code in the callback it can just call the
> >> callback function itself when doing the reference walk), but it seems a
> >> bit "ugly" to me.
> > 
> > call-back can have a performance impact (less opportunity for compiler
> > optimisation and CPU speculation), though less than taking spinlock and
> > references.  However Al and Christian have drawn a hard line against
> > making seq numbers visible outside VFS code so I think it is the
> > approach most likely to be accepted.
> > 
> > Certainly vfs_walk_ancestors() would fallback to ref-walk if rcu-walk
> > resulted in -ECHILD - just like all other path walking code in namei.c.
> > This would be largely transparent to the caller - the caller would only
> > see that the callback received a NULL path indicating a restart.  It
> > wouldn't need to know why.

Given the constraints this looks good to me.  Here is an updated API
with two extra consts, an updated walk_cb() signature, and a new
"flags" and without @root:

int vfs_walk_ancestors(struct path *path,
                       bool (*walk_cb)(const struct path *ancestor, void *data),
                       void *data, int flags)

The walk continue while walk_cb() returns true.  walk_cb() can then
check if @ancestor is equal to a @root, or other properties.  The
walk_cb() return value (if not bool) should not be returned by
vfs_walk_ancestors() because a walk stop doesn't mean an error.

@path would be updated with latest ancestor path (e.g. @root).
@flags could contain LOOKUP_RCU or not, which enables us to have
walk_cb() not-RCU compatible.

When passing LOOKUP_RCU, if the first call to vfs_walk_ancestors()
failed with -ECHILD, the caller can restart the walk by calling
vfs_walk_ancestors() again but without LOOKUP_RCU.

> 
> I guess I misunderstood the proposal of vfs_walk_ancestors() 
> initially, so some clarification:
> 
> I think vfs_walk_ancestors() is good for the rcu-walk, and some 
> rcu-then-ref-walk. However, I don’t think it fits all use cases. 
> A reliable step-by-step ref-walk, like this set, works well with 
> BPF, and we want to keep it. 

The above updated API should work for both use cases: if the caller wants
to walk only one level, walk_cb() can just always return false (and
potentially save that it was called) and then stop the walk the first
time it is called.  This makes it possible to write an eBPF helper with
the same API as path_walk_parent(), while making the kernel API more
flexible.

> 
> Can we ship this set as-is (or after fixing the comment reported
> by kernel test robot)? I really don’t think we need figure out 
> all details about the rcu-walk here. 
> 
> Once this is landed, we can try implementing the rcu-walk
> (vfs_walk_ancestors or some variation). If no one volunteers, I
> can give it a try. 

My understanding is that Christian only wants one helper (that should
handle both use cases).  I think this updated API should be good enough
for everyone.  Most of your code should stay the same.  What do you
think?

> 
> Thanks,
> Song
> 

  reply	other threads:[~2025-06-26  9:50 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-17  6:11 [PATCH v5 bpf-next 0/5] bpf path iterator Song Liu
2025-06-17  6:11 ` [PATCH v5 bpf-next 1/5] namei: Introduce new helper function path_walk_parent() Song Liu
2025-06-18  1:02   ` kernel test robot
2025-06-24 12:18   ` Jan Kara
2025-06-24 17:37     ` Song Liu
2025-06-25 10:30       ` Jan Kara
2025-07-04 17:40   ` Yonghong Song
2025-07-06 23:54     ` Song Liu
2025-07-07 17:53       ` Yonghong Song
2025-06-17  6:11 ` [PATCH v5 bpf-next 2/5] landlock: Use path_walk_parent() Song Liu
2025-07-03 18:29   ` Mickaël Salaün
2025-07-03 22:27     ` Song Liu
2025-07-04  9:00       ` Mickaël Salaün
2025-07-06 22:29         ` Song Liu
2025-07-07 10:28         ` Christian Brauner
2025-06-17  6:11 ` [PATCH v5 bpf-next 3/5] bpf: Introduce path iterator Song Liu
2025-06-17  6:11 ` [PATCH v5 bpf-next 4/5] selftests/bpf: Add tests for bpf " Song Liu
2025-06-17  6:11 ` [PATCH v5 bpf-next 5/5] selftests/bpf: Path walk test Song Liu
2025-06-20 21:59 ` [PATCH v5 bpf-next 0/5] bpf path iterator Song Liu
2025-06-24 18:45   ` Mickaël Salaün
2025-06-24 21:38     ` NeilBrown
2025-06-25 13:14       ` Mickaël Salaün
2025-06-25 23:04         ` NeilBrown
2025-06-25 23:17           ` Song Liu
2025-06-26  0:07           ` Tingmao Wang
2025-06-26  1:05             ` NeilBrown
2025-06-26  5:52               ` Song Liu
2025-06-26  9:43                 ` Mickaël Salaün [this message]
2025-06-26 14:49                   ` Song Liu
2025-06-26 10:22                 ` NeilBrown
2025-06-26 14:28                   ` Song Liu
2025-06-26 22:51                     ` NeilBrown
2025-06-27  0:21                       ` Song Liu
2025-07-07 10:46                       ` Christian Brauner
2025-07-07 11:17                         ` Christian Brauner
2025-07-07 18:50                           ` Song Liu
2025-07-09 16:06                             ` Mickaël Salaün
2025-07-09 17:31                               ` Song Liu
2025-07-09 22:24                                 ` NeilBrown
2025-07-09 22:50                                   ` Song Liu
2025-07-10  0:58                                     ` NeilBrown
2025-07-10  6:28                                       ` Song Liu
2025-07-14 21:09                                         ` Song Liu
2025-07-24 17:35                                           ` Mickaël Salaün
2025-07-26  9:52                                             ` Song Liu
2025-07-09 22:14                             ` NeilBrown
2025-07-09 22:41                               ` Song Liu
2025-07-10  0:58                                 ` NeilBrown
2025-07-07 10:43               ` Christian Brauner
2025-07-03  5:04     ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250626.eifo4iChohWo@digikod.net \
    --to=mic@digikod.net \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=gnoack@google.com \
    --cc=jack@suse.cz \
    --cc=kernel-team@meta.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=m@maowtm.org \
    --cc=martin.lau@linux.dev \
    --cc=mattbobrowski@google.com \
    --cc=neil@brown.name \
    --cc=song@kernel.org \
    --cc=songliubraving@meta.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).