linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] fs: add predicts based on nd->depth
@ 2025-11-19 14:29 Mateusz Guzik
  2025-11-25  9:04 ` Christian Brauner
  2025-12-12  1:22 ` Chris Mason
  0 siblings, 2 replies; 5+ messages in thread
From: Mateusz Guzik @ 2025-11-19 14:29 UTC (permalink / raw)
  To: brauner, viro; +Cc: jack, linux-kernel, linux-fsdevel, Mateusz Guzik

Stats from nd->depth usage during the venerable kernel build collected like so:
bpftrace -e 'kprobe:terminate_walk,kprobe:walk_component,kprobe:legitimize_links
{ @[probe] = lhist(((struct nameidata *)arg0)->depth, 0, 8, 1); }'

@[kprobe:legitimize_links]:
[0, 1)           6554906 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1, 2)              3534 |                                                    |

@[kprobe:terminate_walk]:
[0, 1)          12153664 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|

@[kprobe:walk_component]:
[0, 1)          53075749 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1, 2)            971421 |                                                    |
[2, 3)             84946 |                                                    |

Additionally a custom probe was added for depth within link_path_walk():
bpftrace -e 'kprobe:link_path_walk_probe { @[probe] = lhist(arg0, 0, 8, 1); }'
@[kprobe:link_path_walk_probe]:
[0, 1)           7528231 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1, 2)            407905 |@@                                                  |

Given these results:
1. terminate_walk() is called towards the end of the lookup and in this
   test it never had any links to clean up.
2. legitimize_links() is also called towards the end of lookup and most
   of the time there s 0 depth. Patch consumers to avoid calling into it
   in that case.
3. walk_component() is typically called with WALK_MORE and zero depth,
   checked in that order. Check depth first and predict it is 0.
4. link_path_walk() also does not deal with a symlink most of the time
   when !*name

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---

v5:
- tweak the commit message + add link_path_walk probe stats, no code
  changes

 fs/namei.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 2a112b2c0951..11295fcf877c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -785,7 +785,8 @@ static void leave_rcu(struct nameidata *nd)
 
 static void terminate_walk(struct nameidata *nd)
 {
-	drop_links(nd);
+	if (unlikely(nd->depth))
+		drop_links(nd);
 	if (!(nd->flags & LOOKUP_RCU)) {
 		int i;
 		path_put(&nd->path);
@@ -882,7 +883,7 @@ static bool try_to_unlazy(struct nameidata *nd)
 
 	BUG_ON(!(nd->flags & LOOKUP_RCU));
 
-	if (unlikely(!legitimize_links(nd)))
+	if (unlikely(nd->depth && !legitimize_links(nd)))
 		goto out1;
 	if (unlikely(!legitimize_path(nd, &nd->path, nd->seq)))
 		goto out;
@@ -917,7 +918,7 @@ static bool try_to_unlazy_next(struct nameidata *nd, struct dentry *dentry)
 	int res;
 	BUG_ON(!(nd->flags & LOOKUP_RCU));
 
-	if (unlikely(!legitimize_links(nd)))
+	if (unlikely(nd->depth && !legitimize_links(nd)))
 		goto out2;
 	res = __legitimize_mnt(nd->path.mnt, nd->m_seq);
 	if (unlikely(res)) {
@@ -2179,7 +2180,7 @@ static const char *walk_component(struct nameidata *nd, int flags)
 	 * parent relationships.
 	 */
 	if (unlikely(nd->last_type != LAST_NORM)) {
-		if (!(flags & WALK_MORE) && nd->depth)
+		if (unlikely(nd->depth) && !(flags & WALK_MORE))
 			put_link(nd);
 		return handle_dots(nd, nd->last_type);
 	}
@@ -2191,7 +2192,7 @@ static const char *walk_component(struct nameidata *nd, int flags)
 		if (IS_ERR(dentry))
 			return ERR_CAST(dentry);
 	}
-	if (!(flags & WALK_MORE) && nd->depth)
+	if (unlikely(nd->depth) && !(flags & WALK_MORE))
 		put_link(nd);
 	return step_into(nd, flags, dentry);
 }
@@ -2544,7 +2545,7 @@ static int link_path_walk(const char *name, struct nameidata *nd)
 		if (unlikely(!*name)) {
 OK:
 			/* pathname or trailing symlink, done */
-			if (!depth) {
+			if (likely(!depth)) {
 				nd->dir_vfsuid = i_uid_into_vfsuid(idmap, nd->inode);
 				nd->dir_mode = nd->inode->i_mode;
 				nd->flags &= ~LOOKUP_PARENT;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v5] fs: add predicts based on nd->depth
  2025-11-19 14:29 [PATCH v5] fs: add predicts based on nd->depth Mateusz Guzik
@ 2025-11-25  9:04 ` Christian Brauner
  2025-12-12  1:22 ` Chris Mason
  1 sibling, 0 replies; 5+ messages in thread
From: Christian Brauner @ 2025-11-25  9:04 UTC (permalink / raw)
  To: Mateusz Guzik; +Cc: Christian Brauner, jack, linux-kernel, linux-fsdevel, viro

On Wed, 19 Nov 2025 15:29:54 +0100, Mateusz Guzik wrote:
> Stats from nd->depth usage during the venerable kernel build collected like so:
> bpftrace -e 'kprobe:terminate_walk,kprobe:walk_component,kprobe:legitimize_links
> { @[probe] = lhist(((struct nameidata *)arg0)->depth, 0, 8, 1); }'
> 
> @[kprobe:legitimize_links]:
> [0, 1)           6554906 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [1, 2)              3534 |                                                    |
> 
> [...]

Applied to the vfs-6.19.misc branch of the vfs/vfs.git tree.
Patches in the vfs-6.19.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-6.19.misc

[1/1] fs: add predicts based on nd->depth
      https://git.kernel.org/vfs/vfs/c/7c179096e77e

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v5] fs: add predicts based on nd->depth
  2025-11-19 14:29 [PATCH v5] fs: add predicts based on nd->depth Mateusz Guzik
  2025-11-25  9:04 ` Christian Brauner
@ 2025-12-12  1:22 ` Chris Mason
  2025-12-12 11:55   ` Mateusz Guzik
  1 sibling, 1 reply; 5+ messages in thread
From: Chris Mason @ 2025-12-12  1:22 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Chris Mason, brauner, viro, jack, linux-kernel, linux-fsdevel

On Wed, 19 Nov 2025 15:29:54 +0100 Mateusz Guzik <mjguzik@gmail.com> wrote:

Hi everyone,

I ran some of the recent commits through the AI review prompts and it
flagged this one.  The comments look right to me:

[ ... ]

> diff --git a/fs/namei.c b/fs/namei.c
> index a5bffc2a29f6c..5fbb05fb96b75 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c

[ ... ]

> @@ -882,7 +883,7 @@ static bool try_to_unlazy(struct nameidata *nd)
>
>  	BUG_ON(!(nd->flags & LOOKUP_RCU));
>
> -	if (unlikely(!legitimize_links(nd)))
> +	if (unlikely(nd->depth && !legitimize_links(nd)))
>  		goto out1;

When nd->depth is 0, this short-circuits and never calls legitimize_links().
However, legitimize_links() contains a LOOKUP_CACHED check that was moved
there in commit b93a6e389bdc7 ("fix handling of nd->depth on LOOKUP_CACHED
failures in try_to_unlazy*"):

    static bool legitimize_links(struct nameidata *nd)
    {
        if (unlikely(nd->flags & LOOKUP_CACHED)) {
            drop_links(nd);
            nd->depth = 0;
            return false;
        }
        ...
    }

Before that commit, the LOOKUP_CACHED check was done directly in
try_to_unlazy() before calling legitimize_links(). It was moved into
legitimize_links() for consolidation, not because it's semantically
related to having links.

With LOOKUP_CACHED set (used by io_uring for non-blocking opens) and
nd->depth == 0, try_to_unlazy() now proceeds to legitimize_path() and may
succeed, when previously it would have failed. This can cause
lookup_fast()->try_to_unlazy() to succeed, leading the caller to call
lookup_slow() which takes inode_lock_shared() and blocks - violating
LOOKUP_CACHED semantics.

Should the LOOKUP_CACHED check be moved back to try_to_unlazy() before
the depth check, like:

    if (unlikely(nd->flags & LOOKUP_CACHED))
        goto out1;
    if (unlikely(nd->depth && !legitimize_links(nd)))
        goto out1;

>  	if (unlikely(!legitimize_path(nd, &nd->path, nd->seq)))
>  		goto out;

[ ... ]

> @@ -917,7 +918,7 @@ static bool try_to_unlazy_next(struct nameidata *nd, struct dentry *dentry)
>  	int res;
>  	BUG_ON(!(nd->flags & LOOKUP_RCU));
>
> -	if (unlikely(!legitimize_links(nd)))
> +	if (unlikely(nd->depth && !legitimize_links(nd)))
>  		goto out2;

The same issue applies here in try_to_unlazy_next().

[ ... ]


-chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v5] fs: add predicts based on nd->depth
  2025-12-12  1:22 ` Chris Mason
@ 2025-12-12 11:55   ` Mateusz Guzik
  2025-12-12 20:57     ` Chris Mason
  0 siblings, 1 reply; 5+ messages in thread
From: Mateusz Guzik @ 2025-12-12 11:55 UTC (permalink / raw)
  To: Chris Mason; +Cc: brauner, viro, jack, linux-kernel, linux-fsdevel

On Fri, Dec 12, 2025 at 2:22 AM Chris Mason <clm@meta.com> wrote:
>
> On Wed, 19 Nov 2025 15:29:54 +0100 Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> Hi everyone,
>
> I ran some of the recent commits through the AI review prompts and it
> flagged this one.  The comments look right to me:
>
> [ ... ]
>
> > diff --git a/fs/namei.c b/fs/namei.c
> > index a5bffc2a29f6c..5fbb05fb96b75 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
>
> [ ... ]
>
> > @@ -882,7 +883,7 @@ static bool try_to_unlazy(struct nameidata *nd)
> >
> >       BUG_ON(!(nd->flags & LOOKUP_RCU));
> >
> > -     if (unlikely(!legitimize_links(nd)))
> > +     if (unlikely(nd->depth && !legitimize_links(nd)))
> >               goto out1;
>
> When nd->depth is 0, this short-circuits and never calls legitimize_links().
> However, legitimize_links() contains a LOOKUP_CACHED check that was moved
> there in commit b93a6e389bdc7 ("fix handling of nd->depth on LOOKUP_CACHED
> failures in try_to_unlazy*"):
>
>     static bool legitimize_links(struct nameidata *nd)
>     {
>         if (unlikely(nd->flags & LOOKUP_CACHED)) {
>             drop_links(nd);
>             nd->depth = 0;
>             return false;
>         }
>         ...
>     }
>
> Before that commit, the LOOKUP_CACHED check was done directly in
> try_to_unlazy() before calling legitimize_links(). It was moved into
> legitimize_links() for consolidation, not because it's semantically
> related to having links.
>
> With LOOKUP_CACHED set (used by io_uring for non-blocking opens) and
> nd->depth == 0, try_to_unlazy() now proceeds to legitimize_path() and may
> succeed, when previously it would have failed. This can cause
> lookup_fast()->try_to_unlazy() to succeed, leading the caller to call
> lookup_slow() which takes inode_lock_shared() and blocks - violating
> LOOKUP_CACHED semantics.
>
> Should the LOOKUP_CACHED check be moved back to try_to_unlazy() before
> the depth check, like:
>
>     if (unlikely(nd->flags & LOOKUP_CACHED))
>         goto out1;
>     if (unlikely(nd->depth && !legitimize_links(nd)))
>         goto out1;
>

Thanks for the report. This is indeed a bug on my end, in my defense
the current behavior is... interesting -- why would it the routine
fail when it had nothing to do?

The commit hash you referenced does not exist in master, I found this
instead: eacd9aa8cedeb412842c7b339adbaa0477fdd5ad

That said, the proposed patch does not do the trick as it fails to
clean up links if nd->depth && nd->flags & LOOKUP_CACHED. The check
however can be planted *after* if (unlikely(nd->depth &&
!legitimize_links(nd)))

This would clean up the bug but retain the weird (for me anyway)
state. Perhaps this is good enough as a fixup for the release and some
clean up is -next material

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v5] fs: add predicts based on nd->depth
  2025-12-12 11:55   ` Mateusz Guzik
@ 2025-12-12 20:57     ` Chris Mason
  0 siblings, 0 replies; 5+ messages in thread
From: Chris Mason @ 2025-12-12 20:57 UTC (permalink / raw)
  To: Mateusz Guzik; +Cc: brauner, viro, jack, linux-kernel, linux-fsdevel

On 12/12/25 6:55 AM, Mateusz Guzik wrote:
> On Fri, Dec 12, 2025 at 2:22 AM Chris Mason <clm@meta.com> wrote:
>>
>> On Wed, 19 Nov 2025 15:29:54 +0100 Mateusz Guzik <mjguzik@gmail.com> wrote:
>>
>> Hi everyone,
>>
>> I ran some of the recent commits through the AI review prompts and it
>> flagged this one.  The comments look right to me:
>>
>> [ ... ]
>>
>>> diff --git a/fs/namei.c b/fs/namei.c
>>> index a5bffc2a29f6c..5fbb05fb96b75 100644
>>> --- a/fs/namei.c
>>> +++ b/fs/namei.c
>>
>> [ ... ]
>>
>>> @@ -882,7 +883,7 @@ static bool try_to_unlazy(struct nameidata *nd)
>>>
>>>       BUG_ON(!(nd->flags & LOOKUP_RCU));
>>>
>>> -     if (unlikely(!legitimize_links(nd)))
>>> +     if (unlikely(nd->depth && !legitimize_links(nd)))
>>>               goto out1;
>>
>> When nd->depth is 0, this short-circuits and never calls legitimize_links().
>> However, legitimize_links() contains a LOOKUP_CACHED check that was moved
>> there in commit b93a6e389bdc7 ("fix handling of nd->depth on LOOKUP_CACHED
>> failures in try_to_unlazy*"):
>>
>>     static bool legitimize_links(struct nameidata *nd)
>>     {
>>         if (unlikely(nd->flags & LOOKUP_CACHED)) {
>>             drop_links(nd);
>>             nd->depth = 0;
>>             return false;
>>         }
>>         ...
>>     }
>>
>> Before that commit, the LOOKUP_CACHED check was done directly in
>> try_to_unlazy() before calling legitimize_links(). It was moved into
>> legitimize_links() for consolidation, not because it's semantically
>> related to having links.
>>
>> With LOOKUP_CACHED set (used by io_uring for non-blocking opens) and
>> nd->depth == 0, try_to_unlazy() now proceeds to legitimize_path() and may
>> succeed, when previously it would have failed. This can cause
>> lookup_fast()->try_to_unlazy() to succeed, leading the caller to call
>> lookup_slow() which takes inode_lock_shared() and blocks - violating
>> LOOKUP_CACHED semantics.
>>
>> Should the LOOKUP_CACHED check be moved back to try_to_unlazy() before
>> the depth check, like:
>>
>>     if (unlikely(nd->flags & LOOKUP_CACHED))
>>         goto out1;
>>     if (unlikely(nd->depth && !legitimize_links(nd)))
>>         goto out1;
>>
> 
> Thanks for the report. This is indeed a bug on my end, in my defense
> the current behavior is... interesting -- why would it the routine
> fail when it had nothing to do?

These corners are the best use case for the AI reviews. I never would
have found this bug without a test case and a bisection.

> 
> The commit hash you referenced does not exist in master, I found this
> instead: eacd9aa8cedeb412842c7b339adbaa0477fdd5ad

Sorry about that, the tree I was working on had a bunch of backports to
the fb kernels in different branches.

> 
> That said, the proposed patch does not do the trick as it fails to
> clean up links if nd->depth && nd->flags & LOOKUP_CACHED. The check
> however can be planted *after* if (unlikely(nd->depth &&
> !legitimize_links(nd)))

Oh yeah, I missed that part.  It looks like that's exactly what Al's
eacd9aa8c was fixing.

> 
> This would clean up the bug but retain the weird (for me anyway)
> state. Perhaps this is good enough as a fixup for the release and some
> clean up is -next material

Thanks,
Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-12-12 20:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-19 14:29 [PATCH v5] fs: add predicts based on nd->depth Mateusz Guzik
2025-11-25  9:04 ` Christian Brauner
2025-12-12  1:22 ` Chris Mason
2025-12-12 11:55   ` Mateusz Guzik
2025-12-12 20:57     ` Chris Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).