* [PATCH v4] fs: add predicts based on nd->depth
@ 2025-11-10 16:59 Mateusz Guzik
2025-11-11 3:32 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: Mateusz Guzik @ 2025-11-10 16:59 UTC (permalink / raw)
To: viro; +Cc: brauner, jack, linux-kernel, linux-fsdevel, Mateusz Guzik
Stats from nd->depth usage during the venerable kernel build collected like so:
bpftrace -e 'kprobe:terminate_walk,kprobe:walk_component,kprobe:legitimize_links
{ @[probe] = lhist(((struct nameidata *)arg0)->depth, 0, 8, 1); }'
@[kprobe:legitimize_links]:
[0, 1) 6554906 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1, 2) 3534 | |
@[kprobe:terminate_walk]:
[0, 1) 12153664 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
@[kprobe:walk_component]:
[0, 1) 53075749 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1, 2) 971421 | |
[2, 3) 84946 | |
Given these results:
1. terminate_walk() is called towards the end of the lookup. I failed
run into a case where it has any depth to clean up. For now predict
it does not.
2. legitimize_links() is also called towards the end of lookup and most
of the time there s 0 depth. Patch consumers to avoid calling into it
in that case.
3. walk_component() is typically called with WALK_MORE and zero depth,
checked in that order. Check depth first and predict it is 0.
4. link_path_walk() predicts not dealing with a symlink, but the other
part of symlink handling fails to make the same predict. Add it.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
v4:
- fix backwards predict in link_path_walk
v3:
- more predicts
This obsoletes the previous patch which only took care of
legitimize_links().
fs/namei.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 2a112b2c0951..11295fcf877c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -785,7 +785,8 @@ static void leave_rcu(struct nameidata *nd)
static void terminate_walk(struct nameidata *nd)
{
- drop_links(nd);
+ if (unlikely(nd->depth))
+ drop_links(nd);
if (!(nd->flags & LOOKUP_RCU)) {
int i;
path_put(&nd->path);
@@ -882,7 +883,7 @@ static bool try_to_unlazy(struct nameidata *nd)
BUG_ON(!(nd->flags & LOOKUP_RCU));
- if (unlikely(!legitimize_links(nd)))
+ if (unlikely(nd->depth && !legitimize_links(nd)))
goto out1;
if (unlikely(!legitimize_path(nd, &nd->path, nd->seq)))
goto out;
@@ -917,7 +918,7 @@ static bool try_to_unlazy_next(struct nameidata *nd, struct dentry *dentry)
int res;
BUG_ON(!(nd->flags & LOOKUP_RCU));
- if (unlikely(!legitimize_links(nd)))
+ if (unlikely(nd->depth && !legitimize_links(nd)))
goto out2;
res = __legitimize_mnt(nd->path.mnt, nd->m_seq);
if (unlikely(res)) {
@@ -2179,7 +2180,7 @@ static const char *walk_component(struct nameidata *nd, int flags)
* parent relationships.
*/
if (unlikely(nd->last_type != LAST_NORM)) {
- if (!(flags & WALK_MORE) && nd->depth)
+ if (unlikely(nd->depth) && !(flags & WALK_MORE))
put_link(nd);
return handle_dots(nd, nd->last_type);
}
@@ -2191,7 +2192,7 @@ static const char *walk_component(struct nameidata *nd, int flags)
if (IS_ERR(dentry))
return ERR_CAST(dentry);
}
- if (!(flags & WALK_MORE) && nd->depth)
+ if (unlikely(nd->depth) && !(flags & WALK_MORE))
put_link(nd);
return step_into(nd, flags, dentry);
}
@@ -2544,7 +2545,7 @@ static int link_path_walk(const char *name, struct nameidata *nd)
if (unlikely(!*name)) {
OK:
/* pathname or trailing symlink, done */
- if (!depth) {
+ if (likely(!depth)) {
nd->dir_vfsuid = i_uid_into_vfsuid(idmap, nd->inode);
nd->dir_mode = nd->inode->i_mode;
nd->flags &= ~LOOKUP_PARENT;
--
2.48.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v4] fs: add predicts based on nd->depth
2025-11-10 16:59 [PATCH v4] fs: add predicts based on nd->depth Mateusz Guzik
@ 2025-11-11 3:32 ` Al Viro
2025-11-11 18:00 ` Mateusz Guzik
0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2025-11-11 3:32 UTC (permalink / raw)
To: Mateusz Guzik; +Cc: brauner, jack, linux-kernel, linux-fsdevel
On Mon, Nov 10, 2025 at 05:59:01PM +0100, Mateusz Guzik wrote:
> Given these results:
> 1. terminate_walk() is called towards the end of the lookup. I failed
> run into a case where it has any depth to clean up. For now predict
> it does not.
Easy - just have an error while resolving a nested symlink in the middle
of pathname. On success it will have zero nd->depth, of course.
> 2. legitimize_links() is also called towards the end of lookup and most
> of the time there s 0 depth. Patch consumers to avoid calling into it
> in that case.
On transition from lazy to non-lazy mode on cache miss, ->d_revalidate()
saying "dunno, try in non-lazy mode", etc.
That can happen inside a nested symlink as well as on the top level, but
the latter is more common on most of the loads.
> 3. walk_component() is typically called with WALK_MORE and zero depth,
> checked in that order. Check depth first and predict it is 0.
Does it give a measurable effect?
> 4. link_path_walk() predicts not dealing with a symlink, but the other
> part of symlink handling fails to make the same predict. Add it.
Unconvincing, that - one is "we have a component; what are the odds of that
component being a symlink?", another - "we have reached the end of pathname
or the end of nested symlink; what are the odds of the latter being the case?"
I can believe that answers to both questions are fairly low, but they are
not the same. I'd expect the latter to be considerably higher than the
former.
> - if (unlikely(!legitimize_links(nd)))
> + if (unlikely(nd->depth && !legitimize_links(nd)))
I suspect that
if (unlikely(nd->depth) && !legitimize_links(nd))
might be better...
> - if (unlikely(!legitimize_links(nd)))
> + if (unlikely(nd->depth && !legitimize_links(nd)))
> goto out2;
Ditto.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v4] fs: add predicts based on nd->depth
2025-11-11 3:32 ` Al Viro
@ 2025-11-11 18:00 ` Mateusz Guzik
0 siblings, 0 replies; 3+ messages in thread
From: Mateusz Guzik @ 2025-11-11 18:00 UTC (permalink / raw)
To: Al Viro; +Cc: brauner, jack, linux-kernel, linux-fsdevel
On Tue, Nov 11, 2025 at 4:32 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Nov 10, 2025 at 05:59:01PM +0100, Mateusz Guzik wrote:
>
> > Given these results:
> > 1. terminate_walk() is called towards the end of the lookup. I failed
> > run into a case where it has any depth to clean up. For now predict
> > it does not.
>
> Easy - just have an error while resolving a nested symlink in the middle
> of pathname. On success it will have zero nd->depth, of course.
>
Ok, I'll update the commit message.
> > 2. legitimize_links() is also called towards the end of lookup and most
> > of the time there s 0 depth. Patch consumers to avoid calling into it
> > in that case.
>
> On transition from lazy to non-lazy mode on cache miss, ->d_revalidate()
> saying "dunno, try in non-lazy mode", etc.
>
> That can happen inside a nested symlink as well as on the top level, but
> the latter is more common on most of the loads.
>
Given the rest of the e-mail I take it you are clarifying when depth >
0 in this case, as opposed to contesting the predict.
> > 3. walk_component() is typically called with WALK_MORE and zero depth,
> > checked in that order. Check depth first and predict it is 0.
>
> Does it give a measurable effect?
I did not benchmark this patch at all, merely checked for predicts
going the right way with bpftrace. What do I intend to benchmark is
the following: cleanup and inlining of func calls to walk_component()
and step_into(), which happen every time.
The routine got ->depth predicts in this patch because I was already
sprinkling it.
The current asm of walk_component() is penalized by lookup_slow(!)
being inlined and convincing the compiler to spill extra registers.
step_into() also looks like it can be shortened for the common case.
When inlined the compiler should be able to elide branching on WALK_MORE anyway.
You can find profiling info from a kernel running my patches (modulo
this one) here:
https://lore.kernel.org/linux-fsdevel/20251111-brillant-umgegangen-e7c891513bce@brauner/T/#mee42496c2c6495a54c6b0b4da5c4121540ad92d9
walk_component() is hanging out there at over 2.7% and step_into() is
4.8%. I would suspect there is at least 1% total hiding here for
trivial fixups + inlining.
That said, I can drop this bit no problem and add it later to the
patchset which takes care of both walk_component() and step_into()
(which will come with benchmarks).
>
> > 4. link_path_walk() predicts not dealing with a symlink, but the other
> > part of symlink handling fails to make the same predict. Add it.
>
> Unconvincing, that - one is "we have a component; what are the odds of that
> component being a symlink?", another - "we have reached the end of pathname
> or the end of nested symlink; what are the odds of the latter being the case?"
>
Fair, so I added the following crapper:
diff --git a/fs/dcache.c b/fs/dcache.c
index de3e4e9777ea..2bac37c09bf6 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3268,3 +3268,7 @@ void __init vfs_caches_init(void)
bdev_cache_init();
chrdev_init();
}
+
+
+void link_path_walk_probe(int);
+void link_path_walk_probe(int) { }
diff --git a/fs/namei.c b/fs/namei.c
index caeed986108d..00125c578af4 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2470,6 +2470,8 @@ static inline const char *hash_name(struct
nameidata *nd, const char *name, unsi
#define LAST_WORD_IS_DOTDOT 0x2e2e
#endif
+void link_path_walk_probe(int);
+
/*
* Name resolution.
* This is the basic name resolution function, turning a pathname into
@@ -2544,6 +2546,7 @@ static int link_path_walk(const char *name,
struct nameidata *nd)
} while (unlikely(*name == '/'));
if (unlikely(!*name)) {
OK:
+ link_path_walk_probe(depth);
/* pathname or trailing symlink, done */
if (likely(!depth)) {
nd->dir_vfsuid =
i_uid_into_vfsuid(idmap, nd->inode);
then probing over kernel build:
bpftrace -e 'kprobe:link_path_walk_probe { @[probe] = lhist(arg0, 0, 8, 1); }'
@[kprobe:link_path_walk_probe]:
[0, 1) 7528231 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1, 2) 407905 |@@ |
I think that's nicely skewed.
> I can believe that answers to both questions are fairly low, but they are
> not the same. I'd expect the latter to be considerably higher than the
> former.
>
> > - if (unlikely(!legitimize_links(nd)))
> > + if (unlikely(nd->depth && !legitimize_links(nd)))
>
> I suspect that
> if (unlikely(nd->depth) && !legitimize_links(nd))
> might be better...
>
Well it says it is unlikely there is depth, but when there is and
legitimize_links is called, it is unlikely it fails. I think this
matches the current intent, except avodiing the func call most of the
time.
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-11-11 18:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-10 16:59 [PATCH v4] fs: add predicts based on nd->depth Mateusz Guzik
2025-11-11 3:32 ` Al Viro
2025-11-11 18:00 ` Mateusz Guzik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).