From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
NeilBrown <neil@brown.name>
Subject: [RFC PATCH 21/25] shrinking rcu_read_lock() scope in d_alloc_parallel()
Date: Tue, 5 May 2026 06:54:08 +0100 [thread overview]
Message-ID: <20260505055412.1261144-22-viro@zeniv.linux.org.uk> (raw)
In-Reply-To: <20260505055412.1261144-1-viro@zeniv.linux.org.uk>
The current use of rcu_read_lock() uses in d_alloc_parallel()
is fairly opaque - the single large scope serves two purposes.
We start with lookup in normal hash, and there rcu_read_lock()
scope puts __d_lookup_rcu() and subsequent lockref_get_not_dead() into
the same RCU read-side critical area.
If no match is found, we proceed to lock the hash chain of
in-lookup hash and scan that for a match. If we find a match, we want
to grab it and wait for lookup in progress to finish. Since the bitlock
we use for these hash chains has to nest inside ->d_lock, we need to
unlock the chain first and use lockref_get_not_dead() on the match.
That has to be done without breaking the RCU read-side critical area,
and we use the same rcu_read_lock() scope to bridge over.
The thing is, after having grabbed the reference (and it is
very unlikely to fail) we proceed to grab ->d_lock - d_wait_lookup()
and __d_lookup_unhash()/__d_wake_in_lookup_waiters() are using that for
serialization. That makes lockref_get_not_dead() pointless - trying
to avoid grabbing ->d_lock for refcount increment, only to grab it
anyway immediately after that. If we grab ->d_lock first and replace
lockref_get_not_dead() with direct check for sign and increment if
non-negative we can move rcu_read_unlock() to immediately after grabbing
->d_lock. Moreover, we don't need the RCU read-side critical area to
be contiguous since before earlier __d_lookup_rcu() - we can just as
well terminate the earlier one ASAP and call rcu_read_lock() again only
after having found a match (if any) in the in-lookup hash chain.
That makes the entire thing easier to follow and the purpose
of those rcu_read_lock() calls easier to describe - the first scope is
for __d_lookup_rcu() + lockref_get_not_dead(), the second one bridges
over from the bitlock scope to the ->d_lock scope on the match found in
in-lookup hash.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
fs/dcache.c | 29 ++++++++++++-----------------
1 file changed, 12 insertions(+), 17 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index 2a342d618303..f3c8d46867a9 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2729,38 +2729,33 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
spin_unlock(&parent->d_lock);
retry:
- rcu_read_lock();
seq = smp_load_acquire(&parent->d_inode->i_dir_seq);
r_seq = read_seqbegin(&rename_lock);
+ rcu_read_lock();
dentry = __d_lookup_rcu(parent, name, &d_seq);
if (unlikely(dentry)) {
if (!lockref_get_not_dead(&dentry->d_lockref)) {
rcu_read_unlock();
goto retry;
}
+ rcu_read_unlock();
if (read_seqcount_retry(&dentry->d_seq, d_seq)) {
- rcu_read_unlock();
dput(dentry);
goto retry;
}
- rcu_read_unlock();
dput(new);
return dentry;
}
- if (unlikely(read_seqretry(&rename_lock, r_seq))) {
- rcu_read_unlock();
+ rcu_read_unlock();
+ if (unlikely(read_seqretry(&rename_lock, r_seq)))
goto retry;
- }
- if (unlikely(seq & 1)) {
- rcu_read_unlock();
+ if (unlikely(seq & 1))
goto retry;
- }
hlist_bl_lock(b);
if (unlikely(READ_ONCE(parent->d_inode->i_dir_seq) != seq)) {
hlist_bl_unlock(b);
- rcu_read_unlock();
goto retry;
}
/*
@@ -2777,19 +2772,20 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
continue;
if (!d_same_name(dentry, parent, name))
continue;
+ rcu_read_lock();
hlist_bl_unlock(b);
+ spin_lock(&dentry->d_lock);
+ rcu_read_unlock();
/* now we can try to grab a reference */
- if (!lockref_get_not_dead(&dentry->d_lockref)) {
- rcu_read_unlock();
+ if (unlikely(dentry->d_lockref.count < 0)) {
+ spin_unlock(&dentry->d_lock);
goto retry;
}
-
- rcu_read_unlock();
/*
* somebody is likely to be still doing lookup for it;
- * wait for them to finish
+ * pin it and wait for them to finish
*/
- spin_lock(&dentry->d_lock);
+ dget_dlock(dentry);
d_wait_lookup(dentry);
/*
* it's not in-lookup anymore; in principle we should repeat
@@ -2810,7 +2806,6 @@ struct dentry *d_alloc_parallel(struct dentry *parent,
dput(new);
return dentry;
}
- rcu_read_unlock();
hlist_bl_add_head(&new->d_in_lookup_hash, b);
hlist_bl_unlock(b);
return new;
--
2.47.3
next prev parent reply other threads:[~2026-05-05 5:54 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 5:53 [RFC PATCH 00/25] assorted dcache cleanups and fixes Al Viro
2026-05-05 5:53 ` [RFC PATCH 01/25] VFS: use wait_var_event for waiting in d_alloc_parallel() Al Viro
2026-05-05 5:53 ` [RFC PATCH 02/25] alloc_path_pseudo(): make sure we don't end up with NORCU dentries for directories Al Viro
2026-05-05 8:21 ` NeilBrown
2026-05-05 17:48 ` Al Viro
2026-05-05 5:53 ` [RFC PATCH 03/25] fix a race between d_find_any_alias() and final dput() of NORCU dentries Al Viro
2026-05-05 17:06 ` Linus Torvalds
2026-05-05 20:29 ` Al Viro
2026-05-05 5:53 ` [RFC PATCH 04/25] find_acceptable_alias(): skip NORCU aliases with zero refcount Al Viro
2026-05-05 5:53 ` [RFC PATCH 05/25] select_collect(): ignore dentries on shrink lists if they have positive refcounts Al Viro
2026-05-05 5:53 ` [RFC PATCH 06/25] make to_shrink_list() return whether it has moved dentry to list Al Viro
2026-05-05 5:53 ` [RFC PATCH 07/25] kill d_dispose_if_unused() Al Viro
2026-05-05 5:53 ` [RFC PATCH 08/25] d_prune_aliases(): make sure to skip NORCU aliases Al Viro
2026-05-05 5:53 ` [RFC PATCH 09/25] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-05-07 20:39 ` Al Viro
2026-05-05 5:53 ` [RFC PATCH 10/25] fold lock_for_kill() into shrink_kill() Al Viro
2026-05-05 5:53 ` [RFC PATCH 11/25] fold lock_for_kill() and __dentry_kill() into common helper Al Viro
2026-05-05 5:53 ` [RFC PATCH 12/25] reducing rcu_read_lock() scopes in dput and friends, step 1 Al Viro
2026-05-05 8:55 ` NeilBrown
2026-05-05 14:22 ` Al Viro
2026-05-05 21:58 ` NeilBrown
2026-05-05 16:47 ` Linus Torvalds
2026-05-05 22:42 ` Al Viro
2026-05-07 7:35 ` Al Viro
2026-05-07 15:32 ` Linus Torvalds
2026-05-05 5:54 ` [RFC PATCH 13/25] reducing rcu_read_lock() scopes in dput and friends, step 2 Al Viro
2026-05-05 5:54 ` [RFC PATCH 14/25] reducing rcu_read_lock() scopes in dput and friends, step 3 Al Viro
2026-05-05 5:54 ` [RFC PATCH 15/25] reducing rcu_read_lock() scopes in dput and friends, step 4 Al Viro
2026-05-05 5:54 ` [RFC PATCH 16/25] reducing rcu_read_lock() scopes in dput and friends, step 5 Al Viro
2026-05-05 5:54 ` [RFC PATCH 17/25] reducing rcu_read_lock() scopes in dput and friends, step 6 Al Viro
2026-05-05 5:54 ` [RFC PATCH 18/25] adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill() Al Viro
2026-05-05 5:54 ` [RFC PATCH 19/25] document dentry_kill() Al Viro
2026-05-05 5:54 ` [RFC PATCH 20/25] d_walk(): shrink rcu_read_lock() scope Al Viro
2026-05-05 17:01 ` Linus Torvalds
2026-05-05 20:05 ` Al Viro
2026-05-05 21:40 ` Frederic Weisbecker
2026-05-05 22:50 ` Al Viro
2026-05-06 3:49 ` Paul E. McKenney
2026-05-07 22:39 ` NeilBrown
2026-05-07 23:21 ` Paul E. McKenney
2026-05-08 14:47 ` Al Viro
2026-05-08 22:03 ` Paul E. McKenney
2026-05-08 23:03 ` Al Viro
2026-05-08 3:01 ` Al Viro
2026-05-05 5:54 ` Al Viro [this message]
2026-05-07 21:52 ` [RFC PATCH 21/25] shrinking rcu_read_lock() scope in d_alloc_parallel() Jori Koolstra
2026-05-08 3:12 ` Al Viro
2026-05-08 9:28 ` Jori Koolstra
2026-05-05 5:54 ` [RFC PATCH 22/25] shrink_dentry_tree(): unify the calls of shrink_dentry_list() Al Viro
2026-05-05 5:54 ` [RFC PATCH 23/25] wind ->s_roots via ->d_sib instead of ->d_hash Al Viro
2026-05-05 5:54 ` [RFC PATCH 24/25] nfs: get rid of fake root dentries Al Viro
2026-05-05 5:54 ` [RFC PATCH 25/25] make cursors NORCU Al Viro
2026-05-05 17:09 ` [RFC PATCH 00/25] assorted dcache cleanups and fixes Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260505055412.1261144-22-viro@zeniv.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=neil@brown.name \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox