From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
NeilBrown <neil@brown.name>
Subject: Re: [RFC PATCH 12/25] reducing rcu_read_lock() scopes in dput and friends, step 1
Date: Thu, 7 May 2026 08:35:10 +0100 [thread overview]
Message-ID: <20260507073510.GL3518998@ZenIV> (raw)
In-Reply-To: <20260505224256.GH3518998@ZenIV>
On Tue, May 05, 2026 at 11:42:56PM +0100, Al Viro wrote:
> > Honestly, if this is then going to make some still later patch
> > cleaner, I'd prefer 12/13 to be just combined into one "change rcu
> > locking semantics to be better".
> >
> > Because as-is, I think patch 12 on its own is just ugly.
> >
> > It looks like patch 14 then uses the thing to clean up dentry_kill() a
> > bit, so that combined patch wouldn't trigger my "this is just ugly"
> > case, adn would still have a real reason for it (outside of a
> > theoretical "shrink rcu region" without numbers to say why it's
> > needed).
>
> The main payoff is in #15, really. The entire 12--17 started as a single
> commit, which I fucked up on the first two attempts. Thus the carve-up
> into chunks too small for idiot me to fumble...
Does the following look sane to you? That's pretty much #12--#15 combined,
with hopefully saner commit message.
commit f6c6d2194a4911619a4ab8011953fbf2ec202778
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Sat Apr 11 03:24:28 2026 -0400
simplify safety for lock_for_kill() slowpath
rcu_read_lock() scopes in dentry eviction machinery are too wide
and badly structured; we end up with too many of those, quite
a few essentially identical. Worse, quite a few of the function
involved are not neutral wrt that, making them harder to reason about.
rcu_read_lock() scope is not the only thing establishing an
RCU read-side critical area - spin_lock scope does the same and
they can be mixed - the sequence
rcu_read_lock()
...
spin_lock()
...
rcu_read_unlock()
...
rcu_read_lock()
...
spun_unlock()
...
rcu_read_unlock()
is an unbroken RCU read-side critical area.
Use of that observation allows to simplify things. First of all,
lock_for_kill() relies upon being in an unbroken RCU read-side
critical area. It's always called with ->d_lock held, and normally
returns without having ever dropped that spinlock. We would not
need rcu_read_lock() at all, if not for the slow path - if trylock
of inode->i_lock fails, we need to drop and retake ->d_lock.
Having all calls of lock_for_kill() inside an rcu_read_lock() scope
takes care of that, but to show that lock_for_kill() slow path is safe,
we need to demonstrate such rcu_read_lock() scope for any call chain
leading to lock_for_kill(). Which is not fun, seeing that there are
10 such scopes, with 5 distinct beginnings between them.
Case 1: opens in dput() proceeds through fast_dput() grabbing ->d_lock,
returning false into dput() and there a call of finish_dput() which calls
dentry_kill(), which calls lock_for_kill(); ends in dentry_kill(), either
right after lock_for_kill() success or right after dropping ->d_lock
on lock_for_kill() failure. ->d_lock is held continuously all the way
into lock_for_kill().
Case 2: opens in dentry_kill(), where we proceed to the same call of
dentry_kill() as in case 1. ->d_lock is held since before the
beginning of the scope and all the way into lock_for_kill().
Case 3: opens in select_collect2(), proceeds through the return to
d_walk() and to shrink_dcache_tree() where we grab ->d_lock and
proceed to call shrink_kill(), which calls dentry_kill(), then as
in the previous scopes.
Case 4: opens in shrink_dentry_list(), followed by call of shrink_kill(),
then same as in case 3. ->d_lock is held since before the beginning
of the scope and all the way into lock_for_kill().
Case 5: opens in shrink_kill(), where it's immediately followed by
call of dentry_kill(), then same as in the previous scopes. ->d_lock
is held since before the beginning of the scope all the way into
lock_for_kill().
Note that in cases 2, 4 and 5 the slow path of lock_for_kill() is the
only part of rcu_read_lock() scope that is not covered by spinlock
scopes. In case 1 we have the area in fast_dput() as well and in
case 3 - the return path from select_collect2() and chunk in shrink_dcache_tree()
up to grabbing ->d_lock.
Seeing that the reasons we need rcu_read_lock() in these additional
areas are completely unrelated to lock_for_kill() slow path, the things
get much more straightforward with
* explicit rcu_read_lock() scope surrounding the area in slow path
of lock_for_kill() where ->d_lock is not held
* shrink_dentry_list() dropping rcu_read_lock() as soon as it has
grabbed ->d_lock.
* dput() dropping rcu_read_lock() just before calling finish_dput().
* rcu_read_lock() calls in finish_dput(), shrink_kill() and
shrink_dentry_list() are removed, along with rcu_read_unlock() calls in
dentry_kill().
RCU read-side critical areas are unchanged by that, safety of lock_for_kill()
slow path is trivial to verify and a bunch of rcu_read_lock() scopes either
gone or become easier to describe.
Incidentally, all calls of fast_dput() are immediately preceded by rcu_read_lock()
and followed by rcu_read_unlock() now, which will allow to simplify those on
the next step...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
diff --git a/fs/dcache.c b/fs/dcache.c
index 0943a17eccbc..4a657bcd5b4f 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -763,6 +763,7 @@ static bool lock_for_kill(struct dentry *dentry)
if (!inode || likely(spin_trylock(&inode->i_lock)))
return true;
+ rcu_read_lock();
do {
spin_unlock(&dentry->d_lock);
spin_lock(&inode->i_lock);
@@ -772,6 +773,7 @@ static bool lock_for_kill(struct dentry *dentry)
spin_unlock(&inode->i_lock);
inode = dentry->d_inode;
} while (inode);
+ rcu_read_unlock();
if (likely(!dentry->d_lockref.count))
return true;
if (inode)
@@ -783,10 +785,8 @@ static struct dentry *dentry_kill(struct dentry *dentry)
{
if (unlikely(!lock_for_kill(dentry))) {
spin_unlock(&dentry->d_lock);
- rcu_read_unlock();
return NULL;
}
- rcu_read_unlock();
return __dentry_kill(dentry);
}
@@ -931,14 +931,12 @@ static inline bool fast_dput(struct dentry *dentry)
static void finish_dput(struct dentry *dentry)
__releases(dentry->d_lock)
- __releases(RCU)
{
while ((dentry = dentry_kill(dentry)) != NULL) {
if (retain_dentry(dentry, true)) {
spin_unlock(&dentry->d_lock);
return;
}
- rcu_read_lock();
}
}
@@ -978,6 +976,7 @@ void dput(struct dentry *dentry)
rcu_read_unlock();
return;
}
+ rcu_read_unlock();
finish_dput(dentry);
}
EXPORT_SYMBOL(dput);
@@ -988,7 +987,6 @@ void d_make_discardable(struct dentry *dentry)
WARN_ON(!(dentry->d_flags & DCACHE_PERSISTENT));
dentry->d_flags &= ~DCACHE_PERSISTENT;
dentry->d_lockref.count--;
- rcu_read_lock();
finish_dput(dentry);
}
EXPORT_SYMBOL(d_make_discardable);
@@ -1217,7 +1215,7 @@ EXPORT_SYMBOL(d_prune_aliases);
static inline void shrink_kill(struct dentry *victim)
{
while ((victim = dentry_kill(victim)) != NULL)
- rcu_read_lock();
+ ;
}
void shrink_dentry_list(struct list_head *list)
@@ -1233,7 +1231,6 @@ void shrink_dentry_list(struct list_head *list)
spin_unlock(&dentry->d_lock);
continue;
}
- rcu_read_lock();
shrink_kill(dentry);
}
}
@@ -1669,6 +1666,7 @@ static void shrink_dcache_tree(struct dentry *parent, bool for_umount)
struct dentry *v = data.victim;
spin_lock(&v->d_lock);
+ rcu_read_unlock();
if (v->d_lockref.count < 0 &&
!(v->d_flags & DCACHE_DENTRY_KILLED)) {
struct completion_list wait;
@@ -1676,7 +1674,6 @@ static void shrink_dcache_tree(struct dentry *parent, bool for_umount)
// it becomes invisible to d_walk().
d_add_waiter(v, &wait);
spin_unlock(&v->d_lock);
- rcu_read_unlock();
if (!list_empty(&data.dispose))
shrink_dentry_list(&data.dispose);
wait_for_completion(&wait.completion);
next prev parent reply other threads:[~2026-05-07 7:34 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 5:53 [RFC PATCH 00/25] assorted dcache cleanups and fixes Al Viro
2026-05-05 5:53 ` [RFC PATCH 01/25] VFS: use wait_var_event for waiting in d_alloc_parallel() Al Viro
2026-05-05 5:53 ` [RFC PATCH 02/25] alloc_path_pseudo(): make sure we don't end up with NORCU dentries for directories Al Viro
2026-05-05 8:21 ` NeilBrown
2026-05-05 17:48 ` Al Viro
2026-05-05 5:53 ` [RFC PATCH 03/25] fix a race between d_find_any_alias() and final dput() of NORCU dentries Al Viro
2026-05-05 17:06 ` Linus Torvalds
2026-05-05 20:29 ` Al Viro
2026-05-05 5:53 ` [RFC PATCH 04/25] find_acceptable_alias(): skip NORCU aliases with zero refcount Al Viro
2026-05-05 5:53 ` [RFC PATCH 05/25] select_collect(): ignore dentries on shrink lists if they have positive refcounts Al Viro
2026-05-05 5:53 ` [RFC PATCH 06/25] make to_shrink_list() return whether it has moved dentry to list Al Viro
2026-05-05 5:53 ` [RFC PATCH 07/25] kill d_dispose_if_unused() Al Viro
2026-05-05 5:53 ` [RFC PATCH 08/25] d_prune_aliases(): make sure to skip NORCU aliases Al Viro
2026-05-05 5:53 ` [RFC PATCH 09/25] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-05-07 20:39 ` Al Viro
2026-05-05 5:53 ` [RFC PATCH 10/25] fold lock_for_kill() into shrink_kill() Al Viro
2026-05-05 5:53 ` [RFC PATCH 11/25] fold lock_for_kill() and __dentry_kill() into common helper Al Viro
2026-05-05 5:53 ` [RFC PATCH 12/25] reducing rcu_read_lock() scopes in dput and friends, step 1 Al Viro
2026-05-05 8:55 ` NeilBrown
2026-05-05 14:22 ` Al Viro
2026-05-05 21:58 ` NeilBrown
2026-05-05 16:47 ` Linus Torvalds
2026-05-05 22:42 ` Al Viro
2026-05-07 7:35 ` Al Viro [this message]
2026-05-07 15:32 ` Linus Torvalds
2026-05-05 5:54 ` [RFC PATCH 13/25] reducing rcu_read_lock() scopes in dput and friends, step 2 Al Viro
2026-05-05 5:54 ` [RFC PATCH 14/25] reducing rcu_read_lock() scopes in dput and friends, step 3 Al Viro
2026-05-05 5:54 ` [RFC PATCH 15/25] reducing rcu_read_lock() scopes in dput and friends, step 4 Al Viro
2026-05-05 5:54 ` [RFC PATCH 16/25] reducing rcu_read_lock() scopes in dput and friends, step 5 Al Viro
2026-05-05 5:54 ` [RFC PATCH 17/25] reducing rcu_read_lock() scopes in dput and friends, step 6 Al Viro
2026-05-05 5:54 ` [RFC PATCH 18/25] adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill() Al Viro
2026-05-05 5:54 ` [RFC PATCH 19/25] document dentry_kill() Al Viro
2026-05-05 5:54 ` [RFC PATCH 20/25] d_walk(): shrink rcu_read_lock() scope Al Viro
2026-05-05 17:01 ` Linus Torvalds
2026-05-05 20:05 ` Al Viro
2026-05-05 21:40 ` Frederic Weisbecker
2026-05-05 22:50 ` Al Viro
2026-05-06 3:49 ` Paul E. McKenney
2026-05-07 22:39 ` NeilBrown
2026-05-07 23:21 ` Paul E. McKenney
2026-05-08 14:47 ` Al Viro
2026-05-08 22:03 ` Paul E. McKenney
2026-05-08 23:03 ` Al Viro
2026-05-08 3:01 ` Al Viro
2026-05-05 5:54 ` [RFC PATCH 21/25] shrinking rcu_read_lock() scope in d_alloc_parallel() Al Viro
2026-05-07 21:52 ` Jori Koolstra
2026-05-08 3:12 ` Al Viro
2026-05-08 9:28 ` Jori Koolstra
2026-05-05 5:54 ` [RFC PATCH 22/25] shrink_dentry_tree(): unify the calls of shrink_dentry_list() Al Viro
2026-05-05 5:54 ` [RFC PATCH 23/25] wind ->s_roots via ->d_sib instead of ->d_hash Al Viro
2026-05-05 5:54 ` [RFC PATCH 24/25] nfs: get rid of fake root dentries Al Viro
2026-05-05 5:54 ` [RFC PATCH 25/25] make cursors NORCU Al Viro
2026-05-05 17:09 ` [RFC PATCH 00/25] assorted dcache cleanups and fixes Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260507073510.GL3518998@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=neil@brown.name \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox