Linux filesystem development
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	NeilBrown <neil@brown.name>
Subject: Re: [RFC PATCH 12/25] reducing rcu_read_lock() scopes in dput and friends, step 1
Date: Thu, 7 May 2026 08:35:10 +0100	[thread overview]
Message-ID: <20260507073510.GL3518998@ZenIV> (raw)
In-Reply-To: <20260505224256.GH3518998@ZenIV>

On Tue, May 05, 2026 at 11:42:56PM +0100, Al Viro wrote:
> > Honestly, if this is then going to make some still later patch
> > cleaner, I'd prefer 12/13 to be just combined into one "change rcu
> > locking semantics to be better".
> > 
> > Because as-is, I think patch 12 on its own is just ugly.
> > 
> > It looks like patch 14 then uses the thing to clean up dentry_kill() a
> > bit, so that combined patch wouldn't trigger my "this is just ugly"
> > case, adn would still have a real reason for it (outside of a
> > theoretical "shrink rcu region" without numbers to say why it's
> > needed).
> 
> The main payoff is in #15, really.  The entire 12--17 started as a single
> commit, which I fucked up on the first two attempts.  Thus the carve-up
> into chunks too small for idiot me to fumble...

Does the following look sane to you?  That's pretty much #12--#15 combined,
with hopefully saner commit message.

commit f6c6d2194a4911619a4ab8011953fbf2ec202778
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sat Apr 11 03:24:28 2026 -0400

    simplify safety for lock_for_kill() slowpath
    
    rcu_read_lock() scopes in dentry eviction machinery are too wide
    and badly structured; we end up with too many of those, quite
    a few essentially identical.  Worse, quite a few of the function
    involved are not neutral wrt that, making them harder to reason about.
    
    rcu_read_lock() scope is not the only thing establishing an
    RCU read-side critical area - spin_lock scope does the same and
    they can be mixed - the sequence
            rcu_read_lock()
            ...
            spin_lock()
            ...
            rcu_read_unlock()
            ...
            rcu_read_lock()
            ...
            spun_unlock()
            ...
            rcu_read_unlock()
    is an unbroken RCU read-side critical area.
    
    Use of that observation allows to simplify things.  First of all,
    lock_for_kill() relies upon being in an unbroken RCU read-side
    critical area.  It's always called with ->d_lock held, and normally
    returns without having ever dropped that spinlock.  We would not
    need rcu_read_lock() at all, if not for the slow path - if trylock
    of inode->i_lock fails, we need to drop and retake ->d_lock.
    
    Having all calls of lock_for_kill() inside an rcu_read_lock() scope
    takes care of that, but to show that lock_for_kill() slow path is safe,
    we need to demonstrate such rcu_read_lock() scope for any call chain
    leading to lock_for_kill().  Which is not fun, seeing that there are
    10 such scopes, with 5 distinct beginnings between them.
    
    Case 1: opens in dput() proceeds through fast_dput() grabbing ->d_lock,
    returning false into dput() and there a call of finish_dput() which calls
    dentry_kill(), which calls lock_for_kill(); ends in dentry_kill(), either
    right after lock_for_kill() success or right after dropping ->d_lock
    on lock_for_kill() failure.  ->d_lock is held continuously all the way
    into lock_for_kill().
    
    Case 2: opens in dentry_kill(), where we proceed to the same call of
    dentry_kill() as in case 1.  ->d_lock is held since before the
    beginning of the scope and all the way into lock_for_kill().
    
    Case 3: opens in select_collect2(), proceeds through the return to
    d_walk() and to shrink_dcache_tree() where we grab ->d_lock and
    proceed to call shrink_kill(), which calls dentry_kill(), then as
    in the previous scopes.
    
    Case 4: opens in shrink_dentry_list(), followed by call of shrink_kill(),
    then same as in case 3.  ->d_lock is held since before the beginning
    of the scope and all the way into lock_for_kill().
    
    Case 5: opens in shrink_kill(), where it's immediately followed by
    call of dentry_kill(), then same as in the previous scopes.  ->d_lock
    is held since before the beginning of the scope all the way into
    lock_for_kill().
    
    Note that in cases 2, 4 and 5 the slow path of lock_for_kill() is the
    only part of rcu_read_lock() scope that is not covered by spinlock
    scopes.  In case 1 we have the area in fast_dput() as well and in
    case 3 - the return path from select_collect2() and chunk in shrink_dcache_tree()
    up to grabbing ->d_lock.
    
    Seeing that the reasons we need rcu_read_lock() in these additional
    areas are completely unrelated to lock_for_kill() slow path, the things
    get much more straightforward with
            * explicit rcu_read_lock() scope surrounding the area in slow path
    of lock_for_kill() where ->d_lock is not held
            * shrink_dentry_list() dropping rcu_read_lock() as soon as it has
    grabbed ->d_lock.
            * dput() dropping rcu_read_lock() just before calling finish_dput().
            * rcu_read_lock() calls in finish_dput(), shrink_kill() and
    shrink_dentry_list() are removed, along with rcu_read_unlock() calls in
    dentry_kill().
    
    RCU read-side critical areas are unchanged by that, safety of lock_for_kill()
    slow path is trivial to verify and a bunch of rcu_read_lock() scopes either
    gone or become easier to describe.
    
    Incidentally, all calls of fast_dput() are immediately preceded by rcu_read_lock()
    and followed by rcu_read_unlock() now, which will allow to simplify those on
    the next step...
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

diff --git a/fs/dcache.c b/fs/dcache.c
index 0943a17eccbc..4a657bcd5b4f 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -763,6 +763,7 @@ static bool lock_for_kill(struct dentry *dentry)
 	if (!inode || likely(spin_trylock(&inode->i_lock)))
 		return true;
 
+	rcu_read_lock();
 	do {
 		spin_unlock(&dentry->d_lock);
 		spin_lock(&inode->i_lock);
@@ -772,6 +773,7 @@ static bool lock_for_kill(struct dentry *dentry)
 		spin_unlock(&inode->i_lock);
 		inode = dentry->d_inode;
 	} while (inode);
+	rcu_read_unlock();
 	if (likely(!dentry->d_lockref.count))
 		return true;
 	if (inode)
@@ -783,10 +785,8 @@ static struct dentry *dentry_kill(struct dentry *dentry)
 {
 	if (unlikely(!lock_for_kill(dentry))) {
 		spin_unlock(&dentry->d_lock);
-		rcu_read_unlock();
 		return NULL;
 	}
-	rcu_read_unlock();
 	return __dentry_kill(dentry);
 }
 
@@ -931,14 +931,12 @@ static inline bool fast_dput(struct dentry *dentry)
 
 static void finish_dput(struct dentry *dentry)
 	__releases(dentry->d_lock)
-	__releases(RCU)
 {
 	while ((dentry = dentry_kill(dentry)) != NULL) {
 		if (retain_dentry(dentry, true)) {
 			spin_unlock(&dentry->d_lock);
 			return;
 		}
-		rcu_read_lock();
 	}
 }
 
@@ -978,6 +976,7 @@ void dput(struct dentry *dentry)
 		rcu_read_unlock();
 		return;
 	}
+	rcu_read_unlock();
 	finish_dput(dentry);
 }
 EXPORT_SYMBOL(dput);
@@ -988,7 +987,6 @@ void d_make_discardable(struct dentry *dentry)
 	WARN_ON(!(dentry->d_flags & DCACHE_PERSISTENT));
 	dentry->d_flags &= ~DCACHE_PERSISTENT;
 	dentry->d_lockref.count--;
-	rcu_read_lock();
 	finish_dput(dentry);
 }
 EXPORT_SYMBOL(d_make_discardable);
@@ -1217,7 +1215,7 @@ EXPORT_SYMBOL(d_prune_aliases);
 static inline void shrink_kill(struct dentry *victim)
 {
 	while ((victim = dentry_kill(victim)) != NULL)
-		rcu_read_lock();
+		;
 }
 
 void shrink_dentry_list(struct list_head *list)
@@ -1233,7 +1231,6 @@ void shrink_dentry_list(struct list_head *list)
 			spin_unlock(&dentry->d_lock);
 			continue;
 		}
-		rcu_read_lock();
 		shrink_kill(dentry);
 	}
 }
@@ -1669,6 +1666,7 @@ static void shrink_dcache_tree(struct dentry *parent, bool for_umount)
 			struct dentry *v = data.victim;
 
 			spin_lock(&v->d_lock);
+			rcu_read_unlock();
 			if (v->d_lockref.count < 0 &&
 			    !(v->d_flags & DCACHE_DENTRY_KILLED)) {
 				struct completion_list wait;
@@ -1676,7 +1674,6 @@ static void shrink_dcache_tree(struct dentry *parent, bool for_umount)
 				// it becomes invisible to d_walk().
 				d_add_waiter(v, &wait);
 				spin_unlock(&v->d_lock);
-				rcu_read_unlock();
 				if (!list_empty(&data.dispose))
 					shrink_dentry_list(&data.dispose);
 				wait_for_completion(&wait.completion);

  reply	other threads:[~2026-05-07  7:34 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05  5:53 [RFC PATCH 00/25] assorted dcache cleanups and fixes Al Viro
2026-05-05  5:53 ` [RFC PATCH 01/25] VFS: use wait_var_event for waiting in d_alloc_parallel() Al Viro
2026-05-05  5:53 ` [RFC PATCH 02/25] alloc_path_pseudo(): make sure we don't end up with NORCU dentries for directories Al Viro
2026-05-05  8:21   ` NeilBrown
2026-05-05 17:48     ` Al Viro
2026-05-05  5:53 ` [RFC PATCH 03/25] fix a race between d_find_any_alias() and final dput() of NORCU dentries Al Viro
2026-05-05 17:06   ` Linus Torvalds
2026-05-05 20:29     ` Al Viro
2026-05-05  5:53 ` [RFC PATCH 04/25] find_acceptable_alias(): skip NORCU aliases with zero refcount Al Viro
2026-05-05  5:53 ` [RFC PATCH 05/25] select_collect(): ignore dentries on shrink lists if they have positive refcounts Al Viro
2026-05-05  5:53 ` [RFC PATCH 06/25] make to_shrink_list() return whether it has moved dentry to list Al Viro
2026-05-05  5:53 ` [RFC PATCH 07/25] kill d_dispose_if_unused() Al Viro
2026-05-05  5:53 ` [RFC PATCH 08/25] d_prune_aliases(): make sure to skip NORCU aliases Al Viro
2026-05-05  5:53 ` [RFC PATCH 09/25] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-05-07 20:39   ` Al Viro
2026-05-05  5:53 ` [RFC PATCH 10/25] fold lock_for_kill() into shrink_kill() Al Viro
2026-05-05  5:53 ` [RFC PATCH 11/25] fold lock_for_kill() and __dentry_kill() into common helper Al Viro
2026-05-05  5:53 ` [RFC PATCH 12/25] reducing rcu_read_lock() scopes in dput and friends, step 1 Al Viro
2026-05-05  8:55   ` NeilBrown
2026-05-05 14:22     ` Al Viro
2026-05-05 21:58       ` NeilBrown
2026-05-05 16:47   ` Linus Torvalds
2026-05-05 22:42     ` Al Viro
2026-05-07  7:35       ` Al Viro [this message]
2026-05-07 15:32         ` Linus Torvalds
2026-05-05  5:54 ` [RFC PATCH 13/25] reducing rcu_read_lock() scopes in dput and friends, step 2 Al Viro
2026-05-05  5:54 ` [RFC PATCH 14/25] reducing rcu_read_lock() scopes in dput and friends, step 3 Al Viro
2026-05-05  5:54 ` [RFC PATCH 15/25] reducing rcu_read_lock() scopes in dput and friends, step 4 Al Viro
2026-05-05  5:54 ` [RFC PATCH 16/25] reducing rcu_read_lock() scopes in dput and friends, step 5 Al Viro
2026-05-05  5:54 ` [RFC PATCH 17/25] reducing rcu_read_lock() scopes in dput and friends, step 6 Al Viro
2026-05-05  5:54 ` [RFC PATCH 18/25] adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill() Al Viro
2026-05-05  5:54 ` [RFC PATCH 19/25] document dentry_kill() Al Viro
2026-05-05  5:54 ` [RFC PATCH 20/25] d_walk(): shrink rcu_read_lock() scope Al Viro
2026-05-05 17:01   ` Linus Torvalds
2026-05-05 20:05     ` Al Viro
2026-05-05 21:40       ` Frederic Weisbecker
2026-05-05 22:50         ` Al Viro
2026-05-06  3:49           ` Paul E. McKenney
2026-05-07 22:39             ` NeilBrown
2026-05-07 23:21               ` Paul E. McKenney
2026-05-08 14:47                 ` Al Viro
2026-05-08 22:03                   ` Paul E. McKenney
2026-05-08 23:03                     ` Al Viro
2026-05-08  3:01         ` Al Viro
2026-05-05  5:54 ` [RFC PATCH 21/25] shrinking rcu_read_lock() scope in d_alloc_parallel() Al Viro
2026-05-07 21:52   ` Jori Koolstra
2026-05-08  3:12     ` Al Viro
2026-05-08  9:28       ` Jori Koolstra
2026-05-05  5:54 ` [RFC PATCH 22/25] shrink_dentry_tree(): unify the calls of shrink_dentry_list() Al Viro
2026-05-05  5:54 ` [RFC PATCH 23/25] wind ->s_roots via ->d_sib instead of ->d_hash Al Viro
2026-05-05  5:54 ` [RFC PATCH 24/25] nfs: get rid of fake root dentries Al Viro
2026-05-05  5:54 ` [RFC PATCH 25/25] make cursors NORCU Al Viro
2026-05-05 17:09 ` [RFC PATCH 00/25] assorted dcache cleanups and fixes Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260507073510.GL3518998@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox