Linux filesystem development
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	linux-fsdevel@vger.kernel.org,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	NeilBrown <neil@brown.name>, Boqun Feng <boqun@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Uladzislau Rezki <urezki@gmail.com>
Subject: Re: [RFC PATCH 20/25] d_walk(): shrink rcu_read_lock() scope
Date: Tue, 5 May 2026 23:40:44 +0200	[thread overview]
Message-ID: <afpj3FBrvHer0E6Y@pavilion.home> (raw)
In-Reply-To: <20260505200501.GF3518998@ZenIV>

(Adding other RCU folks in Cc)


Le Tue, May 05, 2026 at 09:05:01PM +0100, Al Viro a écrit :
> On Tue, May 05, 2026 at 10:01:39AM -0700, Linus Torvalds wrote:
> 
> > If we had a function that was "transfer spinlock without dropping RCU
> > read lock", and we'd use that as an _optimization_ (maybe we could
> > avoid all the preemption count games and testing for whether there are
> > active RCU events etc), that would make sense to me.
> >
> > At that point it would *all* become "we rely on the spinlock to hold
> > RCU read locking", and it wouldn't mix locking models. So I think such
> > an operation might make sense on its own.
> 
> Definitely; we have the same "bridge two spin_lock scopes together into
> a contiguous RCU read-side critical area" elsewhere as well.  Hell,
> 
> 	rcu_read_lock();
> 	do {
> 		spin_unlock(&dentry->d_lock);
> 		spin_lock(&inode->i_lock);
> 		spin_lock(&dentry->d_lock);
> 		if (likely(inode == dentry->d_inode))
> 			break;
> 		spin_unlock(&inode->i_lock);
> 		inode = dentry->d_inode;
> 	} while (inode);
> 	rcu_read_unlock();
> 
> in lock_for_kill() might be better off as
> 
> 	do {
> 		transfer_spinlock_with_rcu(&dentry->d_lock, &inode->i_lock);
> 		spin_lock(&dentry->d_lock);
> 		if (likely(inode == dentry->d_inode))
> 			break;
> 		spin_unlock(&inode->i_lock);
> 		inode = dentry->d_inode;
> 	} while (inode);
> 
> Explicit rcu_read_lock() uses don't explain what's really going on; in this
> form it's "we can't take ->i_lock under ->d_lock, but we need an unbroken
> RCU read-side critical area for memory safety".  The same goes for d_walk()
> use ("can't take parent's ->d_lock under child's one"), as well as in
> shrink_dcache_for_umount() ("can't take ->d_lock under ->s_roots_lock")
> and I wouldn't be surprised to see other instances of the same pattern.
> 
> I would really prefer to have all explicit rcu_read_lock() in fs/dcache.c
> clearly documented ;-/

And the calls to rcu_read_lock() / rcu_read_unlock() are _usually_ cheap enough
that their explicit calls should go unnoticed over an implicit spinlock RCU
critical section. Ok rcu_read_unlock() has its slowpath but it would otherwise
trigger randomly when implicit read side are used (context switch, interrupts,
etc...)

Also rcu_read_lock() has the advantage to tell clearly that we are doing an RCU read
side (and its uses indeed claim to be documented) here while relying on spinlocks to do
that is fine but implicit and more error-prone when there are tricky things
like spinlock transfers: better don't ask for trouble.

So I would say that we should probably avoid introducing such an API if possible.

> FWIW, back in 2015 spin_lock() did *not* provide an RCU read-side critical
> area, according to https://lwn.net/Articles/652156/; would be nice to have
> the history mentioned in D/RCU/whatisRCU.rst - at least to the level of
> "once upon a time one *did* need an explicit rcu_read_lock() in addition
> to spin_lock() - spin_lock() or preempt_disable() alone was not enough;
> that has changed in <...>".
> 
> I'm not familiar enough with that code for that kind of archaeology ;-/
> Could somebody help with that?  Paul?

That's quite outdated. Now preempt disabled, IRQs disabled, and spinlocks
are all implicit RCU read sides. On non RT spinlocks disable preemption and
on RT they call rcu_read_lock() internally.

Thanks.

-- 
Frederic Weisbecker
SUSE Labs

  reply	other threads:[~2026-05-05 21:40 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05  5:53 [RFC PATCH 00/25] assorted dcache cleanups and fixes Al Viro
2026-05-05  5:53 ` [RFC PATCH 01/25] VFS: use wait_var_event for waiting in d_alloc_parallel() Al Viro
2026-05-05  5:53 ` [RFC PATCH 02/25] alloc_path_pseudo(): make sure we don't end up with NORCU dentries for directories Al Viro
2026-05-05  8:21   ` NeilBrown
2026-05-05 17:48     ` Al Viro
2026-05-05  5:53 ` [RFC PATCH 03/25] fix a race between d_find_any_alias() and final dput() of NORCU dentries Al Viro
2026-05-05 17:06   ` Linus Torvalds
2026-05-05 20:29     ` Al Viro
2026-05-05  5:53 ` [RFC PATCH 04/25] find_acceptable_alias(): skip NORCU aliases with zero refcount Al Viro
2026-05-05  5:53 ` [RFC PATCH 05/25] select_collect(): ignore dentries on shrink lists if they have positive refcounts Al Viro
2026-05-05  5:53 ` [RFC PATCH 06/25] make to_shrink_list() return whether it has moved dentry to list Al Viro
2026-05-05  5:53 ` [RFC PATCH 07/25] kill d_dispose_if_unused() Al Viro
2026-05-05  5:53 ` [RFC PATCH 08/25] d_prune_aliases(): make sure to skip NORCU aliases Al Viro
2026-05-05  5:53 ` [RFC PATCH 09/25] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-05-07 20:39   ` Al Viro
2026-05-05  5:53 ` [RFC PATCH 10/25] fold lock_for_kill() into shrink_kill() Al Viro
2026-05-05  5:53 ` [RFC PATCH 11/25] fold lock_for_kill() and __dentry_kill() into common helper Al Viro
2026-05-05  5:53 ` [RFC PATCH 12/25] reducing rcu_read_lock() scopes in dput and friends, step 1 Al Viro
2026-05-05  8:55   ` NeilBrown
2026-05-05 14:22     ` Al Viro
2026-05-05 21:58       ` NeilBrown
2026-05-05 16:47   ` Linus Torvalds
2026-05-05 22:42     ` Al Viro
2026-05-07  7:35       ` Al Viro
2026-05-07 15:32         ` Linus Torvalds
2026-05-05  5:54 ` [RFC PATCH 13/25] reducing rcu_read_lock() scopes in dput and friends, step 2 Al Viro
2026-05-05  5:54 ` [RFC PATCH 14/25] reducing rcu_read_lock() scopes in dput and friends, step 3 Al Viro
2026-05-05  5:54 ` [RFC PATCH 15/25] reducing rcu_read_lock() scopes in dput and friends, step 4 Al Viro
2026-05-05  5:54 ` [RFC PATCH 16/25] reducing rcu_read_lock() scopes in dput and friends, step 5 Al Viro
2026-05-05  5:54 ` [RFC PATCH 17/25] reducing rcu_read_lock() scopes in dput and friends, step 6 Al Viro
2026-05-05  5:54 ` [RFC PATCH 18/25] adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill() Al Viro
2026-05-05  5:54 ` [RFC PATCH 19/25] document dentry_kill() Al Viro
2026-05-05  5:54 ` [RFC PATCH 20/25] d_walk(): shrink rcu_read_lock() scope Al Viro
2026-05-05 17:01   ` Linus Torvalds
2026-05-05 20:05     ` Al Viro
2026-05-05 21:40       ` Frederic Weisbecker [this message]
2026-05-05 22:50         ` Al Viro
2026-05-06  3:49           ` Paul E. McKenney
2026-05-07 22:39             ` NeilBrown
2026-05-07 23:21               ` Paul E. McKenney
2026-05-08 14:47                 ` Al Viro
2026-05-08 22:03                   ` Paul E. McKenney
2026-05-08 23:03                     ` Al Viro
2026-05-08  3:01         ` Al Viro
2026-05-05  5:54 ` [RFC PATCH 21/25] shrinking rcu_read_lock() scope in d_alloc_parallel() Al Viro
2026-05-07 21:52   ` Jori Koolstra
2026-05-08  3:12     ` Al Viro
2026-05-08  9:28       ` Jori Koolstra
2026-05-05  5:54 ` [RFC PATCH 22/25] shrink_dentry_tree(): unify the calls of shrink_dentry_list() Al Viro
2026-05-05  5:54 ` [RFC PATCH 23/25] wind ->s_roots via ->d_sib instead of ->d_hash Al Viro
2026-05-05  5:54 ` [RFC PATCH 24/25] nfs: get rid of fake root dentries Al Viro
2026-05-05  5:54 ` [RFC PATCH 25/25] make cursors NORCU Al Viro
2026-05-05 17:09 ` [RFC PATCH 00/25] assorted dcache cleanups and fixes Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afpj3FBrvHer0E6Y@pavilion.home \
    --to=frederic@kernel.org \
    --cc=boqun@kernel.org \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=paulmck@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox