public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Al Viro <viro@zeniv.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker	 <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Boqun Feng <boqun@kernel.org>,
	 Uladzislau Rezki	 <urezki@gmail.com>,
	linux-fsdevel@vger.kernel.org,
	Christian Brauner	 <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Nikolay Borisov	 <nik.borisov@suse.com>,
	Max Kellermann <max.kellermann@ionos.com>,
	Eric Sandeen	 <sandeen@redhat.com>,
	Paulo Alcantara <pc@manguebit.org>
Subject: Re: [RFC][PATCH] make sure that lock_for_kill() callers drop the locks in safe order
Date: Sun, 12 Apr 2026 09:15:56 -0400	[thread overview]
Message-ID: <72698ebb266c70c36910d401d03f45ac0fc33fc0.camel@kernel.org> (raw)
In-Reply-To: <20260411213348.GB3836593@ZenIV>

On Sat, 2026-04-11 at 22:33 +0100, Al Viro wrote:
> On Fri, Apr 10, 2026 at 09:24:04PM +0100, Al Viro wrote:
> > On Fri, Apr 10, 2026 at 12:30:13PM -0700, Linus Torvalds wrote:
> > 
> > > The reason it exists is because lock_for_kill() can drop d_lock(), but
> > > that's in the unlikely case that we cn't just immediately get the
> > > inode lock.
> > > 
> > > So honestly, I think that rcu_read_lock() should be inside
> > > lock_for_kill(), rather than in the caller as a "just in case things
> > > go down".
> > 
> > Yup, in the cascade of followups I've mentioned...
> 
> FWIW, see #work.dcache-cleanups (on top of #work.dcache-busy-wait).  That's
> obviously next cycle fodder, and it needs review and testing (at the moment
> I've only build-tested that).
> 
> Branch is in git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.dcache-cleanups
> individual patches in followups.
> 
> I think the eviction machinery got easier to read, if nothing else, and
> rcu_read_lock() scopes are saner now.  OTOH, if we *do* have RCU bugs wrt
> overlapping scopes, we'll probably see a lot of UAF on that - rcu_read_lock()
> scopes got aggressively minimized...
> 
> I've split the rcu_read_lock() massage into a series of small steps after
> getting confused while trying to do them at once; not sure if it needs
> to be carved up so much, but...
> 
> Shortlog:
> Al Viro (11):
>       shrink_dentry_list(): start with removing from shrink list
>       fold lock_for_kill() into shrink_kill()
>       fold lock_for_kill() and __dentry_kill() into common helper
>       reducing rcu_read_lock() scopes in dput and friends, step 1
>       reducing rcu_read_lock() scopes in dput and friends, step 2
>       reducing rcu_read_lock() scopes in dput and friends, step 3
>       reducing rcu_read_lock() scopes in dput and friends, step 4
>       reducing rcu_read_lock() scopes in dput and friends, step 5
>       reducing rcu_read_lock() scopes in dput and friends, step 6
>       adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill()
>       document dentry_kill()
> 
> Diffstat:
>  fs/dcache.c | 214 ++++++++++++++++++++++++++++++++++++------------------------
>  1 file changed, 129 insertions(+), 85 deletions(-)
> 63 lines of comments added - outside of comments it's -20LoC...

FWIW, I ran fstests + the earlier reproducer on a v6.19-ish kernel
without any of your fixes, and with "slub_debug=P,dentry" on the kernel
command line. It also had the patch below. I got no warnings back:

--------------------------8<--------------------------

diff --git a/fs/Kconfig b/fs/Kconfig
index 0bfdaecaa877..f3a228cd14e8 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -9,6 +9,15 @@ menu "File systems"
 config DCACHE_WORD_ACCESS
        bool
 
+config DCACHE_SHRINK_RACE_DEBUG
+	bool "Debug: inject delay in __dentry_kill to widen race window"
+	depends on DEBUG_KERNEL
+	default n
+	help
+	  Inject a probabilistic delay in __dentry_kill() between releasing
+	  d_lock and re-acquiring it, to make dcache shrink races reproducible
+	  in test environments. Only enable for testing.
+
 config VALIDATE_FS_PARSER
 	bool "Validate filesystem parameter description"
 	help
diff --git a/fs/dcache.c b/fs/dcache.c
index 66dd1bb830d1..b1625f7e30e1 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -32,6 +32,8 @@
 #include <linux/bit_spinlock.h>
 #include <linux/rculist_bl.h>
 #include <linux/list_lru.h>
+#include <linux/delay.h>
+#include <linux/random.h>
 #include "internal.h"
 #include "mount.h"
 
@@ -675,6 +677,16 @@ static struct dentry *__dentry_kill(struct dentry *dentry)
 		dentry->d_op->d_release(dentry);
 
 	cond_resched();
+#ifdef CONFIG_DCACHE_SHRINK_RACE_DEBUG
+	/*
+	 * Probabilistic delay to widen the race window in __dentry_kill()
+	 * between dropping d_lock and re-acquiring it.  30% chance of 5ms
+	 * delay for proc dentries.
+	 */
+	if (dentry->d_sb->s_magic == 0x9fa0 /* PROC_SUPER_MAGIC */ &&
+	    get_random_u32_below(100) < 30)
+		mdelay(5);
+#endif
 	/* now that it's negative, ->d_parent is stable */
 	if (!IS_ROOT(dentry)) {
 		parent = dentry->d_parent;
@@ -2722,6 +2734,9 @@ static wait_queue_head_t *__d_lookup_unhash(struct dentry *dentry)
 
 	lockdep_assert_held(&dentry->d_lock);
 
+	if (WARN_ON(!(dentry->d_flags & DCACHE_PAR_LOOKUP)))
+		return NULL;
+
 	b = in_lookup_hash(dentry->d_parent, dentry->d_name.hash);
 	hlist_bl_lock(b);
 	dentry->d_flags &= ~DCACHE_PAR_LOOKUP;



      parent reply	other threads:[~2026-04-12 13:16 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-22 20:20 [PATCH][RFC] get rid of busy-wait in shrink_dcache_tree() Al Viro
2026-01-23  0:19 ` Linus Torvalds
2026-01-23  0:36   ` Al Viro
2026-01-24  4:36     ` Al Viro
2026-01-24  4:46       ` Linus Torvalds
2026-01-24  5:36         ` Al Viro
2026-01-24 17:45           ` Linus Torvalds
2026-01-24 18:43             ` Al Viro
2026-01-24 19:32               ` Linus Torvalds
2026-01-24 20:28                 ` Al Viro
2026-04-02 18:08 ` [RFC PATCH v2 0/4] getting rid of busy-wait in shrink_dcache_parent() Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 1/4] for_each_alias(): helper macro for iterating through dentries of given inode Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 2/4] struct dentry: make ->d_u anonymous Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 3/4] dcache.c: more idiomatic "positives are not allowed" sanity checks Al Viro
2026-04-02 18:08   ` [RFC PATCH v2 4/4] get rid of busy-waiting in shrink_dcache_tree() Al Viro
2026-04-02 19:52     ` Linus Torvalds
2026-04-02 22:44       ` Al Viro
2026-04-02 22:49         ` Linus Torvalds
2026-04-02 23:16           ` Al Viro
2026-04-03  0:29             ` Linus Torvalds
2026-04-03  2:15               ` Al Viro
2026-04-04  0:02                 ` Al Viro
2026-04-04  0:04                   ` Linus Torvalds
2026-04-04 18:54                     ` Al Viro
2026-04-04 19:04                       ` Linus Torvalds
2026-04-05  0:04                         ` Al Viro
2026-04-02 20:28   ` [RFC PATCH v2 0/4] getting rid of busy-wait in shrink_dcache_parent() Paulo Alcantara
2026-04-03  4:46     ` Al Viro
2026-04-04  8:07 ` [RFC PATCH v3 " Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 1/4] for_each_alias(): helper macro for iterating through dentries of given inode Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 2/4] struct dentry: make ->d_u anonymous Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 3/4] dcache.c: more idiomatic "positives are not allowed" sanity checks Al Viro
2026-04-04  8:07   ` [RFC PATCH v3 4/4] get rid of busy-waiting in shrink_dcache_tree() Al Viro
2026-04-09 16:51   ` [RFC PATCH v3 0/4] getting rid of busy-wait in shrink_dcache_parent() Jeff Layton
2026-04-09 19:02     ` Al Viro
2026-04-09 20:10       ` Jeff Layton
2026-04-09 21:57         ` Al Viro
2026-04-09 22:38           ` Jeff Layton
2026-04-10  8:48           ` [RFC][PATCH] make sure that lock_for_kill() callers drop the locks in safe order Al Viro
2026-04-10 11:18             ` Jeff Layton
2026-04-10 11:56               ` Jeff Layton
2026-04-10 15:25             ` Linus Torvalds
2026-04-10 15:57               ` Al Viro
2026-04-10 16:27               ` Boqun Feng
2026-04-10 17:31                 ` Linus Torvalds
2026-04-10 18:11                   ` Paul E. McKenney
2026-04-10 18:21                   ` Jeff Layton
2026-04-10 19:19                     ` Al Viro
2026-04-10 19:32                       ` Jeff Layton
2026-04-10 21:13                         ` Calvin Owens
2026-04-10 21:24                           ` Al Viro
2026-04-10 22:15                             ` Calvin Owens
2026-04-10 23:05                               ` Al Viro
2026-04-10 23:30                                 ` Calvin Owens
2026-04-11  0:51                                   ` Al Viro
2026-04-11 12:07                                     ` Calvin Owens
2026-04-10 17:32               ` Paul E. McKenney
2026-04-10 18:26                 ` Jeff Layton
2026-04-10 18:36                   ` Paul E. McKenney
2026-04-10 18:52               ` Al Viro
2026-04-10 19:21                 ` Paul E. McKenney
2026-04-10 19:30                 ` Linus Torvalds
2026-04-10 20:24                   ` Al Viro
2026-04-10 20:48                     ` Al Viro
2026-04-11 21:33                     ` Al Viro
2026-04-11 21:34                       ` [RFC PATCH 01/11] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-04-11 21:34                         ` [RFC PATCH 02/11] fold lock_for_kill() into shrink_kill() Al Viro
2026-04-11 21:34                         ` [RFC PATCH 03/11] fold lock_for_kill() and __dentry_kill() into common helper Al Viro
2026-04-11 21:34                         ` [RFC PATCH 04/11] reducing rcu_read_lock() scopes in dput and friends, step 1 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 05/11] reducing rcu_read_lock() scopes in dput and friends, step 2 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 06/11] reducing rcu_read_lock() scopes in dput and friends, step 3 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 07/11] reducing rcu_read_lock() scopes in dput and friends, step 4 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 08/11] reducing rcu_read_lock() scopes in dput and friends, step 5 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 09/11] reducing rcu_read_lock() scopes in dput and friends, step 6 Al Viro
2026-04-11 21:34                         ` [RFC PATCH 10/11] adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill() Al Viro
2026-04-11 21:34                         ` [RFC PATCH 11/11] document dentry_kill() Al Viro
2026-04-12 19:03                         ` [RFC PATCH 01/11] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-04-12 13:15                       ` Jeff Layton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72698ebb266c70c36910d401d03f45ac0fc33fc0.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=boqun@kernel.org \
    --cc=brauner@kernel.org \
    --cc=frederic@kernel.org \
    --cc=jack@suse.cz \
    --cc=joelagnelf@nvidia.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=max.kellermann@ionos.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=nik.borisov@suse.com \
    --cc=paulmck@kernel.org \
    --cc=pc@manguebit.org \
    --cc=sandeen@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox