From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
NeilBrown <neil@brown.name>
Subject: [RFC PATCH 03/25] fix a race between d_find_any_alias() and final dput() of NORCU dentries
Date: Tue, 5 May 2026 06:53:50 +0100 [thread overview]
Message-ID: <20260505055412.1261144-4-viro@zeniv.linux.org.uk> (raw)
In-Reply-To: <20260505055412.1261144-1-viro@zeniv.linux.org.uk>
Refcount of a NORCU dentry must not be incremented after having dropped
to zero. Otherwise we might end up with the following race:
CPU1: in fast_dput(d), rcu_read_lock();
CPU1: decrements refcount of d to 0
CPU1: notice that it's unhashed
CPU2: grab a reference to d
CPU2: dput(d), freeing d
CPU1: ... looks like we need to evict d, let's grab ->d_lock, recheck
the refcount, etc.
and that spin_lock(&d->d_lock) ends up a UAF, despite still being in
an RCU read-side critical area started back when the refcount had been
positive. If not for DCACHE_NORCU in d->d_flags freeing would've been
RCU-delayed, so we'd have grabbed ->d_lock, noticed the negative value
stored into refcount by __dentry_kill(), dropped the locks and that would
be it. For NORCU dentries freeing is _not_ delayed, though.
Most of the non-counting references are excluded for NORCU dentries -
they are not allowed to be hashed, they never get placed on LRU, they
never get placed into anyone's list of children and while dput_to_list()
might put them into a shrink list, nobody bumps refcount of something
that had been reached that way.
However, inode's list of aliases can be a problem - it does not contribute
to dentry refcount (for obvious reasons) and we *do* have places that
grab references to something found on that list - that's precisely what
d_find_alias() is. In case of d_find_alias() we are safe - it skips
unhashed aliases, so all NORCU ones are ignored there. d_find_any_alias()
is *not* limited to hashed ones, though, and while it's usually called
for directories (which never get NORCU dentries), there are callers that
use it to get something for non-directories with no hashed aliases.
Having d_find_any_alias() hit a NORCU dentry is not impossible - it can
be easily arranged if you have CAP_DAC_READ_SEARCH (memfd_create() + mmap()
+ name_to_handle_at() for /proc/self/map_files/<...> + munmap() +
open_by_handle_at() will do that, and adding a second memfd_create() for
mount_fd makes it possible to do that without having memfd pinned).
The race window is narrow, and it's probably not feasible on bare hardware,
but...
It's not hard to fix, fortunately:
* separate __d_find_dir_alias() (== current __d_find_any_alias()) to
be used for directory inodes.
* provide __dget_alias_careful() that would return false for NORCU
dentries with zero refcount and return true incrementing refcount otherwise
* make __d_find_any_alias() go over the list of aliases, using
__dget_alias_careful() and returning the alias it succeeds on (normally the
first one). Any NORCU alias with zero refcount is going to be evicted by
the thread that had dropped the final reference; this makes __d_find_any_alias()
pretend it had lost the race with eviction.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
fs/dcache.c | 21 ++++++++++++++++++---
include/linux/dcache.h | 18 ++++++++++++++++++
2 files changed, 36 insertions(+), 3 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index 0aff2c510beb..923e499ffe7e 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1052,7 +1052,10 @@ struct dentry *dget_parent(struct dentry *dentry)
}
EXPORT_SYMBOL(dget_parent);
-static struct dentry * __d_find_any_alias(struct inode *inode)
+/*
+ * inode is a directory, inode->i_lock is held by the caller
+ */
+static struct dentry * __d_find_dir_alias(struct inode *inode)
{
struct dentry *alias;
@@ -1063,6 +1066,18 @@ static struct dentry * __d_find_any_alias(struct inode *inode)
return alias;
}
+static struct dentry * __d_find_any_alias(struct inode *inode)
+{
+ struct dentry *alias;
+
+ if (hlist_empty(&inode->i_dentry))
+ return NULL;
+ for_each_alias(alias, inode)
+ if (__dget_alias_careful(alias))
+ return alias;
+ return NULL;
+}
+
/**
* d_find_any_alias - find any alias for a given inode
* @inode: inode to find an alias for
@@ -1086,7 +1101,7 @@ static struct dentry *__d_find_alias(struct inode *inode)
struct dentry *alias;
if (S_ISDIR(inode->i_mode))
- return __d_find_any_alias(inode);
+ return __d_find_dir_alias(inode);
for_each_alias(alias, inode) {
spin_lock(&alias->d_lock);
@@ -3150,7 +3165,7 @@ struct dentry *d_splice_alias_ops(struct inode *inode, struct dentry *dentry,
security_d_instantiate(dentry, inode);
spin_lock(&inode->i_lock);
if (S_ISDIR(inode->i_mode)) {
- struct dentry *new = __d_find_any_alias(inode);
+ struct dentry *new = __d_find_dir_alias(inode);
if (unlikely(new)) {
/* The reference to new ensures it remains an alias */
spin_unlock(&inode->i_lock);
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 97a887be150a..684aeb9e9cbe 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -365,6 +365,24 @@ static inline struct dentry *dget(struct dentry *dentry)
return dentry;
}
+/* dentry->d_inode->i_lock must be held by caller */
+static inline bool __dget_alias_careful(struct dentry *dentry)
+{
+ if (likely(!(READ_ONCE(dentry->d_flags) & DCACHE_NORCU))) {
+ lockref_get(&dentry->d_lockref);
+ return true;
+ }
+ // NORCU dentries with zero refcount MUST NOT be grabbed
+ spin_lock(&dentry->d_lock);
+ if (dentry->d_lockref.count > 0) {
+ dget_dlock(dentry);
+ spin_unlock(&dentry->d_lock);
+ return true;
+ }
+ spin_unlock(&dentry->d_lock);
+ return false;
+}
+
extern struct dentry *dget_parent(struct dentry *dentry);
/**
--
2.47.3
next prev parent reply other threads:[~2026-05-05 5:53 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 5:53 [RFC PATCH 00/25] assorted dcache cleanups and fixes Al Viro
2026-05-05 5:53 ` [RFC PATCH 01/25] VFS: use wait_var_event for waiting in d_alloc_parallel() Al Viro
2026-05-05 5:53 ` [RFC PATCH 02/25] alloc_path_pseudo(): make sure we don't end up with NORCU dentries for directories Al Viro
2026-05-05 8:21 ` NeilBrown
2026-05-05 17:48 ` Al Viro
2026-05-05 5:53 ` Al Viro [this message]
2026-05-05 17:06 ` [RFC PATCH 03/25] fix a race between d_find_any_alias() and final dput() of NORCU dentries Linus Torvalds
2026-05-05 20:29 ` Al Viro
2026-05-05 5:53 ` [RFC PATCH 04/25] find_acceptable_alias(): skip NORCU aliases with zero refcount Al Viro
2026-05-05 5:53 ` [RFC PATCH 05/25] select_collect(): ignore dentries on shrink lists if they have positive refcounts Al Viro
2026-05-05 5:53 ` [RFC PATCH 06/25] make to_shrink_list() return whether it has moved dentry to list Al Viro
2026-05-05 5:53 ` [RFC PATCH 07/25] kill d_dispose_if_unused() Al Viro
2026-05-05 5:53 ` [RFC PATCH 08/25] d_prune_aliases(): make sure to skip NORCU aliases Al Viro
2026-05-05 5:53 ` [RFC PATCH 09/25] shrink_dentry_list(): start with removing from shrink list Al Viro
2026-05-07 20:39 ` Al Viro
2026-05-05 5:53 ` [RFC PATCH 10/25] fold lock_for_kill() into shrink_kill() Al Viro
2026-05-05 5:53 ` [RFC PATCH 11/25] fold lock_for_kill() and __dentry_kill() into common helper Al Viro
2026-05-05 5:53 ` [RFC PATCH 12/25] reducing rcu_read_lock() scopes in dput and friends, step 1 Al Viro
2026-05-05 8:55 ` NeilBrown
2026-05-05 14:22 ` Al Viro
2026-05-05 21:58 ` NeilBrown
2026-05-05 16:47 ` Linus Torvalds
2026-05-05 22:42 ` Al Viro
2026-05-07 7:35 ` Al Viro
2026-05-07 15:32 ` Linus Torvalds
2026-05-05 5:54 ` [RFC PATCH 13/25] reducing rcu_read_lock() scopes in dput and friends, step 2 Al Viro
2026-05-05 5:54 ` [RFC PATCH 14/25] reducing rcu_read_lock() scopes in dput and friends, step 3 Al Viro
2026-05-05 5:54 ` [RFC PATCH 15/25] reducing rcu_read_lock() scopes in dput and friends, step 4 Al Viro
2026-05-05 5:54 ` [RFC PATCH 16/25] reducing rcu_read_lock() scopes in dput and friends, step 5 Al Viro
2026-05-05 5:54 ` [RFC PATCH 17/25] reducing rcu_read_lock() scopes in dput and friends, step 6 Al Viro
2026-05-05 5:54 ` [RFC PATCH 18/25] adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill() Al Viro
2026-05-05 5:54 ` [RFC PATCH 19/25] document dentry_kill() Al Viro
2026-05-05 5:54 ` [RFC PATCH 20/25] d_walk(): shrink rcu_read_lock() scope Al Viro
2026-05-05 17:01 ` Linus Torvalds
2026-05-05 20:05 ` Al Viro
2026-05-05 21:40 ` Frederic Weisbecker
2026-05-05 22:50 ` Al Viro
2026-05-06 3:49 ` Paul E. McKenney
2026-05-07 22:39 ` NeilBrown
2026-05-07 23:21 ` Paul E. McKenney
2026-05-08 14:47 ` Al Viro
2026-05-08 22:03 ` Paul E. McKenney
2026-05-08 23:03 ` Al Viro
2026-05-08 3:01 ` Al Viro
2026-05-05 5:54 ` [RFC PATCH 21/25] shrinking rcu_read_lock() scope in d_alloc_parallel() Al Viro
2026-05-07 21:52 ` Jori Koolstra
2026-05-08 3:12 ` Al Viro
2026-05-08 9:28 ` Jori Koolstra
2026-05-05 5:54 ` [RFC PATCH 22/25] shrink_dentry_tree(): unify the calls of shrink_dentry_list() Al Viro
2026-05-05 5:54 ` [RFC PATCH 23/25] wind ->s_roots via ->d_sib instead of ->d_hash Al Viro
2026-05-05 5:54 ` [RFC PATCH 24/25] nfs: get rid of fake root dentries Al Viro
2026-05-05 5:54 ` [RFC PATCH 25/25] make cursors NORCU Al Viro
2026-05-05 17:09 ` [RFC PATCH 00/25] assorted dcache cleanups and fixes Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260505055412.1261144-4-viro@zeniv.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=neil@brown.name \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox