public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement
@ 2025-05-21  1:53 Qingfang Deng
  2025-05-21  1:53 ` [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes Qingfang Deng
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Qingfang Deng @ 2025-05-21  1:53 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman, Tejun Heo, linux-kernel; +Cc: Ian Kent

KCSAN reports concurrent accesses to inode->i_mode:

==================================================================
BUG: KCSAN: data-race in generic_permission / kernfs_iop_permission

write to 0xffffffe001129590 of 2 bytes by task 2477 on cpu 1:
 kernfs_iop_permission+0x72/0x1a0
 link_path_walk.part.0.constprop.0+0x348/0x420
 path_openat+0xee/0x10f0
 do_filp_open+0xaa/0x160
 do_sys_openat2+0x252/0x380
 sys_openat+0x4c/0xa0
 ret_from_syscall+0x0/0x2

read to 0xffffffe001129590 of 2 bytes by task 3902 on cpu 3:
 generic_permission+0x26/0x120
 kernfs_iop_permission+0x150/0x1a0
 link_path_walk.part.0.constprop.0+0x348/0x420
 path_lookupat+0x58/0x280
 filename_lookup+0xae/0x1f0
 user_path_at_empty+0x3a/0x70
 vfs_statx+0x82/0x170
 __do_sys_newfstatat+0x36/0x70
 sys_newfstatat+0x2e/0x50
 ret_from_syscall+0x0/0x2

Reported by Kernel Concurrency Sanitizer on:
CPU: 3 PID: 3902 Comm: ls Not tainted 5.10.104+ #0
==================================================================

kernfs_iop_permission+0x72/0x1a0:

kernfs_refresh_inode at fs/kernfs/inode.c:174
 169 	
 170 	static void kernfs_refresh_inode(struct kernfs_node *kn, struct inode *inode)
 171 	{
 172 		struct kernfs_iattrs *attrs = kn->iattr;
 173 	
>174<		inode->i_mode = kn->mode;
 175 		if (attrs)
 176 			/*
 177 			 * kernfs_node has non-default attributes get them from
 178 			 * persistent copy in kernfs_node.
 179 			 */

(inlined by) kernfs_iop_permission at fs/kernfs/inode.c:285
 280 			return -ECHILD;
 281 	
 282 		kn = inode->i_private;
 283 	
 284 		mutex_lock(&kernfs_mutex);
>285<		kernfs_refresh_inode(kn, inode);
 286 		mutex_unlock(&kernfs_mutex);
 287 	
 288 		return generic_permission(inode, mask);
 289 	}
 290 	

generic_permission+0x26/0x120:

acl_permission_check at fs/namei.c:298
 293 	 * Note that the POSIX ACL check cares about the MAY_NOT_BLOCK bit,
 294 	 * for RCU walking.
 295 	 */
 296 	static int acl_permission_check(struct inode *inode, int mask)
 297 	{
>298<		unsigned int mode = inode->i_mode;
 299 	
 300 		/* Are we the owner? If so, ACL's don't matter */
 301 		if (likely(uid_eq(current_fsuid(), inode->i_uid))) {
 302 			mask &= 7;
 303 			mode >>= 6;

(inlined by) generic_permission at fs/namei.c:353
 348 		int ret;
 349 	
 350 		/*
 351 		 * Do the basic permission checks.
 352 		 */
>353<		ret = acl_permission_check(inode, mask);
 354 		if (ret != -EACCES)
 355 			return ret;
 356 	
 357 		if (S_ISDIR(inode->i_mode)) {
 358 			/* DACs are overridable for directories */

Backport the series from 5.15 to fix the concurrency bug.
https://lore.kernel.org/all/162642752894.63632.5596341704463755308.stgit@web.messagingengine.com

Ian Kent (5):
  kernfs: add a revision to identify directory node changes
  kernfs: use VFS negative dentry caching
  kernfs: switch kernfs to use an rwsem
  kernfs: use i_lock to protect concurrent inode updates
  kernfs: dont call d_splice_alias() under kernfs node lock

 fs/kernfs/dir.c             | 153 ++++++++++++++++++++----------------
 fs/kernfs/file.c            |   4 +-
 fs/kernfs/inode.c           |  26 +++---
 fs/kernfs/kernfs-internal.h |  24 +++++-
 fs/kernfs/mount.c           |  12 +--
 fs/kernfs/symlink.c         |   4 +-
 include/linux/kernfs.h      |   7 +-
 7 files changed, 138 insertions(+), 92 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes
  2025-05-21  1:53 [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Qingfang Deng
@ 2025-05-21  1:53 ` Qingfang Deng
  2025-05-21  4:16   ` Greg Kroah-Hartman
  2025-05-22  2:07   ` Sasha Levin
  2025-05-21  1:53 ` [PATCH 5.10 2/5] kernfs: use VFS negative dentry caching Qingfang Deng
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Qingfang Deng @ 2025-05-21  1:53 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman, Tejun Heo, linux-kernel
  Cc: Ian Kent, Miklos Szeredi

From: Ian Kent <raven@themaw.net>

Commit 895adbec302e92086359e6fd92611ac3be6d92c3 upstream.

Add a revision counter to kernfs directory nodes so it can be used
to detect if a directory node has changed during negative dentry
revalidation.

There's an assumption that sizeof(unsigned long) <= sizeof(pointer)
on all architectures and as far as I know that assumption holds.

So adding a revision counter to the struct kernfs_elem_dir variant of
the kernfs_node type union won't increase the size of the kernfs_node
struct. This is because struct kernfs_elem_dir is at least
sizeof(pointer) smaller than the largest union variant. It's tempting
to make the revision counter a u64 but that would increase the size of
kernfs_node on archs where sizeof(pointer) is smaller than the revision
counter.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Link: https://lore.kernel.org/r/162642769895.63632.8356662784964509867.stgit@web.messagingengine.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/kernfs/dir.c             |  2 ++
 fs/kernfs/kernfs-internal.h | 19 +++++++++++++++++++
 include/linux/kernfs.h      |  5 +++++
 3 files changed, 26 insertions(+)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 0ba056e06e48..9bc73c8b6e3f 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -380,6 +380,7 @@ static int kernfs_link_sibling(struct kernfs_node *kn)
 	/* successfully added, account subdir number */
 	if (kernfs_type(kn) == KERNFS_DIR)
 		kn->parent->dir.subdirs++;
+	kernfs_inc_rev(kn->parent);
 
 	return 0;
 }
@@ -402,6 +403,7 @@ static bool kernfs_unlink_sibling(struct kernfs_node *kn)
 
 	if (kernfs_type(kn) == KERNFS_DIR)
 		kn->parent->dir.subdirs--;
+	kernfs_inc_rev(kn->parent);
 
 	rb_erase(&kn->rb, &kn->parent->dir.children);
 	RB_CLEAR_NODE(&kn->rb);
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 7ee97ef59184..6a8d0ca26d03 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -81,6 +81,25 @@ static inline struct kernfs_node *kernfs_dentry_node(struct dentry *dentry)
 	return d_inode(dentry)->i_private;
 }
 
+static inline void kernfs_set_rev(struct kernfs_node *parent,
+				  struct dentry *dentry)
+{
+	dentry->d_time = parent->dir.rev;
+}
+
+static inline void kernfs_inc_rev(struct kernfs_node *parent)
+{
+	parent->dir.rev++;
+}
+
+static inline bool kernfs_dir_changed(struct kernfs_node *parent,
+				      struct dentry *dentry)
+{
+	if (parent->dir.rev != dentry->d_time)
+		return true;
+	return false;
+}
+
 extern const struct super_operations kernfs_sops;
 extern struct kmem_cache *kernfs_node_cache, *kernfs_iattrs_cache;
 
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 89f6a4214a70..195afa63ab1c 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -98,6 +98,11 @@ struct kernfs_elem_dir {
 	 * better directly in kernfs_node but is here to save space.
 	 */
 	struct kernfs_root	*root;
+	/*
+	 * Monotonic revision counter, used to identify if a directory
+	 * node has changed during negative dentry revalidation.
+	 */
+	unsigned long		rev;
 };
 
 struct kernfs_elem_symlink {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5.10 2/5] kernfs: use VFS negative dentry caching
  2025-05-21  1:53 [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Qingfang Deng
  2025-05-21  1:53 ` [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes Qingfang Deng
@ 2025-05-21  1:53 ` Qingfang Deng
  2025-05-22  2:05   ` Sasha Levin
  2025-05-21  1:53 ` [PATCH 5.10 3/5] kernfs: switch kernfs to use an rwsem Qingfang Deng
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Qingfang Deng @ 2025-05-21  1:53 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman, Tejun Heo, linux-kernel
  Cc: Ian Kent, Miklos Szeredi

From: Ian Kent <raven@themaw.net>

Commit c7e7c04274b13f98f758fb69b03f2ab61976ea80 upstream.

If there are many lookups for non-existent paths these negative lookups
can lead to a lot of overhead during path walks.

The VFS allows dentries to be created as negative and hashed, and caches
them so they can be used to reduce the fairly high overhead alloc/free
cycle that occurs during these lookups.

Use the kernfs node parent revision to identify if a change has been
made to the containing directory so that the negative dentry can be
discarded and the lookup redone.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Link: https://lore.kernel.org/r/162642770420.63632.15791924970508867106.stgit@web.messagingengine.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/kernfs/dir.c | 55 +++++++++++++++++++++++++++++++------------------
 1 file changed, 35 insertions(+), 20 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 9bc73c8b6e3f..1756f7e43aac 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -565,9 +565,31 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 	if (flags & LOOKUP_RCU)
 		return -ECHILD;
 
-	/* Always perform fresh lookup for negatives */
-	if (d_really_is_negative(dentry))
-		goto out_bad_unlocked;
+	/* Negative hashed dentry? */
+	if (d_really_is_negative(dentry)) {
+		struct kernfs_node *parent;
+
+		/* If the kernfs parent node has changed discard and
+		 * proceed to ->lookup.
+		 */
+		mutex_lock(&kernfs_mutex);
+		spin_lock(&dentry->d_lock);
+		parent = kernfs_dentry_node(dentry->d_parent);
+		if (parent) {
+			if (kernfs_dir_changed(parent, dentry)) {
+				spin_unlock(&dentry->d_lock);
+				mutex_unlock(&kernfs_mutex);
+				return 0;
+			}
+		}
+		spin_unlock(&dentry->d_lock);
+		mutex_unlock(&kernfs_mutex);
+
+		/* The kernfs parent node hasn't changed, leave the
+		 * dentry negative and return success.
+		 */
+		return 1;
+	}
 
 	kn = kernfs_dentry_node(dentry);
 	mutex_lock(&kernfs_mutex);
@@ -593,7 +615,6 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 	return 1;
 out_bad:
 	mutex_unlock(&kernfs_mutex);
-out_bad_unlocked:
 	return 0;
 }
 
@@ -1104,33 +1125,27 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 	struct dentry *ret;
 	struct kernfs_node *parent = dir->i_private;
 	struct kernfs_node *kn;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	const void *ns = NULL;
 
 	mutex_lock(&kernfs_mutex);
-
 	if (kernfs_ns_enabled(parent))
 		ns = kernfs_info(dir->i_sb)->ns;
 
 	kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
-
-	/* no such entry */
-	if (!kn || !kernfs_active(kn)) {
-		ret = NULL;
-		goto out_unlock;
-	}
-
 	/* attach dentry and inode */
-	inode = kernfs_get_inode(dir->i_sb, kn);
-	if (!inode) {
-		ret = ERR_PTR(-ENOMEM);
-		goto out_unlock;
+	if (kn && kernfs_active(kn)) {
+		inode = kernfs_get_inode(dir->i_sb, kn);
+		if (!inode)
+			inode = ERR_PTR(-ENOMEM);
 	}
-
-	/* instantiate and hash dentry */
+	/* Needed only for negative dentry validation */
+	if (!inode)
+		kernfs_set_rev(parent, dentry);
+	/* instantiate and hash (possibly negative) dentry */
 	ret = d_splice_alias(inode, dentry);
- out_unlock:
 	mutex_unlock(&kernfs_mutex);
+
 	return ret;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5.10 3/5] kernfs: switch kernfs to use an rwsem
  2025-05-21  1:53 [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Qingfang Deng
  2025-05-21  1:53 ` [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes Qingfang Deng
  2025-05-21  1:53 ` [PATCH 5.10 2/5] kernfs: use VFS negative dentry caching Qingfang Deng
@ 2025-05-21  1:53 ` Qingfang Deng
  2025-05-22  2:03   ` Sasha Levin
  2025-05-21  1:53 ` [PATCH 5.10 4/5] kernfs: use i_lock to protect concurrent inode updates Qingfang Deng
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Qingfang Deng @ 2025-05-21  1:53 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman, Tejun Heo, linux-kernel
  Cc: Ian Kent, Miklos Szeredi

From: Ian Kent <raven@themaw.net>

Commit 7ba0273b2f34a55efe967d3c7381fb1da2ca195f upstream.

The kernfs global lock restricts the ability to perform kernfs node
lookup operations in parallel during path walks.

Change the kernfs mutex to an rwsem so that, when opportunity arises,
node searches can be done in parallel with path walk lookups.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Link: https://lore.kernel.org/r/162642770946.63632.2218304587223241374.stgit@web.messagingengine.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/kernfs/dir.c             | 100 ++++++++++++++++++------------------
 fs/kernfs/file.c            |   4 +-
 fs/kernfs/inode.c           |  16 +++---
 fs/kernfs/kernfs-internal.h |   5 +-
 fs/kernfs/mount.c           |  12 ++---
 fs/kernfs/symlink.c         |   4 +-
 include/linux/kernfs.h      |   2 +-
 7 files changed, 72 insertions(+), 71 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 1756f7e43aac..0443a9cd72a3 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -17,7 +17,7 @@
 
 #include "kernfs-internal.h"
 
-DEFINE_MUTEX(kernfs_mutex);
+DECLARE_RWSEM(kernfs_rwsem);
 static DEFINE_SPINLOCK(kernfs_rename_lock);	/* kn->parent and ->name */
 /*
  * Don't use rename_lock to piggy back on pr_cont_buf. We don't want to
@@ -34,7 +34,7 @@ static DEFINE_SPINLOCK(kernfs_idr_lock);	/* root->ino_idr */
 
 static bool kernfs_active(struct kernfs_node *kn)
 {
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held(&kernfs_rwsem);
 	return atomic_read(&kn->active) >= 0;
 }
 
@@ -348,7 +348,7 @@ static int kernfs_sd_compare(const struct kernfs_node *left,
  *	@kn->parent->dir.children.
  *
  *	Locking:
- *	mutex_lock(kernfs_mutex)
+ *	kernfs_rwsem held exclusive
  *
  *	RETURNS:
  *	0 on susccess -EEXIST on failure.
@@ -394,7 +394,7 @@ static int kernfs_link_sibling(struct kernfs_node *kn)
  *	removed, %false if @kn wasn't on the rbtree.
  *
  *	Locking:
- *	mutex_lock(kernfs_mutex)
+ *	kernfs_rwsem held exclusive
  */
 static bool kernfs_unlink_sibling(struct kernfs_node *kn)
 {
@@ -465,14 +465,14 @@ void kernfs_put_active(struct kernfs_node *kn)
  * return after draining is complete.
  */
 static void kernfs_drain(struct kernfs_node *kn)
-	__releases(&kernfs_mutex) __acquires(&kernfs_mutex)
+	__releases(&kernfs_rwsem) __acquires(&kernfs_rwsem)
 {
 	struct kernfs_root *root = kernfs_root(kn);
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held_write(&kernfs_rwsem);
 	WARN_ON_ONCE(kernfs_active(kn));
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	if (kernfs_lockdep(kn)) {
 		rwsem_acquire(&kn->dep_map, 0, 0, _RET_IP_);
@@ -491,7 +491,7 @@ static void kernfs_drain(struct kernfs_node *kn)
 
 	kernfs_drain_open_files(kn);
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 }
 
 /**
@@ -572,18 +572,18 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 		/* If the kernfs parent node has changed discard and
 		 * proceed to ->lookup.
 		 */
-		mutex_lock(&kernfs_mutex);
+		down_read(&kernfs_rwsem);
 		spin_lock(&dentry->d_lock);
 		parent = kernfs_dentry_node(dentry->d_parent);
 		if (parent) {
 			if (kernfs_dir_changed(parent, dentry)) {
 				spin_unlock(&dentry->d_lock);
-				mutex_unlock(&kernfs_mutex);
+				up_read(&kernfs_rwsem);
 				return 0;
 			}
 		}
 		spin_unlock(&dentry->d_lock);
-		mutex_unlock(&kernfs_mutex);
+		up_read(&kernfs_rwsem);
 
 		/* The kernfs parent node hasn't changed, leave the
 		 * dentry negative and return success.
@@ -592,7 +592,7 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 	}
 
 	kn = kernfs_dentry_node(dentry);
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 
 	/* The kernfs node has been deactivated */
 	if (!kernfs_active(kn))
@@ -611,10 +611,10 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 	    kernfs_info(dentry->d_sb)->ns != kn->ns)
 		goto out_bad;
 
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 	return 1;
 out_bad:
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 	return 0;
 }
 
@@ -809,7 +809,7 @@ int kernfs_add_one(struct kernfs_node *kn)
 	bool has_ns;
 	int ret;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	ret = -EINVAL;
 	has_ns = kernfs_ns_enabled(parent);
@@ -840,7 +840,7 @@ int kernfs_add_one(struct kernfs_node *kn)
 		ps_iattr->ia_mtime = ps_iattr->ia_ctime;
 	}
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	/*
 	 * Activate the new node unless CREATE_DEACTIVATED is requested.
@@ -854,7 +854,7 @@ int kernfs_add_one(struct kernfs_node *kn)
 	return 0;
 
 out_unlock:
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return ret;
 }
 
@@ -875,7 +875,7 @@ static struct kernfs_node *kernfs_find_ns(struct kernfs_node *parent,
 	bool has_ns = kernfs_ns_enabled(parent);
 	unsigned int hash;
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held(&kernfs_rwsem);
 
 	if (has_ns != (bool)ns) {
 		WARN(1, KERN_WARNING "kernfs: ns %s in '%s' for '%s'\n",
@@ -907,7 +907,7 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
 	size_t len;
 	char *p, *name;
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held_read(&kernfs_rwsem);
 
 	spin_lock_irq(&kernfs_pr_cont_lock);
 
@@ -946,10 +946,10 @@ struct kernfs_node *kernfs_find_and_get_ns(struct kernfs_node *parent,
 {
 	struct kernfs_node *kn;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 	kn = kernfs_find_ns(parent, name, ns);
 	kernfs_get(kn);
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 
 	return kn;
 }
@@ -970,10 +970,10 @@ struct kernfs_node *kernfs_walk_and_get_ns(struct kernfs_node *parent,
 {
 	struct kernfs_node *kn;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 	kn = kernfs_walk_ns(parent, path, ns);
 	kernfs_get(kn);
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 
 	return kn;
 }
@@ -1128,7 +1128,7 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 	struct inode *inode = NULL;
 	const void *ns = NULL;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 	if (kernfs_ns_enabled(parent))
 		ns = kernfs_info(dir->i_sb)->ns;
 
@@ -1144,7 +1144,7 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 		kernfs_set_rev(parent, dentry);
 	/* instantiate and hash (possibly negative) dentry */
 	ret = d_splice_alias(inode, dentry);
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 
 	return ret;
 }
@@ -1264,7 +1264,7 @@ static struct kernfs_node *kernfs_next_descendant_post(struct kernfs_node *pos,
 {
 	struct rb_node *rbn;
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held_write(&kernfs_rwsem);
 
 	/* if first iteration, visit leftmost descendant which may be root */
 	if (!pos)
@@ -1300,7 +1300,7 @@ void kernfs_activate(struct kernfs_node *kn)
 {
 	struct kernfs_node *pos;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	pos = NULL;
 	while ((pos = kernfs_next_descendant_post(pos, kn))) {
@@ -1314,14 +1314,14 @@ void kernfs_activate(struct kernfs_node *kn)
 		pos->flags |= KERNFS_ACTIVATED;
 	}
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 }
 
 static void __kernfs_remove(struct kernfs_node *kn)
 {
 	struct kernfs_node *pos;
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held_write(&kernfs_rwsem);
 
 	/*
 	 * Short-circuit if non-root @kn has already finished removal.
@@ -1344,7 +1344,7 @@ static void __kernfs_remove(struct kernfs_node *kn)
 		pos = kernfs_leftmost_descendant(kn);
 
 		/*
-		 * kernfs_drain() drops kernfs_mutex temporarily and @pos's
+		 * kernfs_drain() drops kernfs_rwsem temporarily and @pos's
 		 * base ref could have been put by someone else by the time
 		 * the function returns.  Make sure it doesn't go away
 		 * underneath us.
@@ -1391,9 +1391,9 @@ static void __kernfs_remove(struct kernfs_node *kn)
  */
 void kernfs_remove(struct kernfs_node *kn)
 {
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	__kernfs_remove(kn);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 }
 
 /**
@@ -1480,17 +1480,17 @@ bool kernfs_remove_self(struct kernfs_node *kn)
 {
 	bool ret;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	kernfs_break_active_protection(kn);
 
 	/*
 	 * SUICIDAL is used to arbitrate among competing invocations.  Only
 	 * the first one will actually perform removal.  When the removal
 	 * is complete, SUICIDED is set and the active ref is restored
-	 * while holding kernfs_mutex.  The ones which lost arbitration
-	 * waits for SUICDED && drained which can happen only after the
-	 * enclosing kernfs operation which executed the winning instance
-	 * of kernfs_remove_self() finished.
+	 * while kernfs_rwsem for held exclusive.  The ones which lost
+	 * arbitration waits for SUICIDED && drained which can happen only
+	 * after the enclosing kernfs operation which executed the winning
+	 * instance of kernfs_remove_self() finished.
 	 */
 	if (!(kn->flags & KERNFS_SUICIDAL)) {
 		kn->flags |= KERNFS_SUICIDAL;
@@ -1508,9 +1508,9 @@ bool kernfs_remove_self(struct kernfs_node *kn)
 			    atomic_read(&kn->active) == KN_DEACTIVATED_BIAS)
 				break;
 
-			mutex_unlock(&kernfs_mutex);
+			up_write(&kernfs_rwsem);
 			schedule();
-			mutex_lock(&kernfs_mutex);
+			down_write(&kernfs_rwsem);
 		}
 		finish_wait(waitq, &wait);
 		WARN_ON_ONCE(!RB_EMPTY_NODE(&kn->rb));
@@ -1518,12 +1518,12 @@ bool kernfs_remove_self(struct kernfs_node *kn)
 	}
 
 	/*
-	 * This must be done while holding kernfs_mutex; otherwise, waiting
-	 * for SUICIDED && deactivated could finish prematurely.
+	 * This must be done while kernfs_rwsem held exclusive; otherwise,
+	 * waiting for SUICIDED && deactivated could finish prematurely.
 	 */
 	kernfs_unbreak_active_protection(kn);
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return ret;
 }
 
@@ -1547,7 +1547,7 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
 		return -ENOENT;
 	}
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	kn = kernfs_find_ns(parent, name, ns);
 	if (kn) {
@@ -1556,7 +1556,7 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
 		kernfs_put(kn);
 	}
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	if (kn)
 		return 0;
@@ -1582,7 +1582,7 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 	if (!kn->parent)
 		return -EINVAL;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	error = -ENOENT;
 	if (!kernfs_active(kn) || !kernfs_active(new_parent) ||
@@ -1636,7 +1636,7 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 
 	error = 0;
  out:
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return error;
 }
 
@@ -1711,7 +1711,7 @@ static int kernfs_fop_readdir(struct file *file, struct dir_context *ctx)
 
 	if (!dir_emit_dots(file, ctx))
 		return 0;
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 
 	if (kernfs_ns_enabled(parent))
 		ns = kernfs_info(dentry->d_sb)->ns;
@@ -1728,12 +1728,12 @@ static int kernfs_fop_readdir(struct file *file, struct dir_context *ctx)
 		file->private_data = pos;
 		kernfs_get(pos);
 
-		mutex_unlock(&kernfs_mutex);
+		up_read(&kernfs_rwsem);
 		if (!dir_emit(ctx, name, len, ino, type))
 			return 0;
-		mutex_lock(&kernfs_mutex);
+		down_read(&kernfs_rwsem);
 	}
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 	file->private_data = NULL;
 	ctx->pos = INT_MAX;
 	return 0;
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index c75719312147..60e2a86c535e 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -860,7 +860,7 @@ static void kernfs_notify_workfn(struct work_struct *work)
 	spin_unlock_irq(&kernfs_notify_lock);
 
 	/* kick fsnotify */
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
 		struct kernfs_node *parent;
@@ -898,7 +898,7 @@ static void kernfs_notify_workfn(struct work_struct *work)
 		iput(inode);
 	}
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	kernfs_put(kn);
 	goto repeat;
 }
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index fc2469a20fed..ddaf18198935 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -106,9 +106,9 @@ int kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr)
 {
 	int ret;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	ret = __kernfs_setattr(kn, iattr);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return ret;
 }
 
@@ -121,7 +121,7 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
 	if (!kn)
 		return -EINVAL;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	error = setattr_prepare(dentry, iattr);
 	if (error)
 		goto out;
@@ -134,7 +134,7 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
 	setattr_copy(inode, iattr);
 
 out:
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return error;
 }
 
@@ -189,9 +189,9 @@ int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
 	struct inode *inode = d_inode(path->dentry);
 	struct kernfs_node *kn = inode->i_private;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	kernfs_refresh_inode(kn, inode);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	generic_fillattr(inode, stat);
 	return 0;
@@ -281,9 +281,9 @@ int kernfs_iop_permission(struct inode *inode, int mask)
 
 	kn = inode->i_private;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	kernfs_refresh_inode(kn, inode);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	return generic_permission(inode, mask);
 }
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 6a8d0ca26d03..c933d9bd8a78 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -13,6 +13,7 @@
 #include <linux/lockdep.h>
 #include <linux/fs.h>
 #include <linux/mutex.h>
+#include <linux/rwsem.h>
 #include <linux/xattr.h>
 
 #include <linux/kernfs.h>
@@ -69,7 +70,7 @@ struct kernfs_super_info {
 	 */
 	const void		*ns;
 
-	/* anchored at kernfs_root->supers, protected by kernfs_mutex */
+	/* anchored at kernfs_root->supers, protected by kernfs_rwsem */
 	struct list_head	node;
 };
 #define kernfs_info(SB) ((struct kernfs_super_info *)(SB->s_fs_info))
@@ -118,7 +119,7 @@ int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr);
 /*
  * dir.c
  */
-extern struct mutex kernfs_mutex;
+extern struct rw_semaphore kernfs_rwsem;
 extern const struct dentry_operations kernfs_dops;
 extern const struct file_operations kernfs_dir_fops;
 extern const struct inode_operations kernfs_dir_iops;
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 9dc7e7a64e10..baa4155ba2ed 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -255,9 +255,9 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
 	sb->s_shrink.seeks = 0;
 
 	/* get root inode, initialize and unlock it */
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	inode = kernfs_get_inode(sb, info->root->kn);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	if (!inode) {
 		pr_debug("kernfs: could not get root inode\n");
 		return -ENOMEM;
@@ -344,9 +344,9 @@ int kernfs_get_tree(struct fs_context *fc)
 		}
 		sb->s_flags |= SB_ACTIVE;
 
-		mutex_lock(&kernfs_mutex);
+		down_write(&kernfs_rwsem);
 		list_add(&info->node, &info->root->supers);
-		mutex_unlock(&kernfs_mutex);
+		up_write(&kernfs_rwsem);
 	}
 
 	fc->root = dget(sb->s_root);
@@ -372,9 +372,9 @@ void kernfs_kill_sb(struct super_block *sb)
 {
 	struct kernfs_super_info *info = kernfs_info(sb);
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	list_del(&info->node);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	/*
 	 * Remove the superblock from fs_supers/s_instances
diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c
index 5432883d819f..c8f8e41b8411 100644
--- a/fs/kernfs/symlink.c
+++ b/fs/kernfs/symlink.c
@@ -116,9 +116,9 @@ static int kernfs_getlink(struct inode *inode, char *path)
 	struct kernfs_node *target = kn->symlink.target_kn;
 	int error;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 	error = kernfs_get_target_path(parent, target, path);
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 
 	return error;
 }
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 195afa63ab1c..95e1948379d0 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -193,7 +193,7 @@ struct kernfs_root {
 	u32			id_highbits;
 	struct kernfs_syscall_ops *syscall_ops;
 
-	/* list of kernfs_super_info of this root, protected by kernfs_mutex */
+	/* list of kernfs_super_info of this root, protected by kernfs_rwsem */
 	struct list_head	supers;
 
 	wait_queue_head_t	deactivate_waitq;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5.10 4/5] kernfs: use i_lock to protect concurrent inode updates
  2025-05-21  1:53 [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Qingfang Deng
                   ` (2 preceding siblings ...)
  2025-05-21  1:53 ` [PATCH 5.10 3/5] kernfs: switch kernfs to use an rwsem Qingfang Deng
@ 2025-05-21  1:53 ` Qingfang Deng
  2025-05-22  2:07   ` Sasha Levin
  2025-05-21  1:53 ` [PATCH 5.10 5/5] kernfs: dont call d_splice_alias() under kernfs node lock Qingfang Deng
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Qingfang Deng @ 2025-05-21  1:53 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman, Tejun Heo, linux-kernel
  Cc: Ian Kent, Miklos Szeredi

From: Ian Kent <raven@themaw.net>

Commit 47b5c64d0ab5e7136db2b78c6ec710e0d8a5a36b upstream.

The inode operations .permission() and .getattr() use the kernfs node
write lock but all that's needed is the read lock to protect against
partial updates of these kernfs node fields which are all done under
the write lock.

And .permission() is called frequently during path walks and can cause
quite a bit of contention between kernfs node operations and path
walks when the number of concurrent walks is high.

To change kernfs_iop_getattr() and kernfs_iop_permission() to take
the rw sem read lock instead of the write lock an additional lock is
needed to protect against multiple processes concurrently updating
the inode attributes and link count in kernfs_refresh_inode().

The inode i_lock seems like the sensible thing to use to protect these
inode attribute updates so use it in kernfs_refresh_inode().

The last hunk in the patch, applied to kernfs_fill_super(), is possibly
not needed but taking the lock was present originally. I prefer to
continue to take it to protect against a partial update of the source
kernfs fields during the call to kernfs_refresh_inode() made by
kernfs_get_inode().

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Link: https://lore.kernel.org/r/162642771474.63632.16295959115893904470.stgit@web.messagingengine.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/kernfs/inode.c | 18 ++++++++++++------
 fs/kernfs/mount.c |  4 ++--
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index ddaf18198935..73d7d4a24c51 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -189,11 +189,13 @@ int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
 	struct inode *inode = d_inode(path->dentry);
 	struct kernfs_node *kn = inode->i_private;
 
-	down_write(&kernfs_rwsem);
+	down_read(&kernfs_rwsem);
+	spin_lock(&inode->i_lock);
 	kernfs_refresh_inode(kn, inode);
-	up_write(&kernfs_rwsem);
-
 	generic_fillattr(inode, stat);
+	spin_unlock(&inode->i_lock);
+	up_read(&kernfs_rwsem);
+
 	return 0;
 }
 
@@ -275,17 +277,21 @@ void kernfs_evict_inode(struct inode *inode)
 int kernfs_iop_permission(struct inode *inode, int mask)
 {
 	struct kernfs_node *kn;
+	int ret;
 
 	if (mask & MAY_NOT_BLOCK)
 		return -ECHILD;
 
 	kn = inode->i_private;
 
-	down_write(&kernfs_rwsem);
+	down_read(&kernfs_rwsem);
+	spin_lock(&inode->i_lock);
 	kernfs_refresh_inode(kn, inode);
-	up_write(&kernfs_rwsem);
+	ret = generic_permission(inode, mask);
+	spin_unlock(&inode->i_lock);
+	up_read(&kernfs_rwsem);
 
-	return generic_permission(inode, mask);
+	return ret;
 }
 
 int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index baa4155ba2ed..f2f909d09f52 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -255,9 +255,9 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
 	sb->s_shrink.seeks = 0;
 
 	/* get root inode, initialize and unlock it */
-	down_write(&kernfs_rwsem);
+	down_read(&kernfs_rwsem);
 	inode = kernfs_get_inode(sb, info->root->kn);
-	up_write(&kernfs_rwsem);
+	up_read(&kernfs_rwsem);
 	if (!inode) {
 		pr_debug("kernfs: could not get root inode\n");
 		return -ENOMEM;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5.10 5/5] kernfs: dont call d_splice_alias() under kernfs node lock
  2025-05-21  1:53 [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Qingfang Deng
                   ` (3 preceding siblings ...)
  2025-05-21  1:53 ` [PATCH 5.10 4/5] kernfs: use i_lock to protect concurrent inode updates Qingfang Deng
@ 2025-05-21  1:53 ` Qingfang Deng
  2025-05-22  2:08   ` Sasha Levin
  2025-05-21  4:15 ` [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Greg Kroah-Hartman
  2025-05-21  5:35 ` Ian Kent
  6 siblings, 1 reply; 15+ messages in thread
From: Qingfang Deng @ 2025-05-21  1:53 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman, Tejun Heo, linux-kernel
  Cc: Ian Kent, Miklos Szeredi

From: Ian Kent <raven@themaw.net>

Commit df6192f47d2311cf40cd4321cc59863a5853b665 upstream.

The call to d_splice_alias() in kernfs_iop_lookup() doesn't depend on
any kernfs node so there's no reason to hold the kernfs node lock when
calling it.

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Link: https://lore.kernel.org/r/162642772000.63632.10672683419693513226.stgit@web.messagingengine.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/kernfs/dir.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 0443a9cd72a3..f301605d70b0 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1122,7 +1122,6 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 					struct dentry *dentry,
 					unsigned int flags)
 {
-	struct dentry *ret;
 	struct kernfs_node *parent = dir->i_private;
 	struct kernfs_node *kn;
 	struct inode *inode = NULL;
@@ -1142,11 +1141,10 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 	/* Needed only for negative dentry validation */
 	if (!inode)
 		kernfs_set_rev(parent, dentry);
-	/* instantiate and hash (possibly negative) dentry */
-	ret = d_splice_alias(inode, dentry);
 	up_read(&kernfs_rwsem);
 
-	return ret;
+	/* instantiate and hash (possibly negative) dentry */
+	return d_splice_alias(inode, dentry);
 }
 
 static int kernfs_iop_mkdir(struct inode *dir, struct dentry *dentry,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement
  2025-05-21  1:53 [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Qingfang Deng
                   ` (4 preceding siblings ...)
  2025-05-21  1:53 ` [PATCH 5.10 5/5] kernfs: dont call d_splice_alias() under kernfs node lock Qingfang Deng
@ 2025-05-21  4:15 ` Greg Kroah-Hartman
  2025-05-21  5:35 ` Ian Kent
  6 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-05-21  4:15 UTC (permalink / raw)
  To: Qingfang Deng; +Cc: stable, Tejun Heo, linux-kernel, Ian Kent

On Wed, May 21, 2025 at 09:53:30AM +0800, Qingfang Deng wrote:
> KCSAN reports concurrent accesses to inode->i_mode:

<snip>

Yes, but can you actually trigger this issue with a real workload?  How
were these tested in the 5.10 tree, and if this is a problem, can you
just move to a newer release instead?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes
  2025-05-21  1:53 ` [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes Qingfang Deng
@ 2025-05-21  4:16   ` Greg Kroah-Hartman
  2025-05-22  2:07   ` Sasha Levin
  1 sibling, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-05-21  4:16 UTC (permalink / raw)
  To: Qingfang Deng; +Cc: stable, Tejun Heo, linux-kernel, Ian Kent, Miklos Szeredi

On Wed, May 21, 2025 at 09:53:31AM +0800, Qingfang Deng wrote:
> From: Ian Kent <raven@themaw.net>
> 
> Commit 895adbec302e92086359e6fd92611ac3be6d92c3 upstream.
> 
> Add a revision counter to kernfs directory nodes so it can be used
> to detect if a directory node has changed during negative dentry
> revalidation.
> 
> There's an assumption that sizeof(unsigned long) <= sizeof(pointer)
> on all architectures and as far as I know that assumption holds.
> 
> So adding a revision counter to the struct kernfs_elem_dir variant of
> the kernfs_node type union won't increase the size of the kernfs_node
> struct. This is because struct kernfs_elem_dir is at least
> sizeof(pointer) smaller than the largest union variant. It's tempting
> to make the revision counter a u64 but that would increase the size of
> kernfs_node on archs where sizeof(pointer) is smaller than the revision
> counter.
> 
> Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
> Signed-off-by: Ian Kent <raven@themaw.net>
> Link: https://lore.kernel.org/r/162642769895.63632.8356662784964509867.stgit@web.messagingengine.com
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

When forwarding on patches, you HAVE to sign off on them as well.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement
  2025-05-21  1:53 [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Qingfang Deng
                   ` (5 preceding siblings ...)
  2025-05-21  4:15 ` [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Greg Kroah-Hartman
@ 2025-05-21  5:35 ` Ian Kent
  2025-05-21  6:09   ` Ian Kent
  6 siblings, 1 reply; 15+ messages in thread
From: Ian Kent @ 2025-05-21  5:35 UTC (permalink / raw)
  To: Qingfang Deng, stable, Greg Kroah-Hartman, Tejun Heo,
	linux-kernel

On 21/5/25 09:53, Qingfang Deng wrote:
> KCSAN reports concurrent accesses to inode->i_mode:
>
> ==================================================================
> BUG: KCSAN: data-race in generic_permission / kernfs_iop_permission
>
> write to 0xffffffe001129590 of 2 bytes by task 2477 on cpu 1:
>   kernfs_iop_permission+0x72/0x1a0
>   link_path_walk.part.0.constprop.0+0x348/0x420
>   path_openat+0xee/0x10f0
>   do_filp_open+0xaa/0x160
>   do_sys_openat2+0x252/0x380
>   sys_openat+0x4c/0xa0
>   ret_from_syscall+0x0/0x2
>
> read to 0xffffffe001129590 of 2 bytes by task 3902 on cpu 3:
>   generic_permission+0x26/0x120
>   kernfs_iop_permission+0x150/0x1a0
>   link_path_walk.part.0.constprop.0+0x348/0x420
>   path_lookupat+0x58/0x280
>   filename_lookup+0xae/0x1f0
>   user_path_at_empty+0x3a/0x70
>   vfs_statx+0x82/0x170
>   __do_sys_newfstatat+0x36/0x70
>   sys_newfstatat+0x2e/0x50
>   ret_from_syscall+0x0/0x2
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 3 PID: 3902 Comm: ls Not tainted 5.10.104+ #0
> ==================================================================

It's been soo long since this was merged.

I seem to vaguely remember something along these lines and after 
analyzing it

came to the conclusion it was a false positive.

Let me think about it for a while and see if I can remember the reasoning.


Ian

>
> kernfs_iop_permission+0x72/0x1a0:
>
> kernfs_refresh_inode at fs/kernfs/inode.c:174
>   169 	
>   170 	static void kernfs_refresh_inode(struct kernfs_node *kn, struct inode *inode)
>   171 	{
>   172 		struct kernfs_iattrs *attrs = kn->iattr;
>   173 	
>> 174<		inode->i_mode = kn->mode;
>   175 		if (attrs)
>   176 			/*
>   177 			 * kernfs_node has non-default attributes get them from
>   178 			 * persistent copy in kernfs_node.
>   179 			 */
>
> (inlined by) kernfs_iop_permission at fs/kernfs/inode.c:285
>   280 			return -ECHILD;
>   281 	
>   282 		kn = inode->i_private;
>   283 	
>   284 		mutex_lock(&kernfs_mutex);
>> 285<		kernfs_refresh_inode(kn, inode);
>   286 		mutex_unlock(&kernfs_mutex);
>   287 	
>   288 		return generic_permission(inode, mask);
>   289 	}
>   290 	
>
> generic_permission+0x26/0x120:
>
> acl_permission_check at fs/namei.c:298
>   293 	 * Note that the POSIX ACL check cares about the MAY_NOT_BLOCK bit,
>   294 	 * for RCU walking.
>   295 	 */
>   296 	static int acl_permission_check(struct inode *inode, int mask)
>   297 	{
>> 298<		unsigned int mode = inode->i_mode;
>   299 	
>   300 		/* Are we the owner? If so, ACL's don't matter */
>   301 		if (likely(uid_eq(current_fsuid(), inode->i_uid))) {
>   302 			mask &= 7;
>   303 			mode >>= 6;
>
> (inlined by) generic_permission at fs/namei.c:353
>   348 		int ret;
>   349 	
>   350 		/*
>   351 		 * Do the basic permission checks.
>   352 		 */
>> 353<		ret = acl_permission_check(inode, mask);
>   354 		if (ret != -EACCES)
>   355 			return ret;
>   356 	
>   357 		if (S_ISDIR(inode->i_mode)) {
>   358 			/* DACs are overridable for directories */
>
> Backport the series from 5.15 to fix the concurrency bug.
> https://lore.kernel.org/all/162642752894.63632.5596341704463755308.stgit@web.messagingengine.com
>
> Ian Kent (5):
>    kernfs: add a revision to identify directory node changes
>    kernfs: use VFS negative dentry caching
>    kernfs: switch kernfs to use an rwsem
>    kernfs: use i_lock to protect concurrent inode updates
>    kernfs: dont call d_splice_alias() under kernfs node lock
>
>   fs/kernfs/dir.c             | 153 ++++++++++++++++++++----------------
>   fs/kernfs/file.c            |   4 +-
>   fs/kernfs/inode.c           |  26 +++---
>   fs/kernfs/kernfs-internal.h |  24 +++++-
>   fs/kernfs/mount.c           |  12 +--
>   fs/kernfs/symlink.c         |   4 +-
>   include/linux/kernfs.h      |   7 +-
>   7 files changed, 138 insertions(+), 92 deletions(-)
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement
  2025-05-21  5:35 ` Ian Kent
@ 2025-05-21  6:09   ` Ian Kent
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Kent @ 2025-05-21  6:09 UTC (permalink / raw)
  To: Qingfang Deng, stable, Greg Kroah-Hartman, Tejun Heo,
	linux-kernel

On 21/5/25 13:35, Ian Kent wrote:
> On 21/5/25 09:53, Qingfang Deng wrote:
>> KCSAN reports concurrent accesses to inode->i_mode:
>>
>> ==================================================================
>> BUG: KCSAN: data-race in generic_permission / kernfs_iop_permission
>>
>> write to 0xffffffe001129590 of 2 bytes by task 2477 on cpu 1:
>>   kernfs_iop_permission+0x72/0x1a0
>>   link_path_walk.part.0.constprop.0+0x348/0x420
>>   path_openat+0xee/0x10f0
>>   do_filp_open+0xaa/0x160
>>   do_sys_openat2+0x252/0x380
>>   sys_openat+0x4c/0xa0
>>   ret_from_syscall+0x0/0x2
>>
>> read to 0xffffffe001129590 of 2 bytes by task 3902 on cpu 3:
>>   generic_permission+0x26/0x120
>>   kernfs_iop_permission+0x150/0x1a0
>>   link_path_walk.part.0.constprop.0+0x348/0x420
>>   path_lookupat+0x58/0x280
>>   filename_lookup+0xae/0x1f0
>>   user_path_at_empty+0x3a/0x70
>>   vfs_statx+0x82/0x170
>>   __do_sys_newfstatat+0x36/0x70
>>   sys_newfstatat+0x2e/0x50
>>   ret_from_syscall+0x0/0x2
>>
>> Reported by Kernel Concurrency Sanitizer on:
>> CPU: 3 PID: 3902 Comm: ls Not tainted 5.10.104+ #0
>> ==================================================================
>
> It's been soo long since this was merged.
>
> I seem to vaguely remember something along these lines and after 
> analyzing it
>
> came to the conclusion it was a false positive.
>
> Let me think about it for a while and see if I can remember the reasoning.

Ok, IIRC, so my thinking was that mode is actually stored in the node 
->mode and is

always updated while holding the write lock and copying the same value 
from ->mode

in multiple concurrent threads wouldn't lead to corruption of inode->mode.


>
>
>
> Ian
>
>>
>> kernfs_iop_permission+0x72/0x1a0:
>>
>> kernfs_refresh_inode at fs/kernfs/inode.c:174
>>   169
>>   170     static void kernfs_refresh_inode(struct kernfs_node *kn, 
>> struct inode *inode)
>>   171     {
>>   172         struct kernfs_iattrs *attrs = kn->iattr;
>>   173
>>> 174<        inode->i_mode = kn->mode;
>>   175         if (attrs)
>>   176             /*
>>   177              * kernfs_node has non-default attributes get them 
>> from
>>   178              * persistent copy in kernfs_node.
>>   179              */
>>
>> (inlined by) kernfs_iop_permission at fs/kernfs/inode.c:285
>>   280             return -ECHILD;
>>   281
>>   282         kn = inode->i_private;
>>   283
>>   284         mutex_lock(&kernfs_mutex);
>>> 285<        kernfs_refresh_inode(kn, inode);
>>   286         mutex_unlock(&kernfs_mutex);
>>   287
>>   288         return generic_permission(inode, mask);
>>   289     }
>>   290
>>
>> generic_permission+0x26/0x120:
>>
>> acl_permission_check at fs/namei.c:298
>>   293      * Note that the POSIX ACL check cares about the 
>> MAY_NOT_BLOCK bit,
>>   294      * for RCU walking.
>>   295      */
>>   296     static int acl_permission_check(struct inode *inode, int mask)
>>   297     {
>>> 298<        unsigned int mode = inode->i_mode;
>>   299
>>   300         /* Are we the owner? If so, ACL's don't matter */
>>   301         if (likely(uid_eq(current_fsuid(), inode->i_uid))) {
>>   302             mask &= 7;
>>   303             mode >>= 6;
>>
>> (inlined by) generic_permission at fs/namei.c:353
>>   348         int ret;
>>   349
>>   350         /*
>>   351          * Do the basic permission checks.
>>   352          */
>>> 353<        ret = acl_permission_check(inode, mask);
>>   354         if (ret != -EACCES)
>>   355             return ret;
>>   356
>>   357         if (S_ISDIR(inode->i_mode)) {
>>   358             /* DACs are overridable for directories */
>>
>> Backport the series from 5.15 to fix the concurrency bug.
>> https://lore.kernel.org/all/162642752894.63632.5596341704463755308.stgit@web.messagingengine.com 
>>
>>
>> Ian Kent (5):
>>    kernfs: add a revision to identify directory node changes
>>    kernfs: use VFS negative dentry caching
>>    kernfs: switch kernfs to use an rwsem
>>    kernfs: use i_lock to protect concurrent inode updates
>>    kernfs: dont call d_splice_alias() under kernfs node lock
>>
>>   fs/kernfs/dir.c             | 153 ++++++++++++++++++++----------------
>>   fs/kernfs/file.c            |   4 +-
>>   fs/kernfs/inode.c           |  26 +++---
>>   fs/kernfs/kernfs-internal.h |  24 +++++-
>>   fs/kernfs/mount.c           |  12 +--
>>   fs/kernfs/symlink.c         |   4 +-
>>   include/linux/kernfs.h      |   7 +-
>>   7 files changed, 138 insertions(+), 92 deletions(-)
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 3/5] kernfs: switch kernfs to use an rwsem
  2025-05-21  1:53 ` [PATCH 5.10 3/5] kernfs: switch kernfs to use an rwsem Qingfang Deng
@ 2025-05-22  2:03   ` Sasha Levin
  0 siblings, 0 replies; 15+ messages in thread
From: Sasha Levin @ 2025-05-22  2:03 UTC (permalink / raw)
  To: stable; +Cc: Qingfang Deng, Sasha Levin

[ Sasha's backport helper bot ]

Hi,

✅ All tests passed successfully. No issues detected.
No action required from the submitter.

The upstream commit SHA1 provided is correct: 7ba0273b2f34a55efe967d3c7381fb1da2ca195f

WARNING: Author mismatch between patch and upstream commit:
Backport author: Qingfang Deng<dqfext@gmail.com>
Commit author: Ian Kent<raven@themaw.net>

Status in newer kernel trees:
6.14.y | Present (exact SHA1)
6.12.y | Present (exact SHA1)
6.6.y | Present (exact SHA1)
6.1.y | Present (exact SHA1)
5.15.y | Present (exact SHA1)

Note: The patch differs from the upstream commit:
---
1:  7ba0273b2f34a ! 1:  a0418b6d194c8 kernfs: switch kernfs to use an rwsem
    @@ Metadata
      ## Commit message ##
         kernfs: switch kernfs to use an rwsem
     
    +    Commit 7ba0273b2f34a55efe967d3c7381fb1da2ca195f upstream.
    +
         The kernfs global lock restricts the ability to perform kernfs node
         lookup operations in parallel during path walks.
     
    @@ fs/kernfs/dir.c
     -DEFINE_MUTEX(kernfs_mutex);
     +DECLARE_RWSEM(kernfs_rwsem);
      static DEFINE_SPINLOCK(kernfs_rename_lock);	/* kn->parent and ->name */
    - static char kernfs_pr_cont_buf[PATH_MAX];	/* protected by rename_lock */
    - static DEFINE_SPINLOCK(kernfs_idr_lock);	/* root->ino_idr */
    + /*
    +  * Don't use rename_lock to piggy back on pr_cont_buf. We don't want to
     @@ fs/kernfs/dir.c: static DEFINE_SPINLOCK(kernfs_idr_lock);	/* root->ino_idr */
      
      static bool kernfs_active(struct kernfs_node *kn)
    @@ fs/kernfs/dir.c: static void kernfs_drain(struct kernfs_node *kn)
      }
      
      /**
    +@@ fs/kernfs/dir.c: static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
    + 		/* If the kernfs parent node has changed discard and
    + 		 * proceed to ->lookup.
    + 		 */
    +-		mutex_lock(&kernfs_mutex);
    ++		down_read(&kernfs_rwsem);
    + 		spin_lock(&dentry->d_lock);
    + 		parent = kernfs_dentry_node(dentry->d_parent);
    + 		if (parent) {
    + 			if (kernfs_dir_changed(parent, dentry)) {
    + 				spin_unlock(&dentry->d_lock);
    +-				mutex_unlock(&kernfs_mutex);
    ++				up_read(&kernfs_rwsem);
    + 				return 0;
    + 			}
    + 		}
    + 		spin_unlock(&dentry->d_lock);
    +-		mutex_unlock(&kernfs_mutex);
    ++		up_read(&kernfs_rwsem);
    + 
    + 		/* The kernfs parent node hasn't changed, leave the
    + 		 * dentry negative and return success.
    +@@ fs/kernfs/dir.c: static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
    + 	}
    + 
    + 	kn = kernfs_dentry_node(dentry);
    +-	mutex_lock(&kernfs_mutex);
    ++	down_read(&kernfs_rwsem);
    + 
    + 	/* The kernfs node has been deactivated */
    + 	if (!kernfs_active(kn))
    +@@ fs/kernfs/dir.c: static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
    + 	    kernfs_info(dentry->d_sb)->ns != kn->ns)
    + 		goto out_bad;
    + 
    +-	mutex_unlock(&kernfs_mutex);
    ++	up_read(&kernfs_rwsem);
    + 	return 1;
    + out_bad:
    +-	mutex_unlock(&kernfs_mutex);
    ++	up_read(&kernfs_rwsem);
    + 	return 0;
    + }
    + 
     @@ fs/kernfs/dir.c: int kernfs_add_one(struct kernfs_node *kn)
      	bool has_ns;
      	int ret;
    @@ fs/kernfs/dir.c: static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *p
     -	lockdep_assert_held(&kernfs_mutex);
     +	lockdep_assert_held_read(&kernfs_rwsem);
      
    - 	/* grab kernfs_rename_lock to piggy back on kernfs_pr_cont_buf */
    - 	spin_lock_irq(&kernfs_rename_lock);
    + 	spin_lock_irq(&kernfs_pr_cont_lock);
    + 
     @@ fs/kernfs/dir.c: struct kernfs_node *kernfs_find_and_get_ns(struct kernfs_node *parent,
      {
      	struct kernfs_node *kn;
    @@ fs/kernfs/dir.c: struct kernfs_node *kernfs_walk_and_get_ns(struct kernfs_node *
      
      	return kn;
      }
    -@@ fs/kernfs/dir.c: static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
    - 		/* If the kernfs parent node has changed discard and
    - 		 * proceed to ->lookup.
    - 		 */
    --		mutex_lock(&kernfs_mutex);
    -+		down_read(&kernfs_rwsem);
    - 		spin_lock(&dentry->d_lock);
    - 		parent = kernfs_dentry_node(dentry->d_parent);
    - 		if (parent) {
    - 			if (kernfs_dir_changed(parent, dentry)) {
    - 				spin_unlock(&dentry->d_lock);
    --				mutex_unlock(&kernfs_mutex);
    -+				up_read(&kernfs_rwsem);
    - 				return 0;
    - 			}
    - 		}
    - 		spin_unlock(&dentry->d_lock);
    --		mutex_unlock(&kernfs_mutex);
    -+		up_read(&kernfs_rwsem);
    - 
    - 		/* The kernfs parent node hasn't changed, leave the
    - 		 * dentry negative and return success.
    -@@ fs/kernfs/dir.c: static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
    - 	}
    - 
    - 	kn = kernfs_dentry_node(dentry);
    --	mutex_lock(&kernfs_mutex);
    -+	down_read(&kernfs_rwsem);
    - 
    - 	/* The kernfs node has been deactivated */
    - 	if (!kernfs_active(kn))
    -@@ fs/kernfs/dir.c: static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
    - 	    kernfs_info(dentry->d_sb)->ns != kn->ns)
    - 		goto out_bad;
    - 
    --	mutex_unlock(&kernfs_mutex);
    -+	up_read(&kernfs_rwsem);
    - 	return 1;
    - out_bad:
    --	mutex_unlock(&kernfs_mutex);
    -+	up_read(&kernfs_rwsem);
    - 	return 0;
    - }
    - 
     @@ fs/kernfs/dir.c: static struct dentry *kernfs_iop_lookup(struct inode *dir,
      	struct inode *inode = NULL;
      	const void *ns = NULL;
    @@ fs/kernfs/dir.c: int kernfs_remove_by_name_ns(struct kernfs_node *parent, const
     +	down_write(&kernfs_rwsem);
      
      	kn = kernfs_find_ns(parent, name, ns);
    - 	if (kn)
    - 		__kernfs_remove(kn);
    + 	if (kn) {
    +@@ fs/kernfs/dir.c: int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
    + 		kernfs_put(kn);
    + 	}
      
     -	mutex_unlock(&kernfs_mutex);
     +	up_write(&kernfs_rwsem);
    @@ fs/kernfs/inode.c: int kernfs_setattr(struct kernfs_node *kn, const struct iattr
      	return ret;
      }
      
    -@@ fs/kernfs/inode.c: int kernfs_iop_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
    +@@ fs/kernfs/inode.c: int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
      	if (!kn)
      		return -EINVAL;
      
     -	mutex_lock(&kernfs_mutex);
     +	down_write(&kernfs_rwsem);
    - 	error = setattr_prepare(&init_user_ns, dentry, iattr);
    + 	error = setattr_prepare(dentry, iattr);
      	if (error)
      		goto out;
    -@@ fs/kernfs/inode.c: int kernfs_iop_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
    - 	setattr_copy(&init_user_ns, inode, iattr);
    +@@ fs/kernfs/inode.c: int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
    + 	setattr_copy(inode, iattr);
      
      out:
     -	mutex_unlock(&kernfs_mutex);
    @@ fs/kernfs/inode.c: int kernfs_iop_setattr(struct user_namespace *mnt_userns, str
      	return error;
      }
      
    -@@ fs/kernfs/inode.c: int kernfs_iop_getattr(struct user_namespace *mnt_userns,
    +@@ fs/kernfs/inode.c: int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
      	struct inode *inode = d_inode(path->dentry);
      	struct kernfs_node *kn = inode->i_private;
      
    @@ fs/kernfs/inode.c: int kernfs_iop_getattr(struct user_namespace *mnt_userns,
     -	mutex_unlock(&kernfs_mutex);
     +	up_write(&kernfs_rwsem);
      
    - 	generic_fillattr(&init_user_ns, inode, stat);
    + 	generic_fillattr(inode, stat);
      	return 0;
    -@@ fs/kernfs/inode.c: int kernfs_iop_permission(struct user_namespace *mnt_userns,
    +@@ fs/kernfs/inode.c: int kernfs_iop_permission(struct inode *inode, int mask)
      
      	kn = inode->i_private;
      
    @@ fs/kernfs/inode.c: int kernfs_iop_permission(struct user_namespace *mnt_userns,
     -	mutex_unlock(&kernfs_mutex);
     +	up_write(&kernfs_rwsem);
      
    - 	return generic_permission(&init_user_ns, inode, mask);
    + 	return generic_permission(inode, mask);
      }
     
      ## fs/kernfs/kernfs-internal.h ##
---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-5.15.y       |  Success    |  Success   |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 2/5] kernfs: use VFS negative dentry caching
  2025-05-21  1:53 ` [PATCH 5.10 2/5] kernfs: use VFS negative dentry caching Qingfang Deng
@ 2025-05-22  2:05   ` Sasha Levin
  0 siblings, 0 replies; 15+ messages in thread
From: Sasha Levin @ 2025-05-22  2:05 UTC (permalink / raw)
  To: stable, dqfext; +Cc: Sasha Levin

[ Sasha's backport helper bot ]

Hi,

Summary of potential issues:
ℹ️ This is part 2/5 of a series
⚠️ Found follow-up fixes in mainline

The upstream commit SHA1 provided is correct: c7e7c04274b13f98f758fb69b03f2ab61976ea80

WARNING: Author mismatch between patch and upstream commit:
Backport author: Qingfang Deng<dqfext@gmail.com>
Commit author: Ian Kent<raven@themaw.net>

Status in newer kernel trees:
6.14.y | Present (exact SHA1)
6.12.y | Present (exact SHA1)
6.6.y | Present (exact SHA1)
6.1.y | Present (exact SHA1)
5.15.y | Present (exact SHA1)

Found fixes commits:
410d591a1954 kernfs: don't create a negative dentry if inactive node exists
df38d852c681 kernfs: also call kernfs_set_rev() for positive dentry

Note: The patch differs from the upstream commit:
---
1:  c7e7c04274b13 ! 1:  6ca146f13dc9d kernfs: use VFS negative dentry caching
    @@ Metadata
      ## Commit message ##
         kernfs: use VFS negative dentry caching
     
    +    Commit c7e7c04274b13f98f758fb69b03f2ab61976ea80 upstream.
    +
         If there are many lookups for non-existent paths these negative lookups
         can lead to a lot of overhead during path walks.
     
---

NOTE: These results are for this patch alone. Full series testing will be
performed when all parts are received.

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-5.15.y       |  Success    |  Success   |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes
  2025-05-21  1:53 ` [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes Qingfang Deng
  2025-05-21  4:16   ` Greg Kroah-Hartman
@ 2025-05-22  2:07   ` Sasha Levin
  1 sibling, 0 replies; 15+ messages in thread
From: Sasha Levin @ 2025-05-22  2:07 UTC (permalink / raw)
  To: stable; +Cc: Qingfang Deng, Sasha Levin

[ Sasha's backport helper bot ]

Hi,

✅ All tests passed successfully. No issues detected.
No action required from the submitter.

The upstream commit SHA1 provided is correct: 895adbec302e92086359e6fd92611ac3be6d92c3

WARNING: Author mismatch between patch and upstream commit:
Backport author: Qingfang Deng<dqfext@gmail.com>
Commit author: Ian Kent<raven@themaw.net>

Status in newer kernel trees:
6.14.y | Present (exact SHA1)
6.12.y | Present (exact SHA1)
6.6.y | Present (exact SHA1)
6.1.y | Present (exact SHA1)
5.15.y | Present (exact SHA1)

Note: The patch differs from the upstream commit:
---
1:  895adbec302e9 ! 1:  384581c600ad7 kernfs: add a revision to identify directory node changes
    @@ Metadata
      ## Commit message ##
         kernfs: add a revision to identify directory node changes
     
    +    Commit 895adbec302e92086359e6fd92611ac3be6d92c3 upstream.
    +
         Add a revision counter to kernfs directory nodes so it can be used
         to detect if a directory node has changed during negative dentry
         revalidation.
---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-5.10.y       |  Success    |  Success   |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 4/5] kernfs: use i_lock to protect concurrent inode updates
  2025-05-21  1:53 ` [PATCH 5.10 4/5] kernfs: use i_lock to protect concurrent inode updates Qingfang Deng
@ 2025-05-22  2:07   ` Sasha Levin
  0 siblings, 0 replies; 15+ messages in thread
From: Sasha Levin @ 2025-05-22  2:07 UTC (permalink / raw)
  To: stable; +Cc: Qingfang Deng, Sasha Levin

[ Sasha's backport helper bot ]

Hi,

✅ All tests passed successfully. No issues detected.
No action required from the submitter.

The upstream commit SHA1 provided is correct: 47b5c64d0ab5e7136db2b78c6ec710e0d8a5a36b

WARNING: Author mismatch between patch and upstream commit:
Backport author: Qingfang Deng<dqfext@gmail.com>
Commit author: Ian Kent<raven@themaw.net>

Status in newer kernel trees:
6.14.y | Present (exact SHA1)
6.12.y | Present (exact SHA1)
6.6.y | Present (exact SHA1)
6.1.y | Present (exact SHA1)
5.15.y | Present (exact SHA1)

Note: The patch differs from the upstream commit:
---
1:  47b5c64d0ab5e ! 1:  3ab6e3573bab5 kernfs: use i_lock to protect concurrent inode updates
    @@ Metadata
      ## Commit message ##
         kernfs: use i_lock to protect concurrent inode updates
     
    +    Commit 47b5c64d0ab5e7136db2b78c6ec710e0d8a5a36b upstream.
    +
         The inode operations .permission() and .getattr() use the kernfs node
         write lock but all that's needed is the read lock to protect against
         partial updates of these kernfs node fields which are all done under
    @@ Commit message
         Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
     
      ## fs/kernfs/inode.c ##
    -@@ fs/kernfs/inode.c: int kernfs_iop_getattr(struct user_namespace *mnt_userns,
    +@@ fs/kernfs/inode.c: int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
      	struct inode *inode = d_inode(path->dentry);
      	struct kernfs_node *kn = inode->i_private;
      
    @@ fs/kernfs/inode.c: int kernfs_iop_getattr(struct user_namespace *mnt_userns,
      	kernfs_refresh_inode(kn, inode);
     -	up_write(&kernfs_rwsem);
     -
    - 	generic_fillattr(&init_user_ns, inode, stat);
    + 	generic_fillattr(inode, stat);
     +	spin_unlock(&inode->i_lock);
     +	up_read(&kernfs_rwsem);
     +
      	return 0;
      }
      
    -@@ fs/kernfs/inode.c: int kernfs_iop_permission(struct user_namespace *mnt_userns,
    - 			  struct inode *inode, int mask)
    +@@ fs/kernfs/inode.c: void kernfs_evict_inode(struct inode *inode)
    + int kernfs_iop_permission(struct inode *inode, int mask)
      {
      	struct kernfs_node *kn;
     +	int ret;
    @@ fs/kernfs/inode.c: int kernfs_iop_permission(struct user_namespace *mnt_userns,
     +	spin_lock(&inode->i_lock);
      	kernfs_refresh_inode(kn, inode);
     -	up_write(&kernfs_rwsem);
    -+	ret = generic_permission(&init_user_ns, inode, mask);
    ++	ret = generic_permission(inode, mask);
     +	spin_unlock(&inode->i_lock);
     +	up_read(&kernfs_rwsem);
      
    --	return generic_permission(&init_user_ns, inode, mask);
    +-	return generic_permission(inode, mask);
     +	return ret;
      }
      
---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-5.15.y       |  Success    |  Success   |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5.10 5/5] kernfs: dont call d_splice_alias() under kernfs node lock
  2025-05-21  1:53 ` [PATCH 5.10 5/5] kernfs: dont call d_splice_alias() under kernfs node lock Qingfang Deng
@ 2025-05-22  2:08   ` Sasha Levin
  0 siblings, 0 replies; 15+ messages in thread
From: Sasha Levin @ 2025-05-22  2:08 UTC (permalink / raw)
  To: stable; +Cc: Qingfang Deng, Sasha Levin

[ Sasha's backport helper bot ]

Hi,

✅ All tests passed successfully. No issues detected.
No action required from the submitter.

The upstream commit SHA1 provided is correct: df6192f47d2311cf40cd4321cc59863a5853b665

WARNING: Author mismatch between patch and upstream commit:
Backport author: Qingfang Deng<dqfext@gmail.com>
Commit author: Ian Kent<raven@themaw.net>

Status in newer kernel trees:
6.14.y | Present (exact SHA1)
6.12.y | Present (exact SHA1)
6.6.y | Present (exact SHA1)
6.1.y | Present (exact SHA1)
5.15.y | Present (exact SHA1)

Note: The patch differs from the upstream commit:
---
1:  df6192f47d231 ! 1:  fde730faa2db3 kernfs: dont call d_splice_alias() under kernfs node lock
    @@ Metadata
      ## Commit message ##
         kernfs: dont call d_splice_alias() under kernfs node lock
     
    +    Commit df6192f47d2311cf40cd4321cc59863a5853b665 upstream.
    +
         The call to d_splice_alias() in kernfs_iop_lookup() doesn't depend on
         any kernfs node so there's no reason to hold the kernfs node lock when
         calling it.
    @@ fs/kernfs/dir.c: static struct dentry *kernfs_iop_lookup(struct inode *dir,
     +	return d_splice_alias(inode, dentry);
      }
      
    - static int kernfs_iop_mkdir(struct user_namespace *mnt_userns,
    + static int kernfs_iop_mkdir(struct inode *dir, struct dentry *dentry,
---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-5.15.y       |  Success    |  Success   |

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-05-22  2:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-21  1:53 [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Qingfang Deng
2025-05-21  1:53 ` [PATCH 5.10 1/5] kernfs: add a revision to identify directory node changes Qingfang Deng
2025-05-21  4:16   ` Greg Kroah-Hartman
2025-05-22  2:07   ` Sasha Levin
2025-05-21  1:53 ` [PATCH 5.10 2/5] kernfs: use VFS negative dentry caching Qingfang Deng
2025-05-22  2:05   ` Sasha Levin
2025-05-21  1:53 ` [PATCH 5.10 3/5] kernfs: switch kernfs to use an rwsem Qingfang Deng
2025-05-22  2:03   ` Sasha Levin
2025-05-21  1:53 ` [PATCH 5.10 4/5] kernfs: use i_lock to protect concurrent inode updates Qingfang Deng
2025-05-22  2:07   ` Sasha Levin
2025-05-21  1:53 ` [PATCH 5.10 5/5] kernfs: dont call d_splice_alias() under kernfs node lock Qingfang Deng
2025-05-22  2:08   ` Sasha Levin
2025-05-21  4:15 ` [PATCH 5.10 0/5] kernfs: backport locking and concurrency improvement Greg Kroah-Hartman
2025-05-21  5:35 ` Ian Kent
2025-05-21  6:09   ` Ian Kent

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox