linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH kernfs 0/3] kernfs: switch global locks to per-fs lock
@ 2025-04-11 18:31 alexjlzheng
  2025-04-11 18:31 ` [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock " alexjlzheng
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: alexjlzheng @ 2025-04-11 18:31 UTC (permalink / raw)
  To: gregkh, tj; +Cc: alexjlzheng, linux-kernel

From: Jinliang Zheng <alexjlzheng@tencent.com>

The kernfs implementation has big lock granularity so every kernfs-based
(e.g., sysfs, cgroup) fs are able to compete the locks. This patchset
switches the global locks to per-fs locks.

In fact, the implementation of global locks has not yet introduced
performance issues. But in the long run, more and more file systems will
be implemented based on the kernfs framework, so this optimization is
meaningful.

Jinliang Zheng (3):
  kernfs: switch global kernfs_idr_lock to per-fs lock
  kernfs: switch global kernfs_rename_lock to per-fs lock
  kernfs: switch global kernfs_pr_cont_lock to per-fs lock

 fs/kernfs/dir.c             | 74 ++++++++++++++++++-------------------
 fs/kernfs/kernfs-internal.h | 27 ++++++++++++--
 2 files changed, 59 insertions(+), 42 deletions(-)

-- 
2.48.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-11 18:31 [PATCH kernfs 0/3] kernfs: switch global locks to per-fs lock alexjlzheng
@ 2025-04-11 18:31 ` alexjlzheng
  2025-04-12  6:12   ` Greg KH
  2025-04-14 17:09   ` Tejun Heo
  2025-04-11 18:31 ` [PATCH kernfs 2/3] kernfs: switch global kernfs_rename_lock " alexjlzheng
  2025-04-11 18:31 ` [PATCH kernfs 3/3] kernfs: switch global kernfs_pr_cont_lock " alexjlzheng
  2 siblings, 2 replies; 15+ messages in thread
From: alexjlzheng @ 2025-04-11 18:31 UTC (permalink / raw)
  To: gregkh, tj; +Cc: alexjlzheng, linux-kernel

From: Jinliang Zheng <alexjlzheng@tencent.com>

The kernfs implementation has big lock granularity(kernfs_idr_lock) so
every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.

This patch switches the global kernfs_idr_lock to per-fs lock, which
put the spinlock into kernfs_root.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/kernfs/dir.c             | 14 +++++++-------
 fs/kernfs/kernfs-internal.h |  1 +
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index fc70d72c3fe8..355d943ffe27 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -27,7 +27,6 @@ DEFINE_RWLOCK(kernfs_rename_lock);	/* kn->parent and ->name */
  */
 static DEFINE_SPINLOCK(kernfs_pr_cont_lock);
 static char kernfs_pr_cont_buf[PATH_MAX];	/* protected by pr_cont_lock */
-static DEFINE_SPINLOCK(kernfs_idr_lock);	/* root->ino_idr */
 
 #define rb_to_kn(X) rb_entry((X), struct kernfs_node, rb)
 
@@ -584,9 +583,9 @@ void kernfs_put(struct kernfs_node *kn)
 	if (kernfs_type(kn) == KERNFS_LINK)
 		kernfs_put(kn->symlink.target_kn);
 
-	spin_lock(&kernfs_idr_lock);
+	spin_lock(&root->kernfs_idr_lock);
 	idr_remove(&root->ino_idr, (u32)kernfs_ino(kn));
-	spin_unlock(&kernfs_idr_lock);
+	spin_unlock(&root->kernfs_idr_lock);
 
 	call_rcu(&kn->rcu, kernfs_free_rcu);
 
@@ -639,13 +638,13 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
 		goto err_out1;
 
 	idr_preload(GFP_KERNEL);
-	spin_lock(&kernfs_idr_lock);
+	spin_lock(&root->kernfs_idr_lock);
 	ret = idr_alloc_cyclic(&root->ino_idr, kn, 1, 0, GFP_ATOMIC);
 	if (ret >= 0 && ret < root->last_id_lowbits)
 		root->id_highbits++;
 	id_highbits = root->id_highbits;
 	root->last_id_lowbits = ret;
-	spin_unlock(&kernfs_idr_lock);
+	spin_unlock(&root->kernfs_idr_lock);
 	idr_preload_end();
 	if (ret < 0)
 		goto err_out2;
@@ -681,9 +680,9 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
 	return kn;
 
  err_out3:
-	spin_lock(&kernfs_idr_lock);
+	spin_lock(&root->kernfs_idr_lock);
 	idr_remove(&root->ino_idr, (u32)kernfs_ino(kn));
-	spin_unlock(&kernfs_idr_lock);
+	spin_unlock(&root->kernfs_idr_lock);
  err_out2:
 	kmem_cache_free(kernfs_node_cache, kn);
  err_out1:
@@ -989,6 +988,7 @@ struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
 		return ERR_PTR(-ENOMEM);
 
 	idr_init(&root->ino_idr);
+	spin_lock_init(&root->kernfs_idr_lock);
 	init_rwsem(&root->kernfs_rwsem);
 	init_rwsem(&root->kernfs_iattr_rwsem);
 	init_rwsem(&root->kernfs_supers_rwsem);
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 40a2a9cd819d..24e9514565ac 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -40,6 +40,7 @@ struct kernfs_root {
 
 	/* private fields, do not use outside kernfs proper */
 	struct idr		ino_idr;
+	spinlock_t		kernfs_idr_lock;	/* root->ino_idr */
 	u32			last_id_lowbits;
 	u32			id_highbits;
 	struct kernfs_syscall_ops *syscall_ops;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH kernfs 2/3] kernfs: switch global kernfs_rename_lock to per-fs lock
  2025-04-11 18:31 [PATCH kernfs 0/3] kernfs: switch global locks to per-fs lock alexjlzheng
  2025-04-11 18:31 ` [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock " alexjlzheng
@ 2025-04-11 18:31 ` alexjlzheng
  2025-04-14 17:26   ` Tejun Heo
  2025-04-11 18:31 ` [PATCH kernfs 3/3] kernfs: switch global kernfs_pr_cont_lock " alexjlzheng
  2 siblings, 1 reply; 15+ messages in thread
From: alexjlzheng @ 2025-04-11 18:31 UTC (permalink / raw)
  To: gregkh, tj; +Cc: alexjlzheng, linux-kernel

From: Jinliang Zheng <alexjlzheng@tencent.com>

The kernfs implementation has big lock granularity(kernfs_rename_lock) so
every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.

This patch switches the global kernfs_rename_lock to per-fs lock, which
put the rwlock into kernfs_root.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/kernfs/dir.c             | 14 ++++++++------
 fs/kernfs/kernfs-internal.h | 15 +++++++++++----
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 355d943ffe27..d63a96786c9b 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -17,7 +17,6 @@
 
 #include "kernfs-internal.h"
 
-DEFINE_RWLOCK(kernfs_rename_lock);	/* kn->parent and ->name */
 /*
  * Don't use rename_lock to piggy back on pr_cont_buf. We don't want to
  * call pr_cont() while holding rename_lock. Because sometimes pr_cont()
@@ -228,7 +227,7 @@ int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from,
 	if (to) {
 		root = kernfs_root(to);
 		if (!(root->flags & KERNFS_ROOT_INVARIANT_PARENT)) {
-			guard(read_lock_irqsave)(&kernfs_rename_lock);
+			guard(read_lock_irqsave)(&root->kernfs_rename_lock);
 			return kernfs_path_from_node_locked(to, from, buf, buflen);
 		}
 	}
@@ -295,12 +294,14 @@ void pr_cont_kernfs_path(struct kernfs_node *kn)
 struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn)
 {
 	struct kernfs_node *parent;
+	struct kernfs_root *root;
 	unsigned long flags;
 
-	read_lock_irqsave(&kernfs_rename_lock, flags);
+	root = kernfs_root(kn);
+	read_lock_irqsave(&root->kernfs_rename_lock, flags);
 	parent = kernfs_parent(kn);
 	kernfs_get(parent);
-	read_unlock_irqrestore(&kernfs_rename_lock, flags);
+	read_unlock_irqrestore(&root->kernfs_rename_lock, flags);
 
 	return parent;
 }
@@ -993,6 +994,7 @@ struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
 	init_rwsem(&root->kernfs_iattr_rwsem);
 	init_rwsem(&root->kernfs_supers_rwsem);
 	INIT_LIST_HEAD(&root->supers);
+	rwlock_init(&root->kernfs_rename_lock);
 
 	/*
 	 * On 64bit ino setups, id is ino.  On 32bit, low 32bits are ino.
@@ -1789,7 +1791,7 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 	/* rename_lock protects ->parent accessors */
 	if (old_parent != new_parent) {
 		kernfs_get(new_parent);
-		write_lock_irq(&kernfs_rename_lock);
+		write_lock_irq(&root->kernfs_rename_lock);
 
 		rcu_assign_pointer(kn->__parent, new_parent);
 
@@ -1797,7 +1799,7 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 		if (new_name)
 			rcu_assign_pointer(kn->name, new_name);
 
-		write_unlock_irq(&kernfs_rename_lock);
+		write_unlock_irq(&root->kernfs_rename_lock);
 		kernfs_put(old_parent);
 	} else {
 		/* name assignment is RCU protected, parent is the same */
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 24e9514565ac..6061b6f70d2a 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -19,8 +19,6 @@
 #include <linux/kernfs.h>
 #include <linux/fs_context.h>
 
-extern rwlock_t kernfs_rename_lock;
-
 struct kernfs_iattrs {
 	kuid_t			ia_uid;
 	kgid_t			ia_gid;
@@ -53,6 +51,9 @@ struct kernfs_root {
 	struct rw_semaphore	kernfs_iattr_rwsem;
 	struct rw_semaphore	kernfs_supers_rwsem;
 
+	/* kn->parent and kn->name */
+	rwlock_t		kernfs_rename_lock;
+
 	struct rcu_head		rcu;
 };
 
@@ -108,6 +109,11 @@ static inline bool kernfs_root_is_locked(const struct kernfs_node *kn)
 	return lockdep_is_held(&kernfs_root(kn)->kernfs_rwsem);
 }
 
+static inline bool kernfs_rename_is_locked(const struct kernfs_node *kn)
+{
+	return lockdep_is_held(&kernfs_root(kn)->kernfs_rename_lock);
+}
+
 static inline const char *kernfs_rcu_name(const struct kernfs_node *kn)
 {
 	return rcu_dereference_check(kn->name, kernfs_root_is_locked(kn));
@@ -118,14 +124,15 @@ static inline struct kernfs_node *kernfs_parent(const struct kernfs_node *kn)
 	/*
 	 * The kernfs_node::__parent remains valid within a RCU section. The kn
 	 * can be reparented (and renamed) which changes the entry. This can be
-	 * avoided by locking kernfs_root::kernfs_rwsem or kernfs_rename_lock.
+	 * avoided by locking kernfs_root::kernfs_rwsem or
+	 * kernfs_root::kernfs_rename_lock.
 	 * Both locks can be used to obtain a reference on __parent. Once the
 	 * reference count reaches 0 then the node is about to be freed
 	 * and can not be renamed (or become a different parent) anymore.
 	 */
 	return rcu_dereference_check(kn->__parent,
 				     kernfs_root_is_locked(kn) ||
-				     lockdep_is_held(&kernfs_rename_lock) ||
+				     kernfs_rename_is_locked(kn) ||
 				     !atomic_read(&kn->count));
 }
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH kernfs 3/3] kernfs: switch global kernfs_pr_cont_lock to per-fs lock
  2025-04-11 18:31 [PATCH kernfs 0/3] kernfs: switch global locks to per-fs lock alexjlzheng
  2025-04-11 18:31 ` [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock " alexjlzheng
  2025-04-11 18:31 ` [PATCH kernfs 2/3] kernfs: switch global kernfs_rename_lock " alexjlzheng
@ 2025-04-11 18:31 ` alexjlzheng
  2025-04-14 17:27   ` Tejun Heo
  2 siblings, 1 reply; 15+ messages in thread
From: alexjlzheng @ 2025-04-11 18:31 UTC (permalink / raw)
  To: gregkh, tj; +Cc: alexjlzheng, linux-kernel

From: Jinliang Zheng <alexjlzheng@tencent.com>

The kernfs implementation has big lock granularity(kernfs_pr_cont_lock) so
every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.

This patch switches the global kernfs_pr_cont_lock to per-fs lock, which
put the spinlock into kernfs_root. Of course, kernfs_pr_cont_buf also needs
to be moved to kernfs_root.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/kernfs/dir.c             | 46 +++++++++++++++++--------------------
 fs/kernfs/kernfs-internal.h | 11 +++++++++
 2 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index d63a96786c9b..29605d7f0ab0 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -17,16 +17,6 @@
 
 #include "kernfs-internal.h"
 
-/*
- * Don't use rename_lock to piggy back on pr_cont_buf. We don't want to
- * call pr_cont() while holding rename_lock. Because sometimes pr_cont()
- * will perform wakeups when releasing console_sem. Holding rename_lock
- * will introduce deadlock if the scheduler reads the kernfs_name in the
- * wakeup path.
- */
-static DEFINE_SPINLOCK(kernfs_pr_cont_lock);
-static char kernfs_pr_cont_buf[PATH_MAX];	/* protected by pr_cont_lock */
-
 #define rb_to_kn(X) rb_entry((X), struct kernfs_node, rb)
 
 static bool __kernfs_active(struct kernfs_node *kn)
@@ -244,13 +234,15 @@ EXPORT_SYMBOL_GPL(kernfs_path_from_node);
 void pr_cont_kernfs_name(struct kernfs_node *kn)
 {
 	unsigned long flags;
+	struct kernfs_root *root = kernfs_root(kn);
 
-	spin_lock_irqsave(&kernfs_pr_cont_lock, flags);
+	spin_lock_irqsave(&root->kernfs_pr_cont_lock, flags);
 
-	kernfs_name(kn, kernfs_pr_cont_buf, sizeof(kernfs_pr_cont_buf));
-	pr_cont("%s", kernfs_pr_cont_buf);
+	kernfs_name(kn, root->kernfs_pr_cont_buf,
+			sizeof(root->kernfs_pr_cont_buf));
+	pr_cont("%s", root->kernfs_pr_cont_buf);
 
-	spin_unlock_irqrestore(&kernfs_pr_cont_lock, flags);
+	spin_unlock_irqrestore(&root->kernfs_pr_cont_lock, flags);
 }
 
 /**
@@ -263,11 +255,12 @@ void pr_cont_kernfs_path(struct kernfs_node *kn)
 {
 	unsigned long flags;
 	int sz;
+	struct kernfs_root *root = kernfs_root(kn);
 
-	spin_lock_irqsave(&kernfs_pr_cont_lock, flags);
+	spin_lock_irqsave(&root->kernfs_pr_cont_lock, flags);
 
-	sz = kernfs_path_from_node(kn, NULL, kernfs_pr_cont_buf,
-				   sizeof(kernfs_pr_cont_buf));
+	sz = kernfs_path_from_node(kn, NULL, root->kernfs_pr_cont_buf,
+				   sizeof(root->kernfs_pr_cont_buf));
 	if (sz < 0) {
 		if (sz == -E2BIG)
 			pr_cont("(name too long)");
@@ -276,10 +269,10 @@ void pr_cont_kernfs_path(struct kernfs_node *kn)
 		goto out;
 	}
 
-	pr_cont("%s", kernfs_pr_cont_buf);
+	pr_cont("%s", root->kernfs_pr_cont_buf);
 
 out:
-	spin_unlock_irqrestore(&kernfs_pr_cont_lock, flags);
+	spin_unlock_irqrestore(&root->kernfs_pr_cont_lock, flags);
 }
 
 /**
@@ -888,19 +881,21 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
 {
 	ssize_t len;
 	char *p, *name;
+	struct kernfs_root *root = kernfs_root(parent);
 
-	lockdep_assert_held_read(&kernfs_root(parent)->kernfs_rwsem);
+	lockdep_assert_held_read(&root->kernfs_rwsem);
 
-	spin_lock_irq(&kernfs_pr_cont_lock);
+	spin_lock_irq(&root->kernfs_pr_cont_lock);
 
-	len = strscpy(kernfs_pr_cont_buf, path, sizeof(kernfs_pr_cont_buf));
+	len = strscpy(root->kernfs_pr_cont_buf, path,
+			sizeof(root->kernfs_pr_cont_buf));
 
 	if (len < 0) {
-		spin_unlock_irq(&kernfs_pr_cont_lock);
+		spin_unlock_irq(&root->kernfs_pr_cont_lock);
 		return NULL;
 	}
 
-	p = kernfs_pr_cont_buf;
+	p = root->kernfs_pr_cont_buf;
 
 	while ((name = strsep(&p, "/")) && parent) {
 		if (*name == '\0')
@@ -908,7 +903,7 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
 		parent = kernfs_find_ns(parent, name, ns);
 	}
 
-	spin_unlock_irq(&kernfs_pr_cont_lock);
+	spin_unlock_irq(&root->kernfs_pr_cont_lock);
 
 	return parent;
 }
@@ -995,6 +990,7 @@ struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
 	init_rwsem(&root->kernfs_supers_rwsem);
 	INIT_LIST_HEAD(&root->supers);
 	rwlock_init(&root->kernfs_rename_lock);
+	spin_lock_init(&root->kernfs_pr_cont_lock);
 
 	/*
 	 * On 64bit ino setups, id is ino.  On 32bit, low 32bits are ino.
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 6061b6f70d2a..c7fe50955e8c 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -55,6 +55,17 @@ struct kernfs_root {
 	rwlock_t		kernfs_rename_lock;
 
 	struct rcu_head		rcu;
+
+	/*
+	 * Don't use rename_lock to piggy back on pr_cont_buf. We don't want to
+	 * call pr_cont() while holding rename_lock. Because sometimes pr_cont()
+	 * will perform wakeups when releasing console_sem. Holding rename_lock
+	 * will introduce deadlock if the scheduler reads the kernfs_name in the
+	 * wakeup path.
+	 */
+	spinlock_t		kernfs_pr_cont_lock;
+	/* protected by pr_cont_lock */
+	char			kernfs_pr_cont_buf[PATH_MAX];
 };
 
 /* +1 to avoid triggering overflow warning when negating it */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-11 18:31 ` [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock " alexjlzheng
@ 2025-04-12  6:12   ` Greg KH
  2025-04-12 11:50     ` alexjlzheng
  2025-04-14 17:09   ` Tejun Heo
  1 sibling, 1 reply; 15+ messages in thread
From: Greg KH @ 2025-04-12  6:12 UTC (permalink / raw)
  To: alexjlzheng; +Cc: tj, alexjlzheng, linux-kernel

On Sat, Apr 12, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> The kernfs implementation has big lock granularity(kernfs_idr_lock) so
> every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> 
> This patch switches the global kernfs_idr_lock to per-fs lock, which
> put the spinlock into kernfs_root.
> 
> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> ---
>  fs/kernfs/dir.c             | 14 +++++++-------
>  fs/kernfs/kernfs-internal.h |  1 +
>  2 files changed, 8 insertions(+), 7 deletions(-)

What kind of testing / benchmark did you do for this series that shows
that this works, AND that this actually is measureable?  What workload
are you doing that causes these changes to be needed?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-12  6:12   ` Greg KH
@ 2025-04-12 11:50     ` alexjlzheng
  2025-04-13  7:51       ` Greg KH
  0 siblings, 1 reply; 15+ messages in thread
From: alexjlzheng @ 2025-04-12 11:50 UTC (permalink / raw)
  To: gregkh; +Cc: alexjlzheng, alexjlzheng, linux-kernel, tj

On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote:
> On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > 
> > The kernfs implementation has big lock granularity(kernfs_idr_lock) so
> > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> > 
> > This patch switches the global kernfs_idr_lock to per-fs lock, which
> > put the spinlock into kernfs_root.
> > 
> > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > ---
> >  fs/kernfs/dir.c             | 14 +++++++-------
> >  fs/kernfs/kernfs-internal.h |  1 +
> >  2 files changed, 8 insertions(+), 7 deletions(-)
> 
> What kind of testing / benchmark did you do for this series that shows
> that this works, AND that this actually is measureable?  What workload
> are you doing that causes these changes to be needed?

Thank you for your reply. :)

We are trying to implement a kernfs-based filesystem that will have
multiple instances running at the same time, i.e., multiple kernfs_roots.

While investigating the kernfs implementation, we found some global locks
that would cause noticeable lock contention when there are many filesystem
instances.

Fortunately, we found that some optimizations have been made in [1], which
moved kernfs_rwsem into kernfs_root. But there are still some global locks
left.

We think it is also necessary to switch the remaining global locks to
per-fs. Moreover, we strongly agree with Tejun Heo's point in [1]:

  "... this is the right thing to do even if there is no concrete
   performance argument (not saying there isn't). It's just weird to
   entangle these completely unrelated users in a single rwsem."

We think kernfs will be widely used to build other filesystems, so we
strongly recommend switching global locks to per-fs.

Thank you,
Jinliang Zheng :)

[1] https://lore.kernel.org/all/YZbbxK1F7jY%2FRBFF@slm.duckdns.org/

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-12 11:50     ` alexjlzheng
@ 2025-04-13  7:51       ` Greg KH
  2025-04-14  3:20         ` Jinliang Zheng
  0 siblings, 1 reply; 15+ messages in thread
From: Greg KH @ 2025-04-13  7:51 UTC (permalink / raw)
  To: alexjlzheng; +Cc: alexjlzheng, linux-kernel, tj

On Sat, Apr 12, 2025 at 07:50:54PM +0800, alexjlzheng@gmail.com wrote:
> On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote:
> > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote:
> > > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > > 
> > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so
> > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> > > 
> > > This patch switches the global kernfs_idr_lock to per-fs lock, which
> > > put the spinlock into kernfs_root.
> > > 
> > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > > ---
> > >  fs/kernfs/dir.c             | 14 +++++++-------
> > >  fs/kernfs/kernfs-internal.h |  1 +
> > >  2 files changed, 8 insertions(+), 7 deletions(-)
> > 
> > What kind of testing / benchmark did you do for this series that shows
> > that this works, AND that this actually is measureable?  What workload
> > are you doing that causes these changes to be needed?
> 
> Thank you for your reply. :)
> 
> We are trying to implement a kernfs-based filesystem that will have
> multiple instances running at the same time, i.e., multiple kernfs_roots.

I don't think that kernfs is meant for that very well, what is that
filesystem going to be for?

> While investigating the kernfs implementation, we found some global locks
> that would cause noticeable lock contention when there are many filesystem
> instances.
> 
> Fortunately, we found that some optimizations have been made in [1], which
> moved kernfs_rwsem into kernfs_root. But there are still some global locks
> left.
> 
> We think it is also necessary to switch the remaining global locks to
> per-fs. Moreover, we strongly agree with Tejun Heo's point in [1]:
> 
>   "... this is the right thing to do even if there is no concrete
>    performance argument (not saying there isn't). It's just weird to
>    entangle these completely unrelated users in a single rwsem."
> 
> We think kernfs will be widely used to build other filesystems, so we
> strongly recommend switching global locks to per-fs.

I don't strongly object, but I would like to see some real-world numbers first.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-13  7:51       ` Greg KH
@ 2025-04-14  3:20         ` Jinliang Zheng
  2025-04-15 15:16           ` Greg KH
  0 siblings, 1 reply; 15+ messages in thread
From: Jinliang Zheng @ 2025-04-14  3:20 UTC (permalink / raw)
  To: gregkh; +Cc: alexjlzheng, alexjlzheng, linux-kernel, tj

> On Sat, Apr 12, 2025 at 07:50:54PM +0800, alexjlzheng@gmail.com wrote:
> > On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote:
> > > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote:
> > > > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > > > 
> > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so
> > > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> > > > 
> > > > This patch switches the global kernfs_idr_lock to per-fs lock, which
> > > > put the spinlock into kernfs_root.
> > > > 
> > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > > > ---
> > > >  fs/kernfs/dir.c             | 14 +++++++-------
> > > >  fs/kernfs/kernfs-internal.h |  1 +
> > > >  2 files changed, 8 insertions(+), 7 deletions(-)
> > > 
> > > What kind of testing / benchmark did you do for this series that shows
> > > that this works, AND that this actually is measureable?  What workload
> > > are you doing that causes these changes to be needed?
> > 
> > Thank you for your reply. :)
> > 
> > We are trying to implement a kernfs-based filesystem that will have
> > multiple instances running at the same time, i.e., multiple kernfs_roots.
> 
> I don't think that kernfs is meant for that very well, what is that
> filesystem going to be for?

Thank you for your reply. :)

Similar to cgroupfs and sysfs, it is used to export the status and configurations
of some kernel variables in hierarchical modes of the kernel. The only difference
is that it may have many instances, that is, many kernfs_roots.

> 
> > While investigating the kernfs implementation, we found some global locks
> > that would cause noticeable lock contention when there are many filesystem
> > instances.
> > 
> > Fortunately, we found that some optimizations have been made in [1], which
> > moved kernfs_rwsem into kernfs_root. But there are still some global locks
> > left.
> > 
> > We think it is also necessary to switch the remaining global locks to
> > per-fs. Moreover, we strongly agree with Tejun Heo's point in [1]:
> > 
> >   "... this is the right thing to do even if there is no concrete
> >    performance argument (not saying there isn't). It's just weird to
> >    entangle these completely unrelated users in a single rwsem."
> > 
> > We think kernfs will be widely used to build other filesystems, so we
> > strongly recommend switching global locks to per-fs.
> 
> I don't strongly object, but I would like to see some real-world numbers first.

Haha, we are still evaluating whether to implement it based on kernfs, so it
has not been implemented and tested yet. However, for these global locks, I
think it is more reasonable to switch to per-fs, regardless of whether there
is a significant performance improvement (in fact, when the number of kernfs-based
filesystem instances increases, the lock contention will definitely be reduced.
At least, it will not get worse, hahaha).

Haha, but if you think it is not necessary to do this, just ignore this patchset.

Thank you,
Jinliang Zheng. :)

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-11 18:31 ` [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock " alexjlzheng
  2025-04-12  6:12   ` Greg KH
@ 2025-04-14 17:09   ` Tejun Heo
  2025-04-15  6:28     ` Jinliang Zheng
  1 sibling, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2025-04-14 17:09 UTC (permalink / raw)
  To: alexjlzheng; +Cc: gregkh, alexjlzheng, linux-kernel

On Sat, Apr 12, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> The kernfs implementation has big lock granularity(kernfs_idr_lock) so
> every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> 
> This patch switches the global kernfs_idr_lock to per-fs lock, which
> put the spinlock into kernfs_root.
> 
> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>

Given that it doesn't really make things any more complicated, I think this
makes more sense than the existing code even without any direct evidence
that this improves performance.

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 2/3] kernfs: switch global kernfs_rename_lock to per-fs lock
  2025-04-11 18:31 ` [PATCH kernfs 2/3] kernfs: switch global kernfs_rename_lock " alexjlzheng
@ 2025-04-14 17:26   ` Tejun Heo
  0 siblings, 0 replies; 15+ messages in thread
From: Tejun Heo @ 2025-04-14 17:26 UTC (permalink / raw)
  To: alexjlzheng; +Cc: gregkh, alexjlzheng, linux-kernel

On Sat, Apr 12, 2025 at 02:31:08AM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> The kernfs implementation has big lock granularity(kernfs_rename_lock) so
> every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> 
> This patch switches the global kernfs_rename_lock to per-fs lock, which
> put the rwlock into kernfs_root.
> 
> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 3/3] kernfs: switch global kernfs_pr_cont_lock to per-fs lock
  2025-04-11 18:31 ` [PATCH kernfs 3/3] kernfs: switch global kernfs_pr_cont_lock " alexjlzheng
@ 2025-04-14 17:27   ` Tejun Heo
  2025-04-15  6:37     ` Jinliang Zheng
  0 siblings, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2025-04-14 17:27 UTC (permalink / raw)
  To: alexjlzheng; +Cc: gregkh, alexjlzheng, linux-kernel

On Sat, Apr 12, 2025 at 02:31:09AM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> The kernfs implementation has big lock granularity(kernfs_pr_cont_lock) so
> every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> 
> This patch switches the global kernfs_pr_cont_lock to per-fs lock, which
> put the spinlock into kernfs_root. Of course, kernfs_pr_cont_buf also needs
> to be moved to kernfs_root.
> 
> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>

I don't think this one makes sense. There are lots more things that are
globally synchronizing in the printk pass. This is necessarily a really cold
path and it doesn't make anything better to split this lock.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-14 17:09   ` Tejun Heo
@ 2025-04-15  6:28     ` Jinliang Zheng
  0 siblings, 0 replies; 15+ messages in thread
From: Jinliang Zheng @ 2025-04-15  6:28 UTC (permalink / raw)
  To: tj; +Cc: alexjlzheng, alexjlzheng, gregkh, linux-kernel

On Mon, 14 Apr 2025 07:09:28 -1000, tj@kernel.org wrote:
> On Sat, Apr 12, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > 
> > The kernfs implementation has big lock granularity(kernfs_idr_lock) so
> > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> > 
> > This patch switches the global kernfs_idr_lock to per-fs lock, which
> > put the spinlock into kernfs_root.
> > 
> > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> Given that it doesn't really make things any more complicated, I think this
> makes more sense than the existing code even without any direct evidence
> that this improves performance.

I agree.

thanks,
Jinliang Zheng :)

> 
> Acked-by: Tejun Heo <tj@kernel.org>
> 
> Thanks.
> 
> -- 
> tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 3/3] kernfs: switch global kernfs_pr_cont_lock to per-fs lock
  2025-04-14 17:27   ` Tejun Heo
@ 2025-04-15  6:37     ` Jinliang Zheng
  0 siblings, 0 replies; 15+ messages in thread
From: Jinliang Zheng @ 2025-04-15  6:37 UTC (permalink / raw)
  To: tj; +Cc: alexjlzheng, alexjlzheng, gregkh, linux-kernel

On Mon, 14 Apr 2025 07:27:21 -1000, tj@kernel.org wrote:
> On Sat, Apr 12, 2025 at 02:31:09AM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > 
> > The kernfs implementation has big lock granularity(kernfs_pr_cont_lock) so
> > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> > 
> > This patch switches the global kernfs_pr_cont_lock to per-fs lock, which
> > put the spinlock into kernfs_root. Of course, kernfs_pr_cont_buf also needs
> > to be moved to kernfs_root.
> > 
> > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> I don't think this one makes sense. There are lots more things that are
> globally synchronizing in the printk pass. This is necessarily a really cold
> path and it doesn't make anything better to split this lock.

Thank you for your reply, :)

From a performance perspective, I agree. From a design perspective, I can
only agree 50%, hahahaha.

But compared to the other two global locks in this patchset, the changes to
this lock are relatively unimportant.

thanks,
Jinliang Zheng :)

> 
> Thanks.
> 
> -- 
> tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-14  3:20         ` Jinliang Zheng
@ 2025-04-15 15:16           ` Greg KH
  2025-04-15 15:44             ` Jinliang Zheng
  0 siblings, 1 reply; 15+ messages in thread
From: Greg KH @ 2025-04-15 15:16 UTC (permalink / raw)
  To: Jinliang Zheng; +Cc: alexjlzheng, linux-kernel, tj

On Mon, Apr 14, 2025 at 11:20:54AM +0800, Jinliang Zheng wrote:
> > On Sat, Apr 12, 2025 at 07:50:54PM +0800, alexjlzheng@gmail.com wrote:
> > > On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote:
> > > > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote:
> > > > > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > > > > 
> > > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so
> > > > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> > > > > 
> > > > > This patch switches the global kernfs_idr_lock to per-fs lock, which
> > > > > put the spinlock into kernfs_root.
> > > > > 
> > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > > > > ---
> > > > >  fs/kernfs/dir.c             | 14 +++++++-------
> > > > >  fs/kernfs/kernfs-internal.h |  1 +
> > > > >  2 files changed, 8 insertions(+), 7 deletions(-)
> > > > 
> > > > What kind of testing / benchmark did you do for this series that shows
> > > > that this works, AND that this actually is measureable?  What workload
> > > > are you doing that causes these changes to be needed?
> > > 
> > > Thank you for your reply. :)
> > > 
> > > We are trying to implement a kernfs-based filesystem that will have
> > > multiple instances running at the same time, i.e., multiple kernfs_roots.
> > 
> > I don't think that kernfs is meant for that very well, what is that
> > filesystem going to be for?
> 
> Thank you for your reply. :)
> 
> Similar to cgroupfs and sysfs, it is used to export the status and configurations
> of some kernel variables in hierarchical modes of the kernel. The only difference
> is that it may have many instances, that is, many kernfs_roots.

Let's see that filesystem first please before determining more, as you
would be adding a new user/kernel api that we all need to argue about :)

Anyway, for the 2 patches that Tejun agrees with here, can you resend
just them?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock to per-fs lock
  2025-04-15 15:16           ` Greg KH
@ 2025-04-15 15:44             ` Jinliang Zheng
  0 siblings, 0 replies; 15+ messages in thread
From: Jinliang Zheng @ 2025-04-15 15:44 UTC (permalink / raw)
  To: gregkh; +Cc: alexjlzheng, alexjlzheng, linux-kernel, tj

On Tue, 15 Apr 2025 17:16:46 +0200, gregkh@linuxfoundation.org wrote:
> On Mon, Apr 14, 2025 at 11:20:54AM +0800, Jinliang Zheng wrote:
> > > On Sat, Apr 12, 2025 at 07:50:54PM +0800, alexjlzheng@gmail.com wrote:
> > > > On Sat, 12 Apr 2025 08:12:22 +0200, gregkh@linuxfoundation.org wrote:
> > > > > On Sat, Apr 13, 2025 at 02:31:07AM +0800, alexjlzheng@gmail.com wrote:
> > > > > > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > > > > > 
> > > > > > The kernfs implementation has big lock granularity(kernfs_idr_lock) so
> > > > > > every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the lock.
> > > > > > 
> > > > > > This patch switches the global kernfs_idr_lock to per-fs lock, which
> > > > > > put the spinlock into kernfs_root.
> > > > > > 
> > > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > > > > > ---
> > > > > >  fs/kernfs/dir.c             | 14 +++++++-------
> > > > > >  fs/kernfs/kernfs-internal.h |  1 +
> > > > > >  2 files changed, 8 insertions(+), 7 deletions(-)
> > > > > 
> > > > > What kind of testing / benchmark did you do for this series that shows
> > > > > that this works, AND that this actually is measureable?  What workload
> > > > > are you doing that causes these changes to be needed?
> > > > 
> > > > Thank you for your reply. :)
> > > > 
> > > > We are trying to implement a kernfs-based filesystem that will have
> > > > multiple instances running at the same time, i.e., multiple kernfs_roots.
> > > 
> > > I don't think that kernfs is meant for that very well, what is that
> > > filesystem going to be for?
> > 
> > Thank you for your reply. :)
> > 
> > Similar to cgroupfs and sysfs, it is used to export the status and configurations
> > of some kernel variables in hierarchical modes of the kernel. The only difference
> > is that it may have many instances, that is, many kernfs_roots.
> 
> Let's see that filesystem first please before determining more, as you
> would be adding a new user/kernel api that we all need to argue about :)
> 
> Anyway, for the 2 patches that Tejun agrees with here, can you resend
> just them?

Fine. I have resent them at https://lore.kernel.org/all/20250415153659.14950-1-alexjlzheng@tencent.com/

thanks,
Jinliang Zheng. :)

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-04-15 15:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-11 18:31 [PATCH kernfs 0/3] kernfs: switch global locks to per-fs lock alexjlzheng
2025-04-11 18:31 ` [PATCH kernfs 1/3] kernfs: switch global kernfs_idr_lock " alexjlzheng
2025-04-12  6:12   ` Greg KH
2025-04-12 11:50     ` alexjlzheng
2025-04-13  7:51       ` Greg KH
2025-04-14  3:20         ` Jinliang Zheng
2025-04-15 15:16           ` Greg KH
2025-04-15 15:44             ` Jinliang Zheng
2025-04-14 17:09   ` Tejun Heo
2025-04-15  6:28     ` Jinliang Zheng
2025-04-11 18:31 ` [PATCH kernfs 2/3] kernfs: switch global kernfs_rename_lock " alexjlzheng
2025-04-14 17:26   ` Tejun Heo
2025-04-11 18:31 ` [PATCH kernfs 3/3] kernfs: switch global kernfs_pr_cont_lock " alexjlzheng
2025-04-14 17:27   ` Tejun Heo
2025-04-15  6:37     ` Jinliang Zheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).