public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths
@ 2026-02-28 18:27 Waiman Long
  2026-02-28 18:27 ` [PATCH v4 1/2] fs: Add a pool of extra fs->pwd references to fs_struct Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Waiman Long @ 2026-02-28 18:27 UTC (permalink / raw)
  To: Paul Moore, Eric Paris, Christian Brauner, Al Viro, Jan Kara
  Cc: linux-kernel, linux-fsdevel, audit, Richard Guy Briggs,
	Ricardo Robaina, Waiman Long

 v4:
  - Add ack and review tags
  - Simplify put_fs_pwd_pool() in patch 1 as suggested by Paul Moore

 v3:
  - https://lore.kernel.org/lkml/20260206201918.1988344-1-longman@redhat.com/

When the audit subsystem is enabled, it can do a lot of get_fs_pwd()
calls to get references to fs->pwd and then releasing those references
back with path_put() later. That may cause a lot of spinlock contention
on a single pwd's dentry lock because of the constant changes to the
reference count when there are many processes on the same working
directory actively doing open/close system calls. This can cause
noticeable performance regresssion when compared with the case where
the audit subsystem is turned off especially on systems with a lot of
CPUs which is becoming more common these days.

This patch series aim to avoid this type of performance regression caused
by audit by adding a new set of fs_struct helpers to reduce unncessary
path_get() and path_put() calls and the audit code is modified to use
these new helpers.

Waiman Long (2):
  fs: Add a pool of extra fs->pwd references to fs_struct
  audit: Use the new {get,put}_fs_pwd_pool() APIs to get/put pwd
    references

 fs/fs_struct.c            | 26 +++++++++++++++++++++-----
 fs/namespace.c            |  8 ++++++++
 include/linux/fs_struct.h | 28 +++++++++++++++++++++++++++-
 kernel/auditsc.c          |  7 +++++--
 4 files changed, 61 insertions(+), 8 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/2] fs: Add a pool of extra fs->pwd references to fs_struct
  2026-02-28 18:27 [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths Waiman Long
@ 2026-02-28 18:27 ` Waiman Long
  2026-02-28 18:27 ` [PATCH v4 2/2] audit: Use the new {get,put}_fs_pwd_pool() APIs to get/put pwd references Waiman Long
  2026-03-05 21:46 ` [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths Christian Brauner
  2 siblings, 0 replies; 5+ messages in thread
From: Waiman Long @ 2026-02-28 18:27 UTC (permalink / raw)
  To: Paul Moore, Eric Paris, Christian Brauner, Al Viro, Jan Kara
  Cc: linux-kernel, linux-fsdevel, audit, Richard Guy Briggs,
	Ricardo Robaina, Waiman Long

When the audit subsystem is enabled, it can do a lot of get_fs_pwd()
calls to get references to fs->pwd and then releasing those references
back with path_put() later. That may cause a lot of spinlock contention
on a single pwd's dentry lock because of the constant changes to the
reference count when there are many processes on the same working
directory actively doing open/close system calls. This can cause
noticeable performance regresssion when compared with the case where
the audit subsystem is turned off especially on systems with a lot of
CPUs which is becoming more common these days.

A simple and elegant solution to avoid this kind of performance
regression is to add a common pool of extra fs->pwd references inside
the fs_struct. When a caller needs a pwd reference, it can borrow one
from pool, if available, to avoid an explicit path_get(). When it is
time to release the reference, it can put it back into the common pool
if fs->pwd isn't changed before without doing a path_put(). We still
need to acquire the fs's spinlock, but fs_struct is more distributed
and it is less common to have many tasks sharing a single fs_struct.

A new set of get_fs_pwd_pool/put_fs_pwd_pool() APIs are introduced
with this patch to enable other subsystems to acquire and release
a pwd reference from the common pool without doing unnecessary
path_get/path_put().

Besides fs/fs_struct.c, the copy_mnt_ns() function of fs/namespace.c is
also modified to properly handle the extra pwd references, if available.

Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
---
 fs/fs_struct.c            | 26 +++++++++++++++++++++-----
 fs/namespace.c            |  8 ++++++++
 include/linux/fs_struct.h | 28 +++++++++++++++++++++++++++-
 3 files changed, 56 insertions(+), 6 deletions(-)

diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index 394875d06fd6..43af98e0a10c 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -33,15 +33,19 @@ void set_fs_root(struct fs_struct *fs, const struct path *path)
 void set_fs_pwd(struct fs_struct *fs, const struct path *path)
 {
 	struct path old_pwd;
+	int count;
 
 	path_get(path);
 	write_seqlock(&fs->seq);
 	old_pwd = fs->pwd;
 	fs->pwd = *path;
+	count = fs->pwd_refs + 1;
+	fs->pwd_refs = 0;
 	write_sequnlock(&fs->seq);
 
 	if (old_pwd.dentry)
-		path_put(&old_pwd);
+		while (count--)
+			path_put(&old_pwd);
 }
 
 static inline int replace_path(struct path *p, const struct path *old, const struct path *new)
@@ -63,10 +67,15 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root)
 		task_lock(p);
 		fs = p->fs;
 		if (fs) {
-			int hits = 0;
+			int hits;
+
 			write_seqlock(&fs->seq);
+			hits = replace_path(&fs->pwd, old_root, new_root);
+			if (hits && fs->pwd_refs) {
+				count += fs->pwd_refs;
+				fs->pwd_refs = 0;
+			}
 			hits += replace_path(&fs->root, old_root, new_root);
-			hits += replace_path(&fs->pwd, old_root, new_root);
 			while (hits--) {
 				count++;
 				path_get(new_root);
@@ -82,8 +91,11 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root)
 
 void free_fs_struct(struct fs_struct *fs)
 {
+	int count = fs->pwd_refs + 1;
+
 	path_put(&fs->root);
-	path_put(&fs->pwd);
+	while (count--)
+		path_put(&fs->pwd);
 	kmem_cache_free(fs_cachep, fs);
 }
 
@@ -111,6 +123,7 @@ struct fs_struct *copy_fs_struct(struct fs_struct *old)
 	if (fs) {
 		fs->users = 1;
 		fs->in_exec = 0;
+		fs->pwd_refs = 0;
 		seqlock_init(&fs->seq);
 		fs->umask = old->umask;
 
@@ -118,7 +131,10 @@ struct fs_struct *copy_fs_struct(struct fs_struct *old)
 		fs->root = old->root;
 		path_get(&fs->root);
 		fs->pwd = old->pwd;
-		path_get(&fs->pwd);
+		if (old->pwd_refs)
+			old->pwd_refs--;
+		else
+			path_get(&fs->pwd);
 		read_sequnlock_excl(&old->seq);
 	}
 	return fs;
diff --git a/fs/namespace.c b/fs/namespace.c
index 854f4fc66469..96d41f00add6 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4272,6 +4272,14 @@ struct mnt_namespace *copy_mnt_ns(u64 flags, struct mnt_namespace *ns,
 	 * as belonging to new namespace.  We have already acquired a private
 	 * fs_struct, so tsk->fs->lock is not needed.
 	 */
+	if (new_fs)
+		WARN_ON_ONCE(new_fs->users != 1);
+
+	/* Release the extra pwd references of new_fs, if present. */
+	while (new_fs && new_fs->pwd_refs) {
+		path_put(&new_fs->pwd);
+		new_fs->pwd_refs--;
+	}
 	p = old;
 	q = new;
 	while (p) {
diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h
index 0070764b790a..f8cf3b280398 100644
--- a/include/linux/fs_struct.h
+++ b/include/linux/fs_struct.h
@@ -8,10 +8,11 @@
 #include <linux/seqlock.h>
 
 struct fs_struct {
-	int users;
 	seqlock_t seq;
+	int users;
 	int umask;
 	int in_exec;
+	int pwd_refs;	/* A pool of extra pwd references */
 	struct path root, pwd;
 } __randomize_layout;
 
@@ -40,6 +41,31 @@ static inline void get_fs_pwd(struct fs_struct *fs, struct path *pwd)
 	read_sequnlock_excl(&fs->seq);
 }
 
+/* Acquire a pwd reference from the pwd_refs pool, if available */
+static inline void get_fs_pwd_pool(struct fs_struct *fs, struct path *pwd)
+{
+	read_seqlock_excl(&fs->seq);
+	*pwd = fs->pwd;
+	if (fs->pwd_refs)
+		fs->pwd_refs--;
+	else
+		path_get(pwd);
+	read_sequnlock_excl(&fs->seq);
+}
+
+/* Release a pwd reference back to the pwd_refs pool, if appropriate */
+static inline void put_fs_pwd_pool(struct fs_struct *fs, struct path *pwd)
+{
+	read_seqlock_excl(&fs->seq);
+	if ((fs->pwd.dentry == pwd->dentry) && (fs->pwd.mnt == pwd->mnt)) {
+		fs->pwd_refs++;
+		pwd = NULL;
+	}
+	read_sequnlock_excl(&fs->seq);
+	if (pwd)
+		path_put(pwd);
+}
+
 extern bool current_chrooted(void);
 
 static inline int current_umask(void)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 2/2] audit: Use the new {get,put}_fs_pwd_pool() APIs to get/put pwd references
  2026-02-28 18:27 [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths Waiman Long
  2026-02-28 18:27 ` [PATCH v4 1/2] fs: Add a pool of extra fs->pwd references to fs_struct Waiman Long
@ 2026-02-28 18:27 ` Waiman Long
  2026-03-05 21:46 ` [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths Christian Brauner
  2 siblings, 0 replies; 5+ messages in thread
From: Waiman Long @ 2026-02-28 18:27 UTC (permalink / raw)
  To: Paul Moore, Eric Paris, Christian Brauner, Al Viro, Jan Kara
  Cc: linux-kernel, linux-fsdevel, audit, Richard Guy Briggs,
	Ricardo Robaina, Waiman Long

When the audit subsystem is enabled, it can do a lot of get_fs_pwd()
calls to get references to fs->pwd and then releasing those references
back with path_put() later. That may cause a lot of spinlock contention
on a single pwd's dentry lock because of the constant changes to the
reference count when there are many processes on the same working
directory actively doing open/close system calls. This can cause
noticeable performance regresssion when compared with the case where
the audit subsystem is turned off especially on systems with a lot of
CPUs which is becoming more common these days.

To avoid this kind of performance regression, use the new
get_fs_pwd_pool() and put_fs_pwd_pool() APIs to acquire and release a
fs->pwd reference. This should greatly reduce the number of path_get()
and path_put() calls that are needed.

After installing a test kernel with auditing enabled and counters
added to track the get_fs_pwd_pool() and put_fs_pwd_pool() calls on
a 2-socket 96-core test system and running a parallel kernel build,
the counter values for this particular test run were shown below.

  fs_get_path=307,903
  fs_get_pool=56,583,192
  fs_put_path=6,209
  fs_put_pool=56,885,147

Of the about 57M calls to get_fs_pwd_pool() and put_fs_pwd_pool(), the
majority of them are just updating the pwd_refs counters. Only less than
1% of those calls require an actual path_get() and path_put() calls. The
difference between fs_get_path and fs_put_path represents the extra pwd
references that were still stored in various active task->fs's when the
counter values were retrieved.

It can be seen that the number of path_get() and path_put() calls are
reduced by quite a lot.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
---
 kernel/auditsc.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index f6af6a8f68c4..26ba61eabfb0 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -931,6 +931,9 @@ static inline void audit_free_names(struct audit_context *context)
 {
 	struct audit_names *n, *next;
 
+	if (!context->name_count)
+		return;	/* audit_alloc_name() has not been called */
+
 	list_for_each_entry_safe(n, next, &context->names_list, list) {
 		list_del(&n->list);
 		if (n->name)
@@ -939,7 +942,7 @@ static inline void audit_free_names(struct audit_context *context)
 			kfree(n);
 	}
 	context->name_count = 0;
-	path_put(&context->pwd);
+	put_fs_pwd_pool(current->fs, &context->pwd);
 	context->pwd.dentry = NULL;
 	context->pwd.mnt = NULL;
 }
@@ -2165,7 +2168,7 @@ static struct audit_names *audit_alloc_name(struct audit_context *context,
 
 	context->name_count++;
 	if (!context->pwd.dentry)
-		get_fs_pwd(current->fs, &context->pwd);
+		get_fs_pwd_pool(current->fs, &context->pwd);
 	return aname;
 }
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths
  2026-02-28 18:27 [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths Waiman Long
  2026-02-28 18:27 ` [PATCH v4 1/2] fs: Add a pool of extra fs->pwd references to fs_struct Waiman Long
  2026-02-28 18:27 ` [PATCH v4 2/2] audit: Use the new {get,put}_fs_pwd_pool() APIs to get/put pwd references Waiman Long
@ 2026-03-05 21:46 ` Christian Brauner
  2026-03-06  0:53   ` Waiman Long
  2 siblings, 1 reply; 5+ messages in thread
From: Christian Brauner @ 2026-03-05 21:46 UTC (permalink / raw)
  To: Waiman Long
  Cc: Paul Moore, Eric Paris, Al Viro, Jan Kara, linux-kernel,
	linux-fsdevel, audit, Richard Guy Briggs, Ricardo Robaina

[-- Attachment #1: Type: text/plain, Size: 1433 bytes --]

On Sat, Feb 28, 2026 at 01:27:55PM -0500, Waiman Long wrote:
>  v4:
>   - Add ack and review tags
>   - Simplify put_fs_pwd_pool() in patch 1 as suggested by Paul Moore
> 
>  v3:
>   - https://lore.kernel.org/lkml/20260206201918.1988344-1-longman@redhat.com/
> 
> When the audit subsystem is enabled, it can do a lot of get_fs_pwd()
> calls to get references to fs->pwd and then releasing those references
> back with path_put() later. That may cause a lot of spinlock contention
> on a single pwd's dentry lock because of the constant changes to the
> reference count when there are many processes on the same working
> directory actively doing open/close system calls. This can cause
> noticeable performance regresssion when compared with the case where
> the audit subsystem is turned off especially on systems with a lot of
> CPUs which is becoming more common these days.
> 
> This patch series aim to avoid this type of performance regression caused
> by audit by adding a new set of fs_struct helpers to reduce unncessary
> path_get() and path_put() calls and the audit code is modified to use
> these new helpers.

Tbh, the open-coding everywhere is really not very tasteful and makes me
not want to do this at all. Ideally we'd have a better mechanism that
avoids all this new spaghetti in various codepaths.

In it's current form I don't find it palatable. I added a few cleanups
on top that make it at least somewhat ok.

[-- Attachment #2: 0001-fs-use-path_equal-in-fs_struct-helpers.patch --]
[-- Type: text/x-diff, Size: 1504 bytes --]

From df813ea26394f5d1d1dac0eb49b18d029c73906a Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Thu, 5 Mar 2026 22:06:18 +0100
Subject: [PATCH 1/4] fs: use path_equal() in fs_struct helpers

Replace the open-coded dentry/mnt pointer comparison in
put_fs_pwd_pool() with the existing path_equal() helper.

No functional change.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/fs_struct.c            | 2 +-
 include/linux/fs_struct.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index 43af98e0a10c..ce814b76bde7 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -50,7 +50,7 @@ void set_fs_pwd(struct fs_struct *fs, const struct path *path)
 
 static inline int replace_path(struct path *p, const struct path *old, const struct path *new)
 {
-	if (likely(p->dentry != old->dentry || p->mnt != old->mnt))
+	if (likely(!path_equal(p, old)))
 		return 0;
 	*p = *new;
 	return 1;
diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h
index f8cf3b280398..9414a572d8f2 100644
--- a/include/linux/fs_struct.h
+++ b/include/linux/fs_struct.h
@@ -57,7 +57,7 @@ static inline void get_fs_pwd_pool(struct fs_struct *fs, struct path *pwd)
 static inline void put_fs_pwd_pool(struct fs_struct *fs, struct path *pwd)
 {
 	read_seqlock_excl(&fs->seq);
-	if ((fs->pwd.dentry == pwd->dentry) && (fs->pwd.mnt == pwd->mnt)) {
+	if (path_equal(&fs->pwd, pwd)) {
 		fs->pwd_refs++;
 		pwd = NULL;
 	}
-- 
2.47.3


[-- Attachment #3: 0002-fs-document-seqlock-usage-in-pwd-pool-APIs.patch --]
[-- Type: text/x-diff, Size: 2074 bytes --]

From cd1f838cec0303780025906dee3c789a4680e402 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Thu, 5 Mar 2026 22:06:39 +0100
Subject: [PATCH 2/4] fs: document seqlock usage in pwd pool APIs

Document why get_fs_pwd_pool() and put_fs_pwd_pool() use
read_seqlock_excl() rather than write_seqlock() to modify pwd_refs.

read_seqlock_excl() acquires the writer spinlock without bumping the
sequence counter. This is correct because pwd_refs changes don't affect
the path values that lockless seq readers care about. Using
write_seqlock() would needlessly force retries in concurrent
get_fs_pwd()/get_fs_root() callers.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 include/linux/fs_struct.h | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h
index 9414a572d8f2..b88437f04672 100644
--- a/include/linux/fs_struct.h
+++ b/include/linux/fs_struct.h
@@ -41,7 +41,15 @@ static inline void get_fs_pwd(struct fs_struct *fs, struct path *pwd)
 	read_sequnlock_excl(&fs->seq);
 }
 
-/* Acquire a pwd reference from the pwd_refs pool, if available */
+/*
+ * Acquire a pwd reference from the pwd_refs pool, if available.
+ *
+ * Uses read_seqlock_excl() (writer spinlock without sequence bump) rather
+ * than write_seqlock() because modifying pwd_refs does not change the path
+ * values that lockless seq readers care about. Bumping the sequence counter
+ * would force unnecessary retries in concurrent get_fs_pwd()/get_fs_root()
+ * callers.
+ */
 static inline void get_fs_pwd_pool(struct fs_struct *fs, struct path *pwd)
 {
 	read_seqlock_excl(&fs->seq);
@@ -53,7 +61,7 @@ static inline void get_fs_pwd_pool(struct fs_struct *fs, struct path *pwd)
 	read_sequnlock_excl(&fs->seq);
 }
 
-/* Release a pwd reference back to the pwd_refs pool, if appropriate */
+/* Release a pwd reference back to the pwd_refs pool, if appropriate. */
 static inline void put_fs_pwd_pool(struct fs_struct *fs, struct path *pwd)
 {
 	read_seqlock_excl(&fs->seq);
-- 
2.47.3


[-- Attachment #4: 0003-fs-add-drain_fs_pwd_pool-helper.patch --]
[-- Type: text/x-diff, Size: 3007 bytes --]

From 367cb14a4623f0ae35dd13d586eb224cffee7a11 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Thu, 5 Mar 2026 22:07:07 +0100
Subject: [PATCH 3/4] fs: add drain_fs_pwd_pool() helper

Add a drain_fs_pwd_pool() function in to encapsulate draining the pwd
reference pool. This keeps the pool implementation details private to
fs_struct code.

Use it in free_fs_struct() and copy_mnt_ns(). The latter previously
manipulated fs->pwd_refs directly from namespace code.

The caller must ensure exclusive access to the fs_struct, either
because fs->users == 1 or the write side of the seqlock is held.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/fs_struct.c            | 18 ++++++++++++++----
 fs/namespace.c            | 12 +++++++-----
 include/linux/fs_struct.h |  1 +
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index ce814b76bde7..0a72fb3ea427 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -89,13 +89,23 @@ void chroot_fs_refs(const struct path *old_root, const struct path *new_root)
 		path_put(old_root);
 }
 
-void free_fs_struct(struct fs_struct *fs)
+/*
+ * Drain extra pwd references from the pool. The caller must ensure
+ * exclusive access to @fs (e.g., fs->users == 1 or under write_seqlock).
+ */
+void drain_fs_pwd_pool(struct fs_struct *fs)
 {
-	int count = fs->pwd_refs + 1;
+	while (fs->pwd_refs) {
+		path_put(&fs->pwd);
+		fs->pwd_refs--;
+	}
+}
 
+void free_fs_struct(struct fs_struct *fs)
+{
 	path_put(&fs->root);
-	while (count--)
-		path_put(&fs->pwd);
+	drain_fs_pwd_pool(fs);
+	path_put(&fs->pwd);
 	kmem_cache_free(fs_cachep, fs);
 }
 
diff --git a/fs/namespace.c b/fs/namespace.c
index 89aef4e81f23..06b856410a01 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4262,11 +4262,13 @@ struct mnt_namespace *copy_mnt_ns(u64 flags, struct mnt_namespace *ns,
 	if (new_fs)
 		WARN_ON_ONCE(new_fs->users != 1);
 
-	/* Release the extra pwd references of new_fs, if present. */
-	while (new_fs && new_fs->pwd_refs) {
-		path_put(&new_fs->pwd);
-		new_fs->pwd_refs--;
-	}
+	/*
+	 * Drain the pwd reference pool. The pool must be empty before we
+	 * update new_fs->pwd.mnt below since the pooled references belong
+	 * to the old mount. Safe to access without locking: new_fs->users == 1.
+	 */
+	if (new_fs)
+		drain_fs_pwd_pool(new_fs);
 	p = old;
 	q = new;
 	while (p) {
diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h
index b88437f04672..e67d92f88605 100644
--- a/include/linux/fs_struct.h
+++ b/include/linux/fs_struct.h
@@ -23,6 +23,7 @@ extern void set_fs_root(struct fs_struct *, const struct path *);
 extern void set_fs_pwd(struct fs_struct *, const struct path *);
 extern struct fs_struct *copy_fs_struct(struct fs_struct *);
 extern void free_fs_struct(struct fs_struct *);
+extern void drain_fs_pwd_pool(struct fs_struct *);
 extern int unshare_fs_struct(void);
 
 static inline void get_fs_root(struct fs_struct *fs, struct path *root)
-- 
2.47.3


[-- Attachment #5: 0004-fs-factor-out-get_fs_pwd_pool_locked-for-lock-held-c.patch --]
[-- Type: text/x-diff, Size: 2129 bytes --]

From 8d0fcb0fdde5e29ed01d04d6123df32bc5e325c3 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Thu, 5 Mar 2026 22:13:23 +0100
Subject: [PATCH 4/4] fs: factor out get_fs_pwd_pool_locked() for lock-held
 callers

Extract the inner pool borrow logic from get_fs_pwd_pool() into
get_fs_pwd_pool_locked() for callers that already hold fs->seq.

Use it in copy_fs_struct() which open-coded the same pool borrow
pattern under an existing lock hold.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/fs_struct.c            |  6 +-----
 include/linux/fs_struct.h | 16 +++++++++++-----
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/fs/fs_struct.c b/fs/fs_struct.c
index 0a72fb3ea427..e1487ca6256d 100644
--- a/fs/fs_struct.c
+++ b/fs/fs_struct.c
@@ -140,11 +140,7 @@ struct fs_struct *copy_fs_struct(struct fs_struct *old)
 		read_seqlock_excl(&old->seq);
 		fs->root = old->root;
 		path_get(&fs->root);
-		fs->pwd = old->pwd;
-		if (old->pwd_refs)
-			old->pwd_refs--;
-		else
-			path_get(&fs->pwd);
+		get_fs_pwd_pool_locked(old, &fs->pwd);
 		read_sequnlock_excl(&old->seq);
 	}
 	return fs;
diff --git a/include/linux/fs_struct.h b/include/linux/fs_struct.h
index e67d92f88605..b63003cec25f 100644
--- a/include/linux/fs_struct.h
+++ b/include/linux/fs_struct.h
@@ -42,6 +42,16 @@ static inline void get_fs_pwd(struct fs_struct *fs, struct path *pwd)
 	read_sequnlock_excl(&fs->seq);
 }
 
+/* Borrow a pwd reference from the pool. Caller must hold fs->seq. */
+static inline void get_fs_pwd_pool_locked(struct fs_struct *fs, struct path *pwd)
+{
+	*pwd = fs->pwd;
+	if (fs->pwd_refs)
+		fs->pwd_refs--;
+	else
+		path_get(pwd);
+}
+
 /*
  * Acquire a pwd reference from the pwd_refs pool, if available.
  *
@@ -54,11 +64,7 @@ static inline void get_fs_pwd(struct fs_struct *fs, struct path *pwd)
 static inline void get_fs_pwd_pool(struct fs_struct *fs, struct path *pwd)
 {
 	read_seqlock_excl(&fs->seq);
-	*pwd = fs->pwd;
-	if (fs->pwd_refs)
-		fs->pwd_refs--;
-	else
-		path_get(pwd);
+	get_fs_pwd_pool_locked(fs, pwd);
 	read_sequnlock_excl(&fs->seq);
 }
 
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths
  2026-03-05 21:46 ` [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths Christian Brauner
@ 2026-03-06  0:53   ` Waiman Long
  0 siblings, 0 replies; 5+ messages in thread
From: Waiman Long @ 2026-03-06  0:53 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Paul Moore, Eric Paris, Al Viro, Jan Kara, linux-kernel,
	linux-fsdevel, audit, Richard Guy Briggs, Ricardo Robaina

On 3/5/26 4:46 PM, Christian Brauner wrote:
> On Sat, Feb 28, 2026 at 01:27:55PM -0500, Waiman Long wrote:
>>   v4:
>>    - Add ack and review tags
>>    - Simplify put_fs_pwd_pool() in patch 1 as suggested by Paul Moore
>>
>>   v3:
>>    - https://lore.kernel.org/lkml/20260206201918.1988344-1-longman@redhat.com/
>>
>> When the audit subsystem is enabled, it can do a lot of get_fs_pwd()
>> calls to get references to fs->pwd and then releasing those references
>> back with path_put() later. That may cause a lot of spinlock contention
>> on a single pwd's dentry lock because of the constant changes to the
>> reference count when there are many processes on the same working
>> directory actively doing open/close system calls. This can cause
>> noticeable performance regresssion when compared with the case where
>> the audit subsystem is turned off especially on systems with a lot of
>> CPUs which is becoming more common these days.
>>
>> This patch series aim to avoid this type of performance regression caused
>> by audit by adding a new set of fs_struct helpers to reduce unncessary
>> path_get() and path_put() calls and the audit code is modified to use
>> these new helpers.
> Tbh, the open-coding everywhere is really not very tasteful and makes me
> not want to do this at all. Ideally we'd have a better mechanism that
> avoids all this new spaghetti in various codepaths.
>
> In it's current form I don't find it palatable. I added a few cleanups
> on top that make it at least somewhat ok.

Thanks for the cleanup patches. They all look good to me.

Reviewed-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-06  0:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-28 18:27 [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths Waiman Long
2026-02-28 18:27 ` [PATCH v4 1/2] fs: Add a pool of extra fs->pwd references to fs_struct Waiman Long
2026-02-28 18:27 ` [PATCH v4 2/2] audit: Use the new {get,put}_fs_pwd_pool() APIs to get/put pwd references Waiman Long
2026-03-05 21:46 ` [PATCH v4 0/2] fs, audit: Avoid excessive dput/dget in audit_context setup and reset paths Christian Brauner
2026-03-06  0:53   ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox