public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks
@ 2026-02-19 21:50 Chuck Lever
  2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Chuck Lever @ 2026-02-19 21:50 UTC (permalink / raw)
  To: misanjum, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Attempt to address three crashes reported here:

https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/

These are compile-tested and regression-tested, but as I do not have
a PowerPC system handy, I will need someone who has one to test
whether they actually address the crashes.

Chuck Lever (2):
  NFSD: Defer sub-object cleanup in export put callbacks
  NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd

 fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
 fs/nfsd/export.h |  7 ++++--
 fs/nfsd/nfsctl.c | 22 ++++++++++++++---
 3 files changed, 78 insertions(+), 14 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks
  2026-02-19 21:50 [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks Chuck Lever
@ 2026-02-19 21:50 ` Chuck Lever
  2026-02-20 15:50   ` Jeff Layton
  2026-02-19 21:50 ` [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd Chuck Lever
  2026-02-21 22:57 ` [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks NeilBrown
  2 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2026-02-19 21:50 UTC (permalink / raw)
  To: misanjum, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

svc_export_put() calls path_put() and auth_domain_put() immediately
when the last reference drops, before the RCU grace period. RCU
readers in e_show() and c_show() access both ex_path (via
seq_path/d_path) and ex_client->name (via seq_escape) without
holding a reference. If cache_clean removes the entry and drops the
last reference concurrently, the sub-objects are freed while still
in use, producing a NULL pointer dereference in d_path.

Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or
ex_stats") moved kfree of ex_uuid and ex_stats into the
call_rcu callback, but left path_put() and auth_domain_put() running
before the grace period because both may sleep and call_rcu
callbacks execute in softirq context.

Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the
callback until after the RCU grace period and executes it in process
context where sleeping is permitted. This allows path_put() and
auth_domain_put() to be moved into the deferred callback alongside
the other resource releases. Apply the same fix to expkey_put(),
which has the identical pattern with ek_path and ek_client.

A dedicated workqueue scopes the shutdown drain to only NFSD
export release work items; flushing the shared
system_unbound_wq would stall on unrelated work from other
subsystems. nfsd_export_shutdown() uses rcu_barrier() followed
by flush_workqueue() to ensure all deferred release callbacks
complete before the export caches are destroyed.

Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu")
Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
 fs/nfsd/export.h |  7 ++++--
 fs/nfsd/nfsctl.c |  8 +++++-
 3 files changed, 66 insertions(+), 12 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 04b18f0f402f..53fe66784ed2 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -36,19 +36,30 @@
  * second map contains a reference to the entry in the first map.
  */
 
+static struct workqueue_struct *nfsd_export_wq;
+
 #define	EXPKEY_HASHBITS		8
 #define	EXPKEY_HASHMAX		(1 << EXPKEY_HASHBITS)
 #define	EXPKEY_HASHMASK		(EXPKEY_HASHMAX -1)
 
-static void expkey_put(struct kref *ref)
+static void expkey_release(struct work_struct *work)
 {
-	struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
+	struct svc_expkey *key = container_of(to_rcu_work(work),
+					      struct svc_expkey, ek_rwork);
 
 	if (test_bit(CACHE_VALID, &key->h.flags) &&
 	    !test_bit(CACHE_NEGATIVE, &key->h.flags))
 		path_put(&key->ek_path);
 	auth_domain_put(key->ek_client);
-	kfree_rcu(key, ek_rcu);
+	kfree(key);
+}
+
+static void expkey_put(struct kref *ref)
+{
+	struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
+
+	INIT_RCU_WORK(&key->ek_rwork, expkey_release);
+	queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
 }
 
 static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
@@ -353,11 +364,13 @@ static void export_stats_destroy(struct export_stats *stats)
 					    EXP_STATS_COUNTERS_NUM);
 }
 
-static void svc_export_release(struct rcu_head *rcu_head)
+static void svc_export_release(struct work_struct *work)
 {
-	struct svc_export *exp = container_of(rcu_head, struct svc_export,
-			ex_rcu);
+	struct svc_export *exp = container_of(to_rcu_work(work),
+					      struct svc_export, ex_rwork);
 
+	path_put(&exp->ex_path);
+	auth_domain_put(exp->ex_client);
 	nfsd4_fslocs_free(&exp->ex_fslocs);
 	export_stats_destroy(exp->ex_stats);
 	kfree(exp->ex_stats);
@@ -369,9 +382,8 @@ static void svc_export_put(struct kref *ref)
 {
 	struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
 
-	path_put(&exp->ex_path);
-	auth_domain_put(exp->ex_client);
-	call_rcu(&exp->ex_rcu, svc_export_release);
+	INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
+	queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
 }
 
 static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
@@ -1481,6 +1493,36 @@ const struct seq_operations nfs_exports_op = {
 	.show	= e_show,
 };
 
+/**
+ * nfsd_export_wq_init - allocate the export release workqueue
+ *
+ * Called once at module load. The workqueue runs deferred svc_export and
+ * svc_expkey release work scheduled by queue_rcu_work() in the cache put
+ * callbacks.
+ *
+ * Return values:
+ *   %0: workqueue allocated
+ *   %-ENOMEM: allocation failed
+ */
+int nfsd_export_wq_init(void)
+{
+	nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
+	if (!nfsd_export_wq)
+		return -ENOMEM;
+	return 0;
+}
+
+/**
+ * nfsd_export_wq_shutdown - drain and free the export release workqueue
+ *
+ * Called once at module unload. Per-namespace teardown in
+ * nfsd_export_shutdown() has already drained all deferred work.
+ */
+void nfsd_export_wq_shutdown(void)
+{
+	destroy_workqueue(nfsd_export_wq);
+}
+
 /*
  * Initialize the exports module.
  */
@@ -1542,6 +1584,9 @@ nfsd_export_shutdown(struct net *net)
 
 	cache_unregister_net(nn->svc_expkey_cache, net);
 	cache_unregister_net(nn->svc_export_cache, net);
+	/* Drain deferred export and expkey release work. */
+	rcu_barrier();
+	flush_workqueue(nfsd_export_wq);
 	cache_destroy_net(nn->svc_expkey_cache, net);
 	cache_destroy_net(nn->svc_export_cache, net);
 	svcauth_unix_purge(net);
diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
index d2b09cd76145..b05399374574 100644
--- a/fs/nfsd/export.h
+++ b/fs/nfsd/export.h
@@ -7,6 +7,7 @@
 
 #include <linux/sunrpc/cache.h>
 #include <linux/percpu_counter.h>
+#include <linux/workqueue.h>
 #include <uapi/linux/nfsd/export.h>
 #include <linux/nfs4.h>
 
@@ -75,7 +76,7 @@ struct svc_export {
 	u32			ex_layout_types;
 	struct nfsd4_deviceid_map *ex_devid_map;
 	struct cache_detail	*cd;
-	struct rcu_head		ex_rcu;
+	struct rcu_work		ex_rwork;
 	unsigned long		ex_xprtsec_modes;
 	struct export_stats	*ex_stats;
 };
@@ -92,7 +93,7 @@ struct svc_expkey {
 	u32			ek_fsid[6];
 
 	struct path		ek_path;
-	struct rcu_head		ek_rcu;
+	struct rcu_work		ek_rwork;
 };
 
 #define EX_ISSYNC(exp)		(!((exp)->ex_flags & NFSEXP_ASYNC))
@@ -110,6 +111,8 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
 /*
  * Function declarations
  */
+int			nfsd_export_wq_init(void);
+void			nfsd_export_wq_shutdown(void);
 int			nfsd_export_init(struct net *);
 void			nfsd_export_shutdown(struct net *);
 void			nfsd_export_flush(struct net *);
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 664a3275c511..4166f59908f4 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -2308,9 +2308,12 @@ static int __init init_nfsd(void)
 	if (retval)
 		goto out_free_pnfs;
 	nfsd_lockd_init();	/* lockd->nfsd callbacks */
+	retval = nfsd_export_wq_init();
+	if (retval)
+		goto out_free_lockd;
 	retval = register_pernet_subsys(&nfsd_net_ops);
 	if (retval < 0)
-		goto out_free_lockd;
+		goto out_free_export_wq;
 	retval = register_cld_notifier();
 	if (retval)
 		goto out_free_subsys;
@@ -2339,6 +2342,8 @@ static int __init init_nfsd(void)
 	unregister_cld_notifier();
 out_free_subsys:
 	unregister_pernet_subsys(&nfsd_net_ops);
+out_free_export_wq:
+	nfsd_export_wq_shutdown();
 out_free_lockd:
 	nfsd_lockd_shutdown();
 	nfsd_drc_slab_free();
@@ -2359,6 +2364,7 @@ static void __exit exit_nfsd(void)
 	nfsd4_destroy_laundry_wq();
 	unregister_cld_notifier();
 	unregister_pernet_subsys(&nfsd_net_ops);
+	nfsd_export_wq_shutdown();
 	nfsd_drc_slab_free();
 	nfsd_lockd_shutdown();
 	nfsd4_free_slabs();
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
  2026-02-19 21:50 [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks Chuck Lever
  2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
@ 2026-02-19 21:50 ` Chuck Lever
  2026-02-20 15:52   ` Jeff Layton
  2026-02-21 22:57 ` [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks NeilBrown
  2 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2026-02-19 21:50 UTC (permalink / raw)
  To: misanjum, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

The /proc/fs/nfs/exports proc entry is created at module init
and persists for the module's lifetime. exports_proc_open()
captures the caller's current network namespace and stores
its svc_export_cache in seq->private, but takes no reference
on the namespace. If the namespace is subsequently torn down
(e.g. container destruction after the opener does setns() to a
different namespace), nfsd_net_exit() calls nfsd_export_shutdown()
which frees the cache. Subsequent reads on the still-open fd
dereference the freed cache_detail, walking a freed hash table.

Hold a reference on the struct net for the lifetime of the open
file descriptor. This prevents nfsd_net_exit() from running --
and thus prevents nfsd_export_shutdown() from freeing the cache
-- while any exports fd is open. cache_detail already stores
its net pointer (cd->net, set by cache_create_net()), so
exports_release() can retrieve it without additional per-file
storage.

Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
Fixes: 96d851c4d28d ("nfsd: use proper net while reading "exports" file")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfsctl.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 4166f59908f4..3d5a676e1d14 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -149,9 +149,19 @@ static int exports_net_open(struct net *net, struct file *file)
 
 	seq = file->private_data;
 	seq->private = nn->svc_export_cache;
+	get_net(net);
 	return 0;
 }
 
+static int exports_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *seq = file->private_data;
+	struct cache_detail *cd = seq->private;
+
+	put_net(cd->net);
+	return seq_release(inode, file);
+}
+
 static int exports_nfsd_open(struct inode *inode, struct file *file)
 {
 	return exports_net_open(inode->i_sb->s_fs_info, file);
@@ -161,7 +171,7 @@ static const struct file_operations exports_nfsd_operations = {
 	.open		= exports_nfsd_open,
 	.read		= seq_read,
 	.llseek		= seq_lseek,
-	.release	= seq_release,
+	.release	= exports_release,
 };
 
 static int export_features_show(struct seq_file *m, void *v)
@@ -1376,7 +1386,7 @@ static const struct proc_ops exports_proc_ops = {
 	.proc_open	= exports_proc_open,
 	.proc_read	= seq_read,
 	.proc_lseek	= seq_lseek,
-	.proc_release	= seq_release,
+	.proc_release	= exports_release,
 };
 
 static int create_proc_exports_entry(void)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks
  2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
@ 2026-02-20 15:50   ` Jeff Layton
  2026-02-25 18:29     ` Olga Kornievskaia
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2026-02-20 15:50 UTC (permalink / raw)
  To: Chuck Lever, misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: linux-nfs, Chuck Lever

On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> svc_export_put() calls path_put() and auth_domain_put() immediately
> when the last reference drops, before the RCU grace period. RCU
> readers in e_show() and c_show() access both ex_path (via
> seq_path/d_path) and ex_client->name (via seq_escape) without
> holding a reference. If cache_clean removes the entry and drops the
> last reference concurrently, the sub-objects are freed while still
> in use, producing a NULL pointer dereference in d_path.
> 
> Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or
> ex_stats") moved kfree of ex_uuid and ex_stats into the
> call_rcu callback, but left path_put() and auth_domain_put() running
> before the grace period because both may sleep and call_rcu
> callbacks execute in softirq context.
> 
> Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the
> callback until after the RCU grace period and executes it in process
> context where sleeping is permitted. This allows path_put() and
> auth_domain_put() to be moved into the deferred callback alongside
> the other resource releases. Apply the same fix to expkey_put(),
> which has the identical pattern with ek_path and ek_client.
> 
> A dedicated workqueue scopes the shutdown drain to only NFSD
> export release work items; flushing the shared
> system_unbound_wq would stall on unrelated work from other
> subsystems. nfsd_export_shutdown() uses rcu_barrier() followed
> by flush_workqueue() to ensure all deferred release callbacks
> complete before the export caches are destroyed.
> 
> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu")
> Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
>  fs/nfsd/export.h |  7 ++++--
>  fs/nfsd/nfsctl.c |  8 +++++-
>  3 files changed, 66 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> index 04b18f0f402f..53fe66784ed2 100644
> --- a/fs/nfsd/export.c
> +++ b/fs/nfsd/export.c
> @@ -36,19 +36,30 @@
>   * second map contains a reference to the entry in the first map.
>   */
>  
> +static struct workqueue_struct *nfsd_export_wq;
> +
>  #define	EXPKEY_HASHBITS		8
>  #define	EXPKEY_HASHMAX		(1 << EXPKEY_HASHBITS)
>  #define	EXPKEY_HASHMASK		(EXPKEY_HASHMAX -1)
>  
> -static void expkey_put(struct kref *ref)
> +static void expkey_release(struct work_struct *work)
>  {
> -	struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
> +	struct svc_expkey *key = container_of(to_rcu_work(work),
> +					      struct svc_expkey, ek_rwork);
>  
>  	if (test_bit(CACHE_VALID, &key->h.flags) &&
>  	    !test_bit(CACHE_NEGATIVE, &key->h.flags))
>  		path_put(&key->ek_path);
>  	auth_domain_put(key->ek_client);
> -	kfree_rcu(key, ek_rcu);
> +	kfree(key);
> +}
> +
> +static void expkey_put(struct kref *ref)
> +{
> +	struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
> +
> +	INIT_RCU_WORK(&key->ek_rwork, expkey_release);
> +	queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
>  }
>  
>  static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
> @@ -353,11 +364,13 @@ static void export_stats_destroy(struct export_stats *stats)
>  					    EXP_STATS_COUNTERS_NUM);
>  }
>  
> -static void svc_export_release(struct rcu_head *rcu_head)
> +static void svc_export_release(struct work_struct *work)
>  {
> -	struct svc_export *exp = container_of(rcu_head, struct svc_export,
> -			ex_rcu);
> +	struct svc_export *exp = container_of(to_rcu_work(work),
> +					      struct svc_export, ex_rwork);
>  
> +	path_put(&exp->ex_path);
> +	auth_domain_put(exp->ex_client);
>  	nfsd4_fslocs_free(&exp->ex_fslocs);
>  	export_stats_destroy(exp->ex_stats);
>  	kfree(exp->ex_stats);
> @@ -369,9 +382,8 @@ static void svc_export_put(struct kref *ref)
>  {
>  	struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
>  
> -	path_put(&exp->ex_path);
> -	auth_domain_put(exp->ex_client);
> -	call_rcu(&exp->ex_rcu, svc_export_release);
> +	INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
> +	queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
>  }
>  
>  static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
> @@ -1481,6 +1493,36 @@ const struct seq_operations nfs_exports_op = {
>  	.show	= e_show,
>  };
>  
> +/**
> + * nfsd_export_wq_init - allocate the export release workqueue
> + *
> + * Called once at module load. The workqueue runs deferred svc_export and
> + * svc_expkey release work scheduled by queue_rcu_work() in the cache put
> + * callbacks.
> + *
> + * Return values:
> + *   %0: workqueue allocated
> + *   %-ENOMEM: allocation failed
> + */
> +int nfsd_export_wq_init(void)
> +{
> +	nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
> +	if (!nfsd_export_wq)
> +		return -ENOMEM;
> +	return 0;
> +}
> +
> +/**
> + * nfsd_export_wq_shutdown - drain and free the export release workqueue
> + *
> + * Called once at module unload. Per-namespace teardown in
> + * nfsd_export_shutdown() has already drained all deferred work.
> + */
> +void nfsd_export_wq_shutdown(void)
> +{
> +	destroy_workqueue(nfsd_export_wq);
> +}
> +
>  /*
>   * Initialize the exports module.
>   */
> @@ -1542,6 +1584,9 @@ nfsd_export_shutdown(struct net *net)
>  
>  	cache_unregister_net(nn->svc_expkey_cache, net);
>  	cache_unregister_net(nn->svc_export_cache, net);
> +	/* Drain deferred export and expkey release work. */
> +	rcu_barrier();
> +	flush_workqueue(nfsd_export_wq);
>  	cache_destroy_net(nn->svc_expkey_cache, net);
>  	cache_destroy_net(nn->svc_export_cache, net);
>  	svcauth_unix_purge(net);
> diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
> index d2b09cd76145..b05399374574 100644
> --- a/fs/nfsd/export.h
> +++ b/fs/nfsd/export.h
> @@ -7,6 +7,7 @@
>  
>  #include <linux/sunrpc/cache.h>
>  #include <linux/percpu_counter.h>
> +#include <linux/workqueue.h>
>  #include <uapi/linux/nfsd/export.h>
>  #include <linux/nfs4.h>
>  
> @@ -75,7 +76,7 @@ struct svc_export {
>  	u32			ex_layout_types;
>  	struct nfsd4_deviceid_map *ex_devid_map;
>  	struct cache_detail	*cd;
> -	struct rcu_head		ex_rcu;
> +	struct rcu_work		ex_rwork;
>  	unsigned long		ex_xprtsec_modes;
>  	struct export_stats	*ex_stats;
>  };
> @@ -92,7 +93,7 @@ struct svc_expkey {
>  	u32			ek_fsid[6];
>  
>  	struct path		ek_path;
> -	struct rcu_head		ek_rcu;
> +	struct rcu_work		ek_rwork;
>  };
>  
>  #define EX_ISSYNC(exp)		(!((exp)->ex_flags & NFSEXP_ASYNC))
> @@ -110,6 +111,8 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
>  /*
>   * Function declarations
>   */
> +int			nfsd_export_wq_init(void);
> +void			nfsd_export_wq_shutdown(void);
>  int			nfsd_export_init(struct net *);
>  void			nfsd_export_shutdown(struct net *);
>  void			nfsd_export_flush(struct net *);
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 664a3275c511..4166f59908f4 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -2308,9 +2308,12 @@ static int __init init_nfsd(void)
>  	if (retval)
>  		goto out_free_pnfs;
>  	nfsd_lockd_init();	/* lockd->nfsd callbacks */
> +	retval = nfsd_export_wq_init();
> +	if (retval)
> +		goto out_free_lockd;
>  	retval = register_pernet_subsys(&nfsd_net_ops);
>  	if (retval < 0)
> -		goto out_free_lockd;
> +		goto out_free_export_wq;
>  	retval = register_cld_notifier();
>  	if (retval)
>  		goto out_free_subsys;
> @@ -2339,6 +2342,8 @@ static int __init init_nfsd(void)
>  	unregister_cld_notifier();
>  out_free_subsys:
>  	unregister_pernet_subsys(&nfsd_net_ops);
> +out_free_export_wq:
> +	nfsd_export_wq_shutdown();
>  out_free_lockd:
>  	nfsd_lockd_shutdown();
>  	nfsd_drc_slab_free();
> @@ -2359,6 +2364,7 @@ static void __exit exit_nfsd(void)
>  	nfsd4_destroy_laundry_wq();
>  	unregister_cld_notifier();
>  	unregister_pernet_subsys(&nfsd_net_ops);
> +	nfsd_export_wq_shutdown();
>  	nfsd_drc_slab_free();
>  	nfsd_lockd_shutdown();
>  	nfsd4_free_slabs();

Looks good.

Reviwed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
  2026-02-19 21:50 ` [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd Chuck Lever
@ 2026-02-20 15:52   ` Jeff Layton
  2026-02-25 18:29     ` Olga Kornievskaia
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2026-02-20 15:52 UTC (permalink / raw)
  To: Chuck Lever, misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: linux-nfs, Chuck Lever

On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> The /proc/fs/nfs/exports proc entry is created at module init
> and persists for the module's lifetime. exports_proc_open()
> captures the caller's current network namespace and stores
> its svc_export_cache in seq->private, but takes no reference
> on the namespace. If the namespace is subsequently torn down
> (e.g. container destruction after the opener does setns() to a
> different namespace), nfsd_net_exit() calls nfsd_export_shutdown()
> which frees the cache. Subsequent reads on the still-open fd
> dereference the freed cache_detail, walking a freed hash table.
> 
> Hold a reference on the struct net for the lifetime of the open
> file descriptor. This prevents nfsd_net_exit() from running --
> and thus prevents nfsd_export_shutdown() from freeing the cache
> -- while any exports fd is open. cache_detail already stores
> its net pointer (cd->net, set by cache_create_net()), so
> exports_release() can retrieve it without additional per-file
> storage.
> 
> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> Fixes: 96d851c4d28d ("nfsd: use proper net while reading "exports" file")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfsctl.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 4166f59908f4..3d5a676e1d14 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -149,9 +149,19 @@ static int exports_net_open(struct net *net, struct file *file)
>  
>  	seq = file->private_data;
>  	seq->private = nn->svc_export_cache;
> +	get_net(net);
>  	return 0;
>  }
>  
> +static int exports_release(struct inode *inode, struct file *file)
> +{
> +	struct seq_file *seq = file->private_data;
> +	struct cache_detail *cd = seq->private;
> +
> +	put_net(cd->net);
> +	return seq_release(inode, file);
> +}
> +
>  static int exports_nfsd_open(struct inode *inode, struct file *file)
>  {
>  	return exports_net_open(inode->i_sb->s_fs_info, file);
> @@ -161,7 +171,7 @@ static const struct file_operations exports_nfsd_operations = {
>  	.open		= exports_nfsd_open,
>  	.read		= seq_read,
>  	.llseek		= seq_lseek,
> -	.release	= seq_release,
> +	.release	= exports_release,
>  };
>  
>  static int export_features_show(struct seq_file *m, void *v)
> @@ -1376,7 +1386,7 @@ static const struct proc_ops exports_proc_ops = {
>  	.proc_open	= exports_proc_open,
>  	.proc_read	= seq_read,
>  	.proc_lseek	= seq_lseek,
> -	.proc_release	= seq_release,
> +	.proc_release	= exports_release,
>  };
>  
>  static int create_proc_exports_entry(void)

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks
  2026-02-19 21:50 [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks Chuck Lever
  2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
  2026-02-19 21:50 ` [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd Chuck Lever
@ 2026-02-21 22:57 ` NeilBrown
  2026-02-22 15:41   ` Chuck Lever
  2 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2026-02-21 22:57 UTC (permalink / raw)
  To: Chuck Lever
  Cc: misanjum, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Chuck Lever

On Fri, 20 Feb 2026, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Attempt to address three crashes reported here:
> 
> https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> 
> These are compile-tested and regression-tested, but as I do not have
> a PowerPC system handy, I will need someone who has one to test
> whether they actually address the crashes.
> 
> Chuck Lever (2):
>   NFSD: Defer sub-object cleanup in export put callbacks
>   NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd

Nice.  I particularly liked the thorough commit descriptions!

Reviewed-by: NeilBrown <neil@brown.name>


> 
>  fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
>  fs/nfsd/export.h |  7 ++++--
>  fs/nfsd/nfsctl.c | 22 ++++++++++++++---
>  3 files changed, 78 insertions(+), 14 deletions(-)
> 
> -- 
> 2.53.0
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks
  2026-02-21 22:57 ` [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks NeilBrown
@ 2026-02-22 15:41   ` Chuck Lever
  0 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2026-02-22 15:41 UTC (permalink / raw)
  To: NeilBrown
  Cc: misanjum, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Chuck Lever



On Sat, Feb 21, 2026, at 5:57 PM, NeilBrown wrote:
> On Fri, 20 Feb 2026, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>> 
>> Attempt to address three crashes reported here:
>> 
>> https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
>> 
>> These are compile-tested and regression-tested, but as I do not have
>> a PowerPC system handy, I will need someone who has one to test
>> whether they actually address the crashes.
>> 
>> Chuck Lever (2):
>>   NFSD: Defer sub-object cleanup in export put callbacks
>>   NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
>
> Nice.  I particularly liked the thorough commit descriptions!
>
> Reviewed-by: NeilBrown <neil@brown.name>

Thanks Neil!

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks
  2026-02-20 15:50   ` Jeff Layton
@ 2026-02-25 18:29     ` Olga Kornievskaia
  2026-02-25 18:53       ` Chuck Lever
  0 siblings, 1 reply; 10+ messages in thread
From: Olga Kornievskaia @ 2026-02-25 18:29 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Tom Talpey, linux-nfs, Chuck Lever

On Fri, Feb 20, 2026 at 10:50 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > svc_export_put() calls path_put() and auth_domain_put() immediately
> > when the last reference drops, before the RCU grace period. RCU
> > readers in e_show() and c_show() access both ex_path (via
> > seq_path/d_path) and ex_client->name (via seq_escape) without
> > holding a reference. If cache_clean removes the entry and drops the
> > last reference concurrently, the sub-objects are freed while still
> > in use, producing a NULL pointer dereference in d_path.
> >
> > Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or
> > ex_stats") moved kfree of ex_uuid and ex_stats into the
> > call_rcu callback, but left path_put() and auth_domain_put() running
> > before the grace period because both may sleep and call_rcu
> > callbacks execute in softirq context.
> >
> > Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the
> > callback until after the RCU grace period and executes it in process
> > context where sleeping is permitted. This allows path_put() and
> > auth_domain_put() to be moved into the deferred callback alongside
> > the other resource releases. Apply the same fix to expkey_put(),
> > which has the identical pattern with ek_path and ek_client.
> >
> > A dedicated workqueue scopes the shutdown drain to only NFSD
> > export release work items; flushing the shared
> > system_unbound_wq would stall on unrelated work from other
> > subsystems. nfsd_export_shutdown() uses rcu_barrier() followed
> > by flush_workqueue() to ensure all deferred release callbacks
> > complete before the export caches are destroyed.
> >
> > Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> > Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> > Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu")
> > Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu")
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Tested-by: Olga Kornievskaia <okorniev@redhat.com>

I can reproduce the problem and verify that the 2 patches applied I no
longer see it.

> > ---
> >  fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
> >  fs/nfsd/export.h |  7 ++++--
> >  fs/nfsd/nfsctl.c |  8 +++++-
> >  3 files changed, 66 insertions(+), 12 deletions(-)
> >
> > diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> > index 04b18f0f402f..53fe66784ed2 100644
> > --- a/fs/nfsd/export.c
> > +++ b/fs/nfsd/export.c
> > @@ -36,19 +36,30 @@
> >   * second map contains a reference to the entry in the first map.
> >   */
> >
> > +static struct workqueue_struct *nfsd_export_wq;
> > +
> >  #define      EXPKEY_HASHBITS         8
> >  #define      EXPKEY_HASHMAX          (1 << EXPKEY_HASHBITS)
> >  #define      EXPKEY_HASHMASK         (EXPKEY_HASHMAX -1)
> >
> > -static void expkey_put(struct kref *ref)
> > +static void expkey_release(struct work_struct *work)
> >  {
> > -     struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
> > +     struct svc_expkey *key = container_of(to_rcu_work(work),
> > +                                           struct svc_expkey, ek_rwork);
> >
> >       if (test_bit(CACHE_VALID, &key->h.flags) &&
> >           !test_bit(CACHE_NEGATIVE, &key->h.flags))
> >               path_put(&key->ek_path);
> >       auth_domain_put(key->ek_client);
> > -     kfree_rcu(key, ek_rcu);
> > +     kfree(key);
> > +}
> > +
> > +static void expkey_put(struct kref *ref)
> > +{
> > +     struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
> > +
> > +     INIT_RCU_WORK(&key->ek_rwork, expkey_release);
> > +     queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
> >  }
> >
> >  static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
> > @@ -353,11 +364,13 @@ static void export_stats_destroy(struct export_stats *stats)
> >                                           EXP_STATS_COUNTERS_NUM);
> >  }
> >
> > -static void svc_export_release(struct rcu_head *rcu_head)
> > +static void svc_export_release(struct work_struct *work)
> >  {
> > -     struct svc_export *exp = container_of(rcu_head, struct svc_export,
> > -                     ex_rcu);
> > +     struct svc_export *exp = container_of(to_rcu_work(work),
> > +                                           struct svc_export, ex_rwork);
> >
> > +     path_put(&exp->ex_path);
> > +     auth_domain_put(exp->ex_client);
> >       nfsd4_fslocs_free(&exp->ex_fslocs);
> >       export_stats_destroy(exp->ex_stats);
> >       kfree(exp->ex_stats);
> > @@ -369,9 +382,8 @@ static void svc_export_put(struct kref *ref)
> >  {
> >       struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
> >
> > -     path_put(&exp->ex_path);
> > -     auth_domain_put(exp->ex_client);
> > -     call_rcu(&exp->ex_rcu, svc_export_release);
> > +     INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
> > +     queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
> >  }
> >
> >  static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
> > @@ -1481,6 +1493,36 @@ const struct seq_operations nfs_exports_op = {
> >       .show   = e_show,
> >  };
> >
> > +/**
> > + * nfsd_export_wq_init - allocate the export release workqueue
> > + *
> > + * Called once at module load. The workqueue runs deferred svc_export and
> > + * svc_expkey release work scheduled by queue_rcu_work() in the cache put
> > + * callbacks.
> > + *
> > + * Return values:
> > + *   %0: workqueue allocated
> > + *   %-ENOMEM: allocation failed
> > + */
> > +int nfsd_export_wq_init(void)
> > +{
> > +     nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
> > +     if (!nfsd_export_wq)
> > +             return -ENOMEM;
> > +     return 0;
> > +}
> > +
> > +/**
> > + * nfsd_export_wq_shutdown - drain and free the export release workqueue
> > + *
> > + * Called once at module unload. Per-namespace teardown in
> > + * nfsd_export_shutdown() has already drained all deferred work.
> > + */
> > +void nfsd_export_wq_shutdown(void)
> > +{
> > +     destroy_workqueue(nfsd_export_wq);
> > +}
> > +
> >  /*
> >   * Initialize the exports module.
> >   */
> > @@ -1542,6 +1584,9 @@ nfsd_export_shutdown(struct net *net)
> >
> >       cache_unregister_net(nn->svc_expkey_cache, net);
> >       cache_unregister_net(nn->svc_export_cache, net);
> > +     /* Drain deferred export and expkey release work. */
> > +     rcu_barrier();
> > +     flush_workqueue(nfsd_export_wq);
> >       cache_destroy_net(nn->svc_expkey_cache, net);
> >       cache_destroy_net(nn->svc_export_cache, net);
> >       svcauth_unix_purge(net);
> > diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
> > index d2b09cd76145..b05399374574 100644
> > --- a/fs/nfsd/export.h
> > +++ b/fs/nfsd/export.h
> > @@ -7,6 +7,7 @@
> >
> >  #include <linux/sunrpc/cache.h>
> >  #include <linux/percpu_counter.h>
> > +#include <linux/workqueue.h>
> >  #include <uapi/linux/nfsd/export.h>
> >  #include <linux/nfs4.h>
> >
> > @@ -75,7 +76,7 @@ struct svc_export {
> >       u32                     ex_layout_types;
> >       struct nfsd4_deviceid_map *ex_devid_map;
> >       struct cache_detail     *cd;
> > -     struct rcu_head         ex_rcu;
> > +     struct rcu_work         ex_rwork;
> >       unsigned long           ex_xprtsec_modes;
> >       struct export_stats     *ex_stats;
> >  };
> > @@ -92,7 +93,7 @@ struct svc_expkey {
> >       u32                     ek_fsid[6];
> >
> >       struct path             ek_path;
> > -     struct rcu_head         ek_rcu;
> > +     struct rcu_work         ek_rwork;
> >  };
> >
> >  #define EX_ISSYNC(exp)               (!((exp)->ex_flags & NFSEXP_ASYNC))
> > @@ -110,6 +111,8 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
> >  /*
> >   * Function declarations
> >   */
> > +int                  nfsd_export_wq_init(void);
> > +void                 nfsd_export_wq_shutdown(void);
> >  int                  nfsd_export_init(struct net *);
> >  void                 nfsd_export_shutdown(struct net *);
> >  void                 nfsd_export_flush(struct net *);
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 664a3275c511..4166f59908f4 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -2308,9 +2308,12 @@ static int __init init_nfsd(void)
> >       if (retval)
> >               goto out_free_pnfs;
> >       nfsd_lockd_init();      /* lockd->nfsd callbacks */
> > +     retval = nfsd_export_wq_init();
> > +     if (retval)
> > +             goto out_free_lockd;
> >       retval = register_pernet_subsys(&nfsd_net_ops);
> >       if (retval < 0)
> > -             goto out_free_lockd;
> > +             goto out_free_export_wq;
> >       retval = register_cld_notifier();
> >       if (retval)
> >               goto out_free_subsys;
> > @@ -2339,6 +2342,8 @@ static int __init init_nfsd(void)
> >       unregister_cld_notifier();
> >  out_free_subsys:
> >       unregister_pernet_subsys(&nfsd_net_ops);
> > +out_free_export_wq:
> > +     nfsd_export_wq_shutdown();
> >  out_free_lockd:
> >       nfsd_lockd_shutdown();
> >       nfsd_drc_slab_free();
> > @@ -2359,6 +2364,7 @@ static void __exit exit_nfsd(void)
> >       nfsd4_destroy_laundry_wq();
> >       unregister_cld_notifier();
> >       unregister_pernet_subsys(&nfsd_net_ops);
> > +     nfsd_export_wq_shutdown();
> >       nfsd_drc_slab_free();
> >       nfsd_lockd_shutdown();
> >       nfsd4_free_slabs();
>
> Looks good.
>
> Reviwed-by: Jeff Layton <jlayton@kernel.org>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
  2026-02-20 15:52   ` Jeff Layton
@ 2026-02-25 18:29     ` Olga Kornievskaia
  0 siblings, 0 replies; 10+ messages in thread
From: Olga Kornievskaia @ 2026-02-25 18:29 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo,
	Tom Talpey, linux-nfs, Chuck Lever

On Fri, Feb 20, 2026 at 10:53 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > The /proc/fs/nfs/exports proc entry is created at module init
> > and persists for the module's lifetime. exports_proc_open()
> > captures the caller's current network namespace and stores
> > its svc_export_cache in seq->private, but takes no reference
> > on the namespace. If the namespace is subsequently torn down
> > (e.g. container destruction after the opener does setns() to a
> > different namespace), nfsd_net_exit() calls nfsd_export_shutdown()
> > which frees the cache. Subsequent reads on the still-open fd
> > dereference the freed cache_detail, walking a freed hash table.
> >
> > Hold a reference on the struct net for the lifetime of the open
> > file descriptor. This prevents nfsd_net_exit() from running --
> > and thus prevents nfsd_export_shutdown() from freeing the cache
> > -- while any exports fd is open. cache_detail already stores
> > its net pointer (cd->net, set by cache_create_net()), so
> > exports_release() can retrieve it without additional per-file
> > storage.
> >
> > Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> > Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> > Fixes: 96d851c4d28d ("nfsd: use proper net while reading "exports" file")
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Tested-by: Olga Kornievskaia <okorniev@redhat.com>

> > ---
> >  fs/nfsd/nfsctl.c | 14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 4166f59908f4..3d5a676e1d14 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -149,9 +149,19 @@ static int exports_net_open(struct net *net, struct file *file)
> >
> >       seq = file->private_data;
> >       seq->private = nn->svc_export_cache;
> > +     get_net(net);
> >       return 0;
> >  }
> >
> > +static int exports_release(struct inode *inode, struct file *file)
> > +{
> > +     struct seq_file *seq = file->private_data;
> > +     struct cache_detail *cd = seq->private;
> > +
> > +     put_net(cd->net);
> > +     return seq_release(inode, file);
> > +}
> > +
> >  static int exports_nfsd_open(struct inode *inode, struct file *file)
> >  {
> >       return exports_net_open(inode->i_sb->s_fs_info, file);
> > @@ -161,7 +171,7 @@ static const struct file_operations exports_nfsd_operations = {
> >       .open           = exports_nfsd_open,
> >       .read           = seq_read,
> >       .llseek         = seq_lseek,
> > -     .release        = seq_release,
> > +     .release        = exports_release,
> >  };
> >
> >  static int export_features_show(struct seq_file *m, void *v)
> > @@ -1376,7 +1386,7 @@ static const struct proc_ops exports_proc_ops = {
> >       .proc_open      = exports_proc_open,
> >       .proc_read      = seq_read,
> >       .proc_lseek     = seq_lseek,
> > -     .proc_release   = seq_release,
> > +     .proc_release   = exports_release,
> >  };
> >
> >  static int create_proc_exports_entry(void)
>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks
  2026-02-25 18:29     ` Olga Kornievskaia
@ 2026-02-25 18:53       ` Chuck Lever
  0 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2026-02-25 18:53 UTC (permalink / raw)
  To: Olga Kornievskaia, Jeff Layton
  Cc: misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Chuck Lever



On Wed, Feb 25, 2026, at 1:29 PM, Olga Kornievskaia wrote:
> On Fri, Feb 20, 2026 at 10:50 AM Jeff Layton <jlayton@kernel.org> wrote:
>>
>> On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
>> > From: Chuck Lever <chuck.lever@oracle.com>
>> >
>> > svc_export_put() calls path_put() and auth_domain_put() immediately
>> > when the last reference drops, before the RCU grace period. RCU
>> > readers in e_show() and c_show() access both ex_path (via
>> > seq_path/d_path) and ex_client->name (via seq_escape) without
>> > holding a reference. If cache_clean removes the entry and drops the
>> > last reference concurrently, the sub-objects are freed while still
>> > in use, producing a NULL pointer dereference in d_path.
>> >
>> > Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or
>> > ex_stats") moved kfree of ex_uuid and ex_stats into the
>> > call_rcu callback, but left path_put() and auth_domain_put() running
>> > before the grace period because both may sleep and call_rcu
>> > callbacks execute in softirq context.
>> >
>> > Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the
>> > callback until after the RCU grace period and executes it in process
>> > context where sleeping is permitted. This allows path_put() and
>> > auth_domain_put() to be moved into the deferred callback alongside
>> > the other resource releases. Apply the same fix to expkey_put(),
>> > which has the identical pattern with ek_path and ek_client.
>> >
>> > A dedicated workqueue scopes the shutdown drain to only NFSD
>> > export release work items; flushing the shared
>> > system_unbound_wq would stall on unrelated work from other
>> > subsystems. nfsd_export_shutdown() uses rcu_barrier() followed
>> > by flush_workqueue() to ensure all deferred release callbacks
>> > complete before the export caches are destroyed.
>> >
>> > Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
>> > Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
>> > Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu")
>> > Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu")
>> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>
> Tested-by: Olga Kornievskaia <okorniev@redhat.com>
>
> I can reproduce the problem and verify that the 2 patches applied I no
> longer see it.

Excellent, thank you!


>> > ---
>> >  fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
>> >  fs/nfsd/export.h |  7 ++++--
>> >  fs/nfsd/nfsctl.c |  8 +++++-
>> >  3 files changed, 66 insertions(+), 12 deletions(-)
>> >
>> > diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
>> > index 04b18f0f402f..53fe66784ed2 100644
>> > --- a/fs/nfsd/export.c
>> > +++ b/fs/nfsd/export.c
>> > @@ -36,19 +36,30 @@
>> >   * second map contains a reference to the entry in the first map.
>> >   */
>> >
>> > +static struct workqueue_struct *nfsd_export_wq;
>> > +
>> >  #define      EXPKEY_HASHBITS         8
>> >  #define      EXPKEY_HASHMAX          (1 << EXPKEY_HASHBITS)
>> >  #define      EXPKEY_HASHMASK         (EXPKEY_HASHMAX -1)
>> >
>> > -static void expkey_put(struct kref *ref)
>> > +static void expkey_release(struct work_struct *work)
>> >  {
>> > -     struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
>> > +     struct svc_expkey *key = container_of(to_rcu_work(work),
>> > +                                           struct svc_expkey, ek_rwork);
>> >
>> >       if (test_bit(CACHE_VALID, &key->h.flags) &&
>> >           !test_bit(CACHE_NEGATIVE, &key->h.flags))
>> >               path_put(&key->ek_path);
>> >       auth_domain_put(key->ek_client);
>> > -     kfree_rcu(key, ek_rcu);
>> > +     kfree(key);
>> > +}
>> > +
>> > +static void expkey_put(struct kref *ref)
>> > +{
>> > +     struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
>> > +
>> > +     INIT_RCU_WORK(&key->ek_rwork, expkey_release);
>> > +     queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
>> >  }
>> >
>> >  static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
>> > @@ -353,11 +364,13 @@ static void export_stats_destroy(struct export_stats *stats)
>> >                                           EXP_STATS_COUNTERS_NUM);
>> >  }
>> >
>> > -static void svc_export_release(struct rcu_head *rcu_head)
>> > +static void svc_export_release(struct work_struct *work)
>> >  {
>> > -     struct svc_export *exp = container_of(rcu_head, struct svc_export,
>> > -                     ex_rcu);
>> > +     struct svc_export *exp = container_of(to_rcu_work(work),
>> > +                                           struct svc_export, ex_rwork);
>> >
>> > +     path_put(&exp->ex_path);
>> > +     auth_domain_put(exp->ex_client);
>> >       nfsd4_fslocs_free(&exp->ex_fslocs);
>> >       export_stats_destroy(exp->ex_stats);
>> >       kfree(exp->ex_stats);
>> > @@ -369,9 +382,8 @@ static void svc_export_put(struct kref *ref)
>> >  {
>> >       struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
>> >
>> > -     path_put(&exp->ex_path);
>> > -     auth_domain_put(exp->ex_client);
>> > -     call_rcu(&exp->ex_rcu, svc_export_release);
>> > +     INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
>> > +     queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
>> >  }
>> >
>> >  static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
>> > @@ -1481,6 +1493,36 @@ const struct seq_operations nfs_exports_op = {
>> >       .show   = e_show,
>> >  };
>> >
>> > +/**
>> > + * nfsd_export_wq_init - allocate the export release workqueue
>> > + *
>> > + * Called once at module load. The workqueue runs deferred svc_export and
>> > + * svc_expkey release work scheduled by queue_rcu_work() in the cache put
>> > + * callbacks.
>> > + *
>> > + * Return values:
>> > + *   %0: workqueue allocated
>> > + *   %-ENOMEM: allocation failed
>> > + */
>> > +int nfsd_export_wq_init(void)
>> > +{
>> > +     nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
>> > +     if (!nfsd_export_wq)
>> > +             return -ENOMEM;
>> > +     return 0;
>> > +}
>> > +
>> > +/**
>> > + * nfsd_export_wq_shutdown - drain and free the export release workqueue
>> > + *
>> > + * Called once at module unload. Per-namespace teardown in
>> > + * nfsd_export_shutdown() has already drained all deferred work.
>> > + */
>> > +void nfsd_export_wq_shutdown(void)
>> > +{
>> > +     destroy_workqueue(nfsd_export_wq);
>> > +}
>> > +
>> >  /*
>> >   * Initialize the exports module.
>> >   */
>> > @@ -1542,6 +1584,9 @@ nfsd_export_shutdown(struct net *net)
>> >
>> >       cache_unregister_net(nn->svc_expkey_cache, net);
>> >       cache_unregister_net(nn->svc_export_cache, net);
>> > +     /* Drain deferred export and expkey release work. */
>> > +     rcu_barrier();
>> > +     flush_workqueue(nfsd_export_wq);
>> >       cache_destroy_net(nn->svc_expkey_cache, net);
>> >       cache_destroy_net(nn->svc_export_cache, net);
>> >       svcauth_unix_purge(net);
>> > diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
>> > index d2b09cd76145..b05399374574 100644
>> > --- a/fs/nfsd/export.h
>> > +++ b/fs/nfsd/export.h
>> > @@ -7,6 +7,7 @@
>> >
>> >  #include <linux/sunrpc/cache.h>
>> >  #include <linux/percpu_counter.h>
>> > +#include <linux/workqueue.h>
>> >  #include <uapi/linux/nfsd/export.h>
>> >  #include <linux/nfs4.h>
>> >
>> > @@ -75,7 +76,7 @@ struct svc_export {
>> >       u32                     ex_layout_types;
>> >       struct nfsd4_deviceid_map *ex_devid_map;
>> >       struct cache_detail     *cd;
>> > -     struct rcu_head         ex_rcu;
>> > +     struct rcu_work         ex_rwork;
>> >       unsigned long           ex_xprtsec_modes;
>> >       struct export_stats     *ex_stats;
>> >  };
>> > @@ -92,7 +93,7 @@ struct svc_expkey {
>> >       u32                     ek_fsid[6];
>> >
>> >       struct path             ek_path;
>> > -     struct rcu_head         ek_rcu;
>> > +     struct rcu_work         ek_rwork;
>> >  };
>> >
>> >  #define EX_ISSYNC(exp)               (!((exp)->ex_flags & NFSEXP_ASYNC))
>> > @@ -110,6 +111,8 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
>> >  /*
>> >   * Function declarations
>> >   */
>> > +int                  nfsd_export_wq_init(void);
>> > +void                 nfsd_export_wq_shutdown(void);
>> >  int                  nfsd_export_init(struct net *);
>> >  void                 nfsd_export_shutdown(struct net *);
>> >  void                 nfsd_export_flush(struct net *);
>> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
>> > index 664a3275c511..4166f59908f4 100644
>> > --- a/fs/nfsd/nfsctl.c
>> > +++ b/fs/nfsd/nfsctl.c
>> > @@ -2308,9 +2308,12 @@ static int __init init_nfsd(void)
>> >       if (retval)
>> >               goto out_free_pnfs;
>> >       nfsd_lockd_init();      /* lockd->nfsd callbacks */
>> > +     retval = nfsd_export_wq_init();
>> > +     if (retval)
>> > +             goto out_free_lockd;
>> >       retval = register_pernet_subsys(&nfsd_net_ops);
>> >       if (retval < 0)
>> > -             goto out_free_lockd;
>> > +             goto out_free_export_wq;
>> >       retval = register_cld_notifier();
>> >       if (retval)
>> >               goto out_free_subsys;
>> > @@ -2339,6 +2342,8 @@ static int __init init_nfsd(void)
>> >       unregister_cld_notifier();
>> >  out_free_subsys:
>> >       unregister_pernet_subsys(&nfsd_net_ops);
>> > +out_free_export_wq:
>> > +     nfsd_export_wq_shutdown();
>> >  out_free_lockd:
>> >       nfsd_lockd_shutdown();
>> >       nfsd_drc_slab_free();
>> > @@ -2359,6 +2364,7 @@ static void __exit exit_nfsd(void)
>> >       nfsd4_destroy_laundry_wq();
>> >       unregister_cld_notifier();
>> >       unregister_pernet_subsys(&nfsd_net_ops);
>> > +     nfsd_export_wq_shutdown();
>> >       nfsd_drc_slab_free();
>> >       nfsd_lockd_shutdown();
>> >       nfsd4_free_slabs();
>>
>> Looks good.
>>
>> Reviwed-by: Jeff Layton <jlayton@kernel.org>
>>

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-02-25 18:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-19 21:50 [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks Chuck Lever
2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
2026-02-20 15:50   ` Jeff Layton
2026-02-25 18:29     ` Olga Kornievskaia
2026-02-25 18:53       ` Chuck Lever
2026-02-19 21:50 ` [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd Chuck Lever
2026-02-20 15:52   ` Jeff Layton
2026-02-25 18:29     ` Olga Kornievskaia
2026-02-21 22:57 ` [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks NeilBrown
2026-02-22 15:41   ` Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox