* [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks
@ 2026-02-19 21:50 Chuck Lever
2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Chuck Lever @ 2026-02-19 21:50 UTC (permalink / raw)
To: misanjum, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
Attempt to address three crashes reported here:
https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
These are compile-tested and regression-tested, but as I do not have
a PowerPC system handy, I will need someone who has one to test
whether they actually address the crashes.
Chuck Lever (2):
NFSD: Defer sub-object cleanup in export put callbacks
NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
fs/nfsd/export.h | 7 ++++--
fs/nfsd/nfsctl.c | 22 ++++++++++++++---
3 files changed, 78 insertions(+), 14 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks
2026-02-19 21:50 [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks Chuck Lever
@ 2026-02-19 21:50 ` Chuck Lever
2026-02-20 15:50 ` Jeff Layton
2026-02-19 21:50 ` [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd Chuck Lever
2026-02-21 22:57 ` [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks NeilBrown
2 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2026-02-19 21:50 UTC (permalink / raw)
To: misanjum, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
svc_export_put() calls path_put() and auth_domain_put() immediately
when the last reference drops, before the RCU grace period. RCU
readers in e_show() and c_show() access both ex_path (via
seq_path/d_path) and ex_client->name (via seq_escape) without
holding a reference. If cache_clean removes the entry and drops the
last reference concurrently, the sub-objects are freed while still
in use, producing a NULL pointer dereference in d_path.
Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or
ex_stats") moved kfree of ex_uuid and ex_stats into the
call_rcu callback, but left path_put() and auth_domain_put() running
before the grace period because both may sleep and call_rcu
callbacks execute in softirq context.
Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the
callback until after the RCU grace period and executes it in process
context where sleeping is permitted. This allows path_put() and
auth_domain_put() to be moved into the deferred callback alongside
the other resource releases. Apply the same fix to expkey_put(),
which has the identical pattern with ek_path and ek_client.
A dedicated workqueue scopes the shutdown drain to only NFSD
export release work items; flushing the shared
system_unbound_wq would stall on unrelated work from other
subsystems. nfsd_export_shutdown() uses rcu_barrier() followed
by flush_workqueue() to ensure all deferred release callbacks
complete before the export caches are destroyed.
Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu")
Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
fs/nfsd/export.h | 7 ++++--
fs/nfsd/nfsctl.c | 8 +++++-
3 files changed, 66 insertions(+), 12 deletions(-)
diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 04b18f0f402f..53fe66784ed2 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -36,19 +36,30 @@
* second map contains a reference to the entry in the first map.
*/
+static struct workqueue_struct *nfsd_export_wq;
+
#define EXPKEY_HASHBITS 8
#define EXPKEY_HASHMAX (1 << EXPKEY_HASHBITS)
#define EXPKEY_HASHMASK (EXPKEY_HASHMAX -1)
-static void expkey_put(struct kref *ref)
+static void expkey_release(struct work_struct *work)
{
- struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
+ struct svc_expkey *key = container_of(to_rcu_work(work),
+ struct svc_expkey, ek_rwork);
if (test_bit(CACHE_VALID, &key->h.flags) &&
!test_bit(CACHE_NEGATIVE, &key->h.flags))
path_put(&key->ek_path);
auth_domain_put(key->ek_client);
- kfree_rcu(key, ek_rcu);
+ kfree(key);
+}
+
+static void expkey_put(struct kref *ref)
+{
+ struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
+
+ INIT_RCU_WORK(&key->ek_rwork, expkey_release);
+ queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
}
static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
@@ -353,11 +364,13 @@ static void export_stats_destroy(struct export_stats *stats)
EXP_STATS_COUNTERS_NUM);
}
-static void svc_export_release(struct rcu_head *rcu_head)
+static void svc_export_release(struct work_struct *work)
{
- struct svc_export *exp = container_of(rcu_head, struct svc_export,
- ex_rcu);
+ struct svc_export *exp = container_of(to_rcu_work(work),
+ struct svc_export, ex_rwork);
+ path_put(&exp->ex_path);
+ auth_domain_put(exp->ex_client);
nfsd4_fslocs_free(&exp->ex_fslocs);
export_stats_destroy(exp->ex_stats);
kfree(exp->ex_stats);
@@ -369,9 +382,8 @@ static void svc_export_put(struct kref *ref)
{
struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
- path_put(&exp->ex_path);
- auth_domain_put(exp->ex_client);
- call_rcu(&exp->ex_rcu, svc_export_release);
+ INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
+ queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
}
static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
@@ -1481,6 +1493,36 @@ const struct seq_operations nfs_exports_op = {
.show = e_show,
};
+/**
+ * nfsd_export_wq_init - allocate the export release workqueue
+ *
+ * Called once at module load. The workqueue runs deferred svc_export and
+ * svc_expkey release work scheduled by queue_rcu_work() in the cache put
+ * callbacks.
+ *
+ * Return values:
+ * %0: workqueue allocated
+ * %-ENOMEM: allocation failed
+ */
+int nfsd_export_wq_init(void)
+{
+ nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
+ if (!nfsd_export_wq)
+ return -ENOMEM;
+ return 0;
+}
+
+/**
+ * nfsd_export_wq_shutdown - drain and free the export release workqueue
+ *
+ * Called once at module unload. Per-namespace teardown in
+ * nfsd_export_shutdown() has already drained all deferred work.
+ */
+void nfsd_export_wq_shutdown(void)
+{
+ destroy_workqueue(nfsd_export_wq);
+}
+
/*
* Initialize the exports module.
*/
@@ -1542,6 +1584,9 @@ nfsd_export_shutdown(struct net *net)
cache_unregister_net(nn->svc_expkey_cache, net);
cache_unregister_net(nn->svc_export_cache, net);
+ /* Drain deferred export and expkey release work. */
+ rcu_barrier();
+ flush_workqueue(nfsd_export_wq);
cache_destroy_net(nn->svc_expkey_cache, net);
cache_destroy_net(nn->svc_export_cache, net);
svcauth_unix_purge(net);
diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
index d2b09cd76145..b05399374574 100644
--- a/fs/nfsd/export.h
+++ b/fs/nfsd/export.h
@@ -7,6 +7,7 @@
#include <linux/sunrpc/cache.h>
#include <linux/percpu_counter.h>
+#include <linux/workqueue.h>
#include <uapi/linux/nfsd/export.h>
#include <linux/nfs4.h>
@@ -75,7 +76,7 @@ struct svc_export {
u32 ex_layout_types;
struct nfsd4_deviceid_map *ex_devid_map;
struct cache_detail *cd;
- struct rcu_head ex_rcu;
+ struct rcu_work ex_rwork;
unsigned long ex_xprtsec_modes;
struct export_stats *ex_stats;
};
@@ -92,7 +93,7 @@ struct svc_expkey {
u32 ek_fsid[6];
struct path ek_path;
- struct rcu_head ek_rcu;
+ struct rcu_work ek_rwork;
};
#define EX_ISSYNC(exp) (!((exp)->ex_flags & NFSEXP_ASYNC))
@@ -110,6 +111,8 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
/*
* Function declarations
*/
+int nfsd_export_wq_init(void);
+void nfsd_export_wq_shutdown(void);
int nfsd_export_init(struct net *);
void nfsd_export_shutdown(struct net *);
void nfsd_export_flush(struct net *);
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 664a3275c511..4166f59908f4 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -2308,9 +2308,12 @@ static int __init init_nfsd(void)
if (retval)
goto out_free_pnfs;
nfsd_lockd_init(); /* lockd->nfsd callbacks */
+ retval = nfsd_export_wq_init();
+ if (retval)
+ goto out_free_lockd;
retval = register_pernet_subsys(&nfsd_net_ops);
if (retval < 0)
- goto out_free_lockd;
+ goto out_free_export_wq;
retval = register_cld_notifier();
if (retval)
goto out_free_subsys;
@@ -2339,6 +2342,8 @@ static int __init init_nfsd(void)
unregister_cld_notifier();
out_free_subsys:
unregister_pernet_subsys(&nfsd_net_ops);
+out_free_export_wq:
+ nfsd_export_wq_shutdown();
out_free_lockd:
nfsd_lockd_shutdown();
nfsd_drc_slab_free();
@@ -2359,6 +2364,7 @@ static void __exit exit_nfsd(void)
nfsd4_destroy_laundry_wq();
unregister_cld_notifier();
unregister_pernet_subsys(&nfsd_net_ops);
+ nfsd_export_wq_shutdown();
nfsd_drc_slab_free();
nfsd_lockd_shutdown();
nfsd4_free_slabs();
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
2026-02-19 21:50 [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks Chuck Lever
2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
@ 2026-02-19 21:50 ` Chuck Lever
2026-02-20 15:52 ` Jeff Layton
2026-02-21 22:57 ` [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks NeilBrown
2 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2026-02-19 21:50 UTC (permalink / raw)
To: misanjum, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
The /proc/fs/nfs/exports proc entry is created at module init
and persists for the module's lifetime. exports_proc_open()
captures the caller's current network namespace and stores
its svc_export_cache in seq->private, but takes no reference
on the namespace. If the namespace is subsequently torn down
(e.g. container destruction after the opener does setns() to a
different namespace), nfsd_net_exit() calls nfsd_export_shutdown()
which frees the cache. Subsequent reads on the still-open fd
dereference the freed cache_detail, walking a freed hash table.
Hold a reference on the struct net for the lifetime of the open
file descriptor. This prevents nfsd_net_exit() from running --
and thus prevents nfsd_export_shutdown() from freeing the cache
-- while any exports fd is open. cache_detail already stores
its net pointer (cd->net, set by cache_create_net()), so
exports_release() can retrieve it without additional per-file
storage.
Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
Fixes: 96d851c4d28d ("nfsd: use proper net while reading "exports" file")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfsctl.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 4166f59908f4..3d5a676e1d14 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -149,9 +149,19 @@ static int exports_net_open(struct net *net, struct file *file)
seq = file->private_data;
seq->private = nn->svc_export_cache;
+ get_net(net);
return 0;
}
+static int exports_release(struct inode *inode, struct file *file)
+{
+ struct seq_file *seq = file->private_data;
+ struct cache_detail *cd = seq->private;
+
+ put_net(cd->net);
+ return seq_release(inode, file);
+}
+
static int exports_nfsd_open(struct inode *inode, struct file *file)
{
return exports_net_open(inode->i_sb->s_fs_info, file);
@@ -161,7 +171,7 @@ static const struct file_operations exports_nfsd_operations = {
.open = exports_nfsd_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = seq_release,
+ .release = exports_release,
};
static int export_features_show(struct seq_file *m, void *v)
@@ -1376,7 +1386,7 @@ static const struct proc_ops exports_proc_ops = {
.proc_open = exports_proc_open,
.proc_read = seq_read,
.proc_lseek = seq_lseek,
- .proc_release = seq_release,
+ .proc_release = exports_release,
};
static int create_proc_exports_entry(void)
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks
2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
@ 2026-02-20 15:50 ` Jeff Layton
2026-02-25 18:29 ` Olga Kornievskaia
0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2026-02-20 15:50 UTC (permalink / raw)
To: Chuck Lever, misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo,
Tom Talpey
Cc: linux-nfs, Chuck Lever
On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> svc_export_put() calls path_put() and auth_domain_put() immediately
> when the last reference drops, before the RCU grace period. RCU
> readers in e_show() and c_show() access both ex_path (via
> seq_path/d_path) and ex_client->name (via seq_escape) without
> holding a reference. If cache_clean removes the entry and drops the
> last reference concurrently, the sub-objects are freed while still
> in use, producing a NULL pointer dereference in d_path.
>
> Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or
> ex_stats") moved kfree of ex_uuid and ex_stats into the
> call_rcu callback, but left path_put() and auth_domain_put() running
> before the grace period because both may sleep and call_rcu
> callbacks execute in softirq context.
>
> Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the
> callback until after the RCU grace period and executes it in process
> context where sleeping is permitted. This allows path_put() and
> auth_domain_put() to be moved into the deferred callback alongside
> the other resource releases. Apply the same fix to expkey_put(),
> which has the identical pattern with ek_path and ek_client.
>
> A dedicated workqueue scopes the shutdown drain to only NFSD
> export release work items; flushing the shared
> system_unbound_wq would stall on unrelated work from other
> subsystems. nfsd_export_shutdown() uses rcu_barrier() followed
> by flush_workqueue() to ensure all deferred release callbacks
> complete before the export caches are destroyed.
>
> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu")
> Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
> fs/nfsd/export.h | 7 ++++--
> fs/nfsd/nfsctl.c | 8 +++++-
> 3 files changed, 66 insertions(+), 12 deletions(-)
>
> diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> index 04b18f0f402f..53fe66784ed2 100644
> --- a/fs/nfsd/export.c
> +++ b/fs/nfsd/export.c
> @@ -36,19 +36,30 @@
> * second map contains a reference to the entry in the first map.
> */
>
> +static struct workqueue_struct *nfsd_export_wq;
> +
> #define EXPKEY_HASHBITS 8
> #define EXPKEY_HASHMAX (1 << EXPKEY_HASHBITS)
> #define EXPKEY_HASHMASK (EXPKEY_HASHMAX -1)
>
> -static void expkey_put(struct kref *ref)
> +static void expkey_release(struct work_struct *work)
> {
> - struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
> + struct svc_expkey *key = container_of(to_rcu_work(work),
> + struct svc_expkey, ek_rwork);
>
> if (test_bit(CACHE_VALID, &key->h.flags) &&
> !test_bit(CACHE_NEGATIVE, &key->h.flags))
> path_put(&key->ek_path);
> auth_domain_put(key->ek_client);
> - kfree_rcu(key, ek_rcu);
> + kfree(key);
> +}
> +
> +static void expkey_put(struct kref *ref)
> +{
> + struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
> +
> + INIT_RCU_WORK(&key->ek_rwork, expkey_release);
> + queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
> }
>
> static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
> @@ -353,11 +364,13 @@ static void export_stats_destroy(struct export_stats *stats)
> EXP_STATS_COUNTERS_NUM);
> }
>
> -static void svc_export_release(struct rcu_head *rcu_head)
> +static void svc_export_release(struct work_struct *work)
> {
> - struct svc_export *exp = container_of(rcu_head, struct svc_export,
> - ex_rcu);
> + struct svc_export *exp = container_of(to_rcu_work(work),
> + struct svc_export, ex_rwork);
>
> + path_put(&exp->ex_path);
> + auth_domain_put(exp->ex_client);
> nfsd4_fslocs_free(&exp->ex_fslocs);
> export_stats_destroy(exp->ex_stats);
> kfree(exp->ex_stats);
> @@ -369,9 +382,8 @@ static void svc_export_put(struct kref *ref)
> {
> struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
>
> - path_put(&exp->ex_path);
> - auth_domain_put(exp->ex_client);
> - call_rcu(&exp->ex_rcu, svc_export_release);
> + INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
> + queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
> }
>
> static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
> @@ -1481,6 +1493,36 @@ const struct seq_operations nfs_exports_op = {
> .show = e_show,
> };
>
> +/**
> + * nfsd_export_wq_init - allocate the export release workqueue
> + *
> + * Called once at module load. The workqueue runs deferred svc_export and
> + * svc_expkey release work scheduled by queue_rcu_work() in the cache put
> + * callbacks.
> + *
> + * Return values:
> + * %0: workqueue allocated
> + * %-ENOMEM: allocation failed
> + */
> +int nfsd_export_wq_init(void)
> +{
> + nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
> + if (!nfsd_export_wq)
> + return -ENOMEM;
> + return 0;
> +}
> +
> +/**
> + * nfsd_export_wq_shutdown - drain and free the export release workqueue
> + *
> + * Called once at module unload. Per-namespace teardown in
> + * nfsd_export_shutdown() has already drained all deferred work.
> + */
> +void nfsd_export_wq_shutdown(void)
> +{
> + destroy_workqueue(nfsd_export_wq);
> +}
> +
> /*
> * Initialize the exports module.
> */
> @@ -1542,6 +1584,9 @@ nfsd_export_shutdown(struct net *net)
>
> cache_unregister_net(nn->svc_expkey_cache, net);
> cache_unregister_net(nn->svc_export_cache, net);
> + /* Drain deferred export and expkey release work. */
> + rcu_barrier();
> + flush_workqueue(nfsd_export_wq);
> cache_destroy_net(nn->svc_expkey_cache, net);
> cache_destroy_net(nn->svc_export_cache, net);
> svcauth_unix_purge(net);
> diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
> index d2b09cd76145..b05399374574 100644
> --- a/fs/nfsd/export.h
> +++ b/fs/nfsd/export.h
> @@ -7,6 +7,7 @@
>
> #include <linux/sunrpc/cache.h>
> #include <linux/percpu_counter.h>
> +#include <linux/workqueue.h>
> #include <uapi/linux/nfsd/export.h>
> #include <linux/nfs4.h>
>
> @@ -75,7 +76,7 @@ struct svc_export {
> u32 ex_layout_types;
> struct nfsd4_deviceid_map *ex_devid_map;
> struct cache_detail *cd;
> - struct rcu_head ex_rcu;
> + struct rcu_work ex_rwork;
> unsigned long ex_xprtsec_modes;
> struct export_stats *ex_stats;
> };
> @@ -92,7 +93,7 @@ struct svc_expkey {
> u32 ek_fsid[6];
>
> struct path ek_path;
> - struct rcu_head ek_rcu;
> + struct rcu_work ek_rwork;
> };
>
> #define EX_ISSYNC(exp) (!((exp)->ex_flags & NFSEXP_ASYNC))
> @@ -110,6 +111,8 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
> /*
> * Function declarations
> */
> +int nfsd_export_wq_init(void);
> +void nfsd_export_wq_shutdown(void);
> int nfsd_export_init(struct net *);
> void nfsd_export_shutdown(struct net *);
> void nfsd_export_flush(struct net *);
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 664a3275c511..4166f59908f4 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -2308,9 +2308,12 @@ static int __init init_nfsd(void)
> if (retval)
> goto out_free_pnfs;
> nfsd_lockd_init(); /* lockd->nfsd callbacks */
> + retval = nfsd_export_wq_init();
> + if (retval)
> + goto out_free_lockd;
> retval = register_pernet_subsys(&nfsd_net_ops);
> if (retval < 0)
> - goto out_free_lockd;
> + goto out_free_export_wq;
> retval = register_cld_notifier();
> if (retval)
> goto out_free_subsys;
> @@ -2339,6 +2342,8 @@ static int __init init_nfsd(void)
> unregister_cld_notifier();
> out_free_subsys:
> unregister_pernet_subsys(&nfsd_net_ops);
> +out_free_export_wq:
> + nfsd_export_wq_shutdown();
> out_free_lockd:
> nfsd_lockd_shutdown();
> nfsd_drc_slab_free();
> @@ -2359,6 +2364,7 @@ static void __exit exit_nfsd(void)
> nfsd4_destroy_laundry_wq();
> unregister_cld_notifier();
> unregister_pernet_subsys(&nfsd_net_ops);
> + nfsd_export_wq_shutdown();
> nfsd_drc_slab_free();
> nfsd_lockd_shutdown();
> nfsd4_free_slabs();
Looks good.
Reviwed-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
2026-02-19 21:50 ` [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd Chuck Lever
@ 2026-02-20 15:52 ` Jeff Layton
2026-02-25 18:29 ` Olga Kornievskaia
0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2026-02-20 15:52 UTC (permalink / raw)
To: Chuck Lever, misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo,
Tom Talpey
Cc: linux-nfs, Chuck Lever
On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> The /proc/fs/nfs/exports proc entry is created at module init
> and persists for the module's lifetime. exports_proc_open()
> captures the caller's current network namespace and stores
> its svc_export_cache in seq->private, but takes no reference
> on the namespace. If the namespace is subsequently torn down
> (e.g. container destruction after the opener does setns() to a
> different namespace), nfsd_net_exit() calls nfsd_export_shutdown()
> which frees the cache. Subsequent reads on the still-open fd
> dereference the freed cache_detail, walking a freed hash table.
>
> Hold a reference on the struct net for the lifetime of the open
> file descriptor. This prevents nfsd_net_exit() from running --
> and thus prevents nfsd_export_shutdown() from freeing the cache
> -- while any exports fd is open. cache_detail already stores
> its net pointer (cd->net, set by cache_create_net()), so
> exports_release() can retrieve it without additional per-file
> storage.
>
> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> Fixes: 96d851c4d28d ("nfsd: use proper net while reading "exports" file")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfsd/nfsctl.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 4166f59908f4..3d5a676e1d14 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -149,9 +149,19 @@ static int exports_net_open(struct net *net, struct file *file)
>
> seq = file->private_data;
> seq->private = nn->svc_export_cache;
> + get_net(net);
> return 0;
> }
>
> +static int exports_release(struct inode *inode, struct file *file)
> +{
> + struct seq_file *seq = file->private_data;
> + struct cache_detail *cd = seq->private;
> +
> + put_net(cd->net);
> + return seq_release(inode, file);
> +}
> +
> static int exports_nfsd_open(struct inode *inode, struct file *file)
> {
> return exports_net_open(inode->i_sb->s_fs_info, file);
> @@ -161,7 +171,7 @@ static const struct file_operations exports_nfsd_operations = {
> .open = exports_nfsd_open,
> .read = seq_read,
> .llseek = seq_lseek,
> - .release = seq_release,
> + .release = exports_release,
> };
>
> static int export_features_show(struct seq_file *m, void *v)
> @@ -1376,7 +1386,7 @@ static const struct proc_ops exports_proc_ops = {
> .proc_open = exports_proc_open,
> .proc_read = seq_read,
> .proc_lseek = seq_lseek,
> - .proc_release = seq_release,
> + .proc_release = exports_release,
> };
>
> static int create_proc_exports_entry(void)
Reviewed-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks
2026-02-19 21:50 [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks Chuck Lever
2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
2026-02-19 21:50 ` [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd Chuck Lever
@ 2026-02-21 22:57 ` NeilBrown
2026-02-22 15:41 ` Chuck Lever
2 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2026-02-21 22:57 UTC (permalink / raw)
To: Chuck Lever
Cc: misanjum, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
linux-nfs, Chuck Lever
On Fri, 20 Feb 2026, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Attempt to address three crashes reported here:
>
> https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
>
> These are compile-tested and regression-tested, but as I do not have
> a PowerPC system handy, I will need someone who has one to test
> whether they actually address the crashes.
>
> Chuck Lever (2):
> NFSD: Defer sub-object cleanup in export put callbacks
> NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
Nice. I particularly liked the thorough commit descriptions!
Reviewed-by: NeilBrown <neil@brown.name>
>
> fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
> fs/nfsd/export.h | 7 ++++--
> fs/nfsd/nfsctl.c | 22 ++++++++++++++---
> 3 files changed, 78 insertions(+), 14 deletions(-)
>
> --
> 2.53.0
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks
2026-02-21 22:57 ` [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks NeilBrown
@ 2026-02-22 15:41 ` Chuck Lever
0 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2026-02-22 15:41 UTC (permalink / raw)
To: NeilBrown
Cc: misanjum, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
linux-nfs, Chuck Lever
On Sat, Feb 21, 2026, at 5:57 PM, NeilBrown wrote:
> On Fri, 20 Feb 2026, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> Attempt to address three crashes reported here:
>>
>> https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
>>
>> These are compile-tested and regression-tested, but as I do not have
>> a PowerPC system handy, I will need someone who has one to test
>> whether they actually address the crashes.
>>
>> Chuck Lever (2):
>> NFSD: Defer sub-object cleanup in export put callbacks
>> NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
>
> Nice. I particularly liked the thorough commit descriptions!
>
> Reviewed-by: NeilBrown <neil@brown.name>
Thanks Neil!
--
Chuck Lever
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks
2026-02-20 15:50 ` Jeff Layton
@ 2026-02-25 18:29 ` Olga Kornievskaia
2026-02-25 18:53 ` Chuck Lever
0 siblings, 1 reply; 10+ messages in thread
From: Olga Kornievskaia @ 2026-02-25 18:29 UTC (permalink / raw)
To: Jeff Layton
Cc: Chuck Lever, misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo,
Tom Talpey, linux-nfs, Chuck Lever
On Fri, Feb 20, 2026 at 10:50 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > svc_export_put() calls path_put() and auth_domain_put() immediately
> > when the last reference drops, before the RCU grace period. RCU
> > readers in e_show() and c_show() access both ex_path (via
> > seq_path/d_path) and ex_client->name (via seq_escape) without
> > holding a reference. If cache_clean removes the entry and drops the
> > last reference concurrently, the sub-objects are freed while still
> > in use, producing a NULL pointer dereference in d_path.
> >
> > Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or
> > ex_stats") moved kfree of ex_uuid and ex_stats into the
> > call_rcu callback, but left path_put() and auth_domain_put() running
> > before the grace period because both may sleep and call_rcu
> > callbacks execute in softirq context.
> >
> > Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the
> > callback until after the RCU grace period and executes it in process
> > context where sleeping is permitted. This allows path_put() and
> > auth_domain_put() to be moved into the deferred callback alongside
> > the other resource releases. Apply the same fix to expkey_put(),
> > which has the identical pattern with ek_path and ek_client.
> >
> > A dedicated workqueue scopes the shutdown drain to only NFSD
> > export release work items; flushing the shared
> > system_unbound_wq would stall on unrelated work from other
> > subsystems. nfsd_export_shutdown() uses rcu_barrier() followed
> > by flush_workqueue() to ensure all deferred release callbacks
> > complete before the export caches are destroyed.
> >
> > Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> > Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> > Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu")
> > Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu")
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Olga Kornievskaia <okorniev@redhat.com>
I can reproduce the problem and verify that the 2 patches applied I no
longer see it.
> > ---
> > fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
> > fs/nfsd/export.h | 7 ++++--
> > fs/nfsd/nfsctl.c | 8 +++++-
> > 3 files changed, 66 insertions(+), 12 deletions(-)
> >
> > diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> > index 04b18f0f402f..53fe66784ed2 100644
> > --- a/fs/nfsd/export.c
> > +++ b/fs/nfsd/export.c
> > @@ -36,19 +36,30 @@
> > * second map contains a reference to the entry in the first map.
> > */
> >
> > +static struct workqueue_struct *nfsd_export_wq;
> > +
> > #define EXPKEY_HASHBITS 8
> > #define EXPKEY_HASHMAX (1 << EXPKEY_HASHBITS)
> > #define EXPKEY_HASHMASK (EXPKEY_HASHMAX -1)
> >
> > -static void expkey_put(struct kref *ref)
> > +static void expkey_release(struct work_struct *work)
> > {
> > - struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
> > + struct svc_expkey *key = container_of(to_rcu_work(work),
> > + struct svc_expkey, ek_rwork);
> >
> > if (test_bit(CACHE_VALID, &key->h.flags) &&
> > !test_bit(CACHE_NEGATIVE, &key->h.flags))
> > path_put(&key->ek_path);
> > auth_domain_put(key->ek_client);
> > - kfree_rcu(key, ek_rcu);
> > + kfree(key);
> > +}
> > +
> > +static void expkey_put(struct kref *ref)
> > +{
> > + struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
> > +
> > + INIT_RCU_WORK(&key->ek_rwork, expkey_release);
> > + queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
> > }
> >
> > static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
> > @@ -353,11 +364,13 @@ static void export_stats_destroy(struct export_stats *stats)
> > EXP_STATS_COUNTERS_NUM);
> > }
> >
> > -static void svc_export_release(struct rcu_head *rcu_head)
> > +static void svc_export_release(struct work_struct *work)
> > {
> > - struct svc_export *exp = container_of(rcu_head, struct svc_export,
> > - ex_rcu);
> > + struct svc_export *exp = container_of(to_rcu_work(work),
> > + struct svc_export, ex_rwork);
> >
> > + path_put(&exp->ex_path);
> > + auth_domain_put(exp->ex_client);
> > nfsd4_fslocs_free(&exp->ex_fslocs);
> > export_stats_destroy(exp->ex_stats);
> > kfree(exp->ex_stats);
> > @@ -369,9 +382,8 @@ static void svc_export_put(struct kref *ref)
> > {
> > struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
> >
> > - path_put(&exp->ex_path);
> > - auth_domain_put(exp->ex_client);
> > - call_rcu(&exp->ex_rcu, svc_export_release);
> > + INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
> > + queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
> > }
> >
> > static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
> > @@ -1481,6 +1493,36 @@ const struct seq_operations nfs_exports_op = {
> > .show = e_show,
> > };
> >
> > +/**
> > + * nfsd_export_wq_init - allocate the export release workqueue
> > + *
> > + * Called once at module load. The workqueue runs deferred svc_export and
> > + * svc_expkey release work scheduled by queue_rcu_work() in the cache put
> > + * callbacks.
> > + *
> > + * Return values:
> > + * %0: workqueue allocated
> > + * %-ENOMEM: allocation failed
> > + */
> > +int nfsd_export_wq_init(void)
> > +{
> > + nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
> > + if (!nfsd_export_wq)
> > + return -ENOMEM;
> > + return 0;
> > +}
> > +
> > +/**
> > + * nfsd_export_wq_shutdown - drain and free the export release workqueue
> > + *
> > + * Called once at module unload. Per-namespace teardown in
> > + * nfsd_export_shutdown() has already drained all deferred work.
> > + */
> > +void nfsd_export_wq_shutdown(void)
> > +{
> > + destroy_workqueue(nfsd_export_wq);
> > +}
> > +
> > /*
> > * Initialize the exports module.
> > */
> > @@ -1542,6 +1584,9 @@ nfsd_export_shutdown(struct net *net)
> >
> > cache_unregister_net(nn->svc_expkey_cache, net);
> > cache_unregister_net(nn->svc_export_cache, net);
> > + /* Drain deferred export and expkey release work. */
> > + rcu_barrier();
> > + flush_workqueue(nfsd_export_wq);
> > cache_destroy_net(nn->svc_expkey_cache, net);
> > cache_destroy_net(nn->svc_export_cache, net);
> > svcauth_unix_purge(net);
> > diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
> > index d2b09cd76145..b05399374574 100644
> > --- a/fs/nfsd/export.h
> > +++ b/fs/nfsd/export.h
> > @@ -7,6 +7,7 @@
> >
> > #include <linux/sunrpc/cache.h>
> > #include <linux/percpu_counter.h>
> > +#include <linux/workqueue.h>
> > #include <uapi/linux/nfsd/export.h>
> > #include <linux/nfs4.h>
> >
> > @@ -75,7 +76,7 @@ struct svc_export {
> > u32 ex_layout_types;
> > struct nfsd4_deviceid_map *ex_devid_map;
> > struct cache_detail *cd;
> > - struct rcu_head ex_rcu;
> > + struct rcu_work ex_rwork;
> > unsigned long ex_xprtsec_modes;
> > struct export_stats *ex_stats;
> > };
> > @@ -92,7 +93,7 @@ struct svc_expkey {
> > u32 ek_fsid[6];
> >
> > struct path ek_path;
> > - struct rcu_head ek_rcu;
> > + struct rcu_work ek_rwork;
> > };
> >
> > #define EX_ISSYNC(exp) (!((exp)->ex_flags & NFSEXP_ASYNC))
> > @@ -110,6 +111,8 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
> > /*
> > * Function declarations
> > */
> > +int nfsd_export_wq_init(void);
> > +void nfsd_export_wq_shutdown(void);
> > int nfsd_export_init(struct net *);
> > void nfsd_export_shutdown(struct net *);
> > void nfsd_export_flush(struct net *);
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 664a3275c511..4166f59908f4 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -2308,9 +2308,12 @@ static int __init init_nfsd(void)
> > if (retval)
> > goto out_free_pnfs;
> > nfsd_lockd_init(); /* lockd->nfsd callbacks */
> > + retval = nfsd_export_wq_init();
> > + if (retval)
> > + goto out_free_lockd;
> > retval = register_pernet_subsys(&nfsd_net_ops);
> > if (retval < 0)
> > - goto out_free_lockd;
> > + goto out_free_export_wq;
> > retval = register_cld_notifier();
> > if (retval)
> > goto out_free_subsys;
> > @@ -2339,6 +2342,8 @@ static int __init init_nfsd(void)
> > unregister_cld_notifier();
> > out_free_subsys:
> > unregister_pernet_subsys(&nfsd_net_ops);
> > +out_free_export_wq:
> > + nfsd_export_wq_shutdown();
> > out_free_lockd:
> > nfsd_lockd_shutdown();
> > nfsd_drc_slab_free();
> > @@ -2359,6 +2364,7 @@ static void __exit exit_nfsd(void)
> > nfsd4_destroy_laundry_wq();
> > unregister_cld_notifier();
> > unregister_pernet_subsys(&nfsd_net_ops);
> > + nfsd_export_wq_shutdown();
> > nfsd_drc_slab_free();
> > nfsd_lockd_shutdown();
> > nfsd4_free_slabs();
>
> Looks good.
>
> Reviwed-by: Jeff Layton <jlayton@kernel.org>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
2026-02-20 15:52 ` Jeff Layton
@ 2026-02-25 18:29 ` Olga Kornievskaia
0 siblings, 0 replies; 10+ messages in thread
From: Olga Kornievskaia @ 2026-02-25 18:29 UTC (permalink / raw)
To: Jeff Layton
Cc: Chuck Lever, misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo,
Tom Talpey, linux-nfs, Chuck Lever
On Fri, Feb 20, 2026 at 10:53 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > The /proc/fs/nfs/exports proc entry is created at module init
> > and persists for the module's lifetime. exports_proc_open()
> > captures the caller's current network namespace and stores
> > its svc_export_cache in seq->private, but takes no reference
> > on the namespace. If the namespace is subsequently torn down
> > (e.g. container destruction after the opener does setns() to a
> > different namespace), nfsd_net_exit() calls nfsd_export_shutdown()
> > which frees the cache. Subsequent reads on the still-open fd
> > dereference the freed cache_detail, walking a freed hash table.
> >
> > Hold a reference on the struct net for the lifetime of the open
> > file descriptor. This prevents nfsd_net_exit() from running --
> > and thus prevents nfsd_export_shutdown() from freeing the cache
> > -- while any exports fd is open. cache_detail already stores
> > its net pointer (cd->net, set by cache_create_net()), so
> > exports_release() can retrieve it without additional per-file
> > storage.
> >
> > Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> > Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
> > Fixes: 96d851c4d28d ("nfsd: use proper net while reading "exports" file")
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Olga Kornievskaia <okorniev@redhat.com>
> > ---
> > fs/nfsd/nfsctl.c | 14 ++++++++++++--
> > 1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 4166f59908f4..3d5a676e1d14 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -149,9 +149,19 @@ static int exports_net_open(struct net *net, struct file *file)
> >
> > seq = file->private_data;
> > seq->private = nn->svc_export_cache;
> > + get_net(net);
> > return 0;
> > }
> >
> > +static int exports_release(struct inode *inode, struct file *file)
> > +{
> > + struct seq_file *seq = file->private_data;
> > + struct cache_detail *cd = seq->private;
> > +
> > + put_net(cd->net);
> > + return seq_release(inode, file);
> > +}
> > +
> > static int exports_nfsd_open(struct inode *inode, struct file *file)
> > {
> > return exports_net_open(inode->i_sb->s_fs_info, file);
> > @@ -161,7 +171,7 @@ static const struct file_operations exports_nfsd_operations = {
> > .open = exports_nfsd_open,
> > .read = seq_read,
> > .llseek = seq_lseek,
> > - .release = seq_release,
> > + .release = exports_release,
> > };
> >
> > static int export_features_show(struct seq_file *m, void *v)
> > @@ -1376,7 +1386,7 @@ static const struct proc_ops exports_proc_ops = {
> > .proc_open = exports_proc_open,
> > .proc_read = seq_read,
> > .proc_lseek = seq_lseek,
> > - .proc_release = seq_release,
> > + .proc_release = exports_release,
> > };
> >
> > static int create_proc_exports_entry(void)
>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks
2026-02-25 18:29 ` Olga Kornievskaia
@ 2026-02-25 18:53 ` Chuck Lever
0 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2026-02-25 18:53 UTC (permalink / raw)
To: Olga Kornievskaia, Jeff Layton
Cc: misanjum, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
linux-nfs, Chuck Lever
On Wed, Feb 25, 2026, at 1:29 PM, Olga Kornievskaia wrote:
> On Fri, Feb 20, 2026 at 10:50 AM Jeff Layton <jlayton@kernel.org> wrote:
>>
>> On Thu, 2026-02-19 at 16:50 -0500, Chuck Lever wrote:
>> > From: Chuck Lever <chuck.lever@oracle.com>
>> >
>> > svc_export_put() calls path_put() and auth_domain_put() immediately
>> > when the last reference drops, before the RCU grace period. RCU
>> > readers in e_show() and c_show() access both ex_path (via
>> > seq_path/d_path) and ex_client->name (via seq_escape) without
>> > holding a reference. If cache_clean removes the entry and drops the
>> > last reference concurrently, the sub-objects are freed while still
>> > in use, producing a NULL pointer dereference in d_path.
>> >
>> > Commit 2530766492ec ("nfsd: fix UAF when access ex_uuid or
>> > ex_stats") moved kfree of ex_uuid and ex_stats into the
>> > call_rcu callback, but left path_put() and auth_domain_put() running
>> > before the grace period because both may sleep and call_rcu
>> > callbacks execute in softirq context.
>> >
>> > Replace call_rcu/kfree_rcu with queue_rcu_work(), which defers the
>> > callback until after the RCU grace period and executes it in process
>> > context where sleeping is permitted. This allows path_put() and
>> > auth_domain_put() to be moved into the deferred callback alongside
>> > the other resource releases. Apply the same fix to expkey_put(),
>> > which has the identical pattern with ek_path and ek_client.
>> >
>> > A dedicated workqueue scopes the shutdown drain to only NFSD
>> > export release work items; flushing the shared
>> > system_unbound_wq would stall on unrelated work from other
>> > subsystems. nfsd_export_shutdown() uses rcu_barrier() followed
>> > by flush_workqueue() to ensure all deferred release callbacks
>> > complete before the export caches are destroyed.
>> >
>> > Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
>> > Closes: https://lore.kernel.org/linux-nfs/dcd371d3a95815a84ba7de52cef447b8@linux.ibm.com/
>> > Fixes: c224edca7af0 ("nfsd: no need get cache ref when protected by rcu")
>> > Fixes: 1b10f0b603c0 ("SUNRPC: no need get cache ref when protected by rcu")
>> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>
> Tested-by: Olga Kornievskaia <okorniev@redhat.com>
>
> I can reproduce the problem and verify that the 2 patches applied I no
> longer see it.
Excellent, thank you!
>> > ---
>> > fs/nfsd/export.c | 63 +++++++++++++++++++++++++++++++++++++++++-------
>> > fs/nfsd/export.h | 7 ++++--
>> > fs/nfsd/nfsctl.c | 8 +++++-
>> > 3 files changed, 66 insertions(+), 12 deletions(-)
>> >
>> > diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
>> > index 04b18f0f402f..53fe66784ed2 100644
>> > --- a/fs/nfsd/export.c
>> > +++ b/fs/nfsd/export.c
>> > @@ -36,19 +36,30 @@
>> > * second map contains a reference to the entry in the first map.
>> > */
>> >
>> > +static struct workqueue_struct *nfsd_export_wq;
>> > +
>> > #define EXPKEY_HASHBITS 8
>> > #define EXPKEY_HASHMAX (1 << EXPKEY_HASHBITS)
>> > #define EXPKEY_HASHMASK (EXPKEY_HASHMAX -1)
>> >
>> > -static void expkey_put(struct kref *ref)
>> > +static void expkey_release(struct work_struct *work)
>> > {
>> > - struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
>> > + struct svc_expkey *key = container_of(to_rcu_work(work),
>> > + struct svc_expkey, ek_rwork);
>> >
>> > if (test_bit(CACHE_VALID, &key->h.flags) &&
>> > !test_bit(CACHE_NEGATIVE, &key->h.flags))
>> > path_put(&key->ek_path);
>> > auth_domain_put(key->ek_client);
>> > - kfree_rcu(key, ek_rcu);
>> > + kfree(key);
>> > +}
>> > +
>> > +static void expkey_put(struct kref *ref)
>> > +{
>> > + struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
>> > +
>> > + INIT_RCU_WORK(&key->ek_rwork, expkey_release);
>> > + queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
>> > }
>> >
>> > static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
>> > @@ -353,11 +364,13 @@ static void export_stats_destroy(struct export_stats *stats)
>> > EXP_STATS_COUNTERS_NUM);
>> > }
>> >
>> > -static void svc_export_release(struct rcu_head *rcu_head)
>> > +static void svc_export_release(struct work_struct *work)
>> > {
>> > - struct svc_export *exp = container_of(rcu_head, struct svc_export,
>> > - ex_rcu);
>> > + struct svc_export *exp = container_of(to_rcu_work(work),
>> > + struct svc_export, ex_rwork);
>> >
>> > + path_put(&exp->ex_path);
>> > + auth_domain_put(exp->ex_client);
>> > nfsd4_fslocs_free(&exp->ex_fslocs);
>> > export_stats_destroy(exp->ex_stats);
>> > kfree(exp->ex_stats);
>> > @@ -369,9 +382,8 @@ static void svc_export_put(struct kref *ref)
>> > {
>> > struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
>> >
>> > - path_put(&exp->ex_path);
>> > - auth_domain_put(exp->ex_client);
>> > - call_rcu(&exp->ex_rcu, svc_export_release);
>> > + INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
>> > + queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
>> > }
>> >
>> > static int svc_export_upcall(struct cache_detail *cd, struct cache_head *h)
>> > @@ -1481,6 +1493,36 @@ const struct seq_operations nfs_exports_op = {
>> > .show = e_show,
>> > };
>> >
>> > +/**
>> > + * nfsd_export_wq_init - allocate the export release workqueue
>> > + *
>> > + * Called once at module load. The workqueue runs deferred svc_export and
>> > + * svc_expkey release work scheduled by queue_rcu_work() in the cache put
>> > + * callbacks.
>> > + *
>> > + * Return values:
>> > + * %0: workqueue allocated
>> > + * %-ENOMEM: allocation failed
>> > + */
>> > +int nfsd_export_wq_init(void)
>> > +{
>> > + nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
>> > + if (!nfsd_export_wq)
>> > + return -ENOMEM;
>> > + return 0;
>> > +}
>> > +
>> > +/**
>> > + * nfsd_export_wq_shutdown - drain and free the export release workqueue
>> > + *
>> > + * Called once at module unload. Per-namespace teardown in
>> > + * nfsd_export_shutdown() has already drained all deferred work.
>> > + */
>> > +void nfsd_export_wq_shutdown(void)
>> > +{
>> > + destroy_workqueue(nfsd_export_wq);
>> > +}
>> > +
>> > /*
>> > * Initialize the exports module.
>> > */
>> > @@ -1542,6 +1584,9 @@ nfsd_export_shutdown(struct net *net)
>> >
>> > cache_unregister_net(nn->svc_expkey_cache, net);
>> > cache_unregister_net(nn->svc_export_cache, net);
>> > + /* Drain deferred export and expkey release work. */
>> > + rcu_barrier();
>> > + flush_workqueue(nfsd_export_wq);
>> > cache_destroy_net(nn->svc_expkey_cache, net);
>> > cache_destroy_net(nn->svc_export_cache, net);
>> > svcauth_unix_purge(net);
>> > diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
>> > index d2b09cd76145..b05399374574 100644
>> > --- a/fs/nfsd/export.h
>> > +++ b/fs/nfsd/export.h
>> > @@ -7,6 +7,7 @@
>> >
>> > #include <linux/sunrpc/cache.h>
>> > #include <linux/percpu_counter.h>
>> > +#include <linux/workqueue.h>
>> > #include <uapi/linux/nfsd/export.h>
>> > #include <linux/nfs4.h>
>> >
>> > @@ -75,7 +76,7 @@ struct svc_export {
>> > u32 ex_layout_types;
>> > struct nfsd4_deviceid_map *ex_devid_map;
>> > struct cache_detail *cd;
>> > - struct rcu_head ex_rcu;
>> > + struct rcu_work ex_rwork;
>> > unsigned long ex_xprtsec_modes;
>> > struct export_stats *ex_stats;
>> > };
>> > @@ -92,7 +93,7 @@ struct svc_expkey {
>> > u32 ek_fsid[6];
>> >
>> > struct path ek_path;
>> > - struct rcu_head ek_rcu;
>> > + struct rcu_work ek_rwork;
>> > };
>> >
>> > #define EX_ISSYNC(exp) (!((exp)->ex_flags & NFSEXP_ASYNC))
>> > @@ -110,6 +111,8 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
>> > /*
>> > * Function declarations
>> > */
>> > +int nfsd_export_wq_init(void);
>> > +void nfsd_export_wq_shutdown(void);
>> > int nfsd_export_init(struct net *);
>> > void nfsd_export_shutdown(struct net *);
>> > void nfsd_export_flush(struct net *);
>> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
>> > index 664a3275c511..4166f59908f4 100644
>> > --- a/fs/nfsd/nfsctl.c
>> > +++ b/fs/nfsd/nfsctl.c
>> > @@ -2308,9 +2308,12 @@ static int __init init_nfsd(void)
>> > if (retval)
>> > goto out_free_pnfs;
>> > nfsd_lockd_init(); /* lockd->nfsd callbacks */
>> > + retval = nfsd_export_wq_init();
>> > + if (retval)
>> > + goto out_free_lockd;
>> > retval = register_pernet_subsys(&nfsd_net_ops);
>> > if (retval < 0)
>> > - goto out_free_lockd;
>> > + goto out_free_export_wq;
>> > retval = register_cld_notifier();
>> > if (retval)
>> > goto out_free_subsys;
>> > @@ -2339,6 +2342,8 @@ static int __init init_nfsd(void)
>> > unregister_cld_notifier();
>> > out_free_subsys:
>> > unregister_pernet_subsys(&nfsd_net_ops);
>> > +out_free_export_wq:
>> > + nfsd_export_wq_shutdown();
>> > out_free_lockd:
>> > nfsd_lockd_shutdown();
>> > nfsd_drc_slab_free();
>> > @@ -2359,6 +2364,7 @@ static void __exit exit_nfsd(void)
>> > nfsd4_destroy_laundry_wq();
>> > unregister_cld_notifier();
>> > unregister_pernet_subsys(&nfsd_net_ops);
>> > + nfsd_export_wq_shutdown();
>> > nfsd_drc_slab_free();
>> > nfsd_lockd_shutdown();
>> > nfsd4_free_slabs();
>>
>> Looks good.
>>
>> Reviwed-by: Jeff Layton <jlayton@kernel.org>
>>
--
Chuck Lever
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-02-25 18:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-19 21:50 [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks Chuck Lever
2026-02-19 21:50 ` [PATCH v1 1/2] NFSD: Defer sub-object cleanup in export put callbacks Chuck Lever
2026-02-20 15:50 ` Jeff Layton
2026-02-25 18:29 ` Olga Kornievskaia
2026-02-25 18:53 ` Chuck Lever
2026-02-19 21:50 ` [PATCH v1 2/2] NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd Chuck Lever
2026-02-20 15:52 ` Jeff Layton
2026-02-25 18:29 ` Olga Kornievskaia
2026-02-21 22:57 ` [PATCH v1 0/2] Address UAF in sunrpc cache show callbacks NeilBrown
2026-02-22 15:41 ` Chuck Lever
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox