From: Chuck Lever <cel@kernel.org>
To: Misbah Anjum N <misanjum@linux.ibm.com>,
Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
Yang Erkun <yangerkun@huawei.com>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH 2/6] SUNRPC: Provide a shared workqueue for cache release callbacks
Date: Fri, 01 May 2026 10:51:08 -0400 [thread overview]
Message-ID: <20260501-cache-uaf-fix-v1-2-a49928bf4817@oracle.com> (raw)
In-Reply-To: <20260501-cache-uaf-fix-v1-0-a49928bf4817@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Cache .put callbacks may need to release sub-objects whose
cleanup sleeps (path_put, auth_domain_put, put_group_info), which
precludes running the release from a call_rcu() softirq callback.
Commit 48db892356d6 ("NFSD: Defer sub-object cleanup in export
put callbacks") introduced nfsd_export_wq for that purpose, with
a dedicated workqueue chosen so that flush_workqueue() in the
per-namespace teardown path drains only NFSD export release work
rather than blocking on unrelated work queued to system_unbound_wq.
Subsequent patches in this series convert the sunrpc ip_map and
unix_gid put callbacks to the same queue_rcu_work() pattern, and
those would otherwise need their own per-cache workqueue for the
same reason. Hoist the workqueue up to the sunrpc layer so that
all four cache_detail put callbacks share a single workqueue,
managed entirely within net/sunrpc/cache.c.
Expose the workqueue through three helpers.
sunrpc_cache_queue_release() schedules a deferred release after
the next RCU grace period. sunrpc_cache_destroy_net()
encapsulates the cache_unregister_net() + drain +
cache_destroy_net() sequence that single-cache teardowns
otherwise have to open-code, putting the ordering rule in one
place. sunrpc_cache_drain() exposes the underlying
rcu_barrier() + flush_workqueue() primitive for the rare caller
that drains multiple cache_details together, such as
nfsd_export_shutdown(). Allocate the workqueue in
cache_initialize() and destroy it in a new cache_destroy()
called from cleanup_sunrpc(). Replace the local nfsd_export_wq
with the shared sunrpc helpers and drop the
nfsd_export_wq_init/shutdown helpers and their callers.
Assisted-by: Claude:claude-opus-4-7[1m]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/export.c | 41 +++-----------------------
fs/nfsd/export.h | 2 --
fs/nfsd/nfsctl.c | 8 +----
include/linux/sunrpc/cache.h | 3 ++
net/sunrpc/cache.c | 70 +++++++++++++++++++++++++++++++++++++++++++-
net/sunrpc/sunrpc.h | 3 +-
net/sunrpc/sunrpc_syms.c | 23 +++++++++------
7 files changed, 93 insertions(+), 57 deletions(-)
diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 15972919e1e9..3c4340e743fa 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -39,8 +39,6 @@
* second map contains a reference to the entry in the first map.
*/
-static struct workqueue_struct *nfsd_export_wq;
-
#define EXPKEY_HASHBITS 8
#define EXPKEY_HASHMAX (1 << EXPKEY_HASHBITS)
#define EXPKEY_HASHMASK (EXPKEY_HASHMAX -1)
@@ -62,7 +60,7 @@ static void expkey_put(struct kref *ref)
struct svc_expkey *key = container_of(ref, struct svc_expkey, h.ref);
INIT_RCU_WORK(&key->ek_rwork, expkey_release);
- queue_rcu_work(nfsd_export_wq, &key->ek_rwork);
+ sunrpc_cache_queue_release(&key->ek_rwork);
}
static int expkey_upcall(struct cache_detail *cd, struct cache_head *h)
@@ -652,7 +650,7 @@ static void svc_export_put(struct kref *ref)
struct svc_export *exp = container_of(ref, struct svc_export, h.ref);
INIT_RCU_WORK(&exp->ex_rwork, svc_export_release);
- queue_rcu_work(nfsd_export_wq, &exp->ex_rwork);
+ sunrpc_cache_queue_release(&exp->ex_rwork);
}
/**
@@ -2193,36 +2191,6 @@ const struct seq_operations nfs_exports_op = {
.show = e_show,
};
-/**
- * nfsd_export_wq_init - allocate the export release workqueue
- *
- * Called once at module load. The workqueue runs deferred svc_export and
- * svc_expkey release work scheduled by queue_rcu_work() in the cache put
- * callbacks.
- *
- * Return values:
- * %0: workqueue allocated
- * %-ENOMEM: allocation failed
- */
-int nfsd_export_wq_init(void)
-{
- nfsd_export_wq = alloc_workqueue("nfsd_export", WQ_UNBOUND, 0);
- if (!nfsd_export_wq)
- return -ENOMEM;
- return 0;
-}
-
-/**
- * nfsd_export_wq_shutdown - drain and free the export release workqueue
- *
- * Called once at module unload. Per-namespace teardown in
- * nfsd_export_shutdown() has already drained all deferred work.
- */
-void nfsd_export_wq_shutdown(void)
-{
- destroy_workqueue(nfsd_export_wq);
-}
-
/*
* Initialize the exports module.
*/
@@ -2284,9 +2252,8 @@ nfsd_export_shutdown(struct net *net)
cache_unregister_net(nn->svc_expkey_cache, net);
cache_unregister_net(nn->svc_export_cache, net);
- /* Drain deferred export and expkey release work. */
- rcu_barrier();
- flush_workqueue(nfsd_export_wq);
+ /* One drain covers both caches' deferred release work. */
+ sunrpc_cache_drain();
cache_destroy_net(nn->svc_expkey_cache, net);
cache_destroy_net(nn->svc_export_cache, net);
svcauth_unix_purge(net);
diff --git a/fs/nfsd/export.h b/fs/nfsd/export.h
index b05399374574..8969e81de448 100644
--- a/fs/nfsd/export.h
+++ b/fs/nfsd/export.h
@@ -111,8 +111,6 @@ __be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp,
/*
* Function declarations
*/
-int nfsd_export_wq_init(void);
-void nfsd_export_wq_shutdown(void);
int nfsd_export_init(struct net *);
void nfsd_export_shutdown(struct net *);
void nfsd_export_flush(struct net *);
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 064a2e749bc9..468aad8c3af9 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -2536,12 +2536,9 @@ static int __init init_nfsd(void)
if (retval)
goto out_free_pnfs;
nfsd_lockd_init(); /* lockd->nfsd callbacks */
- retval = nfsd_export_wq_init();
- if (retval)
- goto out_free_lockd;
retval = register_pernet_subsys(&nfsd_net_ops);
if (retval < 0)
- goto out_free_export_wq;
+ goto out_free_lockd;
retval = register_cld_notifier();
if (retval)
goto out_free_subsys;
@@ -2570,8 +2567,6 @@ static int __init init_nfsd(void)
unregister_cld_notifier();
out_free_subsys:
unregister_pernet_subsys(&nfsd_net_ops);
-out_free_export_wq:
- nfsd_export_wq_shutdown();
out_free_lockd:
nfsd_lockd_shutdown();
nfsd_drc_slab_free();
@@ -2592,7 +2587,6 @@ static void __exit exit_nfsd(void)
nfsd4_destroy_laundry_wq();
unregister_cld_notifier();
unregister_pernet_subsys(&nfsd_net_ops);
- nfsd_export_wq_shutdown();
nfsd_drc_slab_free();
nfsd_lockd_shutdown();
nfsd4_free_slabs();
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 83c88dc82e69..84802438a5fc 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -237,11 +237,14 @@ extern int cache_check(struct cache_detail *detail,
extern void cache_flush(void);
extern void cache_purge(struct cache_detail *detail);
#define NEVER (0x7FFFFFFF)
+extern void sunrpc_cache_queue_release(struct rcu_work *rwork);
+extern void sunrpc_cache_drain(void);
extern int cache_register_net(struct cache_detail *cd, struct net *net);
extern void cache_unregister_net(struct cache_detail *cd, struct net *net);
extern struct cache_detail *cache_create_net(const struct cache_detail *tmpl, struct net *net);
extern void cache_destroy_net(struct cache_detail *cd, struct net *net);
+extern void sunrpc_cache_destroy_net(struct cache_detail *cd, struct net *net);
extern void sunrpc_init_cache_detail(struct cache_detail *cd);
extern void sunrpc_destroy_cache_detail(struct cache_detail *cd);
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 488a14961b19..733bcd3daa46 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1705,9 +1705,77 @@ static int create_cache_proc_entries(struct cache_detail *cd, struct net *net)
return -ENOMEM;
}
-void __init cache_initialize(void)
+static struct workqueue_struct *sunrpc_cache_wq;
+
+/**
+ * sunrpc_cache_queue_release - schedule deferred cache release work
+ * @rwork: caller-initialized rcu_work to queue
+ *
+ * Run @rwork in process context after the next RCU grace period.
+ * Use this for cache .put callbacks whose cleanup may sleep
+ * (path_put(), auth_domain_put()).
+ */
+void sunrpc_cache_queue_release(struct rcu_work *rwork)
{
+ queue_rcu_work(sunrpc_cache_wq, rwork);
+}
+EXPORT_SYMBOL_GPL(sunrpc_cache_queue_release);
+
+/**
+ * sunrpc_cache_drain - drain pending cache release work
+ *
+ * Wait for outstanding RCU callbacks to enqueue their release
+ * work, then flush that work to completion.
+ */
+void sunrpc_cache_drain(void)
+{
+ rcu_barrier();
+ flush_workqueue(sunrpc_cache_wq);
+}
+EXPORT_SYMBOL_GPL(sunrpc_cache_drain);
+
+/**
+ * sunrpc_cache_destroy_net - quiesce and tear down a per-net cache
+ * @cd: the cache_detail to release
+ * @net: the network namespace owning @cd
+ *
+ * Canonical teardown for caches whose .put callbacks use
+ * sunrpc_cache_queue_release(). Unregister @cd to stop new
+ * lookups, drain in-flight RCU callbacks and queued release
+ * work, then free @cd and its hash table. The drain ensures
+ * release workers complete while the cache_detail is still
+ * valid.
+ */
+void sunrpc_cache_destroy_net(struct cache_detail *cd, struct net *net)
+{
+ cache_unregister_net(cd, net);
+ sunrpc_cache_drain();
+ cache_destroy_net(cd, net);
+}
+EXPORT_SYMBOL_GPL(sunrpc_cache_destroy_net);
+
+/**
+ * cache_initialize - allocate sunrpc cache subsystem resources
+ */
+int __init cache_initialize(void)
+{
+ sunrpc_cache_wq = alloc_workqueue("sunrpc_cache",
+ WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
+ if (!sunrpc_cache_wq)
+ return -ENOMEM;
INIT_DEFERRABLE_WORK(&cache_cleaner, do_cache_clean);
+ return 0;
+}
+
+/**
+ * cache_destroy - release sunrpc cache subsystem resources
+ *
+ * Caller must ensure no further sunrpc_cache_queue_release()
+ * calls can be scheduled before invoking this.
+ */
+void cache_destroy(void)
+{
+ destroy_workqueue(sunrpc_cache_wq);
}
int cache_register_net(struct cache_detail *cd, struct net *net)
diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h
index 7fa35ee8f9a4..75ee201e4800 100644
--- a/net/sunrpc/sunrpc.h
+++ b/net/sunrpc/sunrpc.h
@@ -41,7 +41,8 @@ struct svc_rqst;
int rpc_clients_notifier_register(void);
void rpc_clients_notifier_unregister(void);
void auth_domain_cleanup(void);
-void __init cache_initialize(void);
+int __init cache_initialize(void);
+void cache_destroy(void);
void svc_sock_update_bufs(struct svc_serv *serv);
enum svc_auth_status svc_authenticate(struct svc_rqst *rqstp);
#endif /* _NET_SUNRPC_SUNRPC_H */
diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c
index ab88ce46afb5..d75ff1e592f2 100644
--- a/net/sunrpc/sunrpc_syms.c
+++ b/net/sunrpc/sunrpc_syms.c
@@ -97,24 +97,26 @@ init_sunrpc(void)
if (err)
goto out2;
- cache_initialize();
-
- err = register_pernet_subsys(&sunrpc_net_ops);
+ err = cache_initialize();
if (err)
goto out3;
- err = register_rpc_pipefs();
+ err = register_pernet_subsys(&sunrpc_net_ops);
if (err)
goto out4;
- err = rpc_sysfs_init();
+ err = register_rpc_pipefs();
if (err)
goto out5;
- err = genl_register_family(&sunrpc_nl_family);
+ err = rpc_sysfs_init();
if (err)
goto out6;
+ err = genl_register_family(&sunrpc_nl_family);
+ if (err)
+ goto out7;
+
sunrpc_debugfs_init();
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
rpc_register_sysctl();
@@ -123,12 +125,14 @@ init_sunrpc(void)
init_socket_xprt(); /* clnt sock transport */
return 0;
-out6:
+out7:
rpc_sysfs_exit();
-out5:
+out6:
unregister_rpc_pipefs();
-out4:
+out5:
unregister_pernet_subsys(&sunrpc_net_ops);
+out4:
+ cache_destroy();
out3:
rpcauth_remove_module();
out2:
@@ -157,6 +161,7 @@ cleanup_sunrpc(void)
rpc_unregister_sysctl();
#endif
rcu_barrier(); /* Wait for completion of call_rcu()'s */
+ cache_destroy();
}
MODULE_DESCRIPTION("Sun RPC core");
MODULE_LICENSE("GPL");
--
2.53.0
next prev parent reply other threads:[~2026-05-01 14:51 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-01 14:51 [PATCH 0/6] SUNRPC: Address remaining cache_check_rcu() UAF in cache content files Chuck Lever
2026-05-01 14:51 ` [PATCH 1/6] SUNRPC: Move cache_initialize() declaration to sunrpc-private header Chuck Lever
2026-05-01 14:51 ` Chuck Lever [this message]
2026-05-01 14:51 ` [PATCH 3/6] SUNRPC: Defer ip_map sub-object cleanup past RCU grace period Chuck Lever
2026-05-01 14:51 ` [PATCH 4/6] SUNRPC: Use shared release pattern for the unix_gid cache Chuck Lever
2026-05-01 14:51 ` [PATCH 5/6] SUNRPC: Hold cd->net for the lifetime of cache files Chuck Lever
2026-05-01 14:51 ` [PATCH 6/6] NFSD: Convert nfsd_export_shutdown() to sunrpc_cache_destroy_net() Chuck Lever
2026-05-05 5:32 ` [PATCH 0/6] SUNRPC: Address remaining cache_check_rcu() UAF in cache content files Jeff Layton
2026-05-05 10:49 ` Calum Mackay
2026-05-05 10:53 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260501-cache-uaf-fix-v1-2-a49928bf4817@oracle.com \
--to=cel@kernel.org \
--cc=Dai.Ngo@oracle.com \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=jlayton@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=misanjum@linux.ibm.com \
--cc=neil@brown.name \
--cc=netdev@vger.kernel.org \
--cc=okorniev@redhat.com \
--cc=pabeni@redhat.com \
--cc=tom@talpey.com \
--cc=trondmy@kernel.org \
--cc=yangerkun@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox