* [PATCH] rpcrdma: arm rn_done before publishing the notification
@ 2026-06-01 20:17 Chuck Lever
0 siblings, 0 replies; only message in thread
From: Chuck Lever @ 2026-06-01 20:17 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, linux-rdma, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
rpcrdma_rn_register() inserts @rn into rd_xa with xa_alloc() before
storing the caller's callback in rn->rn_done. The xarray makes @rn
reachable to rpcrdma_remove_one(), which walks rd_xa and invokes
rn->rn_done(rn) for every registered notification. A device removal
that races a fresh registration can therefore observe @rn with
rn_done still NULL, because the notification objects are zero
allocated by their owners, and call through a NULL function pointer.
Store rn->rn_done before xa_alloc() publishes @rn. The xarray's
store-side and load-side ordering then guarantees that any CPU which
finds @rn in rd_xa also observes the armed callback.
rpcrdma_rn_unregister() treats a non-NULL rn_done as the sentinel
for a completed registration, so the early store must not survive a
failed registration. Clear rn_done again when xa_alloc() fails.
Were it left set, the failed-accept cleanup path would call
rpcrdma_rn_unregister() on an @rn that was never inserted, erasing
an unrelated rd_xa slot and underflowing rd_kref.
Fixes: 7e86845a0346 ("rpcrdma: Implement generic device removal")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtrdma/ib_client.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/net/sunrpc/xprtrdma/ib_client.c b/net/sunrpc/xprtrdma/ib_client.c
index 69166d5d9987..188f7a13397f 100644
--- a/net/sunrpc/xprtrdma/ib_client.c
+++ b/net/sunrpc/xprtrdma/ib_client.c
@@ -52,8 +52,8 @@ static struct rpcrdma_device *rpcrdma_get_client_data(struct ib_device *device)
* is unregistered first.
*
* On failure, a negative errno is returned. rn->rn_done is left
- * NULL on every failure path (it is assigned only after xa_alloc
- * and kref_get have both succeeded), so the @rn may safely be
+ * NULL on every failure path (it is armed before xa_alloc but
+ * cleared again if xa_alloc fails), so the @rn may safely be
* passed to rpcrdma_rn_unregister() without a separate
* registered/unregistered flag in the caller.
*/
@@ -66,10 +66,21 @@ int rpcrdma_rn_register(struct ib_device *device,
if (!rd || test_bit(RPCRDMA_RD_F_REMOVING, &rd->rd_flags))
return -ENETUNREACH;
- if (xa_alloc(&rd->rd_xa, &rn->rn_index, rn, xa_limit_32b, GFP_KERNEL) < 0)
- return -ENOMEM;
- kref_get(&rd->rd_kref);
+ /*
+ * Arm rn_done before xa_alloc() publishes @rn: once @rn is
+ * visible in rd_xa, a concurrent rpcrdma_remove_one() can
+ * call rn->rn_done(), so the pointer must already be set.
+ *
+ * Restore NULL if xa_alloc() fails. rn_done doubles as the
+ * registration sentinel for rpcrdma_rn_unregister(); a stale
+ * value would unregister an @rn that was never inserted.
+ */
rn->rn_done = done;
+ if (xa_alloc(&rd->rd_xa, &rn->rn_index, rn, xa_limit_32b, GFP_KERNEL) < 0) {
+ rn->rn_done = NULL;
+ return -ENOMEM;
+ }
+ kref_get(&rd->rd_kref);
trace_rpcrdma_client_register(device, rn);
return 0;
}
@@ -102,8 +113,9 @@ void rpcrdma_rn_unregister(struct ib_device *device,
/*
* rn_done is the registration sentinel: rpcrdma_rn_register
- * assigns it last, after xa_alloc and kref_get have both
- * succeeded. A NULL rn_done means this notification was
+ * leaves it NULL on every failure path, clearing it again if
+ * xa_alloc fails, so a non-NULL rn_done marks a completed
+ * registration. A NULL rn_done means this notification was
* never registered (or its registration failed) or has
* already been unregistered, and the call is a no-op.
* Without this guard, rn_index == 0 from a kzalloc'd
--
2.54.0
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2026-06-01 20:17 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 20:17 [PATCH] rpcrdma: arm rn_done before publishing the notification Chuck Lever
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox