* [PATCH] rpcrdma: arm rn_done before publishing the notification
@ 2026-06-01 20:17 Chuck Lever
0 siblings, 0 replies; only message in thread
From: Chuck Lever @ 2026-06-01 20:17 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, linux-rdma, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
rpcrdma_rn_register() inserts @rn into rd_xa with xa_alloc() before
storing the caller's callback in rn->rn_done. The xarray makes @rn
reachable to rpcrdma_remove_one(), which walks rd_xa and invokes
rn->rn_done(rn) for every registered notification. A device removal
that races a fresh registration can therefore observe @rn with
rn_done still NULL, because the notification objects are zero
allocated by their owners, and call through a NULL function pointer.
Store rn->rn_done before xa_alloc() publishes @rn. The xarray's
store-side and load-side ordering then guarantees that any CPU which
finds @rn in rd_xa also observes the armed callback.
rpcrdma_rn_unregister() treats a non-NULL rn_done as the sentinel
for a completed registration, so the early store must not survive a
failed registration. Clear rn_done again when xa_alloc() fails.
Were it left set, the failed-accept cleanup path would call
rpcrdma_rn_unregister() on an @rn that was never inserted, erasing
an unrelated rd_xa slot and underflowing rd_kref.
Fixes: 7e86845a0346 ("rpcrdma: Implement generic device removal")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtrdma/ib_client.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/net/sunrpc/xprtrdma/ib_client.c b/net/sunrpc/xprtrdma/ib_client.c
index 69166d5d9987..188f7a13397f 100644
--- a/net/sunrpc/xprtrdma/ib_client.c
+++ b/net/sunrpc/xprtrdma/ib_client.c
@@ -52,8 +52,8 @@ static struct rpcrdma_device *rpcrdma_get_client_data(struct ib_device *device)
* is unregistered first.
*
* On failure, a negative errno is returned. rn->rn_done is left
- * NULL on every failure path (it is assigned only after xa_alloc
- * and kref_get have both succeeded), so the @rn may safely be
+ * NULL on every failure path (it is armed before xa_alloc but
+ * cleared again if xa_alloc fails), so the @rn may safely be
* passed to rpcrdma_rn_unregister() without a separate
* registered/unregistered flag in the caller.
*/
@@ -66,10 +66,21 @@ int rpcrdma_rn_register(struct ib_device *device,
if (!rd || test_bit(RPCRDMA_RD_F_REMOVING, &rd->rd_flags))
return -ENETUNREACH;
- if (xa_alloc(&rd->rd_xa, &rn->rn_index, rn, xa_limit_32b, GFP_KERNEL) < 0)
- return -ENOMEM;
- kref_get(&rd->rd_kref);
+ /*
+ * Arm rn_done before xa_alloc() publishes @rn: once @rn is
+ * visible in rd_xa, a concurrent rpcrdma_remove_one() can
+ * call rn->rn_done(), so the pointer must already be set.
+ *
+ * Restore NULL if xa_alloc() fails. rn_done doubles as the
+ * registration sentinel for rpcrdma_rn_unregister(); a stale
+ * value would unregister an @rn that was never inserted.
+ */
rn->rn_done = done;
+ if (xa_alloc(&rd->rd_xa, &rn->rn_index, rn, xa_limit_32b, GFP_KERNEL) < 0) {
+ rn->rn_done = NULL;
+ return -ENOMEM;
+ }
+ kref_get(&rd->rd_kref);
trace_rpcrdma_client_register(device, rn);
return 0;
}
@@ -102,8 +113,9 @@ void rpcrdma_rn_unregister(struct ib_device *device,
/*
* rn_done is the registration sentinel: rpcrdma_rn_register
- * assigns it last, after xa_alloc and kref_get have both
- * succeeded. A NULL rn_done means this notification was
+ * leaves it NULL on every failure path, clearing it again if
+ * xa_alloc fails, so a non-NULL rn_done marks a completed
+ * registration. A NULL rn_done means this notification was
* never registered (or its registration failed) or has
* already been unregistered, and the call is a no-op.
* Without this guard, rn_index == 0 from a kzalloc'd
--
2.54.0
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2026-06-01 20:17 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 20:17 [PATCH] rpcrdma: arm rn_done before publishing the notification Chuck Lever
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.