From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C4293EFFDA; Mon, 1 Jun 2026 20:17:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780345028; cv=none; b=OSnpr24t1aL5e6mUXn3QbUy6qtvYlEGCfN94IB2kdx23dnrOkdRUnFwNkqCYakQGt5r6fYDkAO69tadVOkPNYtPx++tLk6O3FDRpPO121C3hKmbzoFgHy+oAAUv3Bb1q1kz3GwihlQpl9at2Im58A781DR7EaJmSJZyVHZSlmaI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780345028; c=relaxed/simple; bh=4SAI65EkvBGD3doUpnTPzYUuQiCfdnCC+SHYXU+mCgE=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=AzOEmrBGdw84KKpmMxhw8EBpS/Sf0cNyX67irlTh3zjcktJxyzYYnCTfnvwdbVV4KM9uX9T5OBKeOisjiutxk6bj0d1fyr9d7O4c964Czi9Gr1PdGFO28Bbl63u+W2xEk1DG7/hJW7B5VJO9QyPhA0m8LvGSKl+KYLyZlmpPAkA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WBs+mP8l; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WBs+mP8l" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9387E1F00893; Mon, 1 Jun 2026 20:17:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780345025; bh=0AFjyC97XtGoAMi6YDHtcRzBtnK0UUtQ/Zwq5uDWpWo=; h=From:To:Cc:Subject:Date; b=WBs+mP8l9NPXhaDi3mPd7WxRisRYDgUXpgAOb+srW8tYicRaQenbS46y6+ZQRpg6k XhfcEw0Hxf1o53eihwhupGwAdUinDoZ3mhDtuA9CUobMNRw9c3O+IOUG2feQHQ4h5O ieTL6TQRsuaaqixq3Q51ElX23PZoTT6WrWSbeMRhPupYQEnurVfU+9bS0SzMP4ZadG qv7gSEpsCrhn3BxKAnfaq2/C4HHpMZ+wMu/iYEQ9hSbgEet4gHsL7/k3hl/IRCDUjw aphVjSNJW1/B5DqimBzWyhgtvodqsOM01kYAmXtHAla68pLSjtdqQsHcw/zyYjP2eJ FVEU/dj51zCiQ== From: Chuck Lever To: NeilBrown , Jeff Layton , Olga Kornievskaia , Dai Ngo , Tom Talpey Cc: , , Chuck Lever Subject: [PATCH] rpcrdma: arm rn_done before publishing the notification Date: Mon, 1 Jun 2026 16:17:03 -0400 Message-ID: <20260601201703.46078-1-cel@kernel.org> X-Mailer: git-send-email 2.54.0 Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Chuck Lever rpcrdma_rn_register() inserts @rn into rd_xa with xa_alloc() before storing the caller's callback in rn->rn_done. The xarray makes @rn reachable to rpcrdma_remove_one(), which walks rd_xa and invokes rn->rn_done(rn) for every registered notification. A device removal that races a fresh registration can therefore observe @rn with rn_done still NULL, because the notification objects are zero allocated by their owners, and call through a NULL function pointer. Store rn->rn_done before xa_alloc() publishes @rn. The xarray's store-side and load-side ordering then guarantees that any CPU which finds @rn in rd_xa also observes the armed callback. rpcrdma_rn_unregister() treats a non-NULL rn_done as the sentinel for a completed registration, so the early store must not survive a failed registration. Clear rn_done again when xa_alloc() fails. Were it left set, the failed-accept cleanup path would call rpcrdma_rn_unregister() on an @rn that was never inserted, erasing an unrelated rd_xa slot and underflowing rd_kref. Fixes: 7e86845a0346 ("rpcrdma: Implement generic device removal") Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/ib_client.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/net/sunrpc/xprtrdma/ib_client.c b/net/sunrpc/xprtrdma/ib_client.c index 69166d5d9987..188f7a13397f 100644 --- a/net/sunrpc/xprtrdma/ib_client.c +++ b/net/sunrpc/xprtrdma/ib_client.c @@ -52,8 +52,8 @@ static struct rpcrdma_device *rpcrdma_get_client_data(struct ib_device *device) * is unregistered first. * * On failure, a negative errno is returned. rn->rn_done is left - * NULL on every failure path (it is assigned only after xa_alloc - * and kref_get have both succeeded), so the @rn may safely be + * NULL on every failure path (it is armed before xa_alloc but + * cleared again if xa_alloc fails), so the @rn may safely be * passed to rpcrdma_rn_unregister() without a separate * registered/unregistered flag in the caller. */ @@ -66,10 +66,21 @@ int rpcrdma_rn_register(struct ib_device *device, if (!rd || test_bit(RPCRDMA_RD_F_REMOVING, &rd->rd_flags)) return -ENETUNREACH; - if (xa_alloc(&rd->rd_xa, &rn->rn_index, rn, xa_limit_32b, GFP_KERNEL) < 0) - return -ENOMEM; - kref_get(&rd->rd_kref); + /* + * Arm rn_done before xa_alloc() publishes @rn: once @rn is + * visible in rd_xa, a concurrent rpcrdma_remove_one() can + * call rn->rn_done(), so the pointer must already be set. + * + * Restore NULL if xa_alloc() fails. rn_done doubles as the + * registration sentinel for rpcrdma_rn_unregister(); a stale + * value would unregister an @rn that was never inserted. + */ rn->rn_done = done; + if (xa_alloc(&rd->rd_xa, &rn->rn_index, rn, xa_limit_32b, GFP_KERNEL) < 0) { + rn->rn_done = NULL; + return -ENOMEM; + } + kref_get(&rd->rd_kref); trace_rpcrdma_client_register(device, rn); return 0; } @@ -102,8 +113,9 @@ void rpcrdma_rn_unregister(struct ib_device *device, /* * rn_done is the registration sentinel: rpcrdma_rn_register - * assigns it last, after xa_alloc and kref_get have both - * succeeded. A NULL rn_done means this notification was + * leaves it NULL on every failure path, clearing it again if + * xa_alloc fails, so a non-NULL rn_done marks a completed + * registration. A NULL rn_done means this notification was * never registered (or its registration failed) or has * already been unregistered, and the call is a no-op. * Without this guard, rn_index == 0 from a kzalloc'd -- 2.54.0