* [PATCH net v2] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket
@ 2026-06-30 18:32 Xiang Mei
2026-07-01 18:32 ` sashiko-bot
0 siblings, 1 reply; 2+ messages in thread
From: Xiang Mei @ 2026-06-30 18:32 UTC (permalink / raw)
To: Sidraya Jayagond, D . Wythe, Dust Li, Wenjia Zhang,
Mahanta Jambigi, Tony Lu, Wen Gu, netdev
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Hans Wippel, linux-rdma, linux-s390, Weiming Shi,
Xiang Mei
smc_cdc_rx_handler() looks up the connection by token under the link
group's conns_lock, drops the lock, and then dereferences conn and the
smc_sock derived from it, ending in sock_hold(&smc->sk) inside
smc_cdc_msg_recv(). No reference is held across the lock release.
The only reference pinning the socket while the connection is
discoverable in the link group is taken in smc_lgr_register_conn()
(sock_hold) and dropped in __smc_lgr_unregister_conn() (sock_put), both
under conns_lock. Once the handler drops conns_lock, a concurrent
close() -> smc_release() -> smc_conn_free() -> smc_lgr_unregister_conn()
can drop that reference and free the smc_sock, so the handler's later
sock_hold() runs on freed memory:
WARNING: lib/refcount.c:25 at refcount_warn_saturate
Workqueue: rxe_wq do_work
refcount_warn_saturate (lib/refcount.c:25)
smc_cdc_msg_recv (net/smc/smc_cdc.c:430)
smc_cdc_rx_handler (net/smc/smc_cdc.c:502)
smc_wr_rx_tasklet_fn (net/smc/smc_wr.c:445)
tasklet_action_common (kernel/softirq.c:938)
handle_softirqs (kernel/softirq.c:622)
Kernel panic - not syncing: panic_on_warn set
Only SMC-R is affected. The SMC-D receive tasklet is stopped by
tasklet_kill(&conn->rx_tsklet) in smc_conn_free() before the connection
is unregistered, so it cannot run concurrently with the free.
Take the socket reference while still holding conns_lock, so the
registration reference can no longer be the last one, and drop it once
the handler is done.
Fixes: d7b0e37c1ac1 ("net/smc: restructure CDC message reception")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Xiang Mei <xmei5@asu.edu>
---
v2:
- Take the reference under conns_lock, and compute smc once
- Initialize smc = NULL at declaration
net/smc/smc_cdc.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index 619b3bab3824..32d6d03df321 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -470,9 +470,9 @@ static void smc_cdc_rx_handler(struct ib_wc *wc, void *buf)
{
struct smc_link *link = (struct smc_link *)wc->qp->qp_context;
struct smc_cdc_msg *cdc = buf;
+ struct smc_sock *smc = NULL;
struct smc_connection *conn;
struct smc_link_group *lgr;
- struct smc_sock *smc;
if (wc->byte_len < offsetof(struct smc_cdc_msg, reserved))
return; /* short message */
@@ -483,21 +483,26 @@ static void smc_cdc_rx_handler(struct ib_wc *wc, void *buf)
lgr = smc_get_lgr(link);
read_lock_bh(&lgr->conns_lock);
conn = smc_lgr_find_conn(ntohl(cdc->token), lgr);
- read_unlock_bh(&lgr->conns_lock);
- if (!conn || conn->out_of_sync)
+ if (!conn || conn->out_of_sync) {
+ read_unlock_bh(&lgr->conns_lock);
return;
+ }
smc = container_of(conn, struct smc_sock, conn);
+ sock_hold(&smc->sk);
+ read_unlock_bh(&lgr->conns_lock);
if (cdc->prod_flags.failover_validation) {
smc_cdc_msg_validate(smc, cdc, link);
- return;
+ goto out;
}
if (smc_cdc_before(ntohs(cdc->seqno),
conn->local_rx_ctrl.seqno))
/* received seqno is old */
- return;
+ goto out;
smc_cdc_msg_recv(smc, cdc);
+out:
+ sock_put(&smc->sk);
}
static struct smc_wr_rx_handler smc_cdc_rx_handlers[] = {
--
2.43.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH net v2] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket
2026-06-30 18:32 [PATCH net v2] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket Xiang Mei
@ 2026-07-01 18:32 ` sashiko-bot
0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-07-01 18:32 UTC (permalink / raw)
To: Xiang Mei; +Cc: Alexander Gordeev, Vasily Gorbik, linux-s390, Heiko Carstens
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [Critical] Concurrent unprotected bitfield modifications in `struct smc_connection` can lead to lost updates for `freed`, `killed`, or `out_of_sync` flags, causing reference count underflows and Use-After-Free.
--
commit 2f1e95dfacd6a1bad8a756fc60b2dadb865fcaac
Author: Xiang Mei <xmei5@asu.edu>
net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket
This commit fixes a use-after-free in smc_cdc_rx_handler() by taking a socket
reference while holding conns_lock. This ensures the connection and socket
remain valid while the receive handler runs.
> diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
> index 619b3bab38248..32d6d03df3214 100644
> --- a/net/smc/smc_cdc.c
> +++ b/net/smc/smc_cdc.c
[ ... ]
> @@ -483,21 +483,26 @@ static void smc_cdc_rx_handler(struct ib_wc *wc, void *buf)
> lgr = smc_get_lgr(link);
> read_lock_bh(&lgr->conns_lock);
> conn = smc_lgr_find_conn(ntohl(cdc->token), lgr);
> - read_unlock_bh(&lgr->conns_lock);
> - if (!conn || conn->out_of_sync)
> + if (!conn || conn->out_of_sync) {
> + read_unlock_bh(&lgr->conns_lock);
> return;
> + }
> smc = container_of(conn, struct smc_sock, conn);
> + sock_hold(&smc->sk);
> + read_unlock_bh(&lgr->conns_lock);
>
> if (cdc->prod_flags.failover_validation) {
> smc_cdc_msg_validate(smc, cdc, link);
> - return;
> + goto out;
[Severity: Critical]
This is a pre-existing issue, but can concurrent bitfield modifications in
struct smc_connection lead to lost updates and a use-after-free?
In net/smc/smc.h, we have:
struct smc_connection {
...
u8 killed : 1; /* abnormal termination */
u8 freed : 1; /* normal termination */
u8 out_of_sync : 1; /* out of sync with peer */
};
When smc_cdc_msg_validate() is called from softirq (tasklet) context here,
it sets conn->out_of_sync = 1 locklessly.
If a softirq interrupts a process context that is simultaneously setting
conn->freed = 1 in smc_conn_free(), or if they execute concurrently on
different CPUs, the compiler's read-modify-write on the shared byte could
cause one update to overwrite the other.
If the update to conn->freed = 1 is lost, could smc_conn_free() be executed
multiple times on the same connection, bypassing an early if (conn->freed)
check? This seems like it could underflow the link and link-group reference
counts, leading to a use-after-free of those structures.
> }
> if (smc_cdc_before(ntohs(cdc->seqno),
> conn->local_rx_ctrl.seqno))
> /* received seqno is old */
> - return;
> + goto out;
>
> smc_cdc_msg_recv(smc, cdc);
> +out:
> + sock_put(&smc->sk);
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260630183227.2044998-1-xmei5@asu.edu?part=1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-07-01 18:32 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 18:32 [PATCH net v2] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket Xiang Mei
2026-07-01 18:32 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox