* [PATCH net] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket
@ 2026-06-27 1:49 Xiang Mei
2026-06-28 1:50 ` sashiko-bot
2026-06-30 5:39 ` Sidraya Jayagond
0 siblings, 2 replies; 4+ messages in thread
From: Xiang Mei @ 2026-06-27 1:49 UTC (permalink / raw)
To: D . Wythe, Dust Li, Sidraya Jayagond, Wenjia Zhang,
Mahanta Jambigi, Tony Lu, Wen Gu, netdev
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Hans Wippel, linux-rdma, linux-s390, Weiming Shi,
Xiang Mei
smc_cdc_rx_handler() looks up the connection by token under the link
group's conns_lock, drops the lock, and then dereferences conn and the
smc_sock derived from it, ending in sock_hold(&smc->sk) inside
smc_cdc_msg_recv(). No reference is held across the lock release.
The only reference pinning the socket while the connection is
discoverable in the link group is taken in smc_lgr_register_conn()
(sock_hold) and dropped in __smc_lgr_unregister_conn() (sock_put), both
under conns_lock. Once the handler drops conns_lock, a concurrent
close() -> smc_release() -> smc_conn_free() -> smc_lgr_unregister_conn()
can drop that reference and free the smc_sock, so the handler's later
sock_hold() runs on freed memory:
WARNING: lib/refcount.c:25 at refcount_warn_saturate
Workqueue: rxe_wq do_work
refcount_warn_saturate (lib/refcount.c:25)
smc_cdc_msg_recv (net/smc/smc_cdc.c:430)
smc_cdc_rx_handler (net/smc/smc_cdc.c:502)
smc_wr_rx_tasklet_fn (net/smc/smc_wr.c:445)
tasklet_action_common (kernel/softirq.c:938)
handle_softirqs (kernel/softirq.c:622)
Kernel panic - not syncing: panic_on_warn set
Only SMC-R is affected. The SMC-D receive tasklet is stopped by
tasklet_kill(&conn->rx_tsklet) in smc_conn_free() before the connection
is unregistered, so it cannot run concurrently with the free.
Take the socket reference while still holding conns_lock, so the
registration reference can no longer be the last one, and drop it once
the handler is done.
Fixes: d7b0e37c1ac1 ("net/smc: restructure CDC message reception")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Xiang Mei <xmei5@asu.edu>
---
net/smc/smc_cdc.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index 619b3bab3824..b809139d7e87 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -483,21 +483,27 @@ static void smc_cdc_rx_handler(struct ib_wc *wc, void *buf)
lgr = smc_get_lgr(link);
read_lock_bh(&lgr->conns_lock);
conn = smc_lgr_find_conn(ntohl(cdc->token), lgr);
+ if (conn && !conn->out_of_sync)
+ sock_hold(&container_of(conn, struct smc_sock, conn)->sk);
+ else
+ conn = NULL;
read_unlock_bh(&lgr->conns_lock);
- if (!conn || conn->out_of_sync)
+ if (!conn)
return;
smc = container_of(conn, struct smc_sock, conn);
if (cdc->prod_flags.failover_validation) {
smc_cdc_msg_validate(smc, cdc, link);
- return;
+ goto out;
}
if (smc_cdc_before(ntohs(cdc->seqno),
conn->local_rx_ctrl.seqno))
/* received seqno is old */
- return;
+ goto out;
smc_cdc_msg_recv(smc, cdc);
+out:
+ sock_put(&smc->sk);
}
static struct smc_wr_rx_handler smc_cdc_rx_handlers[] = {
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket
2026-06-27 1:49 [PATCH net] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket Xiang Mei
@ 2026-06-28 1:50 ` sashiko-bot
2026-06-30 5:39 ` Sidraya Jayagond
1 sibling, 0 replies; 4+ messages in thread
From: sashiko-bot @ 2026-06-28 1:50 UTC (permalink / raw)
To: Xiang Mei; +Cc: linux-s390, Heiko Carstens, Alexander Gordeev, Vasily Gorbik
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
Pre-existing issues:
- [High] Asynchronous `tx_work` can be queued during socket release and execute after the socket is freed, leading to a Use-After-Free.
- [High] Concurrent execution of `smc_conn_free()` and `smc_cdc_rx_handler()` causes a Use-After-Free on connection buffers and the link group.
- [High] Concurrent modification of bitfields (`out_of_sync`, `freed`, `killed`) in `struct smc_connection` causes torn writes and lost state updates.
- [High] Cancelling `abort_work` in `smc_conn_free()` without checking its return value leaks a socket reference.
--
commit 2ba91a5beb3d341e982b17474cb075d5a25cb381
Author: Xiang Mei <xmei5@asu.edu>
net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket
smc_cdc_rx_handler() looks up the connection by token under the link
group's conns_lock, drops the lock, and then dereferences conn and the
smc_sock derived from it. This patch takes the socket reference while
still holding conns_lock to prevent the socket from being freed
concurrently, and drops the reference once the handler is done.
> diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
> index 619b3bab38248..b809139d7e870 100644
> --- a/net/smc/smc_cdc.c
> +++ b/net/smc/smc_cdc.c
> @@ -483,21 +483,27 @@ static void smc_cdc_rx_handler(struct ib_wc *wc, void *buf)
> lgr = smc_get_lgr(link);
> read_lock_bh(&lgr->conns_lock);
> conn = smc_lgr_find_conn(ntohl(cdc->token), lgr);
> + if (conn && !conn->out_of_sync)
> + sock_hold(&container_of(conn, struct smc_sock, conn)->sk);
> + else
> + conn = NULL;
> read_unlock_bh(&lgr->conns_lock);
[Severity: High]
This is a pre-existing issue, but does dropping lgr->conns_lock here leave the
connection buffers and link group vulnerable to concurrent freeing?
Looking at net/smc/smc_core.c:smc_conn_free(), the buffers can be freed
concurrently:
smc_buf_unuse(conn, lgr); /* allow buffer reuse */
Because smc_conn_free() doesn't synchronize with the RX handler for SMC-R,
could the subsequent call to smc_cdc_msg_recv_action() dereference freed memory
when it accesses conn->rmb_desc->len or conn->lgr?
> - if (!conn || conn->out_of_sync)
> + if (!conn)
> return;
> smc = container_of(conn, struct smc_sock, conn);
>
> if (cdc->prod_flags.failover_validation) {
> smc_cdc_msg_validate(smc, cdc, link);
[Severity: High]
This isn't a bug introduced by this patch, but can concurrent bitfield
modifications lead to lost state updates here?
In net/smc/smc_cdc.c:smc_cdc_msg_validate(), conn->out_of_sync is set without a
lock from the softirq context:
conn->out_of_sync = 1; /* prevent any further receives */
Concurrently, net/smc/smc_core.c:smc_conn_free() modifies another bit in the
same byte from process context:
conn->freed = 1;
Since these bitfields share the same byte in struct smc_connection, could the
read-modify-write on these bitfields cause the freed = 1 update to be
lost, resulting in a double-free if the connection is torn down again?
[Severity: High]
This is another pre-existing issue, but does cancelling the abort work without
checking its return value cause a socket reference leak?
In net/smc/smc_cdc.c:smc_cdc_msg_validate():
sock_hold(&smc->sk);
if (!queue_work(smc_close_wq, &conn->abort_work))
sock_put(&smc->sk);
If net/smc/smc_core.c:smc_conn_free() successfully cancels this pending work:
if (current_work() != &conn->abort_work)
cancel_work_sync(&conn->abort_work);
The work function won't execute to drop the reference. Since smc_conn_free()
ignores the return value of cancel_work_sync(), does the socket reference
remain permanently leaked?
> - return;
> + goto out;
> }
> if (smc_cdc_before(ntohs(cdc->seqno),
> conn->local_rx_ctrl.seqno))
> /* received seqno is old */
> - return;
> + goto out;
>
> smc_cdc_msg_recv(smc, cdc);
[Severity: High]
This is a pre-existing issue, but can this lead to a use-after-free of the
smc socket during release?
In net/smc/smc_cdc.c:smc_cdc_msg_recv_action(), if the socket lock is held
by the release thread, the handler sets a flag instead of queuing work:
conn->tx_in_release_sock = true;
When net/smc/af_smc.c:smc_release() runs, it invokes release_sock(), causing
net/smc/af_smc.c:smc_release_cb() to queue the work asynchronously:
smc_tx_pending(&smc->conn);
Since smc_release() drops the final socket reference right after this:
sock_put(sk); /* final sock_put */
Could the asynchronous net/smc/smc_tx.c:smc_tx_work() execute after the socket
is freed, resulting in a use-after-free when it dereferences the socket
structure?
> +out:
> + sock_put(&smc->sk);
> }
[ ... ]
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260627014948.3049512-1-xmei5@asu.edu?part=1
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket
2026-06-27 1:49 [PATCH net] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket Xiang Mei
2026-06-28 1:50 ` sashiko-bot
@ 2026-06-30 5:39 ` Sidraya Jayagond
2026-06-30 18:18 ` Xiang Mei
1 sibling, 1 reply; 4+ messages in thread
From: Sidraya Jayagond @ 2026-06-30 5:39 UTC (permalink / raw)
To: Xiang Mei, D . Wythe, Dust Li, Wenjia Zhang, Mahanta Jambigi,
Tony Lu, Wen Gu, netdev
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Hans Wippel, linux-rdma, linux-s390, Weiming Shi
On 27/06/26 7:19 am, Xiang Mei wrote:
> smc_cdc_rx_handler() looks up the connection by token under the link
> group's conns_lock, drops the lock, and then dereferences conn and the
> smc_sock derived from it, ending in sock_hold(&smc->sk) inside
> smc_cdc_msg_recv(). No reference is held across the lock release.
>
> The only reference pinning the socket while the connection is
> discoverable in the link group is taken in smc_lgr_register_conn()
> (sock_hold) and dropped in __smc_lgr_unregister_conn() (sock_put), both
> under conns_lock. Once the handler drops conns_lock, a concurrent
> close() -> smc_release() -> smc_conn_free() -> smc_lgr_unregister_conn()
> can drop that reference and free the smc_sock, so the handler's later
> sock_hold() runs on freed memory:
>
> WARNING: lib/refcount.c:25 at refcount_warn_saturate
> Workqueue: rxe_wq do_work
> refcount_warn_saturate (lib/refcount.c:25)
> smc_cdc_msg_recv (net/smc/smc_cdc.c:430)
> smc_cdc_rx_handler (net/smc/smc_cdc.c:502)
> smc_wr_rx_tasklet_fn (net/smc/smc_wr.c:445)
> tasklet_action_common (kernel/softirq.c:938)
> handle_softirqs (kernel/softirq.c:622)
> Kernel panic - not syncing: panic_on_warn set
>
> Only SMC-R is affected. The SMC-D receive tasklet is stopped by
> tasklet_kill(&conn->rx_tsklet) in smc_conn_free() before the connection
> is unregistered, so it cannot run concurrently with the free.
>
> Take the socket reference while still holding conns_lock, so the
> registration reference can no longer be the last one, and drop it once
> the handler is done.
>
> Fixes: d7b0e37c1ac1 ("net/smc: restructure CDC message reception")
> Reported-by: Weiming Shi <bestswngs@gmail.com>
> Assisted-by: Claude:claude-opus-4-8
> Signed-off-by: Xiang Mei <xmei5@asu.edu>
> ---
> net/smc/smc_cdc.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
> index 619b3bab3824..b809139d7e87 100644
> --- a/net/smc/smc_cdc.c
> +++ b/net/smc/smc_cdc.c
> @@ -483,21 +483,27 @@ static void smc_cdc_rx_handler(struct ib_wc *wc, void *buf)
> lgr = smc_get_lgr(link);
> read_lock_bh(&lgr->conns_lock);
> conn = smc_lgr_find_conn(ntohl(cdc->token), lgr);
> + if (conn && !conn->out_of_sync)
> + sock_hold(&container_of(conn, struct smc_sock, conn)->sk);
> + else
> + conn = NULL;
> read_unlock_bh(&lgr->conns_lock);
> - if (!conn || conn->out_of_sync)
> + if (!conn)
> return;
> smc = container_of(conn, struct smc_sock, conn);
>
Fix looks correct.
A few nits on the implementation:
container_of() is called twice for the same conn. The conn = NULL
sentinel and the second post unlock check can also be dropped. Flip the
condition, early return inside the lock, compute smc once:
if (!conn || conn->out_of_sync) {
read_unlock_bh(&lgr->conns_lock);
return;
}
smc = container_of(conn, struct smc_sock, conn);
sock_hold(&smc->sk);
read_unlock_bh(&lgr->conns_lock);
Also please initialize smc = NULL at declaration, it's not a bug now
since the early return guards it, just to make it refactor safe.
> if (cdc->prod_flags.failover_validation) {
> smc_cdc_msg_validate(smc, cdc, link);
> - return;
> + goto out;
> }
> if (smc_cdc_before(ntohs(cdc->seqno),
> conn->local_rx_ctrl.seqno))
> /* received seqno is old */
> - return;
> + goto out;
>
> smc_cdc_msg_recv(smc, cdc);
> +out:
> + sock_put(&smc->sk);
> }
>
> static struct smc_wr_rx_handler smc_cdc_rx_handlers[] = {
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket
2026-06-30 5:39 ` Sidraya Jayagond
@ 2026-06-30 18:18 ` Xiang Mei
0 siblings, 0 replies; 4+ messages in thread
From: Xiang Mei @ 2026-06-30 18:18 UTC (permalink / raw)
To: Sidraya Jayagond
Cc: D . Wythe, Dust Li, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
Wen Gu, netdev, David S . Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Hans Wippel, linux-rdma, linux-s390,
Weiming Shi
On Mon, Jun 29, 2026 at 10:40 PM Sidraya Jayagond <sidraya@linux.ibm.com> wrote:
>
>
>
> On 27/06/26 7:19 am, Xiang Mei wrote:
> > smc_cdc_rx_handler() looks up the connection by token under the link
> > group's conns_lock, drops the lock, and then dereferences conn and the
> > smc_sock derived from it, ending in sock_hold(&smc->sk) inside
> > smc_cdc_msg_recv(). No reference is held across the lock release.
> >
> > The only reference pinning the socket while the connection is
> > discoverable in the link group is taken in smc_lgr_register_conn()
> > (sock_hold) and dropped in __smc_lgr_unregister_conn() (sock_put), both
> > under conns_lock. Once the handler drops conns_lock, a concurrent
> > close() -> smc_release() -> smc_conn_free() -> smc_lgr_unregister_conn()
> > can drop that reference and free the smc_sock, so the handler's later
> > sock_hold() runs on freed memory:
> >
> > WARNING: lib/refcount.c:25 at refcount_warn_saturate
> > Workqueue: rxe_wq do_work
> > refcount_warn_saturate (lib/refcount.c:25)
> > smc_cdc_msg_recv (net/smc/smc_cdc.c:430)
> > smc_cdc_rx_handler (net/smc/smc_cdc.c:502)
> > smc_wr_rx_tasklet_fn (net/smc/smc_wr.c:445)
> > tasklet_action_common (kernel/softirq.c:938)
> > handle_softirqs (kernel/softirq.c:622)
> > Kernel panic - not syncing: panic_on_warn set
> >
> > Only SMC-R is affected. The SMC-D receive tasklet is stopped by
> > tasklet_kill(&conn->rx_tsklet) in smc_conn_free() before the connection
> > is unregistered, so it cannot run concurrently with the free.
> >
> > Take the socket reference while still holding conns_lock, so the
> > registration reference can no longer be the last one, and drop it once
> > the handler is done.
> >
> > Fixes: d7b0e37c1ac1 ("net/smc: restructure CDC message reception")
> > Reported-by: Weiming Shi <bestswngs@gmail.com>
> > Assisted-by: Claude:claude-opus-4-8
> > Signed-off-by: Xiang Mei <xmei5@asu.edu>
> > ---
> > net/smc/smc_cdc.c | 12 +++++++++---
> > 1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
> > index 619b3bab3824..b809139d7e87 100644
> > --- a/net/smc/smc_cdc.c
> > +++ b/net/smc/smc_cdc.c
> > @@ -483,21 +483,27 @@ static void smc_cdc_rx_handler(struct ib_wc *wc, void *buf)
> > lgr = smc_get_lgr(link);
> > read_lock_bh(&lgr->conns_lock);
> > conn = smc_lgr_find_conn(ntohl(cdc->token), lgr);
> > + if (conn && !conn->out_of_sync)
> > + sock_hold(&container_of(conn, struct smc_sock, conn)->sk);
> > + else
> > + conn = NULL;
> > read_unlock_bh(&lgr->conns_lock);
> > - if (!conn || conn->out_of_sync)
> > + if (!conn)
> > return;
> > smc = container_of(conn, struct smc_sock, conn);
> >
>
> Fix looks correct.
> A few nits on the implementation:
> container_of() is called twice for the same conn. The conn = NULL
> sentinel and the second post unlock check can also be dropped. Flip the
> condition, early return inside the lock, compute smc once:
>
> if (!conn || conn->out_of_sync) {
> read_unlock_bh(&lgr->conns_lock);
> return;
> }
> smc = container_of(conn, struct smc_sock, conn);
> sock_hold(&smc->sk);
> read_unlock_bh(&lgr->conns_lock);
>
> Also please initialize smc = NULL at declaration, it's not a bug now
> since the early return guards it, just to make it refactor safe.
>
Thanks so much for the review!
Both suggestions are good. v2 takes the reference under conns_lock and
returns early inside the lock. And smc is also initialized to NULL at
declaration.
Will send v2.
Xiang
> > if (cdc->prod_flags.failover_validation) {
> > smc_cdc_msg_validate(smc, cdc, link);
> > - return;
> > + goto out;
> > }
> > if (smc_cdc_before(ntohs(cdc->seqno),
> > conn->local_rx_ctrl.seqno))
> > /* received seqno is old */
> > - return;
> > + goto out;
> >
> > smc_cdc_msg_recv(smc, cdc);
> > +out:
> > + sock_put(&smc->sk);
> > }
> >
> > static struct smc_wr_rx_handler smc_cdc_rx_handlers[] = {
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-30 18:18 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-27 1:49 [PATCH net] net/smc: fix UAF in smc_cdc_rx_handler() by pinning the socket Xiang Mei
2026-06-28 1:50 ` sashiko-bot
2026-06-30 5:39 ` Sidraya Jayagond
2026-06-30 18:18 ` Xiang Mei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox