* [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
@ 2026-03-09 2:38 Jiayuan Chen
2026-03-09 6:06 ` D. Wythe
2026-03-09 7:26 ` Eric Dumazet
0 siblings, 2 replies; 6+ messages in thread
From: Jiayuan Chen @ 2026-03-09 2:38 UTC (permalink / raw)
To: netdev
Cc: Jiayuan Chen, syzbot+827ae2bfb3a3529333e9, Eric Dumazet, D. Wythe,
Dust Li, Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
Wen Gu, David S. Miller, Jakub Kicinski, Paolo Abeni,
Simon Horman, linux-rdma, linux-s390, linux-kernel
Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
smc_tcp_syn_recv_sock() is called in the TCP receive path
(softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
listening socket). It reads sk_user_data to get the smc_sock
pointer. However, when the SMC listen socket is being closed
concurrently, smc_close_active() sets clcsock->sk_user_data
to NULL under sk_callback_lock, and then the smc_sock itself
can be freed via sock_put() in smc_release().
This leads to two issues:
1) NULL pointer dereference: sk_user_data is NULL when
accessed.
2) Use-after-free: sk_user_data is read as non-NULL, but the
smc_sock is freed before its fields (e.g., queued_smc_hs,
ori_af_ops) are accessed.
The race window looks like this:
CPU A (softirq) CPU B (process ctx)
tcp_v4_rcv()
TCP_NEW_SYN_RECV:
sk = req->rsk_listener
sock_hold(sk)
/* No lock on listener */
smc_close_active():
write_lock_bh(cb_lock)
sk_user_data = NULL
write_unlock_bh(cb_lock)
...
smc_clcsock_release()
sock_put(smc->sk) x2
-> smc_sock freed!
tcp_check_req()
smc_tcp_syn_recv_sock():
smc = user_data(sk)
-> NULL or dangling
smc->queued_smc_hs
-> crash!
Note that the clcsock and smc_sock are two independent objects
with separate refcounts. TCP stack holds a reference on the
clcsock, which keeps it alive, but this does NOT prevent the
smc_sock from being freed.
Fix this by using RCU and refcount_inc_not_zero() to safely
access smc_sock. Since smc_tcp_syn_recv_sock() is called in
the TCP three-way handshake path, taking read_lock_bh on
sk_callback_lock is too heavy and would not survive a SYN
flood attack. Using rcu_read_lock() is much more lightweight.
- Set SOCK_RCU_FREE on the SMC listen socket so that
smc_sock freeing is deferred until after the RCU grace
period. This guarantees the memory is still valid when
accessed inside rcu_read_lock().
- Use rcu_read_lock() to protect reading sk_user_data.
- Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
smc_sock. If the refcount has already reached zero (close
path completed), it returns false and we bail out safely.
Note: smc_hs_congested() has a similar lockless read of
sk_user_data without rcu_read_lock(), but it only checks for
NULL and accesses the global smc_hs_wq, never dereferencing
any smc_sock field, so it is not affected.
Reproducer was verified with mdelay injection and smc_run,
the issue no longer occurs with this patch applied.
[1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
v2:
- Use rcu_read_lock() + refcount_inc_not_zero() instead of
read_lock_bh(sk_callback_lock) + sock_hold(), since this
is the TCP handshake hot path and read_lock_bh is too
expensive under SYN flood.
- Set SOCK_RCU_FREE on SMC listen socket to ensure
RCU-deferred freeing.
v1: https://lore.kernel.org/netdev/20260307032158.372165-1-jiayuan.chen@linux.dev/
---
net/smc/af_smc.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index d0119afcc6a1..72ac1d8c62d4 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
struct smc_sock *smc;
struct sock *child;
+ rcu_read_lock();
smc = smc_clcsock_user_data(sk);
+ if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
+ rcu_read_unlock();
+ return NULL;
+ }
+ rcu_read_unlock();
if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
sk->sk_max_ack_backlog)
@@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
}
+ sock_put(&smc->sk);
return child;
drop:
dst_release(dst);
tcp_listendrop(sk);
+ sock_put(&smc->sk);
return NULL;
}
@@ -2691,6 +2699,7 @@ int smc_listen(struct socket *sock, int backlog)
write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
goto out;
}
+ sock_set_flag(sk, SOCK_RCU_FREE);
sk->sk_max_ack_backlog = backlog;
sk->sk_ack_backlog = 0;
sk->sk_state = SMC_LISTEN;
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
2026-03-09 2:38 [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock() Jiayuan Chen
@ 2026-03-09 6:06 ` D. Wythe
2026-03-09 7:48 ` Jiayuan Chen
2026-03-09 7:26 ` Eric Dumazet
1 sibling, 1 reply; 6+ messages in thread
From: D. Wythe @ 2026-03-09 6:06 UTC (permalink / raw)
To: Jiayuan Chen
Cc: netdev, syzbot+827ae2bfb3a3529333e9, Eric Dumazet, D. Wythe,
Dust Li, Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
Wen Gu, David S. Miller, Jakub Kicinski, Paolo Abeni,
Simon Horman, linux-rdma, linux-s390, linux-kernel
On Mon, Mar 09, 2026 at 10:38:45AM +0800, Jiayuan Chen wrote:
> Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
>
> smc_tcp_syn_recv_sock() is called in the TCP receive path
> (softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
> listening socket). It reads sk_user_data to get the smc_sock
> pointer. However, when the SMC listen socket is being closed
> concurrently, smc_close_active() sets clcsock->sk_user_data
> to NULL under sk_callback_lock, and then the smc_sock itself
> can be freed via sock_put() in smc_release().
>
> This leads to two issues:
>
> 1) NULL pointer dereference: sk_user_data is NULL when
> accessed.
> 2) Use-after-free: sk_user_data is read as non-NULL, but the
> smc_sock is freed before its fields (e.g., queued_smc_hs,
> ori_af_ops) are accessed.
>
> The race window looks like this:
>
> CPU A (softirq) CPU B (process ctx)
>
> tcp_v4_rcv()
> TCP_NEW_SYN_RECV:
> sk = req->rsk_listener
> sock_hold(sk)
> /* No lock on listener */
> smc_close_active():
> write_lock_bh(cb_lock)
> sk_user_data = NULL
> write_unlock_bh(cb_lock)
> ...
> smc_clcsock_release()
> sock_put(smc->sk) x2
> -> smc_sock freed!
> tcp_check_req()
> smc_tcp_syn_recv_sock():
> smc = user_data(sk)
> -> NULL or dangling
> smc->queued_smc_hs
> -> crash!
>
> Note that the clcsock and smc_sock are two independent objects
> with separate refcounts. TCP stack holds a reference on the
> clcsock, which keeps it alive, but this does NOT prevent the
> smc_sock from being freed.
>
> Fix this by using RCU and refcount_inc_not_zero() to safely
> access smc_sock. Since smc_tcp_syn_recv_sock() is called in
> the TCP three-way handshake path, taking read_lock_bh on
> sk_callback_lock is too heavy and would not survive a SYN
> flood attack. Using rcu_read_lock() is much more lightweight.
>
> - Set SOCK_RCU_FREE on the SMC listen socket so that
> smc_sock freeing is deferred until after the RCU grace
> period. This guarantees the memory is still valid when
> accessed inside rcu_read_lock().
> - Use rcu_read_lock() to protect reading sk_user_data.
> - Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
> smc_sock. If the refcount has already reached zero (close
> path completed), it returns false and we bail out safely.
>
> Note: smc_hs_congested() has a similar lockless read of
> sk_user_data without rcu_read_lock(), but it only checks for
> NULL and accesses the global smc_hs_wq, never dereferencing
> any smc_sock field, so it is not affected.
>
> Reproducer was verified with mdelay injection and smc_run,
> the issue no longer occurs with this patch applied.
>
> [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
>
> Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
> v2:
> - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> read_lock_bh(sk_callback_lock) + sock_hold(), since this
> is the TCP handshake hot path and read_lock_bh is too
> expensive under SYN flood.
> - Set SOCK_RCU_FREE on SMC listen socket to ensure
> RCU-deferred freeing.
>
> v1: https://lore.kernel.org/netdev/20260307032158.372165-1-jiayuan.chen@linux.dev/
> ---
> net/smc/af_smc.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index d0119afcc6a1..72ac1d8c62d4 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> struct smc_sock *smc;
> struct sock *child;
>
> + rcu_read_lock();
> smc = smc_clcsock_user_data(sk);
> + if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
> + rcu_read_unlock();
> + return NULL;
> + }
> + rcu_read_unlock();
>
> if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
> sk->sk_max_ack_backlog)
> @@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
> }
> + sock_put(&smc->sk);
> return child;
>
> drop:
> dst_release(dst);
> tcp_listendrop(sk);
> + sock_put(&smc->sk);
> return NULL;
> }
>
> @@ -2691,6 +2699,7 @@ int smc_listen(struct socket *sock, int backlog)
> write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
> goto out;
> }
> + sock_set_flag(sk, SOCK_RCU_FREE);
This RCU approach looks good to me. Since SOCK_RCU_FREE is now enabled,
other callers of smc_clcsock_user_data() should also follow this
RCU-based pattern. It will eventually allow us to completely remove the
annoying sk_callback_lock.
D. Wythe
> sk->sk_max_ack_backlog = backlog;
> sk->sk_ack_backlog = 0;
> sk->sk_state = SMC_LISTEN;
> --
> 2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
2026-03-09 2:38 [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock() Jiayuan Chen
2026-03-09 6:06 ` D. Wythe
@ 2026-03-09 7:26 ` Eric Dumazet
2026-03-09 7:36 ` Eric Dumazet
1 sibling, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2026-03-09 7:26 UTC (permalink / raw)
To: Jiayuan Chen
Cc: netdev, syzbot+827ae2bfb3a3529333e9, D. Wythe, Dust Li,
Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu, Wen Gu,
David S. Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
linux-rdma, linux-s390, linux-kernel
On Mon, Mar 9, 2026 at 3:38 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
>
> smc_tcp_syn_recv_sock() is called in the TCP receive path
> (softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
> listening socket). It reads sk_user_data to get the smc_sock
> pointer. However, when the SMC listen socket is being closed
> concurrently, smc_close_active() sets clcsock->sk_user_data
> to NULL under sk_callback_lock, and then the smc_sock itself
> can be freed via sock_put() in smc_release().
>
> This leads to two issues:
>
> 1) NULL pointer dereference: sk_user_data is NULL when
> accessed.
> 2) Use-after-free: sk_user_data is read as non-NULL, but the
> smc_sock is freed before its fields (e.g., queued_smc_hs,
> ori_af_ops) are accessed.
>
> The race window looks like this:
>
> CPU A (softirq) CPU B (process ctx)
>
> tcp_v4_rcv()
> TCP_NEW_SYN_RECV:
> sk = req->rsk_listener
> sock_hold(sk)
> /* No lock on listener */
> smc_close_active():
> write_lock_bh(cb_lock)
> sk_user_data = NULL
> write_unlock_bh(cb_lock)
> ...
> smc_clcsock_release()
> sock_put(smc->sk) x2
> -> smc_sock freed!
> tcp_check_req()
> smc_tcp_syn_recv_sock():
> smc = user_data(sk)
> -> NULL or dangling
> smc->queued_smc_hs
> -> crash!
>
> Note that the clcsock and smc_sock are two independent objects
> with separate refcounts. TCP stack holds a reference on the
> clcsock, which keeps it alive, but this does NOT prevent the
> smc_sock from being freed.
>
> Fix this by using RCU and refcount_inc_not_zero() to safely
> access smc_sock. Since smc_tcp_syn_recv_sock() is called in
> the TCP three-way handshake path, taking read_lock_bh on
> sk_callback_lock is too heavy and would not survive a SYN
> flood attack. Using rcu_read_lock() is much more lightweight.
>
> - Set SOCK_RCU_FREE on the SMC listen socket so that
> smc_sock freeing is deferred until after the RCU grace
> period. This guarantees the memory is still valid when
> accessed inside rcu_read_lock().
> - Use rcu_read_lock() to protect reading sk_user_data.
> - Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
> smc_sock. If the refcount has already reached zero (close
> path completed), it returns false and we bail out safely.
>
> Note: smc_hs_congested() has a similar lockless read of
> sk_user_data without rcu_read_lock(), but it only checks for
> NULL and accesses the global smc_hs_wq, never dereferencing
> any smc_sock field, so it is not affected.
>
> Reproducer was verified with mdelay injection and smc_run,
> the issue no longer occurs with this patch applied.
>
> [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
>
> Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
> v2:
> - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> read_lock_bh(sk_callback_lock) + sock_hold(), since this
> is the TCP handshake hot path and read_lock_bh is too
> expensive under SYN flood.
> - Set SOCK_RCU_FREE on SMC listen socket to ensure
> RCU-deferred freeing.
This looks better, but please note there is something missing in your
RCU implementation.
You still need to ensure s->sk_user_data is read / written with
load/store tearing prevention.
Standard rcu_dereference()/rcu_assign() are dealing with this aspect.
For instance, the following helper is assuming a lock was held :
static inline struct smc_sock *smc_clcsock_user_data(const struct sock *clcsk)
{
return (struct smc_sock *)
((uintptr_t)clcsk->sk_user_data & ~SK_USER_DATA_NOCOPY);
}
One possible way is to use
rcu_dereference_sk_user_data()/rcu_assign_sk_user_data() or similar
helpers.
Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
2026-03-09 7:26 ` Eric Dumazet
@ 2026-03-09 7:36 ` Eric Dumazet
0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2026-03-09 7:36 UTC (permalink / raw)
To: Jiayuan Chen
Cc: netdev, syzbot+827ae2bfb3a3529333e9, D. Wythe, Dust Li,
Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu, Wen Gu,
David S. Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
linux-rdma, linux-s390, linux-kernel
On Mon, Mar 9, 2026 at 8:26 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Mar 9, 2026 at 3:38 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> >
> > Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
> >
> > smc_tcp_syn_recv_sock() is called in the TCP receive path
> > (softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
> > listening socket). It reads sk_user_data to get the smc_sock
> > pointer. However, when the SMC listen socket is being closed
> > concurrently, smc_close_active() sets clcsock->sk_user_data
> > to NULL under sk_callback_lock, and then the smc_sock itself
> > can be freed via sock_put() in smc_release().
> >
> > This leads to two issues:
> >
> > 1) NULL pointer dereference: sk_user_data is NULL when
> > accessed.
> > 2) Use-after-free: sk_user_data is read as non-NULL, but the
> > smc_sock is freed before its fields (e.g., queued_smc_hs,
> > ori_af_ops) are accessed.
> >
> > The race window looks like this:
> >
> > CPU A (softirq) CPU B (process ctx)
> >
> > tcp_v4_rcv()
> > TCP_NEW_SYN_RECV:
> > sk = req->rsk_listener
> > sock_hold(sk)
> > /* No lock on listener */
> > smc_close_active():
> > write_lock_bh(cb_lock)
> > sk_user_data = NULL
> > write_unlock_bh(cb_lock)
> > ...
> > smc_clcsock_release()
> > sock_put(smc->sk) x2
> > -> smc_sock freed!
> > tcp_check_req()
> > smc_tcp_syn_recv_sock():
> > smc = user_data(sk)
> > -> NULL or dangling
> > smc->queued_smc_hs
> > -> crash!
> >
> > Note that the clcsock and smc_sock are two independent objects
> > with separate refcounts. TCP stack holds a reference on the
> > clcsock, which keeps it alive, but this does NOT prevent the
> > smc_sock from being freed.
> >
> > Fix this by using RCU and refcount_inc_not_zero() to safely
> > access smc_sock. Since smc_tcp_syn_recv_sock() is called in
> > the TCP three-way handshake path, taking read_lock_bh on
> > sk_callback_lock is too heavy and would not survive a SYN
> > flood attack. Using rcu_read_lock() is much more lightweight.
> >
> > - Set SOCK_RCU_FREE on the SMC listen socket so that
> > smc_sock freeing is deferred until after the RCU grace
> > period. This guarantees the memory is still valid when
> > accessed inside rcu_read_lock().
> > - Use rcu_read_lock() to protect reading sk_user_data.
> > - Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
> > smc_sock. If the refcount has already reached zero (close
> > path completed), it returns false and we bail out safely.
> >
> > Note: smc_hs_congested() has a similar lockless read of
> > sk_user_data without rcu_read_lock(), but it only checks for
> > NULL and accesses the global smc_hs_wq, never dereferencing
> > any smc_sock field, so it is not affected.
> >
> > Reproducer was verified with mdelay injection and smc_run,
> > the issue no longer occurs with this patch applied.
> >
> > [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
> >
> > Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> > Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> > Suggested-by: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > ---
> > v2:
> > - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> > read_lock_bh(sk_callback_lock) + sock_hold(), since this
> > is the TCP handshake hot path and read_lock_bh is too
> > expensive under SYN flood.
> > - Set SOCK_RCU_FREE on SMC listen socket to ensure
> > RCU-deferred freeing.
>
> This looks better, but please note there is something missing in your
> RCU implementation.
>
> You still need to ensure s->sk_user_data is read / written with
> load/store tearing prevention.
> Standard rcu_dereference()/rcu_assign() are dealing with this aspect.
>
> For instance, the following helper is assuming a lock was held :
>
> static inline struct smc_sock *smc_clcsock_user_data(const struct sock *clcsk)
> {
> return (struct smc_sock *)
> ((uintptr_t)clcsk->sk_user_data & ~SK_USER_DATA_NOCOPY);
> }
>
>
> One possible way is to use
> rcu_dereference_sk_user_data()/rcu_assign_sk_user_data() or similar
> helpers.
>
More context in this old commit
commit 559835ea7292e2f09304d81eda16f4209433245e
Author: Pravin B Shelar <pshelar@nicira.com>
Date: Tue Sep 24 10:25:40 2013 -0700
vxlan: Use RCU apis to access sk_user_data.
Use of RCU api makes vxlan code easier to understand. It also
fixes bug due to missing ACCESS_ONCE() on sk_user_data dereference.
In rare case without ACCESS_ONCE() compiler might omit vs on
sk_user_data dereference.
Compiler can use vs as alias for sk->sk_user_data, resulting in
multiple sk_user_data dereference in rcu read context which
could change.
CC: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
2026-03-09 6:06 ` D. Wythe
@ 2026-03-09 7:48 ` Jiayuan Chen
2026-03-10 5:51 ` D. Wythe
0 siblings, 1 reply; 6+ messages in thread
From: Jiayuan Chen @ 2026-03-09 7:48 UTC (permalink / raw)
To: D. Wythe
Cc: netdev, syzbot+827ae2bfb3a3529333e9, Eric Dumazet, D. Wythe,
Dust Li, Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
Wen Gu, David S. Miller, Jakub Kicinski, Paolo Abeni,
Simon Horman, linux-rdma, linux-s390, linux-kernel
March 9, 2026 at 14:06, "D. Wythe" <alibuda@linux.alibaba.com mailto:alibuda@linux.alibaba.com?to=%22D.%20Wythe%22%20%3Calibuda%40linux.alibaba.com%3E > wrote:
[...]
> > Reproducer was verified with mdelay injection and smc_run,
> > the issue no longer occurs with this patch applied.
> >
> > [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
> >
> > Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> > Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> > Suggested-by: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > ---
> > v2:
> > - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> > read_lock_bh(sk_callback_lock) + sock_hold(), since this
> > is the TCP handshake hot path and read_lock_bh is too
> > expensive under SYN flood.
> > - Set SOCK_RCU_FREE on SMC listen socket to ensure
> > RCU-deferred freeing.
> >
> > v1: https://lore.kernel.org/netdev/20260307032158.372165-1-jiayuan.chen@linux.dev/
> > ---
> > net/smc/af_smc.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> > index d0119afcc6a1..72ac1d8c62d4 100644
> > --- a/net/smc/af_smc.c
> > +++ b/net/smc/af_smc.c
> > @@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> > struct smc_sock *smc;
> > struct sock *child;
> >
> > + rcu_read_lock();
> > smc = smc_clcsock_user_data(sk);
> > + if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
> > + rcu_read_unlock();
> > + return NULL;
> > + }
> > + rcu_read_unlock();
> >
> > if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
> > sk->sk_max_ack_backlog)
> > @@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> > if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> > inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
> > }
> > + sock_put(&smc->sk);
> > return child;
> >
> > drop:
> > dst_release(dst);
> > tcp_listendrop(sk);
> > + sock_put(&smc->sk);
> > return NULL;
> > }
> >
> > @@ -2691,6 +2699,7 @@ int smc_listen(struct socket *sock, int backlog)
> > write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
> > goto out;
> > }
> > + sock_set_flag(sk, SOCK_RCU_FREE);
> >
> This RCU approach looks good to me. Since SOCK_RCU_FREE is now enabled,
> other callers of smc_clcsock_user_data() should also follow this
> RCU-based pattern. It will eventually allow us to completely remove the
> annoying sk_callback_lock.
>
> D. Wythe
>
Hi D. Wythe,
Thanks for the suggestion. I agree that converting all smc_clcsock_user_data()
callers to RCU is a reasonable direction, and it would allow us to eventually
remove the sk_callback_lock dependency for sk_user_data access.
However, I'd prefer to keep this patch focused on fixing the specific bug in
smc_tcp_syn_recv_sock(), since it needs to be backported to stable trees.
Mixing a bug fix with broader refactoring makes backporting harder and increases
the risk of regressions.
Also, converting the other callers is not entirely trivial. For example:
- smc_fback_state_change/data_ready/write_space/error_report():
the sk_callback_lock there protects not only sk_user_data but also the
consistency of the saved callback pointers (e.g.,smc->clcsk_state_change).
Switching to RCU requires careful ordering analysis against the write side
in smc_fback_restore_callbacks(). Additionally, fallback sockets would also
need SOCK_RCU_FREE.
- smc_clcsock_data_ready():
the sock_hold() would need to become refcount_inc_not_zero() to handle the
case where the refcount has already reached zero.
I'd like to address these conversions in a follow-up patch series.
What do you think?
> >
> > sk->sk_max_ack_backlog = backlog;
> > sk->sk_ack_backlog = 0;
> > sk->sk_state = SMC_LISTEN;
> > --
> > 2.43.0
> >
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
2026-03-09 7:48 ` Jiayuan Chen
@ 2026-03-10 5:51 ` D. Wythe
0 siblings, 0 replies; 6+ messages in thread
From: D. Wythe @ 2026-03-10 5:51 UTC (permalink / raw)
To: Jiayuan Chen
Cc: D. Wythe, netdev, syzbot+827ae2bfb3a3529333e9, Eric Dumazet,
Dust Li, Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
Wen Gu, David S. Miller, Jakub Kicinski, Paolo Abeni,
Simon Horman, linux-rdma, linux-s390, linux-kernel
On Mon, Mar 09, 2026 at 07:48:54AM +0000, Jiayuan Chen wrote:
> March 9, 2026 at 14:06, "D. Wythe" <alibuda@linux.alibaba.com mailto:alibuda@linux.alibaba.com?to=%22D.%20Wythe%22%20%3Calibuda%40linux.alibaba.com%3E > wrote:
>
>
> [...]
> > > Reproducer was verified with mdelay injection and smc_run,
> > > the issue no longer occurs with this patch applied.
> > >
> > > [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
> > >
> > > Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> > > Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> > > Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> > > Suggested-by: Eric Dumazet <edumazet@google.com>
> > > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > > ---
> > > v2:
> > > - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> > > read_lock_bh(sk_callback_lock) + sock_hold(), since this
> > > is the TCP handshake hot path and read_lock_bh is too
> > > expensive under SYN flood.
> > > - Set SOCK_RCU_FREE on SMC listen socket to ensure
> > > RCU-deferred freeing.
> > >
> > > v1: https://lore.kernel.org/netdev/20260307032158.372165-1-jiayuan.chen@linux.dev/
> > > ---
> > > net/smc/af_smc.c | 9 +++++++++
> > > 1 file changed, 9 insertions(+)
> > >
> > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> > > index d0119afcc6a1..72ac1d8c62d4 100644
> > > --- a/net/smc/af_smc.c
> > > +++ b/net/smc/af_smc.c
> > > @@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> > > struct smc_sock *smc;
> > > struct sock *child;
> > >
> > > + rcu_read_lock();
> > > smc = smc_clcsock_user_data(sk);
> > > + if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
> > > + rcu_read_unlock();
> > > + return NULL;
> > > + }
> > > + rcu_read_unlock();
> > >
> > > if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
> > > sk->sk_max_ack_backlog)
> > > @@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> > > if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> > > inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
> > > }
> > > + sock_put(&smc->sk);
> > > return child;
> > >
> > > drop:
> > > dst_release(dst);
> > > tcp_listendrop(sk);
> > > + sock_put(&smc->sk);
> > > return NULL;
> > > }
> > >
> > > @@ -2691,6 +2699,7 @@ int smc_listen(struct socket *sock, int backlog)
> > > write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
> > > goto out;
> > > }
> > > + sock_set_flag(sk, SOCK_RCU_FREE);
> > >
> > This RCU approach looks good to me. Since SOCK_RCU_FREE is now enabled,
> > other callers of smc_clcsock_user_data() should also follow this
> > RCU-based pattern. It will eventually allow us to completely remove the
> > annoying sk_callback_lock.
> >
> > D. Wythe
> >
>
>
> Hi D. Wythe,
>
> Thanks for the suggestion. I agree that converting all smc_clcsock_user_data()
> callers to RCU is a reasonable direction, and it would allow us to eventually
> remove the sk_callback_lock dependency for sk_user_data access.
>
> However, I'd prefer to keep this patch focused on fixing the specific bug in
> smc_tcp_syn_recv_sock(), since it needs to be backported to stable trees.
> Mixing a bug fix with broader refactoring makes backporting harder and increases
> the risk of regressions.
>
> Also, converting the other callers is not entirely trivial. For example:
>
> - smc_fback_state_change/data_ready/write_space/error_report():
> the sk_callback_lock there protects not only sk_user_data but also the
> consistency of the saved callback pointers (e.g.,smc->clcsk_state_change).
> Switching to RCU requires careful ordering analysis against the write side
> in smc_fback_restore_callbacks(). Additionally, fallback sockets would also
> need SOCK_RCU_FREE.
>
> - smc_clcsock_data_ready():
> the sock_hold() would need to become refcount_inc_not_zero() to handle the
> case where the refcount has already reached zero.
>
> I'd like to address these conversions in a follow-up patch series.
>
> What do you think?
Thanks for the detailed explanation. I agree with your plan, it's
better to keep the bug fix and the optimization separate.
>
> > >
> > > sk->sk_max_ack_backlog = backlog;
> > > sk->sk_ack_backlog = 0;
> > > sk->sk_state = SMC_LISTEN;
> > > --
> > > 2.43.0
> > >
> >
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-10 5:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09 2:38 [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock() Jiayuan Chen
2026-03-09 6:06 ` D. Wythe
2026-03-09 7:48 ` Jiayuan Chen
2026-03-10 5:51 ` D. Wythe
2026-03-09 7:26 ` Eric Dumazet
2026-03-09 7:36 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox