public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
@ 2026-03-09  2:38 Jiayuan Chen
  2026-03-09  6:06 ` D. Wythe
  2026-03-09  7:26 ` Eric Dumazet
  0 siblings, 2 replies; 6+ messages in thread
From: Jiayuan Chen @ 2026-03-09  2:38 UTC (permalink / raw)
  To: netdev
  Cc: Jiayuan Chen, syzbot+827ae2bfb3a3529333e9, Eric Dumazet, D. Wythe,
	Dust Li, Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
	Wen Gu, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-rdma, linux-s390, linux-kernel

Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].

smc_tcp_syn_recv_sock() is called in the TCP receive path
(softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
listening socket). It reads sk_user_data to get the smc_sock
pointer. However, when the SMC listen socket is being closed
concurrently, smc_close_active() sets clcsock->sk_user_data
to NULL under sk_callback_lock, and then the smc_sock itself
can be freed via sock_put() in smc_release().

This leads to two issues:

1) NULL pointer dereference: sk_user_data is NULL when
   accessed.
2) Use-after-free: sk_user_data is read as non-NULL, but the
   smc_sock is freed before its fields (e.g., queued_smc_hs,
   ori_af_ops) are accessed.

The race window looks like this:

  CPU A (softirq)              CPU B (process ctx)

  tcp_v4_rcv()
    TCP_NEW_SYN_RECV:
    sk = req->rsk_listener
    sock_hold(sk)
    /* No lock on listener */
                               smc_close_active():
                                 write_lock_bh(cb_lock)
                                 sk_user_data = NULL
                                 write_unlock_bh(cb_lock)
                                 ...
                                 smc_clcsock_release()
                                 sock_put(smc->sk) x2
                                   -> smc_sock freed!
    tcp_check_req()
      smc_tcp_syn_recv_sock():
        smc = user_data(sk)
          -> NULL or dangling
        smc->queued_smc_hs
          -> crash!

Note that the clcsock and smc_sock are two independent objects
with separate refcounts. TCP stack holds a reference on the
clcsock, which keeps it alive, but this does NOT prevent the
smc_sock from being freed.

Fix this by using RCU and refcount_inc_not_zero() to safely
access smc_sock. Since smc_tcp_syn_recv_sock() is called in
the TCP three-way handshake path, taking read_lock_bh on
sk_callback_lock is too heavy and would not survive a SYN
flood attack. Using rcu_read_lock() is much more lightweight.

- Set SOCK_RCU_FREE on the SMC listen socket so that
  smc_sock freeing is deferred until after the RCU grace
  period. This guarantees the memory is still valid when
  accessed inside rcu_read_lock().
- Use rcu_read_lock() to protect reading sk_user_data.
- Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
  smc_sock. If the refcount has already reached zero (close
  path completed), it returns false and we bail out safely.

Note: smc_hs_congested() has a similar lockless read of
sk_user_data without rcu_read_lock(), but it only checks for
NULL and accesses the global smc_hs_wq, never dereferencing
any smc_sock field, so it is not affected.

Reproducer was verified with mdelay injection and smc_run,
the issue no longer occurs with this patch applied.

[1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9

Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
v2:
- Use rcu_read_lock() + refcount_inc_not_zero() instead of
  read_lock_bh(sk_callback_lock) + sock_hold(), since this
  is the TCP handshake hot path and read_lock_bh is too
  expensive under SYN flood.
- Set SOCK_RCU_FREE on SMC listen socket to ensure
  RCU-deferred freeing.

v1: https://lore.kernel.org/netdev/20260307032158.372165-1-jiayuan.chen@linux.dev/
---
 net/smc/af_smc.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index d0119afcc6a1..72ac1d8c62d4 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
 	struct smc_sock *smc;
 	struct sock *child;
 
+	rcu_read_lock();
 	smc = smc_clcsock_user_data(sk);
+	if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
+		rcu_read_unlock();
+		return NULL;
+	}
+	rcu_read_unlock();
 
 	if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
 				sk->sk_max_ack_backlog)
@@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
 		if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
 			inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
 	}
+	sock_put(&smc->sk);
 	return child;
 
 drop:
 	dst_release(dst);
 	tcp_listendrop(sk);
+	sock_put(&smc->sk);
 	return NULL;
 }
 
@@ -2691,6 +2699,7 @@ int smc_listen(struct socket *sock, int backlog)
 		write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
 		goto out;
 	}
+	sock_set_flag(sk, SOCK_RCU_FREE);
 	sk->sk_max_ack_backlog = backlog;
 	sk->sk_ack_backlog = 0;
 	sk->sk_state = SMC_LISTEN;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
  2026-03-09  2:38 [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock() Jiayuan Chen
@ 2026-03-09  6:06 ` D. Wythe
  2026-03-09  7:48   ` Jiayuan Chen
  2026-03-09  7:26 ` Eric Dumazet
  1 sibling, 1 reply; 6+ messages in thread
From: D. Wythe @ 2026-03-09  6:06 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, syzbot+827ae2bfb3a3529333e9, Eric Dumazet, D. Wythe,
	Dust Li, Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
	Wen Gu, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-rdma, linux-s390, linux-kernel

On Mon, Mar 09, 2026 at 10:38:45AM +0800, Jiayuan Chen wrote:
> Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
> 
> smc_tcp_syn_recv_sock() is called in the TCP receive path
> (softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
> listening socket). It reads sk_user_data to get the smc_sock
> pointer. However, when the SMC listen socket is being closed
> concurrently, smc_close_active() sets clcsock->sk_user_data
> to NULL under sk_callback_lock, and then the smc_sock itself
> can be freed via sock_put() in smc_release().
> 
> This leads to two issues:
> 
> 1) NULL pointer dereference: sk_user_data is NULL when
>    accessed.
> 2) Use-after-free: sk_user_data is read as non-NULL, but the
>    smc_sock is freed before its fields (e.g., queued_smc_hs,
>    ori_af_ops) are accessed.
> 
> The race window looks like this:
> 
>   CPU A (softirq)              CPU B (process ctx)
> 
>   tcp_v4_rcv()
>     TCP_NEW_SYN_RECV:
>     sk = req->rsk_listener
>     sock_hold(sk)
>     /* No lock on listener */
>                                smc_close_active():
>                                  write_lock_bh(cb_lock)
>                                  sk_user_data = NULL
>                                  write_unlock_bh(cb_lock)
>                                  ...
>                                  smc_clcsock_release()
>                                  sock_put(smc->sk) x2
>                                    -> smc_sock freed!
>     tcp_check_req()
>       smc_tcp_syn_recv_sock():
>         smc = user_data(sk)
>           -> NULL or dangling
>         smc->queued_smc_hs
>           -> crash!
> 
> Note that the clcsock and smc_sock are two independent objects
> with separate refcounts. TCP stack holds a reference on the
> clcsock, which keeps it alive, but this does NOT prevent the
> smc_sock from being freed.
> 
> Fix this by using RCU and refcount_inc_not_zero() to safely
> access smc_sock. Since smc_tcp_syn_recv_sock() is called in
> the TCP three-way handshake path, taking read_lock_bh on
> sk_callback_lock is too heavy and would not survive a SYN
> flood attack. Using rcu_read_lock() is much more lightweight.
> 
> - Set SOCK_RCU_FREE on the SMC listen socket so that
>   smc_sock freeing is deferred until after the RCU grace
>   period. This guarantees the memory is still valid when
>   accessed inside rcu_read_lock().
> - Use rcu_read_lock() to protect reading sk_user_data.
> - Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
>   smc_sock. If the refcount has already reached zero (close
>   path completed), it returns false and we bail out safely.
> 
> Note: smc_hs_congested() has a similar lockless read of
> sk_user_data without rcu_read_lock(), but it only checks for
> NULL and accesses the global smc_hs_wq, never dereferencing
> any smc_sock field, so it is not affected.
> 
> Reproducer was verified with mdelay injection and smc_run,
> the issue no longer occurs with this patch applied.
> 
> [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
> 
> Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
> v2:
> - Use rcu_read_lock() + refcount_inc_not_zero() instead of
>   read_lock_bh(sk_callback_lock) + sock_hold(), since this
>   is the TCP handshake hot path and read_lock_bh is too
>   expensive under SYN flood.
> - Set SOCK_RCU_FREE on SMC listen socket to ensure
>   RCU-deferred freeing.
> 
> v1: https://lore.kernel.org/netdev/20260307032158.372165-1-jiayuan.chen@linux.dev/
> ---
>  net/smc/af_smc.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index d0119afcc6a1..72ac1d8c62d4 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
>  	struct smc_sock *smc;
>  	struct sock *child;
>  
> +	rcu_read_lock();
>  	smc = smc_clcsock_user_data(sk);
> +	if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
> +		rcu_read_unlock();
> +		return NULL;
> +	}
> +	rcu_read_unlock();
>  
>  	if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
>  				sk->sk_max_ack_backlog)
> @@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
>  		if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
>  			inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
>  	}
> +	sock_put(&smc->sk);
>  	return child;
>  
>  drop:
>  	dst_release(dst);
>  	tcp_listendrop(sk);
> +	sock_put(&smc->sk);
>  	return NULL;
>  }
>  
> @@ -2691,6 +2699,7 @@ int smc_listen(struct socket *sock, int backlog)
>  		write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
>  		goto out;
>  	}
> +	sock_set_flag(sk, SOCK_RCU_FREE);

This RCU approach looks good to me. Since SOCK_RCU_FREE is now enabled,
other callers of smc_clcsock_user_data() should also follow this
RCU-based pattern. It will eventually allow us to completely remove the
annoying sk_callback_lock.

D. Wythe

>  	sk->sk_max_ack_backlog = backlog;
>  	sk->sk_ack_backlog = 0;
>  	sk->sk_state = SMC_LISTEN;
> -- 
> 2.43.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
  2026-03-09  2:38 [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock() Jiayuan Chen
  2026-03-09  6:06 ` D. Wythe
@ 2026-03-09  7:26 ` Eric Dumazet
  2026-03-09  7:36   ` Eric Dumazet
  1 sibling, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2026-03-09  7:26 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, syzbot+827ae2bfb3a3529333e9, D. Wythe, Dust Li,
	Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu, Wen Gu,
	David S. Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-rdma, linux-s390, linux-kernel

On Mon, Mar 9, 2026 at 3:38 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
>
> smc_tcp_syn_recv_sock() is called in the TCP receive path
> (softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
> listening socket). It reads sk_user_data to get the smc_sock
> pointer. However, when the SMC listen socket is being closed
> concurrently, smc_close_active() sets clcsock->sk_user_data
> to NULL under sk_callback_lock, and then the smc_sock itself
> can be freed via sock_put() in smc_release().
>
> This leads to two issues:
>
> 1) NULL pointer dereference: sk_user_data is NULL when
>    accessed.
> 2) Use-after-free: sk_user_data is read as non-NULL, but the
>    smc_sock is freed before its fields (e.g., queued_smc_hs,
>    ori_af_ops) are accessed.
>
> The race window looks like this:
>
>   CPU A (softirq)              CPU B (process ctx)
>
>   tcp_v4_rcv()
>     TCP_NEW_SYN_RECV:
>     sk = req->rsk_listener
>     sock_hold(sk)
>     /* No lock on listener */
>                                smc_close_active():
>                                  write_lock_bh(cb_lock)
>                                  sk_user_data = NULL
>                                  write_unlock_bh(cb_lock)
>                                  ...
>                                  smc_clcsock_release()
>                                  sock_put(smc->sk) x2
>                                    -> smc_sock freed!
>     tcp_check_req()
>       smc_tcp_syn_recv_sock():
>         smc = user_data(sk)
>           -> NULL or dangling
>         smc->queued_smc_hs
>           -> crash!
>
> Note that the clcsock and smc_sock are two independent objects
> with separate refcounts. TCP stack holds a reference on the
> clcsock, which keeps it alive, but this does NOT prevent the
> smc_sock from being freed.
>
> Fix this by using RCU and refcount_inc_not_zero() to safely
> access smc_sock. Since smc_tcp_syn_recv_sock() is called in
> the TCP three-way handshake path, taking read_lock_bh on
> sk_callback_lock is too heavy and would not survive a SYN
> flood attack. Using rcu_read_lock() is much more lightweight.
>
> - Set SOCK_RCU_FREE on the SMC listen socket so that
>   smc_sock freeing is deferred until after the RCU grace
>   period. This guarantees the memory is still valid when
>   accessed inside rcu_read_lock().
> - Use rcu_read_lock() to protect reading sk_user_data.
> - Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
>   smc_sock. If the refcount has already reached zero (close
>   path completed), it returns false and we bail out safely.
>
> Note: smc_hs_congested() has a similar lockless read of
> sk_user_data without rcu_read_lock(), but it only checks for
> NULL and accesses the global smc_hs_wq, never dereferencing
> any smc_sock field, so it is not affected.
>
> Reproducer was verified with mdelay injection and smc_run,
> the issue no longer occurs with this patch applied.
>
> [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
>
> Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
> v2:
> - Use rcu_read_lock() + refcount_inc_not_zero() instead of
>   read_lock_bh(sk_callback_lock) + sock_hold(), since this
>   is the TCP handshake hot path and read_lock_bh is too
>   expensive under SYN flood.
> - Set SOCK_RCU_FREE on SMC listen socket to ensure
>   RCU-deferred freeing.

This looks better, but please note there is something missing in your
RCU implementation.

You still need to ensure s->sk_user_data is read / written with
load/store tearing prevention.
Standard rcu_dereference()/rcu_assign() are dealing with this aspect.

For instance, the following helper is assuming a lock was held :

static inline struct smc_sock *smc_clcsock_user_data(const struct sock *clcsk)
{
return (struct smc_sock *)
       ((uintptr_t)clcsk->sk_user_data & ~SK_USER_DATA_NOCOPY);
}


One possible way is to use
rcu_dereference_sk_user_data()/rcu_assign_sk_user_data() or similar
helpers.


Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
  2026-03-09  7:26 ` Eric Dumazet
@ 2026-03-09  7:36   ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2026-03-09  7:36 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, syzbot+827ae2bfb3a3529333e9, D. Wythe, Dust Li,
	Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu, Wen Gu,
	David S. Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-rdma, linux-s390, linux-kernel

On Mon, Mar 9, 2026 at 8:26 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Mar 9, 2026 at 3:38 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> >
> > Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
> >
> > smc_tcp_syn_recv_sock() is called in the TCP receive path
> > (softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
> > listening socket). It reads sk_user_data to get the smc_sock
> > pointer. However, when the SMC listen socket is being closed
> > concurrently, smc_close_active() sets clcsock->sk_user_data
> > to NULL under sk_callback_lock, and then the smc_sock itself
> > can be freed via sock_put() in smc_release().
> >
> > This leads to two issues:
> >
> > 1) NULL pointer dereference: sk_user_data is NULL when
> >    accessed.
> > 2) Use-after-free: sk_user_data is read as non-NULL, but the
> >    smc_sock is freed before its fields (e.g., queued_smc_hs,
> >    ori_af_ops) are accessed.
> >
> > The race window looks like this:
> >
> >   CPU A (softirq)              CPU B (process ctx)
> >
> >   tcp_v4_rcv()
> >     TCP_NEW_SYN_RECV:
> >     sk = req->rsk_listener
> >     sock_hold(sk)
> >     /* No lock on listener */
> >                                smc_close_active():
> >                                  write_lock_bh(cb_lock)
> >                                  sk_user_data = NULL
> >                                  write_unlock_bh(cb_lock)
> >                                  ...
> >                                  smc_clcsock_release()
> >                                  sock_put(smc->sk) x2
> >                                    -> smc_sock freed!
> >     tcp_check_req()
> >       smc_tcp_syn_recv_sock():
> >         smc = user_data(sk)
> >           -> NULL or dangling
> >         smc->queued_smc_hs
> >           -> crash!
> >
> > Note that the clcsock and smc_sock are two independent objects
> > with separate refcounts. TCP stack holds a reference on the
> > clcsock, which keeps it alive, but this does NOT prevent the
> > smc_sock from being freed.
> >
> > Fix this by using RCU and refcount_inc_not_zero() to safely
> > access smc_sock. Since smc_tcp_syn_recv_sock() is called in
> > the TCP three-way handshake path, taking read_lock_bh on
> > sk_callback_lock is too heavy and would not survive a SYN
> > flood attack. Using rcu_read_lock() is much more lightweight.
> >
> > - Set SOCK_RCU_FREE on the SMC listen socket so that
> >   smc_sock freeing is deferred until after the RCU grace
> >   period. This guarantees the memory is still valid when
> >   accessed inside rcu_read_lock().
> > - Use rcu_read_lock() to protect reading sk_user_data.
> > - Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
> >   smc_sock. If the refcount has already reached zero (close
> >   path completed), it returns false and we bail out safely.
> >
> > Note: smc_hs_congested() has a similar lockless read of
> > sk_user_data without rcu_read_lock(), but it only checks for
> > NULL and accesses the global smc_hs_wq, never dereferencing
> > any smc_sock field, so it is not affected.
> >
> > Reproducer was verified with mdelay injection and smc_run,
> > the issue no longer occurs with this patch applied.
> >
> > [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
> >
> > Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> > Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> > Suggested-by: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > ---
> > v2:
> > - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> >   read_lock_bh(sk_callback_lock) + sock_hold(), since this
> >   is the TCP handshake hot path and read_lock_bh is too
> >   expensive under SYN flood.
> > - Set SOCK_RCU_FREE on SMC listen socket to ensure
> >   RCU-deferred freeing.
>
> This looks better, but please note there is something missing in your
> RCU implementation.
>
> You still need to ensure s->sk_user_data is read / written with
> load/store tearing prevention.
> Standard rcu_dereference()/rcu_assign() are dealing with this aspect.
>
> For instance, the following helper is assuming a lock was held :
>
> static inline struct smc_sock *smc_clcsock_user_data(const struct sock *clcsk)
> {
> return (struct smc_sock *)
>        ((uintptr_t)clcsk->sk_user_data & ~SK_USER_DATA_NOCOPY);
> }
>
>
> One possible way is to use
> rcu_dereference_sk_user_data()/rcu_assign_sk_user_data() or similar
> helpers.
>

More context in this old commit

commit 559835ea7292e2f09304d81eda16f4209433245e
Author: Pravin B Shelar <pshelar@nicira.com>
Date:   Tue Sep 24 10:25:40 2013 -0700

    vxlan: Use RCU apis to access sk_user_data.

    Use of RCU api makes vxlan code easier to understand.  It also
    fixes bug due to missing ACCESS_ONCE() on sk_user_data dereference.
    In rare case without ACCESS_ONCE() compiler might omit vs on
    sk_user_data dereference.
    Compiler can use vs as alias for sk->sk_user_data, resulting in
    multiple sk_user_data dereference in rcu read context which
    could change.

    CC: Jesse Gross <jesse@nicira.com>
    Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
  2026-03-09  6:06 ` D. Wythe
@ 2026-03-09  7:48   ` Jiayuan Chen
  2026-03-10  5:51     ` D. Wythe
  0 siblings, 1 reply; 6+ messages in thread
From: Jiayuan Chen @ 2026-03-09  7:48 UTC (permalink / raw)
  To: D. Wythe
  Cc: netdev, syzbot+827ae2bfb3a3529333e9, Eric Dumazet, D. Wythe,
	Dust Li, Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
	Wen Gu, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-rdma, linux-s390, linux-kernel

March 9, 2026 at 14:06, "D. Wythe" <alibuda@linux.alibaba.com mailto:alibuda@linux.alibaba.com?to=%22D.%20Wythe%22%20%3Calibuda%40linux.alibaba.com%3E > wrote:


[...]
> >  Reproducer was verified with mdelay injection and smc_run,
> >  the issue no longer occurs with this patch applied.
> >  
> >  [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
> >  
> >  Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> >  Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> >  Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> >  Suggested-by: Eric Dumazet <edumazet@google.com>
> >  Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> >  ---
> >  v2:
> >  - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> >  read_lock_bh(sk_callback_lock) + sock_hold(), since this
> >  is the TCP handshake hot path and read_lock_bh is too
> >  expensive under SYN flood.
> >  - Set SOCK_RCU_FREE on SMC listen socket to ensure
> >  RCU-deferred freeing.
> >  
> >  v1: https://lore.kernel.org/netdev/20260307032158.372165-1-jiayuan.chen@linux.dev/
> >  ---
> >  net/smc/af_smc.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >  
> >  diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> >  index d0119afcc6a1..72ac1d8c62d4 100644
> >  --- a/net/smc/af_smc.c
> >  +++ b/net/smc/af_smc.c
> >  @@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> >  struct smc_sock *smc;
> >  struct sock *child;
> >  
> >  + rcu_read_lock();
> >  smc = smc_clcsock_user_data(sk);
> >  + if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
> >  + rcu_read_unlock();
> >  + return NULL;
> >  + }
> >  + rcu_read_unlock();
> >  
> >  if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
> >  sk->sk_max_ack_backlog)
> >  @@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> >  if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> >  inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
> >  }
> >  + sock_put(&smc->sk);
> >  return child;
> >  
> >  drop:
> >  dst_release(dst);
> >  tcp_listendrop(sk);
> >  + sock_put(&smc->sk);
> >  return NULL;
> >  }
> >  
> >  @@ -2691,6 +2699,7 @@ int smc_listen(struct socket *sock, int backlog)
> >  write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
> >  goto out;
> >  }
> >  + sock_set_flag(sk, SOCK_RCU_FREE);
> > 
> This RCU approach looks good to me. Since SOCK_RCU_FREE is now enabled,
> other callers of smc_clcsock_user_data() should also follow this
> RCU-based pattern. It will eventually allow us to completely remove the
> annoying sk_callback_lock.
> 
> D. Wythe
> 


Hi D. Wythe,

Thanks for the suggestion. I agree that converting all smc_clcsock_user_data()
callers to RCU is a reasonable direction, and it would allow us to eventually
remove the sk_callback_lock dependency for sk_user_data access.

However, I'd prefer to keep this patch focused on fixing the specific bug in
smc_tcp_syn_recv_sock(), since it needs to be backported to stable trees.
Mixing a bug fix with broader refactoring makes backporting harder and increases
the risk of regressions.

Also, converting the other callers is not entirely trivial. For example:

- smc_fback_state_change/data_ready/write_space/error_report():
  the sk_callback_lock there protects not only sk_user_data but also the
  consistency of the saved callback pointers (e.g.,smc->clcsk_state_change).
  Switching to RCU requires careful ordering analysis against the write side
  in smc_fback_restore_callbacks(). Additionally, fallback sockets would also
  need SOCK_RCU_FREE.

- smc_clcsock_data_ready():
  the sock_hold() would need to become refcount_inc_not_zero() to handle the
  case where the refcount has already reached zero.

I'd like to address these conversions in a follow-up patch series.

What do you think?

> > 
> > sk->sk_max_ack_backlog = backlog;
> >  sk->sk_ack_backlog = 0;
> >  sk->sk_state = SMC_LISTEN;
> >  -- 
> >  2.43.0
> >
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
  2026-03-09  7:48   ` Jiayuan Chen
@ 2026-03-10  5:51     ` D. Wythe
  0 siblings, 0 replies; 6+ messages in thread
From: D. Wythe @ 2026-03-10  5:51 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: D. Wythe, netdev, syzbot+827ae2bfb3a3529333e9, Eric Dumazet,
	Dust Li, Sidraya Jayagond, Wenjia Zhang, Mahanta Jambigi, Tony Lu,
	Wen Gu, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-rdma, linux-s390, linux-kernel

On Mon, Mar 09, 2026 at 07:48:54AM +0000, Jiayuan Chen wrote:
> March 9, 2026 at 14:06, "D. Wythe" <alibuda@linux.alibaba.com mailto:alibuda@linux.alibaba.com?to=%22D.%20Wythe%22%20%3Calibuda%40linux.alibaba.com%3E > wrote:
> 
> 
> [...]
> > >  Reproducer was verified with mdelay injection and smc_run,
> > >  the issue no longer occurs with this patch applied.
> > >  
> > >  [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
> > >  
> > >  Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> > >  Reported-by: syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com
> > >  Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/
> > >  Suggested-by: Eric Dumazet <edumazet@google.com>
> > >  Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > >  ---
> > >  v2:
> > >  - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> > >  read_lock_bh(sk_callback_lock) + sock_hold(), since this
> > >  is the TCP handshake hot path and read_lock_bh is too
> > >  expensive under SYN flood.
> > >  - Set SOCK_RCU_FREE on SMC listen socket to ensure
> > >  RCU-deferred freeing.
> > >  
> > >  v1: https://lore.kernel.org/netdev/20260307032158.372165-1-jiayuan.chen@linux.dev/
> > >  ---
> > >  net/smc/af_smc.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > >  
> > >  diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> > >  index d0119afcc6a1..72ac1d8c62d4 100644
> > >  --- a/net/smc/af_smc.c
> > >  +++ b/net/smc/af_smc.c
> > >  @@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> > >  struct smc_sock *smc;
> > >  struct sock *child;
> > >  
> > >  + rcu_read_lock();
> > >  smc = smc_clcsock_user_data(sk);
> > >  + if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
> > >  + rcu_read_unlock();
> > >  + return NULL;
> > >  + }
> > >  + rcu_read_unlock();
> > >  
> > >  if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
> > >  sk->sk_max_ack_backlog)
> > >  @@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
> > >  if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
> > >  inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
> > >  }
> > >  + sock_put(&smc->sk);
> > >  return child;
> > >  
> > >  drop:
> > >  dst_release(dst);
> > >  tcp_listendrop(sk);
> > >  + sock_put(&smc->sk);
> > >  return NULL;
> > >  }
> > >  
> > >  @@ -2691,6 +2699,7 @@ int smc_listen(struct socket *sock, int backlog)
> > >  write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
> > >  goto out;
> > >  }
> > >  + sock_set_flag(sk, SOCK_RCU_FREE);
> > > 
> > This RCU approach looks good to me. Since SOCK_RCU_FREE is now enabled,
> > other callers of smc_clcsock_user_data() should also follow this
> > RCU-based pattern. It will eventually allow us to completely remove the
> > annoying sk_callback_lock.
> > 
> > D. Wythe
> > 
> 
> 
> Hi D. Wythe,
> 
> Thanks for the suggestion. I agree that converting all smc_clcsock_user_data()
> callers to RCU is a reasonable direction, and it would allow us to eventually
> remove the sk_callback_lock dependency for sk_user_data access.
> 
> However, I'd prefer to keep this patch focused on fixing the specific bug in
> smc_tcp_syn_recv_sock(), since it needs to be backported to stable trees.
> Mixing a bug fix with broader refactoring makes backporting harder and increases
> the risk of regressions.
> 
> Also, converting the other callers is not entirely trivial. For example:
> 
> - smc_fback_state_change/data_ready/write_space/error_report():
>   the sk_callback_lock there protects not only sk_user_data but also the
>   consistency of the saved callback pointers (e.g.,smc->clcsk_state_change).
>   Switching to RCU requires careful ordering analysis against the write side
>   in smc_fback_restore_callbacks(). Additionally, fallback sockets would also
>   need SOCK_RCU_FREE.
> 
> - smc_clcsock_data_ready():
>   the sock_hold() would need to become refcount_inc_not_zero() to handle the
>   case where the refcount has already reached zero.
> 
> I'd like to address these conversions in a follow-up patch series.
> 
> What do you think?

Thanks for the detailed explanation. I agree with your plan, it's
better to keep the bug fix and the optimization separate.

> 
> > > 
> > > sk->sk_max_ack_backlog = backlog;
> > >  sk->sk_ack_backlog = 0;
> > >  sk->sk_state = SMC_LISTEN;
> > >  -- 
> > >  2.43.0
> > >
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-10  5:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09  2:38 [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock() Jiayuan Chen
2026-03-09  6:06 ` D. Wythe
2026-03-09  7:48   ` Jiayuan Chen
2026-03-10  5:51     ` D. Wythe
2026-03-09  7:26 ` Eric Dumazet
2026-03-09  7:36   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox