public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: jiayuan.chen@linux.dev
Cc: Jakub Kicinski <kuba@kernel.org>,
	jiayuan.chen@shopee.com, mjambigi@linux.ibm.com,
	wenjia@linux.ibm.com, horms@kernel.org, sidraya@linux.ibm.com,
	guwen@linux.alibaba.com, davem@davemloft.net,
	linux-s390@vger.kernel.org, netdev@vger.kernel.org,
	linux-rdma@vger.kernel.org, dust.li@linux.alibaba.com,
	syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com,
	pabeni@redhat.com, tonylu@linux.alibaba.com,
	linux-kernel@vger.kernel.org, edumazet@google.com,
	alibuda@linux.alibaba.com
Subject: Re: [net,v4] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()
Date: Wed, 11 Mar 2026 20:05:20 -0700	[thread overview]
Message-ID: <20260312030520.626362-1-kuba@kernel.org> (raw)
In-Reply-To: <20260311022451.395802-1-jiayuan.chen@linux.dev>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()

This patch fixes a race condition in smc_tcp_syn_recv_sock() where
sk_user_data can be nullified or the smc_sock can be freed during
concurrent socket closure, using RCU and refcount_inc_not_zero() for
safe access.

Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@google.com/T/

> The race window looks like this:
>
>   CPU A (softirq)              CPU B (process ctx)
>
>   tcp_v4_rcv()
>     TCP_NEW_SYN_RECV:
>     sk = req->rsk_listener
>     sock_hold(sk)
>     /* No lock on listener */
>                                smc_close_active():
>                                  write_lock_bh(cb_lock)
>                                  sk_user_data = NULL
>                                  write_unlock_bh(cb_lock)
>                                  ...
>                                  smc_clcsock_release()
>                                  sock_put(smc->sk) x2
>                                    -> smc_sock freed!
>     tcp_check_req()
>       smc_tcp_syn_recv_sock():
>         smc = user_data(sk)
>           -> NULL or dangling
>         smc->queued_smc_hs
>           -> crash!

The race diagram describes the tcp_check_req() path (TCP_NEW_SYN_RECV
handling), but the actual syzbot crash occurred through a different path.
Looking at the syzbot stacktrace, the crash shows:

smc_tcp_syn_recv_sock+0xa7/0x4b0 called from tcp_get_cookie_sock+0xd5/0x790

This is the SYN cookie path:

tcp_v4_rcv() -> tcp_v4_do_rcv() -> tcp_v4_cookie_check() ->
cookie_v4_check() -> tcp_get_cookie_sock() -> smc_tcp_syn_recv_sock()

The syzbot log even shows 'Possible SYN flooding on port [::]:20002.
Sending cookies.' confirming syncookies were active. Both paths are
vulnerable to the race, but the race diagram doesn't match the actual
crash path that the fuzzer triggered.

Would it be more accurate to note that both paths (tcp_check_req for
normal SYN handling and tcp_get_cookie_sock for SYN cookies) can trigger
the race?

> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -131,7 +131,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
>  	struct smc_sock *smc;
>  	struct sock *child;
>
> -	smc = smc_clcsock_user_data(sk);
> +	rcu_read_lock();
> +	smc = smc_clcsock_user_data_rcu(sk);
> +	if (!smc || !refcount_inc_not_zero(&smc->sk.sk_refcnt)) {
> +		rcu_read_unlock();
> +		return NULL;

When smc_tcp_syn_recv_sock() is called from the SYN cookie path
(tcp_get_cookie_sock() in net/ipv4/syncookies.c:205 and
net/ipv6/syncookies.c:275), the dst parameter is non-NULL.

The new early return here returns NULL without calling dst_release(dst).
The syn_recv_sock contract requires the callee to either consume dst
(attach to child socket on success) or release it (call dst_release on
failure).

Looking at tcp_get_cookie_sock() in net/ipv4/syncookies.c:

    child = af_ops->syn_recv_sock(..., &dst, ...)
    if (!child) {
        reqsk_free(req);
        return NULL;
    }

When syn_recv_sock returns NULL, tcp_get_cookie_sock immediately returns
NULL without releasing dst, expecting the syn_recv_sock callee to have
already handled it.

The existing 'drop:' label at line 165 correctly calls dst_release(dst),
but the new early return at line 138 bypasses it. This causes a dst_entry
reference count leak under SYN flood conditions on an SMC listener.

Should this be: goto drop instead of return NULL, or should dst_release(dst)
be added before the return NULL?

> +	}
> +	rcu_read_unlock();
>
>  	if (READ_ONCE(sk->sk_ack_backlog) + atomic_read(&smc->queued_smc_hs) >
>  				sk->sk_max_ack_backlog)
> @@ -153,11 +159,13 @@ static struct sock *smc_tcp_syn_recv_sock(const struct sock *sk,
>  		if (inet_csk(child)->icsk_af_ops == inet_csk(sk)->icsk_af_ops)
>  			inet_csk(child)->icsk_af_ops = smc->ori_af_ops;
>  	}
> +	sock_put(&smc->sk);
>  	return child;
>
>  drop:
>  	dst_release(dst);
>  	tcp_listendrop(sk);
> +	sock_put(&smc->sk);
>  	return NULL;
>  }
>
> @@ -254,7 +262,7 @@ static void smc_fback_restore_callbacks(struct smc_sock *smc)
>  	struct sock *clcsk = smc->clcsock->sk;
>
>  	write_lock_bh(&clcsk->sk_callback_lock);
> -	clcsk->sk_user_data = NULL;
> +	rcu_assign_sk_user_data(clcsk, NULL);
>
>  	smc_clcsock_restore_cb(&clcsk->sk_state_change, &smc->clcsk_state_change);
>  	smc_clcsock_restore_cb(&clcsk->sk_data_ready, &smc->clcsk_data_ready);

[ ... ]

> @@ -2687,10 +2695,11 @@ int smc_listen(struct socket *sock, int backlog)
>  		write_lock_bh(&smc->clcsock->sk->sk_callback_lock);
>  		smc_clcsock_restore_cb(&smc->clcsock->sk->sk_data_ready,
>  				       &smc->clcsk_data_ready);
> -		smc->clcsock->sk->sk_user_data = NULL;
> +		rcu_assign_sk_user_data(smc->clcsock->sk, NULL);
>  		write_unlock_bh(&smc->clcsock->sk->sk_callback_lock);
>  		goto out;
>  	}
> +	sock_set_flag(sk, SOCK_RCU_FREE);
>  	sk->sk_max_ack_backlog = backlog;
>  	sk->sk_ack_backlog = 0;
>  	sk->sk_state = SMC_LISTEN;

[ ... ]
-- 
pw-bot: cr

  parent reply	other threads:[~2026-03-12  3:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11  2:24 [PATCH net v4] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock() Jiayuan Chen
2026-03-11  9:17 ` Eric Dumazet
2026-03-12  3:05 ` Jakub Kicinski [this message]
2026-03-12  3:27   ` [net,v4] " Jiayuan Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260312030520.626362-1-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=alibuda@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=dust.li@linux.alibaba.com \
    --cc=edumazet@google.com \
    --cc=guwen@linux.alibaba.com \
    --cc=horms@kernel.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=jiayuan.chen@shopee.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjambigi@linux.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sidraya@linux.ibm.com \
    --cc=syzbot+827ae2bfb3a3529333e9@syzkaller.appspotmail.com \
    --cc=tonylu@linux.alibaba.com \
    --cc=wenjia@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox