public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: KaFai Wan <kafai.wan@linux.dev>
To: sashiko@lists.linux.dev
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf v3 1/2] bpf: Reject TCP_NODELAY in TCP header option callbacks
Date: Mon, 20 Apr 2026 21:41:06 +0800	[thread overview]
Message-ID: <0e5602f1ca92074cbef0554a7a399ff8b1cc8a1c.camel@linux.dev> (raw)
In-Reply-To: <20260418092235.98444C19424@smtp.kernel.org>

On Sat, 2026-04-18 at 09:22 +0000, sashiko-bot@kernel.org wrote:
> Sashiko AI review found 1 potential issue(s):
> - [High] The patch prevents an infinite recursion causing a kernel stack overflow by rejecting
> TCP_NODELAY in specific BPF_SOCK_OPS callbacks. However, it fails to address the exact same
> recursion vulnerability achievable via bpf_sk_setsockopt in a BPF TCP congestion control algorithm
> (BPF_PROG_TYPE_STRUCT_OPS).
> --
> 
> commit 4ff4d98f6b4e3c3d90f9268761de84573f49fd86
> Author: KaFai Wan <kafai.wan@linux.dev>
> 
> bpf: Reject TCP_NODELAY in TCP header option callbacks
> 
> This patch rejects bpf_setsockopt(TCP_NODELAY) when called from
> BPF_SOCK_OPS_HDR_OPT_LEN_CB or BPF_SOCK_OPS_WRITE_HDR_OPT_CB to prevent
> unbounded recursion and kernel stack overflow.
> 
> Closes: https://lore.kernel.org/bpf/d1d523c9-6901-4454-a183-94462b8f3e4e@std.uestc.edu.cn/
> 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index fcfcb72663ca3..eae2f2c542716 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5833,6 +5833,12 @@ BPF_CALL_5(bpf_sock_ops_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
> >  	if (!is_locked_tcp_sock_ops(bpf_sock))
> >  		return -EOPNOTSUPP;
> >  
> > +	/* TCP_NODELAY triggers tcp_push_pending_frames() and re-enters these callbacks. */
> > +	if ((bpf_sock->op == BPF_SOCK_OPS_HDR_OPT_LEN_CB ||
> > +	     bpf_sock->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB) &&
> > +	    level == SOL_TCP && optname == TCP_NODELAY)
> > +		return -EOPNOTSUPP;
> 
> Does this same recursion vulnerability exist for BPF TCP congestion control
> algorithms using BPF_PROG_TYPE_STRUCT_OPS?
> 
> If a BPF congestion control algorithm invokes bpf_setsockopt(TCP_NODELAY)
> from its cwnd_event callback when handling CA_EVENT_TX_START, could it
> trigger the same unbounded recursion?
> 
> When the kernel transmits the first packet of a data train via
> tcp_transmit_skb(), it invokes tcp_event_data_sent(). Because
> tp->packets_out is not incremented until later, tcp_packets_in_flight(tp)
> evaluates to 0, triggering tcp_ca_event(sk, CA_EVENT_TX_START).
> 
> If the BPF program then calls bpf_setsockopt(TCP_NODELAY), it would result
> in this call chain:
> 
> tcp_transmit_skb()
>   tcp_event_data_sent() -> invokes CA_EVENT_TX_START
>     cwnd_event()
>       bpf_setsockopt(TCP_NODELAY)
>         tcp_push_pending_frames()
>           tcp_write_xmit()
> 
> Since the outer tcp_transmit_skb() hasn't finished, the send head hasn't
> advanced. Wouldn't tcp_write_xmit() see the same SKB, attempt to transmit
> it again, and re-enter tcp_transmit_skb() causing an infinite recursion?
> 
You are right. I can reproduce this. 
 
> Should the restriction on TCP_NODELAY be enforced at a broader level, such
> as inside _bpf_setsockopt(), to protect contexts holding the socket lock
> during TX paths?
> 
We can check in sol_tcp_sockopt().

> > +
> >  	return _bpf_setsockopt(bpf_sock->sk, level, optname, optval, optlen);
> >  }
> > 
> 

-- 
Thanks,
KaFai

  reply	other threads:[~2026-04-20 13:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-17  9:20 [PATCH bpf v3 0/2] bpf: Reject TCP_NODELAY in TCP header option callbacks KaFai Wan
2026-04-17  9:20 ` [PATCH bpf v3 1/2] " KaFai Wan
2026-04-17 10:10   ` bot+bpf-ci
2026-04-17 10:26   ` Jiayuan Chen
2026-04-18  9:22   ` sashiko-bot
2026-04-20 13:41     ` KaFai Wan [this message]
2026-04-20 18:12       ` Martin KaFai Lau
2026-04-21 15:50         ` KaFai Wan
2026-04-17  9:20 ` [PATCH bpf v3 2/2] selftests/bpf: Test TCP_NODELAY in TCP hdr opt callbacks KaFai Wan
2026-04-17 10:45   ` Jiayuan Chen
2026-04-17 16:25   ` Martin KaFai Lau
2026-04-18  2:19     ` KaFai Wan
2026-04-20 17:09       ` Martin KaFai Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e5602f1ca92074cbef0554a7a399ff8b1cc8a1c.camel@linux.dev \
    --to=kafai.wan@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox