All of lore.kernel.org
 help / color / mirror / Atom feed
From: KaFai Wan <kafai.wan@linux.dev>
To: sashiko@lists.linux.dev
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf v3 1/2] bpf: Reject TCP_NODELAY in TCP header option callbacks
Date: Mon, 20 Apr 2026 21:41:06 +0800	[thread overview]
Message-ID: <0e5602f1ca92074cbef0554a7a399ff8b1cc8a1c.camel@linux.dev> (raw)
In-Reply-To: <20260418092235.98444C19424@smtp.kernel.org>

On Sat, 2026-04-18 at 09:22 +0000, sashiko-bot@kernel.org wrote:
> Sashiko AI review found 1 potential issue(s):
> - [High] The patch prevents an infinite recursion causing a kernel stack overflow by rejecting
> TCP_NODELAY in specific BPF_SOCK_OPS callbacks. However, it fails to address the exact same
> recursion vulnerability achievable via bpf_sk_setsockopt in a BPF TCP congestion control algorithm
> (BPF_PROG_TYPE_STRUCT_OPS).
> --
> 
> commit 4ff4d98f6b4e3c3d90f9268761de84573f49fd86
> Author: KaFai Wan <kafai.wan@linux.dev>
> 
> bpf: Reject TCP_NODELAY in TCP header option callbacks
> 
> This patch rejects bpf_setsockopt(TCP_NODELAY) when called from
> BPF_SOCK_OPS_HDR_OPT_LEN_CB or BPF_SOCK_OPS_WRITE_HDR_OPT_CB to prevent
> unbounded recursion and kernel stack overflow.
> 
> Closes: https://lore.kernel.org/bpf/d1d523c9-6901-4454-a183-94462b8f3e4e@std.uestc.edu.cn/
> 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index fcfcb72663ca3..eae2f2c542716 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5833,6 +5833,12 @@ BPF_CALL_5(bpf_sock_ops_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
> >  	if (!is_locked_tcp_sock_ops(bpf_sock))
> >  		return -EOPNOTSUPP;
> >  
> > +	/* TCP_NODELAY triggers tcp_push_pending_frames() and re-enters these callbacks. */
> > +	if ((bpf_sock->op == BPF_SOCK_OPS_HDR_OPT_LEN_CB ||
> > +	     bpf_sock->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB) &&
> > +	    level == SOL_TCP && optname == TCP_NODELAY)
> > +		return -EOPNOTSUPP;
> 
> Does this same recursion vulnerability exist for BPF TCP congestion control
> algorithms using BPF_PROG_TYPE_STRUCT_OPS?
> 
> If a BPF congestion control algorithm invokes bpf_setsockopt(TCP_NODELAY)
> from its cwnd_event callback when handling CA_EVENT_TX_START, could it
> trigger the same unbounded recursion?
> 
> When the kernel transmits the first packet of a data train via
> tcp_transmit_skb(), it invokes tcp_event_data_sent(). Because
> tp->packets_out is not incremented until later, tcp_packets_in_flight(tp)
> evaluates to 0, triggering tcp_ca_event(sk, CA_EVENT_TX_START).
> 
> If the BPF program then calls bpf_setsockopt(TCP_NODELAY), it would result
> in this call chain:
> 
> tcp_transmit_skb()
>   tcp_event_data_sent() -> invokes CA_EVENT_TX_START
>     cwnd_event()
>       bpf_setsockopt(TCP_NODELAY)
>         tcp_push_pending_frames()
>           tcp_write_xmit()
> 
> Since the outer tcp_transmit_skb() hasn't finished, the send head hasn't
> advanced. Wouldn't tcp_write_xmit() see the same SKB, attempt to transmit
> it again, and re-enter tcp_transmit_skb() causing an infinite recursion?
> 
You are right. I can reproduce this. 
 
> Should the restriction on TCP_NODELAY be enforced at a broader level, such
> as inside _bpf_setsockopt(), to protect contexts holding the socket lock
> during TX paths?
> 
We can check in sol_tcp_sockopt().

> > +
> >  	return _bpf_setsockopt(bpf_sock->sk, level, optname, optval, optlen);
> >  }
> > 
> 

-- 
Thanks,
KaFai

  reply	other threads:[~2026-04-20 13:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-17  9:20 [PATCH bpf v3 0/2] bpf: Reject TCP_NODELAY in TCP header option callbacks KaFai Wan
2026-04-17  9:20 ` [PATCH bpf v3 1/2] " KaFai Wan
2026-04-17 10:10   ` bot+bpf-ci
2026-04-17 10:26   ` Jiayuan Chen
2026-04-18  9:22   ` sashiko-bot
2026-04-20 13:41     ` KaFai Wan [this message]
2026-04-20 18:12       ` Martin KaFai Lau
2026-04-21 15:50         ` KaFai Wan
2026-04-17  9:20 ` [PATCH bpf v3 2/2] selftests/bpf: Test TCP_NODELAY in TCP hdr opt callbacks KaFai Wan
2026-04-17 10:45   ` Jiayuan Chen
2026-04-17 16:25   ` Martin KaFai Lau
2026-04-18  2:19     ` KaFai Wan
2026-04-20 17:09       ` Martin KaFai Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e5602f1ca92074cbef0554a7a399ff8b1cc8a1c.camel@linux.dev \
    --to=kafai.wan@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.