From: Jiayuan Chen <jiayuan.chen@linux.dev>
To: KaFai Wan <kafai.wan@linux.dev>,
martin.lau@linux.dev, daniel@iogearbox.net,
john.fastabend@gmail.com, sdf@fomichev.me, ast@kernel.org,
andrii@kernel.org, eddyz87@gmail.com, memxor@gmail.com,
song@kernel.org, yonghong.song@linux.dev, jolsa@kernel.org,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, shuah@kernel.org,
jiayuan.chen@linux.dev, bpf@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org
Cc: Quan Sun <2022090917019@std.uestc.edu.cn>,
Yinhao Hu <dddddd@hust.edu.cn>,
Kaiyan Mei <M202472210@hust.edu.cn>
Subject: Re: [PATCH bpf v2 1/2] bpf: Reject TCP_NODELAY in TCP header option callbacks
Date: Fri, 17 Apr 2026 10:43:53 +0800 [thread overview]
Message-ID: <ce2019d0-c127-4c0a-8d2a-f373a68d1bd2@linux.dev> (raw)
In-Reply-To: <20260416112308.1820332-2-kafai.wan@linux.dev>
On 4/16/26 7:23 PM, KaFai Wan wrote:
> A BPF_SOCK_OPS program can enable
> BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG and then call
> bpf_setsockopt(TCP_NODELAY) from BPF_SOCK_OPS_HDR_OPT_LEN_CB or
> BPF_SOCK_OPS_WRITE_HDR_OPT_CB.
>
> In these callbacks, bpf_setsockopt(TCP_NODELAY) can reach
> __tcp_sock_set_nodelay(), which can call tcp_push_pending_frames().
>
> From BPF_SOCK_OPS_HDR_OPT_LEN_CB, tcp_push_pending_frames() can call
> tcp_current_mss(), which calls tcp_established_options() and re-enters
> bpf_skops_hdr_opt_len().
>
> BPF_SOCK_OPS_HDR_OPT_LEN_CB
> -> bpf_setsockopt(TCP_NODELAY)
> -> tcp_push_pending_frames()
> -> tcp_current_mss()
> -> tcp_established_options()
> -> bpf_skops_hdr_opt_len()
> -> BPF_SOCK_OPS_HDR_OPT_LEN_CB
>
> From BPF_SOCK_OPS_WRITE_HDR_OPT_CB, tcp_push_pending_frames() can call
> tcp_write_xmit(), which calls tcp_transmit_skb(). That path recomputes
> header option length through tcp_established_options() and
> bpf_skops_hdr_opt_len() before re-entering bpf_skops_write_hdr_opt().
>
> BPF_SOCK_OPS_WRITE_HDR_OPT_CB
> -> bpf_setsockopt(TCP_NODELAY)
> -> tcp_push_pending_frames()
> -> tcp_write_xmit()
> -> tcp_transmit_skb()
> -> tcp_established_options()
> -> bpf_skops_hdr_opt_len()
> -> bpf_skops_write_hdr_opt()
> -> BPF_SOCK_OPS_WRITE_HDR_OPT_CB
>
> This leads to unbounded recursion and can overflow the kernel stack.
>
> Reject TCP_NODELAY with -EOPNOTSUPP in bpf_sock_ops_setsockopt()
> when bpf_setsockopt() is called from
> BPF_SOCK_OPS_HDR_OPT_LEN_CB or BPF_SOCK_OPS_WRITE_HDR_OPT_CB.
>
> Reported-by: Quan Sun <2022090917019@std.uestc.edu.cn>
> Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
> Closes: https://lore.kernel.org/bpf/d1d523c9-6901-4454-a183-94462b8f3e4e@std.uestc.edu.cn/
> Fixes: 7e41df5dbba2 ("bpf: Add a few optnames to bpf_setsockopt")
> Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
> ---
> net/core/filter.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index fcfcb72663ca..911ff04bca5a 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5833,6 +5833,11 @@ BPF_CALL_5(bpf_sock_ops_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
> if (!is_locked_tcp_sock_ops(bpf_sock))
> return -EOPNOTSUPP;
>
> + if ((bpf_sock->op == BPF_SOCK_OPS_HDR_OPT_LEN_CB ||
> + bpf_sock->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB) &&
> + IS_ENABLED(CONFIG_INET) && level == SOL_TCP && optname == TCP_NODELAY)
> + return -EOPNOTSUPP;
> +
> return _bpf_setsockopt(bpf_sock->sk, level, optname, optval, optlen);
> }
>
A simple comment is recommended:
/* TCP_NODELAY triggers tcp_push_pending_frames() and re-enters these
callbacks. */
Also like Martin pointed before, BPF_SOCK_OPS_HDR_OPT_LEN_CB /
BPF_SOCK_OPS_WRITE_HDR_OPT_CB
can only be produced under CONFIG_INET so IS_ENABLED(CONFIG_INET) is dead.
next prev parent reply other threads:[~2026-04-17 2:44 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-16 11:23 [PATCH bpf v2 0/2] bpf: Reject TCP_NODELAY in TCP header option callbacks KaFai Wan
2026-04-16 11:23 ` [PATCH bpf v2 1/2] " KaFai Wan
2026-04-16 17:35 ` Martin KaFai Lau
2026-04-17 1:35 ` KaFai Wan
2026-04-17 2:43 ` Jiayuan Chen [this message]
2026-04-17 9:27 ` KaFai Wan
2026-04-16 11:23 ` [PATCH bpf v2 2/2] selftests/bpf: Test TCP_NODELAY in TCP hdr opt callbacks KaFai Wan
2026-04-16 19:06 ` Martin KaFai Lau
2026-04-17 3:07 ` KaFai Wan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ce2019d0-c127-4c0a-8d2a-f373a68d1bd2@linux.dev \
--to=jiayuan.chen@linux.dev \
--cc=2022090917019@std.uestc.edu.cn \
--cc=M202472210@hust.edu.cn \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dddddd@hust.edu.cn \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kafai.wan@linux.dev \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=memxor@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.