From: Jiayuan Chen <jiayuan.chen@linux.dev>
To: mkf <message1887@163.com>, bpf@vger.kernel.org
Cc: Quan Sun <2022090917019@std.uestc.edu.cn>,
Yinhao Hu <dddddd@hust.edu.cn>,
Kaiyan Mei <M202472210@hust.edu.cn>,
Dongliang Mu <dzm91@hust.edu.cn>,
Eric Dumazet <edumazet@google.com>,
Neal Cardwell <ncardwell@google.com>,
Kuniyuki Iwashima <kuniyu@google.com>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>, Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, David Ahern <dsahern@kernel.org>,
netdev@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH bpf] bpf,tcp: avoid infinite recursion in BPF_SOCK_OPS_HDR_OPT_LEN_CB
Date: Wed, 15 Apr 2026 09:47:30 +0800 [thread overview]
Message-ID: <0b3a3a41-f709-4414-8a5d-d2eb4959db3f@linux.dev> (raw)
In-Reply-To: <42c1fed84a84519c2432163aa46f587f2d624fef.camel@163.com>
On 4/14/26 11:37 PM, mkf wrote:
> On Tue, 2026-04-14 at 18:57 +0800, Jiayuan Chen wrote:
[...]
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -475,12 +475,21 @@ struct tcp_sock {
> u8 bpf_sock_ops_cb_flags; /* Control calling BPF programs
> * values defined in uapi/linux/tcp.h
> */
> - u8 bpf_chg_cc_inprogress:1; /* In the middle of
> + u8 bpf_chg_cc_inprogress:1, /* In the middle of
> * bpf_setsockopt(TCP_CONGESTION),
> * it is to avoid the bpf_tcp_cc->init()
> * to recur itself by calling
> * bpf_setsockopt(TCP_CONGESTION, "itself").
> */
> + bpf_hdr_opt_len_cb_inprogress:1; /* It is set before invoking the
> + * callback so that a nested
> + * bpf_setsockopt(TCP_NODELAY) or
> + * bpf_setsockopt(TCP_CORK) cannot
> + * trigger tcp_push_pending_frames(),
> + * which would call tcp_current_mss()
> + * -> bpf_skops_hdr_opt_len(), causing
> + * infinite recursion.
> + */
> #define BPF_SOCK_OPS_TEST_FLAG(TP, ARG) (TP->bpf_sock_ops_cb_flags & ARG)
> #else
> #define BPF_SOCK_OPS_TEST_FLAG(TP, ARG) 0
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 78b548158fb0..518699429a7a 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5483,6 +5483,10 @@ static int sol_tcp_sockopt(struct sock *sk, int optname,
> if (sk->sk_protocol != IPPROTO_TCP)
> return -EINVAL;
>
> + if ((optname == TCP_NODELAY || optname == TCP_CORK) &&
> + tcp_sk(sk)->bpf_hdr_opt_len_cb_inprogress)
> + return -EBUSY;
> +
> TCP_CORK is not support in sol_tcp_sockopt(), return -EINVAL by default. and put the check here
> could also prevent us from calling getsockopt(TCP_NODELAY) below.
>
>> switch (optname) {
>> case TCP_NODELAY:
>> case TCP_MAXSEG:
>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
>> index dafb63b923d0..fb06c464ac16 100644
>> --- a/net/ipv4/tcp_minisocks.c
>> +++ b/net/ipv4/tcp_minisocks.c
>> @@ -663,6 +663,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
>> RCU_INIT_POINTER(newtp->fastopen_rsk, NULL);
>>
>> newtp->bpf_chg_cc_inprogress = 0;
>> + newtp->bpf_hdr_opt_len_cb_inprogress = 0;
>> tcp_bpf_clone(sk, newsk);
>>
>> __TCP_INC_STATS(sock_net(sk), TCP_MIB_PASSIVEOPENS);
>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>> index 326b58ff1118..c9654e690e1a 100644
>> --- a/net/ipv4/tcp_output.c
>> +++ b/net/ipv4/tcp_output.c
>> @@ -475,6 +475,7 @@ static void bpf_skops_hdr_opt_len(struct sock *sk, struct sk_buff *skb,
>> unsigned int *remaining)
>> {
>> struct bpf_sock_ops_kern sock_ops;
>> + struct tcp_sock *tp = tcp_sk(sk);
>> int err;
>>
>> if (likely(!BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk),
>> @@ -519,7 +520,9 @@ static void bpf_skops_hdr_opt_len(struct sock *sk, struct sk_buff *skb,
>> if (skb)
>> bpf_skops_init_skb(&sock_ops, skb, 0);
>>
>> + tp->bpf_hdr_opt_len_cb_inprogress = 1;
> we check the BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG before calling BPF_CGROUP_RUN_PROG_SOCK_OPS_SK,
> could this flag use for the same purpose? so we don't need to add an extra field.
>
> if (likely(!BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk),
> BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG)) ||
> !*remaining)
> return;
Hi Martin, I saw your patch. Your solution is better, please ignore mine :)
next prev parent reply other threads:[~2026-04-15 1:48 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-14 10:57 [PATCH bpf] bpf,tcp: avoid infinite recursion in BPF_SOCK_OPS_HDR_OPT_LEN_CB Jiayuan Chen
2026-04-14 14:33 ` Alexei Starovoitov
2026-04-14 15:37 ` mkf
2026-04-15 1:47 ` Jiayuan Chen [this message]
2026-04-15 12:52 ` KaFai Wan
2026-04-15 18:55 ` Martin KaFai Lau
2026-04-15 20:47 ` KaFai Wan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b3a3a41-f709-4414-8a5d-d2eb4959db3f@linux.dev \
--to=jiayuan.chen@linux.dev \
--cc=2022090917019@std.uestc.edu.cn \
--cc=M202472210@hust.edu.cn \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dddddd@hust.edu.cn \
--cc=dsahern@kernel.org \
--cc=dzm91@hust.edu.cn \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=message1887@163.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=skhan@linuxfoundation.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox