From: Martin KaFai Lau <martin.lau@linux.dev>
To: Kuniyuki Iwashima <kuniyu@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
Stanislav Fomichev <sdf@fomichev.me>,
Eric Dumazet <edumazet@google.com>,
Neal Cardwell <ncardwell@google.com>,
Willem de Bruijn <willemb@google.com>,
Tenzin Ukyab <ukyab@berkeley.edu>,
Kuniyuki Iwashima <kuni1840@gmail.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB.
Date: Tue, 26 May 2026 13:34:03 -0700 [thread overview]
Message-ID: <202652620632.prOx.martin.lau@linux.dev> (raw)
In-Reply-To: <20260523083001.2911931-4-kuniyu@google.com>
On Sat, May 23, 2026 at 08:29:32AM +0000, Kuniyuki Iwashima wrote:
> When a TCP skb is queued to sk->sk_receive_queue, BPF SOCK_OPS
> prog can be called with BPF_SOCK_OPS_RCVQ_CB.
>
> In this hook, we want to parse the RPC descriptor in the skb
> and adjust sk->sk_rcvlowat based on the RPC frame size.
>
> However, we cannot access payload via bpf_sock_ops.data on
> modern NICs with TCP header/data split on as the payload is
> not placed in the linear area.
>
> Let's support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB.
>
> Three notes:
>
> 1) bpf_sock_ops_kern.skb will be NULL when the BPF prog is
> invoked from recvmsg().
>
> 2) Access to bpf_sock_ops.data will be disabled by passing
> 0 end_offset to bpf_skops_init_skb().
>
> 3) ____bpf_skb_load_bytes() is called directly instead of
> __bpf_skb_load_bytes() to allow compilers to inline it
> instead of generating a tail-call.
Some observations below.
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
> v2: Explain why using ____ version instead of __
> ---
> net/core/filter.c | 34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 4a50fe2cd863..fa8a7c7d86eb 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -7760,6 +7760,38 @@ static const struct bpf_func_proto bpf_sk_assign_proto = {
> .arg3_type = ARG_ANYTHING,
> };
>
> +BPF_CALL_4(bpf_sock_ops_skb_load_bytes, struct bpf_sock_ops_kern *, bpf_sock,
> + u32, offset, void *, to, u32, len)
> +{
> + int err;
> +
> + if (bpf_sock->op != BPF_SOCK_OPS_RCVQ_CB) {
bpf_dynptr_from_skb() and bpf_dynptr_slice() kfunc could also be considered.
One less bpf_sock->op check in filter.c to maintain and could also avoid
a data copy. There is a bpf_cast_to_kern_ctx() to get to a trusted
skops_kern pointer but this will need changes in verifier.c to get to
skops_kern->skb (e.g. in type_is_trusted_or_null) and this is the tradeoff.
If this new rcvq callback is added to the 'bpf_tcp_ops' proposal [1],
all this will go away. 'struct sk_buff *skb' can be directly passed to an
ops of the 'bpf_tcp_ops'. Supporting '*skb' in a struct_ops has already
been done in the bpf_qdisc.
[1]: https://lore.kernel.org/bpf/20260519215841.2984970-11-martin.lau@linux.dev/
next prev parent reply other threads:[~2026-05-26 20:34 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-23 8:29 [PATCH v3 bpf-next 00/11] bpf: Add SOCK_OPS hooks for TCP AutoLOWAT Kuniyuki Iwashima
2026-05-23 8:29 ` [PATCH v3 bpf-next 01/11] selftest: bpf: Use BPF_SOCK_OPS_ALL_CB_FLAGS + 1 for bad_cb_test_rv Kuniyuki Iwashima
2026-05-23 9:06 ` bot+bpf-ci
2026-05-23 8:29 ` [PATCH v3 bpf-next 02/11] bpf: tcp: Introduce BPF_SOCK_OPS_RCVQ_CB Kuniyuki Iwashima
2026-05-23 8:29 ` [PATCH v3 bpf-next 03/11] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVQ_CB Kuniyuki Iwashima
2026-05-26 20:34 ` Martin KaFai Lau [this message]
2026-05-26 21:21 ` Kuniyuki Iwashima
2026-05-26 22:18 ` Martin KaFai Lau
2026-05-23 8:29 ` [PATCH v3 bpf-next 04/11] tcp: Split out __tcp_set_rcvlowat() Kuniyuki Iwashima
2026-05-23 8:29 ` [PATCH v3 bpf-next 05/11] bpf: tcp: Add kfunc to adjust sk->sk_rcvlowat Kuniyuki Iwashima
2026-05-23 9:06 ` bot+bpf-ci
2026-05-23 8:29 ` [PATCH v3 bpf-next 06/11] bpf: tcp: Make BPF_SOCK_OPS_RCVQ_CB and SOCKMAP mutually exclusive Kuniyuki Iwashima
2026-05-23 9:20 ` bot+bpf-ci
2026-05-24 3:37 ` Kuniyuki Iwashima
2026-05-23 8:29 ` [PATCH v3 bpf-next 07/11] bpf: mptcp: Don't support BPF_SOCK_OPS_RCVQ_CB Kuniyuki Iwashima
2026-05-23 8:29 ` [PATCH v3 bpf-next 08/11] bpf: tcp: Reject BPF_SOCK_OPS_RCVQ_CB if receive queue is not empty Kuniyuki Iwashima
2026-05-23 9:20 ` bot+bpf-ci
2026-05-23 8:29 ` [PATCH v3 bpf-next 09/11] bpf: tcp: Factorise bpf_skops_established() Kuniyuki Iwashima
2026-05-23 8:29 ` [PATCH v3 bpf-next 10/11] bpf: tcp: Add SOCK_OPS rcvlowat hook Kuniyuki Iwashima
2026-05-26 20:47 ` Martin KaFai Lau
2026-05-26 21:07 ` Kuniyuki Iwashima
2026-05-26 21:37 ` Amery Hung
2026-05-26 21:51 ` Kuniyuki Iwashima
2026-05-23 8:29 ` [PATCH v3 bpf-next 11/11] selftest: bpf: Add test for BPF_SOCK_OPS_RCVQ_CB Kuniyuki Iwashima
2026-05-23 9:20 ` bot+bpf-ci
2026-05-24 4:03 ` Kuniyuki Iwashima
2026-05-26 21:01 ` Martin KaFai Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202652620632.prOx.martin.lau@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=john.fastabend@gmail.com \
--cc=kuni1840@gmail.com \
--cc=kuniyu@google.com \
--cc=memxor@gmail.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=sdf@fomichev.me \
--cc=ukyab@berkeley.edu \
--cc=willemb@google.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox