All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Sitnicki <jakub@cloudflare.com>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: bpf@vger.kernel.org,  John Fastabend <john.fastabend@gmail.com>,
	 "David S. Miller" <davem@davemloft.net>,
	 Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	 Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
	 Neal Cardwell <ncardwell@google.com>,
	Kuniyuki Iwashima <kuniyu@google.com>,
	 David Ahern <dsahern@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Eduard Zingerman <eddyz87@gmail.com>,
	 Alexei Starovoitov <ast@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	 Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>,
	 Yonghong Song <yonghong.song@linux.dev>,
	 KP Singh <kpsingh@kernel.org>,
	 Stanislav Fomichev <sdf@fomichev.me>,
	 Hao Luo <haoluo@google.com>,  Jiri Olsa <jolsa@kernel.org>,
	 Shuah Khan <shuah@kernel.org>,  Michal Luczaj <mhal@rbox.co>,
	 Cong Wang <cong.wang@bytedance.com>,
	 netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	 linux-kselftest@vger.kernel.org
Subject: Re: [PATCH bpf-next v9 2/3] bpf, sockmap: Fix FIONREAD for sockmap
Date: Mon, 26 Jan 2026 11:53:01 +0100	[thread overview]
Message-ID: <87zf60sl0y.fsf@cloudflare.com> (raw)
In-Reply-To: <20260124113314.113584-3-jiayuan.chen@linux.dev> (Jiayuan Chen's message of "Sat, 24 Jan 2026 19:32:44 +0800")

On Sat, Jan 24, 2026 at 07:32 PM +08, Jiayuan Chen wrote:
> A socket using sockmap has its own independent receive queue: ingress_msg.
> This queue may contain data from its own protocol stack or from other
> sockets.
>
> Therefore, for sockmap, relying solely on copied_seq and rcv_nxt to
> calculate FIONREAD is not enough.
>
> This patch adds a new msg_tot_len field in the psock structure to record
> the data length in ingress_msg. Additionally, we implement new ioctl
> interfaces for TCP and UDP to intercept FIONREAD operations.
>
> Note that we intentionally do not include sk_receive_queue data in the
> FIONREAD result. Data in sk_receive_queue has not yet been processed by
> the BPF verdict program, and may be redirected to other sockets or
> dropped. Including it would create semantic ambiguity since this data
> may never be readable by the user.
>
> Unix and VSOCK sockets have similar issues, but fixing them is outside
> the scope of this patch as it would require more intrusive changes.
>
> Previous work by John Fastabend made some efforts towards FIONREAD support:
> commit e5c6de5fa025 ("bpf, sockmap: Incorrectly handling copied_seq")
> Although the current patch is based on the previous work by John Fastabend,
> it is acceptable for our Fixes tag to point to the same commit.
>
>                                                       FD1:read()
>                                                       --  FD1->copied_seq++
>                                                           |  [read data]
>                                                           |
>                                    [enqueue data]         v
>                   [sockmap]     -> ingress to self ->  ingress_msg queue
> FD1 native stack  ------>                                 ^
> -- FD1->rcv_nxt++               -> redirect to other      | [enqueue data]
>                                        |                  |
>                                        |             ingress to FD1
>                                        v                  ^
>                                       ...                 |  [sockmap]
>                                                      FD2 native stack
>
> Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---

Thanks, Jiayuan!

Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>

  parent reply	other threads:[~2026-01-26 10:53 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-24 11:32 [PATCH bpf-next v9 0/3] bpf: Fix FIONREAD and copied_seq issues Jiayuan Chen
2026-01-24 11:32 ` [PATCH bpf-next v9 1/3] bpf, sockmap: Fix incorrect copied_seq calculation Jiayuan Chen
2026-01-24 11:32 ` [PATCH bpf-next v9 2/3] bpf, sockmap: Fix FIONREAD for sockmap Jiayuan Chen
2026-01-25  2:10   ` Alexei Starovoitov
2026-01-26 10:53   ` Jakub Sitnicki [this message]
2026-01-24 11:32 ` [PATCH bpf-next v9 3/3] bpf, selftest: Add tests for FIONREAD and copied_seq Jiayuan Chen
2026-01-26 10:59   ` Jakub Sitnicki
2026-01-27 17:20 ` [PATCH bpf-next v9 0/3] bpf: Fix FIONREAD and copied_seq issues patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zf60sl0y.fsf@cloudflare.com \
    --to=jakub@cloudflare.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=horms@kernel.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mhal@rbox.co \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.