All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf.kernel@gmail.com>
To: Kuniyuki Iwashima <kuniyu@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	 Martin KaFai Lau <martin.lau@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>,
	 Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	 John Fastabend <john.fastabend@gmail.com>,
	Stanislav Fomichev <sdf@fomichev.me>,
	 Eric Dumazet <edumazet@google.com>,
	Neal Cardwell <ncardwell@google.com>,
	 Willem de Bruijn <willemb@google.com>,
	Tenzin Ukyab <ukyab@berkeley.edu>,
	 Kuniyuki Iwashima <kuni1840@gmail.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH v1 bpf-next 7/8] bpf: tcp: Add SOCK_OPS rcvlowat hook.
Date: Fri, 8 May 2026 08:28:10 -0700	[thread overview]
Message-ID: <af4AO2-9jwluWuik@devvm7509.cco0.facebook.com> (raw)
In-Reply-To: <20260508073355.3916746-8-kuniyu@google.com>

On 05/08, Kuniyuki Iwashima wrote:
> Now, it is time to add the new hooks for BPF_SOCK_OPS_RCVLOWAT_CB.
> 
> Let's invoke the BPF SOCK_OPS prog when
> 
>   1. TCP stack enqueues skb to sk->sk_receive_queue
>      -> tcp_queue_rcv(), tcp_ofo_queue(), and tcp_fastopen_add_skb()
> 
>   2. TCP recvmsg() completes
>      -> __tcp_cleanup_rbuf()
> 
> This will allow the BPF prog to parse each skb and dynamically
> adjust sk->sk_rcvlowat to suppress unnecessary EPOLLIN wakeups
> until sufficient data (e.g., a full RPC frame) is available
> in the receive queue.
> 
> Note that the direct access to bpf_sock_ops.data is intentionally
> disabled by passing 0 as end_offset.
> 
> Instead, the BPF prog is supposed to use bpf_skb_load_bytes()
> with bpf_sock_ops because payload is not in the linear area
> with TCP header/data split on and skb may contain a RPC
> descriptor in skb frag.  This also simplifies the BPF prog.
> 
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>

I was reading the series expecting to find some skb_queue_walk-like
implementation, but since it's a cgroup hook we obviously don't
need to do that.. So at this point BPF_SOCK_OPS_RCVLOWAT_CB_FLAG
is basically a "rx queue skb" hook, right? So should we make
the name more generic? There is really nothing lowat-specific
here besides your new kfunc to read the payload?

> ---
>  include/net/tcp.h       | 14 ++++++++++++++
>  net/ipv4/tcp.c          |  2 ++
>  net/ipv4/tcp_fastopen.c |  2 ++
>  net/ipv4/tcp_input.c    | 10 ++++++++++
>  4 files changed, 28 insertions(+)
> 
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 4e9e634e276b..003e46c9b500 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -737,6 +737,20 @@ static inline struct request_sock *cookie_bpf_check(struct net *net, struct sock
>  }
>  #endif
>  
> +#ifdef CONFIG_CGROUP_BPF
> +void bpf_skops_rcvlowat(struct sock *sk, struct sk_buff *skb);
> +
> +static inline void tcp_bpf_rcvlowat(struct sock *sk, struct sk_buff *skb)
> +{
> +	if (BPF_SOCK_OPS_TEST_FLAG(tcp_sk(sk), BPF_SOCK_OPS_RCVLOWAT_CB_FLAG))
> +		bpf_skops_rcvlowat(sk, skb);
> +}
> +#else
> +static inline void tcp_bpf_rcvlowat(struct sock *sk, struct sk_buff *skb)
> +{
> +}
> +#endif
> +
>  /* From net/ipv6/syncookies.c */
>  int __cookie_v6_check(const struct ipv6hdr *iph, const struct tcphdr *th);
>  struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb);
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 1d9e52fc454f..80144b97a87a 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1602,6 +1602,8 @@ void __tcp_cleanup_rbuf(struct sock *sk, int copied)
>  		tcp_mstamp_refresh(tp);
>  		tcp_send_ack(sk);
>  	}
> +
> +	tcp_bpf_rcvlowat(sk, NULL);
>  }
>  
>  void tcp_cleanup_rbuf(struct sock *sk, int copied)
> diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
> index 471c78be5513..91bf421fc5b6 100644
> --- a/net/ipv4/tcp_fastopen.c
> +++ b/net/ipv4/tcp_fastopen.c
> @@ -281,6 +281,8 @@ void tcp_fastopen_add_skb(struct sock *sk, struct sk_buff *skb)
>  	TCP_SKB_CB(skb)->seq++;
>  	TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_SYN;
>  
> +	tcp_bpf_rcvlowat(sk, skb);
> +

I'm also not sure about the particular placement of some of these..
For example here, why do it before updating tp? Why not after?

(and same for tcp_ofo_queue)

  parent reply	other threads:[~2026-05-08 15:28 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08  7:33 [PATCH v1 bpf-next 0/8] bpf: Add SOCK_OPS hooks for TCP AutoLOWAT Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 1/8] selftest: bpf: Use BPF_SOCK_OPS_ALL_CB_FLAGS + 1 for bad_cb_test_rv Kuniyuki Iwashima
2026-05-08 19:02   ` sashiko-bot
2026-05-08 20:21     ` Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 2/8] bpf: tcp: Introduce BPF_SOCK_OPS_RCVLOWAT_CB Kuniyuki Iwashima
2026-05-08 19:17   ` sashiko-bot
2026-05-08 20:26     ` Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 3/8] bpf: tcp: Support bpf_skb_load_bytes() for BPF_SOCK_OPS_RCVLOWAT_CB Kuniyuki Iwashima
2026-05-08 15:15   ` Stanislav Fomichev
2026-05-08 19:45     ` Kuniyuki Iwashima
2026-05-11 14:56       ` Stanislav Fomichev
2026-05-08  7:33 ` [PATCH v1 bpf-next 4/8] tcp: Split out __tcp_set_rcvlowat() Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 5/8] bpf: tcp: Add kfunc to adjust sk->sk_rcvlowat Kuniyuki Iwashima
2026-05-11 12:34   ` Björn Töpel
2026-05-17 23:28     ` Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 6/8] bpf: tcp: Factorise bpf_skops_established() Kuniyuki Iwashima
2026-05-08  7:33 ` [PATCH v1 bpf-next 7/8] bpf: tcp: Add SOCK_OPS rcvlowat hook Kuniyuki Iwashima
2026-05-08 10:37   ` Jiayuan Chen
2026-05-08 11:30     ` Kuniyuki Iwashima
2026-05-08 12:19       ` Jiayuan Chen
2026-05-08 15:28   ` Stanislav Fomichev [this message]
2026-05-08 20:05     ` Kuniyuki Iwashima
2026-05-11 14:55       ` Stanislav Fomichev
2026-05-08 21:46   ` sashiko-bot
2026-05-08  7:33 ` [PATCH v1 bpf-next 8/8] selftest: bpf: Add test for BPF_SOCK_OPS_RCVLOWAT_CB Kuniyuki Iwashima
2026-05-08 15:35   ` Stanislav Fomichev
2026-05-08 20:19     ` Kuniyuki Iwashima
2026-05-08 21:47       ` Stanislav Fomichev
2026-05-08 21:58         ` Kuniyuki Iwashima
2026-05-08 22:17   ` sashiko-bot
2026-05-08 22:47     ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af4AO2-9jwluWuik@devvm7509.cco0.facebook.com \
    --to=sdf.kernel@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuni1840@gmail.com \
    --cc=kuniyu@google.com \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@fomichev.me \
    --cc=ukyab@berkeley.edu \
    --cc=willemb@google.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.