public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Jason Xing <kerneljasonxing@gmail.com>,
	 davem@davemloft.net,  edumazet@google.com,  kuba@kernel.org,
	 pabeni@redhat.com,  horms@kernel.org,  willemb@google.com,
	 martin.lau@kernel.org
Cc: netdev@vger.kernel.org,  bpf@vger.kernel.org,
	 Jason Xing <kernelxing@tencent.com>,
	 Yushan Zhou <katrinzhou@tencent.com>
Subject: Re: [PATCH net-next v2 3/4] bpf-timestamp: keep track of the skb when wait_for_space occurs
Date: Sun, 05 Apr 2026 22:28:35 -0400	[thread overview]
Message-ID: <willemdebruijn.kernel.2ff565d2f9e7f@gmail.com> (raw)
In-Reply-To: <20260404150452.83904-4-kerneljasonxing@gmail.com>

Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> The patch is the 1/2 part of push-level granularity feature.
> 
> Tag the skb in tcp_sendmsg_locked() when wait_for_space occurs even
> though it might not carry the last byte of the sendmsg.
> 
> Prior to the patch, BPF timestamping cannot cover this case:
> The following steps reproduce this:
> 1) skb A is the current last skb before entering wait_for_space process
> 2) tcp_push() pushes A without any tag
> 3) A is transmitted from TCP to driver without putting any skb carrying
>    timestamps in the error queue, like SCHED, DRV/HARDWARE.
> 4) sk_stream_wait_memory() sleeps for a while and then returns with an
>    error code. Note that the socket lock is released.
> 5) skb A finally gets acked and removed from the rtx queue.
> 6) continue with the rest of tcp_sendmsg_locked(): it will jump to(goto)
>    'do_error' label and then 'out' label.
> 7) at this moment, skb A turns out to be the last one in this send
>    syscall, and miss the following tcp_bpf_tx_timestamp() opportunity
>    before the final tcp_push()
> 8) BPF script fails to see any timestamps this time
> 
> Signed-off-by: Yushan Zhou <katrinzhou@tencent.com>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
>  net/ipv4/tcp.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index c603b90057f6..7d030a11d004 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1400,9 +1400,11 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>  wait_for_space:
>  		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
>  		tcp_remove_empty_skb(sk);
> -		if (copied)
> +		if (copied) {
> +			tcp_bpf_tx_timestamp(sk);
>  			tcp_push(sk, flags & ~MSG_MORE, mss_now,
>  				 TCP_NAGLE_PUSH, size_goal);

Now the number of skbs that will be tracked will be unpredictable,
varying based on memory pressure.

That sounds hard to use to me. Especially if these extra pushes
cannot be identified as such.

Perhaps if all skbs from the same sendmsg call can be identified,
that would help explain pattern in data resulting from these
uncommon extra data points.

> +		}
>  
>  		err = sk_stream_wait_memory(sk, &timeo);
>  		if (err != 0)
> -- 
> 2.41.3
> 



  reply	other threads:[~2026-04-06  2:28 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-04 15:04 [PATCH net-next v2 0/4] bpf-timestamp: convert to push-level granularity Jason Xing
2026-04-04 15:04 ` [PATCH net-next v2 1/4] tcp: separate BPF timestamping from tcp_tx_timestamp Jason Xing
2026-04-04 15:04 ` [PATCH net-next v2 2/4] tcp: advance the tsflags check to save cycles Jason Xing
2026-04-06  2:23   ` Willem de Bruijn
2026-04-06 11:48     ` Jason Xing
2026-04-04 15:04 ` [PATCH net-next v2 3/4] bpf-timestamp: keep track of the skb when wait_for_space occurs Jason Xing
2026-04-06  2:28   ` Willem de Bruijn [this message]
2026-04-06 11:59     ` Jason Xing
2026-04-06 14:37       ` Willem de Bruijn
2026-04-07  3:33         ` Jason Xing
2026-04-04 15:04 ` [PATCH net-next v2 4/4] bpf-timestamp: complete tracing the skb from each push in sendmsg Jason Xing
2026-04-06  2:17 ` [PATCH net-next v2 0/4] bpf-timestamp: convert to push-level granularity Willem de Bruijn
2026-04-06 12:25   ` Jason Xing
2026-04-06 14:38     ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=willemdebruijn.kernel.2ff565d2f9e7f@gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=katrinzhou@tencent.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kernelxing@tencent.com \
    --cc=kuba@kernel.org \
    --cc=martin.lau@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox