From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
dsahern@kernel.org, willemdebruijn.kernel@gmail.com,
willemb@google.com, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, eddyz87@gmail.com, song@kernel.org,
yonghong.song@linux.dev, john.fastabend@gmail.com,
kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com,
jolsa@kernel.org, horms@kernel.org, bpf@vger.kernel.org,
netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v8 10/12] bpf: make TCP tx timestamp bpf extension work
Date: Wed, 5 Feb 2025 16:47:04 -0800 [thread overview]
Message-ID: <0a8e7b84-bab6-4852-8616-577d9b561f4c@linux.dev> (raw)
In-Reply-To: <CAL+tcoCQ165Y4R7UWG=J=8e=EzwFLxSX3MQPOv=kOS3W1Q7R0A@mail.gmail.com>
On 2/5/25 4:12 PM, Jason Xing wrote:
> On Thu, Feb 6, 2025 at 5:57 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 2/4/25 5:57 PM, Jakub Kicinski wrote:
>>> On Wed, 5 Feb 2025 02:30:22 +0800 Jason Xing wrote:
>>>> + if (cgroup_bpf_enabled(CGROUP_SOCK_OPS) &&
>>>> + SK_BPF_CB_FLAG_TEST(sk, SK_BPF_CB_TX_TIMESTAMPING) && skb) {
>>>> + struct skb_shared_info *shinfo = skb_shinfo(skb);
>>>> + struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
>>>> +
>>>> + tcb->txstamp_ack_bpf = 1;
>>>> + shinfo->tx_flags |= SKBTX_BPF;
>>>> + shinfo->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1;
>>>> + }
>>>
>>> If BPF program is attached we'll timestamp all skbs? Am I reading this
>>> right?
>>
>> If the attached bpf program explicitly turns on the SK_BPF_CB_TX_TIMESTAMPING
>> bit of a sock, then all skbs of this sock will be tx timestamp-ed.
>
> Martin, I'm afraid it's not like what you expect. Only the last
> portion of the sendmsg will enter the above function which means if
> the size of sendmsg is large, only the last skb will be set SKBTX_BPF
> and be timestamped.
Sure. The last skb of a large msg and more skb of small msg (or MSG_EOR).
My point is, only attaching a bpf alone is not enough. The
SK_BPF_CB_TX_TIMESTAMPING still needs to be turned on.
>
>>
>>>
>>> Wouldn't it be better to let BPF_SOCK_OPS_TS_SND_CB return whether it's
>>> interested in tracing current packet all the way thru the stack?
>>
>> I like this idea. It can give the BPF prog a chance to do skb sampling on a
>> particular socket.
>>
>> The return value of BPF_SOCK_OPS_TS_SND_CB (or any cgroup BPF prog return value)
>> already has another usage, which its return value is currently enforced by the
>> verifier. It is better not to convolute it further.
>>
>> I don't prefer to add more use cases to skops->reply either, which is an union
>> of args[4], such that later progs (in the cgrp prog array) may lose the args value.
>>
>> Jason, instead of always setting SKBTX_BPF and txstamp_ack_bpf in the kernel, a
>> new BPF kfunc can be added so that the BPF prog can call it to selectively set
>> SKBTX_BPF and txstamp_ack_bpf in some skb.
>
> Agreed because at netdev 0x19 I have an explicit plan to share the
> experience from our company about how to trace all the skbs which were
> completed through a kernel module. It's how we use in production
> especially for debug or diagnose use.
This is fine. The bpf prog can still do that by calling the kfunc. I don't see
why move the bit setting into kfunc makes the whole set won't work.
> I'm not knowledgeable enough about BPF, so I'd like to know if there
> are some functions that I can take as good examples?
>
> I think it's a standalone and good feature, can I handle it after this series?
Unfortunately, no. Once the default is on, this cannot be changed.
I think Jakub's suggestion to allow bpf prog selectively choose skb to timestamp
is useful, so I suggested a way to do it.
next prev parent reply other threads:[~2025-02-06 0:47 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-04 18:30 [PATCH bpf-next v8 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 01/12] bpf: add support for bpf_setsockopt() Jason Xing
2025-02-05 15:22 ` Willem de Bruijn
2025-02-05 15:34 ` Jason Xing
2025-02-05 20:57 ` Martin KaFai Lau
2025-02-05 21:25 ` Willem de Bruijn
2025-02-04 18:30 ` [PATCH bpf-next v8 02/12] bpf: prepare for timestamping callbacks use Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 03/12] bpf: stop unsafely accessing TCP fields in bpf callbacks Jason Xing
2025-02-05 15:24 ` Willem de Bruijn
2025-02-05 15:35 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 04/12] bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks Jason Xing
2025-02-05 15:26 ` Willem de Bruijn
2025-02-05 15:50 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 05/12] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-02-05 1:47 ` Jakub Kicinski
2025-02-05 2:40 ` Jason Xing
2025-02-05 3:14 ` Jakub Kicinski
2025-02-05 3:23 ` Jason Xing
2025-02-05 1:50 ` Jakub Kicinski
2025-02-05 15:34 ` Willem de Bruijn
2025-02-05 15:52 ` Jason Xing
2025-02-06 8:43 ` Jason Xing
2025-02-06 10:22 ` Jason Xing
2025-02-06 16:13 ` Willem de Bruijn
2025-02-07 0:22 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 06/12] bpf: support SCM_TSTAMP_SCHED " Jason Xing
2025-02-05 15:36 ` Willem de Bruijn
2025-02-05 15:55 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 07/12] bpf: support sw SCM_TSTAMP_SND " Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 08/12] bpf: support hw " Jason Xing
2025-02-05 15:45 ` Willem de Bruijn
2025-02-05 16:03 ` Jason Xing
2025-02-10 22:39 ` Martin KaFai Lau
2025-02-11 0:00 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 09/12] bpf: support SCM_TSTAMP_ACK " Jason Xing
2025-02-05 15:47 ` Willem de Bruijn
2025-02-05 16:06 ` Jason Xing
2025-02-05 21:25 ` Willem de Bruijn
2025-02-04 18:30 ` [PATCH bpf-next v8 10/12] bpf: make TCP tx timestamp bpf extension work Jason Xing
2025-02-05 1:57 ` Jakub Kicinski
2025-02-05 2:15 ` Jason Xing
2025-02-05 21:57 ` Martin KaFai Lau
2025-02-06 0:12 ` Jason Xing
2025-02-06 0:42 ` Jason Xing
2025-02-06 0:47 ` Martin KaFai Lau [this message]
2025-02-06 1:05 ` Jason Xing
2025-02-06 2:39 ` Jason Xing
2025-02-06 2:56 ` Willem de Bruijn
2025-02-06 3:09 ` Jason Xing
2025-02-06 3:25 ` Willem de Bruijn
2025-02-06 3:41 ` Jason Xing
2025-02-06 6:12 ` Martin KaFai Lau
2025-02-06 6:56 ` Jason Xing
2025-02-07 2:07 ` Martin KaFai Lau
2025-02-07 2:18 ` Jason Xing
2025-02-07 12:07 ` Jason Xing
2025-02-08 2:11 ` Martin KaFai Lau
2025-02-08 6:53 ` Jason Xing
2025-02-07 13:34 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 11/12] bpf: add a new callback in tcp_tx_timestamp() Jason Xing
2025-02-05 5:28 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 12/12] selftests/bpf: add simple bpf tests in the tx path for timestamping feature Jason Xing
2025-02-05 15:54 ` Willem de Bruijn
2025-02-05 16:08 ` Jason Xing
2025-02-06 1:28 ` Martin KaFai Lau
2025-02-06 2:14 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0a8e7b84-bab6-4852-8616-577d9b561f4c@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kerneljasonxing@gmail.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox