From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
dsahern@kernel.org, willemdebruijn.kernel@gmail.com,
willemb@google.com, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, eddyz87@gmail.com, song@kernel.org,
yonghong.song@linux.dev, john.fastabend@gmail.com,
kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com,
jolsa@kernel.org, horms@kernel.org, bpf@vger.kernel.org,
netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v8 10/12] bpf: make TCP tx timestamp bpf extension work
Date: Wed, 5 Feb 2025 16:47:04 -0800 [thread overview]
Message-ID: <0a8e7b84-bab6-4852-8616-577d9b561f4c@linux.dev> (raw)
In-Reply-To: <CAL+tcoCQ165Y4R7UWG=J=8e=EzwFLxSX3MQPOv=kOS3W1Q7R0A@mail.gmail.com>
On 2/5/25 4:12 PM, Jason Xing wrote:
> On Thu, Feb 6, 2025 at 5:57 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 2/4/25 5:57 PM, Jakub Kicinski wrote:
>>> On Wed, 5 Feb 2025 02:30:22 +0800 Jason Xing wrote:
>>>> + if (cgroup_bpf_enabled(CGROUP_SOCK_OPS) &&
>>>> + SK_BPF_CB_FLAG_TEST(sk, SK_BPF_CB_TX_TIMESTAMPING) && skb) {
>>>> + struct skb_shared_info *shinfo = skb_shinfo(skb);
>>>> + struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
>>>> +
>>>> + tcb->txstamp_ack_bpf = 1;
>>>> + shinfo->tx_flags |= SKBTX_BPF;
>>>> + shinfo->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1;
>>>> + }
>>>
>>> If BPF program is attached we'll timestamp all skbs? Am I reading this
>>> right?
>>
>> If the attached bpf program explicitly turns on the SK_BPF_CB_TX_TIMESTAMPING
>> bit of a sock, then all skbs of this sock will be tx timestamp-ed.
>
> Martin, I'm afraid it's not like what you expect. Only the last
> portion of the sendmsg will enter the above function which means if
> the size of sendmsg is large, only the last skb will be set SKBTX_BPF
> and be timestamped.
Sure. The last skb of a large msg and more skb of small msg (or MSG_EOR).
My point is, only attaching a bpf alone is not enough. The
SK_BPF_CB_TX_TIMESTAMPING still needs to be turned on.
>
>>
>>>
>>> Wouldn't it be better to let BPF_SOCK_OPS_TS_SND_CB return whether it's
>>> interested in tracing current packet all the way thru the stack?
>>
>> I like this idea. It can give the BPF prog a chance to do skb sampling on a
>> particular socket.
>>
>> The return value of BPF_SOCK_OPS_TS_SND_CB (or any cgroup BPF prog return value)
>> already has another usage, which its return value is currently enforced by the
>> verifier. It is better not to convolute it further.
>>
>> I don't prefer to add more use cases to skops->reply either, which is an union
>> of args[4], such that later progs (in the cgrp prog array) may lose the args value.
>>
>> Jason, instead of always setting SKBTX_BPF and txstamp_ack_bpf in the kernel, a
>> new BPF kfunc can be added so that the BPF prog can call it to selectively set
>> SKBTX_BPF and txstamp_ack_bpf in some skb.
>
> Agreed because at netdev 0x19 I have an explicit plan to share the
> experience from our company about how to trace all the skbs which were
> completed through a kernel module. It's how we use in production
> especially for debug or diagnose use.
This is fine. The bpf prog can still do that by calling the kfunc. I don't see
why move the bit setting into kfunc makes the whole set won't work.
> I'm not knowledgeable enough about BPF, so I'd like to know if there
> are some functions that I can take as good examples?
>
> I think it's a standalone and good feature, can I handle it after this series?
Unfortunately, no. Once the default is on, this cannot be changed.
I think Jakub's suggestion to allow bpf prog selectively choose skb to timestamp
is useful, so I suggested a way to do it.
next prev parent reply other threads:[~2025-02-06 0:47 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-04 18:30 [PATCH bpf-next v8 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 01/12] bpf: add support for bpf_setsockopt() Jason Xing
2025-02-05 15:22 ` Willem de Bruijn
2025-02-05 15:34 ` Jason Xing
2025-02-05 20:57 ` Martin KaFai Lau
2025-02-05 21:25 ` Willem de Bruijn
2025-02-04 18:30 ` [PATCH bpf-next v8 02/12] bpf: prepare for timestamping callbacks use Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 03/12] bpf: stop unsafely accessing TCP fields in bpf callbacks Jason Xing
2025-02-05 15:24 ` Willem de Bruijn
2025-02-05 15:35 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 04/12] bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks Jason Xing
2025-02-05 15:26 ` Willem de Bruijn
2025-02-05 15:50 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 05/12] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-02-05 1:47 ` Jakub Kicinski
2025-02-05 2:40 ` Jason Xing
2025-02-05 3:14 ` Jakub Kicinski
2025-02-05 3:23 ` Jason Xing
2025-02-05 1:50 ` Jakub Kicinski
2025-02-05 15:34 ` Willem de Bruijn
2025-02-05 15:52 ` Jason Xing
2025-02-06 8:43 ` Jason Xing
2025-02-06 10:22 ` Jason Xing
2025-02-06 16:13 ` Willem de Bruijn
2025-02-07 0:22 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 06/12] bpf: support SCM_TSTAMP_SCHED " Jason Xing
2025-02-05 15:36 ` Willem de Bruijn
2025-02-05 15:55 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 07/12] bpf: support sw SCM_TSTAMP_SND " Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 08/12] bpf: support hw " Jason Xing
2025-02-05 15:45 ` Willem de Bruijn
2025-02-05 16:03 ` Jason Xing
2025-02-10 22:39 ` Martin KaFai Lau
2025-02-11 0:00 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 09/12] bpf: support SCM_TSTAMP_ACK " Jason Xing
2025-02-05 15:47 ` Willem de Bruijn
2025-02-05 16:06 ` Jason Xing
2025-02-05 21:25 ` Willem de Bruijn
2025-02-04 18:30 ` [PATCH bpf-next v8 10/12] bpf: make TCP tx timestamp bpf extension work Jason Xing
2025-02-05 1:57 ` Jakub Kicinski
2025-02-05 2:15 ` Jason Xing
2025-02-05 21:57 ` Martin KaFai Lau
2025-02-06 0:12 ` Jason Xing
2025-02-06 0:42 ` Jason Xing
2025-02-06 0:47 ` Martin KaFai Lau [this message]
2025-02-06 1:05 ` Jason Xing
2025-02-06 2:39 ` Jason Xing
2025-02-06 2:56 ` Willem de Bruijn
2025-02-06 3:09 ` Jason Xing
2025-02-06 3:25 ` Willem de Bruijn
2025-02-06 3:41 ` Jason Xing
2025-02-06 6:12 ` Martin KaFai Lau
2025-02-06 6:56 ` Jason Xing
2025-02-07 2:07 ` Martin KaFai Lau
2025-02-07 2:18 ` Jason Xing
2025-02-07 12:07 ` Jason Xing
2025-02-08 2:11 ` Martin KaFai Lau
2025-02-08 6:53 ` Jason Xing
2025-02-07 13:34 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 11/12] bpf: add a new callback in tcp_tx_timestamp() Jason Xing
2025-02-05 5:28 ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 12/12] selftests/bpf: add simple bpf tests in the tx path for timestamping feature Jason Xing
2025-02-05 15:54 ` Willem de Bruijn
2025-02-05 16:08 ` Jason Xing
2025-02-06 1:28 ` Martin KaFai Lau
2025-02-06 2:14 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0a8e7b84-bab6-4852-8616-577d9b561f4c@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kerneljasonxing@gmail.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.