From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
willemb@google.com, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org,
ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
haoluo@google.com, jolsa@kernel.org, shuah@kernel.org,
ykolal@fb.com, bpf@vger.kernel.org, netdev@vger.kernel.org,
Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH net-next v3 02/14] net-timestamp: allow two features to work parallelly
Date: Tue, 5 Nov 2024 11:22:09 -0800 [thread overview]
Message-ID: <f27ab4ce-02df-464e-90ed-852652fb7e3e@linux.dev> (raw)
In-Reply-To: <CAL+tcoBf+kQ3_kc9x62KnHx9O+6c==_DN+6EheL82UKQ3xQN1A@mail.gmail.com>
On 11/4/24 10:22 PM, Jason Xing wrote:
> On Tue, Nov 5, 2024 at 10:09 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 11/1/24 6:32 AM, Willem de Bruijn wrote:
>>>> In udp/raw/..., I don't know how likely is the user space having "cork->tx_flags
>>>> & SKBTX_ANY_TSTAMP" set but has neither "READ_ONCE(sk->sk_tsflags) &
>>>> SOF_TIMESTAMPING_OPT_ID" nor "cork->flags & IPCORK_TS_OPT_ID" set.
>>> This is not something to rely on. OPT_ID was added relatively recently.
>>> Older applications, or any that just use the most straightforward API,
>>> will not set this.
>>
>> Good point that the OPT_ID per cmsg is very new.
>>
>> The datagram support on SOF_TIMESTAMPING_OPT_ID in sk->sk_tsflags had
>> been there for quite some time now. Is it a safe assumption that
>> most applications doing udp tx timestamping should have
>> the SOF_TIMESTAMPING_OPT_ID set to be useful?
>>
>>>
>>>> If it is
>>>> unlikely, may be we can just disallow bpf prog from directly setting
>>>> skb_shinfo(skb)->tskey for this particular skb.
>>>>
>>>> For all other cases, in __ip[6]_append_data, directly call a bpf prog and also
>>>> pass the kernel decided tskey to the bpf prog.
>>>>
>>>> The kernel passed tskey could be 0 (meaning the user space has not used it). The
>>>> bpf prog can give one for the kernel to use. The bpf prog can store the
>>>> sk_tskey_bpf in the bpf_sk_storage now. Meaning no need to add one to the struct
>>>> sock. The bpf prog does not have to start from 0 (e.g. start from U32_MAX
>>>> instead) if it helps.
>>>>
>>>> If the kernel passed tskey is not 0, the bpf prog can just use that one
>>>> (assuming the user space is doing something sane, like the value in
>>>> SCM_TS_OPT_ID won't be jumping back and front between 0 to U32_MAX). I hope this
>>>> is very unlikely also (?) but the bpf prog can probably detect this and choose
>>>> to ignore this sk.
>>> If an applications uses OPT_ID, it is unlikely that they will toggle
>>> the feature on and off on a per-packet basis. So in the common case
>>> the program could use the user-set counter or use its own if userspace
>>> does not enable the feature. In the rare case that an application does
>>> intermittently set an OPT_ID, the numbering would be erratic. This
>>> does mean that an actively malicious application could mess with admin
>>> measurements.
>>
>> All make sense. Given it is reasonable to assume the user space should either
>> has SOF_TIMESTAMPING_OPT_ID always on or always off. When it is off, the bpf
>> prog can directly provide its own tskey to be used in shinfo->tskey. The bpf
>> prog can generate the id itself without using the sk->sk_tskey, e.g. store an
>> atomic int in the bpf_sk_storage.
>
> I wonder, how can we correlate the key with each skb in the bpf
> program for non-TCP type without implementing a bpf extension for
> SCM_TS_OPT_ID? Every time the timestamp is reported, we cannot know
> which sendmsg() the skb belongs to for non-TCP cases.
SCM_TS_OPT_ID is eventually setting the shinfo->tskey.
If the shinfo->tskey is not set by the user space, the bpf prog can directly set
the shinfo->tskey. There is no need to use the sk->sk_tskey as the ID generator
also. The bpf prog can have its own id generator.
If the user space has already set the shinfo->tskey (either by sk->sk_tskey or
SCM_TS_OPT_ID), the bpf prog can just use the user space one.
If there is a weird application that flips flops between OPT_ID on/off, the bpf
prog will get confused which is fine. The bpf prog can detect this and choose to
ignore measuring this sk/skb. The bpf prog can also choose to be on the very
safe side and ignore all skb with SKBTX_ANY_TSTAMP set in txflags but with no
OPT_ID. The bpf prog can look into the details of the sk and skb to decide what
makes the most sense for its deployment.
I don't know whether it makes more sense to call the bpf prog to decide the
shinfo->{tx_flags,tskey} just before the "while (length > 0)" in
__ip[6]_append_data or it is better to call the bpf prog in ip[6]_setup_cork.
I admittedly less familiar with this code path than the tcp one.
next prev parent reply other threads:[~2024-11-05 19:22 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-28 11:05 [PATCH net-next v3 00/14] net-timestamp: bpf extension to equip applications transparently Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 01/14] net-timestamp: reorganize in skb_tstamp_tx_output() Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 02/14] net-timestamp: allow two features to work parallelly Jason Xing
2024-10-29 23:00 ` Martin KaFai Lau
2024-10-30 1:23 ` Jason Xing
2024-10-30 1:45 ` Willem de Bruijn
2024-10-30 2:32 ` Jason Xing
2024-10-30 2:47 ` Willem de Bruijn
2024-10-30 3:04 ` Jason Xing
2024-10-30 5:37 ` Martin KaFai Lau
2024-10-30 6:42 ` Jason Xing
2024-10-30 17:15 ` Willem de Bruijn
2024-10-30 23:54 ` Jason Xing
2024-10-31 0:13 ` Jason Xing
2024-10-31 6:27 ` Martin KaFai Lau
2024-10-31 7:04 ` Jason Xing
2024-10-31 12:30 ` Willem de Bruijn
2024-10-31 13:50 ` Jason Xing
2024-10-31 23:26 ` Martin KaFai Lau
2024-11-01 7:47 ` Jason Xing
2024-11-05 1:50 ` Martin KaFai Lau
2024-11-05 3:13 ` Jason Xing
2024-11-01 13:32 ` Willem de Bruijn
2024-11-01 16:08 ` Jason Xing
2024-11-01 16:39 ` Willem de Bruijn
2024-11-05 2:09 ` Martin KaFai Lau
2024-11-05 6:22 ` Jason Xing
2024-11-05 19:22 ` Martin KaFai Lau [this message]
2024-11-06 0:17 ` Jason Xing
2024-11-06 1:09 ` Martin KaFai Lau
2024-11-06 2:51 ` Jason Xing
2024-11-07 1:19 ` Martin KaFai Lau
2024-11-07 3:31 ` Jason Xing
2024-11-07 19:05 ` Martin KaFai Lau
2024-11-06 1:11 ` Willem de Bruijn
2024-11-06 2:37 ` Jason Xing
2024-11-05 14:29 ` Willem de Bruijn
2024-11-02 13:43 ` Simon Horman
2024-11-03 0:42 ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 03/14] net-timestamp: open gate for bpf_setsockopt/_getsockopt Jason Xing
2024-10-29 0:59 ` Willem de Bruijn
2024-10-29 1:18 ` Jason Xing
2024-10-30 0:32 ` Martin KaFai Lau
2024-10-30 1:15 ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 04/14] net-timestamp: introduce TS_SCHED_OPT_CB to generate dev xmit timestamp Jason Xing
2024-10-29 0:23 ` kernel test robot
2024-10-29 1:02 ` Willem de Bruijn
2024-10-29 1:30 ` Jason Xing
2024-10-29 1:04 ` kernel test robot
2024-10-28 11:05 ` [PATCH net-next v3 05/14] net-timestamp: introduce TS_SW_OPT_CB to generate driver timestamp Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 06/14] net-timestamp: introduce TS_ACK_OPT_CB to generate tcp acked timestamp Jason Xing
2024-10-29 1:03 ` Willem de Bruijn
2024-10-29 1:19 ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 07/14] net-timestamp: add a new triggered point to set sk_tsflags_bpf in UDP layer Jason Xing
2024-10-29 1:07 ` Willem de Bruijn
2024-10-29 1:23 ` Jason Xing
2024-10-29 1:33 ` Willem de Bruijn
2024-10-29 3:12 ` Jason Xing
2024-10-29 15:04 ` Willem de Bruijn
2024-10-29 15:44 ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 08/14] net-timestamp: make bpf for tx timestamp work Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 09/14] net-timestamp: add a common helper to set tskey Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 10/14] net-timestamp: add basic support with tskey offset Jason Xing
2024-10-29 1:24 ` Willem de Bruijn
2024-10-29 2:41 ` Jason Xing
2024-10-29 15:03 ` Willem de Bruijn
2024-10-29 15:50 ` Jason Xing
2024-10-29 19:45 ` Willem de Bruijn
2024-10-30 3:27 ` Jason Xing
2024-10-30 5:42 ` Martin KaFai Lau
2024-10-30 6:50 ` Jason Xing
2024-10-31 1:17 ` Martin KaFai Lau
2024-10-31 2:41 ` Jason Xing
2024-10-31 3:27 ` Jason Xing
2024-10-31 5:52 ` Martin KaFai Lau
2024-10-31 6:16 ` Jason Xing
2024-10-31 23:50 ` Martin KaFai Lau
2024-11-01 6:33 ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 11/14] net-timestamp: support OPT_ID for TCP proto Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 12/14] net-timestamp: add OPT_ID for UDP proto Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 13/14] net-timestamp: use static key to control bpf extension Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 14/14] bpf: add simple bpf tests in the tx path for so_timstamping feature Jason Xing
2024-10-29 1:26 ` Willem de Bruijn
2024-10-29 1:33 ` Jason Xing
2024-10-29 1:40 ` Willem de Bruijn
2024-10-29 3:13 ` Jason Xing
2024-10-30 5:57 ` Martin KaFai Lau
2024-10-30 6:54 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f27ab4ce-02df-464e-90ed-852652fb7e3e@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kerneljasonxing@gmail.com \
--cc=kernelxing@tencent.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=ykolal@fb.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.