netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	willemb@google.com, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, shuah@kernel.org,
	ykolal@fb.com, bpf@vger.kernel.org, netdev@vger.kernel.org,
	Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH net-next v3 02/14] net-timestamp: allow two features to work parallelly
Date: Tue, 5 Nov 2024 11:22:09 -0800	[thread overview]
Message-ID: <f27ab4ce-02df-464e-90ed-852652fb7e3e@linux.dev> (raw)
In-Reply-To: <CAL+tcoBf+kQ3_kc9x62KnHx9O+6c==_DN+6EheL82UKQ3xQN1A@mail.gmail.com>

On 11/4/24 10:22 PM, Jason Xing wrote:
> On Tue, Nov 5, 2024 at 10:09 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 11/1/24 6:32 AM, Willem de Bruijn wrote:
>>>> In udp/raw/..., I don't know how likely is the user space having "cork->tx_flags
>>>> & SKBTX_ANY_TSTAMP" set but has neither "READ_ONCE(sk->sk_tsflags) &
>>>> SOF_TIMESTAMPING_OPT_ID" nor "cork->flags & IPCORK_TS_OPT_ID" set.
>>> This is not something to rely on. OPT_ID was added relatively recently.
>>> Older applications, or any that just use the most straightforward API,
>>> will not set this.
>>
>> Good point that the OPT_ID per cmsg is very new.
>>
>> The datagram support on SOF_TIMESTAMPING_OPT_ID in sk->sk_tsflags had
>> been there for quite some time now. Is it a safe assumption that
>> most applications doing udp tx timestamping should have
>> the SOF_TIMESTAMPING_OPT_ID set to be useful?
>>
>>>
>>>> If it is
>>>> unlikely, may be we can just disallow bpf prog from directly setting
>>>> skb_shinfo(skb)->tskey for this particular skb.
>>>>
>>>> For all other cases, in __ip[6]_append_data, directly call a bpf prog and also
>>>> pass the kernel decided tskey to the bpf prog.
>>>>
>>>> The kernel passed tskey could be 0 (meaning the user space has not used it). The
>>>> bpf prog can give one for the kernel to use. The bpf prog can store the
>>>> sk_tskey_bpf in the bpf_sk_storage now. Meaning no need to add one to the struct
>>>> sock. The bpf prog does not have to start from 0 (e.g. start from U32_MAX
>>>> instead) if it helps.
>>>>
>>>> If the kernel passed tskey is not 0, the bpf prog can just use that one
>>>> (assuming the user space is doing something sane, like the value in
>>>> SCM_TS_OPT_ID won't be jumping back and front between 0 to U32_MAX). I hope this
>>>> is very unlikely also (?) but the bpf prog can probably detect this and choose
>>>> to ignore this sk.
>>> If an applications uses OPT_ID, it is unlikely that they will toggle
>>> the feature on and off on a per-packet basis. So in the common case
>>> the program could use the user-set counter or use its own if userspace
>>> does not enable the feature. In the rare case that an application does
>>> intermittently set an OPT_ID, the numbering would be erratic. This
>>> does mean that an actively malicious application could mess with admin
>>> measurements.
>>
>> All make sense. Given it is reasonable to assume the user space should either
>> has SOF_TIMESTAMPING_OPT_ID always on or always off. When it is off, the bpf
>> prog can directly provide its own tskey to be used in shinfo->tskey. The bpf
>> prog can generate the id itself without using the sk->sk_tskey, e.g. store an
>> atomic int in the bpf_sk_storage.
> 
> I wonder, how can we correlate the key with each skb in the bpf
> program for non-TCP type without implementing a bpf extension for
> SCM_TS_OPT_ID? Every time the timestamp is reported, we cannot know
> which sendmsg() the skb belongs to for non-TCP cases.

SCM_TS_OPT_ID is eventually setting the shinfo->tskey.
If the shinfo->tskey is not set by the user space, the bpf prog can directly set 
the shinfo->tskey. There is no need to use the sk->sk_tskey as the ID generator 
also. The bpf prog can have its own id generator.

If the user space has already set the shinfo->tskey (either by sk->sk_tskey or 
SCM_TS_OPT_ID), the bpf prog can just use the user space one.

If there is a weird application that flips flops between OPT_ID on/off, the bpf 
prog will get confused which is fine. The bpf prog can detect this and choose to 
ignore measuring this sk/skb. The bpf prog can also choose to be on the very 
safe side and ignore all skb with SKBTX_ANY_TSTAMP set in txflags but with no 
OPT_ID. The bpf prog can look into the details of the sk and skb to decide what 
makes the most sense for its deployment.

I don't know whether it makes more sense to call the bpf prog to decide the 
shinfo->{tx_flags,tskey} just before the "while (length > 0)" in 
__ip[6]_append_data or it is better to call the bpf prog in ip[6]_setup_cork.
I admittedly less familiar with this code path than the tcp one.

  reply	other threads:[~2024-11-05 19:22 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-28 11:05 [PATCH net-next v3 00/14] net-timestamp: bpf extension to equip applications transparently Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 01/14] net-timestamp: reorganize in skb_tstamp_tx_output() Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 02/14] net-timestamp: allow two features to work parallelly Jason Xing
2024-10-29 23:00   ` Martin KaFai Lau
2024-10-30  1:23     ` Jason Xing
2024-10-30  1:45       ` Willem de Bruijn
2024-10-30  2:32         ` Jason Xing
2024-10-30  2:47           ` Willem de Bruijn
2024-10-30  3:04             ` Jason Xing
2024-10-30  5:37               ` Martin KaFai Lau
2024-10-30  6:42                 ` Jason Xing
2024-10-30 17:15                   ` Willem de Bruijn
2024-10-30 23:54                     ` Jason Xing
2024-10-31  0:13                       ` Jason Xing
2024-10-31  6:27                         ` Martin KaFai Lau
2024-10-31  7:04                           ` Jason Xing
2024-10-31 12:30                             ` Willem de Bruijn
2024-10-31 13:50                               ` Jason Xing
2024-10-31 23:26                                 ` Martin KaFai Lau
2024-11-01  7:47                                   ` Jason Xing
2024-11-05  1:50                                     ` Martin KaFai Lau
2024-11-05  3:13                                       ` Jason Xing
2024-11-01 13:32                                   ` Willem de Bruijn
2024-11-01 16:08                                     ` Jason Xing
2024-11-01 16:39                                       ` Willem de Bruijn
2024-11-05  2:09                                     ` Martin KaFai Lau
2024-11-05  6:22                                       ` Jason Xing
2024-11-05 19:22                                         ` Martin KaFai Lau [this message]
2024-11-06  0:17                                           ` Jason Xing
2024-11-06  1:09                                             ` Martin KaFai Lau
2024-11-06  2:51                                               ` Jason Xing
2024-11-07  1:19                                                 ` Martin KaFai Lau
2024-11-07  3:31                                                   ` Jason Xing
2024-11-07 19:05                                                     ` Martin KaFai Lau
2024-11-06  1:11                                             ` Willem de Bruijn
2024-11-06  2:37                                               ` Jason Xing
2024-11-05 14:29                                       ` Willem de Bruijn
2024-11-02 13:43   ` Simon Horman
2024-11-03  0:42     ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 03/14] net-timestamp: open gate for bpf_setsockopt/_getsockopt Jason Xing
2024-10-29  0:59   ` Willem de Bruijn
2024-10-29  1:18     ` Jason Xing
2024-10-30  0:32   ` Martin KaFai Lau
2024-10-30  1:15     ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 04/14] net-timestamp: introduce TS_SCHED_OPT_CB to generate dev xmit timestamp Jason Xing
2024-10-29  0:23   ` kernel test robot
2024-10-29  1:02   ` Willem de Bruijn
2024-10-29  1:30     ` Jason Xing
2024-10-29  1:04   ` kernel test robot
2024-10-28 11:05 ` [PATCH net-next v3 05/14] net-timestamp: introduce TS_SW_OPT_CB to generate driver timestamp Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 06/14] net-timestamp: introduce TS_ACK_OPT_CB to generate tcp acked timestamp Jason Xing
2024-10-29  1:03   ` Willem de Bruijn
2024-10-29  1:19     ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 07/14] net-timestamp: add a new triggered point to set sk_tsflags_bpf in UDP layer Jason Xing
2024-10-29  1:07   ` Willem de Bruijn
2024-10-29  1:23     ` Jason Xing
2024-10-29  1:33       ` Willem de Bruijn
2024-10-29  3:12         ` Jason Xing
2024-10-29 15:04           ` Willem de Bruijn
2024-10-29 15:44             ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 08/14] net-timestamp: make bpf for tx timestamp work Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 09/14] net-timestamp: add a common helper to set tskey Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 10/14] net-timestamp: add basic support with tskey offset Jason Xing
2024-10-29  1:24   ` Willem de Bruijn
2024-10-29  2:41     ` Jason Xing
2024-10-29 15:03       ` Willem de Bruijn
2024-10-29 15:50         ` Jason Xing
2024-10-29 19:45           ` Willem de Bruijn
2024-10-30  3:27             ` Jason Xing
2024-10-30  5:42   ` Martin KaFai Lau
2024-10-30  6:50     ` Jason Xing
2024-10-31  1:17       ` Martin KaFai Lau
2024-10-31  2:41         ` Jason Xing
2024-10-31  3:27           ` Jason Xing
2024-10-31  5:52           ` Martin KaFai Lau
2024-10-31  6:16             ` Jason Xing
2024-10-31 23:50           ` Martin KaFai Lau
2024-11-01  6:33             ` Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 11/14] net-timestamp: support OPT_ID for TCP proto Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 12/14] net-timestamp: add OPT_ID for UDP proto Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 13/14] net-timestamp: use static key to control bpf extension Jason Xing
2024-10-28 11:05 ` [PATCH net-next v3 14/14] bpf: add simple bpf tests in the tx path for so_timstamping feature Jason Xing
2024-10-29  1:26   ` Willem de Bruijn
2024-10-29  1:33     ` Jason Xing
2024-10-29  1:40       ` Willem de Bruijn
2024-10-29  3:13         ` Jason Xing
2024-10-30  5:57   ` Martin KaFai Lau
2024-10-30  6:54     ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f27ab4ce-02df-464e-90ed-852652fb7e3e@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kerneljasonxing@gmail.com \
    --cc=kernelxing@tencent.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=ykolal@fb.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).