netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, dsahern@kernel.org,
	willemdebruijn.kernel@gmail.com, willemb@google.com,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, horms@kernel.org,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip applications transparently
Date: Mon, 10 Feb 2025 15:37:19 -0800	[thread overview]
Message-ID: <a483c1dd-f593-4f6b-9afe-bfb6d43647bf@linux.dev> (raw)
In-Reply-To: <20250208103220.72294-1-kerneljasonxing@gmail.com>

On 2/8/25 2:32 AM, Jason Xing wrote:
> "Timestamping is key to debugging network stack latency. With
> SO_TIMESTAMPING, bugs that are otherwise incorrectly assumed to be
> network issues can be attributed to the kernel." This is extracted
> from the talk "SO_TIMESTAMPING: Powering Fleetwide RPC Monitoring"
> addressed by Willem de Bruijn at netdevconf 0x17).
> 
> There are a few areas that need optimization with the consideration of
> easier use and less performance impact, which I highlighted and mainly
> discussed at netconf 2024 with Willem de Bruijn and John Fastabend:
> uAPI compatibility, extra system call overhead, and the need for
> application modification. I initially managed to solve these issues
> by writing a kernel module that hooks various key functions. However,
> this approach is not suitable for the next kernel release. Therefore,
> a BPF extension was proposed. During recent period, Martin KaFai Lau
> provides invaluable suggestions about BPF along the way. Many thanks
> here!
> 
> In this series, only support foundamental codes and tx for TCP.

typo: fundamental.... This had been brought up before (in v7?).

By fundamental, I suspect you meant (?) bpf timestamping infrastructure, like: 
"This series adds the BPF networking timestamping infrastructure. This series 
also adds TX timestamping support for TCP. The RX timestamping and UDP support 
will be added in the future."

> This approach mostly relies on existing SO_TIMESTAMPING feature, users

It reuses most of the tx timestamping callback that is currently enabled by the 
SO_TIMESTAMPING. However, I don't think there is a lot of overlap in term of the 
SO_TIMESTAMPING api which does feel like API reuse when first reading this comment.

> only needs to pass certain flags through bpf_setsocktopt() to a separate
> tsflags. Please see the last selftest patch in this series.
> 
> ---
> v8
> Link: https://lore.kernel.org/all/20250128084620.57547-1-kerneljasonxing@gmail.com/
> 1. adjust some commit messages and titles
> 2. add sk cookie in selftests
> 3. handle the NULL pointer in hwstamp
> 4. use kfunc to do selective sampling
> 
> v7
> Link: https://lore.kernel.org/all/20250121012901.87763-1-kerneljasonxing@gmail.com/
> 1. target bpf-next tree
> 2. simplely and directly stop timestamping callbacks calling a few BPF
> CALLS due to safety concern.
> 3. add more new testcases and adjust the existing testcases
> 4. revise some comments of new timestamping callbacks
> 5. remove a few BPF CGROUP locks
> 
> RFC v6
> In the meantime, any suggestions and reviews are welcome!
> Link: https://lore.kernel.org/all/20250112113748.73504-1-kerneljasonxing@gmail.com/
> 1. handle those safety problem by using the correct method.
> 2. support bpf_getsockopt.
> 3. adjust the position of BPF_SOCK_OPS_TS_TCP_SND_CB
> 4. fix mishandling the hardware timestamp error
> 5. add more corresponding tests
> 
> v5
> Link: https://lore.kernel.org/all/20241207173803.90744-1-kerneljasonxing@gmail.com/
> 1. handle the safety issus when someone tries to call unrelated bpf
> helpers.
> 2. avoid adding direct function call in the hot path like
> __dev_queue_xmit()
> 3. remove reporting the hardware timestamp and tskey since they can be
> fetched through the existing helper with the help of
> bpf_skops_init_skb(), please see the selftest.
> 4. add new sendmsg callback in tcp_sendmsg, and introduce tskey_bpf used
> by bpf program to correlate tcp_sendmsg with other hook points in patch [13/15].
> 
> v4
> Link: https://lore.kernel.org/all/20241028110535.82999-1-kerneljasonxing@gmail.com/
> 1. introduce sk->sk_bpf_cb_flags to let user use bpf_setsockopt() (Martin)
> 2. introduce SKBTX_BPF to enable the bpf SO_TIMESTAMPING feature (Martin)
> 3. introduce bpf map in tests (Martin)
> 4. I choose to make this series as simple as possible, so I only support
> most cases in the tx path for TCP protocol.
> 
> v3
> Link: https://lore.kernel.org/all/20241012040651.95616-1-kerneljasonxing@gmail.com/
> 1. support UDP proto by introducing a new generation point.
> 2. for OPT_ID, introducing sk_tskey_bpf_offset to compute the delta
> between the current socket key and bpf socket key. It is desiged for
> UDP, which also applies to TCP.
> 3. support bpf_getsockopt()
> 4. use cgroup static key instead.
> 5. add one simple bpf selftest to show how it can be used.
> 6. remove the rx support from v2 because the number of patches could
> exceed the limit of one series.
> 
> V2
> Link: https://lore.kernel.org/all/20241008095109.99918-1-kerneljasonxing@gmail.com/
> 1. Introduce tsflag requestors so that we are able to extend more in the
> future. Besides, it enables TX flags for bpf extension feature separately
> without breaking users. It is suggested by Vadim Fedorenko.
> 2. introduce a static key to control the whole feature. (Willem)
> 3. Open the gate of bpf_setsockopt for the SO_TIMESTAMPING feature in
> some TX/RX cases, not all the cases.
> 
> Jason Xing (12):
>    bpf: add support for bpf_setsockopt()
>    bpf: prepare for timestamping callbacks use
>    bpf: stop unsafely accessing TCP fields in bpf callbacks
>    bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks
>    net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING
>    bpf: support SCM_TSTAMP_SCHED of SO_TIMESTAMPING
>    bpf: support sw SCM_TSTAMP_SND of SO_TIMESTAMPING
>    bpf: support hw SCM_TSTAMP_SND of SO_TIMESTAMPING
>    bpf: support SCM_TSTAMP_ACK of SO_TIMESTAMPING
>    bpf: add a new callback in tcp_tx_timestamp()
>    bpf: support selective sampling for bpf timestamping
>    selftests/bpf: add simple bpf tests in the tx path for timestamping
>      feature
> 
>   include/linux/filter.h                        |   1 +
>   include/linux/skbuff.h                        |  12 +-
>   include/net/sock.h                            |  10 +
>   include/net/tcp.h                             |   5 +-
>   include/uapi/linux/bpf.h                      |  30 ++
>   kernel/bpf/btf.c                              |   1 +
>   net/core/dev.c                                |   3 +-
>   net/core/filter.c                             |  75 ++++-
>   net/core/skbuff.c                             |  48 ++-
>   net/core/sock.c                               |  15 +
>   net/dsa/user.c                                |   2 +-
>   net/ipv4/tcp.c                                |   4 +
>   net/ipv4/tcp_input.c                          |   2 +
>   net/ipv4/tcp_output.c                         |   2 +
>   net/socket.c                                  |   2 +-
>   tools/include/uapi/linux/bpf.h                |  23 ++
>   .../bpf/prog_tests/so_timestamping.c          |  79 +++++
>   .../selftests/bpf/progs/so_timestamping.c     | 312 ++++++++++++++++++
>   18 files changed, 612 insertions(+), 14 deletions(-)
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/so_timestamping.c
>   create mode 100644 tools/testing/selftests/bpf/progs/so_timestamping.c
> 


  parent reply	other threads:[~2025-02-10 23:37 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-08 10:32 [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 01/12] bpf: add support for bpf_setsockopt() Jason Xing
2025-02-11  1:02   ` Martin KaFai Lau
2025-02-11  2:24     ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 02/12] bpf: prepare for timestamping callbacks use Jason Xing
2025-02-11  1:31   ` Martin KaFai Lau
2025-02-11  2:25     ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 03/12] bpf: stop unsafely accessing TCP fields in bpf callbacks Jason Xing
2025-02-11  6:34   ` Martin KaFai Lau
2025-02-11  8:08     ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 04/12] bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks Jason Xing
2025-02-11  6:55   ` Martin KaFai Lau
2025-02-11  8:24     ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 05/12] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 06/12] bpf: support SCM_TSTAMP_SCHED " Jason Xing
2025-02-11  7:12   ` Martin KaFai Lau
2025-02-11  7:31     ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 07/12] bpf: support sw SCM_TSTAMP_SND " Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 08/12] bpf: support hw " Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 09/12] bpf: support SCM_TSTAMP_ACK " Jason Xing
2025-02-08 17:54   ` Willem de Bruijn
2025-02-08 23:27     ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 10/12] bpf: add a new callback in tcp_tx_timestamp() Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 11/12] bpf: support selective sampling for bpf timestamping Jason Xing
2025-02-11  7:41   ` Martin KaFai Lau
2025-02-11  7:48     ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 12/12] selftests/bpf: add simple bpf tests in the tx path for timestamping feature Jason Xing
2025-02-11  8:05   ` Martin KaFai Lau
2025-02-11 11:37     ` Jason Xing
2025-02-10 23:37 ` Martin KaFai Lau [this message]
2025-02-11  0:03   ` [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a483c1dd-f593-4f6b-9afe-bfb6d43647bf@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kerneljasonxing@gmail.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).