From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, dsahern@kernel.org,
willemdebruijn.kernel@gmail.com, willemb@google.com,
ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
haoluo@google.com, jolsa@kernel.org, horms@kernel.org,
bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip applications transparently
Date: Mon, 10 Feb 2025 15:37:19 -0800 [thread overview]
Message-ID: <a483c1dd-f593-4f6b-9afe-bfb6d43647bf@linux.dev> (raw)
In-Reply-To: <20250208103220.72294-1-kerneljasonxing@gmail.com>
On 2/8/25 2:32 AM, Jason Xing wrote:
> "Timestamping is key to debugging network stack latency. With
> SO_TIMESTAMPING, bugs that are otherwise incorrectly assumed to be
> network issues can be attributed to the kernel." This is extracted
> from the talk "SO_TIMESTAMPING: Powering Fleetwide RPC Monitoring"
> addressed by Willem de Bruijn at netdevconf 0x17).
>
> There are a few areas that need optimization with the consideration of
> easier use and less performance impact, which I highlighted and mainly
> discussed at netconf 2024 with Willem de Bruijn and John Fastabend:
> uAPI compatibility, extra system call overhead, and the need for
> application modification. I initially managed to solve these issues
> by writing a kernel module that hooks various key functions. However,
> this approach is not suitable for the next kernel release. Therefore,
> a BPF extension was proposed. During recent period, Martin KaFai Lau
> provides invaluable suggestions about BPF along the way. Many thanks
> here!
>
> In this series, only support foundamental codes and tx for TCP.
typo: fundamental.... This had been brought up before (in v7?).
By fundamental, I suspect you meant (?) bpf timestamping infrastructure, like:
"This series adds the BPF networking timestamping infrastructure. This series
also adds TX timestamping support for TCP. The RX timestamping and UDP support
will be added in the future."
> This approach mostly relies on existing SO_TIMESTAMPING feature, users
It reuses most of the tx timestamping callback that is currently enabled by the
SO_TIMESTAMPING. However, I don't think there is a lot of overlap in term of the
SO_TIMESTAMPING api which does feel like API reuse when first reading this comment.
> only needs to pass certain flags through bpf_setsocktopt() to a separate
> tsflags. Please see the last selftest patch in this series.
>
> ---
> v8
> Link: https://lore.kernel.org/all/20250128084620.57547-1-kerneljasonxing@gmail.com/
> 1. adjust some commit messages and titles
> 2. add sk cookie in selftests
> 3. handle the NULL pointer in hwstamp
> 4. use kfunc to do selective sampling
>
> v7
> Link: https://lore.kernel.org/all/20250121012901.87763-1-kerneljasonxing@gmail.com/
> 1. target bpf-next tree
> 2. simplely and directly stop timestamping callbacks calling a few BPF
> CALLS due to safety concern.
> 3. add more new testcases and adjust the existing testcases
> 4. revise some comments of new timestamping callbacks
> 5. remove a few BPF CGROUP locks
>
> RFC v6
> In the meantime, any suggestions and reviews are welcome!
> Link: https://lore.kernel.org/all/20250112113748.73504-1-kerneljasonxing@gmail.com/
> 1. handle those safety problem by using the correct method.
> 2. support bpf_getsockopt.
> 3. adjust the position of BPF_SOCK_OPS_TS_TCP_SND_CB
> 4. fix mishandling the hardware timestamp error
> 5. add more corresponding tests
>
> v5
> Link: https://lore.kernel.org/all/20241207173803.90744-1-kerneljasonxing@gmail.com/
> 1. handle the safety issus when someone tries to call unrelated bpf
> helpers.
> 2. avoid adding direct function call in the hot path like
> __dev_queue_xmit()
> 3. remove reporting the hardware timestamp and tskey since they can be
> fetched through the existing helper with the help of
> bpf_skops_init_skb(), please see the selftest.
> 4. add new sendmsg callback in tcp_sendmsg, and introduce tskey_bpf used
> by bpf program to correlate tcp_sendmsg with other hook points in patch [13/15].
>
> v4
> Link: https://lore.kernel.org/all/20241028110535.82999-1-kerneljasonxing@gmail.com/
> 1. introduce sk->sk_bpf_cb_flags to let user use bpf_setsockopt() (Martin)
> 2. introduce SKBTX_BPF to enable the bpf SO_TIMESTAMPING feature (Martin)
> 3. introduce bpf map in tests (Martin)
> 4. I choose to make this series as simple as possible, so I only support
> most cases in the tx path for TCP protocol.
>
> v3
> Link: https://lore.kernel.org/all/20241012040651.95616-1-kerneljasonxing@gmail.com/
> 1. support UDP proto by introducing a new generation point.
> 2. for OPT_ID, introducing sk_tskey_bpf_offset to compute the delta
> between the current socket key and bpf socket key. It is desiged for
> UDP, which also applies to TCP.
> 3. support bpf_getsockopt()
> 4. use cgroup static key instead.
> 5. add one simple bpf selftest to show how it can be used.
> 6. remove the rx support from v2 because the number of patches could
> exceed the limit of one series.
>
> V2
> Link: https://lore.kernel.org/all/20241008095109.99918-1-kerneljasonxing@gmail.com/
> 1. Introduce tsflag requestors so that we are able to extend more in the
> future. Besides, it enables TX flags for bpf extension feature separately
> without breaking users. It is suggested by Vadim Fedorenko.
> 2. introduce a static key to control the whole feature. (Willem)
> 3. Open the gate of bpf_setsockopt for the SO_TIMESTAMPING feature in
> some TX/RX cases, not all the cases.
>
> Jason Xing (12):
> bpf: add support for bpf_setsockopt()
> bpf: prepare for timestamping callbacks use
> bpf: stop unsafely accessing TCP fields in bpf callbacks
> bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks
> net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING
> bpf: support SCM_TSTAMP_SCHED of SO_TIMESTAMPING
> bpf: support sw SCM_TSTAMP_SND of SO_TIMESTAMPING
> bpf: support hw SCM_TSTAMP_SND of SO_TIMESTAMPING
> bpf: support SCM_TSTAMP_ACK of SO_TIMESTAMPING
> bpf: add a new callback in tcp_tx_timestamp()
> bpf: support selective sampling for bpf timestamping
> selftests/bpf: add simple bpf tests in the tx path for timestamping
> feature
>
> include/linux/filter.h | 1 +
> include/linux/skbuff.h | 12 +-
> include/net/sock.h | 10 +
> include/net/tcp.h | 5 +-
> include/uapi/linux/bpf.h | 30 ++
> kernel/bpf/btf.c | 1 +
> net/core/dev.c | 3 +-
> net/core/filter.c | 75 ++++-
> net/core/skbuff.c | 48 ++-
> net/core/sock.c | 15 +
> net/dsa/user.c | 2 +-
> net/ipv4/tcp.c | 4 +
> net/ipv4/tcp_input.c | 2 +
> net/ipv4/tcp_output.c | 2 +
> net/socket.c | 2 +-
> tools/include/uapi/linux/bpf.h | 23 ++
> .../bpf/prog_tests/so_timestamping.c | 79 +++++
> .../selftests/bpf/progs/so_timestamping.c | 312 ++++++++++++++++++
> 18 files changed, 612 insertions(+), 14 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/so_timestamping.c
> create mode 100644 tools/testing/selftests/bpf/progs/so_timestamping.c
>
next prev parent reply other threads:[~2025-02-10 23:37 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-08 10:32 [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 01/12] bpf: add support for bpf_setsockopt() Jason Xing
2025-02-11 1:02 ` Martin KaFai Lau
2025-02-11 2:24 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 02/12] bpf: prepare for timestamping callbacks use Jason Xing
2025-02-11 1:31 ` Martin KaFai Lau
2025-02-11 2:25 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 03/12] bpf: stop unsafely accessing TCP fields in bpf callbacks Jason Xing
2025-02-11 6:34 ` Martin KaFai Lau
2025-02-11 8:08 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 04/12] bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks Jason Xing
2025-02-11 6:55 ` Martin KaFai Lau
2025-02-11 8:24 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 05/12] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 06/12] bpf: support SCM_TSTAMP_SCHED " Jason Xing
2025-02-11 7:12 ` Martin KaFai Lau
2025-02-11 7:31 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 07/12] bpf: support sw SCM_TSTAMP_SND " Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 08/12] bpf: support hw " Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 09/12] bpf: support SCM_TSTAMP_ACK " Jason Xing
2025-02-08 17:54 ` Willem de Bruijn
2025-02-08 23:27 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 10/12] bpf: add a new callback in tcp_tx_timestamp() Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 11/12] bpf: support selective sampling for bpf timestamping Jason Xing
2025-02-11 7:41 ` Martin KaFai Lau
2025-02-11 7:48 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 12/12] selftests/bpf: add simple bpf tests in the tx path for timestamping feature Jason Xing
2025-02-11 8:05 ` Martin KaFai Lau
2025-02-11 11:37 ` Jason Xing
2025-02-10 23:37 ` Martin KaFai Lau [this message]
2025-02-11 0:03 ` [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a483c1dd-f593-4f6b-9afe-bfb6d43647bf@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kerneljasonxing@gmail.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).