netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Xing <kerneljasonxing@gmail.com>
To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, dsahern@kernel.org,
	willemdebruijn.kernel@gmail.com, willemb@google.com,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	martin.lau@linux.dev, eddyz87@gmail.com, song@kernel.org,
	yonghong.song@linux.dev, john.fastabend@gmail.com,
	kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com,
	jolsa@kernel.org, horms@kernel.org
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Jason Xing <kerneljasonxing@gmail.com>
Subject: [PATCH bpf-next v11 00/12] net-timestamp: bpf extension to equip applications transparently
Date: Fri, 14 Feb 2025 09:00:26 +0800	[thread overview]
Message-ID: <20250214010038.54131-1-kerneljasonxing@gmail.com> (raw)

"Timestamping is key to debugging network stack latency. With
SO_TIMESTAMPING, bugs that are otherwise incorrectly assumed to be
network issues can be attributed to the kernel." This is extracted
from the talk "SO_TIMESTAMPING: Powering Fleetwide RPC Monitoring"
addressed by Willem de Bruijn at netdevconf 0x17).

There are a few areas that need optimization with the consideration of
easier use and less performance impact, which I highlighted and mainly
discussed at netconf 2024 with Willem de Bruijn and John Fastabend:
uAPI compatibility, extra system call overhead, and the need for
application modification. I initially managed to solve these issues
by writing a kernel module that hooks various key functions. However,
this approach is not suitable for the next kernel release. Therefore,
a BPF extension was proposed. During recent period, Martin KaFai Lau
provides invaluable suggestions about BPF along the way. Many thanks
here!

This series adds the BPF networking timestamping infrastructure through
reusing most of the tx timestamping callback that is currently enabled
by the SO_TIMESTAMPING.. This series also adds TX timestamping support
for TCP. The RX timestamping and UDP support will be added in the future.

---
v10
Link: https://lore.kernel.org/all/20250212061855.71154-1-kerneljasonxing@gmail.com/
1. rename hwts with hwtimestamp
2. use subtest and pid filter in selftest
3. use 'tcb->txstamp_ack |= TSTAMP_ACK_SK'

v9
Link: https://lore.kernel.org/all/20250208103220.72294-1-kerneljasonxing@gmail.com/
1. set the hwtstamp to skb when the skb enters into the hw SND case
2. fix co-existence problem in patch 9 and add corresponding check in
patch 12.
3. refine some commit messages and titles

v8
Link: https://lore.kernel.org/all/20250128084620.57547-1-kerneljasonxing@gmail.com/
1. adjust some commit messages and titles
2. add sk cookie in selftests
3. handle the NULL pointer in hwstamp
4. use kfunc to do selective sampling

v7
Link: https://lore.kernel.org/all/20250121012901.87763-1-kerneljasonxing@gmail.com/
1. target bpf-next tree
2. simplely and directly stop timestamping callbacks calling a few BPF
CALLS due to safety concern.
3. add more new testcases and adjust the existing testcases
4. revise some comments of new timestamping callbacks
5. remove a few BPF CGROUP locks

RFC v6
In the meantime, any suggestions and reviews are welcome!
Link: https://lore.kernel.org/all/20250112113748.73504-1-kerneljasonxing@gmail.com/
1. handle those safety problem by using the correct method.
2. support bpf_getsockopt.
3. adjust the position of BPF_SOCK_OPS_TS_TCP_SND_CB
4. fix mishandling the hardware timestamp error
5. add more corresponding tests

v5
Link: https://lore.kernel.org/all/20241207173803.90744-1-kerneljasonxing@gmail.com/
1. handle the safety issus when someone tries to call unrelated bpf
helpers.
2. avoid adding direct function call in the hot path like
__dev_queue_xmit()
3. remove reporting the hardware timestamp and tskey since they can be
fetched through the existing helper with the help of
bpf_skops_init_skb(), please see the selftest.
4. add new sendmsg callback in tcp_sendmsg, and introduce tskey_bpf used
by bpf program to correlate tcp_sendmsg with other hook points in patch [13/15].

v4
Link: https://lore.kernel.org/all/20241028110535.82999-1-kerneljasonxing@gmail.com/
1. introduce sk->sk_bpf_cb_flags to let user use bpf_setsockopt() (Martin)
2. introduce SKBTX_BPF to enable the bpf SO_TIMESTAMPING feature (Martin)
3. introduce bpf map in tests (Martin)
4. I choose to make this series as simple as possible, so I only support
most cases in the tx path for TCP protocol.

v3
Link: https://lore.kernel.org/all/20241012040651.95616-1-kerneljasonxing@gmail.com/
1. support UDP proto by introducing a new generation point.
2. for OPT_ID, introducing sk_tskey_bpf_offset to compute the delta
between the current socket key and bpf socket key. It is desiged for
UDP, which also applies to TCP.
3. support bpf_getsockopt()
4. use cgroup static key instead.
5. add one simple bpf selftest to show how it can be used.
6. remove the rx support from v2 because the number of patches could
exceed the limit of one series.

V2
Link: https://lore.kernel.org/all/20241008095109.99918-1-kerneljasonxing@gmail.com/
1. Introduce tsflag requestors so that we are able to extend more in the
future. Besides, it enables TX flags for bpf extension feature separately
without breaking users. It is suggested by Vadim Fedorenko.
2. introduce a static key to control the whole feature. (Willem)
3. Open the gate of bpf_setsockopt for the SO_TIMESTAMPING feature in
some TX/RX cases, not all the cases.


Jason Xing (12):
  bpf: add networking timestamping support to bpf_get/setsockopt()
  bpf: prepare the sock_ops ctx and call bpf prog for TX timestamping
  bpf: prevent unsafe access to the sock fields in the BPF timestamping
    callback
  bpf: disable unsafe helpers in TX timestamping callbacks
  net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING
  bpf: add BPF_SOCK_OPS_TS_SCHED_OPT_CB callback
  bpf: add BPF_SOCK_OPS_TS_SW_OPT_CB callback
  bpf: add BPF_SOCK_OPS_TS_HW_OPT_CB callback
  bpf: add BPF_SOCK_OPS_TS_ACK_OPT_CB callback
  bpf: add BPF_SOCK_OPS_TS_SND_CB callback
  bpf: support selective sampling for bpf timestamping
  selftests/bpf: add simple bpf tests in the tx path for timestamping
    feature

 include/linux/filter.h                        |   1 +
 include/linux/skbuff.h                        |  12 +-
 include/net/sock.h                            |  10 +
 include/net/tcp.h                             |   7 +-
 include/uapi/linux/bpf.h                      |  30 +++
 kernel/bpf/btf.c                              |   1 +
 net/core/dev.c                                |   3 +-
 net/core/filter.c                             |  81 +++++-
 net/core/skbuff.c                             |  53 ++++
 net/core/sock.c                               |  14 +
 net/dsa/user.c                                |   2 +-
 net/ipv4/tcp.c                                |   6 +-
 net/ipv4/tcp_input.c                          |   2 +
 net/ipv4/tcp_output.c                         |   2 +
 net/socket.c                                  |   2 +-
 tools/include/uapi/linux/bpf.h                |  23 ++
 .../bpf/prog_tests/net_timestamping.c         | 239 +++++++++++++++++
 .../selftests/bpf/progs/net_timestamping.c    | 248 ++++++++++++++++++
 18 files changed, 722 insertions(+), 14 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/net_timestamping.c
 create mode 100644 tools/testing/selftests/bpf/progs/net_timestamping.c

-- 
2.43.5


             reply	other threads:[~2025-02-14  1:00 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-14  1:00 Jason Xing [this message]
2025-02-14  1:00 ` [PATCH bpf-next v11 01/12] bpf: add networking timestamping support to bpf_get/setsockopt() Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 02/12] bpf: prepare the sock_ops ctx and call bpf prog for TX timestamping Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 03/12] bpf: prevent unsafe access to the sock fields in the BPF timestamping callback Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 04/12] bpf: disable unsafe helpers in TX timestamping callbacks Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 05/12] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 06/12] bpf: add BPF_SOCK_OPS_TS_SCHED_OPT_CB callback Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 07/12] bpf: add BPF_SOCK_OPS_TS_SW_OPT_CB callback Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 08/12] bpf: add BPF_SOCK_OPS_TS_HW_OPT_CB callback Jason Xing
2025-02-15 15:06   ` Willem de Bruijn
2025-02-15 16:20     ` Jason Xing
2025-02-15 18:08       ` Willem de Bruijn
2025-02-15 22:23         ` Jason Xing
2025-02-15 22:58           ` Martin KaFai Lau
2025-02-15 23:10             ` Jason Xing
2025-02-16 14:36               ` Willem de Bruijn
2025-02-16 14:45                 ` Jason Xing
2025-02-16 14:48                   ` Jason Xing
2025-02-16 16:17                     ` Jason Xing
2025-02-18  0:56                     ` Willem de Bruijn
2025-02-18  4:51                       ` Jason Xing
2025-02-18  0:55                 ` Willem de Bruijn
2025-02-14  1:00 ` [PATCH bpf-next v11 09/12] bpf: add BPF_SOCK_OPS_TS_ACK_OPT_CB callback Jason Xing
2025-02-14 20:33   ` Martin KaFai Lau
2025-02-14 23:16     ` Jason Xing
2025-02-14 23:41       ` Martin KaFai Lau
2025-02-15 15:16         ` Willem de Bruijn
2025-02-14  1:00 ` [PATCH bpf-next v11 10/12] bpf: add BPF_SOCK_OPS_TS_SND_CB callback Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 11/12] bpf: support selective sampling for bpf timestamping Jason Xing
2025-02-15 15:10   ` Willem de Bruijn
2025-02-15 16:17     ` Jason Xing
2025-02-15 18:01       ` Willem de Bruijn
2025-02-15 21:11         ` Jason Xing
2025-02-14  1:00 ` [PATCH bpf-next v11 12/12] selftests/bpf: add simple bpf tests in the tx path for timestamping feature Jason Xing
2025-02-14 20:40   ` Martin KaFai Lau
2025-02-14 23:18     ` Jason Xing
2025-02-15 15:15   ` Willem de Bruijn
2025-02-15 16:17     ` Jason Xing
2025-02-15 18:01       ` Willem de Bruijn
2025-02-14 20:42 ` [PATCH bpf-next v11 00/12] net-timestamp: bpf extension to equip applications transparently Martin KaFai Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250214010038.54131-1-kerneljasonxing@gmail.com \
    --to=kerneljasonxing@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).