netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC bpf-next v2 00/17] traits: Per packet metadata KV store
@ 2025-04-22 13:23 Arthur Fabre
  2025-04-22 13:23 ` [PATCH RFC bpf-next v2 01/17] trait: limited KV store for packet metadata Arthur Fabre
                   ` (16 more replies)
  0 siblings, 17 replies; 36+ messages in thread
From: Arthur Fabre @ 2025-04-22 13:23 UTC (permalink / raw)
  To: netdev, bpf
  Cc: jakub, hawk, yan, jbrandeburg, thoiland, lbiancon, ast, kuba,
	edumazet, Arthur Fabre

The only way to attach information to a sk_buff that travels
through the network stack is with the mark. This field can be
read in firewall rules, drive routing decisions, and be
accessed by BPF programs.

However, its small size creates competition for bits, restricting
its practical use.

We propose using part of the packet headroom to store metadata.
This would allow:
- Tracing packets through the network stack and across the kernel-user
  space boundary, by assigning them a unique ID.
- Metadata-driven packet redirection, routing, and socket steering with
  early classification in XDP.
- Extracting information from encapsulation headers and sharing it with
  user space or vice versa.
- Exposing XDP RX Metadata, like the timestamp, to the rest of the
  network stack.

We originally proposed extending XDP metadata - binary blob
storage also in the headroom - to expose it throughout the network
stack. However based on feedback at LPC 2024 [1]:
- sharing a binary blob amongst different applications is hard.
- exposing a binary blob to userspace is awkward.
we've shifted to a limited KV store in the headroom.

To differentiate this from the overloaded "metadata" term, it's
tentatively called "packet traits".

Traits are currently stored at the start of the headroom:

| xdp_frame | traits | headroom | XDP metadata | data / packet |

This makes adding encap headers to a packet easier: the traits don't
have to be moved out of the way first.

But to let us change this in the future, XDP metadata and traits
aren't allowed to be used together.

A get() / set() / delete() API is exposed to BPF to store and
retrieve traits.

Initial benchmarks in XDP are promising, with get() / set() comparable
to an indirect function call. See patch 7: "trait: Replace memmove calls
with inline move" for full results.

We imagine adding first class support for this in netfilter (setting
/ checking traits in rules) and routing (selecting routing tables
based on traits) in follow up work.
We also envisage a first class userspace API for storing and
retrieving traits in the future.

Like XDP metadata, this relies on there being sufficient headroom
available. Piggy backing on top of that work, traits are currently
only supported:
- On ingress.
- By NIC drivers that support XDP metadata.
- When an XDP program is attached.
This limits the applicability of traits. But future work
guaranteeing sufficient headroom through other means should allow
these restrictions to be lifted.

[1] https://lpc.events/event/18/contributions/1935/

---
Changes in v2:
- Support sizes 0 (for flags), 4, and 8. 16 will be supported in the
  future with a batch API, to set two consecutive 8 byte KVs at once.
- Prevent traits and XDP metadata from being used at the same time.
  This will let us move trait storage where XDP metadata is today if
  we want to.
- Use SKB extensions to store the traits in skbs.
- Drop registration API.
- Link to v1: https://lore.kernel.org/r/20250305-afabre-traits-010-rfc2-v1-0-d0ecfb869797@cloudflare.com

---
Arthur Fabre (16):
      trait: limited KV store for packet metadata
      xdp: Track if metadata is supported in xdp_frame <> xdp_buff conversions
      trait: XDP support
      trait: XDP selftest
      trait: XDP benchmark
      trait: Replace memcpy calls with inline copies
      trait: Replace memmove calls with inline move
      skb: Extension header in packet headroom
      trait: Store traits in sk_buff extension
      bnxt: Propagate trait presence to skb
      ice: Propagate trait presence to skb
      veth: Propagate trait presence to skb
      virtio_net: Propagate trait presence to skb
      mlx5: Propagate trait presence to skb
      xdp generic: Propagate trait presence to skb
      trait: Allow socket filters to access traits

Jesper Dangaard Brouer (1):
      mlx5: move xdp_buff scope one level up

 drivers/net/ethernet/broadcom/bnxt/bnxt.c          |   4 +
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c     |   5 -
 drivers/net/ethernet/intel/ice/ice_txrx.c          |   4 +
 drivers/net/ethernet/intel/ice/ice_xsk.c           |   2 +
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |   6 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c    |   6 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.h    |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 114 ++++----
 drivers/net/veth.c                                 |   4 +
 drivers/net/virtio_net.c                           |   8 +-
 include/linux/skbuff.h                             |  42 +++
 include/net/trait.h                                | 302 +++++++++++++++++++++
 include/net/xdp.h                                  |  56 +++-
 net/core/dev.c                                     |   1 +
 net/core/filter.c                                  |  10 +-
 net/core/skbuff.c                                  | 231 ++++++++++++++--
 net/core/xdp.c                                     |  69 ++++-
 net/xdp/xsk.c                                      |  11 +-
 tools/testing/selftests/bpf/Makefile               |   2 +
 tools/testing/selftests/bpf/bench.c                |   8 +
 .../selftests/bpf/benchs/bench_xdp_traits.c        | 160 +++++++++++
 .../testing/selftests/bpf/prog_tests/xdp_traits.c  |  33 +++
 .../testing/selftests/bpf/progs/bench_xdp_traits.c | 128 +++++++++
 .../testing/selftests/bpf/progs/test_xdp_traits.c  | 206 ++++++++++++++
 24 files changed, 1319 insertions(+), 99 deletions(-)
---
base-commit: 5709be4c35ba760b001733939e20069de033a697
change-id: 20250305-afabre-traits-010-rfc2-a8e4de0c490b

Best regards,
-- 
Arthur Fabre <arthur@arthurfabre.com>


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2025-05-05 12:36 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-22 13:23 [PATCH RFC bpf-next v2 00/17] traits: Per packet metadata KV store Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 01/17] trait: limited KV store for packet metadata Arthur Fabre
2025-04-24 16:22   ` Alexei Starovoitov
2025-04-25 19:26     ` Arthur Fabre
2025-04-29 23:36       ` Alexei Starovoitov
2025-04-30  9:19         ` Toke Høiland-Jørgensen
2025-04-30 16:29           ` Alexei Starovoitov
2025-05-01  7:30             ` Arthur Fabre
2025-04-30 19:19           ` Jakub Sitnicki
2025-05-01 10:43             ` Toke Høiland-Jørgensen
2025-05-01 14:03               ` Jesper Dangaard Brouer
2025-05-05 10:18               ` Jakub Sitnicki
2025-05-05 12:35                 ` Toke Høiland-Jørgensen
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 02/17] xdp: Track if metadata is supported in xdp_frame <> xdp_buff conversions Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 03/17] trait: XDP support Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 04/17] trait: XDP selftest Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 05/17] trait: XDP benchmark Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 06/17] trait: Replace memcpy calls with inline copies Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 07/17] trait: Replace memmove calls with inline move Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 08/17] skb: Extension header in packet headroom Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 09/17] trait: Store traits in sk_buff extension Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 10/17] bnxt: Propagate trait presence to skb Arthur Fabre
2025-04-23 16:36   ` Stanislav Fomichev
2025-04-23 20:54     ` Arthur Fabre
2025-04-23 23:45       ` Stanislav Fomichev
2025-04-24  9:49         ` Toke Høiland-Jørgensen
2025-04-24 15:39           ` Stanislav Fomichev
2025-04-24 18:59             ` Jakub Sitnicki
2025-04-25  8:06               ` Toke Høiland-Jørgensen
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 11/17] ice: " Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 12/17] veth: " Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 13/17] virtio_net: " Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 14/17] mlx5: move xdp_buff scope one level up Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 15/17] mlx5: Propagate trait presence to skb Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 16/17] xdp generic: " Arthur Fabre
2025-04-22 13:23 ` [PATCH RFC bpf-next v2 17/17] trait: Allow socket filters to access traits Arthur Fabre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).