public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <hawk@kernel.org>
To: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Jakub Kicinski <kuba@kernel.org>,
	lorenzo@kernel.org
Cc: Jesper Dangaard Brouer <hawk@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <borkmann@iogearbox.net>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Paolo Abeni <pabeni@redhat.com>,
	sdf@fomichev.me, kernel-team@cloudflare.com,
	arthur@arthurfabre.com, jakub@cloudflare.com
Subject: [PATCH bpf-next V2 7/7] net: xdp: update documentation for xdp-rx-metadata.rst
Date: Wed, 02 Jul 2025 16:58:59 +0200	[thread overview]
Message-ID: <175146833962.1421237.12791103620310340595.stgit@firesoul> (raw)
In-Reply-To: <175146824674.1421237.18351246421763677468.stgit@firesoul>

Update the documentation[1] based on the changes in this patchset.

[1] https://docs.kernel.org/networking/xdp-rx-metadata.html

Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
---
 Documentation/networking/xdp-rx-metadata.rst |   77 +++++++++++++++++++++-----
 net/core/xdp.c                               |   32 +++++++++++
 2 files changed, 93 insertions(+), 16 deletions(-)

diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index a6e0ece18be5..e2b89c066a82 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -90,22 +90,67 @@ the ``data_meta`` pointer.
 In the future, we'd like to support a case where an XDP program
 can override some of the metadata used for building ``skbs``.
 
-bpf_redirect_map
-================
-
-``bpf_redirect_map`` can redirect the frame to a different device.
-Some devices (like virtual ethernet links) support running a second XDP
-program after the redirect. However, the final consumer doesn't have
-access to the original hardware descriptor and can't access any of
-the original metadata. The same applies to XDP programs installed
-into devmaps and cpumaps.
-
-This means that for redirected packets only custom metadata is
-currently supported, which has to be prepared by the initial XDP program
-before redirect. If the frame is eventually passed to the kernel, the
-``skb`` created from such a frame won't have any hardware metadata populated
-in its ``skb``. If such a packet is later redirected into an ``XSK``,
-that will also only have access to the custom metadata.
+XDP_REDIRECT
+============
+
+The ``XDP_REDIRECT`` action forwards an XDP frame (``xdp_frame``) to another net
+device or a CPU (via cpumap/devmap) for further processing. It is invoked using
+BPF helpers like ``bpf_redirect_map()`` or ``bpf_redirect()``.  When an XDP
+frame is redirected, the recipient (e.g., an XDP program on a veth device, or
+the kernel stack via cpumap) naturally loses direct access to the original NIC's
+hardware descriptor and thus its hardware metadata hints.
+
+By default, if an ``xdp_frame`` is redirected and then converted to an ``skb``,
+its fields for hardware-derived metadata like ``skb->hash`` are not
+populated. When this occurs, the network stack recalculates the hash in
+software. This is particularly problematic for encapsulated tunnel traffic
+(e.g., IPsec, GRE), as the software hash is based on the outer headers. For a
+single tunnel, this can cause all flows to receive the same hash, leading to
+poor load balancing when redirected to a veth device or processed by cpumap.
+
+To solve this, a BPF program can calculate a more appropriate hint from the
+packet data (e.g., from the inner headers of a tunnel) and store it for later
+use. While it is also possible for the BPF program to propagate existing
+hardware hints, this is not useful for the tunnel use case; it is unnecessary to
+read the existing hardware metadata hint, as it is based on the outer headers
+and must be recalculated to correctly reflect the inner flow.
+
+For example, a BPF program can perform partial decryption on an IPsec packet,
+calculate a hash from the inner headers, and use ``bpf_xdp_store_rx_hash()`` to
+save it. This ensures that when the packet is redirected to a veth device, it is
+placed on the correct RX queue, achieving proper load balancing.
+
+When these kfuncs are used to store hints before redirection:
+
+* If the ``xdp_frame`` is converted to an ``skb``, the networking stack will use
+  the stored hints to populate the corresponding ``skb`` fields (e.g.,
+  ``skb->hash``, ``skb->vlan_tci``, timestamps).
+
+* When running a second XDP-program after the redirect. The veth driver supports
+  access to the previous stored metadata is accessed though the normal reader
+  kfuncs.
+
+The BPF programmer must explicitly call these "store" kfuncs to save the desired
+hints. The NIC driver is responsible for ensuring sufficient headroom is
+available; kfuncs may return ``-ENOSPC`` if space is inadequate.
+
+Kfuncs are available for storing RX hash (``bpf_xdp_store_rx_hash()``),
+VLAN information (``bpf_xdp_store_rx_vlan()``), and hardware timestamps
+(``bpf_xdp_store_rx_ts()``). Consult the kfunc API documentation for usage
+details, expected data, return codes, and relevant XDP flags that may
+indicate success or metadata availability.
+
+Kfuncs for **store** operations:
+
+.. kernel-doc:: net/core/xdp.c
+   :identifiers: bpf_xdp_store_rx_timestamp
+
+.. kernel-doc:: net/core/xdp.c
+   :identifiers: bpf_xdp_store_rx_hash
+
+.. kernel-doc:: net/core/xdp.c
+   :identifiers: bpf_xdp_store_rx_vlan_tag
+
 
 bpf_tail_call
 =============
diff --git a/net/core/xdp.c b/net/core/xdp.c
index f1b2a3b4ba95..e8faf9f6fc7e 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -990,6 +990,18 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
 	return -EOPNOTSUPP;
 }
 
+/**
+ * bpf_xdp_store_rx_hash - Store XDP frame RX hash.
+ * @ctx: XDP context pointer.
+ * @hash: 32-bit hash value.
+ * @rss_type: RSS hash type.
+ *
+ * The RSS hash type (@rss_type) is as descibed in bpf_xdp_metadata_rx_hash.
+ *
+ * Return:
+ * * Returns 0 on success or ``-errno`` on error.
+ * * ``-NOSPC``   : means device driver doesn't provide enough headroom for storing
+ */
 __bpf_kfunc int bpf_xdp_store_rx_hash(struct xdp_md *ctx, u32 hash,
 				      enum xdp_rss_hash_type rss_type)
 {
@@ -1005,6 +1017,18 @@ __bpf_kfunc int bpf_xdp_store_rx_hash(struct xdp_md *ctx, u32 hash,
 	return 0;
 }
 
+/**
+ * bpf_xdp_store_rx_vlan_tag - Store XDP packet outermost VLAN tag
+ * @ctx: XDP context pointer.
+ * @vlan_proto: VLAN protocol stored in **network byte order (BE)**
+ * @vlan_tci: VLAN TCI (VID + DEI + PCP) stored in **host byte order**
+ *
+ * See bpf_xdp_metadata_rx_vlan_tag() for byte order reasoning.
+ *
+ * Return:
+ * * Returns 0 on success or ``-errno`` on error.
+ * * ``-NOSPC``   : means device driver doesn't provide enough headroom for storing
+ */
 __bpf_kfunc int bpf_xdp_store_rx_vlan(struct xdp_md *ctx, __be16 vlan_proto,
 				      u16 vlan_tci)
 {
@@ -1020,6 +1044,14 @@ __bpf_kfunc int bpf_xdp_store_rx_vlan(struct xdp_md *ctx, __be16 vlan_proto,
 	return 0;
 }
 
+/**
+ * bpf_xdp_metadata_rx_timestamp - Store XDP frame RX timestamp.
+ * @ctx: XDP context pointer.
+ * @timestamp: Timestamp value.
+ *
+ * Return:
+ * * Returns 0 on success or ``-errno`` on error.
+ */
 __bpf_kfunc int bpf_xdp_store_rx_ts(struct xdp_md *ctx, u64 ts)
 {
 	struct xdp_buff *xdp = (struct xdp_buff *)ctx;



  parent reply	other threads:[~2025-07-02 14:59 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-02 14:58 [PATCH bpf-next V2 0/7] xdp: Allow BPF to set RX hints for XDP_REDIRECTed packets Jesper Dangaard Brouer
2025-07-02 14:58 ` [PATCH bpf-next V2 1/7] net: xdp: Add xdp_rx_meta structure Jesper Dangaard Brouer
2025-07-17  9:19   ` Jakub Sitnicki
2025-07-17 14:40     ` Jesper Dangaard Brouer
2025-07-18 10:33       ` Jakub Sitnicki
2025-07-02 14:58 ` [PATCH bpf-next V2 2/7] selftests/bpf: Adjust test for maximum packet size in xdp_do_redirect Jesper Dangaard Brouer
2025-07-02 14:58 ` [PATCH bpf-next V2 3/7] net: xdp: Add kfuncs to store hw metadata in xdp_buff Jesper Dangaard Brouer
2025-07-03 11:41   ` Jesper Dangaard Brouer
2025-07-03 12:26     ` Lorenzo Bianconi
2025-07-02 14:58 ` [PATCH bpf-next V2 4/7] net: xdp: Set skb hw metadata from xdp_frame Jesper Dangaard Brouer
2025-07-02 14:58 ` [PATCH bpf-next V2 5/7] net: veth: Read xdp metadata from rx_meta struct if available Jesper Dangaard Brouer
2025-07-17 12:11   ` Jakub Sitnicki
2025-07-02 14:58 ` [PATCH bpf-next V2 6/7] bpf: selftests: Add rx_meta store kfuncs selftest Jesper Dangaard Brouer
2025-07-23  9:24   ` Bouska, Zdenek
2025-07-02 14:58 ` Jesper Dangaard Brouer [this message]
2025-07-02 16:05 ` [PATCH bpf-next V2 0/7] xdp: Allow BPF to set RX hints for XDP_REDIRECTed packets Stanislav Fomichev
2025-07-03 11:17   ` Jesper Dangaard Brouer
2025-07-07 14:40     ` Stanislav Fomichev
2025-07-09  9:31       ` Lorenzo Bianconi
2025-07-11 16:04         ` Stanislav Fomichev
2025-07-16 11:17           ` Lorenzo Bianconi
2025-07-16 21:20             ` Jakub Kicinski
2025-07-17 13:08               ` Jesper Dangaard Brouer
2025-07-18  1:25                 ` Jakub Kicinski
2025-07-18 10:56                   ` Jesper Dangaard Brouer
2025-07-22  1:13                     ` Jakub Kicinski
2025-07-28 10:53                       ` Lorenzo Bianconi
2025-07-28 16:29                         ` Jakub Kicinski
2025-07-29 11:15                           ` Jesper Dangaard Brouer
2025-07-29 19:47                             ` Martin KaFai Lau
2025-07-31 16:27                               ` Jesper Dangaard Brouer
2025-08-01 20:38                                 ` Jakub Kicinski
2025-08-04 13:18                                   ` Jesper Dangaard Brouer
2025-08-06  0:28                                     ` Jakub Kicinski
2025-08-07 18:26                                       ` Jesper Dangaard Brouer
2025-08-06  1:24                                     ` Martin KaFai Lau
2025-08-07 19:07                                       ` Jesper Dangaard Brouer
2025-08-13  2:59                                         ` Martin KaFai Lau
2025-07-31 21:18                           ` Lorenzo Bianconi
2025-08-01 20:40                             ` Jakub Kicinski
2025-08-05 13:18                               ` Lorenzo Bianconi
2025-08-05 23:54                                 ` Jakub Kicinski
2025-07-18  9:55                 ` Lorenzo Bianconi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=175146833962.1421237.12791103620310340595.stgit@firesoul \
    --to=hawk@kernel.org \
    --cc=arthur@arthurfabre.com \
    --cc=ast@kernel.org \
    --cc=borkmann@iogearbox.net \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox