public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/5] Add the the capability to load HW RX checsum in eBPF programs
@ 2026-02-17  8:33 Lorenzo Bianconi
  2026-02-17  8:33 ` [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs Lorenzo Bianconi
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-17  8:33 UTC (permalink / raw)
  To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski
  Cc: Jakub Sitnicki, netdev, bpf, intel-wired-lan, linux-kselftest,
	Lorenzo Bianconi, Aleksandr Loktionov

Introduce bpf_xdp_metadata_rx_checksum() kfunc in order to load the HW
RX cheksum results in the eBPF program binded to the NIC.
Implement xmo_rx_checksum callback for veth and ice drivers.

If the hardware detects a wrong/failed checksum, it will report
CHECKSUM_NONE in the packet metadata. Moreover, CHECKSUM_NONE will be
returned even if the NIC can't parse the packet (e.g. if it does not
support a specific protocol). A possible use case for
bpf_xdp_metadata_rx_checksum() would be to implement a XDP DDoS
application [1] combining the info from bpf_xdp_metadata_rx_checksum()
and bpf_xdp_metadata_rx_hash() kfuncs in order to filter packets with a
wrong/failed checksum.

[1] https://blog.cloudflare.com/unimog-cloudflares-edge-load-balancer/

---
Changes in v3:
- Remoe leftover assignment from v2 in veth_xdp_rx_checksum()
- Fix typos
- Fix commit logs
- Link to v2: https://lore.kernel.org/r/20260213-bpf-xdp-meta-rxcksum-v2-0-a82c4802afbe@kernel.org

Changes in v2:
- Remove XDP_CHECKSUM_PARTIAL definition
- Improve veth_xdp_rx_checksum() callback
- Fix uninitialized case for cksum_meta in ice_get_rx_csum()
- Fix sparse warnings in ice driver
- Fix typos
- Link to v1: https://lore.kernel.org/r/20260210-bpf-xdp-meta-rxcksum-v1-0-e5d55caa0541@kernel.org

Changes in v1:
- Rebase on top of bpf-next
- Test ice driver using xdp_hw_metadata tool available in the bpf
  kernel selftest
- Improve cover letter with an use-case for
  bpf_xdp_metadata_rx_checksum()
- Link to RFC v2: https://lore.kernel.org/r/20250925-bpf-xdp-meta-rxcksum-v2-0-6b3fe987ce91@kernel.org

Changes in RFC v2:
- Squash patch 1/6 and 2/6
- Introduce enum xdp_checksum definitions
- Rework ice support to reuse ice_rx_csum codebase

---
Lorenzo Bianconi (5):
      netlink: specs: Add XDP RX checksum capability to XDP metadata specs
      net: veth: Add xmo_rx_checksum callback to veth driver
      net: ice: Add xmo_rx_checksum callback
      selftests/bpf: Add selftest support for bpf_xdp_metadata_rx_checksum
      selftests/bpf: Add bpf_xdp_metadata_rx_checksum support to xdp_hw_metadat prog

 Documentation/netlink/specs/netdev.yaml            |   5 +
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c      | 118 +++++++++++++--------
 drivers/net/veth.c                                 |  29 +++++
 include/net/xdp.h                                  |  13 +++
 include/uapi/linux/netdev.h                        |   3 +
 net/core/xdp.c                                     |  28 +++++
 tools/include/uapi/linux/netdev.h                  |   3 +
 .../selftests/bpf/prog_tests/xdp_metadata.c        |   6 ++
 .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |   8 ++
 tools/testing/selftests/bpf/progs/xdp_metadata.c   |   3 +
 tools/testing/selftests/bpf/xdp_hw_metadata.c      |  25 +++++
 tools/testing/selftests/bpf/xdp_metadata.h         |  12 +++
 12 files changed, 211 insertions(+), 42 deletions(-)
---
base-commit: 192c0159402e6bfbe13de6f8379546943297783d
change-id: 20250925-bpf-xdp-meta-rxcksum-900685e2909d

Best regards,
-- 
Lorenzo Bianconi <lorenzo@kernel.org>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-17  8:33 [PATCH bpf-next v3 0/5] Add the the capability to load HW RX checsum in eBPF programs Lorenzo Bianconi
@ 2026-02-17  8:33 ` Lorenzo Bianconi
  2026-02-18  1:01   ` Stanislav Fomichev
  2026-02-19  1:47   ` Jakub Kicinski
  2026-02-17  8:33 ` [PATCH bpf-next v3 2/5] net: veth: Add xmo_rx_checksum callback to veth driver Lorenzo Bianconi
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-17  8:33 UTC (permalink / raw)
  To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski
  Cc: Jakub Sitnicki, netdev, bpf, intel-wired-lan, linux-kselftest,
	Lorenzo Bianconi

Introduce XDP RX checksum capability to XDP metadata specs. XDP RX
checksum will be use by devices capable of exposing receive checksum
result via bpf_xdp_metadata_rx_checksum().
Moreover, introduce xmo_rx_checksum netdev callback in order to allow
the eBPF program bound to the device to retrieve the RX checksum result
computed by the hw NIC.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 Documentation/netlink/specs/netdev.yaml |  5 +++++
 include/net/xdp.h                       | 13 +++++++++++++
 include/uapi/linux/netdev.h             |  3 +++
 net/core/xdp.c                          | 28 ++++++++++++++++++++++++++++
 tools/include/uapi/linux/netdev.h       |  3 +++
 5 files changed, 52 insertions(+)

diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index 596c306ce52b8303b20680ff0cd34d4fd9db0e48..58eda634668a07860447a65d9fc2284839af6244 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -61,6 +61,11 @@ definitions:
         doc: |
           Device is capable of exposing receive packet VLAN tag via
           bpf_xdp_metadata_rx_vlan_tag().
+      -
+        name: checksum
+        doc: |
+          Device is capable of exposing receive checksum result via
+          bpf_xdp_metadata_rx_checksum().
   -
     type: flags
     name: xsk-flags
diff --git a/include/net/xdp.h b/include/net/xdp.h
index aa742f413c358575396530879af4570dc3fc18de..00abb2e1e85514b4080d0e4e6e3b8f5f67f73b61 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -586,6 +586,10 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
 			   NETDEV_XDP_RX_METADATA_VLAN_TAG, \
 			   bpf_xdp_metadata_rx_vlan_tag, \
 			   xmo_rx_vlan_tag) \
+	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_CHECKSUM, \
+			   NETDEV_XDP_RX_METADATA_CHECKSUM, \
+			   bpf_xdp_metadata_rx_checksum, \
+			   xmo_rx_checksum)
 
 enum xdp_rx_metadata {
 #define XDP_METADATA_KFUNC(name, _, __, ___) name,
@@ -643,12 +647,21 @@ enum xdp_rss_hash_type {
 	XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP | XDP_RSS_L3_DYNHDR,
 };
 
+enum xdp_checksum {
+	XDP_CHECKSUM_NONE		= CHECKSUM_NONE,
+	XDP_CHECKSUM_UNNECESSARY	= CHECKSUM_UNNECESSARY,
+	XDP_CHECKSUM_COMPLETE		= CHECKSUM_COMPLETE,
+};
+
 struct xdp_metadata_ops {
 	int	(*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
 	int	(*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
 			       enum xdp_rss_hash_type *rss_type);
 	int	(*xmo_rx_vlan_tag)(const struct xdp_md *ctx, __be16 *vlan_proto,
 				   u16 *vlan_tci);
+	int	(*xmo_rx_checksum)(const struct xdp_md *ctx,
+				   enum xdp_checksum *ip_summed,
+				   u32 *cksum_meta);
 };
 
 #ifdef CONFIG_NET
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index e0b579a1df4f2126acec6c44c299e97bbbefe640..d20da430cfd57bc26b5ea2f406c27b48d8a81693 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -47,11 +47,14 @@ enum netdev_xdp_act {
  *   hash via bpf_xdp_metadata_rx_hash().
  * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive
  *   packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag().
+ * @NETDEV_XDP_RX_METADATA_CHECKSUM: Device is capable of exposing receive
+ *   checksum result via bpf_xdp_metadata_rx_checksum().
  */
 enum netdev_xdp_rx_metadata {
 	NETDEV_XDP_RX_METADATA_TIMESTAMP = 1,
 	NETDEV_XDP_RX_METADATA_HASH = 2,
 	NETDEV_XDP_RX_METADATA_VLAN_TAG = 4,
+	NETDEV_XDP_RX_METADATA_CHECKSUM = 8,
 };
 
 /**
diff --git a/net/core/xdp.c b/net/core/xdp.c
index fee6d080ee85fc2d278bfdddfd1365633058ec06..7d1e08d8ab4151ab42c91203def2afafc66d3149 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -961,6 +961,34 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
 	return -EOPNOTSUPP;
 }
 
+/**
+ * bpf_xdp_metadata_rx_checksum - Read XDP frame RX checksum.
+ * @ctx: XDP context pointer.
+ * @ip_summed: Return value pointer indicating checksum result.
+ * @cksum_meta: Return value pointer indicating checksum result metadata.
+ *
+ * In case of success, ``ip_summed`` is set to the RX checksum result. Possible
+ * values are:
+ * ``XDP_CHECKSUM_NONE``
+ * ``XDP_CHECKSUM_UNNECESSARY``
+ * ``XDP_CHECKSUM_COMPLETE``
+ *
+ * In case of success, ``cksum_meta`` contains the hw computed checksum value
+ * for ``XDP_CHECKSUM_COMPLETE`` or the ``csum_level`` for
+ * ``XDP_CHECKSUM_UNNECESSARY``. It is set to 0 for ``XDP_CHECKSUM_NONE``
+ *
+ * Return:
+ * * Returns 0 on success or ``-errno`` on error.
+ * * ``-EOPNOTSUPP`` : means device driver does not implement kfunc
+ * * ``-ENODATA``    : means no RX-checksum available for this frame
+ */
+__bpf_kfunc int bpf_xdp_metadata_rx_checksum(const struct xdp_md *ctx,
+					     enum xdp_checksum *ip_summed,
+					     u32 *cksum_meta)
+{
+	return -EOPNOTSUPP;
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(xdp_metadata_kfunc_ids)
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index e0b579a1df4f2126acec6c44c299e97bbbefe640..d20da430cfd57bc26b5ea2f406c27b48d8a81693 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -47,11 +47,14 @@ enum netdev_xdp_act {
  *   hash via bpf_xdp_metadata_rx_hash().
  * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive
  *   packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag().
+ * @NETDEV_XDP_RX_METADATA_CHECKSUM: Device is capable of exposing receive
+ *   checksum result via bpf_xdp_metadata_rx_checksum().
  */
 enum netdev_xdp_rx_metadata {
 	NETDEV_XDP_RX_METADATA_TIMESTAMP = 1,
 	NETDEV_XDP_RX_METADATA_HASH = 2,
 	NETDEV_XDP_RX_METADATA_VLAN_TAG = 4,
+	NETDEV_XDP_RX_METADATA_CHECKSUM = 8,
 };
 
 /**

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 2/5] net: veth: Add xmo_rx_checksum callback to veth driver
  2026-02-17  8:33 [PATCH bpf-next v3 0/5] Add the the capability to load HW RX checsum in eBPF programs Lorenzo Bianconi
  2026-02-17  8:33 ` [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs Lorenzo Bianconi
@ 2026-02-17  8:33 ` Lorenzo Bianconi
  2026-02-17  8:33 ` [PATCH bpf-next v3 3/5] net: ice: Add xmo_rx_checksum callback Lorenzo Bianconi
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-17  8:33 UTC (permalink / raw)
  To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski
  Cc: Jakub Sitnicki, netdev, bpf, intel-wired-lan, linux-kselftest,
	Lorenzo Bianconi

Implement xmo_rx_checksum callback in veth driver to report RX checksum
result to the eBPF program bounded to the veth device.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/veth.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 9982412fd7f238e996ccdff24342974cb25094bf..de67c98a995112a931dd871a71d84333b045fd62 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1697,6 +1697,34 @@ static int veth_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto,
 	return err;
 }
 
+static int veth_xdp_rx_checksum(const struct xdp_md *ctx,
+				enum xdp_checksum *ip_summed,
+				u32 *cksum_meta)
+{
+	const struct veth_xdp_buff *_ctx = (void *)ctx;
+	const struct sk_buff *skb = _ctx->skb;
+
+	if (!skb)
+		return -ENODATA;
+
+	switch (skb->ip_summed) {
+	case CHECKSUM_COMPLETE:
+		*ip_summed = XDP_CHECKSUM_COMPLETE;
+		*cksum_meta = skb->csum;
+		break;
+	case CHECKSUM_UNNECESSARY:
+		*ip_summed = XDP_CHECKSUM_UNNECESSARY;
+		*cksum_meta = skb->csum_level;
+		break;
+	default:
+		*ip_summed = XDP_CHECKSUM_NONE;
+		*cksum_meta = 0;
+		break;
+	}
+
+	return 0;
+}
+
 static const struct net_device_ops veth_netdev_ops = {
 	.ndo_init            = veth_dev_init,
 	.ndo_open            = veth_open,
@@ -1722,6 +1750,7 @@ static const struct xdp_metadata_ops veth_xdp_metadata_ops = {
 	.xmo_rx_timestamp		= veth_xdp_rx_timestamp,
 	.xmo_rx_hash			= veth_xdp_rx_hash,
 	.xmo_rx_vlan_tag		= veth_xdp_rx_vlan_tag,
+	.xmo_rx_checksum		= veth_xdp_rx_checksum,
 };
 
 #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 3/5] net: ice: Add xmo_rx_checksum callback
  2026-02-17  8:33 [PATCH bpf-next v3 0/5] Add the the capability to load HW RX checsum in eBPF programs Lorenzo Bianconi
  2026-02-17  8:33 ` [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs Lorenzo Bianconi
  2026-02-17  8:33 ` [PATCH bpf-next v3 2/5] net: veth: Add xmo_rx_checksum callback to veth driver Lorenzo Bianconi
@ 2026-02-17  8:33 ` Lorenzo Bianconi
  2026-02-17  8:33 ` [PATCH bpf-next v3 4/5] selftests/bpf: Add selftest support for bpf_xdp_metadata_rx_checksum Lorenzo Bianconi
  2026-02-17  8:34 ` [PATCH bpf-next v3 5/5] selftests/bpf: Add bpf_xdp_metadata_rx_checksum support to xdp_hw_metadat prog Lorenzo Bianconi
  4 siblings, 0 replies; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-17  8:33 UTC (permalink / raw)
  To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski
  Cc: Jakub Sitnicki, netdev, bpf, intel-wired-lan, linux-kselftest,
	Lorenzo Bianconi

Implement xmo_rx_checksum callback in ice driver to report RX checksum
result to the eBPF program bounded to the NIC.
Introduce ice_get_rx_csum utility routine in order to make the rx checksum
code reusable from ice_rx_csum()

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 118 +++++++++++++++++---------
 1 file changed, 76 insertions(+), 42 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 956da38d63b0032db238325dcff818bbd99478e9..65b34b29f396560739fc58293496a5e9ddf09b53 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -81,69 +81,47 @@ ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring,
 		libeth_rx_pt_set_hash(skb, hash, decoded);
 }
 
-/**
- * ice_rx_gcs - Set generic checksum in skb
- * @skb: skb currently being received and modified
- * @rx_desc: receive descriptor
- */
-static void ice_rx_gcs(struct sk_buff *skb,
-		       const union ice_32b_rx_flex_desc *rx_desc)
-{
-	const struct ice_32b_rx_flex_desc_nic *desc;
-	u16 csum;
-
-	desc = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
-	skb->ip_summed = CHECKSUM_COMPLETE;
-	csum = (__force u16)desc->raw_csum;
-	skb->csum = csum_unfold((__force __sum16)swab16(csum));
-}
-
-/**
- * ice_rx_csum - Indicate in skb if checksum is good
- * @ring: the ring we care about
- * @skb: skb currently being received and modified
- * @rx_desc: the receive descriptor
- * @ptype: the packet type decoded by hardware
- *
- * skb->protocol must be set before this function is called
- */
 static void
-ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
-	    union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
+ice_get_rx_csum(const union ice_32b_rx_flex_desc *rx_desc, u16 ptype,
+		struct ice_rx_ring *ring, enum xdp_checksum *ip_summed,
+		u32 *cksum_meta)
 {
-	struct libeth_rx_pt decoded;
+	struct libeth_rx_pt decoded = libie_rx_pt_parse(ptype);
 	u16 rx_status0, rx_status1;
 	bool ipv4, ipv6;
 
-	/* Start with CHECKSUM_NONE and by default csum_level = 0 */
-	skb->ip_summed = CHECKSUM_NONE;
-
-	decoded = libie_rx_pt_parse(ptype);
 	if (!libeth_rx_pt_has_checksum(ring->netdev, decoded))
-		return;
+		goto checksum_none;
 
 	rx_status0 = le16_to_cpu(rx_desc->wb.status_error0);
 	rx_status1 = le16_to_cpu(rx_desc->wb.status_error1);
-
 	if ((ring->flags & ICE_RX_FLAGS_RING_GCS) &&
 	    rx_desc->wb.rxdid == ICE_RXDID_FLEX_NIC &&
 	    (decoded.inner_prot == LIBETH_RX_PT_INNER_TCP ||
 	     decoded.inner_prot == LIBETH_RX_PT_INNER_UDP ||
 	     decoded.inner_prot == LIBETH_RX_PT_INNER_ICMP)) {
-		ice_rx_gcs(skb, rx_desc);
+		const struct ice_32b_rx_flex_desc_nic *desc;
+		__wsum wcsum;
+		u16 csum;
+
+		desc = (struct ice_32b_rx_flex_desc_nic *)rx_desc;
+		*ip_summed = XDP_CHECKSUM_COMPLETE;
+		csum = (__force u16)desc->raw_csum;
+		wcsum = csum_unfold((__force __sum16)swab16(csum));
+		*cksum_meta = (__force u32)wcsum;
 		return;
 	}
 
 	/* check if HW has decoded the packet and checksum */
 	if (!(rx_status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_L3L4P_S)))
-		return;
+		goto checksum_none;
 
 	ipv4 = libeth_rx_pt_get_ip_ver(decoded) == LIBETH_RX_PT_OUTER_IPV4;
 	ipv6 = libeth_rx_pt_get_ip_ver(decoded) == LIBETH_RX_PT_OUTER_IPV6;
 
 	if (ipv4 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_EIPE_S)))) {
 		ring->vsi->back->hw_rx_eipe_error++;
-		return;
+		goto checksum_none;
 	}
 
 	if (ipv4 && (rx_status0 & (BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_IPE_S))))
@@ -167,14 +145,45 @@ ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
 	 * we need to bump the checksum level by 1 to reflect the fact that
 	 * we are indicating we validated the inner checksum.
 	 */
-	if (decoded.tunnel_type >= LIBETH_RX_PT_TUNNEL_IP_GRENAT)
-		skb->csum_level = 1;
-
-	skb->ip_summed = CHECKSUM_UNNECESSARY;
+	*cksum_meta = decoded.tunnel_type >= LIBETH_RX_PT_TUNNEL_IP_GRENAT;
+	*ip_summed = XDP_CHECKSUM_UNNECESSARY;
 	return;
 
 checksum_fail:
 	ring->vsi->back->hw_csum_rx_error++;
+checksum_none:
+	*ip_summed = XDP_CHECKSUM_NONE;
+	*cksum_meta = 0;
+}
+
+/**
+ * ice_rx_csum - Indicate in skb if checksum is good
+ * @ring: the ring we care about
+ * @skb: skb currently being received and modified
+ * @rx_desc: the receive descriptor
+ * @ptype: the packet type decoded by hardware
+ *
+ * skb->protocol must be set before this function is called
+ */
+static void
+ice_rx_csum(struct ice_rx_ring *ring, struct sk_buff *skb,
+	    union ice_32b_rx_flex_desc *rx_desc, u16 ptype)
+{
+	enum xdp_checksum ip_summed;
+	u32 cksum_meta;
+
+	ice_get_rx_csum(rx_desc, ptype, ring, &ip_summed, &cksum_meta);
+	switch (ip_summed) {
+	case XDP_CHECKSUM_UNNECESSARY:
+		skb->csum_level = cksum_meta;
+		break;
+	case XDP_CHECKSUM_COMPLETE:
+		skb->csum = (__force __wsum)cksum_meta;
+		break;
+	default:
+		break;
+	}
+	skb->ip_summed = ip_summed;
 }
 
 /**
@@ -569,6 +578,30 @@ static int ice_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash,
 	return 0;
 }
 
+/**
+ * ice_xdp_rx_checksum - RX checksum XDP hint handler
+ * @ctx: XDP buff pointer
+ * @ip_summed: RX checksum result destination address
+ * @cksum_meta: XDP RX checksum metadata destination address
+ *
+ * Copy RX checksum result (if available) and its metadata to the
+ * destination address.
+ */
+static int ice_xdp_rx_checksum(const struct xdp_md *ctx,
+			       enum xdp_checksum *ip_summed,
+			       u32 *cksum_meta)
+{
+	const struct libeth_xdp_buff *xdp_ext = (void *)ctx;
+	const union ice_32b_rx_flex_desc *rx_desc = xdp_ext->desc;
+	struct ice_rx_ring *ring;
+
+	ring = libeth_xdp_buff_to_rq(xdp_ext, typeof(*ring), xdp_rxq);
+	ice_get_rx_csum(rx_desc, ice_get_ptype(rx_desc), ring, ip_summed,
+			cksum_meta);
+
+	return 0;
+}
+
 /**
  * ice_xdp_rx_vlan_tag - VLAN tag XDP hint handler
  * @ctx: XDP buff pointer
@@ -601,4 +634,5 @@ const struct xdp_metadata_ops ice_xdp_md_ops = {
 	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
 	.xmo_rx_hash			= ice_xdp_rx_hash,
 	.xmo_rx_vlan_tag		= ice_xdp_rx_vlan_tag,
+	.xmo_rx_checksum		= ice_xdp_rx_checksum,
 };

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 4/5] selftests/bpf: Add selftest support for bpf_xdp_metadata_rx_checksum
  2026-02-17  8:33 [PATCH bpf-next v3 0/5] Add the the capability to load HW RX checsum in eBPF programs Lorenzo Bianconi
                   ` (2 preceding siblings ...)
  2026-02-17  8:33 ` [PATCH bpf-next v3 3/5] net: ice: Add xmo_rx_checksum callback Lorenzo Bianconi
@ 2026-02-17  8:33 ` Lorenzo Bianconi
  2026-02-17  9:17   ` bot+bpf-ci
  2026-02-17  8:34 ` [PATCH bpf-next v3 5/5] selftests/bpf: Add bpf_xdp_metadata_rx_checksum support to xdp_hw_metadat prog Lorenzo Bianconi
  4 siblings, 1 reply; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-17  8:33 UTC (permalink / raw)
  To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski
  Cc: Jakub Sitnicki, netdev, bpf, intel-wired-lan, linux-kselftest,
	Aleksandr Loktionov, Lorenzo Bianconi

Introduce support to xdp_metadata selftest for bpf_xdp_metadata_rx_checksum
kfunc.

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 tools/testing/selftests/bpf/prog_tests/xdp_metadata.c | 6 ++++++
 tools/testing/selftests/bpf/progs/xdp_metadata.c      | 3 +++
 tools/testing/selftests/bpf/xdp_metadata.h            | 8 ++++++++
 3 files changed, 17 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
index 19f92affc2daa23fdd869554e7a0475b86350a4f..31f3629eb0681be656fa0af74fc0d419a3d135fc 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
@@ -310,6 +310,12 @@ static int verify_xsk_metadata(struct xsk *xsk, bool sent_from_af_xdp)
 	if (!ASSERT_NEQ(meta->rx_hash, 0, "rx_hash"))
 		return -1;
 
+	if (!ASSERT_EQ(meta->ip_summed, XDP_CHECKSUM_NONE, "rx_ip_summed"))
+		return -1;
+
+	if (!ASSERT_EQ(meta->cksum_meta, 0, "rx_cksum_meta"))
+		return -1;
+
 	if (!sent_from_af_xdp) {
 		if (!ASSERT_NEQ(meta->rx_hash_type & XDP_RSS_TYPE_L4, 0, "rx_hash_type"))
 			return -1;
diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c
index 09bb8a038d528cf26c5b314cc927915ac2796bf0..72f69c5c659592cca1f04a512868f2101aa2e962 100644
--- a/tools/testing/selftests/bpf/progs/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c
@@ -98,6 +98,9 @@ int rx(struct xdp_md *ctx)
 	bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, &meta->rx_hash_type);
 	bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_proto,
 				     &meta->rx_vlan_tci);
+	bpf_xdp_metadata_rx_checksum(ctx,
+				     (enum xdp_checksum *)&meta->ip_summed,
+				     &meta->cksum_meta);
 
 	return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
 }
diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h
index 87318ad1117a1d677af121f11778178532e2a562..837cd1efe6b5aebd0f62bae4c49d5bfd77db64bc 100644
--- a/tools/testing/selftests/bpf/xdp_metadata.h
+++ b/tools/testing/selftests/bpf/xdp_metadata.h
@@ -30,6 +30,10 @@ enum xdp_meta_field {
 	XDP_META_FIELD_VLAN_TAG	= BIT(2),
 };
 
+#define XDP_CHECKSUM_NONE		0
+#define XDP_CHECKSUM_UNNECESSARY	1
+#define XDP_CHECKSUM_COMPLETE		2
+
 struct xdp_meta {
 	union {
 		__u64 rx_timestamp;
@@ -48,5 +52,9 @@ struct xdp_meta {
 		};
 		__s32 rx_vlan_tag_err;
 	};
+	struct {
+		__u8 ip_summed;
+		__u32 cksum_meta;
+	};
 	enum xdp_meta_field hint_valid;
 };

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v3 5/5] selftests/bpf: Add bpf_xdp_metadata_rx_checksum support to xdp_hw_metadat prog
  2026-02-17  8:33 [PATCH bpf-next v3 0/5] Add the the capability to load HW RX checsum in eBPF programs Lorenzo Bianconi
                   ` (3 preceding siblings ...)
  2026-02-17  8:33 ` [PATCH bpf-next v3 4/5] selftests/bpf: Add selftest support for bpf_xdp_metadata_rx_checksum Lorenzo Bianconi
@ 2026-02-17  8:34 ` Lorenzo Bianconi
  4 siblings, 0 replies; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-17  8:34 UTC (permalink / raw)
  To: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski
  Cc: Jakub Sitnicki, netdev, bpf, intel-wired-lan, linux-kselftest,
	Lorenzo Bianconi

Introduce the capability to dump HW rx checksum in xdp_hw_metadata
program via bpf_xdp_metadata_rx_checksum() kfunc.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 .../testing/selftests/bpf/progs/xdp_hw_metadata.c  |  8 +++++++
 tools/testing/selftests/bpf/xdp_hw_metadata.c      | 25 ++++++++++++++++++++++
 tools/testing/selftests/bpf/xdp_metadata.h         | 10 ++++++---
 3 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
index 330ece2eabdb454da2bb2cbd297d2b2dd6efddc0..9113288fce54bb39dbb6b38e1f6badd0a8ef4cde 100644
--- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
@@ -110,6 +110,14 @@ int rx(struct xdp_md *ctx)
 	else
 		meta->hint_valid |= XDP_META_FIELD_VLAN_TAG;
 
+	err = bpf_xdp_metadata_rx_checksum(ctx,
+			(enum xdp_checksum *)&meta->ip_summed,
+			&meta->cksum_meta);
+	if (err)
+		meta->rx_cksum_err = err;
+	else
+		meta->hint_valid |= XDP_META_FIELD_CHECKSUM;
+
 	__sync_add_and_fetch(&pkts_redir, 1);
 	return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
 }
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c
index 3d8de0d4c96a7afdf5f60b2fdff73c22b876ce54..f1cf8490cc497b5dd75b8d839918fb85c2a0684d 100644
--- a/tools/testing/selftests/bpf/xdp_hw_metadata.c
+++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c
@@ -8,6 +8,7 @@
  * - Metadata verified:
  *   - rx_timestamp
  *   - rx_hash
+ *   - rx_checksum
  *
  * TX:
  * - UDP 9091 packets trigger TX reply
@@ -219,6 +220,25 @@ static void print_vlan_tci(__u16 tag)
 	printf("PCP=%u, DEI=%d, VID=0x%X\n", pcp, dei, vlan_id);
 }
 
+static void print_rx_cksum(__u8 ip_summed, __u32 cksum_meta)
+{
+	const char *cksum = "CHECKSUM_NONE";
+
+	switch (ip_summed) {
+	case XDP_CHECKSUM_UNNECESSARY:
+		cksum = "CHECKSUM_UNNECESSARY";
+		break;
+	case XDP_CHECKSUM_COMPLETE:
+		cksum = "CHECKSUM_COMPLETE";
+		break;
+	case XDP_CHECKSUM_NONE:
+	default:
+		break;
+	}
+
+	printf("rx-cksum: %s, csum_meta=0x%x\n", cksum, cksum_meta);
+}
+
 static void verify_xdp_metadata(void *data, clockid_t clock_id)
 {
 	struct xdp_meta *meta;
@@ -254,6 +274,11 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id)
 		printf("No rx_vlan_tci or rx_vlan_proto, err=%d\n",
 		       meta->rx_vlan_tag_err);
 	}
+
+	if (meta->hint_valid & XDP_META_FIELD_CHECKSUM)
+		print_rx_cksum(meta->ip_summed, meta->cksum_meta);
+	else
+		printf("No rx_cksum, err=%d\n", meta->rx_cksum_err);
 }
 
 static void verify_skb_metadata(int fd)
diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h
index 837cd1efe6b5aebd0f62bae4c49d5bfd77db64bc..822e34390582b3d3660a25e725c9a622edbf635c 100644
--- a/tools/testing/selftests/bpf/xdp_metadata.h
+++ b/tools/testing/selftests/bpf/xdp_metadata.h
@@ -28,6 +28,7 @@ enum xdp_meta_field {
 	XDP_META_FIELD_TS	= BIT(0),
 	XDP_META_FIELD_RSS	= BIT(1),
 	XDP_META_FIELD_VLAN_TAG	= BIT(2),
+	XDP_META_FIELD_CHECKSUM = BIT(3),
 };
 
 #define XDP_CHECKSUM_NONE		0
@@ -52,9 +53,12 @@ struct xdp_meta {
 		};
 		__s32 rx_vlan_tag_err;
 	};
-	struct {
-		__u8 ip_summed;
-		__u32 cksum_meta;
+	union {
+		struct {
+			__u8 ip_summed;
+			__u32 cksum_meta;
+		};
+		__s32 rx_cksum_err;
 	};
 	enum xdp_meta_field hint_valid;
 };

-- 
2.53.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 4/5] selftests/bpf: Add selftest support for bpf_xdp_metadata_rx_checksum
  2026-02-17  8:33 ` [PATCH bpf-next v3 4/5] selftests/bpf: Add selftest support for bpf_xdp_metadata_rx_checksum Lorenzo Bianconi
@ 2026-02-17  9:17   ` bot+bpf-ci
  0 siblings, 0 replies; 17+ messages in thread
From: bot+bpf-ci @ 2026-02-17  9:17 UTC (permalink / raw)
  To: lorenzo, donald.hunter, kuba, davem, edumazet, pabeni, horms, ast,
	daniel, hawk, john.fastabend, sdf, andrew+netdev,
	anthony.l.nguyen, przemyslaw.kitszel, aleksander.lobakin, andrii,
	martin.lau, eddyz87, song, yonghong.song, kpsingh, haoluo, jolsa,
	shuah, maciej.fijalkowski
  Cc: jakub, netdev, bpf, intel-wired-lan, linux-kselftest,
	aleksandr.loktionov, lorenzo, ast, andrii, daniel, martin.lau,
	eddyz87, yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 2116 bytes --]

> diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h
> --- a/tools/testing/selftests/bpf/xdp_metadata.h
> +++ b/tools/testing/selftests/bpf/xdp_metadata.h

[ ... ]

> +	struct {
> +		__u8 ip_summed;
> +		__u32 cksum_meta;
> +	};

The ip_summed field is __u8 (1 byte), but bpf_xdp_metadata_rx_checksum()
takes an enum xdp_checksum * argument, which is 4 bytes.

In the BPF program:

> diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c
> --- a/tools/testing/selftests/bpf/progs/xdp_metadata.c
> +++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c

[ ... ]

> +	bpf_xdp_metadata_rx_checksum(ctx,
> +				     (enum xdp_checksum *)&meta->ip_summed,
> +				     &meta->cksum_meta);

The cast here passes a pointer to a 1-byte field where the kfunc will
perform a 4-byte store.  On little-endian this happens to work because
the value lands in the low byte and the remaining 3 bytes go into struct
padding.

On big-endian, the value would be stored in the high byte, so reading
back ip_summed as __u8 would yield 0 for any checksum result.

For comparison, the other metadata fields match their kfunc argument
sizes: rx_hash is __u32 matching u32 *, and rx_hash_type is __u32
matching enum xdp_rss_hash_type *.

Should ip_summed be __u32 (or enum xdp_checksum) to match the kfunc
signature and avoid the cast?


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22091902298

AI-authorship-score: low
AI-authorship-explanation: Code and commit message follow established kernel selftest patterns consistent with a known contributor, and the type choice reflects human engineering judgment.
issues-found: 1
issue-severity-score: low
issue-severity-explanation: Type size mismatch in selftest code where a __u8 field is cast to enum xdp_checksum * (4 bytes) for a kfunc call, causing incorrect behavior on big-endian architectures.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-17  8:33 ` [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs Lorenzo Bianconi
@ 2026-02-18  1:01   ` Stanislav Fomichev
  2026-02-18 10:58     ` Jesper Dangaard Brouer
  2026-02-19  1:47   ` Jakub Kicinski
  1 sibling, 1 reply; 17+ messages in thread
From: Stanislav Fomichev @ 2026-02-18  1:01 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Lorenzo Bianconi, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Andrew Lunn, Tony Nguyen, Przemek Kitszel,
	Alexander Lobakin, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, KP Singh, Hao Luo,
	Jiri Olsa, Shuah Khan, Maciej Fijalkowski, Jakub Sitnicki, netdev,
	bpf, intel-wired-lan, linux-kselftest

On 02/17, Lorenzo Bianconi wrote:
> Introduce XDP RX checksum capability to XDP metadata specs. XDP RX
> checksum will be use by devices capable of exposing receive checksum
> result via bpf_xdp_metadata_rx_checksum().
> Moreover, introduce xmo_rx_checksum netdev callback in order to allow
> the eBPF program bound to the device to retrieve the RX checksum result
> computed by the hw NIC.
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  Documentation/netlink/specs/netdev.yaml |  5 +++++
>  include/net/xdp.h                       | 13 +++++++++++++
>  include/uapi/linux/netdev.h             |  3 +++
>  net/core/xdp.c                          | 28 ++++++++++++++++++++++++++++
>  tools/include/uapi/linux/netdev.h       |  3 +++
>  5 files changed, 52 insertions(+)
> 
> diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
> index 596c306ce52b8303b20680ff0cd34d4fd9db0e48..58eda634668a07860447a65d9fc2284839af6244 100644
> --- a/Documentation/netlink/specs/netdev.yaml
> +++ b/Documentation/netlink/specs/netdev.yaml
> @@ -61,6 +61,11 @@ definitions:
>          doc: |
>            Device is capable of exposing receive packet VLAN tag via
>            bpf_xdp_metadata_rx_vlan_tag().
> +      -
> +        name: checksum
> +        doc: |
> +          Device is capable of exposing receive checksum result via
> +          bpf_xdp_metadata_rx_checksum().
>    -
>      type: flags
>      name: xsk-flags
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index aa742f413c358575396530879af4570dc3fc18de..00abb2e1e85514b4080d0e4e6e3b8f5f67f73b61 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -586,6 +586,10 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
>  			   NETDEV_XDP_RX_METADATA_VLAN_TAG, \
>  			   bpf_xdp_metadata_rx_vlan_tag, \
>  			   xmo_rx_vlan_tag) \
> +	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_CHECKSUM, \
> +			   NETDEV_XDP_RX_METADATA_CHECKSUM, \
> +			   bpf_xdp_metadata_rx_checksum, \
> +			   xmo_rx_checksum)
>  
>  enum xdp_rx_metadata {
>  #define XDP_METADATA_KFUNC(name, _, __, ___) name,
> @@ -643,12 +647,21 @@ enum xdp_rss_hash_type {
>  	XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP | XDP_RSS_L3_DYNHDR,
>  };
>  
> +enum xdp_checksum {
> +	XDP_CHECKSUM_NONE		= CHECKSUM_NONE,
> +	XDP_CHECKSUM_UNNECESSARY	= CHECKSUM_UNNECESSARY,
> +	XDP_CHECKSUM_COMPLETE		= CHECKSUM_COMPLETE,
> +};
> +
>  struct xdp_metadata_ops {
>  	int	(*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
>  	int	(*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
>  			       enum xdp_rss_hash_type *rss_type);
>  	int	(*xmo_rx_vlan_tag)(const struct xdp_md *ctx, __be16 *vlan_proto,
>  				   u16 *vlan_tci);
> +	int	(*xmo_rx_checksum)(const struct xdp_md *ctx,
> +				   enum xdp_checksum *ip_summed,
> +				   u32 *cksum_meta);
>  };
>  
>  #ifdef CONFIG_NET
> diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
> index e0b579a1df4f2126acec6c44c299e97bbbefe640..d20da430cfd57bc26b5ea2f406c27b48d8a81693 100644
> --- a/include/uapi/linux/netdev.h
> +++ b/include/uapi/linux/netdev.h
> @@ -47,11 +47,14 @@ enum netdev_xdp_act {
>   *   hash via bpf_xdp_metadata_rx_hash().
>   * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive
>   *   packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag().
> + * @NETDEV_XDP_RX_METADATA_CHECKSUM: Device is capable of exposing receive
> + *   checksum result via bpf_xdp_metadata_rx_checksum().
>   */
>  enum netdev_xdp_rx_metadata {
>  	NETDEV_XDP_RX_METADATA_TIMESTAMP = 1,
>  	NETDEV_XDP_RX_METADATA_HASH = 2,
>  	NETDEV_XDP_RX_METADATA_VLAN_TAG = 4,
> +	NETDEV_XDP_RX_METADATA_CHECKSUM = 8,
>  };
>  
>  /**
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index fee6d080ee85fc2d278bfdddfd1365633058ec06..7d1e08d8ab4151ab42c91203def2afafc66d3149 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -961,6 +961,34 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
>  	return -EOPNOTSUPP;
>  }
>  
> +/**
> + * bpf_xdp_metadata_rx_checksum - Read XDP frame RX checksum.
> + * @ctx: XDP context pointer.
> + * @ip_summed: Return value pointer indicating checksum result.
> + * @cksum_meta: Return value pointer indicating checksum result metadata.
> + *
> + * In case of success, ``ip_summed`` is set to the RX checksum result. Possible
> + * values are:
> + * ``XDP_CHECKSUM_NONE``
> + * ``XDP_CHECKSUM_UNNECESSARY``
> + * ``XDP_CHECKSUM_COMPLETE``
> + *
> + * In case of success, ``cksum_meta`` contains the hw computed checksum value
> + * for ``XDP_CHECKSUM_COMPLETE`` or the ``csum_level`` for
> + * ``XDP_CHECKSUM_UNNECESSARY``. It is set to 0 for ``XDP_CHECKSUM_NONE``

The only thing I'm still not sure about is the csum_level and whether
we need to export it or just start with csum_level=0 and extend later
when needed. The rest looks good.

Jesper, Lorenzo mentioned that you might need it? Can you clarify?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-18  1:01   ` Stanislav Fomichev
@ 2026-02-18 10:58     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 17+ messages in thread
From: Jesper Dangaard Brouer @ 2026-02-18 10:58 UTC (permalink / raw)
  To: Stanislav Fomichev, Jesse Brandeburg, Arthur Fabre
  Cc: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Lorenzo Bianconi, John Fastabend, Stanislav Fomichev, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Alexander Lobakin, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	KP Singh, Hao Luo, Jiri Olsa, Shuah Khan, Maciej Fijalkowski,
	Jakub Sitnicki, netdev, bpf, intel-wired-lan, linux-kselftest,
	kernel-team, Willem Ferguson



On 18/02/2026 02.01, Stanislav Fomichev wrote:
> On 02/17, Lorenzo Bianconi wrote:
>> Introduce XDP RX checksum capability to XDP metadata specs. XDP RX
>> checksum will be use by devices capable of exposing receive checksum
>> result via bpf_xdp_metadata_rx_checksum().
>> Moreover, introduce xmo_rx_checksum netdev callback in order to allow
>> the eBPF program bound to the device to retrieve the RX checksum result
>> computed by the hw NIC.
>>
>> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
>> ---
>>   Documentation/netlink/specs/netdev.yaml |  5 +++++
>>   include/net/xdp.h                       | 13 +++++++++++++
>>   include/uapi/linux/netdev.h             |  3 +++
>>   net/core/xdp.c                          | 28 ++++++++++++++++++++++++++++
>>   tools/include/uapi/linux/netdev.h       |  3 +++
>>   5 files changed, 52 insertions(+)
>>
>> diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
>> index 596c306ce52b8303b20680ff0cd34d4fd9db0e48..58eda634668a07860447a65d9fc2284839af6244 100644
>> --- a/Documentation/netlink/specs/netdev.yaml
>> +++ b/Documentation/netlink/specs/netdev.yaml
>> @@ -61,6 +61,11 @@ definitions:
>>           doc: |
>>             Device is capable of exposing receive packet VLAN tag via
>>             bpf_xdp_metadata_rx_vlan_tag().
>> +      -
>> +        name: checksum
>> +        doc: |
>> +          Device is capable of exposing receive checksum result via
>> +          bpf_xdp_metadata_rx_checksum().
>>     -
>>       type: flags
>>       name: xsk-flags
>> diff --git a/include/net/xdp.h b/include/net/xdp.h
>> index aa742f413c358575396530879af4570dc3fc18de..00abb2e1e85514b4080d0e4e6e3b8f5f67f73b61 100644
>> --- a/include/net/xdp.h
>> +++ b/include/net/xdp.h
>> @@ -586,6 +586,10 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
>>   			   NETDEV_XDP_RX_METADATA_VLAN_TAG, \
>>   			   bpf_xdp_metadata_rx_vlan_tag, \
>>   			   xmo_rx_vlan_tag) \
>> +	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_CHECKSUM, \
>> +			   NETDEV_XDP_RX_METADATA_CHECKSUM, \
>> +			   bpf_xdp_metadata_rx_checksum, \
>> +			   xmo_rx_checksum)
>>   
>>   enum xdp_rx_metadata {
>>   #define XDP_METADATA_KFUNC(name, _, __, ___) name,
>> @@ -643,12 +647,21 @@ enum xdp_rss_hash_type {
>>   	XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP | XDP_RSS_L3_DYNHDR,
>>   };
>>   
>> +enum xdp_checksum {
>> +	XDP_CHECKSUM_NONE		= CHECKSUM_NONE,
>> +	XDP_CHECKSUM_UNNECESSARY	= CHECKSUM_UNNECESSARY,
>> +	XDP_CHECKSUM_COMPLETE		= CHECKSUM_COMPLETE,
>> +};
>> +
>>   struct xdp_metadata_ops {
>>   	int	(*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
>>   	int	(*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
>>   			       enum xdp_rss_hash_type *rss_type);
>>   	int	(*xmo_rx_vlan_tag)(const struct xdp_md *ctx, __be16 *vlan_proto,
>>   				   u16 *vlan_tci);
>> +	int	(*xmo_rx_checksum)(const struct xdp_md *ctx,
>> +				   enum xdp_checksum *ip_summed,
>> +				   u32 *cksum_meta);
>>   };
>>   
>>   #ifdef CONFIG_NET
>> diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
>> index e0b579a1df4f2126acec6c44c299e97bbbefe640..d20da430cfd57bc26b5ea2f406c27b48d8a81693 100644
>> --- a/include/uapi/linux/netdev.h
>> +++ b/include/uapi/linux/netdev.h
>> @@ -47,11 +47,14 @@ enum netdev_xdp_act {
>>    *   hash via bpf_xdp_metadata_rx_hash().
>>    * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive
>>    *   packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag().
>> + * @NETDEV_XDP_RX_METADATA_CHECKSUM: Device is capable of exposing receive
>> + *   checksum result via bpf_xdp_metadata_rx_checksum().
>>    */
>>   enum netdev_xdp_rx_metadata {
>>   	NETDEV_XDP_RX_METADATA_TIMESTAMP = 1,
>>   	NETDEV_XDP_RX_METADATA_HASH = 2,
>>   	NETDEV_XDP_RX_METADATA_VLAN_TAG = 4,
>> +	NETDEV_XDP_RX_METADATA_CHECKSUM = 8,
>>   };
>>   
>>   /**
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index fee6d080ee85fc2d278bfdddfd1365633058ec06..7d1e08d8ab4151ab42c91203def2afafc66d3149 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -961,6 +961,34 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
>>   	return -EOPNOTSUPP;
>>   }
>>   
>> +/**
>> + * bpf_xdp_metadata_rx_checksum - Read XDP frame RX checksum.
>> + * @ctx: XDP context pointer.
>> + * @ip_summed: Return value pointer indicating checksum result.
>> + * @cksum_meta: Return value pointer indicating checksum result metadata.
>> + *
>> + * In case of success, ``ip_summed`` is set to the RX checksum result. Possible
>> + * values are:
>> + * ``XDP_CHECKSUM_NONE``
>> + * ``XDP_CHECKSUM_UNNECESSARY``
>> + * ``XDP_CHECKSUM_COMPLETE``
>> + *
>> + * In case of success, ``cksum_meta`` contains the hw computed checksum value
>> + * for ``XDP_CHECKSUM_COMPLETE`` or the ``csum_level`` for
>> + * ``XDP_CHECKSUM_UNNECESSARY``. It is set to 0 for ``XDP_CHECKSUM_NONE``
> 
> The only thing I'm still not sure about is the csum_level and whether
> we need to export it or just start with csum_level=0 and extend later
> when needed. The rest looks good.
> 
> Jesper, Lorenzo mentioned that you might need it? Can you clarify?

At Cloudflare our load-balancer Unimog[1] does GUE (Generic UDP
Encapsulation) when XDP_TX'ing packets to neighboring servers.
Thus, I assume we want to know the csum_level, as this is for
encapsulated packets, right?

Cc Jesse, as he knows more about the hardware and csum_level. To Jesse,
we need to test how hardware handles our GUE packet format (which is
slightly modified).

Cc Arthur + Willem, as they knows the details around how Unimog
currently have to recalc packet checksums in software.  Hopefully this
patchset can help us avoid doing this in some cases.

--Jesper

  [1] 
https://blog.cloudflare.com/unimog-cloudflares-edge-load-balancer/#encapsulation
  [Patch-0/5] 
https://lore.kernel.org/all/20260217-bpf-xdp-meta-rxcksum-v3-0-30024c50ba71@kernel.org/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-17  8:33 ` [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs Lorenzo Bianconi
  2026-02-18  1:01   ` Stanislav Fomichev
@ 2026-02-19  1:47   ` Jakub Kicinski
  2026-02-19 11:04     ` Lorenzo Bianconi
  1 sibling, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2026-02-19  1:47 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Donald Hunter, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski, Jakub Sitnicki, netdev, bpf, intel-wired-lan,
	linux-kselftest

On Tue, 17 Feb 2026 09:33:56 +0100 Lorenzo Bianconi wrote:
> + * In case of success, ``ip_summed`` is set to the RX checksum result. Possible
> + * values are:
> + * ``XDP_CHECKSUM_NONE``
> + * ``XDP_CHECKSUM_UNNECESSARY``
> + * ``XDP_CHECKSUM_COMPLETE``
> + *
> + * In case of success, ``cksum_meta`` contains the hw computed checksum value
> + * for ``XDP_CHECKSUM_COMPLETE`` or the ``csum_level`` for
> + * ``XDP_CHECKSUM_UNNECESSARY``. It is set to 0 for ``XDP_CHECKSUM_NONE``

It's fairly common for NICs to report both csum complete and
unnecessary. Which one should the driver return in that case?
What if the user prefers the other one?..

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-19  1:47   ` Jakub Kicinski
@ 2026-02-19 11:04     ` Lorenzo Bianconi
  2026-02-19 17:13       ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-19 11:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Donald Hunter, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski, Jakub Sitnicki, netdev, bpf, intel-wired-lan,
	linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 1281 bytes --]

> On Tue, 17 Feb 2026 09:33:56 +0100 Lorenzo Bianconi wrote:
> > + * In case of success, ``ip_summed`` is set to the RX checksum result. Possible
> > + * values are:
> > + * ``XDP_CHECKSUM_NONE``
> > + * ``XDP_CHECKSUM_UNNECESSARY``
> > + * ``XDP_CHECKSUM_COMPLETE``
> > + *
> > + * In case of success, ``cksum_meta`` contains the hw computed checksum value
> > + * for ``XDP_CHECKSUM_COMPLETE`` or the ``csum_level`` for
> > + * ``XDP_CHECKSUM_UNNECESSARY``. It is set to 0 for ``XDP_CHECKSUM_NONE``
> 
> It's fairly common for NICs to report both csum complete and
> unnecessary. Which one should the driver return in that case?

Do you mean what is value for cksum_meta if we do not report csum_level for
XDP_CHECKSUM_UNNECESSARY/CHECKSUM_UNNECESSARY use-case? (as suggested by
Stanislav).

My original idea is:
- if the hw reports CHECKSUM_COMPLETE:
  - ip_summed = XDP_CHECKSUM_COMPLETE
  - cksum_meta contains the checksum computed by the hw
- if the hw reports CHECKSUM_UNNECESSARY
  - ip_summed = XDP_CHECKSUM_UNNECESSARY
  - cksum_meta = csum_level <-- Stanislav suggests to drop this one
- if the hw reports CHECKSUM_NONE
  - ip_summed = XDP_CHECKSUM_NONE
  - cksum_meta = 0

Regards,
Lorenzo

> What if the user prefers the other one?..

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-19 11:04     ` Lorenzo Bianconi
@ 2026-02-19 17:13       ` Jakub Kicinski
  2026-02-23 17:11         ` Lorenzo Bianconi
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2026-02-19 17:13 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Donald Hunter, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski, Jakub Sitnicki, netdev, bpf, intel-wired-lan,
	linux-kselftest

On Thu, 19 Feb 2026 12:04:56 +0100 Lorenzo Bianconi wrote:
> > On Tue, 17 Feb 2026 09:33:56 +0100 Lorenzo Bianconi wrote:  
> > > + * In case of success, ``ip_summed`` is set to the RX checksum result. Possible
> > > + * values are:
> > > + * ``XDP_CHECKSUM_NONE``
> > > + * ``XDP_CHECKSUM_UNNECESSARY``
> > > + * ``XDP_CHECKSUM_COMPLETE``
> > > + *
> > > + * In case of success, ``cksum_meta`` contains the hw computed checksum value
> > > + * for ``XDP_CHECKSUM_COMPLETE`` or the ``csum_level`` for
> > > + * ``XDP_CHECKSUM_UNNECESSARY``. It is set to 0 for ``XDP_CHECKSUM_NONE``  
> > 
> > It's fairly common for NICs to report both csum complete and
> > unnecessary. Which one should the driver return in that case?  
> 
> Do you mean what is value for cksum_meta if we do not report csum_level for
> XDP_CHECKSUM_UNNECESSARY/CHECKSUM_UNNECESSARY use-case? (as suggested by
> Stanislav).

More fundamentally whether the API is right.

> My original idea is:
> - if the hw reports CHECKSUM_COMPLETE:
>   - ip_summed = XDP_CHECKSUM_COMPLETE
>   - cksum_meta contains the checksum computed by the hw
> - if the hw reports CHECKSUM_UNNECESSARY
>   - ip_summed = XDP_CHECKSUM_UNNECESSARY
>   - cksum_meta = csum_level <-- Stanislav suggests to drop this one
> - if the hw reports CHECKSUM_NONE
>   - ip_summed = XDP_CHECKSUM_NONE
>   - cksum_meta = 0

Off the top of my head drivers prefer reporting UNNECESSARY when they
have both, and reserve COMPLETE for cases where L4 could not be found
or is incorrect. Why don't we report both? We're using 3 args, we still
have 3 to go. We could turn ip_summed into a bitmap and have explicit
output args for both level and csum complete value?

One more thing I'd like us to at least have a plan for at this stage
is how to deal with COMPLETE + modified packet + XDP_PASS.
Right now some drivers discard COMPLETE when XDP is attached since
they can't be sure if XDP modifies the packet. Other drivers don't
and we end up with bad csum splat. Do we have a recommendation on
the correct behavior? If not - should we have a kfunc to adjust /
discard csum complete explicitly?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-19 17:13       ` Jakub Kicinski
@ 2026-02-23 17:11         ` Lorenzo Bianconi
  2026-02-23 23:18           ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-23 17:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Donald Hunter, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski, Jakub Sitnicki, netdev, bpf, intel-wired-lan,
	linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 2850 bytes --]

> On Thu, 19 Feb 2026 12:04:56 +0100 Lorenzo Bianconi wrote:
> > > On Tue, 17 Feb 2026 09:33:56 +0100 Lorenzo Bianconi wrote:  
> > > > + * In case of success, ``ip_summed`` is set to the RX checksum result. Possible
> > > > + * values are:
> > > > + * ``XDP_CHECKSUM_NONE``
> > > > + * ``XDP_CHECKSUM_UNNECESSARY``
> > > > + * ``XDP_CHECKSUM_COMPLETE``
> > > > + *
> > > > + * In case of success, ``cksum_meta`` contains the hw computed checksum value
> > > > + * for ``XDP_CHECKSUM_COMPLETE`` or the ``csum_level`` for
> > > > + * ``XDP_CHECKSUM_UNNECESSARY``. It is set to 0 for ``XDP_CHECKSUM_NONE``  
> > > 
> > > It's fairly common for NICs to report both csum complete and
> > > unnecessary. Which one should the driver return in that case?  
> > 
> > Do you mean what is value for cksum_meta if we do not report csum_level for
> > XDP_CHECKSUM_UNNECESSARY/CHECKSUM_UNNECESSARY use-case? (as suggested by
> > Stanislav).
> 
> More fundamentally whether the API is right.
> 
> > My original idea is:
> > - if the hw reports CHECKSUM_COMPLETE:
> >   - ip_summed = XDP_CHECKSUM_COMPLETE
> >   - cksum_meta contains the checksum computed by the hw
> > - if the hw reports CHECKSUM_UNNECESSARY
> >   - ip_summed = XDP_CHECKSUM_UNNECESSARY
> >   - cksum_meta = csum_level <-- Stanislav suggests to drop this one
> > - if the hw reports CHECKSUM_NONE
> >   - ip_summed = XDP_CHECKSUM_NONE
> >   - cksum_meta = 0
> 
> Off the top of my head drivers prefer reporting UNNECESSARY when they
> have both, and reserve COMPLETE for cases where L4 could not be found
> or is incorrect. Why don't we report both? We're using 3 args, we still
> have 3 to go. We could turn ip_summed into a bitmap and have explicit
> output args for both level and csum complete value?

Ack, thx for the explanation. Just for sake of understanding, is there
any NIC capable of reporting both csum_value and csum for the same packet
in the DMA descriptor? Or is this change needed to be future-proof?

> 
> One more thing I'd like us to at least have a plan for at this stage
> is how to deal with COMPLETE + modified packet + XDP_PASS.
> Right now some drivers discard COMPLETE when XDP is attached since
> they can't be sure if XDP modifies the packet. Other drivers don't
> and we end up with bad csum splat. Do we have a recommendation on
> the correct behavior? If not - should we have a kfunc to adjust /
> discard csum complete explicitly?

At the moment there is no way to store the csum value we got running
bpf_xdp_metadata_rx_checksum() in order to be consumed during
xdp_buff/xdp_frame to skb conversion (this info can just be consumed in the
ebpf program bound to the NIC) but I guess the issue you pointed out can be
solved in the verifier during program load time. What do you think?

Regards,
Lorenzo


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-23 17:11         ` Lorenzo Bianconi
@ 2026-02-23 23:18           ` Jakub Kicinski
  2026-02-27 13:21             ` Lorenzo Bianconi
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2026-02-23 23:18 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Donald Hunter, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski, Jakub Sitnicki, netdev, bpf, intel-wired-lan,
	linux-kselftest

On Mon, 23 Feb 2026 18:11:54 +0100 Lorenzo Bianconi wrote:
> > Off the top of my head drivers prefer reporting UNNECESSARY when they
> > have both, and reserve COMPLETE for cases where L4 could not be found
> > or is incorrect. Why don't we report both? We're using 3 args, we still
> > have 3 to go. We could turn ip_summed into a bitmap and have explicit
> > output args for both level and csum complete value?  
> 
> Ack, thx for the explanation. Just for sake of understanding, is there
> any NIC capable of reporting both csum_value and csum for the same packet
> in the DMA descriptor? Or is this change needed to be future-proof?

Both nfp and fbnic definitely can. Off the top of my head - mlx5 also
can, but I haven't double checked.

> > One more thing I'd like us to at least have a plan for at this stage
> > is how to deal with COMPLETE + modified packet + XDP_PASS.
> > Right now some drivers discard COMPLETE when XDP is attached since
> > they can't be sure if XDP modifies the packet. Other drivers don't
> > and we end up with bad csum splat. Do we have a recommendation on
> > the correct behavior? If not - should we have a kfunc to adjust /
> > discard csum complete explicitly?  
> 
> At the moment there is no way to store the csum value we got running
> bpf_xdp_metadata_rx_checksum() in order to be consumed during
> xdp_buff/xdp_frame to skb conversion (this info can just be consumed in the
> ebpf program bound to the NIC) but

I think the scope here is much narrower than the xdp_buf to xdp_frame
to skb conversion. We are just pass information between the program and
driver which owns xdp_buff. Very similar to your new xmo.

We could either tell the driver to discard the csum complete or even
add a helper to "adjust" the the csum value. Similar to the helper
we have to adjust the csum in TC / skb context.

> I guess the issue you pointed out can be solved in the verifier
> during program load time. What do you think?

It could, but at the verifier level we'd probably have to be fairly
coarse-grained. Any write to the packet data would mean csum complete
cannot be trusted, that's not too hard. But also any tail call / fentry?
I'm not really up to date on the latest in program chaining in BPF but
I think a lot of real-life deployments would use either chaining or
fentry. So in practice it may be a lot of complexity for having csum
complete always disabled w/ XDP, in practice.

Up to you. I'm totally okay to just say** that drivers should never
report csum complete with XDP (until appropriate API is built).
Perhaps this will force those who care about XDP+csum_complete to
tell us what their requirements are?

[**] "just say" == document and add driver kselftest that validates it

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-23 23:18           ` Jakub Kicinski
@ 2026-02-27 13:21             ` Lorenzo Bianconi
  2026-02-27 23:32               ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-27 13:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Donald Hunter, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski, Jakub Sitnicki, netdev, bpf, intel-wired-lan,
	linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 4019 bytes --]

> On Mon, 23 Feb 2026 18:11:54 +0100 Lorenzo Bianconi wrote:
> > > Off the top of my head drivers prefer reporting UNNECESSARY when they
> > > have both, and reserve COMPLETE for cases where L4 could not be found
> > > or is incorrect. Why don't we report both? We're using 3 args, we still
> > > have 3 to go. We could turn ip_summed into a bitmap and have explicit
> > > output args for both level and csum complete value?  
> > 
> > Ack, thx for the explanation. Just for sake of understanding, is there
> > any NIC capable of reporting both csum_value and csum for the same packet
> > in the DMA descriptor? Or is this change needed to be future-proof?
> 
> Both nfp and fbnic definitely can. Off the top of my head - mlx5 also
> can, but I haven't double checked.

ack, thx for pointing this out, I was not aware of it. I will modify the APIs
in order to add the capability to report both cksum and csum_level for a given
packet.

> 
> > > One more thing I'd like us to at least have a plan for at this stage
> > > is how to deal with COMPLETE + modified packet + XDP_PASS.
> > > Right now some drivers discard COMPLETE when XDP is attached since
> > > they can't be sure if XDP modifies the packet. Other drivers don't
> > > and we end up with bad csum splat. Do we have a recommendation on
> > > the correct behavior? If not - should we have a kfunc to adjust /
> > > discard csum complete explicitly?  
> > 
> > At the moment there is no way to store the csum value we got running
> > bpf_xdp_metadata_rx_checksum() in order to be consumed during
> > xdp_buff/xdp_frame to skb conversion (this info can just be consumed in the
> > ebpf program bound to the NIC) but
> 
> I think the scope here is much narrower than the xdp_buf to xdp_frame
> to skb conversion. We are just pass information between the program and
> driver which owns xdp_buff. Very similar to your new xmo.
> 
> We could either tell the driver to discard the csum complete or even
> add a helper to "adjust" the the csum value. Similar to the helper
> we have to adjust the csum in TC / skb context.

IIUC, for the CSUM_COMPLETE case, we want to add a kfunc used to update (or
invalidate) the checksum value (if the packet has been modified by the eBPF
program bounded to the NIC) and report the updated checksum to the driver if
the XDP verdict is XDP_PASS. Correct?

I guess we could have two approaches here:
- Write the new checksum value into the xdp_metadata area (if available)
  where the driver can load it and update the checksum value before
  allocating the skb.
  The main downside of this approach is we need modify each driver.
- Add a new xmo callback used to set the checksum value and report it
  from the eBPF program into a specific memory area provided by the driver
  (e.g. DMA descriptor) that is used to build the skb.
 
What do you think?

Moreover, since we already have this issue upstream, do you think this new feature must
be part this series or can we do it with a follow-up patch/series?

Regards,
Lorenzo

> 
> > I guess the issue you pointed out can be solved in the verifier
> > during program load time. What do you think?
> 
> It could, but at the verifier level we'd probably have to be fairly
> coarse-grained. Any write to the packet data would mean csum complete
> cannot be trusted, that's not too hard. But also any tail call / fentry?
> I'm not really up to date on the latest in program chaining in BPF but
> I think a lot of real-life deployments would use either chaining or
> fentry. So in practice it may be a lot of complexity for having csum
> complete always disabled w/ XDP, in practice.
> 
> Up to you. I'm totally okay to just say** that drivers should never
> report csum complete with XDP (until appropriate API is built).
> Perhaps this will force those who care about XDP+csum_complete to
> tell us what their requirements are?
> 
> [**] "just say" == document and add driver kselftest that validates it

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-27 13:21             ` Lorenzo Bianconi
@ 2026-02-27 23:32               ` Jakub Kicinski
  2026-02-28 11:58                 ` Lorenzo Bianconi
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2026-02-27 23:32 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Donald Hunter, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski, Jakub Sitnicki, netdev, bpf, intel-wired-lan,
	linux-kselftest

On Fri, 27 Feb 2026 14:21:44 +0100 Lorenzo Bianconi wrote:
> > > At the moment there is no way to store the csum value we got running
> > > bpf_xdp_metadata_rx_checksum() in order to be consumed during
> > > xdp_buff/xdp_frame to skb conversion (this info can just be consumed in the
> > > ebpf program bound to the NIC) but  
> > 
> > I think the scope here is much narrower than the xdp_buf to xdp_frame
> > to skb conversion. We are just pass information between the program and
> > driver which owns xdp_buff. Very similar to your new xmo.
> > 
> > We could either tell the driver to discard the csum complete or even
> > add a helper to "adjust" the the csum value. Similar to the helper
> > we have to adjust the csum in TC / skb context.  
> 
> IIUC, for the CSUM_COMPLETE case, we want to add a kfunc used to update (or
> invalidate) the checksum value (if the packet has been modified by the eBPF
> program bounded to the NIC) and report the updated checksum to the driver if
> the XDP verdict is XDP_PASS. Correct?
> 
> I guess we could have two approaches here:
> - Write the new checksum value into the xdp_metadata area (if available)
>   where the driver can load it and update the checksum value before
>   allocating the skb.
>   The main downside of this approach is we need modify each driver.
> - Add a new xmo callback used to set the checksum value and report it
>   from the eBPF program into a specific memory area provided by the driver
>   (e.g. DMA descriptor) that is used to build the skb.
>  
> What do you think?

Exactly. The invalidation is easier 'cause using a single bit in the
flags should be uncontroversial. If we want to be able to repair /
provide the csum complete then we have to pick one of the two options
you outlined. As you may suspect from previous discussions I favor 
the latter. But we'd probably have to have a PoC with either one and
see where the consensus falls.

Actually, thinking about it more, I guess this is not just a
CSUM_COMPLETE issue. XDP_PASS will also risk reporting invalid
UNNECESSARY to the stack (e.g. when XDP stripped a UDP tunnel
which which the NIC compute the UNNECESSARY but the packet inside
the tunnel has an invalid csum).

> Moreover, since we already have this issue upstream, do you think
> this new feature must be part this series or can we do it with a
> follow-up patch/series?

We don't have to add the kfunc to adjust / invalidate the csum.
But we should document how the drivers are expected to behave until
such kfunc exists and we should add a selftest that checks the
documented expectation.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs
  2026-02-27 23:32               ` Jakub Kicinski
@ 2026-02-28 11:58                 ` Lorenzo Bianconi
  0 siblings, 0 replies; 17+ messages in thread
From: Lorenzo Bianconi @ 2026-02-28 11:58 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Donald Hunter, David S. Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Alexander Lobakin,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, Shuah Khan,
	Maciej Fijalkowski, Jakub Sitnicki, netdev, bpf, intel-wired-lan,
	linux-kselftest

[-- Attachment #1: Type: text/plain, Size: 2872 bytes --]

> On Fri, 27 Feb 2026 14:21:44 +0100 Lorenzo Bianconi wrote:
> > > > At the moment there is no way to store the csum value we got running
> > > > bpf_xdp_metadata_rx_checksum() in order to be consumed during
> > > > xdp_buff/xdp_frame to skb conversion (this info can just be consumed in the
> > > > ebpf program bound to the NIC) but  
> > > 
> > > I think the scope here is much narrower than the xdp_buf to xdp_frame
> > > to skb conversion. We are just pass information between the program and
> > > driver which owns xdp_buff. Very similar to your new xmo.
> > > 
> > > We could either tell the driver to discard the csum complete or even
> > > add a helper to "adjust" the the csum value. Similar to the helper
> > > we have to adjust the csum in TC / skb context.  
> > 
> > IIUC, for the CSUM_COMPLETE case, we want to add a kfunc used to update (or
> > invalidate) the checksum value (if the packet has been modified by the eBPF
> > program bounded to the NIC) and report the updated checksum to the driver if
> > the XDP verdict is XDP_PASS. Correct?
> > 
> > I guess we could have two approaches here:
> > - Write the new checksum value into the xdp_metadata area (if available)
> >   where the driver can load it and update the checksum value before
> >   allocating the skb.
> >   The main downside of this approach is we need modify each driver.
> > - Add a new xmo callback used to set the checksum value and report it
> >   from the eBPF program into a specific memory area provided by the driver
> >   (e.g. DMA descriptor) that is used to build the skb.
> >  
> > What do you think?
> 
> Exactly. The invalidation is easier 'cause using a single bit in the
> flags should be uncontroversial. If we want to be able to repair /
> provide the csum complete then we have to pick one of the two options
> you outlined. As you may suspect from previous discussions I favor 
> the latter. But we'd probably have to have a PoC with either one and
> see where the consensus falls.

ack, I will work on a PoC.

> 
> Actually, thinking about it more, I guess this is not just a
> CSUM_COMPLETE issue. XDP_PASS will also risk reporting invalid
> UNNECESSARY to the stack (e.g. when XDP stripped a UDP tunnel
> which which the NIC compute the UNNECESSARY but the packet inside
> the tunnel has an invalid csum).
> 
> > Moreover, since we already have this issue upstream, do you think
> > this new feature must be part this series or can we do it with a
> > follow-up patch/series?
> 
> We don't have to add the kfunc to adjust / invalidate the csum.
> But we should document how the drivers are expected to behave until
> such kfunc exists and we should add a selftest that checks the
> documented expectation.

I will add the required documentation and kselftest in the next iteration.

Regards,
Lorenzo


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-02-28 11:58 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-17  8:33 [PATCH bpf-next v3 0/5] Add the the capability to load HW RX checsum in eBPF programs Lorenzo Bianconi
2026-02-17  8:33 ` [PATCH bpf-next v3 1/5] netlink: specs: Add XDP RX checksum capability to XDP metadata specs Lorenzo Bianconi
2026-02-18  1:01   ` Stanislav Fomichev
2026-02-18 10:58     ` Jesper Dangaard Brouer
2026-02-19  1:47   ` Jakub Kicinski
2026-02-19 11:04     ` Lorenzo Bianconi
2026-02-19 17:13       ` Jakub Kicinski
2026-02-23 17:11         ` Lorenzo Bianconi
2026-02-23 23:18           ` Jakub Kicinski
2026-02-27 13:21             ` Lorenzo Bianconi
2026-02-27 23:32               ` Jakub Kicinski
2026-02-28 11:58                 ` Lorenzo Bianconi
2026-02-17  8:33 ` [PATCH bpf-next v3 2/5] net: veth: Add xmo_rx_checksum callback to veth driver Lorenzo Bianconi
2026-02-17  8:33 ` [PATCH bpf-next v3 3/5] net: ice: Add xmo_rx_checksum callback Lorenzo Bianconi
2026-02-17  8:33 ` [PATCH bpf-next v3 4/5] selftests/bpf: Add selftest support for bpf_xdp_metadata_rx_checksum Lorenzo Bianconi
2026-02-17  9:17   ` bot+bpf-ci
2026-02-17  8:34 ` [PATCH bpf-next v3 5/5] selftests/bpf: Add bpf_xdp_metadata_rx_checksum support to xdp_hw_metadat prog Lorenzo Bianconi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox