Netdev List
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect
@ 2026-06-30 19:15 Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 1/7] xdp: let XDP programs assert the RX checksum " Vladimir Vdovin
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-06-30 19:15 UTC (permalink / raw)
  To: bpf, netdev
  Cc: ast, daniel, andrii, martin.lau, sdf, hawk, john.fastabend, kuba,
	Vladimir Vdovin

This series lets XDP programs work with the hardware RX checksum verdict:
read what the NIC concluded about a packet, and carry a "the L4 checksum
is correct" assertion across a redirect so the stack does not revalidate
it in software.

When an XDP program redirects a frame to a cpumap (or any other path that
rebuilds an skb from an xdp_frame via __xdp_build_skb_from_frame()), the
HW RX checksum status is lost and the stack revalidates the L4 checksum in
software.

Two kfuncs are added:

 - bpf_xdp_metadata_rx_csum(): a device-bound RX-metadata hint, like the
   existing rx_hash / rx_vlan_tag ones.  It reports enum xdp_csum_status
   (XDP_CSUM_NONE / XDP_CSUM_VERIFIED) and is implemented for mlx5e, ice
   and veth.

 - bpf_xdp_assert_rx_csum(): a generic, non-device-bound kfunc that lets
   the program assert the L4 checksum is correct.  It sets a buff flag
   that rides into the xdp_frame, and __xdp_build_skb_from_frame() turns
   it into skb->ip_summed = CHECKSUM_UNNECESSARY.  The kernel cannot
   verify the assertion; the program takes responsibility, as it already
   does when rewriting packet contents.

Posted as RFC to get feedback on:

 - whether the read hint (bpf_xdp_metadata_rx_csum() and its driver
   support) belongs in this series at all.  bpf_xdp_assert_rx_csum() is
   self-contained and already covers the main use case: a program that
   computes or fixes the L4 checksum itself, or trusts the source, and
   wants the rebuilt skb to skip software revalidation.  The read hint is
   an optimization for programs that did not touch the payload and only
   want to relay the hardware verdict.  These could just as well be two
   independent series (assert-only first);
 - the kfunc naming, bpf_xdp_assert_rx_csum() in particular.

Testing:

 - new selftest xdp_cpumap_rx_csum drives a frame through a native-XDP
   veth into a cpumap redirect and checks, via fexit on
   __xdp_build_skb_from_frame(), that the rebuilt skb is
   CHECKSUM_UNNECESSARY iff the program called bpf_xdp_assert_rx_csum();
 - xdp_metadata calls bpf_xdp_metadata_rx_csum() over veth and checks both
   verdicts: XDP_CSUM_NONE for an AF_XDP-injected frame and
   XDP_CSUM_VERIFIED for one sent through the stack.

Vladimir Vdovin (7):
  xdp: let XDP programs assert the RX checksum over redirect
  selftests/bpf: add test for bpf_xdp_assert_rx_csum over cpumap
  xdp: add bpf_xdp_metadata_rx_csum() RX metadata kfunc
  net/mlx5e: support the rx_csum XDP metadata hint
  ice: support the rx_csum XDP metadata hint
  veth: support the rx_csum XDP metadata hint
  selftests/bpf: cover bpf_xdp_metadata_rx_csum in xdp_metadata

 Documentation/netlink/specs/netdev.yaml       |   5 +
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c |  32 ++++
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  |  23 +++
 drivers/net/veth.c                            |  23 +++
 include/net/xdp.h                             |  23 +++
 include/uapi/linux/netdev.h                   |   3 +
 net/core/xdp.c                                |  73 ++++++++-
 tools/include/uapi/linux/netdev.h             |   3 +
 .../bpf/prog_tests/xdp_cpumap_rx_csum.c       | 150 ++++++++++++++++++
 .../selftests/bpf/prog_tests/xdp_metadata.c   |  10 ++
 .../selftests/bpf/progs/bpf_tracing_net.h     |   1 +
 .../bpf/progs/test_xdp_cpumap_rx_csum.c       |  51 ++++++
 .../selftests/bpf/progs/xdp_metadata.c        |   9 ++
 tools/testing/selftests/bpf/xdp_metadata.h    |   8 +
 14 files changed, 412 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_cpumap_rx_csum.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_cpumap_rx_csum.c


base-commit: f456c1922c49e6be5ce407ddb74a6e61af5b65cf
-- 
2.47.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v1 1/7] xdp: let XDP programs assert the RX checksum over redirect
  2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
@ 2026-06-30 19:15 ` Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 2/7] selftests/bpf: add test for bpf_xdp_assert_rx_csum over cpumap Vladimir Vdovin
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-06-30 19:15 UTC (permalink / raw)
  To: bpf, netdev
  Cc: ast, daniel, andrii, martin.lau, sdf, hawk, john.fastabend, kuba,
	Vladimir Vdovin

When an XDP program redirects a frame to a cpumap (or any other path
that rebuilds an skb from an xdp_frame via __xdp_build_skb_from_frame()),
the HW RX checksum status is lost and the stack revalidates the L4
checksum in software.

Add a non-dev-bound kfunc, bpf_xdp_assert_rx_csum(), that lets the program
assert the L4 checksum is correct.  It sets XDP_FLAGS_RX_CSUM_UNNECESSARY
on the buffer; the flag rides into the xdp_frame and
__xdp_build_skb_from_frame() turns it into skb->ip_summed =
CHECKSUM_UNNECESSARY.  The kernel cannot verify the assertion, the program
takes responsibility, the same way it is already trusted to rewrite
arbitrary packet contents.

Signed-off-by: Vladimir Vdovin <deliran@verdict.gg>
---
 include/net/xdp.h | 11 +++++++++++
 net/core/xdp.c    | 50 +++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index aa742f413c35..5a1e2cc9c312 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -81,6 +81,11 @@ enum xdp_buff_flags {
 	 * XDP program is not attached.
 	 */
 	XDP_FLAGS_FRAGS_UNREADABLE	= BIT(2),
+	/* XDP program asserts the L4 checksum is correct, so the skb built
+	 * out of this frame (e.g. on the cpumap redirect path) can be marked
+	 * CHECKSUM_UNNECESSARY instead of being validated in software.
+	 */
+	XDP_FLAGS_RX_CSUM_UNNECESSARY	= BIT(3),
 };
 
 struct xdp_buff {
@@ -316,6 +321,12 @@ xdp_frame_get_skb_flags(const struct xdp_frame *frame)
 	return frame->flags;
 }
 
+static __always_inline bool
+xdp_frame_rx_csum_unnecessary(const struct xdp_frame *frame)
+{
+	return !!(frame->flags & XDP_FLAGS_RX_CSUM_UNNECESSARY);
+}
+
 #define XDP_BULK_QUEUE_SIZE	16
 struct xdp_frame_bulk {
 	int count;
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 9890a30584ba..63ee36ec93de 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -830,8 +830,11 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 	/* Essential SKB info: protocol and skb->dev */
 	skb->protocol = eth_type_trans(skb, dev);
 
+	/* HW checksum info, if the XDP program asserted it */
+	if (xdp_frame_rx_csum_unnecessary(xdpf))
+		skb->ip_summed = CHECKSUM_UNNECESSARY;
+
 	/* Optional SKB info, currently missing:
-	 * - HW checksum info		(skb->ip_summed)
 	 * - HW RX hash			(skb_set_hash)
 	 * - RX ring dev queue index	(skb_record_rx_queue)
 	 */
@@ -961,6 +964,31 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
 	return -EOPNOTSUPP;
 }
 
+/**
+ * bpf_xdp_assert_rx_csum - Assert the packet's L4 checksum is correct.
+ * @ctx: XDP context pointer.
+ *
+ * Mark the frame so that an skb later built out of it (e.g. on the cpumap
+ * redirect path, see __xdp_build_skb_from_frame()) is set to
+ * CHECKSUM_UNNECESSARY instead of being validated in software when it enters
+ * the stack.
+ *
+ * This is an assertion made by the XDP program: the kernel cannot verify it.
+ * The program takes responsibility for the checksum being correct, the same
+ * way it is already trusted to rewrite arbitrary packet contents. If the
+ * program modifies L4 data after calling this kfunc the assertion may no
+ * longer hold.
+ *
+ * Return: 0.
+ */
+__bpf_kfunc int bpf_xdp_assert_rx_csum(struct xdp_md *ctx)
+{
+	struct xdp_buff *xdp = (struct xdp_buff *)ctx;
+
+	xdp->flags |= XDP_FLAGS_RX_CSUM_UNNECESSARY;
+	return 0;
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(xdp_metadata_kfunc_ids)
@@ -974,6 +1002,18 @@ static const struct btf_kfunc_id_set xdp_metadata_kfunc_set = {
 	.set   = &xdp_metadata_kfunc_ids,
 };
 
+/* Generic XDP kfuncs that need no driver support and are therefore not
+ * dev-bound (unlike the rx-metadata kfuncs above).
+ */
+BTF_KFUNCS_START(xdp_kfunc_ids)
+BTF_ID_FLAGS(func, bpf_xdp_assert_rx_csum)
+BTF_KFUNCS_END(xdp_kfunc_ids)
+
+static const struct btf_kfunc_id_set xdp_kfunc_set = {
+	.owner = THIS_MODULE,
+	.set   = &xdp_kfunc_ids,
+};
+
 BTF_ID_LIST(xdp_metadata_kfunc_ids_unsorted)
 #define XDP_METADATA_KFUNC(name, _, str, __) BTF_ID(func, str)
 XDP_METADATA_KFUNC_xxx
@@ -992,7 +1032,13 @@ bool bpf_dev_bound_kfunc_id(u32 btf_id)
 
 static int __init xdp_metadata_init(void)
 {
-	return register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &xdp_metadata_kfunc_set);
+	int ret;
+
+	ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &xdp_metadata_kfunc_set);
+	if (ret)
+		return ret;
+
+	return register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &xdp_kfunc_set);
 }
 late_initcall(xdp_metadata_init);
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v1 2/7] selftests/bpf: add test for bpf_xdp_assert_rx_csum over cpumap
  2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 1/7] xdp: let XDP programs assert the RX checksum " Vladimir Vdovin
@ 2026-06-30 19:15 ` Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 3/7] xdp: add bpf_xdp_metadata_rx_csum() RX metadata kfunc Vladimir Vdovin
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-06-30 19:15 UTC (permalink / raw)
  To: bpf, netdev
  Cc: ast, daniel, andrii, martin.lau, sdf, hawk, john.fastabend, kuba,
	Vladimir Vdovin

Drive a frame through a native-XDP veth into a cpumap redirect and
observe, via fexit on __xdp_build_skb_from_frame(), that the rebuilt skb
is CHECKSUM_UNNECESSARY when the program called bpf_xdp_assert_rx_csum()
and CHECKSUM_NONE otherwise.  fexit is used because cpumap GRO would
otherwise normalize ip_summed before any later hook can observe it.

Signed-off-by: Vladimir Vdovin <deliran@verdict.gg>
---
 .../bpf/prog_tests/xdp_cpumap_rx_csum.c       | 150 ++++++++++++++++++
 .../selftests/bpf/progs/bpf_tracing_net.h     |   1 +
 .../bpf/progs/test_xdp_cpumap_rx_csum.c       |  51 ++++++
 3 files changed, 202 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_cpumap_rx_csum.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_cpumap_rx_csum.c

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_rx_csum.c b/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_rx_csum.c
new file mode 100644
index 000000000000..2def92fe1111
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_cpumap_rx_csum.c
@@ -0,0 +1,150 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <net/if.h>
+#include <linux/if_ether.h>
+#include <linux/if_link.h>
+#include <linux/if_packet.h>
+#include <linux/ipv6.h>
+#include <netinet/in.h>
+#include <netinet/udp.h>
+#include <sys/socket.h>
+
+#include "test_progs.h"
+#include "network_helpers.h"
+#include <bpf/bpf_endian.h>
+#include "test_xdp_cpumap_rx_csum.skel.h"
+
+#define TEST_NS		"xdp_cm_csum_ns"
+#define UDP_TEST_PORT	7777
+
+/* Kernel skb->ip_summed values, not exported to userspace headers. */
+#define CHECKSUM_NONE		0
+#define CHECKSUM_UNNECESSARY	1
+
+struct udp_pkt {
+	struct ethhdr eth;
+	struct ipv6hdr iph;
+	struct udphdr udp;
+	__u8 payload[16];
+} __packed;
+
+static struct udp_pkt pkt = {
+	.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
+	.eth.h_dest = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+	.eth.h_source = {0x66, 0x77, 0x88, 0x99, 0xaa, 0xbb},
+	.iph.version = 6,
+	.iph.nexthdr = IPPROTO_UDP,
+	.iph.payload_len = __bpf_constant_htons(sizeof(struct udphdr) + 16),
+	.iph.hop_limit = 64,
+	.udp.source = __bpf_constant_htons(1),
+	.udp.dest = __bpf_constant_htons(UDP_TEST_PORT),
+	.udp.len = __bpf_constant_htons(sizeof(struct udphdr) + 16),
+};
+
+/* Inject one frame on veth0; it is received on veth1 where native XDP
+ * redirects it into the cpumap. Report the ip_summed the rebuilt skb carried.
+ */
+static int inject_and_observe(struct test_xdp_cpumap_rx_csum *skel, int sfd,
+			      int ifindex_src, bool assert_csum, int *ip_summed)
+{
+	struct sockaddr_ll sll = {
+		.sll_family = AF_PACKET,
+		.sll_ifindex = ifindex_src,
+		.sll_halen = 0,
+	};
+	int i, n;
+
+	skel->bss->assert_csum = assert_csum;
+	skel->bss->seen = false;
+	skel->data->observed_ip_summed = -1;
+
+	n = sendto(sfd, &pkt, sizeof(pkt), 0, (void *)&sll, sizeof(sll));
+	if (!ASSERT_EQ(n, sizeof(pkt), "sendto"))
+		return -1;
+
+	/* The skb is built asynchronously by the cpumap kthread. */
+	for (i = 0; i < 20 && !skel->bss->seen; i++)
+		usleep(50000);
+
+	if (!ASSERT_TRUE(skel->bss->seen, "skb built from frame"))
+		return -1;
+
+	*ip_summed = skel->data->observed_ip_summed;
+	return 0;
+}
+
+void test_xdp_cpumap_rx_csum(void)
+{
+	struct test_xdp_cpumap_rx_csum *skel = NULL;
+	struct bpf_cpumap_val val = { .qsize = 192 };
+	struct bpf_link *fexit_link = NULL;
+	struct nstoken *nstoken = NULL;
+	int err, map_fd, ifindex_dst = 0, ifindex_src, sfd = -1, ip_summed;
+	bool xdp_attached = false;
+	__u32 idx = 0;
+
+	SYS(out, "ip netns add %s", TEST_NS);
+	nstoken = open_netns(TEST_NS);
+	if (!ASSERT_OK_PTR(nstoken, "open_netns"))
+		goto out;
+
+	/* veth pair: a frame TX'd on veth0 is RX'd on veth1. */
+	SYS(out, "ip link add veth0 type veth peer name veth1");
+	SYS(out, "ip link set veth0 up");
+	SYS(out, "ip link set veth1 up");
+
+	skel = test_xdp_cpumap_rx_csum__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel open_and_load"))
+		goto out;
+
+	/* cpumap entry without a program: a plain redirect that forces the
+	 * frame->skb conversion in __xdp_build_skb_from_frame().
+	 */
+	map_fd = bpf_map__fd(skel->maps.cpu_map);
+	err = bpf_map_update_elem(map_fd, &idx, &val, 0);
+	if (!ASSERT_OK(err, "cpumap update"))
+		goto out;
+
+	ifindex_dst = if_nametoindex("veth1");
+	ifindex_src = if_nametoindex("veth0");
+	if (!ASSERT_GT(ifindex_dst, 0, "veth1 ifindex") ||
+	    !ASSERT_GT(ifindex_src, 0, "veth0 ifindex"))
+		goto out;
+
+	/* Native XDP so the redirect goes through xdp_convert_buff_to_frame(),
+	 * which propagates the rx-csum flag into the frame. Generic mode would
+	 * redirect a ready-made skb and never hit our code path.
+	 */
+	err = bpf_xdp_attach(ifindex_dst, bpf_program__fd(skel->progs.xdp_redir),
+			     XDP_FLAGS_DRV_MODE, NULL);
+	if (!ASSERT_OK(err, "attach native xdp"))
+		goto out;
+	xdp_attached = true;
+
+	fexit_link = bpf_program__attach(skel->progs.on_build);
+	if (!ASSERT_OK_PTR(fexit_link, "attach fexit"))
+		goto out;
+
+	sfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
+	if (!ASSERT_GE(sfd, 0, "AF_PACKET socket"))
+		goto out;
+
+	/* Program asserts the checksum -> CHECKSUM_UNNECESSARY. */
+	if (!inject_and_observe(skel, sfd, ifindex_src, true, &ip_summed))
+		ASSERT_EQ(ip_summed, CHECKSUM_UNNECESSARY,
+			  "ip_summed marked unnecessary");
+
+	/* No assertion -> skb is left CHECKSUM_NONE for the stack to validate. */
+	if (!inject_and_observe(skel, sfd, ifindex_src, false, &ip_summed))
+		ASSERT_EQ(ip_summed, CHECKSUM_NONE, "ip_summed left none");
+
+out:
+	if (sfd >= 0)
+		close(sfd);
+	bpf_link__destroy(fexit_link);
+	if (xdp_attached)
+		bpf_xdp_detach(ifindex_dst, XDP_FLAGS_DRV_MODE, NULL);
+	test_xdp_cpumap_rx_csum__destroy(skel);
+	if (nstoken)
+		close_netns(nstoken);
+	SYS_NOFAIL("ip netns del %s", TEST_NS);
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
index d8dacef37c16..c3a0b2696035 100644
--- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
+++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
@@ -87,6 +87,7 @@
 #define TCPOLEN_SACK_PERM	2
 
 #define CHECKSUM_NONE		0
+#define CHECKSUM_UNNECESSARY	1
 #define CHECKSUM_PARTIAL	3
 
 #define IFNAMSIZ		16
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_cpumap_rx_csum.c b/tools/testing/selftests/bpf/progs/test_xdp_cpumap_rx_csum.c
new file mode 100644
index 000000000000..86c691887d25
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_cpumap_rx_csum.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "vmlinux.h"
+#include "bpf_tracing_net.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_endian.h>
+
+extern int bpf_xdp_assert_rx_csum(struct xdp_md *ctx) __ksym;
+
+struct {
+	__uint(type, BPF_MAP_TYPE_CPUMAP);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(struct bpf_cpumap_val));
+	__uint(max_entries, 1);
+} cpu_map SEC(".maps");
+
+/* Set from userspace before injecting each packet. */
+bool assert_csum = false;
+
+/* Filled in by the fexit program when the cpumap skb is built. */
+bool seen = false;
+int observed_ip_summed = -1;
+
+SEC("xdp")
+int xdp_redir(struct xdp_md *ctx)
+{
+	/* Assert the L4 checksum so the skb built on the cpumap redirect
+	 * path is marked CHECKSUM_UNNECESSARY instead of validated in software.
+	 */
+	if (assert_csum)
+		bpf_xdp_assert_rx_csum(ctx);
+
+	return bpf_redirect_map(&cpu_map, 0, 0);
+}
+
+/* Observe ip_summed exactly as __xdp_build_skb_from_frame() leaves it, before
+ * GRO in the cpumap kthread can normalize it. tc-ingress would be too late:
+ * GRO software-validates a CHECKSUM_NONE skb and marks it UNNECESSARY anyway.
+ */
+SEC("fexit/__xdp_build_skb_from_frame")
+int BPF_PROG(on_build, struct xdp_frame *xdpf, struct sk_buff *skb,
+	     struct net_device *dev, struct sk_buff *ret)
+{
+	if (ret && ret->protocol == bpf_htons(ETH_P_IPV6)) {
+		observed_ip_summed = ret->ip_summed;
+		seen = true;
+	}
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v1 3/7] xdp: add bpf_xdp_metadata_rx_csum() RX metadata kfunc
  2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 1/7] xdp: let XDP programs assert the RX checksum " Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 2/7] selftests/bpf: add test for bpf_xdp_assert_rx_csum over cpumap Vladimir Vdovin
@ 2026-06-30 19:15 ` Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 4/7] net/mlx5e: support the rx_csum XDP metadata hint Vladimir Vdovin
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-06-30 19:15 UTC (permalink / raw)
  To: bpf, netdev
  Cc: ast, daniel, andrii, martin.lau, sdf, hawk, john.fastabend, kuba,
	Vladimir Vdovin

Add a device-bound RX-metadata kfunc that reports the hardware
checksum verdict (enum xdp_csum_status: XDP_CSUM_NONE / XDP_CSUM_VERIFIED)
through a new xmo_rx_csum operation, so an XDP program can make an
informed decision (e.g. call bpf_xdp_assert_rx_csum()) instead of trusting
blindly.  Wire it into the XDP_METADATA_KFUNC machinery and advertise it
via NETDEV_XDP_RX_METADATA_CSUM.

Signed-off-by: Vladimir Vdovin <deliran@verdict.gg>
---
 Documentation/netlink/specs/netdev.yaml |  5 +++++
 include/net/xdp.h                       | 12 ++++++++++++
 include/uapi/linux/netdev.h             |  3 +++
 net/core/xdp.c                          | 23 +++++++++++++++++++++++
 tools/include/uapi/linux/netdev.h       |  3 +++
 5 files changed, 46 insertions(+)

diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index 5f143da7458c..86017f7402d9 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -61,6 +61,11 @@ definitions:
         doc: |
           Device is capable of exposing receive packet VLAN tag via
           bpf_xdp_metadata_rx_vlan_tag().
+      -
+        name: csum
+        doc: |
+          Device is capable of exposing receive packet checksum status via
+          bpf_xdp_metadata_rx_csum().
   -
     type: flags
     name: xsk-flags
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 5a1e2cc9c312..40f6fba41962 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -597,6 +597,10 @@ void xdp_attachment_setup(struct xdp_attachment_info *info,
 			   NETDEV_XDP_RX_METADATA_VLAN_TAG, \
 			   bpf_xdp_metadata_rx_vlan_tag, \
 			   xmo_rx_vlan_tag) \
+	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_CSUM, \
+			   NETDEV_XDP_RX_METADATA_CSUM, \
+			   bpf_xdp_metadata_rx_csum, \
+			   xmo_rx_csum) \
 
 enum xdp_rx_metadata {
 #define XDP_METADATA_KFUNC(name, _, __, ___) name,
@@ -654,12 +658,20 @@ enum xdp_rss_hash_type {
 	XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP | XDP_RSS_L3_DYNHDR,
 };
 
+/* Checksum status reported by bpf_xdp_metadata_rx_csum(). */
+enum xdp_csum_status {
+	XDP_CSUM_NONE = 0,	/* HW did not validate the checksum */
+	XDP_CSUM_VERIFIED,	/* HW validated the L4 checksum; it is correct */
+};
+
 struct xdp_metadata_ops {
 	int	(*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
 	int	(*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
 			       enum xdp_rss_hash_type *rss_type);
 	int	(*xmo_rx_vlan_tag)(const struct xdp_md *ctx, __be16 *vlan_proto,
 				   u16 *vlan_tci);
+	int	(*xmo_rx_csum)(const struct xdp_md *ctx,
+			       enum xdp_csum_status *csum_status);
 };
 
 #ifdef CONFIG_NET
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index 2f3ab75e8cc0..99cda716f0ee 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -47,11 +47,14 @@ enum netdev_xdp_act {
  *   hash via bpf_xdp_metadata_rx_hash().
  * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive
  *   packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag().
+ * @NETDEV_XDP_RX_METADATA_CSUM: Device is capable of exposing receive packet
+ *   checksum status via bpf_xdp_metadata_rx_csum().
  */
 enum netdev_xdp_rx_metadata {
 	NETDEV_XDP_RX_METADATA_TIMESTAMP = 1,
 	NETDEV_XDP_RX_METADATA_HASH = 2,
 	NETDEV_XDP_RX_METADATA_VLAN_TAG = 4,
+	NETDEV_XDP_RX_METADATA_CSUM = 8,
 };
 
 /**
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 63ee36ec93de..7f4b5c6f7c87 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -964,6 +964,29 @@ __bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
 	return -EOPNOTSUPP;
 }
 
+/**
+ * bpf_xdp_metadata_rx_csum - Read the device's RX checksum verdict.
+ * @ctx: XDP context pointer.
+ * @csum_status: Destination pointer for the checksum status.
+ *
+ * Report what the hardware concluded about the packet's checksum, so the
+ * program can decide whether to assert it (e.g. via bpf_xdp_assert_rx_csum()
+ * before a cpumap redirect) instead of having the stack validate it again.
+ *
+ * On ``XDP_CSUM_VERIFIED`` the device has checked the L4 checksum and it is
+ * correct. ``XDP_CSUM_NONE`` means the device did not validate it.
+ *
+ * Return:
+ * * Returns 0 on success or ``-errno`` on error.
+ * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc
+ * * ``-ENODATA``    : checksum information is not available
+ */
+__bpf_kfunc int bpf_xdp_metadata_rx_csum(const struct xdp_md *ctx,
+					 enum xdp_csum_status *csum_status)
+{
+	return -EOPNOTSUPP;
+}
+
 /**
  * bpf_xdp_assert_rx_csum - Assert the packet's L4 checksum is correct.
  * @ctx: XDP context pointer.
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index 2f3ab75e8cc0..99cda716f0ee 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -47,11 +47,14 @@ enum netdev_xdp_act {
  *   hash via bpf_xdp_metadata_rx_hash().
  * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive
  *   packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag().
+ * @NETDEV_XDP_RX_METADATA_CSUM: Device is capable of exposing receive packet
+ *   checksum status via bpf_xdp_metadata_rx_csum().
  */
 enum netdev_xdp_rx_metadata {
 	NETDEV_XDP_RX_METADATA_TIMESTAMP = 1,
 	NETDEV_XDP_RX_METADATA_HASH = 2,
 	NETDEV_XDP_RX_METADATA_VLAN_TAG = 4,
+	NETDEV_XDP_RX_METADATA_CSUM = 8,
 };
 
 /**
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v1 4/7] net/mlx5e: support the rx_csum XDP metadata hint
  2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
                   ` (2 preceding siblings ...)
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 3/7] xdp: add bpf_xdp_metadata_rx_csum() RX metadata kfunc Vladimir Vdovin
@ 2026-06-30 19:15 ` Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 5/7] ice: " Vladimir Vdovin
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-06-30 19:15 UTC (permalink / raw)
  To: bpf, netdev
  Cc: ast, daniel, andrii, martin.lau, sdf, hawk, john.fastabend, kuba,
	Vladimir Vdovin

Implement xmo_rx_csum by reading CQE_L3_OK/CQE_L4_OK, mirroring the
verdict mlx5e_handle_csum() uses for CHECKSUM_UNNECESSARY.
CHECKSUM_COMPLETE is intentionally not surfaced: it is already
disabled while an XDP program is loaded.

Signed-off-by: Vladimir Vdovin <deliran@verdict.gg>
---
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  | 23 +++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index d8c7cb8837d7..6ac06bd24c79 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -277,10 +277,33 @@ static int mlx5e_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto,
 	return 0;
 }
 
+static int mlx5e_xdp_rx_csum(const struct xdp_md *ctx,
+			     enum xdp_csum_status *csum_status)
+{
+	const struct mlx5e_xdp_buff *_ctx = (void *)ctx;
+	const struct mlx5_cqe64 *cqe = _ctx->cqe;
+
+	if (unlikely(!(_ctx->xdp.rxq->dev->features & NETIF_F_RXCSUM)))
+		return -ENODATA;
+
+	/* Same verdict the normal RX path uses for CHECKSUM_UNNECESSARY.
+	 * CHECKSUM_COMPLETE is deliberately not surfaced here: it is disabled
+	 * while an XDP program is loaded (see mlx5e_handle_csum()).
+	 */
+	if (likely((cqe->hds_ip_ext & CQE_L3_OK) &&
+		   (cqe->hds_ip_ext & CQE_L4_OK)))
+		*csum_status = XDP_CSUM_VERIFIED;
+	else
+		*csum_status = XDP_CSUM_NONE;
+
+	return 0;
+}
+
 const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = {
 	.xmo_rx_timestamp		= mlx5e_xdp_rx_timestamp,
 	.xmo_rx_hash			= mlx5e_xdp_rx_hash,
 	.xmo_rx_vlan_tag		= mlx5e_xdp_rx_vlan_tag,
+	.xmo_rx_csum			= mlx5e_xdp_rx_csum,
 };
 
 struct mlx5e_xsk_tx_complete {
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v1 5/7] ice: support the rx_csum XDP metadata hint
  2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
                   ` (3 preceding siblings ...)
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 4/7] net/mlx5e: support the rx_csum XDP metadata hint Vladimir Vdovin
@ 2026-06-30 19:15 ` Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 6/7] veth: " Vladimir Vdovin
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-06-30 19:15 UTC (permalink / raw)
  To: bpf, netdev
  Cc: ast, daniel, andrii, martin.lau, sdf, hawk, john.fastabend, kuba,
	Vladimir Vdovin

Implement xmo_rx_csum from the Rx flex descriptor status0 bits
(L3L4P set and no XSUM_L4E), mirroring ice_rx_csum().  Return -ENODATA
when RX checksum offload (NETIF_F_RXCSUM) is disabled, since the status
bits are not meaningful then.

Signed-off-by: Vladimir Vdovin <deliran@verdict.gg>
---
 drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 32 +++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index e695a664e53d..d13c5e76bc13 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -594,8 +594,40 @@ static int ice_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto,
 	return 0;
 }
 
+/**
+ * ice_xdp_rx_csum - RX checksum XDP hint handler
+ * @ctx: XDP buff pointer
+ * @csum_status: destination for the checksum verdict
+ *
+ * Report whether the hardware validated the packet's L4 checksum, mirroring
+ * the verdict ice_rx_csum() uses for CHECKSUM_UNNECESSARY.  Return -ENODATA
+ * when RX checksum offload is disabled, since the status bits are not
+ * meaningful then.
+ */
+static int ice_xdp_rx_csum(const struct xdp_md *ctx,
+			   enum xdp_csum_status *csum_status)
+{
+	const struct libeth_xdp_buff *xdp_ext = (void *)ctx;
+	struct ice_rx_ring *rx_ring;
+	u16 status0;
+
+	rx_ring = libeth_xdp_buff_to_rq(xdp_ext, typeof(*rx_ring), xdp_rxq);
+	if (!(rx_ring->netdev->features & NETIF_F_RXCSUM))
+		return -ENODATA;
+
+	status0 = le16_to_cpu(xdp_ext->desc->wb.status_error0);
+	if ((status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_L3L4P_S)) &&
+	    !(status0 & BIT(ICE_RX_FLEX_DESC_STATUS0_XSUM_L4E_S)))
+		*csum_status = XDP_CSUM_VERIFIED;
+	else
+		*csum_status = XDP_CSUM_NONE;
+
+	return 0;
+}
+
 const struct xdp_metadata_ops ice_xdp_md_ops = {
 	.xmo_rx_timestamp		= ice_xdp_rx_hw_ts,
 	.xmo_rx_hash			= ice_xdp_rx_hash,
 	.xmo_rx_vlan_tag		= ice_xdp_rx_vlan_tag,
+	.xmo_rx_csum			= ice_xdp_rx_csum,
 };
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v1 6/7] veth: support the rx_csum XDP metadata hint
  2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
                   ` (4 preceding siblings ...)
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 5/7] ice: " Vladimir Vdovin
@ 2026-06-30 19:15 ` Vladimir Vdovin
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 7/7] selftests/bpf: cover bpf_xdp_metadata_rx_csum in xdp_metadata Vladimir Vdovin
  2026-06-30 21:18 ` [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Stanislav Fomichev
  7 siblings, 0 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-06-30 19:15 UTC (permalink / raw)
  To: bpf, netdev
  Cc: ast, daniel, andrii, martin.lau, sdf, hawk, john.fastabend, kuba,
	Vladimir Vdovin

Implement xmo_rx_csum from skb->ip_summed.  veth has no real hardware;
this surfaces whatever checksum verdict the skb already carries and makes
the metadata kfunc testable without a NIC.

Signed-off-by: Vladimir Vdovin <deliran@verdict.gg>
---
 drivers/net/veth.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 1c5142149175..b7bc5a3b07e5 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1700,6 +1700,28 @@ static int veth_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto,
 	return err;
 }
 
+static int veth_xdp_rx_csum(const struct xdp_md *ctx,
+			    enum xdp_csum_status *csum_status)
+{
+	const struct veth_xdp_buff *_ctx = (void *)ctx;
+	const struct sk_buff *skb = _ctx->skb;
+
+	if (!skb)
+		return -ENODATA;
+
+	/* veth has no real hardware; surface whatever checksum verdict the
+	 * skb already carries (e.g. CHECKSUM_PARTIAL/UNNECESSARY from a local
+	 * sender or a previous validation).
+	 */
+	if (skb->ip_summed == CHECKSUM_UNNECESSARY ||
+	    skb->ip_summed == CHECKSUM_PARTIAL)
+		*csum_status = XDP_CSUM_VERIFIED;
+	else
+		*csum_status = XDP_CSUM_NONE;
+
+	return 0;
+}
+
 static const struct net_device_ops veth_netdev_ops = {
 	.ndo_init            = veth_dev_init,
 	.ndo_open            = veth_open,
@@ -1725,6 +1747,7 @@ static const struct xdp_metadata_ops veth_xdp_metadata_ops = {
 	.xmo_rx_timestamp		= veth_xdp_rx_timestamp,
 	.xmo_rx_hash			= veth_xdp_rx_hash,
 	.xmo_rx_vlan_tag		= veth_xdp_rx_vlan_tag,
+	.xmo_rx_csum			= veth_xdp_rx_csum,
 };
 
 #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH bpf-next v1 7/7] selftests/bpf: cover bpf_xdp_metadata_rx_csum in xdp_metadata
  2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
                   ` (5 preceding siblings ...)
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 6/7] veth: " Vladimir Vdovin
@ 2026-06-30 19:15 ` Vladimir Vdovin
  2026-06-30 21:18 ` [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Stanislav Fomichev
  7 siblings, 0 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-06-30 19:15 UTC (permalink / raw)
  To: bpf, netdev
  Cc: ast, daniel, andrii, martin.lau, sdf, hawk, john.fastabend, kuba,
	Vladimir Vdovin

Call bpf_xdp_metadata_rx_csum() in the xdp_metadata program and export the
status to userspace.  veth surfaces skb->ip_summed: a frame injected via
AF_XDP carries no checksum context (XDP_CSUM_NONE), while one sent through
the stack is CHECKSUM_PARTIAL (XDP_CSUM_VERIFIED).  Assert each.

Signed-off-by: Vladimir Vdovin <deliran@verdict.gg>
---
 tools/testing/selftests/bpf/prog_tests/xdp_metadata.c | 10 ++++++++++
 tools/testing/selftests/bpf/progs/xdp_metadata.c      |  9 +++++++++
 tools/testing/selftests/bpf/xdp_metadata.h            |  8 ++++++++
 3 files changed, 27 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
index 5c31054ad4a4..77f55696eb78 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
@@ -310,6 +310,16 @@ static int verify_xsk_metadata(struct xsk *xsk, bool sent_from_af_xdp)
 	if (!ASSERT_NEQ(meta->rx_hash, 0, "rx_hash"))
 		return -1;
 
+	/* veth surfaces the checksum verdict from skb->ip_summed.  A packet
+	 * injected via AF_XDP carries no checksum context and is CHECKSUM_NONE,
+	 * while one sent through the stack is CHECKSUM_PARTIAL and reads back as
+	 * verified.
+	 */
+	if (!ASSERT_EQ(meta->rx_csum_status,
+		       sent_from_af_xdp ? XDP_META_CSUM_NONE : XDP_META_CSUM_VERIFIED,
+		       "rx_csum_status"))
+		return -1;
+
 	if (!sent_from_af_xdp) {
 		if (!ASSERT_NEQ(meta->rx_hash_type & XDP_RSS_TYPE_L4, 0, "rx_hash_type"))
 			return -1;
diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c
index 09bb8a038d52..0089c6c5a2e4 100644
--- a/tools/testing/selftests/bpf/progs/xdp_metadata.c
+++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c
@@ -33,6 +33,8 @@ extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash,
 extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx,
 					__be16 *vlan_proto,
 					__u16 *vlan_tci) __ksym;
+extern int bpf_xdp_metadata_rx_csum(const struct xdp_md *ctx,
+				    enum xdp_csum_status *csum_status) __ksym;
 
 SEC("xdp")
 int rx(struct xdp_md *ctx)
@@ -43,6 +45,7 @@ int rx(struct xdp_md *ctx)
 	struct udphdr *udp = NULL;
 	struct iphdr *iph = NULL;
 	struct xdp_meta *meta;
+	enum xdp_csum_status csum_status;
 	u64 timestamp = -1;
 	int ret;
 
@@ -99,6 +102,12 @@ int rx(struct xdp_md *ctx)
 	bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_proto,
 				     &meta->rx_vlan_tci);
 
+	ret = bpf_xdp_metadata_rx_csum(ctx, &csum_status);
+	if (ret < 0)
+		meta->rx_csum_err = ret;
+	else
+		meta->rx_csum_status = csum_status;
+
 	return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS);
 }
 
diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h
index 87318ad1117a..ba1b2902b371 100644
--- a/tools/testing/selftests/bpf/xdp_metadata.h
+++ b/tools/testing/selftests/bpf/xdp_metadata.h
@@ -30,6 +30,10 @@ enum xdp_meta_field {
 	XDP_META_FIELD_VLAN_TAG	= BIT(2),
 };
 
+/* Mirror of enum xdp_csum_status (include/net/xdp.h) for userspace asserts. */
+#define XDP_META_CSUM_NONE	0
+#define XDP_META_CSUM_VERIFIED	1
+
 struct xdp_meta {
 	union {
 		__u64 rx_timestamp;
@@ -48,5 +52,9 @@ struct xdp_meta {
 		};
 		__s32 rx_vlan_tag_err;
 	};
+	union {
+		__u32 rx_csum_status;
+		__s32 rx_csum_err;
+	};
 	enum xdp_meta_field hint_valid;
 };
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect
  2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
                   ` (6 preceding siblings ...)
  2026-06-30 19:15 ` [RFC PATCH bpf-next v1 7/7] selftests/bpf: cover bpf_xdp_metadata_rx_csum in xdp_metadata Vladimir Vdovin
@ 2026-06-30 21:18 ` Stanislav Fomichev
  2026-06-30 22:16   ` Lorenzo Bianconi
  7 siblings, 1 reply; 11+ messages in thread
From: Stanislav Fomichev @ 2026-06-30 21:18 UTC (permalink / raw)
  To: Vladimir Vdovin
  Cc: bpf, netdev, ast, daniel, andrii, martin.lau, sdf, hawk,
	john.fastabend, kuba, lorenzo

On 06/30, Vladimir Vdovin wrote:
> This series lets XDP programs work with the hardware RX checksum verdict:
> read what the NIC concluded about a packet, and carry a "the L4 checksum
> is correct" assertion across a redirect so the stack does not revalidate
> it in software.
> 
> When an XDP program redirects a frame to a cpumap (or any other path that
> rebuilds an skb from an xdp_frame via __xdp_build_skb_from_frame()), the
> HW RX checksum status is lost and the stack revalidates the L4 checksum in
> software.
> 
> Two kfuncs are added:
> 
>  - bpf_xdp_metadata_rx_csum(): a device-bound RX-metadata hint, like the
>    existing rx_hash / rx_vlan_tag ones.  It reports enum xdp_csum_status
>    (XDP_CSUM_NONE / XDP_CSUM_VERIFIED) and is implemented for mlx5e, ice
>    and veth.
> 
>  - bpf_xdp_assert_rx_csum(): a generic, non-device-bound kfunc that lets
>    the program assert the L4 checksum is correct.  It sets a buff flag
>    that rides into the xdp_frame, and __xdp_build_skb_from_frame() turns
>    it into skb->ip_summed = CHECKSUM_UNNECESSARY.  The kernel cannot
>    verify the assertion; the program takes responsibility, as it already
>    does when rewriting packet contents.
> 
> Posted as RFC to get feedback on:
> 
>  - whether the read hint (bpf_xdp_metadata_rx_csum() and its driver
>    support) belongs in this series at all.  bpf_xdp_assert_rx_csum() is
>    self-contained and already covers the main use case: a program that
>    computes or fixes the L4 checksum itself, or trusts the source, and
>    wants the rebuilt skb to skip software revalidation.  The read hint is
>    an optimization for programs that did not touch the payload and only
>    want to relay the hardware verdict.  These could just as well be two
>    independent series (assert-only first);
>  - the kfunc naming, bpf_xdp_assert_rx_csum() in particular.
> 
> Testing:
> 
>  - new selftest xdp_cpumap_rx_csum drives a frame through a native-XDP
>    veth into a cpumap redirect and checks, via fexit on
>    __xdp_build_skb_from_frame(), that the rebuilt skb is
>    CHECKSUM_UNNECESSARY iff the program called bpf_xdp_assert_rx_csum();
>  - xdp_metadata calls bpf_xdp_metadata_rx_csum() over veth and checks both
>    verdicts: XDP_CSUM_NONE for an AF_XDP-injected frame and
>    XDP_CSUM_VERIFIED for one sent through the stack.

This was posted somewhat recently from Lorenzo (and had a fair bit of
discussion), but there haven't been a follow up:
https://lore.kernel.org/bpf/20260217-bpf-xdp-meta-rxcksum-v3-0-30024c50ba71@kernel.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect
  2026-06-30 21:18 ` [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Stanislav Fomichev
@ 2026-06-30 22:16   ` Lorenzo Bianconi
  2026-07-01 17:10     ` Vladimir Vdovin
  0 siblings, 1 reply; 11+ messages in thread
From: Lorenzo Bianconi @ 2026-06-30 22:16 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Vladimir Vdovin, bpf, netdev, ast, daniel, andrii, martin.lau,
	sdf, hawk, john.fastabend, kuba

[-- Attachment #1: Type: text/plain, Size: 3051 bytes --]

On Jun 30, Stanislav Fomichev wrote:
> On 06/30, Vladimir Vdovin wrote:
> > This series lets XDP programs work with the hardware RX checksum verdict:
> > read what the NIC concluded about a packet, and carry a "the L4 checksum
> > is correct" assertion across a redirect so the stack does not revalidate
> > it in software.
> > 
> > When an XDP program redirects a frame to a cpumap (or any other path that
> > rebuilds an skb from an xdp_frame via __xdp_build_skb_from_frame()), the
> > HW RX checksum status is lost and the stack revalidates the L4 checksum in
> > software.
> > 
> > Two kfuncs are added:
> > 
> >  - bpf_xdp_metadata_rx_csum(): a device-bound RX-metadata hint, like the
> >    existing rx_hash / rx_vlan_tag ones.  It reports enum xdp_csum_status
> >    (XDP_CSUM_NONE / XDP_CSUM_VERIFIED) and is implemented for mlx5e, ice
> >    and veth.
> > 
> >  - bpf_xdp_assert_rx_csum(): a generic, non-device-bound kfunc that lets
> >    the program assert the L4 checksum is correct.  It sets a buff flag
> >    that rides into the xdp_frame, and __xdp_build_skb_from_frame() turns
> >    it into skb->ip_summed = CHECKSUM_UNNECESSARY.  The kernel cannot
> >    verify the assertion; the program takes responsibility, as it already
> >    does when rewriting packet contents.
> > 
> > Posted as RFC to get feedback on:
> > 
> >  - whether the read hint (bpf_xdp_metadata_rx_csum() and its driver
> >    support) belongs in this series at all.  bpf_xdp_assert_rx_csum() is
> >    self-contained and already covers the main use case: a program that
> >    computes or fixes the L4 checksum itself, or trusts the source, and
> >    wants the rebuilt skb to skip software revalidation.  The read hint is
> >    an optimization for programs that did not touch the payload and only
> >    want to relay the hardware verdict.  These could just as well be two
> >    independent series (assert-only first);
> >  - the kfunc naming, bpf_xdp_assert_rx_csum() in particular.
> > 
> > Testing:
> > 
> >  - new selftest xdp_cpumap_rx_csum drives a frame through a native-XDP
> >    veth into a cpumap redirect and checks, via fexit on
> >    __xdp_build_skb_from_frame(), that the rebuilt skb is
> >    CHECKSUM_UNNECESSARY iff the program called bpf_xdp_assert_rx_csum();
> >  - xdp_metadata calls bpf_xdp_metadata_rx_csum() over veth and checks both
> >    verdicts: XDP_CSUM_NONE for an AF_XDP-injected frame and
> >    XDP_CSUM_VERIFIED for one sent through the stack.
> 
> This was posted somewhat recently from Lorenzo (and had a fair bit of
> discussion), but there haven't been a follow up:
> https://lore.kernel.org/bpf/20260217-bpf-xdp-meta-rxcksum-v3-0-30024c50ba71@kernel.org/

Hi Vladimir and Stanislav,

AFAIK in my series we are just missing the drv self-test requested by Jakub.
I have not the time to look into it yet.
@Vladimir: if you have any free-cycles, do you agree to introduce the missing
 test requested by Jakub to my series? Thanks in advance.

Regards,
Lorenzo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect
  2026-06-30 22:16   ` Lorenzo Bianconi
@ 2026-07-01 17:10     ` Vladimir Vdovin
  0 siblings, 0 replies; 11+ messages in thread
From: Vladimir Vdovin @ 2026-07-01 17:10 UTC (permalink / raw)
  To: lorenzo, sdf, kuba
  Cc: andrii, ast, daniel, hawk, john.fastabend, martin.lau, sdf.kernel,
	bpf, netdev, Vladimir Vdovin

Hi Lorenzo,

Sorry -- I blindly missed your earlier RX-checksum series before I posted
mine, thanks Stanislav for the pointer.

To answer your question: yes, I'm happy to take on the driver selftest
Jakub asked for.

As for my own series, the read side clearly overlaps yours and you own it,
so I'll drop my bpf_xdp_metadata_rx_csum() hint and its driver bits.

What's left that is genuinely separate is the "assertion" half -- a non-dev-bound
bpf_xdp_assert_rx_csum() that preserves the HW verdict across a
cpumap/redirect: it sets a flag on the xdp_buff that rides into the
xdp_frame and becomes skb->ip_summed = CHECKSUM_UNNECESSARY in
__xdp_build_skb_from_frame().

Should I resend that as a small standalone series (v2, assert-only)?
It also looks like a PoC that you and Jakub discussed on v3 [1].

A few things I'd like to confirm before writing the test:

1. API for v4. In the v3 discussion you agreed to rework the API to report
   both COMPLETE and UNNECESSARY (+ csum_level), per Jakub. Do you plan to
   send that in v4, or should the driver selftest target the current v3
   signature (enum xdp_checksum + cksum_meta)? I'd rather write the test
   against the API you intend to keep.

2. Documented behavior. The selftest is meant to "check the documented
   expectation", so which rule should it assert -- "a driver must never
   report CHECKSUM_COMPLETE while an XDP program is attached", or that the
   driver downgrades/repairs COMPLETE on the XDP_PASS path? I'll write the
   doc paragraph and the test to match whatever we settle on.

3. Drivers. Your series adds veth and ice; I don't see mlx5e -- was that
   intentional (left to the driver maintainers)? I had an mlx5e
   implementation in my v1 and I'm happy to contribute it to your series if
   it's useful.

For the test itself I was thinking of extending
tools/testing/selftests/drivers/net/hw/xdp_metadata.py, gated on the new
"checksum" xdp-rx-metadata feature, with good-csum / bad-csum / modify +
XDP_PASS cases. Does that match what you and Jakub had in mind?

[1] https://lore.kernel.org/bpf/20260217-bpf-xdp-meta-rxcksum-v3-0-30024c50ba71@kernel.org/

Thanks,
Vladimir

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-07-01 17:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 19:15 [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Vladimir Vdovin
2026-06-30 19:15 ` [RFC PATCH bpf-next v1 1/7] xdp: let XDP programs assert the RX checksum " Vladimir Vdovin
2026-06-30 19:15 ` [RFC PATCH bpf-next v1 2/7] selftests/bpf: add test for bpf_xdp_assert_rx_csum over cpumap Vladimir Vdovin
2026-06-30 19:15 ` [RFC PATCH bpf-next v1 3/7] xdp: add bpf_xdp_metadata_rx_csum() RX metadata kfunc Vladimir Vdovin
2026-06-30 19:15 ` [RFC PATCH bpf-next v1 4/7] net/mlx5e: support the rx_csum XDP metadata hint Vladimir Vdovin
2026-06-30 19:15 ` [RFC PATCH bpf-next v1 5/7] ice: " Vladimir Vdovin
2026-06-30 19:15 ` [RFC PATCH bpf-next v1 6/7] veth: " Vladimir Vdovin
2026-06-30 19:15 ` [RFC PATCH bpf-next v1 7/7] selftests/bpf: cover bpf_xdp_metadata_rx_csum in xdp_metadata Vladimir Vdovin
2026-06-30 21:18 ` [RFC PATCH bpf-next v1 0/7] xdp: RX checksum metadata hint and checksum assertion over redirect Stanislav Fomichev
2026-06-30 22:16   ` Lorenzo Bianconi
2026-07-01 17:10     ` Vladimir Vdovin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox