netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/6] XDP metadata support for tun driver
@ 2025-02-17 17:23 Marcus Wichelmann
  2025-02-17 17:23 ` [PATCH bpf-next v2 1/6] net: tun: enable XDP metadata support Marcus Wichelmann
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-17 17:23 UTC (permalink / raw)
  To: netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

Hi all,

Thank you for your review comments. Here is an updated patch series with
the requested changes.

To add a selftest for the metadata support of the tun driver, I refactored
an existing "xdp_context_functional" test which already tested something
similar but for the veth driver. I made the testing logic behind it more
reusable so that it also works for the tun driver and possibly other
drivers in the future.

The last patch ("fix file descriptor assertion in open_tuntap helper")
fixes an assertion in an existing helper function that I just moved and
reused. Somehow the file descriptor for /dev/net/tun turned out to be 0
when running in the BPF kernel-patches GitHub CI, so the assert condition
needed adjustment:
https://github.com/kernel-patches/bpf/actions/runs/13339140896

Successful pipeline:
https://github.com/kernel-patches/bpf/actions/runs/13372306548

---

v2:
- submit against bpf-next subtree
- split commits and improved commit messages
- remove redundant metasize check and add clarifying comment instead
- use max() instead of ternary operator
- add selftest for metadata support in the tun driver

v1: https://lore.kernel.org/all/20250130171614.1657224-1-marcus.wichelmann@hetzner-cloud.de/

Marcus Wichelmann (6):
  net: tun: enable XDP metadata support
  net: tun: enable transfer of XDP metadata to skb
  selftests/bpf: move open_tuntap to network helpers
  selftests/bpf: refactor xdp_context_functional test and bpf program
  selftests/bpf: add test for XDP metadata support in tun driver
  selftests/bpf: fix file descriptor assertion in open_tuntap helper

 drivers/net/tun.c                             |  24 ++-
 tools/testing/selftests/bpf/network_helpers.c |  28 ++++
 tools/testing/selftests/bpf/network_helpers.h |   3 +
 .../selftests/bpf/prog_tests/lwt_helpers.h    |  29 ----
 .../bpf/prog_tests/xdp_context_test_run.c     | 152 +++++++++++++++---
 .../selftests/bpf/progs/test_xdp_meta.c       |  56 ++++---
 6 files changed, 215 insertions(+), 77 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v2 1/6] net: tun: enable XDP metadata support
  2025-02-17 17:23 [PATCH bpf-next v2 0/6] XDP metadata support for tun driver Marcus Wichelmann
@ 2025-02-17 17:23 ` Marcus Wichelmann
  2025-02-19  3:17   ` Jason Wang
  2025-02-17 17:23 ` [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb Marcus Wichelmann
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-17 17:23 UTC (permalink / raw)
  To: netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

Enable the support for the bpf_xdp_adjust_meta helper function for XDP
buffers initialized by the tun driver. This allows to reserve a metadata
area that is useful to pass any information from one XDP program to
another one, for example when using tail-calls.

Whether this helper function can be used in an XDP program depends on
how the xdp_buff was initialized. Most net drivers initialize the
xdp_buff in a way, that allows bpf_xdp_adjust_meta to be used. In case
of the tun driver, this is currently not the case.

There are two code paths in the tun driver that lead to a
bpf_prog_run_xdp and where metadata support should be enabled:

1. tun_build_skb, which is called by tun_get_user and is used when
   writing packets from userspace into the device. In this case, the
   xdp_buff created in tun_build_skb has no support for
   bpf_xdp_adjust_meta and calls of that helper function result in
   ENOTSUPP.

   For this code path, it's sufficient to set the meta_valid argument of
   the xdp_prepare_buff call. The reserved headroom is large enough
   already.

2. tun_xdp_one, which is called by tun_sendmsg which again is called by
   other drivers (e.g. vhost_net). When the TUN_MSG_PTR mode is used,
   another driver may pass a batch of xdp_buffs to the tun driver. In
   this case, that other driver is the one initializing the xdp_buff.

   See commit 043d222f93ab ("tuntap: accept an array of XDP buffs
   through sendmsg()") for details.

   For now, the vhost_net driver is the only one using TUN_MSG_PTR and
   it already initializes the xdp_buffs with metadata support and
   sufficient headroom. But the tun driver disables it again, so the
   xdp_set_data_meta_invalid call has to be removed.

Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
---
 drivers/net/tun.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 28624cca91f8..c95ab9c46bd2 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1713,7 +1713,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 		u32 act;
 
 		xdp_init_buff(&xdp, buflen, &tfile->xdp_rxq);
-		xdp_prepare_buff(&xdp, buf, pad, len, false);
+		xdp_prepare_buff(&xdp, buf, pad, len, true);
 
 		act = bpf_prog_run_xdp(xdp_prog, &xdp);
 		if (act == XDP_REDIRECT || act == XDP_TX) {
@@ -2471,7 +2471,6 @@ static int tun_xdp_one(struct tun_struct *tun,
 		}
 
 		xdp_init_buff(xdp, buflen, &tfile->xdp_rxq);
-		xdp_set_data_meta_invalid(xdp);
 
 		act = bpf_prog_run_xdp(xdp_prog, xdp);
 		ret = tun_xdp_act(tun, xdp_prog, xdp, act);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb
  2025-02-17 17:23 [PATCH bpf-next v2 0/6] XDP metadata support for tun driver Marcus Wichelmann
  2025-02-17 17:23 ` [PATCH bpf-next v2 1/6] net: tun: enable XDP metadata support Marcus Wichelmann
@ 2025-02-17 17:23 ` Marcus Wichelmann
  2025-02-18  1:45   ` Willem de Bruijn
  2025-02-18  1:47   ` Willem de Bruijn
  2025-02-17 17:23 ` [PATCH bpf-next v2 3/6] selftests/bpf: move open_tuntap to network helpers Marcus Wichelmann
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-17 17:23 UTC (permalink / raw)
  To: netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

When the XDP metadata area was used, it is expected that the same
metadata can also be accessed from TC, as can be read in the description
of the bpf_xdp_adjust_meta helper function. In the tun driver, this was
not yet implemented.

To make this work, the skb that is being built on XDP_PASS should know
of the current size of the metadata area. This is ensured by adding
calls to skb_metadata_set. For the tun_xdp_one code path, an additional
check is necessary to handle the case where the externally initialized
xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).

More information about this feature can be found in the commit message
of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").

Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
---
 drivers/net/tun.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index c95ab9c46bd2..3dde6cd29a84 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1604,7 +1604,8 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
 
 static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
 				       struct page_frag *alloc_frag, char *buf,
-				       int buflen, int len, int pad)
+				       int buflen, int len, int pad,
+				       int metasize)
 {
 	struct sk_buff *skb = build_skb(buf, buflen);
 
@@ -1613,6 +1614,8 @@ static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
 
 	skb_reserve(skb, pad);
 	skb_put(skb, len);
+	if (metasize)
+		skb_metadata_set(skb, metasize);
 	skb_set_owner_w(skb, tfile->socket.sk);
 
 	get_page(alloc_frag->page);
@@ -1672,6 +1675,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 	char *buf;
 	size_t copied;
 	int pad = TUN_RX_PAD;
+	int metasize = 0;
 	int err = 0;
 
 	rcu_read_lock();
@@ -1699,7 +1703,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 	if (hdr->gso_type || !xdp_prog) {
 		*skb_xdp = 1;
 		return __tun_build_skb(tfile, alloc_frag, buf, buflen, len,
-				       pad);
+				       pad, metasize);
 	}
 
 	*skb_xdp = 0;
@@ -1734,12 +1738,18 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 
 		pad = xdp.data - xdp.data_hard_start;
 		len = xdp.data_end - xdp.data;
+
+		/* It is known that the xdp_buff was prepared with metadata
+		 * support, so no additional check is necessary.
+		 */
+		metasize = xdp.data - xdp.data_meta;
 	}
 	bpf_net_ctx_clear(bpf_net_ctx);
 	rcu_read_unlock();
 	local_bh_enable();
 
-	return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad);
+	return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad,
+			       metasize);
 
 out:
 	bpf_net_ctx_clear(bpf_net_ctx);
@@ -2456,6 +2466,7 @@ static int tun_xdp_one(struct tun_struct *tun,
 	struct sk_buff_head *queue;
 	u32 rxhash = 0, act;
 	int buflen = hdr->buflen;
+	int metasize = 0;
 	int ret = 0;
 	bool skb_xdp = false;
 	struct page *page;
@@ -2510,6 +2521,10 @@ static int tun_xdp_one(struct tun_struct *tun,
 	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	skb_put(skb, xdp->data_end - xdp->data);
 
+	metasize = max(xdp->data - xdp->data_meta, 0);
+	if (metasize)
+		skb_metadata_set(skb, metasize);
+
 	if (virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) {
 		atomic_long_inc(&tun->rx_frame_errors);
 		kfree_skb(skb);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v2 3/6] selftests/bpf: move open_tuntap to network helpers
  2025-02-17 17:23 [PATCH bpf-next v2 0/6] XDP metadata support for tun driver Marcus Wichelmann
  2025-02-17 17:23 ` [PATCH bpf-next v2 1/6] net: tun: enable XDP metadata support Marcus Wichelmann
  2025-02-17 17:23 ` [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb Marcus Wichelmann
@ 2025-02-17 17:23 ` Marcus Wichelmann
  2025-02-18  1:50   ` Willem de Bruijn
  2025-02-17 17:23 ` [PATCH bpf-next v2 4/6] selftests/bpf: refactor xdp_context_functional test and bpf program Marcus Wichelmann
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-17 17:23 UTC (permalink / raw)
  To: netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

To test the XDP metadata functionality of the tun driver, it's necessary
to create a new tap device first. A helper function for this already
exists in lwt_helpers.h. Move it to the common network helpers header,
so it can be reused in other tests.

Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
---
 tools/testing/selftests/bpf/network_helpers.c | 28 ++++++++++++++++++
 tools/testing/selftests/bpf/network_helpers.h |  3 ++
 .../selftests/bpf/prog_tests/lwt_helpers.h    | 29 -------------------
 3 files changed, 31 insertions(+), 29 deletions(-)

diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c
index 737a952dcf80..e1cfa1b37754 100644
--- a/tools/testing/selftests/bpf/network_helpers.c
+++ b/tools/testing/selftests/bpf/network_helpers.c
@@ -565,6 +565,34 @@ void close_netns(struct nstoken *token)
 	free(token);
 }
 
+int open_tuntap(const char *dev_name, bool need_mac)
+{
+	int err = 0;
+	struct ifreq ifr;
+	int fd = open("/dev/net/tun", O_RDWR);
+
+	if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)"))
+		return -1;
+
+	ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN);
+	strncpy(ifr.ifr_name, dev_name, IFNAMSIZ - 1);
+	ifr.ifr_name[IFNAMSIZ - 1] = '\0';
+
+	err = ioctl(fd, TUNSETIFF, &ifr);
+	if (!ASSERT_OK(err, "ioctl(TUNSETIFF)")) {
+		close(fd);
+		return -1;
+	}
+
+	err = fcntl(fd, F_SETFL, O_NONBLOCK);
+	if (!ASSERT_OK(err, "fcntl(O_NONBLOCK)")) {
+		close(fd);
+		return -1;
+	}
+
+	return fd;
+}
+
 int get_socket_local_port(int sock_fd)
 {
 	struct sockaddr_storage addr;
diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h
index 9f6e05d886c5..99d1417c1d8b 100644
--- a/tools/testing/selftests/bpf/network_helpers.h
+++ b/tools/testing/selftests/bpf/network_helpers.h
@@ -8,6 +8,7 @@
 typedef __u16 __sum16;
 #include <linux/if_ether.h>
 #include <linux/if_packet.h>
+#include <linux/if_tun.h>
 #include <linux/ip.h>
 #include <linux/ipv6.h>
 #include <linux/ethtool.h>
@@ -98,6 +99,8 @@ int send_recv_data(int lfd, int fd, uint32_t total_bytes);
 int make_netns(const char *name);
 int remove_netns(const char *name);
 
+int open_tuntap(const char *dev_name, bool need_mac);
+
 /**
  * append_tid() - Append thread ID to the given string.
  *
diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h b/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h
index fb1eb8c67361..ccec0fcdabc1 100644
--- a/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h
+++ b/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h
@@ -5,7 +5,6 @@
 
 #include <time.h>
 #include <net/if.h>
-#include <linux/if_tun.h>
 #include <linux/icmp.h>
 
 #include "test_progs.h"
@@ -37,34 +36,6 @@ static inline int netns_delete(void)
 	return system("ip netns del " NETNS ">/dev/null 2>&1");
 }
 
-static int open_tuntap(const char *dev_name, bool need_mac)
-{
-	int err = 0;
-	struct ifreq ifr;
-	int fd = open("/dev/net/tun", O_RDWR);
-
-	if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)"))
-		return -1;
-
-	ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN);
-	strncpy(ifr.ifr_name, dev_name, IFNAMSIZ - 1);
-	ifr.ifr_name[IFNAMSIZ - 1] = '\0';
-
-	err = ioctl(fd, TUNSETIFF, &ifr);
-	if (!ASSERT_OK(err, "ioctl(TUNSETIFF)")) {
-		close(fd);
-		return -1;
-	}
-
-	err = fcntl(fd, F_SETFL, O_NONBLOCK);
-	if (!ASSERT_OK(err, "fcntl(O_NONBLOCK)")) {
-		close(fd);
-		return -1;
-	}
-
-	return fd;
-}
-
 #define ICMP_PAYLOAD_SIZE     100
 
 /* Match an ICMP packet with payload len ICMP_PAYLOAD_SIZE */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v2 4/6] selftests/bpf: refactor xdp_context_functional test and bpf program
  2025-02-17 17:23 [PATCH bpf-next v2 0/6] XDP metadata support for tun driver Marcus Wichelmann
                   ` (2 preceding siblings ...)
  2025-02-17 17:23 ` [PATCH bpf-next v2 3/6] selftests/bpf: move open_tuntap to network helpers Marcus Wichelmann
@ 2025-02-17 17:23 ` Marcus Wichelmann
  2025-02-17 17:23 ` [PATCH bpf-next v2 5/6] selftests/bpf: add test for XDP metadata support in tun driver Marcus Wichelmann
  2025-02-17 17:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: fix file descriptor assertion in open_tuntap helper Marcus Wichelmann
  5 siblings, 0 replies; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-17 17:23 UTC (permalink / raw)
  To: netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

The existing XDP metadata test works by creating a veth pair and
attaching XDP & TC programs that drop the packet when the condition of
the test isn't fulfilled. The test then pings through the veth pair and
succeeds when the ping comes through.

While this test works great for a veth pair, it is hard to replicate for
tap devices to test the XDP metadata support of them. A similar test for
the tun driver would either involve logic to reply to the ping request,
or would have to capture the packet to check if it was dropped or not.

To make the testing of other drivers easier while still maximizing code
reuse, this commit refactors the existing xdp_context_functional test to
use a test_result map. Instead of conditionally passing or dropping the
packet, the TC program is changed to copy the received metadata into the
value of that single-entry array map. Tests can then verify that the map
value matches the expectation.

This testing logic is easy to adapt to other network drivers as the only
remaining requirement is that there is some way to send a custom
Ethernet packet through it that triggers the XDP & TC programs.

The payload of the Ethernet packet is used as the test data that is
expected to be passed as metadata from the XDP to the TC program and
written to the map. It has a fixed size of 32 bytes which is a
reasonalbe size that should be supported by both drivers. Additional
packet headers are not necessary for the test and were therefore skipped
to keep the testing code short.

Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
---
 .../bpf/prog_tests/xdp_context_test_run.c     | 93 +++++++++++++++----
 .../selftests/bpf/progs/test_xdp_meta.c       | 56 ++++++-----
 2 files changed, 106 insertions(+), 43 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
index 937da9b7532a..0cd2a0d8ae60 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
@@ -4,13 +4,19 @@
 #include "test_xdp_context_test_run.skel.h"
 #include "test_xdp_meta.skel.h"
 
-#define TX_ADDR "10.0.0.1"
-#define RX_ADDR "10.0.0.2"
 #define RX_NAME "veth0"
 #define TX_NAME "veth1"
 #define TX_NETNS "xdp_context_tx"
 #define RX_NETNS "xdp_context_rx"
 
+#define TEST_PAYLOAD_LEN 32
+static const __u8 test_payload[TEST_PAYLOAD_LEN] = {
+	0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
+	0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18,
+	0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28,
+	0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38,
+};
+
 void test_xdp_context_error(int prog_fd, struct bpf_test_run_opts opts,
 			    __u32 data_meta, __u32 data, __u32 data_end,
 			    __u32 ingress_ifindex, __u32 rx_queue_index,
@@ -112,15 +118,66 @@ void test_xdp_context_test_run(void)
 	test_xdp_context_test_run__destroy(skel);
 }
 
-void test_xdp_context_functional(void)
+int send_test_packet(int ifindex)
+{
+	int n, sock = -1;
+	__u8 packet[sizeof(struct ethhdr) + TEST_PAYLOAD_LEN];
+
+	/* The ethernet header is not relevant for this test and doesn't need to
+	 * be meaningful.
+	 */
+	struct ethhdr eth = { 0 };
+
+	memcpy(packet, &eth, sizeof(eth));
+	memcpy(packet + sizeof(eth), test_payload, TEST_PAYLOAD_LEN);
+
+	sock = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
+	if (!ASSERT_GE(sock, 0, "socket"))
+		goto err;
+
+	struct sockaddr_ll saddr = {
+		.sll_family = PF_PACKET,
+		.sll_ifindex = ifindex,
+		.sll_halen = ETH_ALEN
+	};
+	n = sendto(sock, packet, sizeof(packet), 0, (struct sockaddr *)&saddr,
+		   sizeof(saddr));
+	if (!ASSERT_EQ(n, sizeof(packet), "sendto"))
+		goto err;
+
+	close(sock);
+	return 0;
+
+err:
+	if (sock >= 0)
+		close(sock);
+	return -1;
+}
+
+void assert_test_result(struct test_xdp_meta *skel)
+{
+	int err;
+	__u32 map_key = 0;
+	__u8 map_value[TEST_PAYLOAD_LEN];
+
+	err = bpf_map__lookup_elem(skel->maps.test_result, &map_key,
+				   sizeof(map_key), &map_value,
+				   TEST_PAYLOAD_LEN, BPF_ANY);
+	if (!ASSERT_OK(err, "lookup test_result"))
+		return;
+
+	ASSERT_MEMEQ(&map_value, &test_payload, TEST_PAYLOAD_LEN,
+		     "test_result map contains test payload");
+}
+
+void test_xdp_context_veth(void)
 {
 	LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
 	LIBBPF_OPTS(bpf_tc_opts, tc_opts, .handle = 1, .priority = 1);
 	struct netns_obj *rx_ns = NULL, *tx_ns = NULL;
-	struct bpf_program *tc_prog, *xdp_prog;
 	struct test_xdp_meta *skel = NULL;
 	struct nstoken *nstoken = NULL;
-	int rx_ifindex;
+	int rx_ifindex, tx_ifindex;
 	int ret;
 
 	tx_ns = netns_new(TX_NETNS, false);
@@ -138,7 +195,6 @@ void test_xdp_context_functional(void)
 	if (!ASSERT_OK_PTR(nstoken, "setns rx_ns"))
 		goto close;
 
-	SYS(close, "ip addr add " RX_ADDR "/24 dev " RX_NAME);
 	SYS(close, "ip link set dev " RX_NAME " up");
 
 	skel = test_xdp_meta__open_and_load();
@@ -154,21 +210,12 @@ void test_xdp_context_functional(void)
 	if (!ASSERT_OK(ret, "bpf_tc_hook_create"))
 		goto close;
 
-	tc_prog = bpf_object__find_program_by_name(skel->obj, "ing_cls");
-	if (!ASSERT_OK_PTR(tc_prog, "open ing_cls prog"))
-		goto close;
-
-	tc_opts.prog_fd = bpf_program__fd(tc_prog);
+	tc_opts.prog_fd = bpf_program__fd(skel->progs.ing_cls);
 	ret = bpf_tc_attach(&tc_hook, &tc_opts);
 	if (!ASSERT_OK(ret, "bpf_tc_attach"))
 		goto close;
 
-	xdp_prog = bpf_object__find_program_by_name(skel->obj, "ing_xdp");
-	if (!ASSERT_OK_PTR(xdp_prog, "open ing_xdp prog"))
-		goto close;
-
-	ret = bpf_xdp_attach(rx_ifindex,
-			     bpf_program__fd(xdp_prog),
+	ret = bpf_xdp_attach(rx_ifindex, bpf_program__fd(skel->progs.ing_xdp),
 			     0, NULL);
 	if (!ASSERT_GE(ret, 0, "bpf_xdp_attach"))
 		goto close;
@@ -179,9 +226,17 @@ void test_xdp_context_functional(void)
 	if (!ASSERT_OK_PTR(nstoken, "setns tx_ns"))
 		goto close;
 
-	SYS(close, "ip addr add " TX_ADDR "/24 dev " TX_NAME);
 	SYS(close, "ip link set dev " TX_NAME " up");
-	ASSERT_OK(SYS_NOFAIL("ping -c 1 " RX_ADDR), "ping");
+
+	tx_ifindex = if_nametoindex(TX_NAME);
+	if (!ASSERT_GE(tx_ifindex, 0, "if_nametoindex tx"))
+		goto close;
+
+	ret = send_test_packet(tx_ifindex);
+	if (!ASSERT_OK(ret, "send_test_packet"))
+		goto close;
+
+	assert_test_result(skel);
 
 close:
 	close_netns(nstoken);
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_meta.c b/tools/testing/selftests/bpf/progs/test_xdp_meta.c
index fe2d71ae0e71..56a83307b483 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_meta.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_meta.c
@@ -4,49 +4,57 @@
 
 #include <bpf/bpf_helpers.h>
 
-#define __round_mask(x, y) ((__typeof__(x))((y) - 1))
-#define round_up(x, y) ((((x) - 1) | __round_mask(x, y)) + 1)
-#define ctx_ptr(ctx, mem) (void *)(unsigned long)ctx->mem
+#define META_SIZE 32
+
+/* Demonstrates how metadata can be passed from an XDP program to a TC program
+ * using bpf_xdp_adjust_meta.
+ * For the sake of testing the metadata support in drivers, the XDP program uses
+ * a fixed-size payload after the Ethernet header as metadata. The TC program
+ * copies the metadata it receives into a map so it can be checked from
+ * userspace.
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__type(key, __u32);
+	__uint(value_size, META_SIZE);
+} test_result SEC(".maps");
 
 SEC("tc")
 int ing_cls(struct __sk_buff *ctx)
 {
-	__u8 *data, *data_meta, *data_end;
-	__u32 diff = 0;
+	void *data_meta = (void *)(unsigned long)ctx->data_meta;
+	void *data = (void *)(unsigned long)ctx->data;
 
-	data_meta = ctx_ptr(ctx, data_meta);
-	data_end  = ctx_ptr(ctx, data_end);
-	data      = ctx_ptr(ctx, data);
-
-	if (data + ETH_ALEN > data_end ||
-	    data_meta + round_up(ETH_ALEN, 4) > data)
+	if (data_meta + META_SIZE > data)
 		return TC_ACT_SHOT;
 
-	diff |= ((__u32 *)data_meta)[0] ^ ((__u32 *)data)[0];
-	diff |= ((__u16 *)data_meta)[2] ^ ((__u16 *)data)[2];
+	int key = 0;
+
+	bpf_map_update_elem(&test_result, &key, data_meta, BPF_ANY);
 
-	return diff ? TC_ACT_SHOT : TC_ACT_OK;
+	return TC_ACT_SHOT;
 }
 
 SEC("xdp")
 int ing_xdp(struct xdp_md *ctx)
 {
-	__u8 *data, *data_meta, *data_end;
-	int ret;
-
-	ret = bpf_xdp_adjust_meta(ctx, -round_up(ETH_ALEN, 4));
+	int ret = bpf_xdp_adjust_meta(ctx, -META_SIZE);
 	if (ret < 0)
 		return XDP_DROP;
 
-	data_meta = ctx_ptr(ctx, data_meta);
-	data_end  = ctx_ptr(ctx, data_end);
-	data      = ctx_ptr(ctx, data);
+	void *data_meta = (void *)(unsigned long)ctx->data_meta;
+	void *data = (void *)(unsigned long)ctx->data;
+	void *data_end = (void *)(unsigned long)ctx->data_end;
 
-	if (data + ETH_ALEN > data_end ||
-	    data_meta + round_up(ETH_ALEN, 4) > data)
+	void *payload = data + sizeof(struct ethhdr);
+
+	if (data_meta + META_SIZE > data || payload + META_SIZE > data_end)
 		return XDP_DROP;
 
-	__builtin_memcpy(data_meta, data, ETH_ALEN);
+	__builtin_memcpy(data_meta, payload, META_SIZE);
+
 	return XDP_PASS;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v2 5/6] selftests/bpf: add test for XDP metadata support in tun driver
  2025-02-17 17:23 [PATCH bpf-next v2 0/6] XDP metadata support for tun driver Marcus Wichelmann
                   ` (3 preceding siblings ...)
  2025-02-17 17:23 ` [PATCH bpf-next v2 4/6] selftests/bpf: refactor xdp_context_functional test and bpf program Marcus Wichelmann
@ 2025-02-17 17:23 ` Marcus Wichelmann
  2025-02-17 17:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: fix file descriptor assertion in open_tuntap helper Marcus Wichelmann
  5 siblings, 0 replies; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-17 17:23 UTC (permalink / raw)
  To: netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

Add a selftest that creates a tap device, attaches XDP and TC programs,
writes a packet with a test payload into the tap device and checks the
test result. This test ensures that the XDP metadata support in the tun
driver is enabled and that the metadata size is correctly passed to the
skb.

See the previous commit ("selftests/bpf: refactor xdp_context_functional
test and bpf program") for details about the test design.

Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
---
 .../bpf/prog_tests/xdp_context_test_run.c     | 59 +++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
index 0cd2a0d8ae60..5eba4a823184 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
@@ -8,6 +8,7 @@
 #define TX_NAME "veth1"
 #define TX_NETNS "xdp_context_tx"
 #define RX_NETNS "xdp_context_rx"
+#define TAP_NAME "tap0"
 
 #define TEST_PAYLOAD_LEN 32
 static const __u8 test_payload[TEST_PAYLOAD_LEN] = {
@@ -245,3 +246,61 @@ void test_xdp_context_veth(void)
 	netns_free(tx_ns);
 }
 
+void test_xdp_context_tuntap(void)
+{
+	LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
+	LIBBPF_OPTS(bpf_tc_opts, tc_opts, .handle = 1, .priority = 1);
+	struct test_xdp_meta *skel = NULL;
+	__u8 packet[sizeof(struct ethhdr) + TEST_PAYLOAD_LEN];
+	int tap_fd = -1;
+	int tap_ifindex;
+	int ret;
+
+	tap_fd = open_tuntap(TAP_NAME, true);
+	if (!ASSERT_GE(tap_fd, 0, "open_tuntap"))
+		goto close;
+
+	SYS(close, "ip link set dev " TAP_NAME " up");
+
+	skel = test_xdp_meta__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open and load skeleton"))
+		goto close;
+
+	tap_ifindex = if_nametoindex(TAP_NAME);
+	if (!ASSERT_GE(tap_ifindex, 0, "if_nametoindex"))
+		goto close;
+
+	tc_hook.ifindex = tap_ifindex;
+	ret = bpf_tc_hook_create(&tc_hook);
+	if (!ASSERT_OK(ret, "bpf_tc_hook_create"))
+		goto close;
+
+	tc_opts.prog_fd = bpf_program__fd(skel->progs.ing_cls);
+	ret = bpf_tc_attach(&tc_hook, &tc_opts);
+	if (!ASSERT_OK(ret, "bpf_tc_attach"))
+		goto close;
+
+	ret = bpf_xdp_attach(tap_ifindex, bpf_program__fd(skel->progs.ing_xdp),
+			     0, NULL);
+	if (!ASSERT_GE(ret, 0, "bpf_xdp_attach"))
+		goto close;
+
+	/* The ethernet header is not relevant for this test and doesn't need to
+	 * be meaningful.
+	 */
+	struct ethhdr eth = { 0 };
+
+	memcpy(packet, &eth, sizeof(eth));
+	memcpy(packet + sizeof(eth), test_payload, TEST_PAYLOAD_LEN);
+
+	ret = write(tap_fd, packet, sizeof(packet));
+	if (!ASSERT_EQ(ret, sizeof(packet), "write packet"))
+		goto close;
+
+	assert_test_result(skel);
+
+close:
+	if (tap_fd >= 0)
+		close(tap_fd);
+	test_xdp_meta__destroy(skel);
+}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v2 6/6] selftests/bpf: fix file descriptor assertion in open_tuntap helper
  2025-02-17 17:23 [PATCH bpf-next v2 0/6] XDP metadata support for tun driver Marcus Wichelmann
                   ` (4 preceding siblings ...)
  2025-02-17 17:23 ` [PATCH bpf-next v2 5/6] selftests/bpf: add test for XDP metadata support in tun driver Marcus Wichelmann
@ 2025-02-17 17:23 ` Marcus Wichelmann
  2025-02-18  1:56   ` Willem de Bruijn
  5 siblings, 1 reply; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-17 17:23 UTC (permalink / raw)
  To: netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

The open_tuntap helper function uses open() to get a file descriptor for
/dev/net/tun.

The open(2) manpage writes this about its return value:

  On success, open(), openat(), and creat() return the new file
  descriptor (a nonnegative integer).  On error, -1 is returned and
  errno is set to indicate the error.

This means that the fd > 0 assertion in the open_tuntap helper is
incorrect and should rather check for fd >= 0.

When running the BPF selftests locally, this incorrect assertion was not
an issue, but the BPF kernel-patches CI failed because of this:

  open_tuntap:FAIL:open(/dev/net/tun) unexpected open(/dev/net/tun):
  actual 0 <= expected 0

Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
---
 tools/testing/selftests/bpf/network_helpers.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c
index e1cfa1b37754..9b59bfd5d912 100644
--- a/tools/testing/selftests/bpf/network_helpers.c
+++ b/tools/testing/selftests/bpf/network_helpers.c
@@ -571,7 +571,7 @@ int open_tuntap(const char *dev_name, bool need_mac)
 	struct ifreq ifr;
 	int fd = open("/dev/net/tun", O_RDWR);
 
-	if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)"))
+	if (!ASSERT_GE(fd, 0, "open(/dev/net/tun)"))
 		return -1;
 
 	ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb
  2025-02-17 17:23 ` [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb Marcus Wichelmann
@ 2025-02-18  1:45   ` Willem de Bruijn
  2025-02-18  1:47   ` Willem de Bruijn
  1 sibling, 0 replies; 16+ messages in thread
From: Willem de Bruijn @ 2025-02-18  1:45 UTC (permalink / raw)
  To: Marcus Wichelmann, netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

Marcus Wichelmann wrote:
> When the XDP metadata area was used, it is expected that the same
> metadata can also be accessed from TC, as can be read in the description
> of the bpf_xdp_adjust_meta helper function. In the tun driver, this was
> not yet implemented.
> 
> To make this work, the skb that is being built on XDP_PASS should know
> of the current size of the metadata area. This is ensured by adding
> calls to skb_metadata_set. For the tun_xdp_one code path, an additional
> check is necessary to handle the case where the externally initialized
> xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
> 
> More information about this feature can be found in the commit message
> of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
> 
> Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
> ---
>  drivers/net/tun.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index c95ab9c46bd2..3dde6cd29a84 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1604,7 +1604,8 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
>  
>  static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
>  				       struct page_frag *alloc_frag, char *buf,
> -				       int buflen, int len, int pad)
> +				       int buflen, int len, int pad,
> +				       int metasize)
>  {
>  	struct sk_buff *skb = build_skb(buf, buflen);
>  
> @@ -1613,6 +1614,8 @@ static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
>  
>  	skb_reserve(skb, pad);
>  	skb_put(skb, len);
> +	if (metasize)
> +		skb_metadata_set(skb, metasize);
>  	skb_set_owner_w(skb, tfile->socket.sk);
>  
>  	get_page(alloc_frag->page);
> @@ -1672,6 +1675,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	char *buf;
>  	size_t copied;
>  	int pad = TUN_RX_PAD;
> +	int metasize = 0;
>  	int err = 0;
>  
>  	rcu_read_lock();
> @@ -1699,7 +1703,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	if (hdr->gso_type || !xdp_prog) {
>  		*skb_xdp = 1;
>  		return __tun_build_skb(tfile, alloc_frag, buf, buflen, len,
> -				       pad);
> +				       pad, metasize);
>  	}
>  
>  	*skb_xdp = 0;
> @@ -1734,12 +1738,18 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  
>  		pad = xdp.data - xdp.data_hard_start;
>  		len = xdp.data_end - xdp.data;
> +
> +		/* It is known that the xdp_buff was prepared with metadata
> +		 * support, so no additional check is necessary.
> +		 */
> +		metasize = xdp.data - xdp.data_meta;
>  	}
>  	bpf_net_ctx_clear(bpf_net_ctx);
>  	rcu_read_unlock();
>  	local_bh_enable();
>  
> -	return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad);
> +	return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad,
> +			       metasize);
>  
>  out:
>  	bpf_net_ctx_clear(bpf_net_ctx);
> @@ -2456,6 +2466,7 @@ static int tun_xdp_one(struct tun_struct *tun,
>  	struct sk_buff_head *queue;
>  	u32 rxhash = 0, act;
>  	int buflen = hdr->buflen;
> +	int metasize = 0;
>  	int ret = 0;
>  	bool skb_xdp = false;
>  	struct page *page;
> @@ -2510,6 +2521,10 @@ static int tun_xdp_one(struct tun_struct *tun,
>  	skb_reserve(skb, xdp->data - xdp->data_hard_start);
>  	skb_put(skb, xdp->data_end - xdp->data);
>  
> +	metasize = max(xdp->data - xdp->data_meta, 0);

Is it ever possible for xdp->data_meta to be greater than xdp->data?

This is pointer arithmetic, which is a bit complex wrt type. This is
likely ptrdiff_t, which is signed. But may want to use max_t(int, to
make this more explicit.

> +	if (metasize)
> +		skb_metadata_set(skb, metasize);
> +

Technically not needed as skb_metadata_clear is just
skb_metadata_set(skb, 0). But fine to test and elide.

>  	if (virtio_net_hdr_to_skb(skb, gso, tun_is_lttle_endian(tun))) {
>  		atomic_long_inc(&tun->rx_frame_errors);
>  		kfree_skb(skb);
> -- 
> 2.43.0
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb
  2025-02-17 17:23 ` [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb Marcus Wichelmann
  2025-02-18  1:45   ` Willem de Bruijn
@ 2025-02-18  1:47   ` Willem de Bruijn
  2025-02-19 14:47     ` Marcus Wichelmann
  1 sibling, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2025-02-18  1:47 UTC (permalink / raw)
  To: Marcus Wichelmann, netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

Marcus Wichelmann wrote:
> When the XDP metadata area was used, it is expected that the same
> metadata can also be accessed from TC, as can be read in the description
> of the bpf_xdp_adjust_meta helper function. In the tun driver, this was
> not yet implemented.
> 
> To make this work, the skb that is being built on XDP_PASS should know
> of the current size of the metadata area. This is ensured by adding
> calls to skb_metadata_set. For the tun_xdp_one code path, an additional
> check is necessary to handle the case where the externally initialized
> xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
> 
> More information about this feature can be found in the commit message
> of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
> 
> Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
> ---
>  drivers/net/tun.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index c95ab9c46bd2..3dde6cd29a84 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1604,7 +1604,8 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
>  
>  static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
>  				       struct page_frag *alloc_frag, char *buf,
> -				       int buflen, int len, int pad)
> +				       int buflen, int len, int pad,
> +				       int metasize)
>  {
>  	struct sk_buff *skb = build_skb(buf, buflen);
>  
> @@ -1613,6 +1614,8 @@ static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
>  
>  	skb_reserve(skb, pad);
>  	skb_put(skb, len);
> +	if (metasize)
> +		skb_metadata_set(skb, metasize);
>  	skb_set_owner_w(skb, tfile->socket.sk);
>  
>  	get_page(alloc_frag->page);
> @@ -1672,6 +1675,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	char *buf;
>  	size_t copied;
>  	int pad = TUN_RX_PAD;
> +	int metasize = 0;
>  	int err = 0;
>  
>  	rcu_read_lock();
> @@ -1699,7 +1703,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	if (hdr->gso_type || !xdp_prog) {
>  		*skb_xdp = 1;
>  		return __tun_build_skb(tfile, alloc_frag, buf, buflen, len,
> -				       pad);
> +				       pad, metasize);
>  	}
>  
>  	*skb_xdp = 0;
> @@ -1734,12 +1738,18 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  
>  		pad = xdp.data - xdp.data_hard_start;
>  		len = xdp.data_end - xdp.data;
> +
> +		/* It is known that the xdp_buff was prepared with metadata
> +		 * support, so no additional check is necessary.
> +		 */
> +		metasize = xdp.data - xdp.data_meta;
>  	}
>  	bpf_net_ctx_clear(bpf_net_ctx);
>  	rcu_read_unlock();
>  	local_bh_enable();
>  
> -	return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad);
> +	return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad,
> +			       metasize);
>  
>  out:
>  	bpf_net_ctx_clear(bpf_net_ctx);
> @@ -2456,6 +2466,7 @@ static int tun_xdp_one(struct tun_struct *tun,
>  	struct sk_buff_head *queue;
>  	u32 rxhash = 0, act;
>  	int buflen = hdr->buflen;
> +	int metasize = 0;
>  	int ret = 0;
>  	bool skb_xdp = false;
>  	struct page *page;
> @@ -2510,6 +2521,10 @@ static int tun_xdp_one(struct tun_struct *tun,
>  	skb_reserve(skb, xdp->data - xdp->data_hard_start);
>  	skb_put(skb, xdp->data_end - xdp->data);
>  
> +	metasize = max(xdp->data - xdp->data_meta, 0);

Can xdp->data_meta ever be greater than xdp->data?

This is pointer comparison, which is tricky wrt type. It likely is
ptrdiff_t and thus signed. But may want to use max_t(long int, ..) to
make this explicit.

> +	if (metasize)
> +		skb_metadata_set(skb, metasize);
> +

Not strictly needed. As skb_metadata_clear is just
skb_metadata_set(skb, 0). But also not wrong, so fine to keep.

>  	if (virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) {
>  		atomic_long_inc(&tun->rx_frame_errors);
>  		kfree_skb(skb);
> -- 
> 2.43.0
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 3/6] selftests/bpf: move open_tuntap to network helpers
  2025-02-17 17:23 ` [PATCH bpf-next v2 3/6] selftests/bpf: move open_tuntap to network helpers Marcus Wichelmann
@ 2025-02-18  1:50   ` Willem de Bruijn
  0 siblings, 0 replies; 16+ messages in thread
From: Willem de Bruijn @ 2025-02-18  1:50 UTC (permalink / raw)
  To: Marcus Wichelmann, netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

Marcus Wichelmann wrote:
> To test the XDP metadata functionality of the tun driver, it's necessary
> to create a new tap device first. A helper function for this already
> exists in lwt_helpers.h. Move it to the common network helpers header,
> so it can be reused in other tests.
> 
> Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>

Reviewed-by: Willem de Bruijn <willemb@google.com>

Both tests that include lwt_helpers.h also include network_helpers.h,
so the linking with network_helpers.o is already addressed. This was
not entirely obvious from just reading the code.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 6/6] selftests/bpf: fix file descriptor assertion in open_tuntap helper
  2025-02-17 17:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: fix file descriptor assertion in open_tuntap helper Marcus Wichelmann
@ 2025-02-18  1:56   ` Willem de Bruijn
  2025-02-19 14:58     ` Marcus Wichelmann
  0 siblings, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2025-02-18  1:56 UTC (permalink / raw)
  To: Marcus Wichelmann, netdev, linux-kernel, bpf, linux-kselftest
  Cc: willemdebruijn.kernel, jasowang, andrew+netdev, davem, edumazet,
	kuba, pabeni, ast, daniel, andrii, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	mykolal, shuah, hawk, marcus.wichelmann

Marcus Wichelmann wrote:
> The open_tuntap helper function uses open() to get a file descriptor for
> /dev/net/tun.
> 
> The open(2) manpage writes this about its return value:
> 
>   On success, open(), openat(), and creat() return the new file
>   descriptor (a nonnegative integer).  On error, -1 is returned and
>   errno is set to indicate the error.
> 
> This means that the fd > 0 assertion in the open_tuntap helper is
> incorrect and should rather check for fd >= 0.
> 
> When running the BPF selftests locally, this incorrect assertion was not
> an issue, but the BPF kernel-patches CI failed because of this:
> 
>   open_tuntap:FAIL:open(/dev/net/tun) unexpected open(/dev/net/tun):
>   actual 0 <= expected 0

Wow. What kind of environment is this that 0 is not assigned stdin.
 
> Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>

The code makes sense.

I suppose that if this condition can hit, then it can also affect
existing lwt_* tests and thus should be a fix to commit 43a7c3ef8a15
("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT"), sent
separately to bpf (not bpf-next)?

Since it's a test and no failure was reported so far, maybe fine
to just merge as part of this bpf-next series, not my call.

> ---
>  tools/testing/selftests/bpf/network_helpers.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c
> index e1cfa1b37754..9b59bfd5d912 100644
> --- a/tools/testing/selftests/bpf/network_helpers.c
> +++ b/tools/testing/selftests/bpf/network_helpers.c
> @@ -571,7 +571,7 @@ int open_tuntap(const char *dev_name, bool need_mac)
>  	struct ifreq ifr;
>  	int fd = open("/dev/net/tun", O_RDWR);
>  
> -	if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)"))
> +	if (!ASSERT_GE(fd, 0, "open(/dev/net/tun)"))
>  		return -1;
>  
>  	ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN);
> -- 
> 2.43.0
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 1/6] net: tun: enable XDP metadata support
  2025-02-17 17:23 ` [PATCH bpf-next v2 1/6] net: tun: enable XDP metadata support Marcus Wichelmann
@ 2025-02-19  3:17   ` Jason Wang
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Wang @ 2025-02-19  3:17 UTC (permalink / raw)
  To: Marcus Wichelmann
  Cc: netdev, linux-kernel, bpf, linux-kselftest, willemdebruijn.kernel,
	andrew+netdev, davem, edumazet, kuba, pabeni, ast, daniel, andrii,
	martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, mykolal, shuah, hawk

On Tue, Feb 18, 2025 at 1:23 AM Marcus Wichelmann
<marcus.wichelmann@hetzner-cloud.de> wrote:
>
> Enable the support for the bpf_xdp_adjust_meta helper function for XDP
> buffers initialized by the tun driver. This allows to reserve a metadata
> area that is useful to pass any information from one XDP program to
> another one, for example when using tail-calls.
>
> Whether this helper function can be used in an XDP program depends on
> how the xdp_buff was initialized. Most net drivers initialize the
> xdp_buff in a way, that allows bpf_xdp_adjust_meta to be used. In case
> of the tun driver, this is currently not the case.
>
> There are two code paths in the tun driver that lead to a
> bpf_prog_run_xdp and where metadata support should be enabled:
>
> 1. tun_build_skb, which is called by tun_get_user and is used when
>    writing packets from userspace into the device. In this case, the
>    xdp_buff created in tun_build_skb has no support for
>    bpf_xdp_adjust_meta and calls of that helper function result in
>    ENOTSUPP.
>
>    For this code path, it's sufficient to set the meta_valid argument of
>    the xdp_prepare_buff call. The reserved headroom is large enough
>    already.
>
> 2. tun_xdp_one, which is called by tun_sendmsg which again is called by
>    other drivers (e.g. vhost_net). When the TUN_MSG_PTR mode is used,
>    another driver may pass a batch of xdp_buffs to the tun driver. In
>    this case, that other driver is the one initializing the xdp_buff.
>
>    See commit 043d222f93ab ("tuntap: accept an array of XDP buffs
>    through sendmsg()") for details.
>
>    For now, the vhost_net driver is the only one using TUN_MSG_PTR and
>    it already initializes the xdp_buffs with metadata support and
>    sufficient headroom. But the tun driver disables it again, so the
>    xdp_set_data_meta_invalid call has to be removed.
>
> Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
> ---

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb
  2025-02-18  1:47   ` Willem de Bruijn
@ 2025-02-19 14:47     ` Marcus Wichelmann
  2025-02-19 15:06       ` Willem de Bruijn
  0 siblings, 1 reply; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-19 14:47 UTC (permalink / raw)
  To: Willem de Bruijn, netdev, linux-kernel, bpf, linux-kselftest
  Cc: jasowang, andrew+netdev, davem, edumazet, kuba, pabeni, ast,
	daniel, andrii, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, mykolal, shuah, hawk

Am 18.02.25 um 02:47 schrieb Willem de Bruijn:
> Marcus Wichelmann wrote:
>> [...]
>> +	metasize = max(xdp->data - xdp->data_meta, 0);
> 
> Can xdp->data_meta ever be greater than xdp->data?

When an xdp_buff has no metadata support, then this is marked by setting
xdp->data_meta to xdp->data + 1. See xdp_prepare_buff or
xdp_set_data_meta_invalid.

In the case of tun_xdp_one, the xdp_buff is externally created by another
driver and passed to the tun driver using sendmsg and TUN_MSG_PTR. For
now, the vhost_net driver is the only driver doing that, and
xdp->data_meta is set to xdp->data there, marking support for metadata.

So knowing that vhost_net is currently the only driver passing xdp_buffs
to tun_sendmsg, the check is not strictly necessary. But other drivers
may use this API as well in the future. That's why I'd like to not make
the assumption that other drivers always create the xdp_buffs with
metadata support, when they pass them to tun_sendmsg.

Or am I just to careful about this? What do you think?

> This is pointer comparison, which is tricky wrt type. It likely is
> ptrdiff_t and thus signed. But may want to use max_t(long int, ..) to
> make this explicit.

Ah, I see, good point.

So like that?

	metasize = max_t(long int, xdp->data - xdp->data_meta, 0);
	if (metasize)
		skb_metadata_set(skb, metasize);

Alternatively, there is also xdp_data_meta_unsupported(xdp_buff) which
could be used to make this check very explicit, but I don't see it being
used in network drivers elsewhere. Not sure why.

>> +	if (metasize)
>> +		skb_metadata_set(skb, metasize);
>> +
> 
> Not strictly needed. As skb_metadata_clear is just
> skb_metadata_set(skb, 0). But also not wrong, so fine to keep.

Oh, haven't seen that.
I'm following a common pattern here that I've seen in many other network
drivers (grep for "skb_metadata_set"):

	unsigned int metasize = xdp->data - xdp->data_meta;
	[...]
	if (metasize)
		skb_metadata_set(skb, metasize);

Marcus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 6/6] selftests/bpf: fix file descriptor assertion in open_tuntap helper
  2025-02-18  1:56   ` Willem de Bruijn
@ 2025-02-19 14:58     ` Marcus Wichelmann
  0 siblings, 0 replies; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-19 14:58 UTC (permalink / raw)
  To: Willem de Bruijn, netdev, linux-kernel, bpf, linux-kselftest
  Cc: jasowang, andrew+netdev, davem, edumazet, kuba, pabeni, ast,
	daniel, andrii, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, mykolal, shuah, hawk

Am 18.02.25 um 02:56 schrieb Willem de Bruijn:
> Marcus Wichelmann wrote:
>> [...]
>> When running the BPF selftests locally, this incorrect assertion was not
>> an issue, but the BPF kernel-patches CI failed because of this:
>>
>>    open_tuntap:FAIL:open(/dev/net/tun) unexpected open(/dev/net/tun):
>>    actual 0 <= expected 0
> 
> Wow. What kind of environment is this that 0 is not assigned stdin.
>   
>> Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
> 
> The code makes sense.
> 
> I suppose that if this condition can hit, then it can also affect
> existing lwt_* tests and thus should be a fix to commit 43a7c3ef8a15
> ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT"), sent
> separately to bpf (not bpf-next)?
> 
> Since it's a test and no failure was reported so far, maybe fine
> to just merge as part of this bpf-next series, not my call.
I'm not sure why this only became an issue after I added the
xdp_context_tuntap test and never before. This may have to do with
the order of test execution.

If nobody speaks up, I'll leave it in this patch series for now.

Marcus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb
  2025-02-19 14:47     ` Marcus Wichelmann
@ 2025-02-19 15:06       ` Willem de Bruijn
  2025-02-19 15:15         ` Marcus Wichelmann
  0 siblings, 1 reply; 16+ messages in thread
From: Willem de Bruijn @ 2025-02-19 15:06 UTC (permalink / raw)
  To: Marcus Wichelmann, Willem de Bruijn, netdev, linux-kernel, bpf,
	linux-kselftest
  Cc: jasowang, andrew+netdev, davem, edumazet, kuba, pabeni, ast,
	daniel, andrii, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, mykolal, shuah, hawk

Marcus Wichelmann wrote:
> Am 18.02.25 um 02:47 schrieb Willem de Bruijn:
> > Marcus Wichelmann wrote:
> >> [...]
> >> +	metasize = max(xdp->data - xdp->data_meta, 0);
> > 
> > Can xdp->data_meta ever be greater than xdp->data?
> 
> When an xdp_buff has no metadata support, then this is marked by setting
> xdp->data_meta to xdp->data + 1. See xdp_prepare_buff or
> xdp_set_data_meta_invalid.
> 
> In the case of tun_xdp_one, the xdp_buff is externally created by another
> driver and passed to the tun driver using sendmsg and TUN_MSG_PTR. For
> now, the vhost_net driver is the only driver doing that, and
> xdp->data_meta is set to xdp->data there, marking support for metadata.
> 
> So knowing that vhost_net is currently the only driver passing xdp_buffs
> to tun_sendmsg, the check is not strictly necessary. But other drivers
> may use this API as well in the future. That's why I'd like to not make
> the assumption that other drivers always create the xdp_buffs with
> metadata support, when they pass them to tun_sendmsg.
> 
> Or am I just to careful about this? What do you think?

I agree.
 
> > This is pointer comparison, which is tricky wrt type. It likely is
> > ptrdiff_t and thus signed. But may want to use max_t(long int, ..) to
> > make this explicit.
> 
> Ah, I see, good point.
> 
> So like that?
> 
> 	metasize = max_t(long int, xdp->data - xdp->data_meta, 0);
> 	if (metasize)
> 		skb_metadata_set(skb, metasize);

Or just this? Also ensures the test uses signed int.

    int metasize;

    ...


    metasize = xdp->data - xdp->data_meta;
    if (metasize > 0)
            skb_metadata_set(skb, metasize);


> Alternatively, there is also xdp_data_meta_unsupported(xdp_buff) which
> could be used to make this check very explicit, but I don't see it being
> used in network drivers elsewhere. Not sure why.
> 
> >> +	if (metasize)
> >> +		skb_metadata_set(skb, metasize);
> >> +
> > 
> > Not strictly needed. As skb_metadata_clear is just
> > skb_metadata_set(skb, 0). But also not wrong, so fine to keep.
> 
> Oh, haven't seen that.
> I'm following a common pattern here that I've seen in many other network
> drivers (grep for "skb_metadata_set"):
> 
> 	unsigned int metasize = xdp->data - xdp->data_meta;
> 	[...]
> 	if (metasize)
> 		skb_metadata_set(skb, metasize);

Thanks for that context. Sounds good.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb
  2025-02-19 15:06       ` Willem de Bruijn
@ 2025-02-19 15:15         ` Marcus Wichelmann
  0 siblings, 0 replies; 16+ messages in thread
From: Marcus Wichelmann @ 2025-02-19 15:15 UTC (permalink / raw)
  To: Willem de Bruijn, netdev, linux-kernel, bpf, linux-kselftest
  Cc: jasowang, andrew+netdev, davem, edumazet, kuba, pabeni, ast,
	daniel, andrii, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, mykolal, shuah, hawk

Am 19.02.25 um 16:06 schrieb Willem de Bruijn:
> Marcus Wichelmann wrote:
>> Am 18.02.25 um 02:47 schrieb Willem de Bruijn:
>>> [...]
>>> This is pointer comparison, which is tricky wrt type. It likely is
>>> ptrdiff_t and thus signed. But may want to use max_t(long int, ..) to
>>> make this explicit.
>>
>> Ah, I see, good point.
>>
>> So like that?
>>
>> 	metasize = max_t(long int, xdp->data - xdp->data_meta, 0);
>> 	if (metasize)
>> 		skb_metadata_set(skb, metasize);
> 
> Or just this? Also ensures the test uses signed int.
> 
>      int metasize;
> 
>      ...
> 
> 
>      metasize = xdp->data - xdp->data_meta;
>      if (metasize > 0)
>              skb_metadata_set(skb, metasize);
> 

Well, yeah, just keep it simple I guess. ;) Will do that.

I'll send a V3 patch series with the change.

Thanks!

Marcus

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-02-19 15:15 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-17 17:23 [PATCH bpf-next v2 0/6] XDP metadata support for tun driver Marcus Wichelmann
2025-02-17 17:23 ` [PATCH bpf-next v2 1/6] net: tun: enable XDP metadata support Marcus Wichelmann
2025-02-19  3:17   ` Jason Wang
2025-02-17 17:23 ` [PATCH bpf-next v2 2/6] net: tun: enable transfer of XDP metadata to skb Marcus Wichelmann
2025-02-18  1:45   ` Willem de Bruijn
2025-02-18  1:47   ` Willem de Bruijn
2025-02-19 14:47     ` Marcus Wichelmann
2025-02-19 15:06       ` Willem de Bruijn
2025-02-19 15:15         ` Marcus Wichelmann
2025-02-17 17:23 ` [PATCH bpf-next v2 3/6] selftests/bpf: move open_tuntap to network helpers Marcus Wichelmann
2025-02-18  1:50   ` Willem de Bruijn
2025-02-17 17:23 ` [PATCH bpf-next v2 4/6] selftests/bpf: refactor xdp_context_functional test and bpf program Marcus Wichelmann
2025-02-17 17:23 ` [PATCH bpf-next v2 5/6] selftests/bpf: add test for XDP metadata support in tun driver Marcus Wichelmann
2025-02-17 17:23 ` [PATCH bpf-next v2 6/6] selftests/bpf: fix file descriptor assertion in open_tuntap helper Marcus Wichelmann
2025-02-18  1:56   ` Willem de Bruijn
2025-02-19 14:58     ` Marcus Wichelmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).