netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
@ 2024-02-15 13:26 Toke Høiland-Jørgensen
  2024-02-15 13:26 ` [PATCH net-next 1/3] net: Register system page pool as an XDP memory model Toke Høiland-Jørgensen
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-15 13:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Toke Høiland-Jørgensen, netdev, bpf

Now that we have a system-wide page pool, we can use that for the live
frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
avoid the cost of creating a separate page pool instance for each
syscall invocation. See the individual patches for more details.

Toke Høiland-Jørgensen (3):
  net: Register system page pool as an XDP memory model
  bpf: test_run: Use system page pool for XDP live frame mode
  bpf: test_run: Fix cacheline alignment of live XDP frame data
    structures

 include/linux/netdevice.h |   1 +
 net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
 net/core/dev.c            |  13 +++-
 3 files changed, 81 insertions(+), 71 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH net-next 1/3] net: Register system page pool as an XDP memory model
  2024-02-15 13:26 [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Toke Høiland-Jørgensen
@ 2024-02-15 13:26 ` Toke Høiland-Jørgensen
  2024-02-15 13:26 ` [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode Toke Høiland-Jørgensen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-15 13:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend
  Cc: Alexander Lobakin, Toke Høiland-Jørgensen, netdev, bpf

To make the system page pool usable as a source for allocating XDP
frames, we need to register it with xdp_reg_mem_model(), so that page
return works correctly. This is done in preparation for using the system
page pool for the XDP live frame mode in BPF_TEST_RUN; for the same
reason, make the per-cpu variable non-static so we can access it from
the test_run code as well.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/linux/netdevice.h |  1 +
 net/core/dev.c            | 13 ++++++++++++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c541550b0e6e..e1dfdf0c4075 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3345,6 +3345,7 @@ static inline void input_queue_tail_incr_save(struct softnet_data *sd,
 }
 
 DECLARE_PER_CPU_ALIGNED(struct softnet_data, softnet_data);
+DECLARE_PER_CPU_ALIGNED(struct page_pool *, system_page_pool);
 
 static inline int dev_recursion_level(void)
 {
diff --git a/net/core/dev.c b/net/core/dev.c
index d8dd293a7a27..cdb916a647e7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -428,7 +428,7 @@ EXPORT_PER_CPU_SYMBOL(softnet_data);
  * PP consumers must pay attention to run APIs in the appropriate context
  * (e.g. NAPI context).
  */
-static DEFINE_PER_CPU_ALIGNED(struct page_pool *, system_page_pool);
+DEFINE_PER_CPU_ALIGNED(struct page_pool *, system_page_pool);
 
 #ifdef CONFIG_LOCKDEP
 /*
@@ -11739,12 +11739,20 @@ static int net_page_pool_create(int cpuid)
 		.pool_size = SYSTEM_PERCPU_PAGE_POOL_SIZE,
 		.nid = NUMA_NO_NODE,
 	};
+	struct xdp_mem_info info;
 	struct page_pool *pp_ptr;
+	int err;
 
 	pp_ptr = page_pool_create_percpu(&page_pool_params, cpuid);
 	if (IS_ERR(pp_ptr))
 		return -ENOMEM;
 
+	err = xdp_reg_mem_model(&info, MEM_TYPE_PAGE_POOL, pp_ptr);
+	if (err) {
+		page_pool_destroy(pp_ptr);
+		return err;
+	}
+
 	per_cpu(system_page_pool, cpuid) = pp_ptr;
 #endif
 	return 0;
@@ -11834,12 +11842,15 @@ static int __init net_dev_init(void)
 out:
 	if (rc < 0) {
 		for_each_possible_cpu(i) {
+			struct xdp_mem_info mem = { .type = MEM_TYPE_PAGE_POOL };
 			struct page_pool *pp_ptr;
 
 			pp_ptr = per_cpu(system_page_pool, i);
 			if (!pp_ptr)
 				continue;
 
+			mem.id = pp_ptr->xdp_mem_id;
+			xdp_unreg_mem_model(&mem);
 			page_pool_destroy(pp_ptr);
 			per_cpu(system_page_pool, i) = NULL;
 		}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode
  2024-02-15 13:26 [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Toke Høiland-Jørgensen
  2024-02-15 13:26 ` [PATCH net-next 1/3] net: Register system page pool as an XDP memory model Toke Høiland-Jørgensen
@ 2024-02-15 13:26 ` Toke Høiland-Jørgensen
  2024-02-20  9:06   ` Daniel Borkmann
  2024-02-15 13:26 ` [PATCH net-next 3/3] bpf: test_run: Fix cacheline alignment of live XDP frame data structures Toke Høiland-Jørgensen
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-15 13:26 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Jakub Kicinski, Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Toke Høiland-Jørgensen, Eric Dumazet,
	Paolo Abeni, bpf, netdev

The BPF_TEST_RUN code in XDP live frame mode creates a new page pool
each time it is called and uses that to allocate the frames used for the
XDP run. This works well if the syscall is used with a high repetitions
number, as it allows for efficient page recycling. However, if used with
a small number of repetitions, the overhead of creating and tearing down
the page pool is significant, and can even lead to system stalls if the
syscall is called in a tight loop.

Now that we have a persistent system page pool instance, it becomes
pretty straight forward to change the test_run code to use it. The only
wrinkle is that we can no longer rely on a custom page init callback
from page_pool itself; instead, we change the test_run code to write a
random cookie value to the beginning of the page as an indicator that
the page has been initialised and can be re-used without copying the
initial data again.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 net/bpf/test_run.c | 134 ++++++++++++++++++++++-----------------------
 1 file changed, 66 insertions(+), 68 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index dfd919374017..c742869a0612 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -94,10 +94,14 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations,
 }
 
 /* We put this struct at the head of each page with a context and frame
- * initialised when the page is allocated, so we don't have to do this on each
- * repetition of the test run.
+ * initialised the first time a given page is used, saving the memcpy() of the
+ * data on subsequent repetition of the test run. The cookie value is used to
+ * mark the page data the first time we initialise it so we can skip it the next
+ * time we see that page.
  */
+
 struct xdp_page_head {
+	u64 cookie;
 	struct xdp_buff orig_ctx;
 	struct xdp_buff ctx;
 	union {
@@ -111,10 +115,9 @@ struct xdp_test_data {
 	struct xdp_buff *orig_ctx;
 	struct xdp_rxq_info rxq;
 	struct net_device *dev;
-	struct page_pool *pp;
 	struct xdp_frame **frames;
 	struct sk_buff **skbs;
-	struct xdp_mem_info mem;
+	u64 cookie;
 	u32 batch_size;
 	u32 frame_cnt;
 };
@@ -126,48 +129,9 @@ struct xdp_test_data {
 #define TEST_XDP_FRAME_SIZE (PAGE_SIZE - sizeof(struct xdp_page_head))
 #define TEST_XDP_MAX_BATCH 256
 
-static void xdp_test_run_init_page(struct page *page, void *arg)
-{
-	struct xdp_page_head *head = phys_to_virt(page_to_phys(page));
-	struct xdp_buff *new_ctx, *orig_ctx;
-	u32 headroom = XDP_PACKET_HEADROOM;
-	struct xdp_test_data *xdp = arg;
-	size_t frm_len, meta_len;
-	struct xdp_frame *frm;
-	void *data;
-
-	orig_ctx = xdp->orig_ctx;
-	frm_len = orig_ctx->data_end - orig_ctx->data_meta;
-	meta_len = orig_ctx->data - orig_ctx->data_meta;
-	headroom -= meta_len;
-
-	new_ctx = &head->ctx;
-	frm = head->frame;
-	data = head->data;
-	memcpy(data + headroom, orig_ctx->data_meta, frm_len);
-
-	xdp_init_buff(new_ctx, TEST_XDP_FRAME_SIZE, &xdp->rxq);
-	xdp_prepare_buff(new_ctx, data, headroom, frm_len, true);
-	new_ctx->data = new_ctx->data_meta + meta_len;
-
-	xdp_update_frame_from_buff(new_ctx, frm);
-	frm->mem = new_ctx->rxq->mem;
-
-	memcpy(&head->orig_ctx, new_ctx, sizeof(head->orig_ctx));
-}
-
 static int xdp_test_run_setup(struct xdp_test_data *xdp, struct xdp_buff *orig_ctx)
 {
-	struct page_pool *pp;
 	int err = -ENOMEM;
-	struct page_pool_params pp_params = {
-		.order = 0,
-		.flags = 0,
-		.pool_size = xdp->batch_size,
-		.nid = NUMA_NO_NODE,
-		.init_callback = xdp_test_run_init_page,
-		.init_arg = xdp,
-	};
 
 	xdp->frames = kvmalloc_array(xdp->batch_size, sizeof(void *), GFP_KERNEL);
 	if (!xdp->frames)
@@ -177,34 +141,21 @@ static int xdp_test_run_setup(struct xdp_test_data *xdp, struct xdp_buff *orig_c
 	if (!xdp->skbs)
 		goto err_skbs;
 
-	pp = page_pool_create(&pp_params);
-	if (IS_ERR(pp)) {
-		err = PTR_ERR(pp);
-		goto err_pp;
-	}
-
-	/* will copy 'mem.id' into pp->xdp_mem_id */
-	err = xdp_reg_mem_model(&xdp->mem, MEM_TYPE_PAGE_POOL, pp);
-	if (err)
-		goto err_mmodel;
-
-	xdp->pp = pp;
-
 	/* We create a 'fake' RXQ referencing the original dev, but with an
 	 * xdp_mem_info pointing to our page_pool
 	 */
 	xdp_rxq_info_reg(&xdp->rxq, orig_ctx->rxq->dev, 0, 0);
-	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL;
-	xdp->rxq.mem.id = pp->xdp_mem_id;
+	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL; /* mem id is set per-frame below */
 	xdp->dev = orig_ctx->rxq->dev;
 	xdp->orig_ctx = orig_ctx;
 
+	/* We need a random cookie for each run as pages can stick around
+	 * between runs in the system page pool
+	 */
+	get_random_bytes(&xdp->cookie, sizeof(xdp->cookie));
+
 	return 0;
 
-err_mmodel:
-	page_pool_destroy(pp);
-err_pp:
-	kvfree(xdp->skbs);
 err_skbs:
 	kvfree(xdp->frames);
 	return err;
@@ -212,8 +163,6 @@ static int xdp_test_run_setup(struct xdp_test_data *xdp, struct xdp_buff *orig_c
 
 static void xdp_test_run_teardown(struct xdp_test_data *xdp)
 {
-	xdp_unreg_mem_model(&xdp->mem);
-	page_pool_destroy(xdp->pp);
 	kfree(xdp->frames);
 	kfree(xdp->skbs);
 }
@@ -235,8 +184,12 @@ static bool ctx_was_changed(struct xdp_page_head *head)
 		head->orig_ctx.data_end != head->ctx.data_end;
 }
 
-static void reset_ctx(struct xdp_page_head *head)
+static void reset_ctx(struct xdp_page_head *head, struct xdp_test_data *xdp)
 {
+	/* mem id can change if we migrate CPUs between batches */
+	if (head->frame->mem.id != xdp->rxq.mem.id)
+		head->frame->mem.id = xdp->rxq.mem.id;
+
 	if (likely(!frame_was_changed(head) && !ctx_was_changed(head)))
 		return;
 
@@ -246,6 +199,48 @@ static void reset_ctx(struct xdp_page_head *head)
 	xdp_update_frame_from_buff(&head->ctx, head->frame);
 }
 
+static struct xdp_page_head *
+xdp_test_run_init_page(struct page *page, struct xdp_test_data *xdp)
+{
+	struct xdp_page_head *head = phys_to_virt(page_to_phys(page));
+	struct xdp_buff *new_ctx, *orig_ctx;
+	u32 headroom = XDP_PACKET_HEADROOM;
+	size_t frm_len, meta_len;
+	struct xdp_frame *frm;
+	void *data;
+
+	/* Optimise for the recycle case, which is the normal case when doing
+	 * high-repetition REDIRECTS to drivers that return frames.
+	 */
+	if (likely(head->cookie == xdp->cookie)) {
+		reset_ctx(head, xdp);
+		return head;
+	}
+
+	head->cookie = xdp->cookie;
+
+	orig_ctx = xdp->orig_ctx;
+	frm_len = orig_ctx->data_end - orig_ctx->data_meta;
+	meta_len = orig_ctx->data - orig_ctx->data_meta;
+	headroom -= meta_len;
+
+	new_ctx = &head->ctx;
+	frm = head->frame;
+	data = head->data;
+	memcpy(data + headroom, orig_ctx->data_meta, frm_len);
+
+	xdp_init_buff(new_ctx, TEST_XDP_FRAME_SIZE, &xdp->rxq);
+	xdp_prepare_buff(new_ctx, data, headroom, frm_len, true);
+	new_ctx->data = new_ctx->data_meta + meta_len;
+
+	xdp_update_frame_from_buff(new_ctx, frm);
+	frm->mem = new_ctx->rxq->mem;
+
+	memcpy(&head->orig_ctx, new_ctx, sizeof(head->orig_ctx));
+
+	return head;
+}
+
 static int xdp_recv_frames(struct xdp_frame **frames, int nframes,
 			   struct sk_buff **skbs,
 			   struct net_device *dev)
@@ -287,6 +282,7 @@ static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog,
 	struct xdp_page_head *head;
 	struct xdp_frame *frm;
 	bool redirect = false;
+	struct page_pool *pp;
 	struct xdp_buff *ctx;
 	struct page *page;
 
@@ -295,15 +291,17 @@ static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog,
 	local_bh_disable();
 	xdp_set_return_frame_no_direct();
 
+	pp = this_cpu_read(system_page_pool);
+	xdp->rxq.mem.id = pp->xdp_mem_id;
+
 	for (i = 0; i < batch_sz; i++) {
-		page = page_pool_dev_alloc_pages(xdp->pp);
+		page = page_pool_dev_alloc_pages(pp);
 		if (!page) {
 			err = -ENOMEM;
 			goto out;
 		}
 
-		head = phys_to_virt(page_to_phys(page));
-		reset_ctx(head);
+		head = xdp_test_run_init_page(page, xdp);
 		ctx = &head->ctx;
 		frm = head->frame;
 		xdp->frame_cnt++;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net-next 3/3] bpf: test_run: Fix cacheline alignment of live XDP frame data structures
  2024-02-15 13:26 [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Toke Høiland-Jørgensen
  2024-02-15 13:26 ` [PATCH net-next 1/3] net: Register system page pool as an XDP memory model Toke Høiland-Jørgensen
  2024-02-15 13:26 ` [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode Toke Høiland-Jørgensen
@ 2024-02-15 13:26 ` Toke Høiland-Jørgensen
  2024-02-20  9:06   ` Daniel Borkmann
  2024-02-15 15:30 ` [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Alexander Lobakin
  2024-02-19 18:52 ` Toke Høiland-Jørgensen
  4 siblings, 1 reply; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-15 13:26 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Jakub Kicinski, Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Toke Høiland-Jørgensen, Eric Dumazet,
	Paolo Abeni, bpf, netdev

The live XDP frame code in BPF_PROG_RUN suffered from suboptimal cache
line placement due to the forced cache line alignment of struct
xdp_rxq_info. Rearrange things so we don't waste a whole cache line on
padding, and also add explicit alignment to the data_hard_start field in
the start-of-page data structure we use for the data pages.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 net/bpf/test_run.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index c742869a0612..d153a3b3528f 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -108,12 +108,12 @@ struct xdp_page_head {
 		/* ::data_hard_start starts here */
 		DECLARE_FLEX_ARRAY(struct xdp_frame, frame);
 		DECLARE_FLEX_ARRAY(u8, data);
-	};
+	} ____cacheline_aligned;
 };
 
 struct xdp_test_data {
-	struct xdp_buff *orig_ctx;
 	struct xdp_rxq_info rxq;
+	struct xdp_buff *orig_ctx;
 	struct net_device *dev;
 	struct xdp_frame **frames;
 	struct sk_buff **skbs;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-15 13:26 [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Toke Høiland-Jørgensen
                   ` (2 preceding siblings ...)
  2024-02-15 13:26 ` [PATCH net-next 3/3] bpf: test_run: Fix cacheline alignment of live XDP frame data structures Toke Høiland-Jørgensen
@ 2024-02-15 15:30 ` Alexander Lobakin
  2024-02-15 17:06   ` Toke Høiland-Jørgensen
  2024-02-19 18:52 ` Toke Høiland-Jørgensen
  4 siblings, 1 reply; 21+ messages in thread
From: Alexander Lobakin @ 2024-02-15 15:30 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Jesper Dangaard Brouer, netdev, bpf

From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 15 Feb 2024 14:26:29 +0100

> Now that we have a system-wide page pool, we can use that for the live
> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
> avoid the cost of creating a separate page pool instance for each
> syscall invocation. See the individual patches for more details.

Tested xdp-trafficgen on my development tree[0], no regressions from the
net-next with my patch which increases live frames PP size.

Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com>

(with some cosmetics like[1])

> 
> Toke Høiland-Jørgensen (3):
>   net: Register system page pool as an XDP memory model
>   bpf: test_run: Use system page pool for XDP live frame mode
>   bpf: test_run: Fix cacheline alignment of live XDP frame data
>     structures
> 
>  include/linux/netdevice.h |   1 +
>  net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
>  net/core/dev.c            |  13 +++-
>  3 files changed, 81 insertions(+), 71 deletions(-)
> 

[0] https://github.com/alobakin/linux/commits/idpf-libie-new
[1] https://github.com/alobakin/linux/commit/62b4cb03486a

Thanks,
Olek

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-15 15:30 ` [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Alexander Lobakin
@ 2024-02-15 17:06   ` Toke Høiland-Jørgensen
  2024-02-16 11:41     ` Alexander Lobakin
  0 siblings, 1 reply; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-15 17:06 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Jesper Dangaard Brouer, netdev, bpf

Alexander Lobakin <aleksander.lobakin@intel.com> writes:

> From: Toke Høiland-Jørgensen <toke@redhat.com>
> Date: Thu, 15 Feb 2024 14:26:29 +0100
>
>> Now that we have a system-wide page pool, we can use that for the live
>> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
>> avoid the cost of creating a separate page pool instance for each
>> syscall invocation. See the individual patches for more details.
>
> Tested xdp-trafficgen on my development tree[0], no regressions from the
> net-next with my patch which increases live frames PP size.
>
> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com>

Great, thanks for taking it for a spin! :)

-Toke


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-15 17:06   ` Toke Høiland-Jørgensen
@ 2024-02-16 11:41     ` Alexander Lobakin
  2024-02-16 14:00       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 21+ messages in thread
From: Alexander Lobakin @ 2024-02-16 11:41 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Jesper Dangaard Brouer, netdev, bpf

From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 15 Feb 2024 18:06:42 +0100

> Alexander Lobakin <aleksander.lobakin@intel.com> writes:
> 
>> From: Toke Høiland-Jørgensen <toke@redhat.com>
>> Date: Thu, 15 Feb 2024 14:26:29 +0100
>>
>>> Now that we have a system-wide page pool, we can use that for the live
>>> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
>>> avoid the cost of creating a separate page pool instance for each
>>> syscall invocation. See the individual patches for more details.
>>
>> Tested xdp-trafficgen on my development tree[0], no regressions from the
>> net-next with my patch which increases live frames PP size.
>>
>> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>> Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> 
> Great, thanks for taking it for a spin! :)

BTW, you remove the usage of page_pool->slow.init_callback, maybe we
could remove it completely?

> 
> -Toke

Thanks,
Olek

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-16 11:41     ` Alexander Lobakin
@ 2024-02-16 14:00       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-16 14:00 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Jesper Dangaard Brouer, netdev, bpf

Alexander Lobakin <aleksander.lobakin@intel.com> writes:

> From: Toke Høiland-Jørgensen <toke@redhat.com>
> Date: Thu, 15 Feb 2024 18:06:42 +0100
>
>> Alexander Lobakin <aleksander.lobakin@intel.com> writes:
>> 
>>> From: Toke Høiland-Jørgensen <toke@redhat.com>
>>> Date: Thu, 15 Feb 2024 14:26:29 +0100
>>>
>>>> Now that we have a system-wide page pool, we can use that for the live
>>>> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
>>>> avoid the cost of creating a separate page pool instance for each
>>>> syscall invocation. See the individual patches for more details.
>>>
>>> Tested xdp-trafficgen on my development tree[0], no regressions from the
>>> net-next with my patch which increases live frames PP size.
>>>
>>> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>>> Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>> 
>> Great, thanks for taking it for a spin! :)
>
> BTW, you remove the usage of page_pool->slow.init_callback, maybe we
> could remove it completely?

Ohh, you're right. Totally forgot that this was something I introduced
for this use case :D

I'll send a follow-up to get rid of it after this lands.

-Toke


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-15 13:26 [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Toke Høiland-Jørgensen
                   ` (3 preceding siblings ...)
  2024-02-15 15:30 ` [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Alexander Lobakin
@ 2024-02-19 18:52 ` Toke Høiland-Jørgensen
  2024-02-20  8:39   ` Daniel Borkmann
  4 siblings, 1 reply; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-19 18:52 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann
  Cc: Alexander Lobakin, netdev, bpf

Toke Høiland-Jørgensen <toke@redhat.com> writes:

> Now that we have a system-wide page pool, we can use that for the live
> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
> avoid the cost of creating a separate page pool instance for each
> syscall invocation. See the individual patches for more details.
>
> Toke Høiland-Jørgensen (3):
>   net: Register system page pool as an XDP memory model
>   bpf: test_run: Use system page pool for XDP live frame mode
>   bpf: test_run: Fix cacheline alignment of live XDP frame data
>     structures
>
>  include/linux/netdevice.h |   1 +
>  net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
>  net/core/dev.c            |  13 +++-
>  3 files changed, 81 insertions(+), 71 deletions(-)

Hi maintainers

This series is targeting net-next, but it's listed as delegate:bpf in
patchwork[0]; is that a mistake? Do I need to do anything more to nudge it
along?

-Toke

[0] https://patchwork.kernel.org/project/netdevbpf/list/?series=826384


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-19 18:52 ` Toke Høiland-Jørgensen
@ 2024-02-20  8:39   ` Daniel Borkmann
  2024-02-20  9:03     ` Paolo Abeni
  2024-02-20 11:23     ` Toke Høiland-Jørgensen
  0 siblings, 2 replies; 21+ messages in thread
From: Daniel Borkmann @ 2024-02-20  8:39 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov
  Cc: Alexander Lobakin, netdev, bpf

On 2/19/24 7:52 PM, Toke Høiland-Jørgensen wrote:
> Toke Høiland-Jørgensen <toke@redhat.com> writes:
> 
>> Now that we have a system-wide page pool, we can use that for the live
>> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
>> avoid the cost of creating a separate page pool instance for each
>> syscall invocation. See the individual patches for more details.
>>
>> Toke Høiland-Jørgensen (3):
>>    net: Register system page pool as an XDP memory model
>>    bpf: test_run: Use system page pool for XDP live frame mode
>>    bpf: test_run: Fix cacheline alignment of live XDP frame data
>>      structures
>>
>>   include/linux/netdevice.h |   1 +
>>   net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
>>   net/core/dev.c            |  13 +++-
>>   3 files changed, 81 insertions(+), 71 deletions(-)
> 
> Hi maintainers
> 
> This series is targeting net-next, but it's listed as delegate:bpf in
> patchwork[0]; is that a mistake? Do I need to do anything more to nudge it
> along?

I moved it over to netdev, it would be good next time if there are dependencies
which are in net-next but not yet bpf-next to clearly state them given from this
series the majority touches the bpf test infra code.

> -Toke
> 
> [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=826384

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-20  8:39   ` Daniel Borkmann
@ 2024-02-20  9:03     ` Paolo Abeni
  2024-02-20  9:19       ` Daniel Borkmann
  2024-02-20 11:23     ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 21+ messages in thread
From: Paolo Abeni @ 2024-02-20  9:03 UTC (permalink / raw)
  To: Daniel Borkmann, Toke Høiland-Jørgensen,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Alexei Starovoitov
  Cc: Alexander Lobakin, netdev, bpf

On Tue, 2024-02-20 at 09:39 +0100, Daniel Borkmann wrote:
> On 2/19/24 7:52 PM, Toke Høiland-Jørgensen wrote:
> > Toke Høiland-Jørgensen <toke@redhat.com> writes:
> > 
> > > Now that we have a system-wide page pool, we can use that for the live
> > > frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
> > > avoid the cost of creating a separate page pool instance for each
> > > syscall invocation. See the individual patches for more details.
> > > 
> > > Toke Høiland-Jørgensen (3):
> > >    net: Register system page pool as an XDP memory model
> > >    bpf: test_run: Use system page pool for XDP live frame mode
> > >    bpf: test_run: Fix cacheline alignment of live XDP frame data
> > >      structures
> > > 
> > >   include/linux/netdevice.h |   1 +
> > >   net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
> > >   net/core/dev.c            |  13 +++-
> > >   3 files changed, 81 insertions(+), 71 deletions(-)
> > 
> > Hi maintainers
> > 
> > This series is targeting net-next, but it's listed as delegate:bpf in
> > patchwork[0]; is that a mistake? Do I need to do anything more to nudge it
> > along?
> 
> I moved it over to netdev, it would be good next time if there are dependencies
> which are in net-next but not yet bpf-next to clearly state them given from this
> series the majority touches the bpf test infra code.

This series apparently causes bpf self-tests failures:

https://github.com/kernel-patches/bpf/actions/runs/7929088890/job/21648828278

I'm unsure if that is blocking, or just a CI glitch.

The series LGTM, but I think it would be better if someone from the
ebpf team could also have a look.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode
  2024-02-15 13:26 ` [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode Toke Høiland-Jørgensen
@ 2024-02-20  9:06   ` Daniel Borkmann
  2024-02-20  9:45     ` Paolo Abeni
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Borkmann @ 2024-02-20  9:06 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Alexei Starovoitov,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Eric Dumazet, Paolo Abeni, bpf, netdev

On 2/15/24 2:26 PM, Toke Høiland-Jørgensen wrote:
> The BPF_TEST_RUN code in XDP live frame mode creates a new page pool
> each time it is called and uses that to allocate the frames used for the
> XDP run. This works well if the syscall is used with a high repetitions
> number, as it allows for efficient page recycling. However, if used with
> a small number of repetitions, the overhead of creating and tearing down
> the page pool is significant, and can even lead to system stalls if the
> syscall is called in a tight loop.
> 
> Now that we have a persistent system page pool instance, it becomes
> pretty straight forward to change the test_run code to use it. The only
> wrinkle is that we can no longer rely on a custom page init callback
> from page_pool itself; instead, we change the test_run code to write a
> random cookie value to the beginning of the page as an indicator that
> the page has been initialised and can be re-used without copying the
> initial data again.
> 
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>

[...]
> -
>   	/* We create a 'fake' RXQ referencing the original dev, but with an
>   	 * xdp_mem_info pointing to our page_pool
>   	 */
>   	xdp_rxq_info_reg(&xdp->rxq, orig_ctx->rxq->dev, 0, 0);
> -	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL;
> -	xdp->rxq.mem.id = pp->xdp_mem_id;
> +	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL; /* mem id is set per-frame below */
>   	xdp->dev = orig_ctx->rxq->dev;
>   	xdp->orig_ctx = orig_ctx;
>   
> +	/* We need a random cookie for each run as pages can stick around
> +	 * between runs in the system page pool
> +	 */
> +	get_random_bytes(&xdp->cookie, sizeof(xdp->cookie));
> +

So the assumption is that there is only a tiny chance of collisions with
users outside of xdp test_run. If they do collide however, you'd leak data.
Presumably the 64 bit cookie might suffice.. nit, perhaps makes sense to
explicitly exclude zero cookie?

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 3/3] bpf: test_run: Fix cacheline alignment of live XDP frame data structures
  2024-02-15 13:26 ` [PATCH net-next 3/3] bpf: test_run: Fix cacheline alignment of live XDP frame data structures Toke Høiland-Jørgensen
@ 2024-02-20  9:06   ` Daniel Borkmann
  0 siblings, 0 replies; 21+ messages in thread
From: Daniel Borkmann @ 2024-02-20  9:06 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Alexei Starovoitov,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Eric Dumazet, Paolo Abeni, bpf, netdev

On 2/15/24 2:26 PM, Toke Høiland-Jørgensen wrote:
> The live XDP frame code in BPF_PROG_RUN suffered from suboptimal cache
> line placement due to the forced cache line alignment of struct
> xdp_rxq_info. Rearrange things so we don't waste a whole cache line on
> padding, and also add explicit alignment to the data_hard_start field in
> the start-of-page data structure we use for the data pages.
> 
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-20  9:03     ` Paolo Abeni
@ 2024-02-20  9:19       ` Daniel Borkmann
  0 siblings, 0 replies; 21+ messages in thread
From: Daniel Borkmann @ 2024-02-20  9:19 UTC (permalink / raw)
  To: Paolo Abeni, Toke Høiland-Jørgensen, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Alexei Starovoitov
  Cc: Alexander Lobakin, netdev, bpf

On 2/20/24 10:03 AM, Paolo Abeni wrote:
> On Tue, 2024-02-20 at 09:39 +0100, Daniel Borkmann wrote:
>> On 2/19/24 7:52 PM, Toke Høiland-Jørgensen wrote:
>>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>>
>>>> Now that we have a system-wide page pool, we can use that for the live
>>>> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
>>>> avoid the cost of creating a separate page pool instance for each
>>>> syscall invocation. See the individual patches for more details.
>>>>
>>>> Toke Høiland-Jørgensen (3):
>>>>     net: Register system page pool as an XDP memory model
>>>>     bpf: test_run: Use system page pool for XDP live frame mode
>>>>     bpf: test_run: Fix cacheline alignment of live XDP frame data
>>>>       structures
>>>>
>>>>    include/linux/netdevice.h |   1 +
>>>>    net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
>>>>    net/core/dev.c            |  13 +++-
>>>>    3 files changed, 81 insertions(+), 71 deletions(-)
>>>
>>> Hi maintainers
>>>
>>> This series is targeting net-next, but it's listed as delegate:bpf in
>>> patchwork[0]; is that a mistake? Do I need to do anything more to nudge it
>>> along?
>>
>> I moved it over to netdev, it would be good next time if there are dependencies
>> which are in net-next but not yet bpf-next to clearly state them given from this
>> series the majority touches the bpf test infra code.
> 
> This series apparently causes bpf self-tests failures:
> 
> https://github.com/kernel-patches/bpf/actions/runs/7929088890/job/21648828278
> 
> I'm unsure if that is blocking, or just a CI glitch.
> 
> The series LGTM, but I think it would be better if someone from the
> ebpf team could also have a look.

The CI was not able to apply the patches, so this looks unrelated :

Cmd('git') failed due to: exit code(128)
   cmdline: git am --3way
   stdout: 'Applying: net: Register system page pool as an XDP memory model
Patch failed at 0001 net: Register system page pool as an XDP memory model
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".'
   stderr: 'error: sha1 information is lacking or useless (include/linux/netdevice.h).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch'

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode
  2024-02-20  9:06   ` Daniel Borkmann
@ 2024-02-20  9:45     ` Paolo Abeni
  2024-02-20 13:14       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Abeni @ 2024-02-20  9:45 UTC (permalink / raw)
  To: Daniel Borkmann, Toke Høiland-Jørgensen,
	Alexei Starovoitov, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Jakub Kicinski, Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Eric Dumazet, bpf, netdev

On Tue, 2024-02-20 at 10:06 +0100, Daniel Borkmann wrote:
> On 2/15/24 2:26 PM, Toke Høiland-Jørgensen wrote:
> > The BPF_TEST_RUN code in XDP live frame mode creates a new page pool
> > each time it is called and uses that to allocate the frames used for the
> > XDP run. This works well if the syscall is used with a high repetitions
> > number, as it allows for efficient page recycling. However, if used with
> > a small number of repetitions, the overhead of creating and tearing down
> > the page pool is significant, and can even lead to system stalls if the
> > syscall is called in a tight loop.
> > 
> > Now that we have a persistent system page pool instance, it becomes
> > pretty straight forward to change the test_run code to use it. The only
> > wrinkle is that we can no longer rely on a custom page init callback
> > from page_pool itself; instead, we change the test_run code to write a
> > random cookie value to the beginning of the page as an indicator that
> > the page has been initialised and can be re-used without copying the
> > initial data again.
> > 
> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> 
> [...]
> > -
> >   	/* We create a 'fake' RXQ referencing the original dev, but with an
> >   	 * xdp_mem_info pointing to our page_pool
> >   	 */
> >   	xdp_rxq_info_reg(&xdp->rxq, orig_ctx->rxq->dev, 0, 0);
> > -	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL;
> > -	xdp->rxq.mem.id = pp->xdp_mem_id;
> > +	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL; /* mem id is set per-frame below */
> >   	xdp->dev = orig_ctx->rxq->dev;
> >   	xdp->orig_ctx = orig_ctx;
> >   
> > +	/* We need a random cookie for each run as pages can stick around
> > +	 * between runs in the system page pool
> > +	 */
> > +	get_random_bytes(&xdp->cookie, sizeof(xdp->cookie));
> > +
> 
> So the assumption is that there is only a tiny chance of collisions with
> users outside of xdp test_run. If they do collide however, you'd leak data.

Good point. @Toke: what is the worst-case thing that could happen in
case a page is recycled from another pool's user?

could we possibly end-up matching the cookie for a page containing
'random' orig_ctx/ctx, so that bpf program later tries to access
equally random ptrs?

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-20  8:39   ` Daniel Borkmann
  2024-02-20  9:03     ` Paolo Abeni
@ 2024-02-20 11:23     ` Toke Høiland-Jørgensen
  2024-02-20 12:35       ` Daniel Borkmann
  1 sibling, 1 reply; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-20 11:23 UTC (permalink / raw)
  To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov
  Cc: Alexander Lobakin, netdev, bpf

Daniel Borkmann <daniel@iogearbox.net> writes:

> On 2/19/24 7:52 PM, Toke Høiland-Jørgensen wrote:
>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>> 
>>> Now that we have a system-wide page pool, we can use that for the live
>>> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
>>> avoid the cost of creating a separate page pool instance for each
>>> syscall invocation. See the individual patches for more details.
>>>
>>> Toke Høiland-Jørgensen (3):
>>>    net: Register system page pool as an XDP memory model
>>>    bpf: test_run: Use system page pool for XDP live frame mode
>>>    bpf: test_run: Fix cacheline alignment of live XDP frame data
>>>      structures
>>>
>>>   include/linux/netdevice.h |   1 +
>>>   net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
>>>   net/core/dev.c            |  13 +++-
>>>   3 files changed, 81 insertions(+), 71 deletions(-)
>> 
>> Hi maintainers
>> 
>> This series is targeting net-next, but it's listed as delegate:bpf in
>> patchwork[0]; is that a mistake? Do I need to do anything more to nudge it
>> along?
>
> I moved it over to netdev, it would be good next time if there are dependencies
> which are in net-next but not yet bpf-next to clearly state them given from this
> series the majority touches the bpf test infra code.

Right, I thought that was what I was doing by targeting them at net-next
(in the subject). What's the proper way to do this, then, just noting it
in the cover letter? :)

-Toke


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-20 11:23     ` Toke Høiland-Jørgensen
@ 2024-02-20 12:35       ` Daniel Borkmann
  2024-02-20 15:24         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Borkmann @ 2024-02-20 12:35 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov
  Cc: Alexander Lobakin, netdev, bpf

On 2/20/24 12:23 PM, Toke Høiland-Jørgensen wrote:
> Daniel Borkmann <daniel@iogearbox.net> writes:
>> On 2/19/24 7:52 PM, Toke Høiland-Jørgensen wrote:
>>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>>
>>>> Now that we have a system-wide page pool, we can use that for the live
>>>> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
>>>> avoid the cost of creating a separate page pool instance for each
>>>> syscall invocation. See the individual patches for more details.
>>>>
>>>> Toke Høiland-Jørgensen (3):
>>>>     net: Register system page pool as an XDP memory model
>>>>     bpf: test_run: Use system page pool for XDP live frame mode
>>>>     bpf: test_run: Fix cacheline alignment of live XDP frame data
>>>>       structures
>>>>
>>>>    include/linux/netdevice.h |   1 +
>>>>    net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
>>>>    net/core/dev.c            |  13 +++-
>>>>    3 files changed, 81 insertions(+), 71 deletions(-)
>>>
>>> Hi maintainers
>>>
>>> This series is targeting net-next, but it's listed as delegate:bpf in
>>> patchwork[0]; is that a mistake? Do I need to do anything more to nudge it
>>> along?
>>
>> I moved it over to netdev, it would be good next time if there are dependencies
>> which are in net-next but not yet bpf-next to clearly state them given from this
>> series the majority touches the bpf test infra code.
> 
> Right, I thought that was what I was doing by targeting them at net-next
> (in the subject). What's the proper way to do this, then, just noting it
> in the cover letter? :)

An explicit lore link to the series this depends on would be best.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode
  2024-02-20  9:45     ` Paolo Abeni
@ 2024-02-20 13:14       ` Toke Høiland-Jørgensen
  2024-02-20 14:57         ` Paolo Abeni
  0 siblings, 1 reply; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-20 13:14 UTC (permalink / raw)
  To: Paolo Abeni, Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Jakub Kicinski, Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Eric Dumazet, bpf, netdev

Paolo Abeni <pabeni@redhat.com> writes:

> On Tue, 2024-02-20 at 10:06 +0100, Daniel Borkmann wrote:
>> On 2/15/24 2:26 PM, Toke Høiland-Jørgensen wrote:
>> > The BPF_TEST_RUN code in XDP live frame mode creates a new page pool
>> > each time it is called and uses that to allocate the frames used for the
>> > XDP run. This works well if the syscall is used with a high repetitions
>> > number, as it allows for efficient page recycling. However, if used with
>> > a small number of repetitions, the overhead of creating and tearing down
>> > the page pool is significant, and can even lead to system stalls if the
>> > syscall is called in a tight loop.
>> > 
>> > Now that we have a persistent system page pool instance, it becomes
>> > pretty straight forward to change the test_run code to use it. The only
>> > wrinkle is that we can no longer rely on a custom page init callback
>> > from page_pool itself; instead, we change the test_run code to write a
>> > random cookie value to the beginning of the page as an indicator that
>> > the page has been initialised and can be re-used without copying the
>> > initial data again.
>> > 
>> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> 
>> [...]
>> > -
>> >   	/* We create a 'fake' RXQ referencing the original dev, but with an
>> >   	 * xdp_mem_info pointing to our page_pool
>> >   	 */
>> >   	xdp_rxq_info_reg(&xdp->rxq, orig_ctx->rxq->dev, 0, 0);
>> > -	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL;
>> > -	xdp->rxq.mem.id = pp->xdp_mem_id;
>> > +	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL; /* mem id is set per-frame below */
>> >   	xdp->dev = orig_ctx->rxq->dev;
>> >   	xdp->orig_ctx = orig_ctx;
>> >   
>> > +	/* We need a random cookie for each run as pages can stick around
>> > +	 * between runs in the system page pool
>> > +	 */
>> > +	get_random_bytes(&xdp->cookie, sizeof(xdp->cookie));
>> > +
>> 
>> So the assumption is that there is only a tiny chance of collisions with
>> users outside of xdp test_run. If they do collide however, you'd leak data.
>
> Good point. @Toke: what is the worst-case thing that could happen in
> case a page is recycled from another pool's user?
>
> could we possibly end-up matching the cookie for a page containing
> 'random' orig_ctx/ctx, so that bpf program later tries to access
> equally random ptrs?

Well, yes, if there's a collision in the cookie value we'll end up
basically dereferencing garbage pointer values, with all the badness
that ensues (most likely just a crash, but system compromise is probably
also possible in such a case).

A 64-bit value is probably too small to be resistant against random
collisions in a "protect global data across the internet" type scenario
(for instance, a 64-bit cryptographic key is considered weak). However,
in this case the collision domain is only for the lifetime of the
running system, and each cookie value only stays valid for the duration
of a single syscall (seconds, at most), so I figured it was acceptable.

We could exclude all-zeros as a valid cookie value (and also anything
that looks as a valid pointer), but that only removes a few of the
possible random collision values, so if we're really worried about
random collisions of 64-bit numbers, I think a better approach would be
to just make the cookie a 128-bit value instead. I can respin with that
if you prefer? :)

-Toke


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode
  2024-02-20 13:14       ` Toke Høiland-Jørgensen
@ 2024-02-20 14:57         ` Paolo Abeni
  2024-02-20 19:33           ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Abeni @ 2024-02-20 14:57 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Jakub Kicinski, Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Eric Dumazet, bpf, netdev

On Tue, 2024-02-20 at 14:14 +0100, Toke Høiland-Jørgensen wrote:
> Paolo Abeni <pabeni@redhat.com> writes:
> 
> > On Tue, 2024-02-20 at 10:06 +0100, Daniel Borkmann wrote:
> > > On 2/15/24 2:26 PM, Toke Høiland-Jørgensen wrote:
> > > > The BPF_TEST_RUN code in XDP live frame mode creates a new page pool
> > > > each time it is called and uses that to allocate the frames used for the
> > > > XDP run. This works well if the syscall is used with a high repetitions
> > > > number, as it allows for efficient page recycling. However, if used with
> > > > a small number of repetitions, the overhead of creating and tearing down
> > > > the page pool is significant, and can even lead to system stalls if the
> > > > syscall is called in a tight loop.
> > > > 
> > > > Now that we have a persistent system page pool instance, it becomes
> > > > pretty straight forward to change the test_run code to use it. The only
> > > > wrinkle is that we can no longer rely on a custom page init callback
> > > > from page_pool itself; instead, we change the test_run code to write a
> > > > random cookie value to the beginning of the page as an indicator that
> > > > the page has been initialised and can be re-used without copying the
> > > > initial data again.
> > > > 
> > > > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> > > 
> > > [...]
> > > > -
> > > >   	/* We create a 'fake' RXQ referencing the original dev, but with an
> > > >   	 * xdp_mem_info pointing to our page_pool
> > > >   	 */
> > > >   	xdp_rxq_info_reg(&xdp->rxq, orig_ctx->rxq->dev, 0, 0);
> > > > -	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL;
> > > > -	xdp->rxq.mem.id = pp->xdp_mem_id;
> > > > +	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL; /* mem id is set per-frame below */
> > > >   	xdp->dev = orig_ctx->rxq->dev;
> > > >   	xdp->orig_ctx = orig_ctx;
> > > >   
> > > > +	/* We need a random cookie for each run as pages can stick around
> > > > +	 * between runs in the system page pool
> > > > +	 */
> > > > +	get_random_bytes(&xdp->cookie, sizeof(xdp->cookie));
> > > > +
> > > 
> > > So the assumption is that there is only a tiny chance of collisions with
> > > users outside of xdp test_run. If they do collide however, you'd leak data.
> > 
> > Good point. @Toke: what is the worst-case thing that could happen in
> > case a page is recycled from another pool's user?
> > 
> > could we possibly end-up matching the cookie for a page containing
> > 'random' orig_ctx/ctx, so that bpf program later tries to access
> > equally random ptrs?
> 
> Well, yes, if there's a collision in the cookie value we'll end up
> basically dereferencing garbage pointer values, with all the badness
> that ensues (most likely just a crash, but system compromise is probably
> also possible in such a case).
> 
> A 64-bit value is probably too small to be resistant against random
> collisions in a "protect global data across the internet" type scenario
> (for instance, a 64-bit cryptographic key is considered weak). However,
> in this case the collision domain is only for the lifetime of the
> running system, and each cookie value only stays valid for the duration
> of a single syscall (seconds, at most), so I figured it was acceptable.
> 
> We could exclude all-zeros as a valid cookie value (and also anything
> that looks as a valid pointer), but that only removes a few of the
> possible random collision values, so if we're really worried about
> random collisions of 64-bit numbers, I think a better approach would be
> to just make the cookie a 128-bit value instead. I can respin with that
> if you prefer? :)

I must admit that merging a code that will allow trashing the kernel -
even with a very low probability - is quite scaring to me.

How much relevant is the recycle case optimization? Could removing
completely that optimization be considered?

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames
  2024-02-20 12:35       ` Daniel Borkmann
@ 2024-02-20 15:24         ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-20 15:24 UTC (permalink / raw)
  To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov
  Cc: Alexander Lobakin, netdev, bpf

Daniel Borkmann <daniel@iogearbox.net> writes:

> On 2/20/24 12:23 PM, Toke Høiland-Jørgensen wrote:
>> Daniel Borkmann <daniel@iogearbox.net> writes:
>>> On 2/19/24 7:52 PM, Toke Høiland-Jørgensen wrote:
>>>> Toke Høiland-Jørgensen <toke@redhat.com> writes:
>>>>
>>>>> Now that we have a system-wide page pool, we can use that for the live
>>>>> frame mode of BPF_TEST_RUN (used by the XDP traffic generator), and
>>>>> avoid the cost of creating a separate page pool instance for each
>>>>> syscall invocation. See the individual patches for more details.
>>>>>
>>>>> Toke Høiland-Jørgensen (3):
>>>>>     net: Register system page pool as an XDP memory model
>>>>>     bpf: test_run: Use system page pool for XDP live frame mode
>>>>>     bpf: test_run: Fix cacheline alignment of live XDP frame data
>>>>>       structures
>>>>>
>>>>>    include/linux/netdevice.h |   1 +
>>>>>    net/bpf/test_run.c        | 138 +++++++++++++++++++-------------------
>>>>>    net/core/dev.c            |  13 +++-
>>>>>    3 files changed, 81 insertions(+), 71 deletions(-)
>>>>
>>>> Hi maintainers
>>>>
>>>> This series is targeting net-next, but it's listed as delegate:bpf in
>>>> patchwork[0]; is that a mistake? Do I need to do anything more to nudge it
>>>> along?
>>>
>>> I moved it over to netdev, it would be good next time if there are dependencies
>>> which are in net-next but not yet bpf-next to clearly state them given from this
>>> series the majority touches the bpf test infra code.
>> 
>> Right, I thought that was what I was doing by targeting them at net-next
>> (in the subject). What's the proper way to do this, then, just noting it
>> in the cover letter? :)
>
> An explicit lore link to the series this depends on would be best.

Alright; seems I'm respinning anyway, so will add one in the next
revision :)

-Toke


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode
  2024-02-20 14:57         ` Paolo Abeni
@ 2024-02-20 19:33           ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-02-20 19:33 UTC (permalink / raw)
  To: Paolo Abeni, Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Jakub Kicinski, Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Eric Dumazet, bpf, netdev

Paolo Abeni <pabeni@redhat.com> writes:

> On Tue, 2024-02-20 at 14:14 +0100, Toke Høiland-Jørgensen wrote:
>> Paolo Abeni <pabeni@redhat.com> writes:
>> 
>> > On Tue, 2024-02-20 at 10:06 +0100, Daniel Borkmann wrote:
>> > > On 2/15/24 2:26 PM, Toke Høiland-Jørgensen wrote:
>> > > > The BPF_TEST_RUN code in XDP live frame mode creates a new page pool
>> > > > each time it is called and uses that to allocate the frames used for the
>> > > > XDP run. This works well if the syscall is used with a high repetitions
>> > > > number, as it allows for efficient page recycling. However, if used with
>> > > > a small number of repetitions, the overhead of creating and tearing down
>> > > > the page pool is significant, and can even lead to system stalls if the
>> > > > syscall is called in a tight loop.
>> > > > 
>> > > > Now that we have a persistent system page pool instance, it becomes
>> > > > pretty straight forward to change the test_run code to use it. The only
>> > > > wrinkle is that we can no longer rely on a custom page init callback
>> > > > from page_pool itself; instead, we change the test_run code to write a
>> > > > random cookie value to the beginning of the page as an indicator that
>> > > > the page has been initialised and can be re-used without copying the
>> > > > initial data again.
>> > > > 
>> > > > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> > > 
>> > > [...]
>> > > > -
>> > > >   	/* We create a 'fake' RXQ referencing the original dev, but with an
>> > > >   	 * xdp_mem_info pointing to our page_pool
>> > > >   	 */
>> > > >   	xdp_rxq_info_reg(&xdp->rxq, orig_ctx->rxq->dev, 0, 0);
>> > > > -	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL;
>> > > > -	xdp->rxq.mem.id = pp->xdp_mem_id;
>> > > > +	xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL; /* mem id is set per-frame below */
>> > > >   	xdp->dev = orig_ctx->rxq->dev;
>> > > >   	xdp->orig_ctx = orig_ctx;
>> > > >   
>> > > > +	/* We need a random cookie for each run as pages can stick around
>> > > > +	 * between runs in the system page pool
>> > > > +	 */
>> > > > +	get_random_bytes(&xdp->cookie, sizeof(xdp->cookie));
>> > > > +
>> > > 
>> > > So the assumption is that there is only a tiny chance of collisions with
>> > > users outside of xdp test_run. If they do collide however, you'd leak data.
>> > 
>> > Good point. @Toke: what is the worst-case thing that could happen in
>> > case a page is recycled from another pool's user?
>> > 
>> > could we possibly end-up matching the cookie for a page containing
>> > 'random' orig_ctx/ctx, so that bpf program later tries to access
>> > equally random ptrs?
>> 
>> Well, yes, if there's a collision in the cookie value we'll end up
>> basically dereferencing garbage pointer values, with all the badness
>> that ensues (most likely just a crash, but system compromise is probably
>> also possible in such a case).
>> 
>> A 64-bit value is probably too small to be resistant against random
>> collisions in a "protect global data across the internet" type scenario
>> (for instance, a 64-bit cryptographic key is considered weak). However,
>> in this case the collision domain is only for the lifetime of the
>> running system, and each cookie value only stays valid for the duration
>> of a single syscall (seconds, at most), so I figured it was acceptable.
>> 
>> We could exclude all-zeros as a valid cookie value (and also anything
>> that looks as a valid pointer), but that only removes a few of the
>> possible random collision values, so if we're really worried about
>> random collisions of 64-bit numbers, I think a better approach would be
>> to just make the cookie a 128-bit value instead. I can respin with that
>> if you prefer? :)
>
> I must admit that merging a code that will allow trashing the kernel -
> even with a very low probability - is quite scaring to me.
>
> How much relevant is the recycle case optimization? Could removing
> completely that optimization be considered?

Did a quick test of this, and skipping the recycling eats ~12.5%
performance, so I don't think getting rid of it is a good solution.

However, increasing the cookie size to 128 bits makes no performance
difference (everything stays in the same cache lines). If we do that,
the collision probability enters "won't happen before the heat death of
the universe" territory, so I don't think there's any real concern that
this will happen.

I'll respin with the bigger cookie size (and add that patch Olek
suggested to remove the init callback from the page pool code).

-Toke


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-02-20 19:33 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-15 13:26 [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Toke Høiland-Jørgensen
2024-02-15 13:26 ` [PATCH net-next 1/3] net: Register system page pool as an XDP memory model Toke Høiland-Jørgensen
2024-02-15 13:26 ` [PATCH net-next 2/3] bpf: test_run: Use system page pool for XDP live frame mode Toke Høiland-Jørgensen
2024-02-20  9:06   ` Daniel Borkmann
2024-02-20  9:45     ` Paolo Abeni
2024-02-20 13:14       ` Toke Høiland-Jørgensen
2024-02-20 14:57         ` Paolo Abeni
2024-02-20 19:33           ` Toke Høiland-Jørgensen
2024-02-15 13:26 ` [PATCH net-next 3/3] bpf: test_run: Fix cacheline alignment of live XDP frame data structures Toke Høiland-Jørgensen
2024-02-20  9:06   ` Daniel Borkmann
2024-02-15 15:30 ` [PATCH net-next 0/3] Change BPF_TEST_RUN use the system page pool for live XDP frames Alexander Lobakin
2024-02-15 17:06   ` Toke Høiland-Jørgensen
2024-02-16 11:41     ` Alexander Lobakin
2024-02-16 14:00       ` Toke Høiland-Jørgensen
2024-02-19 18:52 ` Toke Høiland-Jørgensen
2024-02-20  8:39   ` Daniel Borkmann
2024-02-20  9:03     ` Paolo Abeni
2024-02-20  9:19       ` Daniel Borkmann
2024-02-20 11:23     ` Toke Høiland-Jørgensen
2024-02-20 12:35       ` Daniel Borkmann
2024-02-20 15:24         ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).