public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH io_uring-7.1 00/16] zcrx update for-7.1
@ 2026-03-23 12:43 Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration Pavel Begunkov
                   ` (17 more replies)
  0 siblings, 18 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

The series mostly consists of cleanups and preparation patches. Patch 1
tries to close the if queue earlier at the start of io_ring_exit_work()
as there are reports io_uring quisce taking too long leading to fails
on attempts to reuse a queue. Patch 5 introduces a device-less mode,
where there is only copy fallback and no dma/devices/page_pool/etc.
Patches 11-12 start moving the memory provider API in the direction
of passing netmem arrays instead of working with pp directly, which
was suggested before.

Pavel Begunkov (16):
  io_uring/zcrx: return back two step unregistration
  io_uring/zcrx: fully clean area on error in io_import_umem()
  io_uring/zcrx: always dma map in advance
  io_uring/zcrx: extract netdev+area init into a helper
  io_uring/zcrx: implement device-less mode for zcrx
  io_uring/zcrx: use better name for RQ region
  io_uring/zcrx: add a struct for refill queue
  io_uring/zcrx: use guards for locking
  io_uring/zcrx: move count check into zcrx_get_free_niov
  io_uring/zcrx: warn on alloc with non-empty pp cache
  io_uring/zcrx: netmem array as refiling format
  io_uring/zcrx: consolidate dma syncing
  io_uring/zcrx: warn on a repeated area append
  io_uring/zcrx: cache fallback availability in zcrx ctx
  io_uring/zcrx: check ctrl op payload struct sizes
  io_uring/zcrx: rename zcrx [un]register functions

 include/uapi/linux/io_uring/zcrx.h |   9 +-
 io_uring/io_uring.c                |   6 +-
 io_uring/register.c                |   2 +-
 io_uring/zcrx.c                    | 364 ++++++++++++++++++-----------
 io_uring/zcrx.h                    |  33 ++-
 5 files changed, 257 insertions(+), 157 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 15:01   ` Jens Axboe
  2026-03-23 12:43 ` [PATCH io_uring-7.1 02/16] io_uring/zcrx: fully clean area on error in io_import_umem() Pavel Begunkov
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev, Youngmin Choi

There are reports where io_uring instance removal takes too long and an
ifq reallocation by another zcrx instance fails. Split zcrx destruction
into two steps similarly how it was before, first close the queue early
but maintain zcrx alive, and then when all inflight requests are
completed, drop the main zcrx reference. For extra protection, mark
terminated zcrx instances in xarray and warn if we double put them.

Cc: stable@vger.kernel.org # 6.19+
Link: https://github.com/axboe/liburing/issues/1550
Reported-by: Youngmin Choi <youngminchoi94@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/io_uring.c |  4 ++++
 io_uring/zcrx.c     | 44 +++++++++++++++++++++++++++++++++++++++++---
 io_uring/zcrx.h     |  4 ++++
 3 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 6eaa21e09469..34104c256c88 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2308,6 +2308,10 @@ static __cold void io_ring_exit_work(struct work_struct *work)
 	struct io_tctx_node *node;
 	int ret;
 
+	mutex_lock(&ctx->uring_lock);
+	io_terminate_zcrx(ctx);
+	mutex_unlock(&ctx->uring_lock);
+
 	/*
 	 * If we're doing polled IO and end up having requests being
 	 * submitted async (out-of-line), then completions can come in while
diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 73fa82759771..8c76c174380d 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -624,12 +624,17 @@ static void io_zcrx_scrub(struct io_zcrx_ifq *ifq)
 	}
 }
 
-static void zcrx_unregister(struct io_zcrx_ifq *ifq)
+static void zcrx_unregister_user(struct io_zcrx_ifq *ifq)
 {
 	if (refcount_dec_and_test(&ifq->user_refs)) {
 		io_close_queue(ifq);
 		io_zcrx_scrub(ifq);
 	}
+}
+
+static void zcrx_unregister(struct io_zcrx_ifq *ifq)
+{
+	zcrx_unregister_user(ifq);
 	io_put_zcrx_ifq(ifq);
 }
 
@@ -887,6 +892,36 @@ static struct net_iov *__io_zcrx_get_free_niov(struct io_zcrx_area *area)
 	return &area->nia.niovs[niov_idx];
 }
 
+static inline bool is_zcrx_entry_marked(struct io_ring_ctx *ctx, unsigned long id)
+{
+	return xa_get_mark(&ctx->zcrx_ctxs, id, XA_MARK_0);
+}
+
+static inline void set_zcrx_entry_mark(struct io_ring_ctx *ctx, unsigned long id)
+{
+	xa_set_mark(&ctx->zcrx_ctxs, id, XA_MARK_0);
+}
+
+void io_terminate_zcrx(struct io_ring_ctx *ctx)
+{
+	struct io_zcrx_ifq *ifq;
+	unsigned long id = 0;
+
+	lockdep_assert_held(&ctx->uring_lock);
+
+	while (1) {
+		scoped_guard(mutex, &ctx->mmap_lock)
+			ifq = xa_find(&ctx->zcrx_ctxs, &id, ULONG_MAX, XA_PRESENT);
+		if (!ifq)
+			break;
+		if (WARN_ON_ONCE(is_zcrx_entry_marked(ctx, id)))
+			break;
+		set_zcrx_entry_mark(ctx, id);
+		id++;
+		zcrx_unregister_user(ifq);
+	}
+}
+
 void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
 {
 	struct io_zcrx_ifq *ifq;
@@ -898,12 +933,15 @@ void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
 			unsigned long id = 0;
 
 			ifq = xa_find(&ctx->zcrx_ctxs, &id, ULONG_MAX, XA_PRESENT);
-			if (ifq)
+			if (ifq) {
+				if (WARN_ON_ONCE(!is_zcrx_entry_marked(ctx, id)))
+					break;
 				xa_erase(&ctx->zcrx_ctxs, id);
+			}
 		}
 		if (!ifq)
 			break;
-		zcrx_unregister(ifq);
+		io_put_zcrx_ifq(ifq);
 	}
 
 	xa_destroy(&ctx->zcrx_ctxs);
diff --git a/io_uring/zcrx.h b/io_uring/zcrx.h
index 0ddcf0ee8861..0316a41a3561 100644
--- a/io_uring/zcrx.h
+++ b/io_uring/zcrx.h
@@ -74,6 +74,7 @@ int io_zcrx_ctrl(struct io_ring_ctx *ctx, void __user *arg, unsigned nr_arg);
 int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 			 struct io_uring_zcrx_ifq_reg __user *arg);
 void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx);
+void io_terminate_zcrx(struct io_ring_ctx *ctx);
 int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
 		 struct socket *sock, unsigned int flags,
 		 unsigned issue_flags, unsigned int *len);
@@ -88,6 +89,9 @@ static inline int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 static inline void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
 {
 }
+static inline void io_terminate_zcrx(struct io_ring_ctx *ctx)
+{
+}
 static inline int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
 			       struct socket *sock, unsigned int flags,
 			       unsigned issue_flags, unsigned int *len)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 02/16] io_uring/zcrx: fully clean area on error in io_import_umem()
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 03/16] io_uring/zcrx: always dma map in advance Pavel Begunkov
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

When accounting fails, io_import_umem() sets the page array, etc. and
returns an error expecting that the error handling code will take care
of the rest. To make the next patch simpler, only return a fully
initialised areas from the function.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 8c76c174380d..5739ce14d8ea 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -207,22 +207,26 @@ static int io_import_umem(struct io_zcrx_ifq *ifq,
 	ret = sg_alloc_table_from_pages(&mem->page_sg_table, pages, nr_pages,
 					0, (unsigned long)nr_pages << PAGE_SHIFT,
 					GFP_KERNEL_ACCOUNT);
-	if (ret) {
-		unpin_user_pages(pages, nr_pages);
-		kvfree(pages);
-		return ret;
-	}
+	if (ret)
+		goto out_err;
 
 	mem->account_pages = io_count_account_pages(pages, nr_pages);
 	ret = io_account_mem(ifq->user, ifq->mm_account, mem->account_pages);
-	if (ret < 0)
+	if (ret < 0) {
 		mem->account_pages = 0;
+		goto out_err;
+	}
 
 	mem->sgt = &mem->page_sg_table;
 	mem->pages = pages;
 	mem->nr_folios = nr_pages;
 	mem->size = area_reg->len;
 	return ret;
+out_err:
+	sg_free_table(&mem->page_sg_table);
+	unpin_user_pages(pages, nr_pages);
+	kvfree(pages);
+	return ret;
 }
 
 static void io_release_area_mem(struct io_zcrx_mem *mem)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 03/16] io_uring/zcrx: always dma map in advance
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 02/16] io_uring/zcrx: fully clean area on error in io_import_umem() Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 04/16] io_uring/zcrx: extract netdev+area init into a helper Pavel Begunkov
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

zcrx was originally establisihing dma mappings at a late stage when it
was being bound to a page pool. Dma-buf couldn't work this way, so it's
initialised during area creation.

It's messy having them do it at different spots, just move everything to
the area creation time.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 44 +++++++++++++++-----------------------------
 1 file changed, 15 insertions(+), 29 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 5739ce14d8ea..4e8064fc5561 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -194,6 +194,7 @@ static int io_import_umem(struct io_zcrx_ifq *ifq,
 {
 	struct page **pages;
 	int nr_pages, ret;
+	bool mapped = false;
 
 	if (area_reg->dmabuf_fd)
 		return -EINVAL;
@@ -210,6 +211,12 @@ static int io_import_umem(struct io_zcrx_ifq *ifq,
 	if (ret)
 		goto out_err;
 
+	ret = dma_map_sgtable(ifq->dev, &mem->page_sg_table,
+			      DMA_FROM_DEVICE, IO_DMA_ATTR);
+	if (ret < 0)
+		goto out_err;
+	mapped = true;
+
 	mem->account_pages = io_count_account_pages(pages, nr_pages);
 	ret = io_account_mem(ifq->user, ifq->mm_account, mem->account_pages);
 	if (ret < 0) {
@@ -223,6 +230,9 @@ static int io_import_umem(struct io_zcrx_ifq *ifq,
 	mem->size = area_reg->len;
 	return ret;
 out_err:
+	if (mapped)
+		dma_unmap_sgtable(ifq->dev, &mem->page_sg_table,
+				  DMA_FROM_DEVICE, IO_DMA_ATTR);
 	sg_free_table(&mem->page_sg_table);
 	unpin_user_pages(pages, nr_pages);
 	kvfree(pages);
@@ -288,30 +298,6 @@ static void io_zcrx_unmap_area(struct io_zcrx_ifq *ifq,
 	}
 }
 
-static int io_zcrx_map_area(struct io_zcrx_ifq *ifq, struct io_zcrx_area *area)
-{
-	int ret;
-
-	guard(mutex)(&ifq->pp_lock);
-	if (area->is_mapped)
-		return 0;
-
-	if (!area->mem.is_dmabuf) {
-		ret = dma_map_sgtable(ifq->dev, &area->mem.page_sg_table,
-				      DMA_FROM_DEVICE, IO_DMA_ATTR);
-		if (ret < 0)
-			return ret;
-	}
-
-	ret = io_populate_area_dma(ifq, area);
-	if (ret && !area->mem.is_dmabuf)
-		dma_unmap_sgtable(ifq->dev, &area->mem.page_sg_table,
-				  DMA_FROM_DEVICE, IO_DMA_ATTR);
-	if (ret == 0)
-		area->is_mapped = true;
-	return ret;
-}
-
 static void io_zcrx_sync_for_device(struct page_pool *pool,
 				    struct net_iov *niov)
 {
@@ -464,6 +450,7 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
 	ret = io_import_area(ifq, &area->mem, area_reg);
 	if (ret)
 		goto err;
+	area->is_mapped = true;
 
 	if (buf_size_shift > io_area_max_shift(&area->mem)) {
 		ret = -ERANGE;
@@ -499,6 +486,10 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
 		niov->type = NET_IOV_IOURING;
 	}
 
+	ret = io_populate_area_dma(ifq, area);
+	if (ret)
+		goto err;
+
 	area->free_count = nr_iovs;
 	/* we're only supporting one area per ifq for now */
 	area->area_id = 0;
@@ -1080,7 +1071,6 @@ static bool io_pp_zc_release_netmem(struct page_pool *pp, netmem_ref netmem)
 static int io_pp_zc_init(struct page_pool *pp)
 {
 	struct io_zcrx_ifq *ifq = io_pp_to_ifq(pp);
-	int ret;
 
 	if (WARN_ON_ONCE(!ifq))
 		return -EINVAL;
@@ -1093,10 +1083,6 @@ static int io_pp_zc_init(struct page_pool *pp)
 	if (pp->p.dma_dir != DMA_FROM_DEVICE)
 		return -EOPNOTSUPP;
 
-	ret = io_zcrx_map_area(ifq, ifq->area);
-	if (ret)
-		return ret;
-
 	refcount_inc(&ifq->refs);
 	return 0;
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 04/16] io_uring/zcrx: extract netdev+area init into a helper
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (2 preceding siblings ...)
  2026-03-23 12:43 ` [PATCH io_uring-7.1 03/16] io_uring/zcrx: always dma map in advance Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 05/16] io_uring/zcrx: implement device-less mode for zcrx Pavel Begunkov
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

In preparation to following patches, add a function that is responsibly
for looking up a netdev, creating an area, DMA mapping it and opening a
queue.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 72 +++++++++++++++++++++++++++++--------------------
 1 file changed, 43 insertions(+), 29 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 4e8064fc5561..d2798a82c678 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -751,10 +751,50 @@ static int import_zcrx(struct io_ring_ctx *ctx,
 	return ret;
 }
 
+static int zcrx_register_netdev(struct io_zcrx_ifq *ifq,
+				struct io_uring_zcrx_ifq_reg *reg,
+				struct io_uring_zcrx_area_reg *area)
+{
+	struct pp_memory_provider_params mp_param = {};
+	unsigned if_rxq = reg->if_rxq;
+	int ret;
+
+	ifq->netdev = netdev_get_by_index_lock(current->nsproxy->net_ns,
+						reg->if_idx);
+	if (!ifq->netdev)
+		return -ENODEV;
+
+	netdev_hold(ifq->netdev, &ifq->netdev_tracker, GFP_KERNEL);
+
+	ifq->dev = netdev_queue_get_dma_dev(ifq->netdev, if_rxq);
+	if (!ifq->dev) {
+		ret = -EOPNOTSUPP;
+		goto netdev_put_unlock;
+	}
+	get_device(ifq->dev);
+
+	ret = io_zcrx_create_area(ifq, area, reg);
+	if (ret)
+		goto netdev_put_unlock;
+
+	if (reg->rx_buf_len)
+		mp_param.rx_page_size = 1U << ifq->niov_shift;
+	mp_param.mp_ops = &io_uring_pp_zc_ops;
+	mp_param.mp_priv = ifq;
+	ret = __net_mp_open_rxq(ifq->netdev, if_rxq, &mp_param, NULL);
+	if (ret)
+		goto netdev_put_unlock;
+
+	ifq->if_rxq = if_rxq;
+	ret = 0;
+netdev_put_unlock:
+	netdev_unlock(ifq->netdev);
+	return ret;
+}
+
 int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 			  struct io_uring_zcrx_ifq_reg __user *arg)
 {
-	struct pp_memory_provider_params mp_param = {};
 	struct io_uring_zcrx_area_reg area;
 	struct io_uring_zcrx_ifq_reg reg;
 	struct io_uring_region_desc rd;
@@ -821,33 +861,9 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 	if (ret)
 		goto err;
 
-	ifq->netdev = netdev_get_by_index_lock(current->nsproxy->net_ns, reg.if_idx);
-	if (!ifq->netdev) {
-		ret = -ENODEV;
-		goto err;
-	}
-	netdev_hold(ifq->netdev, &ifq->netdev_tracker, GFP_KERNEL);
-
-	ifq->dev = netdev_queue_get_dma_dev(ifq->netdev, reg.if_rxq);
-	if (!ifq->dev) {
-		ret = -EOPNOTSUPP;
-		goto netdev_put_unlock;
-	}
-	get_device(ifq->dev);
-
-	ret = io_zcrx_create_area(ifq, &area, &reg);
-	if (ret)
-		goto netdev_put_unlock;
-
-	if (reg.rx_buf_len)
-		mp_param.rx_page_size = 1U << ifq->niov_shift;
-	mp_param.mp_ops = &io_uring_pp_zc_ops;
-	mp_param.mp_priv = ifq;
-	ret = __net_mp_open_rxq(ifq->netdev, reg.if_rxq, &mp_param, NULL);
+	ret = zcrx_register_netdev(ifq, &reg, &area);
 	if (ret)
-		goto netdev_put_unlock;
-	netdev_unlock(ifq->netdev);
-	ifq->if_rxq = reg.if_rxq;
+		goto err;
 
 	reg.zcrx_id = id;
 
@@ -867,8 +883,6 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 		goto err;
 	}
 	return 0;
-netdev_put_unlock:
-	netdev_unlock(ifq->netdev);
 err:
 	scoped_guard(mutex, &ctx->mmap_lock)
 		xa_erase(&ctx->zcrx_ctxs, id);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 05/16] io_uring/zcrx: implement device-less mode for zcrx
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (3 preceding siblings ...)
  2026-03-23 12:43 ` [PATCH io_uring-7.1 04/16] io_uring/zcrx: extract netdev+area init into a helper Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 06/16] io_uring/zcrx: use better name for RQ region Pavel Begunkov
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Allow creating a zcrx instance without attaching it to a net device.
All data will be copied through the fallback path. The user is also
expected to use ZCRX_CTRL_FLUSH_RQ to handle overflows as it normally
should even with a netdev, but it becomes even more relevant as there
will likely be no one to automatically pick up buffers.

Apart from that, it follows the zcrx uapi for the I/O path, and is
useful for testing, experimentation, and potentially for the copy
recieve path in the future if improved.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/uapi/linux/io_uring/zcrx.h |  9 ++++++-
 io_uring/zcrx.c                    | 41 ++++++++++++++++++++----------
 io_uring/zcrx.h                    |  2 +-
 3 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/include/uapi/linux/io_uring/zcrx.h b/include/uapi/linux/io_uring/zcrx.h
index 3163a4b8aeb0..103d65e690eb 100644
--- a/include/uapi/linux/io_uring/zcrx.h
+++ b/include/uapi/linux/io_uring/zcrx.h
@@ -49,7 +49,14 @@ struct io_uring_zcrx_area_reg {
 };
 
 enum zcrx_reg_flags {
-	ZCRX_REG_IMPORT	= 1,
+	ZCRX_REG_IMPORT		= 1,
+
+	/*
+	 * Register a zcrx instance without a net device. All data will be
+	 * copied. The refill queue entries might not be automatically
+	 * consmumed and need to be flushed, see ZCRX_CTRL_FLUSH_RQ.
+	 */
+	ZCRX_REG_NODEV		= 2,
 };
 
 enum zcrx_features {
diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index d2798a82c678..d772e1609c4b 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -127,10 +127,10 @@ static int io_import_dmabuf(struct io_zcrx_ifq *ifq,
 	int dmabuf_fd = area_reg->dmabuf_fd;
 	int i, ret;
 
+	if (!ifq->dev)
+		return -EINVAL;
 	if (off)
 		return -EINVAL;
-	if (WARN_ON_ONCE(!ifq->dev))
-		return -EFAULT;
 	if (!IS_ENABLED(CONFIG_DMA_SHARED_BUFFER))
 		return -EINVAL;
 
@@ -211,11 +211,13 @@ static int io_import_umem(struct io_zcrx_ifq *ifq,
 	if (ret)
 		goto out_err;
 
-	ret = dma_map_sgtable(ifq->dev, &mem->page_sg_table,
-			      DMA_FROM_DEVICE, IO_DMA_ATTR);
-	if (ret < 0)
-		goto out_err;
-	mapped = true;
+	if (ifq->dev) {
+		ret = dma_map_sgtable(ifq->dev, &mem->page_sg_table,
+				      DMA_FROM_DEVICE, IO_DMA_ATTR);
+		if (ret < 0)
+			goto out_err;
+		mapped = true;
+	}
 
 	mem->account_pages = io_count_account_pages(pages, nr_pages);
 	ret = io_account_mem(ifq->user, ifq->mm_account, mem->account_pages);
@@ -450,7 +452,8 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
 	ret = io_import_area(ifq, &area->mem, area_reg);
 	if (ret)
 		goto err;
-	area->is_mapped = true;
+	if (ifq->dev)
+		area->is_mapped = true;
 
 	if (buf_size_shift > io_area_max_shift(&area->mem)) {
 		ret = -ERANGE;
@@ -486,9 +489,11 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
 		niov->type = NET_IOV_IOURING;
 	}
 
-	ret = io_populate_area_dma(ifq, area);
-	if (ret)
-		goto err;
+	if (ifq->dev) {
+		ret = io_populate_area_dma(ifq, area);
+		if (ret)
+			goto err;
+	}
 
 	area->free_count = nr_iovs;
 	/* we're only supporting one area per ifq for now */
@@ -826,6 +831,8 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 		return -EFAULT;
 	if (reg.if_rxq == -1 || !reg.rq_entries)
 		return -EINVAL;
+	if ((reg.if_rxq || reg.if_idx) && (reg.flags & ZCRX_REG_NODEV))
+		return -EINVAL;
 	if (reg.rq_entries > IO_RQ_MAX_ENTRIES) {
 		if (!(ctx->flags & IORING_SETUP_CLAMP))
 			return -EINVAL;
@@ -861,9 +868,15 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 	if (ret)
 		goto err;
 
-	ret = zcrx_register_netdev(ifq, &reg, &area);
-	if (ret)
-		goto err;
+	if (!(reg.flags & ZCRX_REG_NODEV)) {
+		ret = zcrx_register_netdev(ifq, &reg, &area);
+		if (ret)
+			goto err;
+	} else {
+		ret = io_zcrx_create_area(ifq, &area, &reg);
+		if (ret)
+			goto err;
+	}
 
 	reg.zcrx_id = id;
 
diff --git a/io_uring/zcrx.h b/io_uring/zcrx.h
index 0316a41a3561..f395656c3160 100644
--- a/io_uring/zcrx.h
+++ b/io_uring/zcrx.h
@@ -8,7 +8,7 @@
 #include <net/page_pool/types.h>
 #include <net/net_trackers.h>
 
-#define ZCRX_SUPPORTED_REG_FLAGS	(ZCRX_REG_IMPORT)
+#define ZCRX_SUPPORTED_REG_FLAGS	(ZCRX_REG_IMPORT | ZCRX_REG_NODEV)
 #define ZCRX_FEATURES			(ZCRX_FEATURE_RX_PAGE_SIZE)
 
 struct io_zcrx_mem {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 06/16] io_uring/zcrx: use better name for RQ region
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (4 preceding siblings ...)
  2026-03-23 12:43 ` [PATCH io_uring-7.1 05/16] io_uring/zcrx: implement device-less mode for zcrx Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 07/16] io_uring/zcrx: add a struct for refill queue Pavel Begunkov
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Rename "region" to "rq_region" to highlight that it's a refill queue
region.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 8 ++++----
 io_uring/zcrx.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index d772e1609c4b..f10df7750740 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -384,11 +384,11 @@ static int io_allocate_rbuf_ring(struct io_ring_ctx *ctx,
 	mmap_offset = IORING_MAP_OFF_ZCRX_REGION;
 	mmap_offset += id << IORING_OFF_PBUF_SHIFT;
 
-	ret = io_create_region(ctx, &ifq->region, rd, mmap_offset);
+	ret = io_create_region(ctx, &ifq->rq_region, rd, mmap_offset);
 	if (ret < 0)
 		return ret;
 
-	ptr = io_region_get_ptr(&ifq->region);
+	ptr = io_region_get_ptr(&ifq->rq_region);
 	ifq->rq_ring = (struct io_uring *)ptr;
 	ifq->rqes = (struct io_uring_zcrx_rqe *)(ptr + off);
 
@@ -397,7 +397,7 @@ static int io_allocate_rbuf_ring(struct io_ring_ctx *ctx,
 
 static void io_free_rbuf_ring(struct io_zcrx_ifq *ifq)
 {
-	io_free_region(ifq->user, &ifq->region);
+	io_free_region(ifq->user, &ifq->rq_region);
 	ifq->rq_ring = NULL;
 	ifq->rqes = NULL;
 }
@@ -645,7 +645,7 @@ struct io_mapped_region *io_zcrx_get_region(struct io_ring_ctx *ctx,
 
 	lockdep_assert_held(&ctx->mmap_lock);
 
-	return ifq ? &ifq->region : NULL;
+	return ifq ? &ifq->rq_region : NULL;
 }
 
 static int zcrx_box_release(struct inode *inode, struct file *file)
diff --git a/io_uring/zcrx.h b/io_uring/zcrx.h
index f395656c3160..3b2681a1fafd 100644
--- a/io_uring/zcrx.h
+++ b/io_uring/zcrx.h
@@ -66,7 +66,7 @@ struct io_zcrx_ifq {
 	 * net stack.
 	 */
 	struct mutex			pp_lock;
-	struct io_mapped_region		region;
+	struct io_mapped_region		rq_region;
 };
 
 #if defined(CONFIG_IO_URING_ZCRX)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 07/16] io_uring/zcrx: add a struct for refill queue
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (5 preceding siblings ...)
  2026-03-23 12:43 ` [PATCH io_uring-7.1 06/16] io_uring/zcrx: use better name for RQ region Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 08/16] io_uring/zcrx: use guards for locking Pavel Begunkov
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Add a new structure that keeps the refill queue state. It's cleaner and
will be useful once we introduce multiple refill queues.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 54 +++++++++++++++++++++++++------------------------
 io_uring/zcrx.h | 14 ++++++++-----
 2 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index f10df7750740..0a5f8eab92c3 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -389,8 +389,8 @@ static int io_allocate_rbuf_ring(struct io_ring_ctx *ctx,
 		return ret;
 
 	ptr = io_region_get_ptr(&ifq->rq_region);
-	ifq->rq_ring = (struct io_uring *)ptr;
-	ifq->rqes = (struct io_uring_zcrx_rqe *)(ptr + off);
+	ifq->rq.ring = (struct io_uring *)ptr;
+	ifq->rq.rqes = (struct io_uring_zcrx_rqe *)(ptr + off);
 
 	return 0;
 }
@@ -398,8 +398,8 @@ static int io_allocate_rbuf_ring(struct io_ring_ctx *ctx,
 static void io_free_rbuf_ring(struct io_zcrx_ifq *ifq)
 {
 	io_free_region(ifq->user, &ifq->rq_region);
-	ifq->rq_ring = NULL;
-	ifq->rqes = NULL;
+	ifq->rq.ring = NULL;
+	ifq->rq.rqes = NULL;
 }
 
 static void io_zcrx_free_area(struct io_zcrx_ifq *ifq,
@@ -519,7 +519,7 @@ static struct io_zcrx_ifq *io_zcrx_ifq_alloc(struct io_ring_ctx *ctx)
 		return NULL;
 
 	ifq->if_rxq = -1;
-	spin_lock_init(&ifq->rq_lock);
+	spin_lock_init(&ifq->rq.lock);
 	mutex_init(&ifq->pp_lock);
 	refcount_set(&ifq->refs, 1);
 	refcount_set(&ifq->user_refs, 1);
@@ -855,7 +855,7 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 		mmgrab(ctx->mm_account);
 		ifq->mm_account = ctx->mm_account;
 	}
-	ifq->rq_entries = reg.rq_entries;
+	ifq->rq.nr_entries = reg.rq_entries;
 
 	scoped_guard(mutex, &ctx->mmap_lock) {
 		/* preallocate id */
@@ -969,20 +969,19 @@ void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
 	xa_destroy(&ctx->zcrx_ctxs);
 }
 
-static inline u32 io_zcrx_rqring_entries(struct io_zcrx_ifq *ifq)
+static inline u32 zcrx_rq_entries(struct zcrx_rq *rq)
 {
 	u32 entries;
 
-	entries = smp_load_acquire(&ifq->rq_ring->tail) - ifq->cached_rq_head;
-	return min(entries, ifq->rq_entries);
+	entries = smp_load_acquire(&rq->ring->tail) - rq->cached_head;
+	return min(entries, rq->nr_entries);
 }
 
-static struct io_uring_zcrx_rqe *io_zcrx_get_rqe(struct io_zcrx_ifq *ifq,
-						 unsigned mask)
+static struct io_uring_zcrx_rqe *zcrx_next_rqe(struct zcrx_rq *rq, unsigned mask)
 {
-	unsigned int idx = ifq->cached_rq_head++ & mask;
+	unsigned int idx = rq->cached_head++ & mask;
 
-	return &ifq->rqes[idx];
+	return &rq->rqes[idx];
 }
 
 static inline bool io_parse_rqe(struct io_uring_zcrx_rqe *rqe,
@@ -1011,18 +1010,19 @@ static inline bool io_parse_rqe(struct io_uring_zcrx_rqe *rqe,
 static void io_zcrx_ring_refill(struct page_pool *pp,
 				struct io_zcrx_ifq *ifq)
 {
-	unsigned int mask = ifq->rq_entries - 1;
+	struct zcrx_rq *rq = &ifq->rq;
+	unsigned int mask = rq->nr_entries - 1;
 	unsigned int entries;
 
-	guard(spinlock_bh)(&ifq->rq_lock);
+	guard(spinlock_bh)(&rq->lock);
 
-	entries = io_zcrx_rqring_entries(ifq);
+	entries = zcrx_rq_entries(rq);
 	entries = min_t(unsigned, entries, PP_ALLOC_CACHE_REFILL);
 	if (unlikely(!entries))
 		return;
 
 	do {
-		struct io_uring_zcrx_rqe *rqe = io_zcrx_get_rqe(ifq, mask);
+		struct io_uring_zcrx_rqe *rqe = zcrx_next_rqe(rq, mask);
 		struct net_iov *niov;
 		netmem_ref netmem;
 
@@ -1044,7 +1044,7 @@ static void io_zcrx_ring_refill(struct page_pool *pp,
 		net_mp_netmem_place_in_cache(pp, netmem);
 	} while (--entries);
 
-	smp_store_release(&ifq->rq_ring->head, ifq->cached_rq_head);
+	smp_store_release(&rq->ring->head, rq->cached_head);
 }
 
 static void io_zcrx_refill_slow(struct page_pool *pp, struct io_zcrx_ifq *ifq)
@@ -1157,14 +1157,14 @@ static const struct memory_provider_ops io_uring_pp_zc_ops = {
 };
 
 static unsigned zcrx_parse_rq(netmem_ref *netmem_array, unsigned nr,
-			      struct io_zcrx_ifq *zcrx)
+			      struct io_zcrx_ifq *zcrx, struct zcrx_rq *rq)
 {
-	unsigned int mask = zcrx->rq_entries - 1;
+	unsigned int mask = rq->nr_entries - 1;
 	unsigned int i;
 
-	nr = min(nr, io_zcrx_rqring_entries(zcrx));
+	nr = min(nr, zcrx_rq_entries(rq));
 	for (i = 0; i < nr; i++) {
-		struct io_uring_zcrx_rqe *rqe = io_zcrx_get_rqe(zcrx, mask);
+		struct io_uring_zcrx_rqe *rqe = zcrx_next_rqe(rq, mask);
 		struct net_iov *niov;
 
 		if (!io_parse_rqe(rqe, zcrx, &niov))
@@ -1172,7 +1172,7 @@ static unsigned zcrx_parse_rq(netmem_ref *netmem_array, unsigned nr,
 		netmem_array[i] = net_iov_to_netmem(niov);
 	}
 
-	smp_store_release(&zcrx->rq_ring->head, zcrx->cached_rq_head);
+	smp_store_release(&rq->ring->head, rq->cached_head);
 	return i;
 }
 
@@ -1206,8 +1206,10 @@ static int zcrx_flush_rq(struct io_ring_ctx *ctx, struct io_zcrx_ifq *zcrx,
 		return -EINVAL;
 
 	do {
-		scoped_guard(spinlock_bh, &zcrx->rq_lock) {
-			nr = zcrx_parse_rq(netmems, ZCRX_FLUSH_BATCH, zcrx);
+		struct zcrx_rq *rq = &zcrx->rq;
+
+		scoped_guard(spinlock_bh, &rq->lock) {
+			nr = zcrx_parse_rq(netmems, ZCRX_FLUSH_BATCH, zcrx, rq);
 			zcrx_return_buffers(netmems, nr);
 		}
 
@@ -1216,7 +1218,7 @@ static int zcrx_flush_rq(struct io_ring_ctx *ctx, struct io_zcrx_ifq *zcrx,
 		if (fatal_signal_pending(current))
 			break;
 		cond_resched();
-	} while (nr == ZCRX_FLUSH_BATCH && total < zcrx->rq_entries);
+	} while (nr == ZCRX_FLUSH_BATCH && total < zcrx->rq.nr_entries);
 
 	return 0;
 }
diff --git a/io_uring/zcrx.h b/io_uring/zcrx.h
index 3b2681a1fafd..893cd3708a06 100644
--- a/io_uring/zcrx.h
+++ b/io_uring/zcrx.h
@@ -41,17 +41,21 @@ struct io_zcrx_area {
 	struct io_zcrx_mem	mem;
 };
 
+struct zcrx_rq {
+	spinlock_t			lock;
+	struct io_uring			*ring;
+	struct io_uring_zcrx_rqe	*rqes;
+	u32				cached_head;
+	u32				nr_entries;
+};
+
 struct io_zcrx_ifq {
 	struct io_zcrx_area		*area;
 	unsigned			niov_shift;
 	struct user_struct		*user;
 	struct mm_struct		*mm_account;
 
-	spinlock_t			rq_lock ____cacheline_aligned_in_smp;
-	struct io_uring			*rq_ring;
-	struct io_uring_zcrx_rqe	*rqes;
-	u32				cached_rq_head;
-	u32				rq_entries;
+	struct zcrx_rq			rq ____cacheline_aligned_in_smp;
 
 	u32				if_rxq;
 	struct device			*dev;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 08/16] io_uring/zcrx: use guards for locking
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (6 preceding siblings ...)
  2026-03-23 12:43 ` [PATCH io_uring-7.1 07/16] io_uring/zcrx: add a struct for refill queue Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 09/16] io_uring/zcrx: move count check into zcrx_get_free_niov Pavel Begunkov
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Convert last several places using manual locking to guards to simplify
the code.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 0a5f8eab92c3..db723644ddcb 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -586,9 +586,8 @@ static void io_zcrx_return_niov_freelist(struct net_iov *niov)
 {
 	struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);
 
-	spin_lock_bh(&area->freelist_lock);
+	guard(spinlock_bh)(&area->freelist_lock);
 	area->freelist[area->free_count++] = net_iov_idx(niov);
-	spin_unlock_bh(&area->freelist_lock);
 }
 
 static void io_zcrx_return_niov(struct net_iov *niov)
@@ -1051,7 +1050,8 @@ static void io_zcrx_refill_slow(struct page_pool *pp, struct io_zcrx_ifq *ifq)
 {
 	struct io_zcrx_area *area = ifq->area;
 
-	spin_lock_bh(&area->freelist_lock);
+	guard(spinlock_bh)(&area->freelist_lock);
+
 	while (area->free_count && pp->alloc.count < PP_ALLOC_CACHE_REFILL) {
 		struct net_iov *niov = __io_zcrx_get_free_niov(area);
 		netmem_ref netmem = net_iov_to_netmem(niov);
@@ -1060,7 +1060,6 @@ static void io_zcrx_refill_slow(struct page_pool *pp, struct io_zcrx_ifq *ifq)
 		io_zcrx_sync_for_device(pp, niov);
 		net_mp_netmem_place_in_cache(pp, netmem);
 	}
-	spin_unlock_bh(&area->freelist_lock);
 }
 
 static netmem_ref io_pp_zc_alloc_netmems(struct page_pool *pp, gfp_t gfp)
@@ -1283,10 +1282,10 @@ static struct net_iov *io_alloc_fallback_niov(struct io_zcrx_ifq *ifq)
 	if (area->mem.is_dmabuf)
 		return NULL;
 
-	spin_lock_bh(&area->freelist_lock);
-	if (area->free_count)
-		niov = __io_zcrx_get_free_niov(area);
-	spin_unlock_bh(&area->freelist_lock);
+	scoped_guard(spinlock_bh, &area->freelist_lock) {
+		if (area->free_count)
+			niov = __io_zcrx_get_free_niov(area);
+	}
 
 	if (niov)
 		page_pool_fragment_netmem(net_iov_to_netmem(niov), 1);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 09/16] io_uring/zcrx: move count check into zcrx_get_free_niov
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (7 preceding siblings ...)
  2026-03-23 12:43 ` [PATCH io_uring-7.1 08/16] io_uring/zcrx: use guards for locking Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:43 ` [PATCH io_uring-7.1 10/16] io_uring/zcrx: warn on alloc with non-empty pp cache Pavel Begunkov
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Instead of relying on the caller of __io_zcrx_get_free_niov() to check
that there are free niovs available (i.e. free_count > 0), move the
check into the function and return NULL if can't allocate. It
consolidates the free count checks, and it'll be easier to extend the
niov free list allocator in the future.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 38 +++++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index db723644ddcb..b4352c7b2d84 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -590,6 +590,19 @@ static void io_zcrx_return_niov_freelist(struct net_iov *niov)
 	area->freelist[area->free_count++] = net_iov_idx(niov);
 }
 
+static struct net_iov *zcrx_get_free_niov(struct io_zcrx_area *area)
+{
+	unsigned niov_idx;
+
+	lockdep_assert_held(&area->freelist_lock);
+
+	if (unlikely(!area->free_count))
+		return NULL;
+
+	niov_idx = area->freelist[--area->free_count];
+	return &area->nia.niovs[niov_idx];
+}
+
 static void io_zcrx_return_niov(struct net_iov *niov)
 {
 	netmem_ref netmem = net_iov_to_netmem(niov);
@@ -903,16 +916,6 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 	return ret;
 }
 
-static struct net_iov *__io_zcrx_get_free_niov(struct io_zcrx_area *area)
-{
-	unsigned niov_idx;
-
-	lockdep_assert_held(&area->freelist_lock);
-
-	niov_idx = area->freelist[--area->free_count];
-	return &area->nia.niovs[niov_idx];
-}
-
 static inline bool is_zcrx_entry_marked(struct io_ring_ctx *ctx, unsigned long id)
 {
 	return xa_get_mark(&ctx->zcrx_ctxs, id, XA_MARK_0);
@@ -1052,12 +1055,15 @@ static void io_zcrx_refill_slow(struct page_pool *pp, struct io_zcrx_ifq *ifq)
 
 	guard(spinlock_bh)(&area->freelist_lock);
 
-	while (area->free_count && pp->alloc.count < PP_ALLOC_CACHE_REFILL) {
-		struct net_iov *niov = __io_zcrx_get_free_niov(area);
-		netmem_ref netmem = net_iov_to_netmem(niov);
+	while (pp->alloc.count < PP_ALLOC_CACHE_REFILL) {
+		struct net_iov *niov = zcrx_get_free_niov(area);
+		netmem_ref netmem;
 
+		if (!niov)
+			break;
 		net_mp_niov_set_page_pool(pp, niov);
 		io_zcrx_sync_for_device(pp, niov);
+		netmem = net_iov_to_netmem(niov);
 		net_mp_netmem_place_in_cache(pp, netmem);
 	}
 }
@@ -1282,10 +1288,8 @@ static struct net_iov *io_alloc_fallback_niov(struct io_zcrx_ifq *ifq)
 	if (area->mem.is_dmabuf)
 		return NULL;
 
-	scoped_guard(spinlock_bh, &area->freelist_lock) {
-		if (area->free_count)
-			niov = __io_zcrx_get_free_niov(area);
-	}
+	scoped_guard(spinlock_bh, &area->freelist_lock)
+		niov = zcrx_get_free_niov(area);
 
 	if (niov)
 		page_pool_fragment_netmem(net_iov_to_netmem(niov), 1);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 10/16] io_uring/zcrx: warn on alloc with non-empty pp cache
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (8 preceding siblings ...)
  2026-03-23 12:43 ` [PATCH io_uring-7.1 09/16] io_uring/zcrx: move count check into zcrx_get_free_niov Pavel Begunkov
@ 2026-03-23 12:43 ` Pavel Begunkov
  2026-03-23 12:44 ` [PATCH io_uring-7.1 11/16] io_uring/zcrx: netmem array as refiling format Pavel Begunkov
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:43 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Page pool ensures the cache is empty before asking to refill it. Warn if
the assumption is violated.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index b4352c7b2d84..04718a3f2831 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -1073,8 +1073,8 @@ static netmem_ref io_pp_zc_alloc_netmems(struct page_pool *pp, gfp_t gfp)
 	struct io_zcrx_ifq *ifq = io_pp_to_ifq(pp);
 
 	/* pp should already be ensuring that */
-	if (unlikely(pp->alloc.count))
-		goto out_return;
+	if (WARN_ON_ONCE(pp->alloc.count))
+		return 0;
 
 	io_zcrx_ring_refill(pp, ifq);
 	if (likely(pp->alloc.count))
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 11/16] io_uring/zcrx: netmem array as refiling format
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (9 preceding siblings ...)
  2026-03-23 12:43 ` [PATCH io_uring-7.1 10/16] io_uring/zcrx: warn on alloc with non-empty pp cache Pavel Begunkov
@ 2026-03-23 12:44 ` Pavel Begunkov
  2026-03-23 12:44 ` [PATCH io_uring-7.1 12/16] io_uring/zcrx: consolidate dma syncing Pavel Begunkov
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:44 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Instead of peeking into page pool allocation cache directly or via
net_mp_netmem_place_in_cache(), pass a netmem array around. It's a
better intermediate format, e.g. you can have it on stack and reuse the
refilling code and decouples it from page pools a bit more.

It still points into the page pool directly, there will be no additional
copies. As the next step, we can change the callback prototype to take
the netmem array from page pool.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 04718a3f2831..070b4941d001 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -1009,19 +1009,21 @@ static inline bool io_parse_rqe(struct io_uring_zcrx_rqe *rqe,
 	return true;
 }
 
-static void io_zcrx_ring_refill(struct page_pool *pp,
-				struct io_zcrx_ifq *ifq)
+static unsigned io_zcrx_ring_refill(struct page_pool *pp,
+				    struct io_zcrx_ifq *ifq,
+				    netmem_ref *netmems, unsigned to_alloc)
 {
 	struct zcrx_rq *rq = &ifq->rq;
 	unsigned int mask = rq->nr_entries - 1;
 	unsigned int entries;
+	unsigned allocated = 0;
 
 	guard(spinlock_bh)(&rq->lock);
 
 	entries = zcrx_rq_entries(rq);
-	entries = min_t(unsigned, entries, PP_ALLOC_CACHE_REFILL);
+	entries = min_t(unsigned, entries, to_alloc);
 	if (unlikely(!entries))
-		return;
+		return 0;
 
 	do {
 		struct io_uring_zcrx_rqe *rqe = zcrx_next_rqe(rq, mask);
@@ -1043,48 +1045,56 @@ static void io_zcrx_ring_refill(struct page_pool *pp,
 		}
 
 		io_zcrx_sync_for_device(pp, niov);
-		net_mp_netmem_place_in_cache(pp, netmem);
+		netmems[allocated] = netmem;
+		allocated++;
 	} while (--entries);
 
 	smp_store_release(&rq->ring->head, rq->cached_head);
+	return allocated;
 }
 
-static void io_zcrx_refill_slow(struct page_pool *pp, struct io_zcrx_ifq *ifq)
+static unsigned io_zcrx_refill_slow(struct page_pool *pp, struct io_zcrx_ifq *ifq,
+				    netmem_ref *netmems, unsigned to_alloc)
 {
 	struct io_zcrx_area *area = ifq->area;
+	unsigned allocated = 0;
 
 	guard(spinlock_bh)(&area->freelist_lock);
 
-	while (pp->alloc.count < PP_ALLOC_CACHE_REFILL) {
+	for (allocated = 0; allocated < to_alloc; allocated++) {
 		struct net_iov *niov = zcrx_get_free_niov(area);
-		netmem_ref netmem;
 
 		if (!niov)
 			break;
 		net_mp_niov_set_page_pool(pp, niov);
 		io_zcrx_sync_for_device(pp, niov);
-		netmem = net_iov_to_netmem(niov);
-		net_mp_netmem_place_in_cache(pp, netmem);
+		netmems[allocated] = net_iov_to_netmem(niov);
 	}
+	return allocated;
 }
 
 static netmem_ref io_pp_zc_alloc_netmems(struct page_pool *pp, gfp_t gfp)
 {
 	struct io_zcrx_ifq *ifq = io_pp_to_ifq(pp);
+	netmem_ref *netmems = pp->alloc.cache;
+	unsigned to_alloc = PP_ALLOC_CACHE_REFILL;
+	unsigned allocated;
 
 	/* pp should already be ensuring that */
 	if (WARN_ON_ONCE(pp->alloc.count))
 		return 0;
 
-	io_zcrx_ring_refill(pp, ifq);
-	if (likely(pp->alloc.count))
+	allocated = io_zcrx_ring_refill(pp, ifq, netmems, to_alloc);
+	if (likely(allocated))
 		goto out_return;
 
-	io_zcrx_refill_slow(pp, ifq);
-	if (!pp->alloc.count)
+	allocated = io_zcrx_refill_slow(pp, ifq, netmems, to_alloc);
+	if (!allocated)
 		return 0;
 out_return:
-	return pp->alloc.cache[--pp->alloc.count];
+	allocated--;
+	pp->alloc.count += allocated;
+	return netmems[allocated];
 }
 
 static bool io_pp_zc_release_netmem(struct page_pool *pp, netmem_ref netmem)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 12/16] io_uring/zcrx: consolidate dma syncing
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (10 preceding siblings ...)
  2026-03-23 12:44 ` [PATCH io_uring-7.1 11/16] io_uring/zcrx: netmem array as refiling format Pavel Begunkov
@ 2026-03-23 12:44 ` Pavel Begunkov
  2026-03-23 12:44 ` [PATCH io_uring-7.1 13/16] io_uring/zcrx: warn on a repeated area append Pavel Begunkov
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:44 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Split refilling into two steps, first allocate niovs, and then do DMA
sync for them. This way dma synchronisation code can be better
optimised. E.g. we don't need to call dma_dev_need_sync() for each every
niov, and maybe we can coalesce sync for adjacent netmems in the future
as well.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 070b4941d001..bf3dd15678c9 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -300,21 +300,23 @@ static void io_zcrx_unmap_area(struct io_zcrx_ifq *ifq,
 	}
 }
 
-static void io_zcrx_sync_for_device(struct page_pool *pool,
-				    struct net_iov *niov)
+static void zcrx_sync_for_device(struct page_pool *pp, struct io_zcrx_ifq *zcrx,
+				 netmem_ref *netmems, unsigned nr)
 {
 #if defined(CONFIG_HAS_DMA) && defined(CONFIG_DMA_NEED_SYNC)
+	struct device *dev = pp->p.dev;
+	unsigned i, niov_size;
 	dma_addr_t dma_addr;
 
-	unsigned niov_size;
-
-	if (!dma_dev_need_sync(pool->p.dev))
+	if (!dma_dev_need_sync(dev))
 		return;
+	niov_size = 1U << zcrx->niov_shift;
 
-	niov_size = 1U << io_pp_to_ifq(pool)->niov_shift;
-	dma_addr = page_pool_get_dma_addr_netmem(net_iov_to_netmem(niov));
-	__dma_sync_single_for_device(pool->p.dev, dma_addr + pool->p.offset,
-				     niov_size, pool->p.dma_dir);
+	for (i = 0; i < nr; i++) {
+		dma_addr = page_pool_get_dma_addr_netmem(netmems[i]);
+		__dma_sync_single_for_device(dev, dma_addr + pp->p.offset,
+					     niov_size, pp->p.dma_dir);
+	}
 #endif
 }
 
@@ -1044,7 +1046,6 @@ static unsigned io_zcrx_ring_refill(struct page_pool *pp,
 			continue;
 		}
 
-		io_zcrx_sync_for_device(pp, niov);
 		netmems[allocated] = netmem;
 		allocated++;
 	} while (--entries);
@@ -1067,7 +1068,6 @@ static unsigned io_zcrx_refill_slow(struct page_pool *pp, struct io_zcrx_ifq *if
 		if (!niov)
 			break;
 		net_mp_niov_set_page_pool(pp, niov);
-		io_zcrx_sync_for_device(pp, niov);
 		netmems[allocated] = net_iov_to_netmem(niov);
 	}
 	return allocated;
@@ -1092,6 +1092,7 @@ static netmem_ref io_pp_zc_alloc_netmems(struct page_pool *pp, gfp_t gfp)
 	if (!allocated)
 		return 0;
 out_return:
+	zcrx_sync_for_device(pp, ifq, netmems, allocated);
 	allocated--;
 	pp->alloc.count += allocated;
 	return netmems[allocated];
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 13/16] io_uring/zcrx: warn on a repeated area append
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (11 preceding siblings ...)
  2026-03-23 12:44 ` [PATCH io_uring-7.1 12/16] io_uring/zcrx: consolidate dma syncing Pavel Begunkov
@ 2026-03-23 12:44 ` Pavel Begunkov
  2026-03-23 12:44 ` [PATCH io_uring-7.1 14/16] io_uring/zcrx: cache fallback availability in zcrx ctx Pavel Begunkov
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:44 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

We only support a single area, no path should be able to call
io_zcrx_append_area() twice. Warn if that happens instead of just
returning an error.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index bf3dd15678c9..265b3a744ac2 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -423,7 +423,7 @@ static void io_zcrx_free_area(struct io_zcrx_ifq *ifq,
 static int io_zcrx_append_area(struct io_zcrx_ifq *ifq,
 				struct io_zcrx_area *area)
 {
-	if (ifq->area)
+	if (WARN_ON_ONCE(ifq->area))
 		return -EINVAL;
 	ifq->area = area;
 	return 0;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 14/16] io_uring/zcrx: cache fallback availability in zcrx ctx
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (12 preceding siblings ...)
  2026-03-23 12:44 ` [PATCH io_uring-7.1 13/16] io_uring/zcrx: warn on a repeated area append Pavel Begunkov
@ 2026-03-23 12:44 ` Pavel Begunkov
  2026-03-23 12:44 ` [PATCH io_uring-7.1 15/16] io_uring/zcrx: check ctrl op payload struct sizes Pavel Begunkov
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:44 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Store a flag in struct io_zcrx_ifq telling if the backing memory is
normal page or dmabuf based. It was looking it up from the area, however
it logically allocates from the zcrx ctx and not a particular area, and
once we add more than one area it'll become a mess.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 9 ++++++++-
 io_uring/zcrx.h | 1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 265b3a744ac2..d6475f95b815 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -423,8 +423,13 @@ static void io_zcrx_free_area(struct io_zcrx_ifq *ifq,
 static int io_zcrx_append_area(struct io_zcrx_ifq *ifq,
 				struct io_zcrx_area *area)
 {
+	bool kern_readable = !area->mem.is_dmabuf;
+
 	if (WARN_ON_ONCE(ifq->area))
 		return -EINVAL;
+	if (WARN_ON_ONCE(ifq->kern_readable != kern_readable))
+		return -EINVAL;
+
 	ifq->area = area;
 	return 0;
 }
@@ -882,6 +887,8 @@ int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 	if (ret)
 		goto err;
 
+	ifq->kern_readable = !(area.flags & IORING_ZCRX_AREA_DMABUF);
+
 	if (!(reg.flags & ZCRX_REG_NODEV)) {
 		ret = zcrx_register_netdev(ifq, &reg, &area);
 		if (ret)
@@ -1296,7 +1303,7 @@ static struct net_iov *io_alloc_fallback_niov(struct io_zcrx_ifq *ifq)
 	struct io_zcrx_area *area = ifq->area;
 	struct net_iov *niov = NULL;
 
-	if (area->mem.is_dmabuf)
+	if (!ifq->kern_readable)
 		return NULL;
 
 	scoped_guard(spinlock_bh, &area->freelist_lock)
diff --git a/io_uring/zcrx.h b/io_uring/zcrx.h
index 893cd3708a06..3e07238a4eb0 100644
--- a/io_uring/zcrx.h
+++ b/io_uring/zcrx.h
@@ -54,6 +54,7 @@ struct io_zcrx_ifq {
 	unsigned			niov_shift;
 	struct user_struct		*user;
 	struct mm_struct		*mm_account;
+	bool				kern_readable;
 
 	struct zcrx_rq			rq ____cacheline_aligned_in_smp;
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 15/16] io_uring/zcrx: check ctrl op payload struct sizes
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (13 preceding siblings ...)
  2026-03-23 12:44 ` [PATCH io_uring-7.1 14/16] io_uring/zcrx: cache fallback availability in zcrx ctx Pavel Begunkov
@ 2026-03-23 12:44 ` Pavel Begunkov
  2026-03-23 12:44 ` [PATCH io_uring-7.1 16/16] io_uring/zcrx: rename zcrx [un]register functions Pavel Begunkov
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:44 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Add a build check that ctrl payloads are of the same size and don't grow
struct zcrx_ctrl.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/zcrx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index d6475f95b815..620482cdb083 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -1251,6 +1251,8 @@ int io_zcrx_ctrl(struct io_ring_ctx *ctx, void __user *arg, unsigned nr_args)
 	struct zcrx_ctrl ctrl;
 	struct io_zcrx_ifq *zcrx;
 
+	BUILD_BUG_ON(sizeof(ctrl.zc_export) != sizeof(ctrl.zc_flush));
+
 	if (nr_args)
 		return -EINVAL;
 	if (copy_from_user(&ctrl, arg, sizeof(ctrl)))
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH io_uring-7.1 16/16] io_uring/zcrx: rename zcrx [un]register functions
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (14 preceding siblings ...)
  2026-03-23 12:44 ` [PATCH io_uring-7.1 15/16] io_uring/zcrx: check ctrl op payload struct sizes Pavel Begunkov
@ 2026-03-23 12:44 ` Pavel Begunkov
  2026-03-23 22:32 ` [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Jakub Kicinski
  2026-03-23 22:34 ` Jens Axboe
  17 siblings, 0 replies; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 12:44 UTC (permalink / raw)
  To: io-uring; +Cc: asml.silence, axboe, netdev

Drop "ifqs" from function names, as it refers to an interface queue and
there might be none once a device-less mode is introduced.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/io_uring.c |  2 +-
 io_uring/register.c |  2 +-
 io_uring/zcrx.c     |  6 +++---
 io_uring/zcrx.h     | 10 +++++-----
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 34104c256c88..16122f877aed 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2156,7 +2156,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
 	mutex_lock(&ctx->uring_lock);
 	io_sqe_buffers_unregister(ctx);
 	io_sqe_files_unregister(ctx);
-	io_unregister_zcrx_ifqs(ctx);
+	io_unregister_zcrx(ctx);
 	io_cqring_overflow_kill(ctx);
 	io_eventfd_unregister(ctx);
 	io_free_alloc_caches(ctx);
diff --git a/io_uring/register.c b/io_uring/register.c
index 489a6feaf228..35432471a550 100644
--- a/io_uring/register.c
+++ b/io_uring/register.c
@@ -900,7 +900,7 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 		ret = -EINVAL;
 		if (!arg || nr_args != 1)
 			break;
-		ret = io_register_zcrx_ifq(ctx, arg);
+		ret = io_register_zcrx(ctx, arg);
 		break;
 	case IORING_REGISTER_RESIZE_RINGS:
 		ret = -EINVAL;
diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 620482cdb083..c2f4fd93b928 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -816,8 +816,8 @@ static int zcrx_register_netdev(struct io_zcrx_ifq *ifq,
 	return ret;
 }
 
-int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
-			  struct io_uring_zcrx_ifq_reg __user *arg)
+int io_register_zcrx(struct io_ring_ctx *ctx,
+		     struct io_uring_zcrx_ifq_reg __user *arg)
 {
 	struct io_uring_zcrx_area_reg area;
 	struct io_uring_zcrx_ifq_reg reg;
@@ -955,7 +955,7 @@ void io_terminate_zcrx(struct io_ring_ctx *ctx)
 	}
 }
 
-void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
+void io_unregister_zcrx(struct io_ring_ctx *ctx)
 {
 	struct io_zcrx_ifq *ifq;
 
diff --git a/io_uring/zcrx.h b/io_uring/zcrx.h
index 3e07238a4eb0..75e0a4e6ef6e 100644
--- a/io_uring/zcrx.h
+++ b/io_uring/zcrx.h
@@ -76,9 +76,9 @@ struct io_zcrx_ifq {
 
 #if defined(CONFIG_IO_URING_ZCRX)
 int io_zcrx_ctrl(struct io_ring_ctx *ctx, void __user *arg, unsigned nr_arg);
-int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
+int io_register_zcrx(struct io_ring_ctx *ctx,
 			 struct io_uring_zcrx_ifq_reg __user *arg);
-void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx);
+void io_unregister_zcrx(struct io_ring_ctx *ctx);
 void io_terminate_zcrx(struct io_ring_ctx *ctx);
 int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
 		 struct socket *sock, unsigned int flags,
@@ -86,12 +86,12 @@ int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
 struct io_mapped_region *io_zcrx_get_region(struct io_ring_ctx *ctx,
 					    unsigned int id);
 #else
-static inline int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
-					struct io_uring_zcrx_ifq_reg __user *arg)
+static inline int io_register_zcrx(struct io_ring_ctx *ctx,
+				   struct io_uring_zcrx_ifq_reg __user *arg)
 {
 	return -EOPNOTSUPP;
 }
-static inline void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
+static inline void io_unregister_zcrx(struct io_ring_ctx *ctx)
 {
 }
 static inline void io_terminate_zcrx(struct io_ring_ctx *ctx)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration
  2026-03-23 12:43 ` [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration Pavel Begunkov
@ 2026-03-23 15:01   ` Jens Axboe
  2026-03-23 16:14     ` Pavel Begunkov
  0 siblings, 1 reply; 22+ messages in thread
From: Jens Axboe @ 2026-03-23 15:01 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: netdev, Youngmin Choi

On 3/23/26 6:43 AM, Pavel Begunkov wrote:
> @@ -898,12 +933,15 @@ void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
>  			unsigned long id = 0;
>  
>  			ifq = xa_find(&ctx->zcrx_ctxs, &id, ULONG_MAX, XA_PRESENT);
> -			if (ifq)
> +			if (ifq) {
> +				if (WARN_ON_ONCE(!is_zcrx_entry_marked(ctx, id)))
> +					break;

This break is inside the scoped_guard(), does this need an ifq = NULL
here? I do like scoped locking, but this seems a bit tricky...

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration
  2026-03-23 15:01   ` Jens Axboe
@ 2026-03-23 16:14     ` Pavel Begunkov
  2026-03-23 16:44       ` Jens Axboe
  0 siblings, 1 reply; 22+ messages in thread
From: Pavel Begunkov @ 2026-03-23 16:14 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: netdev, Youngmin Choi

On 3/23/26 15:01, Jens Axboe wrote:
> On 3/23/26 6:43 AM, Pavel Begunkov wrote:
>> @@ -898,12 +933,15 @@ void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
>>   			unsigned long id = 0;
>>   
>>   			ifq = xa_find(&ctx->zcrx_ctxs, &id, ULONG_MAX, XA_PRESENT);
>> -			if (ifq)
>> +			if (ifq) {
>> +				if (WARN_ON_ONCE(!is_zcrx_entry_marked(ctx, id)))
>> +					break;
> 
> This break is inside the scoped_guard(), does this need an ifq = NULL
> here? I do like scoped locking, but this seems a bit tricky...

That should work, want me to resend or would you amend it? It's a good
thing I was pointed at it, but I'm not too concerned about this case as
it's a warn once.

-- 
Pavel Begunkov


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration
  2026-03-23 16:14     ` Pavel Begunkov
@ 2026-03-23 16:44       ` Jens Axboe
  0 siblings, 0 replies; 22+ messages in thread
From: Jens Axboe @ 2026-03-23 16:44 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: netdev, Youngmin Choi

On 3/23/26 10:14 AM, Pavel Begunkov wrote:
> On 3/23/26 15:01, Jens Axboe wrote:
>> On 3/23/26 6:43 AM, Pavel Begunkov wrote:
>>> @@ -898,12 +933,15 @@ void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
>>>               unsigned long id = 0;
>>>                 ifq = xa_find(&ctx->zcrx_ctxs, &id, ULONG_MAX, XA_PRESENT);
>>> -            if (ifq)
>>> +            if (ifq) {
>>> +                if (WARN_ON_ONCE(!is_zcrx_entry_marked(ctx, id)))
>>> +                    break;
>>
>> This break is inside the scoped_guard(), does this need an ifq = NULL
>> here? I do like scoped locking, but this seems a bit tricky...
> 
> That should work, want me to resend or would you amend it? It's a good
> thing I was pointed at it, but I'm not too concerned about this case as
> it's a warn once.

I can add it and add a note. Outside of that, I think the series looks
fine, no further comments.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH io_uring-7.1 00/16] zcrx update for-7.1
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (15 preceding siblings ...)
  2026-03-23 12:44 ` [PATCH io_uring-7.1 16/16] io_uring/zcrx: rename zcrx [un]register functions Pavel Begunkov
@ 2026-03-23 22:32 ` Jakub Kicinski
  2026-03-23 22:34 ` Jens Axboe
  17 siblings, 0 replies; 22+ messages in thread
From: Jakub Kicinski @ 2026-03-23 22:32 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: io-uring, axboe, netdev

On Mon, 23 Mar 2026 12:43:49 +0000 Pavel Begunkov wrote:
> The series mostly consists of cleanups and preparation patches. Patch 1
> tries to close the if queue earlier at the start of io_ring_exit_work()
> as there are reports io_uring quisce taking too long leading to fails
> on attempts to reuse a queue. Patch 5 introduces a device-less mode,
> where there is only copy fallback and no dma/devices/page_pool/etc.
> Patches 11-12 start moving the memory provider API in the direction
> of passing netmem arrays instead of working with pp directly, which
> was suggested before.

LGTM, FWIW.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH io_uring-7.1 00/16] zcrx update for-7.1
  2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
                   ` (16 preceding siblings ...)
  2026-03-23 22:32 ` [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Jakub Kicinski
@ 2026-03-23 22:34 ` Jens Axboe
  17 siblings, 0 replies; 22+ messages in thread
From: Jens Axboe @ 2026-03-23 22:34 UTC (permalink / raw)
  To: io-uring, Pavel Begunkov; +Cc: netdev


On Mon, 23 Mar 2026 12:43:49 +0000, Pavel Begunkov wrote:
> The series mostly consists of cleanups and preparation patches. Patch 1
> tries to close the if queue earlier at the start of io_ring_exit_work()
> as there are reports io_uring quisce taking too long leading to fails
> on attempts to reuse a queue. Patch 5 introduces a device-less mode,
> where there is only copy fallback and no dma/devices/page_pool/etc.
> Patches 11-12 start moving the memory provider API in the direction
> of passing netmem arrays instead of working with pp directly, which
> was suggested before.
> 
> [...]

Applied, thanks!

[01/16] io_uring/zcrx: return back two step unregistration
        commit: fda90d43f4fac7c0ee56a71c5a9a563bd57dcd96
[02/16] io_uring/zcrx: fully clean area on error in io_import_umem()
        commit: 234fe7bc53d8b2b37bf26a1392020e5b7b58c7d1
[03/16] io_uring/zcrx: always dma map in advance
        commit: 8c0cab0b7bf768594e8efc73f7b8f3d5abeb74f1
[04/16] io_uring/zcrx: extract netdev+area init into a helper
        commit: 80a4144de4e1cc8faeea700fb5a6e6ccc8aa02be
[05/16] io_uring/zcrx: implement device-less mode for zcrx
        commit: c11728021d5cdf8d99a5b127ec21d957d93e2d6c
[06/16] io_uring/zcrx: use better name for RQ region
        commit: 3bb8e0665fd7497e325ef799f945eb9e70186476
[07/16] io_uring/zcrx: add a struct for refill queue
        commit: 161399f0a7414e6b1f09cc76bc1067816bb04ad4
[08/16] io_uring/zcrx: use guards for locking
        commit: a5da6e340ccf62d2672ea90a400a4a66bd13205a
[09/16] io_uring/zcrx: move count check into zcrx_get_free_niov
        commit: ac02a64c479af1ab85b5c31b82345c1c9b6016d1
[10/16] io_uring/zcrx: warn on alloc with non-empty pp cache
        commit: 072237bd1a919545a1c174cd6171a5ee8e709096
[11/16] io_uring/zcrx: netmem array as refiling format
        commit: f3e6e4b057a8e1d4913d92f564c80c3bdd5dab55
[12/16] io_uring/zcrx: consolidate dma syncing
        commit: 2bd8e5066fde4ca5f9f382676ffa830c0e2803fd
[13/16] io_uring/zcrx: warn on a repeated area append
        commit: d2df9b6808abcc46cec4122457693001436e06e7
[14/16] io_uring/zcrx: cache fallback availability in zcrx ctx
        commit: edec451ccfce61291588163f2f8f7e9ed46bb119
[15/16] io_uring/zcrx: check ctrl op payload struct sizes
        commit: 49105528107676a49e5d5a50fa865781986a7c61
[16/16] io_uring/zcrx: rename zcrx [un]register functions
        commit: 623a6d44981f78d7f3391a59d62ae8b55f694850

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2026-03-23 22:34 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration Pavel Begunkov
2026-03-23 15:01   ` Jens Axboe
2026-03-23 16:14     ` Pavel Begunkov
2026-03-23 16:44       ` Jens Axboe
2026-03-23 12:43 ` [PATCH io_uring-7.1 02/16] io_uring/zcrx: fully clean area on error in io_import_umem() Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 03/16] io_uring/zcrx: always dma map in advance Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 04/16] io_uring/zcrx: extract netdev+area init into a helper Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 05/16] io_uring/zcrx: implement device-less mode for zcrx Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 06/16] io_uring/zcrx: use better name for RQ region Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 07/16] io_uring/zcrx: add a struct for refill queue Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 08/16] io_uring/zcrx: use guards for locking Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 09/16] io_uring/zcrx: move count check into zcrx_get_free_niov Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 10/16] io_uring/zcrx: warn on alloc with non-empty pp cache Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 11/16] io_uring/zcrx: netmem array as refiling format Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 12/16] io_uring/zcrx: consolidate dma syncing Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 13/16] io_uring/zcrx: warn on a repeated area append Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 14/16] io_uring/zcrx: cache fallback availability in zcrx ctx Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 15/16] io_uring/zcrx: check ctrl op payload struct sizes Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 16/16] io_uring/zcrx: rename zcrx [un]register functions Pavel Begunkov
2026-03-23 22:32 ` [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Jakub Kicinski
2026-03-23 22:34 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox