public inbox for io-uring@vger.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: io-uring@vger.kernel.org
Cc: asml.silence@gmail.com, axboe@kernel.dk, netdev@vger.kernel.org,
	Youngmin Choi <youngminchoi94@gmail.com>
Subject: [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration
Date: Mon, 23 Mar 2026 12:43:50 +0000	[thread overview]
Message-ID: <0ce21f0565ab4358668922a28a8a36922dfebf76.1774261953.git.asml.silence@gmail.com> (raw)
In-Reply-To: <cover.1774261953.git.asml.silence@gmail.com>

There are reports where io_uring instance removal takes too long and an
ifq reallocation by another zcrx instance fails. Split zcrx destruction
into two steps similarly how it was before, first close the queue early
but maintain zcrx alive, and then when all inflight requests are
completed, drop the main zcrx reference. For extra protection, mark
terminated zcrx instances in xarray and warn if we double put them.

Cc: stable@vger.kernel.org # 6.19+
Link: https://github.com/axboe/liburing/issues/1550
Reported-by: Youngmin Choi <youngminchoi94@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/io_uring.c |  4 ++++
 io_uring/zcrx.c     | 44 +++++++++++++++++++++++++++++++++++++++++---
 io_uring/zcrx.h     |  4 ++++
 3 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 6eaa21e09469..34104c256c88 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2308,6 +2308,10 @@ static __cold void io_ring_exit_work(struct work_struct *work)
 	struct io_tctx_node *node;
 	int ret;
 
+	mutex_lock(&ctx->uring_lock);
+	io_terminate_zcrx(ctx);
+	mutex_unlock(&ctx->uring_lock);
+
 	/*
 	 * If we're doing polled IO and end up having requests being
 	 * submitted async (out-of-line), then completions can come in while
diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index 73fa82759771..8c76c174380d 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -624,12 +624,17 @@ static void io_zcrx_scrub(struct io_zcrx_ifq *ifq)
 	}
 }
 
-static void zcrx_unregister(struct io_zcrx_ifq *ifq)
+static void zcrx_unregister_user(struct io_zcrx_ifq *ifq)
 {
 	if (refcount_dec_and_test(&ifq->user_refs)) {
 		io_close_queue(ifq);
 		io_zcrx_scrub(ifq);
 	}
+}
+
+static void zcrx_unregister(struct io_zcrx_ifq *ifq)
+{
+	zcrx_unregister_user(ifq);
 	io_put_zcrx_ifq(ifq);
 }
 
@@ -887,6 +892,36 @@ static struct net_iov *__io_zcrx_get_free_niov(struct io_zcrx_area *area)
 	return &area->nia.niovs[niov_idx];
 }
 
+static inline bool is_zcrx_entry_marked(struct io_ring_ctx *ctx, unsigned long id)
+{
+	return xa_get_mark(&ctx->zcrx_ctxs, id, XA_MARK_0);
+}
+
+static inline void set_zcrx_entry_mark(struct io_ring_ctx *ctx, unsigned long id)
+{
+	xa_set_mark(&ctx->zcrx_ctxs, id, XA_MARK_0);
+}
+
+void io_terminate_zcrx(struct io_ring_ctx *ctx)
+{
+	struct io_zcrx_ifq *ifq;
+	unsigned long id = 0;
+
+	lockdep_assert_held(&ctx->uring_lock);
+
+	while (1) {
+		scoped_guard(mutex, &ctx->mmap_lock)
+			ifq = xa_find(&ctx->zcrx_ctxs, &id, ULONG_MAX, XA_PRESENT);
+		if (!ifq)
+			break;
+		if (WARN_ON_ONCE(is_zcrx_entry_marked(ctx, id)))
+			break;
+		set_zcrx_entry_mark(ctx, id);
+		id++;
+		zcrx_unregister_user(ifq);
+	}
+}
+
 void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
 {
 	struct io_zcrx_ifq *ifq;
@@ -898,12 +933,15 @@ void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
 			unsigned long id = 0;
 
 			ifq = xa_find(&ctx->zcrx_ctxs, &id, ULONG_MAX, XA_PRESENT);
-			if (ifq)
+			if (ifq) {
+				if (WARN_ON_ONCE(!is_zcrx_entry_marked(ctx, id)))
+					break;
 				xa_erase(&ctx->zcrx_ctxs, id);
+			}
 		}
 		if (!ifq)
 			break;
-		zcrx_unregister(ifq);
+		io_put_zcrx_ifq(ifq);
 	}
 
 	xa_destroy(&ctx->zcrx_ctxs);
diff --git a/io_uring/zcrx.h b/io_uring/zcrx.h
index 0ddcf0ee8861..0316a41a3561 100644
--- a/io_uring/zcrx.h
+++ b/io_uring/zcrx.h
@@ -74,6 +74,7 @@ int io_zcrx_ctrl(struct io_ring_ctx *ctx, void __user *arg, unsigned nr_arg);
 int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 			 struct io_uring_zcrx_ifq_reg __user *arg);
 void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx);
+void io_terminate_zcrx(struct io_ring_ctx *ctx);
 int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
 		 struct socket *sock, unsigned int flags,
 		 unsigned issue_flags, unsigned int *len);
@@ -88,6 +89,9 @@ static inline int io_register_zcrx_ifq(struct io_ring_ctx *ctx,
 static inline void io_unregister_zcrx_ifqs(struct io_ring_ctx *ctx)
 {
 }
+static inline void io_terminate_zcrx(struct io_ring_ctx *ctx)
+{
+}
 static inline int io_zcrx_recv(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
 			       struct socket *sock, unsigned int flags,
 			       unsigned issue_flags, unsigned int *len)
-- 
2.53.0


  reply	other threads:[~2026-03-23 12:44 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-23 12:43 [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Pavel Begunkov
2026-03-23 12:43 ` Pavel Begunkov [this message]
2026-03-23 15:01   ` [PATCH io_uring-7.1 01/16] io_uring/zcrx: return back two step unregistration Jens Axboe
2026-03-23 16:14     ` Pavel Begunkov
2026-03-23 16:44       ` Jens Axboe
2026-03-23 12:43 ` [PATCH io_uring-7.1 02/16] io_uring/zcrx: fully clean area on error in io_import_umem() Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 03/16] io_uring/zcrx: always dma map in advance Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 04/16] io_uring/zcrx: extract netdev+area init into a helper Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 05/16] io_uring/zcrx: implement device-less mode for zcrx Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 06/16] io_uring/zcrx: use better name for RQ region Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 07/16] io_uring/zcrx: add a struct for refill queue Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 08/16] io_uring/zcrx: use guards for locking Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 09/16] io_uring/zcrx: move count check into zcrx_get_free_niov Pavel Begunkov
2026-03-23 12:43 ` [PATCH io_uring-7.1 10/16] io_uring/zcrx: warn on alloc with non-empty pp cache Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 11/16] io_uring/zcrx: netmem array as refiling format Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 12/16] io_uring/zcrx: consolidate dma syncing Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 13/16] io_uring/zcrx: warn on a repeated area append Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 14/16] io_uring/zcrx: cache fallback availability in zcrx ctx Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 15/16] io_uring/zcrx: check ctrl op payload struct sizes Pavel Begunkov
2026-03-23 12:44 ` [PATCH io_uring-7.1 16/16] io_uring/zcrx: rename zcrx [un]register functions Pavel Begunkov
2026-03-23 22:32 ` [PATCH io_uring-7.1 00/16] zcrx update for-7.1 Jakub Kicinski
2026-03-23 22:34 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0ce21f0565ab4358668922a28a8a36922dfebf76.1774261953.git.asml.silence@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=youngminchoi94@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox