* [PATCH 0/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
@ 2025-09-03 3:26 Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 1/4] io_uring: don't include filetable.h in io_uring.h Caleb Sander Mateos
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Caleb Sander Mateos @ 2025-09-03 3:26 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, linux-kernel, Caleb Sander Mateos
As far as I can tell, setting IORING_SETUP_SINGLE_ISSUER when creating
an io_uring doesn't actually enable any additional optimizations (aside
from being a requirement for IORING_SETUP_DEFER_TASKRUN). This series
leverages IORING_SETUP_SINGLE_ISSUER's guarantee that only one task
submits SQEs to skip taking the uring_lock mutex in the submission and
task work paths.
First, we need to close a hole in the IORING_SETUP_SINGLE_ISSUER checks
where IORING_REGISTER_CLONE_BUFFERS only checks whether the thread is
allowed to access one of the two io_urings. It assumes the uring_lock
will prevent concurrent access to the other io_uring, but this will no
longer be the case after the optimization to skip taking uring_lock.
We also need to remove the unused filetable.h #include from io_uring.h
to avoid an #include cycle.
Caleb Sander Mateos (4):
io_uring: don't include filetable.h in io_uring.h
io_uring/rsrc: respect submitter_task in io_register_clone_buffers()
io_uring: factor out uring_lock helpers
io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
io_uring/cancel.c | 1 +
io_uring/fdinfo.c | 2 +-
io_uring/filetable.c | 3 ++-
io_uring/io_uring.c | 58 +++++++++++++++++++++++++++-----------------
io_uring/io_uring.h | 43 ++++++++++++++++++++++++++------
io_uring/kbuf.c | 6 ++---
io_uring/net.c | 1 +
io_uring/notif.c | 5 ++--
io_uring/notif.h | 3 ++-
io_uring/openclose.c | 1 +
io_uring/poll.c | 2 +-
io_uring/register.c | 1 +
io_uring/rsrc.c | 10 +++++++-
io_uring/rsrc.h | 3 ++-
io_uring/rw.c | 3 ++-
io_uring/splice.c | 1 +
io_uring/waitid.c | 2 +-
17 files changed, 102 insertions(+), 43 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/4] io_uring: don't include filetable.h in io_uring.h
2025-09-03 3:26 [PATCH 0/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
@ 2025-09-03 3:26 ` Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 2/4] io_uring/rsrc: respect submitter_task in io_register_clone_buffers() Caleb Sander Mateos
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Caleb Sander Mateos @ 2025-09-03 3:26 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, linux-kernel, Caleb Sander Mateos
io_uring/io_uring.h doesn't use anything declared in
io_uring/filetable.h, so drop the unnecessary #include. Add filetable.h
includes in .c files previously relying on the transitive include from
io_uring.h.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
---
io_uring/cancel.c | 1 +
io_uring/fdinfo.c | 2 +-
io_uring/io_uring.c | 1 +
io_uring/io_uring.h | 1 -
io_uring/net.c | 1 +
io_uring/openclose.c | 1 +
io_uring/register.c | 1 +
io_uring/rsrc.c | 1 +
io_uring/rw.c | 1 +
io_uring/splice.c | 1 +
10 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/io_uring/cancel.c b/io_uring/cancel.c
index 6d57602304df..64b51e82baa2 100644
--- a/io_uring/cancel.c
+++ b/io_uring/cancel.c
@@ -9,10 +9,11 @@
#include <linux/nospec.h>
#include <linux/io_uring.h>
#include <uapi/linux/io_uring.h>
+#include "filetable.h"
#include "io_uring.h"
#include "tctx.h"
#include "poll.h"
#include "timeout.h"
#include "waitid.h"
diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
index 5c7339838769..ff3364531c77 100644
--- a/io_uring/fdinfo.c
+++ b/io_uring/fdinfo.c
@@ -7,11 +7,11 @@
#include <linux/seq_file.h>
#include <linux/io_uring.h>
#include <uapi/linux/io_uring.h>
-#include "io_uring.h"
+#include "filetable.h"
#include "sqpoll.h"
#include "fdinfo.h"
#include "cancel.h"
#include "rsrc.h"
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 545a7d5eefec..9c1190b19adf 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -77,10 +77,11 @@
#include <uapi/linux/io_uring.h>
#include "io-wq.h"
+#include "filetable.h"
#include "io_uring.h"
#include "opdef.h"
#include "refs.h"
#include "tctx.h"
#include "register.h"
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index fa8a66b34d4e..d62b7d9fafed 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -9,11 +9,10 @@
#include <linux/io_uring_types.h>
#include <uapi/linux/eventpoll.h>
#include "alloc_cache.h"
#include "io-wq.h"
#include "slist.h"
-#include "filetable.h"
#include "opdef.h"
#ifndef CREATE_TRACE_POINTS
#include <trace/events/io_uring.h>
#endif
diff --git a/io_uring/net.c b/io_uring/net.c
index d2ca49ceb79d..cf4bf4a2264b 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -8,10 +8,11 @@
#include <net/compat.h>
#include <linux/io_uring.h>
#include <uapi/linux/io_uring.h>
+#include "filetable.h"
#include "io_uring.h"
#include "kbuf.h"
#include "alloc_cache.h"
#include "net.h"
#include "notif.h"
diff --git a/io_uring/openclose.c b/io_uring/openclose.c
index d70700e5cef8..bfeb91b31bba 100644
--- a/io_uring/openclose.c
+++ b/io_uring/openclose.c
@@ -12,10 +12,11 @@
#include <uapi/linux/io_uring.h>
#include "../fs/internal.h"
+#include "filetable.h"
#include "io_uring.h"
#include "rsrc.h"
#include "openclose.h"
struct io_open {
diff --git a/io_uring/register.c b/io_uring/register.c
index aa5f56ad8358..5e493917a1a8 100644
--- a/io_uring/register.c
+++ b/io_uring/register.c
@@ -16,10 +16,11 @@
#include <linux/nospec.h>
#include <linux/compat.h>
#include <linux/io_uring.h>
#include <linux/io_uring_types.h>
+#include "filetable.h"
#include "io_uring.h"
#include "opdef.h"
#include "tctx.h"
#include "rsrc.h"
#include "sqpoll.h"
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index f75f5e43fa4a..2d15b8785a95 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -11,10 +11,11 @@
#include <linux/io_uring.h>
#include <linux/io_uring/cmd.h>
#include <uapi/linux/io_uring.h>
+#include "filetable.h"
#include "io_uring.h"
#include "openclose.h"
#include "rsrc.h"
#include "memmap.h"
#include "register.h"
diff --git a/io_uring/rw.c b/io_uring/rw.c
index dcde5bb7421a..ab6b4afccec3 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -13,10 +13,11 @@
#include <linux/io_uring/cmd.h>
#include <linux/indirect_call_wrapper.h>
#include <uapi/linux/io_uring.h>
+#include "filetable.h"
#include "io_uring.h"
#include "opdef.h"
#include "kbuf.h"
#include "alloc_cache.h"
#include "rsrc.h"
diff --git a/io_uring/splice.c b/io_uring/splice.c
index 35ce4e60b495..e81ebbb91925 100644
--- a/io_uring/splice.c
+++ b/io_uring/splice.c
@@ -9,10 +9,11 @@
#include <linux/io_uring.h>
#include <linux/splice.h>
#include <uapi/linux/io_uring.h>
+#include "filetable.h"
#include "io_uring.h"
#include "splice.h"
struct io_splice {
struct file *file_out;
--
2.45.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/4] io_uring/rsrc: respect submitter_task in io_register_clone_buffers()
2025-09-03 3:26 [PATCH 0/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 1/4] io_uring: don't include filetable.h in io_uring.h Caleb Sander Mateos
@ 2025-09-03 3:26 ` Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 3/4] io_uring: factor out uring_lock helpers Caleb Sander Mateos
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Caleb Sander Mateos @ 2025-09-03 3:26 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, linux-kernel, Caleb Sander Mateos
io_ring_ctx's enabled with IORING_SETUP_SINGLE_ISSUER are only allowed
a single task submitting to the ctx. Although the documentation only
mentions this restriction applying to io_uring_enter() syscalls,
commit d7cce96c449e ("io_uring: limit registration w/ SINGLE_ISSUER")
extends it to io_uring_register(). Ensuring only one task interacts
with the io_ring_ctx will be important to allow this task to avoid
taking the uring_lock.
There is, however, one gap in these checks: io_register_clone_buffers()
may take the uring_lock on a second (source) io_ring_ctx, but
__io_uring_register() only checks the current thread against the
*destination* io_ring_ctx's submitter_task. Fail the
IORING_REGISTER_CLONE_BUFFERS with -EEXIST if the source io_ring_ctx has
a registered submitter_task other than the current task.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
---
io_uring/rsrc.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index 2d15b8785a95..1e5b7833076a 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -1298,14 +1298,21 @@ int io_register_clone_buffers(struct io_ring_ctx *ctx, void __user *arg)
src_ctx = file->private_data;
if (src_ctx != ctx) {
mutex_unlock(&ctx->uring_lock);
lock_two_rings(ctx, src_ctx);
+
+ if (src_ctx->submitter_task &&
+ src_ctx->submitter_task != current) {
+ ret = -EEXIST;
+ goto out;
+ }
}
ret = io_clone_buffers(ctx, src_ctx, &buf);
+out:
if (src_ctx != ctx)
mutex_unlock(&src_ctx->uring_lock);
fput(file);
return ret;
--
2.45.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/4] io_uring: factor out uring_lock helpers
2025-09-03 3:26 [PATCH 0/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 1/4] io_uring: don't include filetable.h in io_uring.h Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 2/4] io_uring/rsrc: respect submitter_task in io_register_clone_buffers() Caleb Sander Mateos
@ 2025-09-03 3:26 ` Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 4/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
2025-09-03 21:55 ` [syzbot ci] " syzbot ci
4 siblings, 0 replies; 7+ messages in thread
From: Caleb Sander Mateos @ 2025-09-03 3:26 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, linux-kernel, Caleb Sander Mateos
A subsequent commit will skip acquiring the io_ring_ctx uring_lock in
io_uring_enter() and io_handle_tw_list() for IORING_SETUP_SINGLE_ISSUER.
Prepare for this change by factoring out the uring_lock accesses under
these functions into helper functions:
- io_ring_ctx_lock() for mutex_lock(&ctx->uring_lock)
- io_ring_ctx_unlock() for mutex_unlock(&ctx->uring_lock)
- io_ring_ctx_assert_locked() for lockdep_assert_held(&ctx->uring_lock)
For now, the helpers unconditionally call the mutex functions. But a
subsequent commit will condition them on !IORING_SETUP_SINGLE_ISSUER.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
---
io_uring/filetable.c | 3 ++-
io_uring/io_uring.c | 51 ++++++++++++++++++++++++++------------------
io_uring/io_uring.h | 28 ++++++++++++++++++------
io_uring/kbuf.c | 6 +++---
io_uring/notif.c | 5 +++--
io_uring/notif.h | 3 ++-
io_uring/poll.c | 2 +-
io_uring/rsrc.c | 2 +-
io_uring/rsrc.h | 3 ++-
io_uring/rw.c | 2 +-
io_uring/waitid.c | 2 +-
11 files changed, 67 insertions(+), 40 deletions(-)
diff --git a/io_uring/filetable.c b/io_uring/filetable.c
index a21660e3145a..aae283e77856 100644
--- a/io_uring/filetable.c
+++ b/io_uring/filetable.c
@@ -55,14 +55,15 @@ void io_free_file_tables(struct io_ring_ctx *ctx, struct io_file_table *table)
table->bitmap = NULL;
}
static int io_install_fixed_file(struct io_ring_ctx *ctx, struct file *file,
u32 slot_index)
- __must_hold(&req->ctx->uring_lock)
{
struct io_rsrc_node *node;
+ io_ring_ctx_assert_locked(ctx);
+
if (io_is_uring_fops(file))
return -EBADF;
if (!ctx->file_table.data.nr)
return -ENXIO;
if (slot_index >= ctx->file_table.data.nr)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 9c1190b19adf..7f19b6da5d3d 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -554,11 +554,11 @@ static unsigned io_linked_nr(struct io_kiocb *req)
static __cold noinline void io_queue_deferred(struct io_ring_ctx *ctx)
{
bool drain_seen = false, first = true;
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
__io_req_caches_free(ctx);
while (!list_empty(&ctx->defer_list)) {
struct io_defer_entry *de = list_first_entry(&ctx->defer_list,
struct io_defer_entry, list);
@@ -925,11 +925,11 @@ bool io_post_aux_cqe(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags
* Must be called from inline task_work so we now a flush will happen later,
* and obviously with ctx->uring_lock held (tw always has that).
*/
void io_add_aux_cqe(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags)
{
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
lockdep_assert(ctx->lockless_cq);
if (!io_fill_cqe_aux(ctx, user_data, res, cflags)) {
struct io_cqe cqe = io_init_cqe(user_data, res, cflags);
@@ -954,11 +954,11 @@ bool io_req_post_cqe(struct io_kiocb *req, s32 res, u32 cflags)
*/
if (!wq_list_empty(&ctx->submit_state.compl_reqs))
__io_submit_flush_completions(ctx);
lockdep_assert(!io_wq_current_is_worker());
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
if (!ctx->lockless_cq) {
spin_lock(&ctx->completion_lock);
posted = io_fill_cqe_aux(ctx, req->cqe.user_data, res, cflags);
spin_unlock(&ctx->completion_lock);
@@ -978,11 +978,11 @@ bool io_req_post_cqe32(struct io_kiocb *req, struct io_uring_cqe cqe[2])
{
struct io_ring_ctx *ctx = req->ctx;
bool posted;
lockdep_assert(!io_wq_current_is_worker());
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
cqe[0].user_data = req->cqe.user_data;
if (!ctx->lockless_cq) {
spin_lock(&ctx->completion_lock);
posted = io_fill_cqe_aux32(ctx, cqe);
@@ -1032,15 +1032,14 @@ static void io_req_complete_post(struct io_kiocb *req, unsigned issue_flags)
*/
req_ref_put(req);
}
void io_req_defer_failed(struct io_kiocb *req, s32 res)
- __must_hold(&ctx->uring_lock)
{
const struct io_cold_def *def = &io_cold_defs[req->opcode];
- lockdep_assert_held(&req->ctx->uring_lock);
+ io_ring_ctx_assert_locked(req->ctx);
req_set_fail(req);
io_req_set_res(req, res, io_put_kbuf(req, res, NULL));
if (def->fail)
def->fail(req);
@@ -1052,16 +1051,17 @@ void io_req_defer_failed(struct io_kiocb *req, s32 res)
* handlers and io_issue_sqe() are done with it, e.g. inline completion path.
* Because of that, io_alloc_req() should be called only under ->uring_lock
* and with extra caution to not get a request that is still worked on.
*/
__cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
{
gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_ZERO;
void *reqs[IO_REQ_ALLOC_BATCH];
int ret;
+ io_ring_ctx_assert_locked(ctx);
+
ret = kmem_cache_alloc_bulk(req_cachep, gfp, ARRAY_SIZE(reqs), reqs);
/*
* Bulk alloc is all-or-nothing. If we fail to get a batch,
* retry single alloc to be on the safe side.
@@ -1126,11 +1126,11 @@ static void ctx_flush_and_put(struct io_ring_ctx *ctx, io_tw_token_t tw)
return;
if (ctx->flags & IORING_SETUP_TASKRUN_FLAG)
atomic_andnot(IORING_SQ_TASKRUN, &ctx->rings->sq_flags);
io_submit_flush_completions(ctx);
- mutex_unlock(&ctx->uring_lock);
+ io_ring_ctx_unlock(ctx);
percpu_ref_put(&ctx->refs);
}
/*
* Run queued task_work, returning the number of entries processed in *count.
@@ -1150,11 +1150,11 @@ struct llist_node *io_handle_tw_list(struct llist_node *node,
io_task_work.node);
if (req->ctx != ctx) {
ctx_flush_and_put(ctx, ts);
ctx = req->ctx;
- mutex_lock(&ctx->uring_lock);
+ io_ring_ctx_lock(ctx);
percpu_ref_get(&ctx->refs);
}
INDIRECT_CALL_2(req->io_task_work.func,
io_poll_task_func, io_req_rw_complete,
req, ts);
@@ -1502,12 +1502,13 @@ static inline void io_req_put_rsrc_nodes(struct io_kiocb *req)
io_put_rsrc_node(req->ctx, req->buf_node);
}
static void io_free_batch_list(struct io_ring_ctx *ctx,
struct io_wq_work_node *node)
- __must_hold(&ctx->uring_lock)
{
+ io_ring_ctx_assert_locked(ctx);
+
do {
struct io_kiocb *req = container_of(node, struct io_kiocb,
comp_list);
if (unlikely(req->flags & IO_REQ_CLEAN_SLOW_FLAGS)) {
@@ -1543,15 +1544,16 @@ static void io_free_batch_list(struct io_ring_ctx *ctx,
io_req_add_to_cache(req, ctx);
} while (node);
}
void __io_submit_flush_completions(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
{
struct io_submit_state *state = &ctx->submit_state;
struct io_wq_work_node *node;
+ io_ring_ctx_assert_locked(ctx);
+
__io_cq_lock(ctx);
__wq_list_for_each(node, &state->compl_reqs) {
struct io_kiocb *req = container_of(node, struct io_kiocb,
comp_list);
@@ -1767,16 +1769,17 @@ io_req_flags_t io_file_get_flags(struct file *file)
res |= REQ_F_SUPPORT_NOWAIT;
return res;
}
static __cold void io_drain_req(struct io_kiocb *req)
- __must_hold(&ctx->uring_lock)
{
struct io_ring_ctx *ctx = req->ctx;
bool drain = req->flags & IOSQE_IO_DRAIN;
struct io_defer_entry *de;
+ io_ring_ctx_assert_locked(ctx);
+
de = kmalloc(sizeof(*de), GFP_KERNEL_ACCOUNT);
if (!de) {
io_req_defer_failed(req, -ENOMEM);
return;
}
@@ -2043,12 +2046,13 @@ static int io_req_sqe_copy(struct io_kiocb *req, unsigned int issue_flags)
def->sqe_copy(req);
return 0;
}
static void io_queue_async(struct io_kiocb *req, unsigned int issue_flags, int ret)
- __must_hold(&req->ctx->uring_lock)
{
+ io_ring_ctx_assert_locked(req->ctx);
+
if (ret != -EAGAIN || (req->flags & REQ_F_NOWAIT)) {
fail:
io_req_defer_failed(req, ret);
return;
}
@@ -2068,16 +2072,17 @@ static void io_queue_async(struct io_kiocb *req, unsigned int issue_flags, int r
break;
}
}
static inline void io_queue_sqe(struct io_kiocb *req, unsigned int extra_flags)
- __must_hold(&req->ctx->uring_lock)
{
unsigned int issue_flags = IO_URING_F_NONBLOCK |
IO_URING_F_COMPLETE_DEFER | extra_flags;
int ret;
+ io_ring_ctx_assert_locked(req->ctx);
+
ret = io_issue_sqe(req, issue_flags);
/*
* We async punt it if the file wasn't marked NOWAIT, or if the file
* doesn't support non-blocking read/write attempts
@@ -2085,12 +2090,13 @@ static inline void io_queue_sqe(struct io_kiocb *req, unsigned int extra_flags)
if (unlikely(ret))
io_queue_async(req, issue_flags, ret);
}
static void io_queue_sqe_fallback(struct io_kiocb *req)
- __must_hold(&req->ctx->uring_lock)
{
+ io_ring_ctx_assert_locked(req->ctx);
+
if (unlikely(req->flags & REQ_F_FAIL)) {
/*
* We don't submit, fail them all, for that replace hardlinks
* with normal links. Extra REQ_F_LINK is tolerated.
*/
@@ -2155,17 +2161,18 @@ static __cold int io_init_fail_req(struct io_kiocb *req, int err)
return err;
}
static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
const struct io_uring_sqe *sqe)
- __must_hold(&ctx->uring_lock)
{
const struct io_issue_def *def;
unsigned int sqe_flags;
int personality;
u8 opcode;
+ io_ring_ctx_assert_locked(ctx);
+
req->ctx = ctx;
req->opcode = opcode = READ_ONCE(sqe->opcode);
/* same numerical values with corresponding REQ_F_*, safe to copy */
sqe_flags = READ_ONCE(sqe->flags);
req->flags = (__force io_req_flags_t) sqe_flags;
@@ -2290,15 +2297,16 @@ static __cold int io_submit_fail_init(const struct io_uring_sqe *sqe,
return 0;
}
static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
const struct io_uring_sqe *sqe)
- __must_hold(&ctx->uring_lock)
{
struct io_submit_link *link = &ctx->submit_state.link;
int ret;
+ io_ring_ctx_assert_locked(ctx);
+
ret = io_init_req(ctx, req, sqe);
if (unlikely(ret))
return io_submit_fail_init(sqe, req, ret);
trace_io_uring_submit_req(req);
@@ -2419,16 +2427,17 @@ static bool io_get_sqe(struct io_ring_ctx *ctx, const struct io_uring_sqe **sqe)
*sqe = &ctx->sq_sqes[head];
return true;
}
int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
- __must_hold(&ctx->uring_lock)
{
unsigned int entries = io_sqring_entries(ctx);
unsigned int left;
int ret;
+ io_ring_ctx_assert_locked(ctx);
+
if (unlikely(!entries))
return 0;
/* make sure SQ entry isn't read before tail */
ret = left = min(nr, entries);
io_get_task_refs(left);
@@ -3518,14 +3527,14 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
} else if (to_submit) {
ret = io_uring_add_tctx_node(ctx);
if (unlikely(ret))
goto out;
- mutex_lock(&ctx->uring_lock);
+ io_ring_ctx_lock(ctx);
ret = io_submit_sqes(ctx, to_submit);
if (ret != to_submit) {
- mutex_unlock(&ctx->uring_lock);
+ io_ring_ctx_unlock(ctx);
goto out;
}
if (flags & IORING_ENTER_GETEVENTS) {
if (ctx->syscall_iopoll)
goto iopoll_locked;
@@ -3534,11 +3543,11 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
* it should handle ownership problems if any.
*/
if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
(void)io_run_local_work_locked(ctx, min_complete);
}
- mutex_unlock(&ctx->uring_lock);
+ io_ring_ctx_unlock(ctx);
}
if (flags & IORING_ENTER_GETEVENTS) {
int ret2;
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index d62b7d9fafed..a0580a1bf6b5 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -119,20 +119,35 @@ bool __io_alloc_req_refill(struct io_ring_ctx *ctx);
bool io_match_task_safe(struct io_kiocb *head, struct io_uring_task *tctx,
bool cancel_all);
void io_activate_pollwq(struct io_ring_ctx *ctx);
+static inline void io_ring_ctx_lock(struct io_ring_ctx *ctx)
+{
+ mutex_lock(&ctx->uring_lock);
+}
+
+static inline void io_ring_ctx_unlock(struct io_ring_ctx *ctx)
+{
+ mutex_unlock(&ctx->uring_lock);
+}
+
+static inline void io_ring_ctx_assert_locked(const struct io_ring_ctx *ctx)
+{
+ lockdep_assert_held(&ctx->uring_lock);
+}
+
static inline void io_lockdep_assert_cq_locked(struct io_ring_ctx *ctx)
{
#if defined(CONFIG_PROVE_LOCKING)
lockdep_assert(in_task());
if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
if (ctx->flags & IORING_SETUP_IOPOLL) {
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
} else if (!ctx->task_complete) {
lockdep_assert_held(&ctx->completion_lock);
} else if (ctx->submitter_task) {
/*
* ->submitter_task may be NULL and we can still post a CQE,
@@ -300,11 +315,11 @@ static inline void io_put_file(struct io_kiocb *req)
}
static inline void io_ring_submit_unlock(struct io_ring_ctx *ctx,
unsigned issue_flags)
{
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
if (unlikely(issue_flags & IO_URING_F_UNLOCKED))
mutex_unlock(&ctx->uring_lock);
}
static inline void io_ring_submit_lock(struct io_ring_ctx *ctx,
@@ -316,11 +331,11 @@ static inline void io_ring_submit_lock(struct io_ring_ctx *ctx,
* The only exception is when we've detached the request and issue it
* from an async worker thread, grab the lock for that case.
*/
if (unlikely(issue_flags & IO_URING_F_UNLOCKED))
mutex_lock(&ctx->uring_lock);
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
}
static inline void io_commit_cqring(struct io_ring_ctx *ctx)
{
/* order cqe stores with ring update */
@@ -428,24 +443,23 @@ static inline bool io_task_work_pending(struct io_ring_ctx *ctx)
return task_work_pending(current) || io_local_work_pending(ctx);
}
static inline void io_tw_lock(struct io_ring_ctx *ctx, io_tw_token_t tw)
{
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
}
/*
* Don't complete immediately but use deferred completion infrastructure.
* Protected by ->uring_lock and can only be used either with
* IO_URING_F_COMPLETE_DEFER or inside a tw handler holding the mutex.
*/
static inline void io_req_complete_defer(struct io_kiocb *req)
- __must_hold(&req->ctx->uring_lock)
{
struct io_submit_state *state = &req->ctx->submit_state;
- lockdep_assert_held(&req->ctx->uring_lock);
+ io_ring_ctx_assert_locked(req->ctx);
wq_list_add_tail(&req->comp_list, &state->compl_reqs);
}
static inline void io_commit_cqring_flush(struct io_ring_ctx *ctx)
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 3e9aab21af9d..ea6f3588d875 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -68,11 +68,11 @@ bool io_kbuf_commit(struct io_kiocb *req,
}
static inline struct io_buffer_list *io_buffer_get_list(struct io_ring_ctx *ctx,
unsigned int bgid)
{
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
return xa_load(&ctx->io_bl_xa, bgid);
}
static int io_buffer_add_list(struct io_ring_ctx *ctx,
@@ -337,11 +337,11 @@ int io_buffers_peek(struct io_kiocb *req, struct buf_sel_arg *arg,
{
struct io_ring_ctx *ctx = req->ctx;
struct io_buffer_list *bl;
int ret;
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
bl = io_buffer_get_list(ctx, arg->buf_group);
if (unlikely(!bl))
return -ENOENT;
@@ -393,11 +393,11 @@ static int io_remove_buffers_legacy(struct io_ring_ctx *ctx,
{
unsigned long i = 0;
struct io_buffer *nxt;
/* protects io_buffers_cache */
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
WARN_ON_ONCE(bl->flags & IOBL_BUF_RING);
for (i = 0; i < nbufs && !list_empty(&bl->buf_list); i++) {
nxt = list_first_entry(&bl->buf_list, struct io_buffer, list);
list_del(&nxt->list);
diff --git a/io_uring/notif.c b/io_uring/notif.c
index 8c92e9cde2c6..9dd248fcb213 100644
--- a/io_uring/notif.c
+++ b/io_uring/notif.c
@@ -14,11 +14,11 @@ static const struct ubuf_info_ops io_ubuf_ops;
static void io_notif_tw_complete(struct io_kiocb *notif, io_tw_token_t tw)
{
struct io_notif_data *nd = io_notif_to_data(notif);
struct io_ring_ctx *ctx = notif->ctx;
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
do {
notif = cmd_to_io_kiocb(nd);
if (WARN_ON_ONCE(ctx != notif->ctx))
@@ -108,15 +108,16 @@ static const struct ubuf_info_ops io_ubuf_ops = {
.complete = io_tx_ubuf_complete,
.link_skb = io_link_skb,
};
struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
{
struct io_kiocb *notif;
struct io_notif_data *nd;
+ io_ring_ctx_assert_locked(ctx);
+
if (unlikely(!io_alloc_req(ctx, ¬if)))
return NULL;
notif->ctx = ctx;
notif->opcode = IORING_OP_NOP;
notif->flags = 0;
diff --git a/io_uring/notif.h b/io_uring/notif.h
index f3589cfef4a9..c33c9a1179c9 100644
--- a/io_uring/notif.h
+++ b/io_uring/notif.h
@@ -31,14 +31,15 @@ static inline struct io_notif_data *io_notif_to_data(struct io_kiocb *notif)
{
return io_kiocb_to_cmd(notif, struct io_notif_data);
}
static inline void io_notif_flush(struct io_kiocb *notif)
- __must_hold(¬if->ctx->uring_lock)
{
struct io_notif_data *nd = io_notif_to_data(notif);
+ io_ring_ctx_assert_locked(notif->ctx);
+
io_tx_ubuf_complete(NULL, &nd->uarg, true);
}
static inline int io_notif_account_mem(struct io_kiocb *notif, unsigned len)
{
diff --git a/io_uring/poll.c b/io_uring/poll.c
index ea75c5cd81a0..ba71403c8fd8 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -121,11 +121,11 @@ static struct io_poll *io_poll_get_single(struct io_kiocb *req)
static void io_poll_req_insert(struct io_kiocb *req)
{
struct io_hash_table *table = &req->ctx->cancel_table;
u32 index = hash_long(req->cqe.user_data, table->hash_bits);
- lockdep_assert_held(&req->ctx->uring_lock);
+ io_ring_ctx_assert_locked(req->ctx);
hlist_add_head(&req->hash_node, &table->hbs[index].list);
}
static void io_init_poll_iocb(struct io_poll *poll, __poll_t events)
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index 1e5b7833076a..1c1753de7340 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -347,11 +347,11 @@ static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
struct io_uring_rsrc_update2 *up,
unsigned nr_args)
{
__u32 tmp;
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
if (check_add_overflow(up->offset, nr_args, &tmp))
return -EOVERFLOW;
switch (type) {
diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h
index a3ca6ba66596..d537a3b895d6 100644
--- a/io_uring/rsrc.h
+++ b/io_uring/rsrc.h
@@ -2,10 +2,11 @@
#ifndef IOU_RSRC_H
#define IOU_RSRC_H
#include <linux/io_uring_types.h>
#include <linux/lockdep.h>
+#include "io_uring.h"
#define IO_VEC_CACHE_SOFT_CAP 256
enum {
IORING_RSRC_FILE = 0,
@@ -97,11 +98,11 @@ static inline struct io_rsrc_node *io_rsrc_node_lookup(struct io_rsrc_data *data
return NULL;
}
static inline void io_put_rsrc_node(struct io_ring_ctx *ctx, struct io_rsrc_node *node)
{
- lockdep_assert_held(&ctx->uring_lock);
+ io_ring_ctx_assert_locked(ctx);
if (!--node->refs)
io_free_rsrc_node(ctx, node);
}
static inline bool io_reset_rsrc_node(struct io_ring_ctx *ctx,
diff --git a/io_uring/rw.c b/io_uring/rw.c
index ab6b4afccec3..f00e02a02dc7 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -461,11 +461,11 @@ int io_read_mshot_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
return 0;
}
void io_readv_writev_cleanup(struct io_kiocb *req)
{
- lockdep_assert_held(&req->ctx->uring_lock);
+ io_ring_ctx_assert_locked(req->ctx);
io_rw_recycle(req, 0);
}
static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req)
{
diff --git a/io_uring/waitid.c b/io_uring/waitid.c
index 26c118f3918d..f7a5054d4d81 100644
--- a/io_uring/waitid.c
+++ b/io_uring/waitid.c
@@ -114,11 +114,11 @@ static void io_waitid_complete(struct io_kiocb *req, int ret)
struct io_waitid *iw = io_kiocb_to_cmd(req, struct io_waitid);
/* anyone completing better be holding a reference */
WARN_ON_ONCE(!(atomic_read(&iw->refs) & IO_WAITID_REF_MASK));
- lockdep_assert_held(&req->ctx->uring_lock);
+ io_ring_ctx_assert_locked(req->ctx);
hlist_del_init(&req->hash_node);
ret = io_waitid_finish(req, ret);
if (ret < 0)
--
2.45.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
2025-09-03 3:26 [PATCH 0/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
` (2 preceding siblings ...)
2025-09-03 3:26 ` [PATCH 3/4] io_uring: factor out uring_lock helpers Caleb Sander Mateos
@ 2025-09-03 3:26 ` Caleb Sander Mateos
2025-09-03 21:55 ` [syzbot ci] " syzbot ci
4 siblings, 0 replies; 7+ messages in thread
From: Caleb Sander Mateos @ 2025-09-03 3:26 UTC (permalink / raw)
To: Jens Axboe; +Cc: io-uring, linux-kernel, Caleb Sander Mateos
io_ring_ctx's mutex uring_lock can be quite expensive in high-IOPS
workloads. Even when only one thread pinned to a single CPU is accessing
the io_ring_ctx, the atomic CAS required to lock and unlock the mutex is
a very hot instruction. The mutex's primary purpose is to prevent
concurrent io_uring system calls on the same io_ring_ctx. However, there
is already a flag IORING_SETUP_SINGLE_ISSUER that promises only one
task will make io_uring_enter() and io_uring_register() system calls on
the io_ring_ctx once it's enabled.
So if the io_ring_ctx is setup with IORING_SETUP_SINGLE_ISSUER, skip the
uring_lock mutex_lock() and mutex_unlock() for the io_uring_enter()
submission as well as for io_handle_tw_list(). io_uring_enter()
submission calls __io_uring_add_tctx_node_from_submit() to verify the
current task matches submitter_task for IORING_SETUP_SINGLE_ISSUER. And
task work can only be scheduled on tasks that submit io_uring requests,
so io_handle_tw_list() will also only be called on submitter_task.
There is a goto from the io_uring_enter() submission to the middle of
the IOPOLL block which assumed the uring_lock would already be held.
This is no longer the case for IORING_SETUP_SINGLE_ISSUER, so goto the
preceding mutex_lock() in that case.
It may be possible to avoid taking uring_lock in other places too for
IORING_SETUP_SINGLE_ISSUER, but these two cover the primary hot paths.
The uring_lock in io_uring_register() is necessary at least before the
io_uring is enabled because submitter_task isn't set yet. uring_lock is
also used to synchronize IOPOLL on submitting tasks with io_uring worker
tasks, so it's still needed there. But in principle, it should be
possible to remove the mutex entirely for IORING_SETUP_SINGLE_ISSUER by
running any code needing exclusive access to the io_ring_ctx in task
work context on submitter_task.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
---
io_uring/io_uring.c | 6 +++++-
io_uring/io_uring.h | 14 ++++++++++++++
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 7f19b6da5d3d..5793f6122159 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3534,12 +3534,15 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
if (ret != to_submit) {
io_ring_ctx_unlock(ctx);
goto out;
}
if (flags & IORING_ENTER_GETEVENTS) {
- if (ctx->syscall_iopoll)
+ if (ctx->syscall_iopoll) {
+ if (ctx->flags & IORING_SETUP_SINGLE_ISSUER)
+ goto iopoll;
goto iopoll_locked;
+ }
/*
* Ignore errors, we'll soon call io_cqring_wait() and
* it should handle ownership problems if any.
*/
if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
@@ -3556,10 +3559,11 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
* We disallow the app entering submit/complete with
* polling, but we still need to lock the ring to
* prevent racing with polled issue that got punted to
* a workqueue.
*/
+iopoll:
mutex_lock(&ctx->uring_lock);
iopoll_locked:
ret2 = io_validate_ext_arg(ctx, flags, argp, argsz);
if (likely(!ret2))
ret2 = io_iopoll_check(ctx, min_complete);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index a0580a1bf6b5..7296b12b0897 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -121,20 +121,34 @@ bool io_match_task_safe(struct io_kiocb *head, struct io_uring_task *tctx,
void io_activate_pollwq(struct io_ring_ctx *ctx);
static inline void io_ring_ctx_lock(struct io_ring_ctx *ctx)
{
+ if (ctx->flags & IORING_SETUP_SINGLE_ISSUER) {
+ WARN_ON_ONCE(current != ctx->submitter_task);
+ return;
+ }
+
mutex_lock(&ctx->uring_lock);
}
static inline void io_ring_ctx_unlock(struct io_ring_ctx *ctx)
{
+ if (ctx->flags & IORING_SETUP_SINGLE_ISSUER) {
+ WARN_ON_ONCE(current != ctx->submitter_task);
+ return;
+ }
+
mutex_unlock(&ctx->uring_lock);
}
static inline void io_ring_ctx_assert_locked(const struct io_ring_ctx *ctx)
{
+ if (ctx->flags & IORING_SETUP_SINGLE_ISSUER &&
+ current == ctx->submitter_task)
+ return;
+
lockdep_assert_held(&ctx->uring_lock);
}
static inline void io_lockdep_assert_cq_locked(struct io_ring_ctx *ctx)
{
--
2.45.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [syzbot ci] Re: io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
2025-09-03 3:26 [PATCH 0/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
` (3 preceding siblings ...)
2025-09-03 3:26 ` [PATCH 4/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
@ 2025-09-03 21:55 ` syzbot ci
2025-09-03 23:29 ` Jens Axboe
4 siblings, 1 reply; 7+ messages in thread
From: syzbot ci @ 2025-09-03 21:55 UTC (permalink / raw)
To: axboe, csander, io-uring, linux-kernel; +Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v1] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
https://lore.kernel.org/all/20250903032656.2012337-1-csander@purestorage.com
* [PATCH 1/4] io_uring: don't include filetable.h in io_uring.h
* [PATCH 2/4] io_uring/rsrc: respect submitter_task in io_register_clone_buffers()
* [PATCH 3/4] io_uring: factor out uring_lock helpers
* [PATCH 4/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
and found the following issue:
WARNING in io_handle_tw_list
Full report is available here:
https://ci.syzbot.org/series/54ae0eae-5e47-4cfe-9ae7-9eaaf959b5ae
***
WARNING in io_handle_tw_list
tree: linux-next
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next
base: 5d50cf9f7cf20a17ac469c20a2e07c29c1f6aab7
arch: amd64
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config: https://ci.syzbot.org/builds/1de646dd-4ee2-418d-9c62-617d88ed4fd2/config
syz repro: https://ci.syzbot.org/findings/e229a878-375f-4286-89fe-b6724c23addd/syz_repro
------------[ cut here ]------------
WARNING: io_uring/io_uring.h:127 at io_ring_ctx_lock io_uring/io_uring.h:127 [inline], CPU#1: iou-sqp-6294/6297
WARNING: io_uring/io_uring.h:127 at io_handle_tw_list+0x234/0x2e0 io_uring/io_uring.c:1155, CPU#1: iou-sqp-6294/6297
Modules linked in:
CPU: 1 UID: 0 PID: 6297 Comm: iou-sqp-6294 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:io_ring_ctx_lock io_uring/io_uring.h:127 [inline]
RIP: 0010:io_handle_tw_list+0x234/0x2e0 io_uring/io_uring.c:1155
Code: 00 00 48 c7 c7 e0 90 02 8c be 8e 04 00 00 31 d2 e8 01 e5 d2 fc 2e 2e 2e 31 c0 45 31 e4 4d 85 ff 75 89 eb 7c e8 ad fb 00 fd 90 <0f> 0b 90 e9 cf fe ff ff 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 22 ff
RSP: 0018:ffffc900032cf938 EFLAGS: 00010293
RAX: ffffffff84bfcba3 RBX: dffffc0000000000 RCX: ffff888107f61cc0
RDX: 0000000000000000 RSI: 0000000000001000 RDI: 0000000000000000
RBP: ffff8881119a8008 R08: ffff888110bb69c7 R09: 1ffff11022176d38
R10: dffffc0000000000 R11: ffffed1022176d39 R12: ffff8881119a8000
R13: ffff888108441e90 R14: ffff888107f61cc0 R15: 0000000000000000
FS: 00007f81f25716c0(0000) GS:ffff8881a39f5000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b31b63fff CR3: 000000010f24c000 CR4: 00000000000006f0
Call Trace:
<TASK>
tctx_task_work_run+0x99/0x370 io_uring/io_uring.c:1223
io_sq_tw io_uring/sqpoll.c:244 [inline]
io_sq_thread+0xed1/0x1e50 io_uring/sqpoll.c:327
ret_from_fork+0x47f/0x820 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [syzbot ci] Re: io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
2025-09-03 21:55 ` [syzbot ci] " syzbot ci
@ 2025-09-03 23:29 ` Jens Axboe
0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2025-09-03 23:29 UTC (permalink / raw)
To: syzbot ci, csander, io-uring, linux-kernel; +Cc: syzbot, syzkaller-bugs
On 9/3/25 3:55 PM, syzbot ci wrote:
> syzbot ci has tested the following series
>
> [v1] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
> https://lore.kernel.org/all/20250903032656.2012337-1-csander@purestorage.com
> * [PATCH 1/4] io_uring: don't include filetable.h in io_uring.h
> * [PATCH 2/4] io_uring/rsrc: respect submitter_task in io_register_clone_buffers()
> * [PATCH 3/4] io_uring: factor out uring_lock helpers
> * [PATCH 4/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
>
> and found the following issue:
> WARNING in io_handle_tw_list
>
> Full report is available here:
> https://ci.syzbot.org/series/54ae0eae-5e47-4cfe-9ae7-9eaaf959b5ae
>
> ***
>
> WARNING in io_handle_tw_list
>
> tree: linux-next
> URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next
> base: 5d50cf9f7cf20a17ac469c20a2e07c29c1f6aab7
> arch: amd64
> compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
> config: https://ci.syzbot.org/builds/1de646dd-4ee2-418d-9c62-617d88ed4fd2/config
> syz repro: https://ci.syzbot.org/findings/e229a878-375f-4286-89fe-b6724c23addd/syz_repro
>
> ------------[ cut here ]------------
> WARNING: io_uring/io_uring.h:127 at io_ring_ctx_lock io_uring/io_uring.h:127 [inline], CPU#1: iou-sqp-6294/6297
> WARNING: io_uring/io_uring.h:127 at io_handle_tw_list+0x234/0x2e0 io_uring/io_uring.c:1155, CPU#1: iou-sqp-6294/6297
> Modules linked in:
> CPU: 1 UID: 0 PID: 6297 Comm: iou-sqp-6294 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> RIP: 0010:io_ring_ctx_lock io_uring/io_uring.h:127 [inline]
> RIP: 0010:io_handle_tw_list+0x234/0x2e0 io_uring/io_uring.c:1155
> Code: 00 00 48 c7 c7 e0 90 02 8c be 8e 04 00 00 31 d2 e8 01 e5 d2 fc 2e 2e 2e 31 c0 45 31 e4 4d 85 ff 75 89 eb 7c e8 ad fb 00 fd 90 <0f> 0b 90 e9 cf fe ff ff 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 22 ff
> RSP: 0018:ffffc900032cf938 EFLAGS: 00010293
> RAX: ffffffff84bfcba3 RBX: dffffc0000000000 RCX: ffff888107f61cc0
> RDX: 0000000000000000 RSI: 0000000000001000 RDI: 0000000000000000
> RBP: ffff8881119a8008 R08: ffff888110bb69c7 R09: 1ffff11022176d38
> R10: dffffc0000000000 R11: ffffed1022176d39 R12: ffff8881119a8000
> R13: ffff888108441e90 R14: ffff888107f61cc0 R15: 0000000000000000
> FS: 00007f81f25716c0(0000) GS:ffff8881a39f5000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b31b63fff CR3: 000000010f24c000 CR4: 00000000000006f0
> Call Trace:
> <TASK>
> tctx_task_work_run+0x99/0x370 io_uring/io_uring.c:1223
> io_sq_tw io_uring/sqpoll.c:244 [inline]
> io_sq_thread+0xed1/0x1e50 io_uring/sqpoll.c:327
> ret_from_fork+0x47f/0x820 arch/x86/kernel/process.c:148
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
Probably the sanest thing to do here is to clear
IORING_SETUP_SINGLE_ISSUER if it's set with IORING_SETUP_SQPOLL. If we
allow it, it'll be impossible to uphold the locking criteria on both the
issue and register side.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-09-03 23:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-03 3:26 [PATCH 0/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 1/4] io_uring: don't include filetable.h in io_uring.h Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 2/4] io_uring/rsrc: respect submitter_task in io_register_clone_buffers() Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 3/4] io_uring: factor out uring_lock helpers Caleb Sander Mateos
2025-09-03 3:26 ` [PATCH 4/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Caleb Sander Mateos
2025-09-03 21:55 ` [syzbot ci] " syzbot ci
2025-09-03 23:29 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).