From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: io-uring <io-uring@vger.kernel.org>
Subject: [GIT PULL] io_uring updates for 7.2
Date: Mon, 15 Jun 2026 09:18:22 -0600 [thread overview]
Message-ID: <8829b16a-4247-4e07-aa35-c3a185780731@kernel.dk> (raw)
Hi Linus,
Here are the core io_uring updates queued up for the 7.2 merge window.
This pull request contains:
- Rework of the task_work infrastructure. Both the local (DEFER_TASKRUN)
and the normal (tctx) task_work lists were llist based, which is LIFO
ordered, and hence each run had to do an O(n) list reversal pass first
to restore queue order. Additionally, to cap the amount of task_work
run, each method needed a retry list as well. Add a lockless MPCS
FIFO queue (based on Dmitry Vyukov's intrusive MPSC algorithm) and
switch both task_work lists to it. It performs better than llists and
we can then also ditch the retry lists as well as entries are popped
one-at-the-time. On top of those changes, run the tctx fallback
task_work directly and remove the now-unused per-ctx fallback machinery
entirely.
- zcrx user notifications. Add a mechanism for zcrx to communicate
conditions back to userspace via a dedicated CQE, with the initial
users being notification on running out of buffers and on a frag copy
fallback, plus shared-memory notification statistics. Alongside that, a
series of zcrx reliability and cleanup fixes: more reliable scrubbing,
poisoning pointers on unregistration, dropping an extra ifq close,
adding a ctx back-pointer, reordering fd allocation in the export path,
and killing a dead 'sock' member.
- Allow using io_uring registered buffers for plain SEND and RECV, not
just for the zero-copy send path. This enables targets like ublk's NBD
backend to push/pull IO data directly to/from a registered buffer over
a plain send/recv on a TCP socket.
- Registered buffer improvements: account huge pages correctly, bump the
io_mapped_ubuf length field to size_t, and raise the previous 1GB
registered buffer size limit.
- Restrict the ctx access exposed to io_uring BPF struct_ops programs by
handing them an opaque type rather than the full io_ring_ctx, and add a
separate MAINTAINERS entry for the bpf-ops code.
- Allow opcode filtering on IORING_OP_CONNECT.
- Validate ring-provided buffer addresses with access_ok(), and align the
legacy buffer add limit with MAX_BIDS_PER_BGID.
- Various other cleanups and minor fixes, including avoiding msghdr async
data on connect/bind, dropping async_size for OP_LISTEN, making the
POLL_FIRST receive side checks consistent, re-checking IO_WQ_BIT_EXIT
for each linked work item, and using trace_call__##name() at guarded
tracepoint call sites.
Note that this will throw a merge conflict in io_uring/net.c due to late
changes on the 7.1 side. The merge resolution is fairly straight
forward, including it at the end of this email for reference.
Please pull!
The following changes since commit 5d6919055dec134de3c40167a490f33c74c12581:
Linux 7.1-rc3 (2026-05-10 14:08:09 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git tags/for-7.2/io_uring-20260615
for you to fetch changes up to d9b710f683dc68b5c0b7dd0c6c64aeb5d27a1ac4:
io_uring/bpf-ops: add a separate maintainer entry (2026-06-13 06:36:16 -0600)
----------------------------------------------------------------
for-7.2/io_uring-20260615
----------------------------------------------------------------
Bertie Tryner (1):
io_uring/zcrx: reorder fd allocation in zcrx_export()
Clément Léger (2):
io_uring/zcrx: notify user on frag copy fallback
io_uring/zcrx: add shared-memory notification statistics
Gabriel Krisman Bertazi (3):
io_uring/net: Avoid msghdr on op_connect/op_bind async data
io_uring/net: Remove async_size for OP_LISTEN
io_uring/nop: Drop a wrong comment in struct io_nop
Jens Axboe (12):
io_uring/rsrc: add huge page accounting for registered buffers
io_uring/rsrc: bump struct io_mapped_ubuf length field to size_t
io_uring/rsrc: raise registered buffer 1GB limit
io_uring/kbuf: validate ring provided buffer addresses with access_ok()
io_uring/zcrx: kill dead 'sock' member in struct io_zcrx_args
io_uring: grab RCU read lock marking task run
io_uring/mpscq: add lockless multi-producer, single-consumer FIFO queue
io_uring: switch local task_work to a mpscq
io_uring: switch normal task_work to a mpscq
io_uring: run the tctx task_work fallback directly
io_uring: remove the per-ctx fallback task_work machinery
io_uring/net: make POLL_FIRST receive side checks consistent
Ming Lei (1):
io_uring/net: support registered buffer for plain send and recv
Pavel Begunkov (7):
io_uring/zcrx: make scrubbing more reliable
io_uring/zcrx: poison pointers on unregistration
io_uring/zcrx: remove extra ifq close
io_uring/zcrx: add ctx pointer to zcrx
io_uring/zcrx: notify user when out of buffers
io_uring/bpf-ops: restrict ctx access to BPF
io_uring/bpf-ops: add a separate maintainer entry
Runyu Xiao (1):
io_uring/io-wq: re-check IO_WQ_BIT_EXIT for each linked work item
Shouvik Kar (1):
io_uring/net: allow filtering on IORING_OP_CONNECT
Vineeth Pillai (1):
io_uring: Use trace_call__##name() at guarded tracepoint call sites
Yi Xie (1):
io_uring: parenthesize io_ring_head_to_buf() expansion
liyouhong (1):
io_uring/kbuf: align legacy buffer add limit with MAX_BIDS_PER_BGID
MAINTAINERS | 8 +
include/linux/io_uring_types.h | 46 ++++-
include/uapi/linux/io_uring/bpf_filter.h | 16 ++
include/uapi/linux/io_uring/query.h | 12 ++
include/uapi/linux/io_uring/zcrx.h | 36 +++-
io_uring/bpf-ops.c | 9 +-
io_uring/bpf-ops.h | 2 +-
io_uring/cancel.c | 2 -
io_uring/fdinfo.c | 2 +-
io_uring/io-wq.c | 2 +-
io_uring/io_uring.c | 14 +-
io_uring/io_uring.h | 3 +-
io_uring/kbuf.c | 18 +-
io_uring/loop.c | 2 +-
io_uring/loop.h | 10 +
io_uring/mpscq.h | 125 ++++++++++++
io_uring/net.c | 129 ++++++++++---
io_uring/net.h | 7 +
io_uring/nop.c | 1 -
io_uring/opdef.c | 7 +-
io_uring/query.c | 16 ++
io_uring/rsrc.c | 269 ++++++++++++++++++++------
io_uring/rsrc.h | 7 +-
io_uring/sqpoll.c | 30 +--
io_uring/tctx.c | 3 +-
io_uring/tw.c | 315 ++++++++++++++-----------------
io_uring/tw.h | 11 +-
io_uring/wait.c | 2 +-
io_uring/wait.h | 12 +-
io_uring/zcrx.c | 229 +++++++++++++++++++---
io_uring/zcrx.h | 11 +-
31 files changed, 1000 insertions(+), 356 deletions(-)
create mode 100644 io_uring/mpscq.h
commit fc98bae94161002f5f78081d55be8d4192ddadb2
Merge: 0e0611827f33 d9b710f683dc
Author: Jens Axboe <axboe@kernel.dk>
Date: Mon Jun 15 08:15:04 2026 -0600
Merge branch 'for-7.2/io_uring' into test
* for-7.2/io_uring: (31 commits)
io_uring/bpf-ops: add a separate maintainer entry
io_uring/net: make POLL_FIRST receive side checks consistent
io_uring: remove the per-ctx fallback task_work machinery
io_uring: run the tctx task_work fallback directly
io_uring: switch normal task_work to a mpscq
io_uring: switch local task_work to a mpscq
io_uring/mpscq: add lockless multi-producer, single-consumer FIFO queue
io_uring: grab RCU read lock marking task run
io_uring/zcrx: kill dead 'sock' member in struct io_zcrx_args
io_uring/kbuf: validate ring provided buffer addresses with access_ok()
io_uring/net: support registered buffer for plain send and recv
io_uring/nop: Drop a wrong comment in struct io_nop
io_uring/net: Remove async_size for OP_LISTEN
io_uring/net: Avoid msghdr on op_connect/op_bind async data
io_uring/bpf-ops: restrict ctx access to BPF
io_uring/io-wq: re-check IO_WQ_BIT_EXIT for each linked work item
io_uring/kbuf: align legacy buffer add limit with MAX_BIDS_PER_BGID
io_uring/zcrx: add shared-memory notification statistics
io_uring/zcrx: notify user on frag copy fallback
io_uring/zcrx: notify user when out of buffers
...
Signed-off-by: Jens Axboe <axboe@kernel.dk>
diff --cc io_uring/cancel.c
index 4aa3103ba9c3,b0259e74f678..8c6fa6f367e4
--- a/io_uring/cancel.c
+++ b/io_uring/cancel.c
@@@ -561,12 -561,10 +561,10 @@@ __cold bool io_uring_try_cancel_request
ret |= io_waitid_remove_all(ctx, tctx, cancel_all);
ret |= io_futex_remove_all(ctx, tctx, cancel_all);
ret |= io_uring_try_cancel_uring_cmd(ctx, tctx, cancel_all);
- mutex_unlock(&ctx->uring_lock);
ret |= io_kill_timeouts(ctx, tctx, cancel_all);
+ mutex_unlock(&ctx->uring_lock);
if (tctx)
ret |= io_run_task_work() > 0;
- else
- ret |= flush_delayed_work(&ctx->fallback_work);
return ret;
}
diff --cc io_uring/net.c
index ee848eb65ec9,7deb62e3b4c0..081d1b7d77c8
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@@ -1801,29 -1879,11 +1881,29 @@@ out
return IOU_COMPLETE;
}
+/*
+ * Check if bind request would potentially end up with filename_create(),
+ * which in turn end up in mnt_want_write() which will grab the fs
+ * percpu start write sem. This can trigger a lockdep warning.
+ */
- static int io_bind_file_create(const struct io_async_msghdr *io, int addr_len)
++static int io_bind_file_create(const struct sockaddr_storage *addr, int addr_len)
+{
+ const struct sockaddr_un *sun;
+
- if (io->addr.ss_family != AF_UNIX)
++ if (addr->ss_family != AF_UNIX)
+ return 0;
+ if (addr_len <= offsetof(struct sockaddr_un, sun_path))
+ return 0;
- sun = (const struct sockaddr_un *) &io->addr;
++ sun = (const struct sockaddr_un *) addr;
+ return sun->sun_path[0] != '\0';
+}
+
int io_bind_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
struct io_bind *bind = io_kiocb_to_cmd(req, struct io_bind);
struct sockaddr __user *uaddr;
- struct io_async_msghdr *io;
+ struct sockaddr_storage *addr;
+ int ret;
if (sqe->len || sqe->buf_index || sqe->rw_flags || sqe->splice_fd_in)
return -EINVAL;
@@@ -1831,17 -1891,13 +1911,18 @@@
uaddr = u64_to_user_ptr(READ_ONCE(sqe->addr));
bind->addr_len = READ_ONCE(sqe->addr2);
- io = io_msg_alloc_async(req);
- if (unlikely(!io))
+ addr = io_uring_alloc_async_data(NULL, req);
+ if (unlikely(!addr))
return -ENOMEM;
- ret = move_addr_to_kernel(uaddr, bind->addr_len, &io->addr);
- return move_addr_to_kernel(uaddr, bind->addr_len, addr);
++ ret = move_addr_to_kernel(uaddr, bind->addr_len, addr);
+ if (unlikely(ret))
+ return ret;
- if (io_bind_file_create(io, bind->addr_len))
++ if (io_bind_file_create(addr, bind->addr_len))
+ req->flags |= REQ_F_FORCE_ASYNC;
+ return 0;
}
+
int io_bind(struct io_kiocb *req, unsigned int issue_flags)
{
struct io_bind *bind = io_kiocb_to_cmd(req, struct io_bind);
--
Jens Axboe
next reply other threads:[~2026-06-15 15:18 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-15 15:18 Jens Axboe [this message]
2026-06-16 7:54 ` [GIT PULL] io_uring updates for 7.2 pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8829b16a-4247-4e07-aa35-c3a185780731@kernel.dk \
--to=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox