From: Jens Axboe <axboe@kernel.dk>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: io-uring@vger.kernel.org, dvyukov@google.com
Subject: Re: [PATCH 2/2] io_uring: switch local task_work to a mpscq
Date: Mon, 15 Jun 2026 12:00:08 -0600 [thread overview]
Message-ID: <e0e6a5da-054e-494b-aad8-be08f040750f@kernel.dk> (raw)
In-Reply-To: <CADUfDZoEhdom7cqRfKhMkhhRc0vmRpzRR-AZXndMhLnLa9KqYg@mail.gmail.com>
On 6/15/26 11:55 AM, Caleb Sander Mateos wrote:
> On Fri, Jun 12, 2026 at 8:11?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 6/12/26 6:21 AM, Jens Axboe wrote:
>>> On 6/11/26 11:24 PM, Caleb Sander Mateos wrote:
>>>> On Thu, Jun 11, 2026 at 7:23?PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>
>>>>> On 6/11/26 7:14 PM, Caleb Sander Mateos wrote:
>>>>>> This is great stuff! I had also observed these hotspots on a ublk
>>>>>> workload. Since incoming ublk requests post task work to the ublk
>>>>>> server's io_urings and completed ublk requests post task work to the
>>>>>> client's io_urings, there is significant cross-CPU contention on the
>>>>>> task work queues.
>>>>>
>>>>> Glad you like it! Once I post v2 tomorrow, perhaps you can try and run
>>>>> some tests with and without and see how it does for you?
>>>>
>>>> Haven't tested v2 yet, but v1 shows a 4% IOPS improvement on a ublk
>>>> 4-KB read workload. The workload has 8 CPUs (unpaired hypertwins)
>>>> running fio with io_uring submitting I/O to the ublk devices and 32
>>>> ublk server CPUs (paired hypertwins) servicing the requests, achieving
>>>> around 4M IOPS. Both the client and server CPUs look completely busy.
>>>
>>> That's a pretty nice improvement! Would be curious to hear what v2 looks
>>> like.
>
> Looks the same as v1, which makes sense as both the client and server
> are using IORING_SETUP_DEFER_TASKRUN.
OK, sounds good.
> I did observe fio seem to get stuck forever on one out of the 85 or so
> runs, though. I'm a little concerned there might be a missing wakeup.
> It was using the default iodepth_batch_complete_min=1 (waiting for
> io_uring completions) and IORING_SETUP_DEFER_TASKRUN.
There's a bug in v2 where it can get missed, the in-tree code should
have that fixed. It was the atomic_dec_and_test() and
atomic_try_cmpxchg() in io_req_local_work_add() racing.
>> And here's some more stuff on top you might find interesting. For a
>> 6 NVMe drive test, it drops my task work usage from top-of-profiles
>> to ~2%.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git/log/?h=io_uring-tw-mpscq-batch
>>
>> The patches sit on top of the io_uring-tw-mpscq branch.
>
> Yeah there are some interesting ideas there.
>
> The ublk server isn't using UBLK_F_BATCH_IO, so it unfortunately
> wouldn't benefit from the task work batching for
> UBLK_U_IO_COMMIT_IO_CMDS. The batching would probably need to be
> scoped to the whole io_submit_sqes() in order to allow batching across
> the multiple UBLK_U_IO_COMMIT_AND_FETCH_REQ commands. I'm also not
> sure about the claim that __ublk_walk_cmd_buf() won't sleep;
> ublk_batch_commit_io() calls io_buffer_unregister_bvec(), which could
> sleep depending on the io_uring issue_flags.
It's very much just a POC series of things... I suspect to get the
benefit of it, we'd need a bit of refactoring and reworking first. It
was more to get the idea out/across, not going anywhere right now.
> The NVMe passthrough task work batching could definitely reduce
> contention on the task work queue. I'll run a perf test.
Thanks!
--
Jens Axboe
next prev parent reply other threads:[~2026-06-15 18:00 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-11 15:58 [PATCHSET 0/2] Add lockless MPSC FIFO queue for task work Jens Axboe
2026-06-11 15:58 ` [PATCH 1/2] io_uring/mpscq: add lockless multi-producer, single-consumer FIFO queue Jens Axboe
2026-06-11 16:49 ` Gabriel Krisman Bertazi
2026-06-11 16:58 ` Jens Axboe
2026-06-12 1:13 ` Caleb Sander Mateos
2026-06-12 2:21 ` Jens Axboe
2026-06-12 2:41 ` Caleb Sander Mateos
2026-06-11 15:58 ` [PATCH 2/2] io_uring: switch local task_work to a mpscq Jens Axboe
2026-06-12 1:14 ` Caleb Sander Mateos
2026-06-12 2:23 ` Jens Axboe
2026-06-12 5:24 ` Caleb Sander Mateos
2026-06-12 12:21 ` Jens Axboe
2026-06-12 15:11 ` Jens Axboe
2026-06-15 17:55 ` Caleb Sander Mateos
2026-06-15 18:00 ` Jens Axboe [this message]
2026-06-16 20:21 ` Caleb Sander Mateos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e0e6a5da-054e-494b-aad8-be08f040750f@kernel.dk \
--to=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=dvyukov@google.com \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.