Linux io-uring development
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: io-uring@vger.kernel.org, dvyukov@google.com
Subject: Re: [PATCH 2/2] io_uring: switch local task_work to a mpscq
Date: Mon, 15 Jun 2026 12:00:08 -0600	[thread overview]
Message-ID: <e0e6a5da-054e-494b-aad8-be08f040750f@kernel.dk> (raw)
In-Reply-To: <CADUfDZoEhdom7cqRfKhMkhhRc0vmRpzRR-AZXndMhLnLa9KqYg@mail.gmail.com>

On 6/15/26 11:55 AM, Caleb Sander Mateos wrote:
> On Fri, Jun 12, 2026 at 8:11?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 6/12/26 6:21 AM, Jens Axboe wrote:
>>> On 6/11/26 11:24 PM, Caleb Sander Mateos wrote:
>>>> On Thu, Jun 11, 2026 at 7:23?PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>
>>>>> On 6/11/26 7:14 PM, Caleb Sander Mateos wrote:
>>>>>> This is great stuff! I had also observed these hotspots on a ublk
>>>>>> workload. Since incoming ublk requests post task work to the ublk
>>>>>> server's io_urings and completed ublk requests post task work to the
>>>>>> client's io_urings, there is significant cross-CPU contention on the
>>>>>> task work queues.
>>>>>
>>>>> Glad you like it! Once I post v2 tomorrow, perhaps you can try and run
>>>>> some tests with and without and see how it does for you?
>>>>
>>>> Haven't tested v2 yet, but v1 shows a 4% IOPS improvement on a ublk
>>>> 4-KB read workload. The workload has 8 CPUs (unpaired hypertwins)
>>>> running fio with io_uring submitting I/O to the ublk devices and 32
>>>> ublk server CPUs (paired hypertwins) servicing the requests, achieving
>>>> around 4M IOPS. Both the client and server CPUs look completely busy.
>>>
>>> That's a pretty nice improvement! Would be curious to hear what v2 looks
>>> like.
> 
> Looks the same as v1, which makes sense as both the client and server
> are using IORING_SETUP_DEFER_TASKRUN.

OK, sounds good.

> I did observe fio seem to get stuck forever on one out of the 85 or so
> runs, though. I'm a little concerned there might be a missing wakeup.
> It was using the default iodepth_batch_complete_min=1 (waiting for
> io_uring completions) and IORING_SETUP_DEFER_TASKRUN.

There's a bug in v2 where it can get missed, the in-tree code should
have that fixed. It was the atomic_dec_and_test() and
atomic_try_cmpxchg() in io_req_local_work_add() racing.

>> And here's some more stuff on top you might find interesting. For a
>> 6 NVMe drive test, it drops my task work usage from top-of-profiles
>> to ~2%.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git/log/?h=io_uring-tw-mpscq-batch
>>
>> The patches sit on top of the io_uring-tw-mpscq branch.
> 
> Yeah there are some interesting ideas there.
> 
> The ublk server isn't using UBLK_F_BATCH_IO, so it unfortunately
> wouldn't benefit from the task work batching for
> UBLK_U_IO_COMMIT_IO_CMDS. The batching would probably need to be
> scoped to the whole io_submit_sqes() in order to allow batching across
> the multiple UBLK_U_IO_COMMIT_AND_FETCH_REQ commands. I'm also not
> sure about the claim that __ublk_walk_cmd_buf() won't sleep;
> ublk_batch_commit_io() calls io_buffer_unregister_bvec(), which could
> sleep depending on the io_uring issue_flags.

It's very much just a POC series of things... I suspect to get the
benefit of it, we'd need a bit of refactoring and reworking first. It
was more to get the idea out/across, not going anywhere right now.

> The NVMe passthrough task work batching could definitely reduce
> contention on the task work queue. I'll run a perf test.

Thanks!

-- 
Jens Axboe

  reply	other threads:[~2026-06-15 18:00 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11 15:58 [PATCHSET 0/2] Add lockless MPSC FIFO queue for task work Jens Axboe
2026-06-11 15:58 ` [PATCH 1/2] io_uring/mpscq: add lockless multi-producer, single-consumer FIFO queue Jens Axboe
2026-06-11 16:49   ` Gabriel Krisman Bertazi
2026-06-11 16:58     ` Jens Axboe
2026-06-12  1:13   ` Caleb Sander Mateos
2026-06-12  2:21     ` Jens Axboe
2026-06-12  2:41       ` Caleb Sander Mateos
2026-06-11 15:58 ` [PATCH 2/2] io_uring: switch local task_work to a mpscq Jens Axboe
2026-06-12  1:14   ` Caleb Sander Mateos
2026-06-12  2:23     ` Jens Axboe
2026-06-12  5:24       ` Caleb Sander Mateos
2026-06-12 12:21         ` Jens Axboe
2026-06-12 15:11           ` Jens Axboe
2026-06-15 17:55             ` Caleb Sander Mateos
2026-06-15 18:00               ` Jens Axboe [this message]
2026-06-16 20:21                 ` Caleb Sander Mateos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e0e6a5da-054e-494b-aad8-be08f040750f@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=dvyukov@google.com \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox