From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Uday Shankar <ushankar@purestorage.com>,
Stefani Seibold <stefani@seibold.net>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH V4 00/27] ublk: add UBLK_F_BATCH_IO
Date: Sat, 29 Nov 2025 09:24:07 +0800 [thread overview]
Message-ID: <aSpLN3xPwCqToYrZ@fedora> (raw)
In-Reply-To: <CADUfDZoZ4Atind4x=GFsJ=H0TpSPFW2Ys2c5AQOMH3LnguSthw@mail.gmail.com>
On Fri, Nov 28, 2025 at 11:07:17AM -0800, Caleb Sander Mateos wrote:
> On Fri, Nov 28, 2025 at 8:19 AM Jens Axboe <axboe@kernel.dk> wrote:
> >
> > On 11/28/25 4:59 AM, Ming Lei wrote:
> > > On Fri, Nov 21, 2025 at 09:58:22AM +0800, Ming Lei wrote:
> > >> Hello,
> > >>
> > >> This patchset adds UBLK_F_BATCH_IO feature for communicating between kernel and ublk
> > >> server in batching way:
> > >>
> > >> - Per-queue vs Per-I/O: Commands operate on queues rather than individual I/Os
> > >>
> > >> - Batch processing: Multiple I/Os are handled in single operation
> > >>
> > >> - Multishot commands: Use io_uring multishot for reducing submission overhead
> > >>
> > >> - Flexible task assignment: Any task can handle any I/O (no per-I/O daemons)
> > >>
> > >> - Better load balancing: Tasks can adjust their workload dynamically
> > >>
> > >> - help for future optimizations:
> > >> - blk-mq batch tags free
> > >> - support io-poll
> > >> - per-task batch for avoiding per-io lock
> > >> - fetch command priority
> > >>
> > >> - simplify command cancel process with per-queue lock
> > >>
> > >> selftest are provided.
> > >>
> > >>
> > >> Performance test result(IOPS) on V3:
> > >>
> > >> - page copy
> > >>
> > >> tools/testing/selftests/ublk//kublk add -t null -q 16 [-b]
> > >>
> > >> - zero copy(--auto_zc)
> > >> tools/testing/selftests/ublk//kublk add -t null -q 16 --auto_zc [-b]
> > >>
> > >> - IO test
> > >> taskset -c 0-31 fio/t/io_uring -p0 -n $JOBS -r 30 /dev/ublkb0
> > >>
> > >> 1) 16 jobs IO
> > >> - page copy: 37.77M vs. 42.40M(BATCH_IO), +12%
> > >> - zero copy(--auto_zc): 42.83M vs. 44.43M(BATCH_IO), +3.7%
> > >>
> > >>
> > >> 2) single job IO
> > >> - page copy: 2.54M vs. 2.6M(BATCH_IO), +2.3%
> > >> - zero copy(--auto_zc): 3.13M vs. 3.35M(BATCH_IO), +7%
> > >>
> > >>
> > >> V4:
> > >> - fix handling in case of running out of mshot buffer, request has to
> > >> be un-prepared for zero copy
> > >> - don't expose unused tag to userspace
> > >> - replace fixed buffer with plain user buffer for
> > >> UBLK_U_IO_PREP_IO_CMDS and UBLK_U_IO_COMMIT_IO_CMDS
> > >> - replace iov iterator with plain copy_from_user() for
> > >> ublk_walk_cmd_buf(), code is simplified with performance improvement
> > >> - don't touch sqe->len for UBLK_U_IO_PREP_IO_CMDS and
> > >> UBLK_U_IO_COMMIT_IO_CMDS(Caleb Sander Mateos)
> > >> - use READ_ONCE() for access sqe->addr (Caleb Sander Mateos)
> > >> - all kinds of patch style fix(Caleb Sander Mateos)
> > >> - inline __kfifo_alloc() (Caleb Sander Mateos)
> > >
> > > Hi Caleb Sander Mateos and Jens,
> > >
> > > Caleb have reviewed patch 1 ~ patch 8, and driver patch 9 ~ patch 18 are not
> > > reviewed yet.
> > >
> > > I'd want to hear your idea for how to move on. So far, looks there are
> > > several ways:
> > >
> > > 1) merge patch 1 ~ patch 6 to v6.19 first, which can be prep patches for BATCH_IO
> > >
> > > 2) delay the whole patchset to v6.20 cycle
> > >
> > > 3) merge the whole patchset to v6.19
> > >
> > > I am fine with either one, which one do you prefer to?
> > >
> > > BTW, V4 pass all builtin function and stress tests, and there is just one small bug
> > > fix not posted yet, which can be a follow-up. The new feature takes standalone
> > > code path, so regression risk is pretty small.
> >
> > I'm fine taking the whole thing for 6.19. Caleb let me know if you
> > disagree. I'll queue 1..6 for now, then can follow up later today with
> > the rest as needed.
>
> Sorry I haven't gotten around to reviewing the rest of the series yet.
> I will try to take a look at them all this weekend. I'm not sure the
> batching feature would make sense for our ublk application use case,
> but I have no objection to it as long as it doesn't regress the
> non-batched ublk behavior/performance.
> No problem with queueing up patches 1-6 now (though patch 1 may need
> an ack from a kfifo maintainer?).
BTW, there are many good things with BATCH_IO features:
- batch blk-mq completion: page copy IO mode has shown >12% IOPS improvement; and
there is chance to apply it for zero copy too in future
- io poll become much easier to support: it can be used to poll nvme char/block device
to get better iops
- io cancel code path becomes less fragile, and easier to debug: in typical
implementation, there is only one or two per-queue FETCH(multishot)
command, others are just sync one-shot commands.
- more chances to improve perf: saved lots of generic uring_cmd code
path cost, such as, security_uring_cmd()
- `perf bug fix` for UBLK_F_PER_IO_DAEMON, meantime robust load balance
support
iops is improved by 4X-5X in `fio/t/io_uring -p0 /dev/ublkbN` between:
./kublk add -t null --nthreads 8 -q 4 --per_io_tasks
and
./kublk add -t null --nthreads 8 -q 4 -b
- with per-io lock: fast io path becomes more robust, still can be bypassed
in future in case of per-io-daemon
The cost is some complexity in ublk server implementation for maintaining
one or two per-queue FETCH buffer, and one or two per-queue COMMIT buffer.
Thanks,
Ming
next prev parent reply other threads:[~2025-11-29 1:24 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-21 1:58 [PATCH V4 00/27] ublk: add UBLK_F_BATCH_IO Ming Lei
2025-11-21 1:58 ` [PATCH V4 01/27] kfifo: add kfifo_alloc_node() helper for NUMA awareness Ming Lei
2025-11-29 19:12 ` Caleb Sander Mateos
2025-12-01 1:46 ` Ming Lei
2025-12-01 5:58 ` Caleb Sander Mateos
2025-11-21 1:58 ` [PATCH V4 02/27] ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() Ming Lei
2025-11-21 1:58 ` [PATCH V4 03/27] ublk: add `union ublk_io_buf` with improved naming Ming Lei
2025-11-21 1:58 ` [PATCH V4 04/27] ublk: refactor auto buffer register in ublk_dispatch_req() Ming Lei
2025-11-21 1:58 ` [PATCH V4 05/27] ublk: pass const pointer to ublk_queue_is_zoned() Ming Lei
2025-11-21 1:58 ` [PATCH V4 06/27] ublk: add helper of __ublk_fetch() Ming Lei
2025-11-21 1:58 ` [PATCH V4 07/27] ublk: define ublk_ch_batch_io_fops for the coming feature F_BATCH_IO Ming Lei
2025-11-21 1:58 ` [PATCH V4 08/27] ublk: prepare for not tracking task context for command batch Ming Lei
2025-11-21 1:58 ` [PATCH V4 09/27] ublk: add new batch command UBLK_U_IO_PREP_IO_CMDS & UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2025-11-29 19:19 ` Caleb Sander Mateos
2025-11-21 1:58 ` [PATCH V4 10/27] ublk: handle UBLK_U_IO_PREP_IO_CMDS Ming Lei
2025-11-29 19:47 ` Caleb Sander Mateos
2025-11-30 19:25 ` Caleb Sander Mateos
2025-11-21 1:58 ` [PATCH V4 11/27] ublk: handle UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2025-11-30 16:39 ` Caleb Sander Mateos
2025-12-01 10:25 ` Ming Lei
2025-12-01 16:43 ` Caleb Sander Mateos
2025-11-21 1:58 ` [PATCH V4 12/27] ublk: add io events fifo structure Ming Lei
2025-11-30 16:53 ` Caleb Sander Mateos
2025-12-01 3:04 ` Ming Lei
2025-11-21 1:58 ` [PATCH V4 13/27] ublk: add batch I/O dispatch infrastructure Ming Lei
2025-11-30 19:24 ` Caleb Sander Mateos
2025-11-30 21:37 ` Caleb Sander Mateos
2025-12-01 2:32 ` Ming Lei
2025-12-01 17:37 ` Caleb Sander Mateos
2025-11-21 1:58 ` [PATCH V4 14/27] ublk: add UBLK_U_IO_FETCH_IO_CMDS for batch I/O processing Ming Lei
2025-12-01 5:55 ` Caleb Sander Mateos
2025-12-01 9:41 ` Ming Lei
2025-12-01 17:51 ` Caleb Sander Mateos
2025-12-02 1:27 ` Ming Lei
2025-12-02 1:39 ` Caleb Sander Mateos
2025-12-02 8:14 ` Ming Lei
2025-12-02 15:20 ` Caleb Sander Mateos
2025-11-21 1:58 ` [PATCH V4 15/27] ublk: abort requests filled in event kfifo Ming Lei
2025-12-01 18:52 ` Caleb Sander Mateos
2025-12-02 1:29 ` Ming Lei
2025-12-01 19:00 ` Caleb Sander Mateos
2025-11-21 1:58 ` [PATCH V4 16/27] ublk: add new feature UBLK_F_BATCH_IO Ming Lei
2025-12-01 21:16 ` Caleb Sander Mateos
2025-12-02 1:44 ` Ming Lei
2025-12-02 16:05 ` Caleb Sander Mateos
2025-12-03 2:21 ` Ming Lei
2025-11-21 1:58 ` [PATCH V4 17/27] ublk: document " Ming Lei
2025-12-01 21:46 ` Caleb Sander Mateos
2025-12-02 1:55 ` Ming Lei
2025-12-02 2:03 ` Ming Lei
2025-11-21 1:58 ` [PATCH V4 18/27] ublk: implement batch request completion via blk_mq_end_request_batch() Ming Lei
2025-12-01 21:55 ` Caleb Sander Mateos
2025-11-21 1:58 ` [PATCH V4 19/27] selftests: ublk: fix user_data truncation for tgt_data >= 256 Ming Lei
2025-11-21 1:58 ` [PATCH V4 20/27] selftests: ublk: replace assert() with ublk_assert() Ming Lei
2025-11-21 1:58 ` [PATCH V4 21/27] selftests: ublk: add ublk_io_buf_idx() for returning io buffer index Ming Lei
2025-11-21 1:58 ` [PATCH V4 22/27] selftests: ublk: add batch buffer management infrastructure Ming Lei
2025-11-21 1:58 ` [PATCH V4 23/27] selftests: ublk: handle UBLK_U_IO_PREP_IO_CMDS Ming Lei
2025-11-21 1:58 ` [PATCH V4 24/27] selftests: ublk: handle UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2025-11-21 1:58 ` [PATCH V4 25/27] selftests: ublk: handle UBLK_U_IO_FETCH_IO_CMDS Ming Lei
2025-11-21 1:58 ` [PATCH V4 26/27] selftests: ublk: add --batch/-b for enabling F_BATCH_IO Ming Lei
2025-11-21 1:58 ` [PATCH V4 27/27] selftests: ublk: support arbitrary threads/queues combination Ming Lei
2025-11-28 11:59 ` [PATCH V4 00/27] ublk: add UBLK_F_BATCH_IO Ming Lei
2025-11-28 16:19 ` Jens Axboe
2025-11-28 19:07 ` Caleb Sander Mateos
2025-11-29 1:24 ` Ming Lei [this message]
2025-11-28 16:22 ` (subset) " Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aSpLN3xPwCqToYrZ@fedora \
--to=ming.lei@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stefani@seibold.net \
--cc=ushankar@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox