All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	Uday Shankar <ushankar@purestorage.com>,
	Stefani Seibold <stefani@seibold.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH V4 00/27] ublk: add UBLK_F_BATCH_IO
Date: Sat, 29 Nov 2025 09:24:07 +0800	[thread overview]
Message-ID: <aSpLN3xPwCqToYrZ@fedora> (raw)
In-Reply-To: <CADUfDZoZ4Atind4x=GFsJ=H0TpSPFW2Ys2c5AQOMH3LnguSthw@mail.gmail.com>

On Fri, Nov 28, 2025 at 11:07:17AM -0800, Caleb Sander Mateos wrote:
> On Fri, Nov 28, 2025 at 8:19 AM Jens Axboe <axboe@kernel.dk> wrote:
> >
> > On 11/28/25 4:59 AM, Ming Lei wrote:
> > > On Fri, Nov 21, 2025 at 09:58:22AM +0800, Ming Lei wrote:
> > >> Hello,
> > >>
> > >> This patchset adds UBLK_F_BATCH_IO feature for communicating between kernel and ublk
> > >> server in batching way:
> > >>
> > >> - Per-queue vs Per-I/O: Commands operate on queues rather than individual I/Os
> > >>
> > >> - Batch processing: Multiple I/Os are handled in single operation
> > >>
> > >> - Multishot commands: Use io_uring multishot for reducing submission overhead
> > >>
> > >> - Flexible task assignment: Any task can handle any I/O (no per-I/O daemons)
> > >>
> > >> - Better load balancing: Tasks can adjust their workload dynamically
> > >>
> > >> - help for future optimizations:
> > >>      - blk-mq batch tags free
> > >>      - support io-poll
> > >>      - per-task batch for avoiding per-io lock
> > >>      - fetch command priority
> > >>
> > >> - simplify command cancel process with per-queue lock
> > >>
> > >> selftest are provided.
> > >>
> > >>
> > >> Performance test result(IOPS) on V3:
> > >>
> > >> - page copy
> > >>
> > >> tools/testing/selftests/ublk//kublk add -t null -q 16 [-b]
> > >>
> > >> - zero copy(--auto_zc)
> > >> tools/testing/selftests/ublk//kublk add -t null -q 16 --auto_zc [-b]
> > >>
> > >> - IO test
> > >> taskset -c 0-31 fio/t/io_uring -p0 -n $JOBS -r 30 /dev/ublkb0
> > >>
> > >> 1) 16 jobs IO
> > >> - page copy:                         37.77M vs. 42.40M(BATCH_IO), +12%
> > >> - zero copy(--auto_zc): 42.83M vs. 44.43M(BATCH_IO), +3.7%
> > >>
> > >>
> > >> 2) single job IO
> > >> - page copy:                         2.54M vs. 2.6M(BATCH_IO),   +2.3%
> > >> - zero copy(--auto_zc): 3.13M vs. 3.35M(BATCH_IO),  +7%
> > >>
> > >>
> > >> V4:
> > >>      - fix handling in case of running out of mshot buffer, request has to
> > >>        be un-prepared for zero copy
> > >>      - don't expose unused tag to userspace
> > >>      - replace fixed buffer with plain user buffer for
> > >>        UBLK_U_IO_PREP_IO_CMDS and UBLK_U_IO_COMMIT_IO_CMDS
> > >>      - replace iov iterator with plain copy_from_user() for
> > >>        ublk_walk_cmd_buf(), code is simplified with performance improvement
> > >>      - don't touch sqe->len for UBLK_U_IO_PREP_IO_CMDS and
> > >>        UBLK_U_IO_COMMIT_IO_CMDS(Caleb Sander Mateos)
> > >>      - use READ_ONCE() for access sqe->addr (Caleb Sander Mateos)
> > >>      - all kinds of patch style fix(Caleb Sander Mateos)
> > >>      - inline __kfifo_alloc() (Caleb Sander Mateos)
> > >
> > > Hi Caleb Sander Mateos and Jens,
> > >
> > > Caleb have reviewed patch 1 ~ patch 8, and driver patch 9 ~ patch 18 are not
> > > reviewed yet.
> > >
> > > I'd want to hear your idea for how to move on. So far, looks there are
> > > several ways:
> > >
> > > 1) merge patch 1 ~ patch 6 to v6.19 first, which can be prep patches for BATCH_IO
> > >
> > > 2) delay the whole patchset to v6.20 cycle
> > >
> > > 3) merge the whole patchset to v6.19
> > >
> > > I am fine with either one, which one do you prefer to?
> > >
> > > BTW, V4 pass all builtin function and stress tests, and there is just one small bug
> > > fix not posted yet, which can be a follow-up. The new feature takes standalone
> > > code path, so regression risk is pretty small.
> >
> > I'm fine taking the whole thing for 6.19. Caleb let me know if you
> > disagree. I'll queue 1..6 for now, then can follow up later today with
> > the rest as needed.
> 
> Sorry I haven't gotten around to reviewing the rest of the series yet.
> I will try to take a look at them all this weekend. I'm not sure the
> batching feature would make sense for our ublk application use case,
> but I have no objection to it as long as it doesn't regress the
> non-batched ublk behavior/performance.
> No problem with queueing up patches 1-6 now (though patch 1 may need
> an ack from a kfifo maintainer?).

BTW, there are many good things with BATCH_IO features:

- batch blk-mq completion: page copy IO mode has shown >12% IOPS improvement; and
	there is chance to apply it for zero copy too in future

- io poll become much easier to support: it can be used to poll nvme char/block device
  to get better iops

- io cancel code path becomes less fragile, and easier to debug: in typical
  implementation, there is only one or two per-queue FETCH(multishot)
  command, others are just sync one-shot commands.

- more chances to improve perf: saved lots of generic uring_cmd code
  path cost, such as, security_uring_cmd()

- `perf bug fix` for UBLK_F_PER_IO_DAEMON, meantime robust load balance
  support

	iops is improved by 4X-5X in `fio/t/io_uring -p0 /dev/ublkbN` between:
		./kublk add -t null  --nthreads 8 -q 4 --per_io_tasks
		and
		./kublk add -t null  --nthreads 8 -q 4 -b

- with per-io lock: fast io path becomes more robust, still can be bypassed
  in future in case of per-io-daemon 


The cost is some complexity in ublk server implementation for maintaining
one or two per-queue FETCH buffer, and one or two per-queue COMMIT buffer.


Thanks,
Ming


  reply	other threads:[~2025-11-29  1:24 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-21  1:58 [PATCH V4 00/27] ublk: add UBLK_F_BATCH_IO Ming Lei
2025-11-21  1:58 ` [PATCH V4 01/27] kfifo: add kfifo_alloc_node() helper for NUMA awareness Ming Lei
2025-11-29 19:12   ` Caleb Sander Mateos
2025-12-01  1:46     ` Ming Lei
2025-12-01  5:58       ` Caleb Sander Mateos
2025-11-21  1:58 ` [PATCH V4 02/27] ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() Ming Lei
2025-11-21  1:58 ` [PATCH V4 03/27] ublk: add `union ublk_io_buf` with improved naming Ming Lei
2025-11-21  1:58 ` [PATCH V4 04/27] ublk: refactor auto buffer register in ublk_dispatch_req() Ming Lei
2025-11-21  1:58 ` [PATCH V4 05/27] ublk: pass const pointer to ublk_queue_is_zoned() Ming Lei
2025-11-21  1:58 ` [PATCH V4 06/27] ublk: add helper of __ublk_fetch() Ming Lei
2025-11-21  1:58 ` [PATCH V4 07/27] ublk: define ublk_ch_batch_io_fops for the coming feature F_BATCH_IO Ming Lei
2025-11-21  1:58 ` [PATCH V4 08/27] ublk: prepare for not tracking task context for command batch Ming Lei
2025-11-21  1:58 ` [PATCH V4 09/27] ublk: add new batch command UBLK_U_IO_PREP_IO_CMDS & UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2025-11-29 19:19   ` Caleb Sander Mateos
2025-11-21  1:58 ` [PATCH V4 10/27] ublk: handle UBLK_U_IO_PREP_IO_CMDS Ming Lei
2025-11-29 19:47   ` Caleb Sander Mateos
2025-11-30 19:25   ` Caleb Sander Mateos
2025-11-21  1:58 ` [PATCH V4 11/27] ublk: handle UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2025-11-30 16:39   ` Caleb Sander Mateos
2025-12-01 10:25     ` Ming Lei
2025-12-01 16:43       ` Caleb Sander Mateos
2025-11-21  1:58 ` [PATCH V4 12/27] ublk: add io events fifo structure Ming Lei
2025-11-30 16:53   ` Caleb Sander Mateos
2025-12-01  3:04     ` Ming Lei
2025-11-21  1:58 ` [PATCH V4 13/27] ublk: add batch I/O dispatch infrastructure Ming Lei
2025-11-30 19:24   ` Caleb Sander Mateos
2025-11-30 21:37     ` Caleb Sander Mateos
2025-12-01  2:32     ` Ming Lei
2025-12-01 17:37       ` Caleb Sander Mateos
2025-11-21  1:58 ` [PATCH V4 14/27] ublk: add UBLK_U_IO_FETCH_IO_CMDS for batch I/O processing Ming Lei
2025-12-01  5:55   ` Caleb Sander Mateos
2025-12-01  9:41     ` Ming Lei
2025-12-01 17:51       ` Caleb Sander Mateos
2025-12-02  1:27         ` Ming Lei
2025-12-02  1:39           ` Caleb Sander Mateos
2025-12-02  8:14             ` Ming Lei
2025-12-02 15:20               ` Caleb Sander Mateos
2025-11-21  1:58 ` [PATCH V4 15/27] ublk: abort requests filled in event kfifo Ming Lei
2025-12-01 18:52   ` Caleb Sander Mateos
2025-12-02  1:29     ` Ming Lei
2025-12-01 19:00   ` Caleb Sander Mateos
2025-11-21  1:58 ` [PATCH V4 16/27] ublk: add new feature UBLK_F_BATCH_IO Ming Lei
2025-12-01 21:16   ` Caleb Sander Mateos
2025-12-02  1:44     ` Ming Lei
2025-12-02 16:05       ` Caleb Sander Mateos
2025-12-03  2:21         ` Ming Lei
2025-11-21  1:58 ` [PATCH V4 17/27] ublk: document " Ming Lei
2025-12-01 21:46   ` Caleb Sander Mateos
2025-12-02  1:55     ` Ming Lei
2025-12-02  2:03     ` Ming Lei
2025-11-21  1:58 ` [PATCH V4 18/27] ublk: implement batch request completion via blk_mq_end_request_batch() Ming Lei
2025-12-01 21:55   ` Caleb Sander Mateos
2025-11-21  1:58 ` [PATCH V4 19/27] selftests: ublk: fix user_data truncation for tgt_data >= 256 Ming Lei
2025-11-21  1:58 ` [PATCH V4 20/27] selftests: ublk: replace assert() with ublk_assert() Ming Lei
2025-11-21  1:58 ` [PATCH V4 21/27] selftests: ublk: add ublk_io_buf_idx() for returning io buffer index Ming Lei
2025-11-21  1:58 ` [PATCH V4 22/27] selftests: ublk: add batch buffer management infrastructure Ming Lei
2025-11-21  1:58 ` [PATCH V4 23/27] selftests: ublk: handle UBLK_U_IO_PREP_IO_CMDS Ming Lei
2025-11-21  1:58 ` [PATCH V4 24/27] selftests: ublk: handle UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2025-11-21  1:58 ` [PATCH V4 25/27] selftests: ublk: handle UBLK_U_IO_FETCH_IO_CMDS Ming Lei
2025-11-21  1:58 ` [PATCH V4 26/27] selftests: ublk: add --batch/-b for enabling F_BATCH_IO Ming Lei
2025-11-21  1:58 ` [PATCH V4 27/27] selftests: ublk: support arbitrary threads/queues combination Ming Lei
2025-11-28 11:59 ` [PATCH V4 00/27] ublk: add UBLK_F_BATCH_IO Ming Lei
2025-11-28 16:19   ` Jens Axboe
2025-11-28 19:07     ` Caleb Sander Mateos
2025-11-29  1:24       ` Ming Lei [this message]
2025-11-28 16:22 ` (subset) " Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aSpLN3xPwCqToYrZ@fedora \
    --to=ming.lei@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stefani@seibold.net \
    --cc=ushankar@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.