public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Caleb Sander Mateos <csander@purestorage.com>
To: Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org,
	Caleb Sander Mateos <csander@purestorage.com>
Subject: [PATCH v2 00/14] ublk: allow off-daemon zero-copy buffer registration
Date: Fri, 20 Jun 2025 09:09:54 -0600	[thread overview]
Message-ID: <20250620151008.3976463-1-csander@purestorage.com> (raw)

Currently ublk zero-copy requires ublk request buffers to be registered
and unregistered by the ublk I/O's daemon task. However, as currently
implemented, there is no reason for this restriction. Registration looks
up the request via the ublk device's tagset rather than the daemon-local
ublk_io structure and takes an atomic reference to prevent racing with
dispatch or completion of the request. Ming has expressed interest in
relaxing this restriction[1] so the ublk server can offload the I/O
operation that uses the zero-copy buffer to another thread.

Additionally, optimize the buffer registration for the common case where
the buffer is registered and unregistered by the daemon task. Skip the
expensive atomic reference count increment and decrement and the several
pointer dereferences involved in looking up the request on the tagset.
On our ublk server threads, this results in a 24% decrease in the CPU
time spent in the ublk functions on a 4K zero-copy read workload, from
8.75% to 6.68%.

Two preliminary fixes are included as well:
- Move the ublk request reference count from ublk_rq_data (a tail
  allocation of struct request) to ublk_io to prevent a use after free
  if the struct request is freed concurrently with taking a reference.
- Don't allocate physically contiguous memory for __queues, which can be
  very large. Doubling the size of struct ublk_io caused ENOMEM errors
  when adding a ublk device before this change.

[1]: https://lore.kernel.org/linux-block/aAmYJxaV1-yWEMRo@fedora/

v2:
- Add 2 fix patches
- Optimize buffer unregistration too
- Cache-align ublk_io
- Restore check for zero copy support in ublk_unregister_io_buf()
- Check for registered buffer count overflow

Caleb Sander Mateos (14):
  ublk: use vmalloc for ublk_device's __queues
  ublk: remove struct ublk_rq_data
  ublk: check cmd_op first
  ublk: handle UBLK_IO_FETCH_REQ earlier
  ublk: remove task variable from __ublk_ch_uring_cmd()
  ublk: consolidate UBLK_IO_FLAG_{ACTIVE,OWNED_BY_SRV} checks
  ublk: move ublk_prep_cancel() to case UBLK_IO_COMMIT_AND_FETCH_REQ
  ublk: don't take ublk_queue in ublk_unregister_io_buf()
  ublk: allow UBLK_IO_(UN)REGISTER_IO_BUF on any task
  ublk: return early if blk_should_fake_timeout()
  ublk: optimize UBLK_IO_REGISTER_IO_BUF on daemon task
  ublk: optimize UBLK_IO_UNREGISTER_IO_BUF on daemon task
  ublk: remove ubq checks from ublk_{get,put}_req_ref()
  ublk: cache-align struct ublk_io

 drivers/block/ublk_drv.c      | 283 +++++++++++++++++++++-------------
 include/uapi/linux/ublk_cmd.h |  10 ++
 2 files changed, 187 insertions(+), 106 deletions(-)

-- 
2.45.2


             reply	other threads:[~2025-06-20 15:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-20 15:09 Caleb Sander Mateos [this message]
2025-06-20 15:09 ` [PATCH v2 01/14] ublk: use vmalloc for ublk_device's __queues Caleb Sander Mateos
2025-06-23  7:20   ` Ming Lei
2025-06-20 15:09 ` [PATCH v2 02/14] ublk: remove struct ublk_rq_data Caleb Sander Mateos
2025-06-23  8:02   ` Ming Lei
2025-06-20 15:09 ` [PATCH v2 03/14] ublk: check cmd_op first Caleb Sander Mateos
2025-06-20 15:09 ` [PATCH v2 04/14] ublk: handle UBLK_IO_FETCH_REQ earlier Caleb Sander Mateos
2025-06-20 15:09 ` [PATCH v2 05/14] ublk: remove task variable from __ublk_ch_uring_cmd() Caleb Sander Mateos
2025-06-20 15:10 ` [PATCH v2 06/14] ublk: consolidate UBLK_IO_FLAG_{ACTIVE,OWNED_BY_SRV} checks Caleb Sander Mateos
2025-06-20 15:10 ` [PATCH v2 07/14] ublk: move ublk_prep_cancel() to case UBLK_IO_COMMIT_AND_FETCH_REQ Caleb Sander Mateos
2025-06-20 15:10 ` [PATCH v2 08/14] ublk: don't take ublk_queue in ublk_unregister_io_buf() Caleb Sander Mateos
2025-06-23  8:29   ` Ming Lei
2025-06-20 15:10 ` [PATCH v2 09/14] ublk: allow UBLK_IO_(UN)REGISTER_IO_BUF on any task Caleb Sander Mateos
2025-06-23  9:07   ` Ming Lei
2025-06-20 15:10 ` [PATCH v2 10/14] ublk: return early if blk_should_fake_timeout() Caleb Sander Mateos
2025-06-23  9:08   ` Ming Lei
2025-06-20 15:10 ` [PATCH v2 11/14] ublk: optimize UBLK_IO_REGISTER_IO_BUF on daemon task Caleb Sander Mateos
2025-06-23  9:44   ` Ming Lei
2025-06-20 15:10 ` [PATCH v2 12/14] ublk: optimize UBLK_IO_UNREGISTER_IO_BUF " Caleb Sander Mateos
2025-06-23  9:45   ` Ming Lei
2025-06-20 15:10 ` [PATCH v2 13/14] ublk: remove ubq checks from ublk_{get,put}_req_ref() Caleb Sander Mateos
2025-06-23  9:49   ` Ming Lei
2025-06-20 15:10 ` [PATCH v2 14/14] ublk: cache-align struct ublk_io Caleb Sander Mateos
2025-06-23  9:49   ` Ming Lei
2025-06-27  0:47 ` [PATCH v2 00/14] ublk: allow off-daemon zero-copy buffer registration Jens Axboe
2025-06-27  0:48   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250620151008.3976463-1-csander@purestorage.com \
    --to=csander@purestorage.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox