Re: ublk: RFC fetch_req_multishot

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Ofer Oshri <ofer@nvidia.com>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	Jared Holzman <jholzman@nvidia.com>, Yoav Cohen <yoav@nvidia.com>,
	Guy Eisenberg <geisenberg@nvidia.com>,
	Omri Levi <omril@nvidia.com>,
	Caleb Sander Mateos <csander@purestorage.com>
Subject: Re: ublk: RFC fetch_req_multishot
Date: Fri, 25 Apr 2025 12:10:38 +0800	[thread overview]
Message-ID: <aAsLPk6x0a2HUG4m@fedora> (raw)
In-Reply-To: <IA1PR12MB606744884B96E0103570A1E9B6852@IA1PR12MB6067.namprd12.prod.outlook.com>

On Thu, Apr 24, 2025 at 06:19:29PM +0000, Ofer Oshri wrote:
> Hi,
> 
> Our code uses a single io_uring per core, which is shared among all block devices - meaning each block device on a core uses the same io_uring.
> 

Can I understand you are using single io_uring for serving one hw queue of
multiple ublk device?

> Let’s say the size of the io_uring is N. Each block device submits M UBLK_U_IO_FETCH_REQ requests. As a result, with the current implementation, we can only support up to P block devices, where P = N / M. This means that when we attempt to support block device P+1, it will fail due to io_uring exhaustion.
> 

Suppose N is the SQ size, the supported count of ublk device can be much bigger
than N/M, because any SQE is freed & available after it is issued to kernel, here
the SQE should be free for reuse after one UBLK_U_IO_FETCH_REQ uring_cmd is
issued to ublk driver.

That is said you can queue arbitrary number of uring_cmd with fixed SQ
size since N is just the submission batch size.

But it needs the ublk server implementation to flush queued SQE if
io_uring_get_sqe() returns NULL.

> To address this, we’d like to propose an enhancement to the ublk driver. The idea is inspired by the multi-shot concept, where a single request allows multiple replies.
> 
> We propose adding:
> 
> 1. A method to register a pool of ublk_io commands.
> 
> 2. Introduce a new UBLK_U_IO_FETCH_REQ_MULTISHOT operation, where a pool of ublk_io commands is bound to a block device. Then, upon receiving a new BIO, the ublk driver can select a reply from the pre-registered pool and push it to the io_uring.
> 
> 3. Introduce a new UBLK_U_IO_COMMIT_REQ command to explicitly mark the completion of a request. In this case, the ublk driver returns the request to the pool.  We can retain the existing UBLK_U_IO_COMMIT_AND_FETCH_REQ command, but for multi-shot scenarios, the “FETCH” operation would simply mean returning the request to the pool.
> 
> What are your thoughts on this approach?

I think we need to understand the real problem you want to address
before digging into the uring_cmd pool concept.

1) for save memory for lots of ublk device ?

- so far, the main preallocation should be from blk-mq request, and
as Caleb mentioned, the state memory from both ublk and io_uring isn't
very big

2) need to support as many as ublk device in single io_uring context with
limited SQ/CQ size ?

- it may not be one big problem because fixed SQ size allows to issue
arbitrary number of uring_cmd

- but CQ size may limit number of completed uring_cmd for notifying
incoming ublk request, is this your problem? Jens has added ring resize
via IORING_REGISTER_RESIZE_RINGS:

https://lore.kernel.org/io-uring/20241022021159.820925-1-axboe@kernel.dk/


3) or other requirement?



Thanks,
Ming

     prev parent reply	other threads:[~2025-04-25  4:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-24 18:19 ublk: RFC fetch_req_multishot Ofer Oshri
2025-04-24 18:28 ` Caleb Sander Mateos
2025-04-24 19:07   ` Ofer Oshri
     [not found]   ` <IA1PR12MB60672D37508D641368D211B8B6852@IA1PR12MB6067.namprd12.prod.outlook.com>
2025-04-24 19:07     ` Caleb Sander Mateos
2025-04-24 21:07       ` Jared Holzman
2025-04-24 21:52         ` Caleb Sander Mateos
2025-04-25  5:23       ` Ming Lei
2025-06-06 12:03         ` Ming Lei
2025-04-25  4:10 ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aAsLPk6x0a2HUG4m@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=geisenberg@nvidia.com \
    --cc=jholzman@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ofer@nvidia.com \
    --cc=omril@nvidia.com \
    --cc=yoav@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox