Re: [RFC 0/3] nvme uring passthrough diet

public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed

From: Kanchan Joshi <joshi.k@samsung.com>
To: Keith Busch <kbusch@kernel.org>
Cc: Keith Busch <kbusch@meta.com>,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	axboe@kernel.dk, hch@lst.de, xiaoguang.wang@linux.alibaba.com
Subject: Re: [RFC 0/3] nvme uring passthrough diet
Date: Fri, 5 May 2023 13:44:55 +0530	[thread overview]
Message-ID: <20230505081455.GA32732@green245> (raw)
In-Reply-To: <ZFJ7pAuTY6ESCVgp@kbusch-mbp.dhcp.thefacebook.com>

[-- Attachment #1: Type: text/plain, Size: 3246 bytes --]

On Wed, May 03, 2023 at 09:20:04AM -0600, Keith Busch wrote:
>On Wed, May 03, 2023 at 12:57:17PM +0530, Kanchan Joshi wrote:
>> On Mon, May 01, 2023 at 08:33:03AM -0700, Keith Busch wrote:
>> > From: Keith Busch <kbusch@kernel.org>
>> >
>> > When you disable all the optional features in your kernel config and
>> > request queue, it looks like the normal request dispatching is just as
>> > fast as any attempts to bypass it. So let's do that instead of
>> > reinventing everything.
>> >
>> > This doesn't require additional queues or user setup. It continues to
>> > work with multiple threads and processes, and relies on the well tested
>> > queueing mechanisms that track timeouts, handle tag exhuastion, and sync
>> > with controller state needed for reset control, hotplug events, and
>> > other error handling.
>>
>> I agree with your point that there are some functional holes in
>> the complete-bypass approach. Yet the work was needed to be done
>> to figure out the gain (of approach) and see whether the effort to fill
>> these holes is worth.
>>
>> On your specific points
>> - requiring additional queues: not a showstopper IMO.
>>  If queues are lying unused with HW, we can reap more performance by
>>  giving those to application. If not, we fall back to the existing path.
>>  No disruption as such.
>
>The current way we're reserving special queues is bad and should
>try to not extend it futher. It applies to the whole module and
>would steal resources from some devices that don't want poll queues.
>If you have a mix of device types in your system, the low end ones
>don't want to split their resources this way.
>
>NVMe has no problem creating new queues on the fly. Queue allocation
>doesn't have to be an initialization thing, but you would need to
>reserve the QID's ahead of time.

Totally in agreement with that. Jens also mentioned this point.
And I had added preallocation in my to-be-killed list. Thanks for
expanding.
Related to that, I think one-qid-per-ring also need to be lifted.
That should allow to do io on two/more devices with the single ring
and see how well that scales.

>> - tag exhaustion: that is not missing, a retry will be made. I actually
>>  wanted to do single command-id management at the io_uring level itself,
>>  and that would have cleaned things up. But it did not fit in
>>  because of submission/completion lifetime differences.
>> - timeout and other bits you mentioned: yes, those need more work.
>>
>> Now with the alternate proposed in this series, I doubt whether similar
>> gains are possible. Happy to be wrong if that happens.
>
>One other thing: the pure-bypass does appear better at low queue
>depths, but utilizing the plug for aggregated sq doorbell writes
>is a real win at higher queue depths from this series. Batching
>submissions at 4 deep is the tipping point on my test box; this
>series outperforms pure bypass at any higher batch count.

I see. 
I hit 5M cliff without plug/batching primarily because pure-bypass
is reducing the code to do the IO. But plug/batching is needed to get
better at this.
If we create space for a pointer in io_uring_cmd, that can get added in
the plug list (in place of struct request). That will be one way to sort
out the plugging.

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]

     prev parent reply	other threads:[~2023-05-05  8:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230501154403epcas5p388c607114ad6f9d20dfd3ec958d88947@epcas5p3.samsung.com>
2023-05-01 15:33 ` [RFC 0/3] nvme uring passthrough diet Keith Busch
2023-05-01 15:33   ` [RFC 1/3] nvme: skip block cgroups for passthrough commands Keith Busch
2023-05-03  5:04     ` Christoph Hellwig
2023-05-03 15:25       ` Keith Busch
2023-05-15 15:47         ` Keith Busch
2023-05-01 15:33   ` [RFC 2/3] nvme: fix cdev name leak Keith Busch
2023-05-01 15:33   ` [RFC 3/3] nvme: create special request queue for cdev Keith Busch
2023-05-02 12:20     ` Johannes Thumshirn
2023-05-03  5:04     ` Christoph Hellwig
2023-05-03 14:56       ` Keith Busch
2023-05-01 19:01   ` [RFC 0/3] nvme uring passthrough diet Kanchan Joshi
2023-05-01 19:31     ` Keith Busch
2023-05-03  7:27   ` Kanchan Joshi
2023-05-03 15:20     ` Keith Busch
2023-05-05  8:14       ` Kanchan Joshi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230505081455.GA32732@green245 \
    --to=joshi.k@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=xiaoguang.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox