Re: [RFC 0/3] nvme uring passthrough diet

public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed

From: Kanchan Joshi <joshi.k@samsung.com>
To: Keith Busch <kbusch@meta.com>
Cc: linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	axboe@kernel.dk, hch@lst.de, xiaoguang.wang@linux.alibaba.com,
	Keith Busch <kbusch@kernel.org>
Subject: Re: [RFC 0/3] nvme uring passthrough diet
Date: Wed, 3 May 2023 12:57:17 +0530	[thread overview]
Message-ID: <20230503072625.GA18487@green245> (raw)
In-Reply-To: <20230501153306.537124-1-kbusch@meta.com>

[-- Attachment #1: Type: text/plain, Size: 2430 bytes --]

On Mon, May 01, 2023 at 08:33:03AM -0700, Keith Busch wrote:
>From: Keith Busch <kbusch@kernel.org>
>
>When you disable all the optional features in your kernel config and
>request queue, it looks like the normal request dispatching is just as
>fast as any attempts to bypass it. So let's do that instead of
>reinventing everything.
>
>This doesn't require additional queues or user setup. It continues to
>work with multiple threads and processes, and relies on the well tested
>queueing mechanisms that track timeouts, handle tag exhuastion, and sync
>with controller state needed for reset control, hotplug events, and
>other error handling.

I agree with your point that there are some functional holes in
the complete-bypass approach. Yet the work was needed to be done
to figure out the gain (of approach) and see whether the effort to fill
these holes is worth.

On your specific points
- requiring additional queues: not a showstopper IMO.
  If queues are lying unused with HW, we can reap more performance by
  giving those to application. If not, we fall back to the existing path.
  No disruption as such.
- tag exhaustion: that is not missing, a retry will be made. I actually
  wanted to do single command-id management at the io_uring level itself,
  and that would have cleaned things up. But it did not fit in
  because of submission/completion lifetime differences.
- timeout and other bits you mentioned: yes, those need more work.

Now with the alternate proposed in this series, I doubt whether similar
gains are possible. Happy to be wrong if that happens.
Please note that for some non-block command sets, passthrough is the only
usable interface. So these users would want some of the functionality
bits too (e.g. cgroups). Cgroups is broken for the passthrough at the
moment, and I wanted to do something about that too.

Overall, the usage model that I imagine with multiple paths is this -

1. existing block IO path: for block-friendly command-sets
2. existing passthrough IO path: for non-block command sets
3. new pure-bypass variant: for both; and this one deliberately trims all
the fat at the expense of some features/functionality.

#2 will not have all the features of #1, but good to have all that are
necessary and do not have semantic troubles to fit in. And these may
grow over time, leading to a kernel that has improved parity between block
and non-block io.
Do you think this makes sense?

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]

next prev parent reply	other threads:[~2023-05-03  7:30 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20230501154403epcas5p388c607114ad6f9d20dfd3ec958d88947@epcas5p3.samsung.com>
2023-05-01 15:33 ` [RFC 0/3] nvme uring passthrough diet Keith Busch
2023-05-01 15:33   ` [RFC 1/3] nvme: skip block cgroups for passthrough commands Keith Busch
2023-05-03  5:04     ` Christoph Hellwig
2023-05-03 15:25       ` Keith Busch
2023-05-15 15:47         ` Keith Busch
2023-05-01 15:33   ` [RFC 2/3] nvme: fix cdev name leak Keith Busch
2023-05-01 15:33   ` [RFC 3/3] nvme: create special request queue for cdev Keith Busch
2023-05-02 12:20     ` Johannes Thumshirn
2023-05-03  5:04     ` Christoph Hellwig
2023-05-03 14:56       ` Keith Busch
2023-05-01 19:01   ` [RFC 0/3] nvme uring passthrough diet Kanchan Joshi
2023-05-01 19:31     ` Keith Busch
2023-05-03  7:27   ` Kanchan Joshi [this message]
2023-05-03 15:20     ` Keith Busch
2023-05-05  8:14       ` Kanchan Joshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230503072625.GA18487@green245 \
    --to=joshi.k@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=xiaoguang.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox