From: hch@lst.de (Christoph Hellwig)
Subject: please revert a nvme-cli commit
Date: Fri, 15 Jun 2018 11:38:06 +0200 [thread overview]
Message-ID: <20180615093806.GA21345@lst.de> (raw)
In-Reply-To: <a4ece148-00b6-01de-ab6b-3fc8a89818f8@broadcom.com>
On Thu, Jun 14, 2018@09:29:24AM -0700, James Smart wrote:
>> I don't agree. For one so far we've guaranteed the device node
>> appears before we return from the write to the /dev/nvme-fabrics
>> device. If that changes it is a significant ABI break, and we need
>> to fix that in the kernel.
>
> what provides the guarantee that it appears first ?
>
> Based on what I've seen there is nothing but time that guarantees it, and
> if that's true, the udev user side always has a potential of getting hung
> up and exceeding the time.
Then we need to use the proper apis to wait for it to appear instead
of just trying a few times. I suspect those are going to be either
libudev or libsysfs APIs, but I'll need to look into the details.
>> And for retrying the actual I/O we need to decide on the exact semantics
>> we want to support first. Blind n time retry is always a bad idea,
>> we need to build some sort of reliable infrastructure. Be that
>> optionally marking requests as not failfast, and/or some sort of poll
>> notification for a device that is ready.
>
> Ok. Given I've already had complaints that the cli shouldn't hang out for a
> minute or more - the time it takes for a controller to lose connectivity
> and fail the loss timeout, and that's not considering the cmds that probe
> several controllers each of which could do this sequentially resulting in a
> delay of many minutes - and as the user may want to kill the waiting cli,
> I'm of the opinion that the kernel request should be terminated not
> suspended. Thus the ios should continue to be failfast and let the policy
> be in the cli based on what the command is.? Making the cli use the poll
> notification to find when the controller is ready again is fine.
The passthrough ioctls can set a per-command timeout, so we should set
that to the time we want to wait for. And yes, we should do a killable
wait. Right now the block layer doesn't offer an API for that, but
we really should switch all of NVMe over to that. I can take it as
an action item to add that support.
> what should be the general policy for any command? if receive an EAGAIN
> (not ready) then poll until ready or device failure ????? or return
> failure immediately unless command is connect-all (discovery log read) then
> it waits for ready/failure ?
We should only ever return EAGAIN if opened with O_NONBLOCK for a start.
And then preferable we should allow to poll for the controller to be
ready.
> what should be the policy for the commands that probe multiple controllers
> ? ?? do commands fail immediately causing command to skip controllers not
> ready (message when does so) ??? note: the skip is already there on
> command failure, but if we suspend as a general policy, it would affect it.
I think we should have the ioctl blocking until the sent timeout (or
killed) and just skip the controllers that time out. But I'd be happy
to listen to other idea if they sound more sensible.
prev parent reply other threads:[~2018-06-15 9:38 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-13 7:55 please revert a nvme-cli commit Christoph Hellwig
2018-06-13 13:31 ` Keith Busch
2018-06-13 15:51 ` James Smart
2018-06-14 12:42 ` Christoph Hellwig
2018-06-14 16:29 ` James Smart
2018-06-15 9:38 ` Christoph Hellwig [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180615093806.GA21345@lst.de \
--to=hch@lst.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox