From: Keith Busch <keith.busch@intel.com>
To: James Smart <james.smart@broadcom.com>
Cc: Sagi Grimberg <sagi@grimberg.me>, Jens Axboe <axboe@kernel.dk>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
Hannes Reinecke <hare@suse.de>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 1/2] blk-mq: Export iterating all tagged requests
Date: Tue, 4 Dec 2018 14:21:17 -0700 [thread overview]
Message-ID: <20181204212117.GC16751@localhost.localdomain> (raw)
In-Reply-To: <b5b306cd-7341-7ce3-f2dc-fe98a01327fe@broadcom.com>
On Tue, Dec 04, 2018 at 11:33:33AM -0800, James Smart wrote:
>
>
> On 12/4/2018 9:48 AM, Keith Busch wrote:
> > On Tue, Dec 04, 2018 at 09:38:29AM -0800, Sagi Grimberg wrote:
> > > > > > Yes, I'm very much in favour of this, too.
> > > > > > We always have this IMO slightly weird notion of stopping the queue, set
> > > > > > some error flags in the driver, then _restarting_ the queue, just so
> > > > > > that the driver then sees the error flag and terminates the requests.
> > > > > > Which I always found quite counter-intuitive.
> > > > > What about requests that come in after the iteration runs? how are those
> > > > > terminated?
> > > > If we've reached a dead state, I think you'd want to start a queue freeze
> > > > before running the terminating iterator.
> > > Its not necessarily dead, in fabrics we need to handle disconnections
> > > that last for a while before we are able to reconnect (for a variety of
> > > reasons) and we need a way to fail I/O for failover (or requeue, or
> > > block its up to the upper layer). Its less of a "last resort" action
> > > like in the pci case.
> > >
> > > Does this guarantee that after freeze+iter we won't get queued with any
> > > other request? If not then we still need to unfreeze and fail at
> > > queue_rq.
> > It sounds like there are different scenarios to consider.
> >
> > For the dead controller, we call blk_cleanup_queue() at the end which
> > ends callers who blocked on entering.
> >
> > If you're doing a failover, you'd replace the freeze with a current path
> > update in order to prevent new requests from entering.
> and if you're not multipath ? I assume you want the io queues to be
> frozen so they queue there - which can block threads such as ns
> verification. It's good to have them live, as todays checks bounce the io,
> letting the thread terminate as its in a reset/reconnect state, which allows
> those threads to exit out or finish before a new reconnect kicks them off
> again. We've already been fighting deadlocks with the reset/delete/rescan
> paths and these io paths. suspending the queues completely over the
> reconnect will likely create more issues in this area.
>
>
> > In either case, you don't need checks in queue_rq. The queue_rq check
> > is redundant with the quiesce state that blk-mq already provides.
>
> I disagree. The cases I've run into are on the admin queue - where we are
> sending io to initialize the controller when another error/reset occurs, and
> the checks are required to identify/reject the "old" initialization
> commands, with another state check allowing them to proceed on the "new"
> initialization commands. And there are also cases for ioctls and other
> things that occur during the middle of those initialization steps that need
> to be weeded out. The Admin queue has to be kept live to allow the
> initialization commands on the new controller.
>
> state checks are also needed for those namespace validation cases....
>
> >
> > Once quiesced, the proposed iterator can handle the final termination
> > of the request, perform failover, or some other lld specific action
> > depending on your situation.
>
> I don't believe they can remain frozen, definitely not for the admin queue.
> -- james
Quiesced and frozen carry different semantics.
My understanding of the nvme-fc implementation is that it returns
BLK_STS_RESOURCE in the scenario you've described where the admin
command can't be executed at the moment. That just has the block layer
requeue it for later resubmission 3 milliseconds later, which will
continue to return the same status code until you're really ready for
it.
What I'm proposing is that instead of using that return code, you may
have nvme-fc control when to dispatch those queued requests by utilizing
the blk-mq quiesce on/off states. Is there a reason that wouldn't work?
next prev parent reply other threads:[~2018-12-04 21:24 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-30 20:26 [PATCH 1/2] blk-mq: Export iterating all tagged requests Keith Busch
2018-11-30 20:26 ` [PATCH 2/2] nvme: Remove queue flushing hack Keith Busch
2018-11-30 20:36 ` [PATCH 1/2] blk-mq: Export iterating all tagged requests Jens Axboe
2018-11-30 20:39 ` Keith Busch
2018-12-01 16:48 ` Christoph Hellwig
2018-12-01 17:11 ` Hannes Reinecke
2018-12-01 18:32 ` Bart Van Assche
2018-12-03 18:57 ` James Smart
2018-12-04 1:33 ` Sagi Grimberg
2018-12-04 15:46 ` Keith Busch
2018-12-04 16:26 ` James Smart
2018-12-04 17:23 ` Sagi Grimberg
2018-12-04 19:13 ` James Smart
2018-12-04 17:38 ` Sagi Grimberg
2018-12-04 17:48 ` Keith Busch
2018-12-04 19:33 ` James Smart
2018-12-04 21:21 ` Keith Busch [this message]
2018-12-04 21:43 ` Keith Busch
2018-12-04 22:09 ` James Smart
2018-12-03 7:44 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181204212117.GC16751@localhost.localdomain \
--to=keith.busch@intel.com \
--cc=axboe@kernel.dk \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=james.smart@broadcom.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).