From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH 2/2] nvme: flush scan_work when resetting controller
Date: Fri, 21 Jun 2019 14:58:52 +0800 [thread overview]
Message-ID: <20190621065851.GA22145@ming.t460p> (raw)
In-Reply-To: <3dbb8dc0-2491-6226-8715-b0f5b7f6a73a@suse.de>
On Fri, Jun 21, 2019@08:14:45AM +0200, Hannes Reinecke wrote:
> On 6/20/19 3:36 AM, Ming Lei wrote:
> > On Tue, Jun 18, 2019@12:10:25PM +0200, Hannes Reinecke wrote:
> >> When resetting the controller there is no point whatsoever to
> >> have a scan run in parallel; we cannot access the controller and
> >
> > scan won't be run in parallel, because .scan_work is embedded in
> > 'struct nvme_ctrl' which is per-HBA.
> >
> Wrong. We do.
> Not sure why having it embedded in the controller structure might
> prevent this from happening; both reset and scan are embedded, but
> running on different queues:
I mean the scan_work function itself is run exclusively, but yes, it can be
run when resetting is in-progress.
>
> void nvme_queue_scan(struct nvme_ctrl *ctrl)
> {
> /*
> * Only new queue scan work when admin and IO queues are both alive
> */
> if (ctrl->state == NVME_CTRL_LIVE)
> queue_work(nvme_wq, &ctrl->scan_work);
> }
> EXPORT_SYMBOL_GPL(nvme_queue_scan);
>
> int nvme_reset_ctrl(struct nvme_ctrl *ctrl)
> {
> if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING))
> return -EBUSY;
> if (!queue_work(nvme_reset_wq, &ctrl->reset_work))
> return -EBUSY;
> return 0;
> }
> EXPORT_SYMBOL_GPL(nvme_reset_ctrl);
>
> So there's nothing stopping them to run in parallel.
>
> >> we cannot tell which devices are present and which not.
> >> Additionally we'll run a scan after reset anyway.
> >> So flush existing scans before reconnecting, ensuring to
> >> short-circuit the scan workqueue function if the controller state
> >> isn't live to avoid lockups.
> >
> > This way may cause dead-lock.
> >
> > 1) nvme_revalidate_disk() might freeze queue in flush context, however
> > any in-flight requests won't be completed until reset is done, so
> > deadlock may be caused by flushing scans in reset context.
> >
> Which is why I'm checking the controller state; I've observed the
> deadlock plenty of times before introducing the controller state check.
Your check can't help wrt. the deadlock, for example:
1) in scan work context:
- blk_mq_freeze_queue() is being started after passing the controller
state check
2) timeout & reset is triggered in another context at the exact same time:
- all in-flight IOs won't be freed until disable controller & reset is done.
- now flush_work() in reset context can't move on, because
blk_mq_freeze_queue() in scan context can't make progress.
>
> > 2) sync IO may be involved in revalidate_disk() which is called in
> > scan context, so deadlock is caused for same reason with 1).
> >
> I/O during revalidate_disk() is protected by the state check, too, so we
> won't be issuing any I/O during resetting.
>
> To be precise: any I/O in flight when reset is triggered will be
> terminated, and any subsequent I/O is short-circuited by the state check.
No, any I/O in flight before resetting is just terminated from hardware,
but still in blk-mq sw or scheduler queue, so either sync IO or queue
freezing won't make progress.
Please see nvme_complete_rq(), all these IO will be retried usually.
Thanks,
Ming
next prev parent reply other threads:[~2019-06-21 6:58 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-18 10:10 [PATCH 0/2] nvme: flush rescan worker before resetting Hannes Reinecke
2019-06-18 10:10 ` [PATCH 1/2] nvme: Do not remove namespaces during reset Hannes Reinecke
2019-06-18 17:30 ` Sagi Grimberg
2019-06-20 1:22 ` Ming Lei
2019-06-18 10:10 ` [PATCH 2/2] nvme: flush scan_work when resetting controller Hannes Reinecke
2019-06-18 17:41 ` Sagi Grimberg
2019-06-19 6:22 ` Hannes Reinecke
2019-06-19 16:56 ` Sagi Grimberg
2019-06-19 18:45 ` Hannes Reinecke
2019-06-19 20:04 ` Sagi Grimberg
2019-06-21 16:26 ` Sagi Grimberg
2019-06-24 5:48 ` Hannes Reinecke
2019-06-24 6:13 ` Hannes Reinecke
2019-06-24 18:08 ` Sagi Grimberg
2019-06-24 18:51 ` James Smart
2019-06-25 6:07 ` Hannes Reinecke
2019-06-25 21:50 ` Sagi Grimberg
2019-06-26 5:34 ` Hannes Reinecke
2019-06-26 20:22 ` Sagi Grimberg
2019-07-02 5:38 ` Sagi Grimberg
2019-07-02 13:29 ` Hannes Reinecke
2019-06-20 1:36 ` Ming Lei
2019-06-21 6:14 ` Hannes Reinecke
2019-06-21 6:58 ` Ming Lei [this message]
2019-06-21 7:59 ` Hannes Reinecke
2019-06-21 17:23 ` James Smart
2019-06-21 17:23 ` James Smart
2019-06-24 3:29 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190621065851.GA22145@ming.t460p \
--to=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.