From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH 2/2] nvme: flush scan_work when resetting controller
Date: Fri, 21 Jun 2019 14:58:52 +0800 [thread overview]
Message-ID: <20190621065851.GA22145@ming.t460p> (raw)
In-Reply-To: <3dbb8dc0-2491-6226-8715-b0f5b7f6a73a@suse.de>
On Fri, Jun 21, 2019@08:14:45AM +0200, Hannes Reinecke wrote:
> On 6/20/19 3:36 AM, Ming Lei wrote:
> > On Tue, Jun 18, 2019@12:10:25PM +0200, Hannes Reinecke wrote:
> >> When resetting the controller there is no point whatsoever to
> >> have a scan run in parallel; we cannot access the controller and
> >
> > scan won't be run in parallel, because .scan_work is embedded in
> > 'struct nvme_ctrl' which is per-HBA.
> >
> Wrong. We do.
> Not sure why having it embedded in the controller structure might
> prevent this from happening; both reset and scan are embedded, but
> running on different queues:
I mean the scan_work function itself is run exclusively, but yes, it can be
run when resetting is in-progress.
>
> void nvme_queue_scan(struct nvme_ctrl *ctrl)
> {
> /*
> * Only new queue scan work when admin and IO queues are both alive
> */
> if (ctrl->state == NVME_CTRL_LIVE)
> queue_work(nvme_wq, &ctrl->scan_work);
> }
> EXPORT_SYMBOL_GPL(nvme_queue_scan);
>
> int nvme_reset_ctrl(struct nvme_ctrl *ctrl)
> {
> if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING))
> return -EBUSY;
> if (!queue_work(nvme_reset_wq, &ctrl->reset_work))
> return -EBUSY;
> return 0;
> }
> EXPORT_SYMBOL_GPL(nvme_reset_ctrl);
>
> So there's nothing stopping them to run in parallel.
>
> >> we cannot tell which devices are present and which not.
> >> Additionally we'll run a scan after reset anyway.
> >> So flush existing scans before reconnecting, ensuring to
> >> short-circuit the scan workqueue function if the controller state
> >> isn't live to avoid lockups.
> >
> > This way may cause dead-lock.
> >
> > 1) nvme_revalidate_disk() might freeze queue in flush context, however
> > any in-flight requests won't be completed until reset is done, so
> > deadlock may be caused by flushing scans in reset context.
> >
> Which is why I'm checking the controller state; I've observed the
> deadlock plenty of times before introducing the controller state check.
Your check can't help wrt. the deadlock, for example:
1) in scan work context:
- blk_mq_freeze_queue() is being started after passing the controller
state check
2) timeout & reset is triggered in another context at the exact same time:
- all in-flight IOs won't be freed until disable controller & reset is done.
- now flush_work() in reset context can't move on, because
blk_mq_freeze_queue() in scan context can't make progress.
>
> > 2) sync IO may be involved in revalidate_disk() which is called in
> > scan context, so deadlock is caused for same reason with 1).
> >
> I/O during revalidate_disk() is protected by the state check, too, so we
> won't be issuing any I/O during resetting.
>
> To be precise: any I/O in flight when reset is triggered will be
> terminated, and any subsequent I/O is short-circuited by the state check.
No, any I/O in flight before resetting is just terminated from hardware,
but still in blk-mq sw or scheduler queue, so either sync IO or queue
freezing won't make progress.
Please see nvme_complete_rq(), all these IO will be retried usually.
Thanks,
Ming
next prev parent reply other threads:[~2019-06-21 6:58 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-18 10:10 [PATCH 0/2] nvme: flush rescan worker before resetting Hannes Reinecke
2019-06-18 10:10 ` [PATCH 1/2] nvme: Do not remove namespaces during reset Hannes Reinecke
2019-06-18 17:30 ` Sagi Grimberg
2019-06-20 1:22 ` Ming Lei
2019-06-18 10:10 ` [PATCH 2/2] nvme: flush scan_work when resetting controller Hannes Reinecke
2019-06-18 17:41 ` Sagi Grimberg
2019-06-19 6:22 ` Hannes Reinecke
2019-06-19 16:56 ` Sagi Grimberg
2019-06-19 18:45 ` Hannes Reinecke
2019-06-19 20:04 ` Sagi Grimberg
2019-06-21 16:26 ` Sagi Grimberg
2019-06-24 5:48 ` Hannes Reinecke
2019-06-24 6:13 ` Hannes Reinecke
2019-06-24 18:08 ` Sagi Grimberg
2019-06-24 18:51 ` James Smart
2019-06-25 6:07 ` Hannes Reinecke
2019-06-25 21:50 ` Sagi Grimberg
2019-06-26 5:34 ` Hannes Reinecke
2019-06-26 20:22 ` Sagi Grimberg
2019-07-02 5:38 ` Sagi Grimberg
2019-07-02 13:29 ` Hannes Reinecke
2019-06-20 1:36 ` Ming Lei
2019-06-21 6:14 ` Hannes Reinecke
2019-06-21 6:58 ` Ming Lei [this message]
2019-06-21 7:59 ` Hannes Reinecke
2019-06-21 17:23 ` James Smart
2019-06-21 17:23 ` James Smart
2019-06-24 3:29 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190621065851.GA22145@ming.t460p \
--to=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox