Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH 2/2] nvme: flush scan_work when resetting controller
Date: Fri, 21 Jun 2019 14:58:52 +0800	[thread overview]
Message-ID: <20190621065851.GA22145@ming.t460p> (raw)
In-Reply-To: <3dbb8dc0-2491-6226-8715-b0f5b7f6a73a@suse.de>

On Fri, Jun 21, 2019@08:14:45AM +0200, Hannes Reinecke wrote:
> On 6/20/19 3:36 AM, Ming Lei wrote:
> > On Tue, Jun 18, 2019@12:10:25PM +0200, Hannes Reinecke wrote:
> >> When resetting the controller there is no point whatsoever to
> >> have a scan run in parallel; we cannot access the controller and
> > 
> > scan won't be run in parallel, because .scan_work is embedded in
> > 'struct nvme_ctrl' which is per-HBA.
> > 
> Wrong. We do.
> Not sure why having it embedded in the controller structure might
> prevent this from happening; both reset and scan are embedded, but
> running on different queues:

I mean the scan_work function itself is run exclusively, but yes, it can be 
run when resetting is in-progress.

> 
> void nvme_queue_scan(struct nvme_ctrl *ctrl)
> {
> 	/*
> 	 * Only new queue scan work when admin and IO queues are both alive
> 	 */
> 	if (ctrl->state == NVME_CTRL_LIVE)
> 		queue_work(nvme_wq, &ctrl->scan_work);
> }
> EXPORT_SYMBOL_GPL(nvme_queue_scan);
> 
> int nvme_reset_ctrl(struct nvme_ctrl *ctrl)
> {
> 	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING))
> 		return -EBUSY;
> 	if (!queue_work(nvme_reset_wq, &ctrl->reset_work))
> 		return -EBUSY;
> 	return 0;
> }
> EXPORT_SYMBOL_GPL(nvme_reset_ctrl);
> 
> So there's nothing stopping them to run in parallel.
> 
> >> we cannot tell which devices are present and which not.
> >> Additionally we'll run a scan after reset anyway.
> >> So flush existing scans before reconnecting, ensuring to
> >> short-circuit the scan workqueue function if the controller state
> >> isn't live to avoid lockups.
> > 
> > This way may cause dead-lock.
> > 
> > 1) nvme_revalidate_disk() might freeze queue in flush context, however
> > any in-flight requests won't be completed until reset is done, so
> > deadlock may be caused by flushing scans in reset context.
> > 
> Which is why I'm checking the controller state; I've observed the
> deadlock plenty of times before introducing the controller state check.

Your check can't help wrt. the deadlock, for example:

1) in scan work context:

- blk_mq_freeze_queue() is being started after passing the controller
  state check

2) timeout & reset is triggered in another context at the exact same time:

- all in-flight IOs won't be freed until disable controller & reset is done.

- now flush_work() in reset context can't move on, because
  blk_mq_freeze_queue() in scan context can't make progress.

> 
> > 2) sync IO may be involved in revalidate_disk() which is called in
> > scan context, so deadlock is caused for same reason with 1).
> > 
> I/O during revalidate_disk() is protected by the state check, too, so we
> won't be issuing any I/O during resetting.
> 
> To be precise: any I/O in flight when reset is triggered will be
> terminated, and any subsequent I/O is short-circuited by the state check.

No, any I/O in flight before resetting is just terminated from hardware,
but still in blk-mq sw or scheduler queue, so either sync IO or queue
freezing won't make progress.

Please see nvme_complete_rq(), all these IO will be retried usually.


Thanks,
Ming

  reply	other threads:[~2019-06-21  6:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-18 10:10 [PATCH 0/2] nvme: flush rescan worker before resetting Hannes Reinecke
2019-06-18 10:10 ` [PATCH 1/2] nvme: Do not remove namespaces during reset Hannes Reinecke
2019-06-18 17:30   ` Sagi Grimberg
2019-06-20  1:22   ` Ming Lei
2019-06-18 10:10 ` [PATCH 2/2] nvme: flush scan_work when resetting controller Hannes Reinecke
2019-06-18 17:41   ` Sagi Grimberg
2019-06-19  6:22     ` Hannes Reinecke
2019-06-19 16:56       ` Sagi Grimberg
2019-06-19 18:45         ` Hannes Reinecke
2019-06-19 20:04           ` Sagi Grimberg
2019-06-21 16:26             ` Sagi Grimberg
2019-06-24  5:48               ` Hannes Reinecke
2019-06-24  6:13               ` Hannes Reinecke
2019-06-24 18:08                 ` Sagi Grimberg
2019-06-24 18:51                   ` James Smart
2019-06-25  6:07                   ` Hannes Reinecke
2019-06-25 21:50                     ` Sagi Grimberg
2019-06-26  5:34                       ` Hannes Reinecke
2019-06-26 20:22                         ` Sagi Grimberg
2019-07-02  5:38                           ` Sagi Grimberg
2019-07-02 13:29                             ` Hannes Reinecke
2019-06-20  1:36   ` Ming Lei
2019-06-21  6:14     ` Hannes Reinecke
2019-06-21  6:58       ` Ming Lei [this message]
2019-06-21  7:59         ` Hannes Reinecke
2019-06-21 17:23           ` James Smart
2019-06-21 17:23           ` James Smart
2019-06-24  3:29           ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190621065851.GA22145@ming.t460p \
    --to=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox