All of lore.kernel.org
 help / color / mirror / Atom feed
From: keith.busch@intel.com (Keith Busch)
Subject: [PATCH] nvme/pci: Fix hot removal during error handling
Date: Wed, 19 Sep 2018 09:48:52 -0600	[thread overview]
Message-ID: <20180919154852.28187-1-keith.busch@intel.com> (raw)

A removal waits for the reset_work to complete. If a surprise removal
occurs around the same time as an error triggered controller reset,
and reset work happened to dispatch a command to the removed controller,
the command won't be recovered since the timeout work doesn't do
anything during error recovery.

This patch fixes this by killing admin queues prior to syncing reset.

Signed-off-by: Keith Busch <keith.busch at intel.com>
---
 drivers/nvme/host/core.c | 4 +++-
 drivers/nvme/host/pci.c  | 9 ++++-----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dd8ec1dd9219..893f1fcc17cd 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3592,8 +3592,10 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl)
 	down_read(&ctrl->namespaces_rwsem);
 
 	/* Forcibly unquiesce queues to avoid blocking dispatch */
-	if (ctrl->admin_q)
+	if (ctrl->admin_q) {
+		blk_set_queue_dying(ctrl->admin_q);
 		blk_mq_unquiesce_queue(ctrl->admin_q);
+	}
 
 	list_for_each_entry(ns, &ctrl->namespaces, list)
 		nvme_set_queue_dying(ns);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d668682f91df..800ee9b345f3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1470,7 +1470,7 @@ static const struct blk_mq_ops nvme_mq_ops = {
 
 static void nvme_dev_remove_admin(struct nvme_dev *dev)
 {
-	if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q)) {
+	if (dev->ctrl.admin_q) {
 		/*
 		 * If the controller was reset during removal, it's possible
 		 * user requests may be waiting on a stopped queue. Start the
@@ -1479,6 +1479,7 @@ static void nvme_dev_remove_admin(struct nvme_dev *dev)
 		blk_mq_unquiesce_queue(dev->ctrl.admin_q);
 		blk_cleanup_queue(dev->ctrl.admin_q);
 		blk_mq_free_tag_set(&dev->admin_tagset);
+		dev->ctrl.admin_q = NULL;
 	}
 }
 
@@ -2565,15 +2566,13 @@ static void nvme_remove(struct pci_dev *pdev)
 
 	nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);
 
-	cancel_work_sync(&dev->ctrl.reset_work);
 	pci_set_drvdata(pdev, NULL);
-
 	if (!pci_device_is_present(pdev)) {
 		nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DEAD);
+		nvme_kill_queues(&dev->ctrl);
 		nvme_dev_disable(dev, true);
 	}
-
-	flush_work(&dev->ctrl.reset_work);
+	cancel_work_sync(&dev->ctrl.reset_work);
 	nvme_stop_ctrl(&dev->ctrl);
 	nvme_remove_namespaces(&dev->ctrl);
 	nvme_dev_disable(dev, true);
-- 
2.14.4

             reply	other threads:[~2018-09-19 15:48 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-19 15:48 Keith Busch [this message]
2018-09-20  6:30 ` [PATCH] nvme/pci: Fix hot removal during error handling Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180919154852.28187-1-keith.busch@intel.com \
    --to=keith.busch@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.