From: keith.busch@intel.com (Keith Busch)
Subject: [PATCH 3/3] nvme: Fail controller on timeouts during reset
Date: Fri, 9 Feb 2018 10:41:27 -0700 [thread overview]
Message-ID: <20180209174127.7224-3-keith.busch@intel.com> (raw)
In-Reply-To: <20180209174127.7224-1-keith.busch@intel.com>
We can't schedule a second controller reset if the controller fails while
the driver is already attempting to start it. Synchronous admin commands
are already handled appropriately since they are never retried and the
completion status is read directly. Asynchronous IO commands, however,
were previously undetected.
This patch fixes that by preventing retries on IO commands during
controller connecting states, and directing the controller to a failed
state after aborting the timed out commands. Without this patch, a
controller that fails IO commands during start up would hang indefinitely.
Reported-by: Jianchao Wang <jianchao.w.wang at oracle.com>
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
drivers/nvme/host/core.c | 6 ++++--
drivers/nvme/host/pci.c | 6 +++++-
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index a9bce23a991f..c0f4771d79a2 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -240,13 +240,15 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq);
void nvme_cancel_request(struct request *req, void *data, bool reserved)
{
+ struct nvme_ctrl *ctrl = data;
if (!blk_mq_request_started(req))
return;
- dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
- "Cancelling I/O %d", req->tag);
+ dev_dbg_ratelimited(ctrl->device, "Cancelling I/O %d", req->tag);
nvme_req(req)->status = NVME_SC_ABORT_REQ;
+ if (ctrl->state == NVME_CTRL_CONNECTING)
+ nvme_req(req)->status |= NVME_SC_DNR;
blk_mq_complete_request(req);
}
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 7a2e4383c468..77929d35eae8 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1212,11 +1212,15 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
/*
* Shutdown immediately if controller times out while starting. The
* reset work will see the pci device disabled when it gets the forced
- * cancellation error. All outstanding requests are completed on
+ * cancellation error. The driver won't see the status if it is waiting
+ * on asynchronous comands, so we set the state to deleting to prevent
+ * it from progressing. All outstanding requests are completed on
* shutdown, so we return BLK_EH_HANDLED.
*/
switch (dev->ctrl.state) {
case NVME_CTRL_CONNECTING:
+ nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);
+ /* FALLTHRU */
case NVME_CTRL_RESETTING:
dev_warn(dev->ctrl.device,
"I/O %d QID %d timeout, disable controller\n",
--
2.14.3
next prev parent reply other threads:[~2018-02-09 17:41 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-09 17:41 [PATCH 1/3] nvme-pci: Fix timeouts in connecting state Keith Busch
2018-02-09 17:41 ` [PATCH 2/3] nvme: Sync queues on controller resets Keith Busch
2018-02-10 1:55 ` jianchao.wang
2018-02-11 1:53 ` jianchao.wang
2018-02-12 21:46 ` Keith Busch
2018-04-18 2:26 ` jianchao.wang
2018-04-19 14:33 ` Keith Busch
2018-02-09 17:41 ` Keith Busch [this message]
2018-02-11 8:26 ` [PATCH 3/3] nvme: Fail controller on timeouts during reset jianchao.wang
2018-02-11 9:53 ` Sagi Grimberg
2018-02-12 7:59 ` jianchao.wang
2018-02-12 18:37 ` Sagi Grimberg
2018-02-13 2:21 ` jianchao.wang
2018-02-10 2:14 ` [PATCH 1/3] nvme-pci: Fix timeouts in connecting state jianchao.wang
2018-02-12 14:18 ` Keith Busch
2018-02-13 2:21 ` jianchao.wang
2018-02-13 2:33 ` jianchao.wang
2018-02-13 14:47 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180209174127.7224-3-keith.busch@intel.com \
--to=keith.busch@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox