From: Hannes Reinecke <hare@suse.de>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Keith Busch <keith.busch@intel.com>,
John Managhini <john.meneghini@netapp.com>,
Christoph Hellwig <hch@lst.de>,
linux-nvme@lists.infradead.org, Hannes Reinecke <hare@suse.de>
Subject: [PATCH] nvme-multipath: do not reset controller on unknown status
Date: Wed, 12 Feb 2020 14:41:40 +0100 [thread overview]
Message-ID: <20200212134140.105817-1-hare@suse.de> (raw)
We're seeing occasional controller resets during straight I/O,
but only when multipath is active.
The problem here is the nvme-multipath will reset the controller
on every unknown status, which really is an odd behaviour, seeing
that the host already received a perfectly good status; it's just
that it's not smart enough to understand it.
And resetting wouldn't help at all; the error status will continue
to be received.
So we should rather pass up any unknown error to the generic
routines and let them deal with this situation.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: John Managhini <john.meneghini@netapp.com>
---
drivers/nvme/host/core.c | 4 ++--
drivers/nvme/host/multipath.c | 18 ++++++++++--------
drivers/nvme/host/nvme.h | 2 +-
3 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5dc32b72e7fa..edb081781ae7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -293,8 +293,8 @@ void nvme_complete_rq(struct request *req)
if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
if ((req->cmd_flags & REQ_NVME_MPATH) &&
blk_path_error(status)) {
- nvme_failover_req(req);
- return;
+ if (nvme_failover_req(req))
+ return;
}
if (!blk_queue_dying(req->q)) {
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 797c18337d96..71e8acae78eb 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -64,16 +64,16 @@ void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
}
}
-void nvme_failover_req(struct request *req)
+bool nvme_failover_req(struct request *req)
{
struct nvme_ns *ns = req->q->queuedata;
u16 status = nvme_req(req)->status;
unsigned long flags;
+ bool handled = false;
spin_lock_irqsave(&ns->head->requeue_lock, flags);
blk_steal_bios(&ns->head->requeue_list, req);
spin_unlock_irqrestore(&ns->head->requeue_lock, flags);
- blk_mq_end_request(req, 0);
switch (status & 0x7ff) {
case NVME_SC_ANA_TRANSITION:
@@ -88,11 +88,13 @@ void nvme_failover_req(struct request *req)
* mark the the path as pending and kick of a re-read of the ANA
* log page ASAP.
*/
+ blk_mq_end_request(req, 0);
nvme_mpath_clear_current_path(ns);
if (ns->ctrl->ana_log_buf) {
set_bit(NVME_NS_ANA_PENDING, &ns->flags);
queue_work(nvme_wq, &ns->ctrl->ana_work);
}
+ handled = true;
break;
case NVME_SC_HOST_PATH_ERROR:
case NVME_SC_HOST_ABORTED_CMD:
@@ -100,18 +102,18 @@ void nvme_failover_req(struct request *req)
* Temporary transport disruption in talking to the controller.
* Try to send on a new path.
*/
+ blk_mq_end_request(req, 0);
nvme_mpath_clear_current_path(ns);
+ handled = true;
break;
default:
- /*
- * Reset the controller for any non-ANA error as we don't know
- * what caused the error.
- */
- nvme_reset_ctrl(ns->ctrl);
+ /* Delegate to common error handling */
break;
}
- kblockd_schedule_work(&ns->head->requeue_work);
+ if (handled)
+ kblockd_schedule_work(&ns->head->requeue_work);
+ return handled;
}
void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 1024fec7914c..7e28084f71af 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -550,7 +550,7 @@ void nvme_mpath_wait_freeze(struct nvme_subsystem *subsys);
void nvme_mpath_start_freeze(struct nvme_subsystem *subsys);
void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
struct nvme_ctrl *ctrl, int *flags);
-void nvme_failover_req(struct request *req);
+bool nvme_failover_req(struct request *req);
void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl);
int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head);
void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id);
--
2.16.4
_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next reply other threads:[~2020-02-12 13:42 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-12 13:41 Hannes Reinecke [this message]
2020-02-12 17:53 ` [PATCH] nvme-multipath: do not reset controller on unknown status Christoph Hellwig
2020-02-12 19:33 ` Sagi Grimberg
2020-02-13 7:02 ` Hannes Reinecke
2020-02-13 7:19 ` Sagi Grimberg
2020-02-13 17:02 ` Keith Busch
2020-02-14 14:22 ` Meneghini, John
2020-02-19 15:03 ` Christoph Hellwig
2020-02-13 6:53 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200212134140.105817-1-hare@suse.de \
--to=hare@suse.de \
--cc=hch@lst.de \
--cc=john.meneghini@netapp.com \
--cc=keith.busch@intel.com \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.