linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Nilay Shroff <nilay@linux.ibm.com>
To: kbusch@kernel.org
Cc: linux-nvme@lists.infradead.org, hch@lst.de, sagi@grimberg.me,
	gjoyce@linux.ibm.com, axboe@fb.com,
	Nilay Shroff <nilay@linux.ibm.com>
Subject: [PATCH v3 1/1] nvme-pci : Fix EEH failure on ppc after subsystem reset
Date: Tue,  4 Jun 2024 14:40:04 +0530	[thread overview]
Message-ID: <20240604091523.1422027-2-nilay@linux.ibm.com> (raw)
In-Reply-To: <20240604091523.1422027-1-nilay@linux.ibm.com>

The NVMe subsystem reset command when executed may cause the loss of
the NVMe adapter communication with kernel. And the only way today
to recover the adapter is to either re-enumerate the pci bus or
hotplug NVMe disk or reboot OS.

The PPC architecture supports mechanism called EEH (enhanced error
handling) which allows pci bus errors to be cleared and a pci card to
be rebooted, without having to physically hotplug NVMe disk or reboot
the OS.

In the current implementation when user executes the nvme subsystem
reset command and if kernel loses the communication with NVMe adapter
then subsequent read/write to the PCIe config space of the device
would fail. Failing to read/write to PCI config space makes NVMe
driver assume the permanent loss of communication with the device and
so driver marks the NVMe controller dead and frees all resources
associate to that controller. As the NVMe controller goes dead, the
EEH recovery can't succeed.

This patch helps fix this issue so that after user executes subsystem
reset command if the communication with the NVMe adapter is lost and
EEH recovery is initiated then we allow the EEH recovery to forward
progress and gives the EEH thread a fair chance to recover the
adapter. If in case, the EEH thread couldn't recover the adapter
communication then it sets the pci channel state of the erring
adapter to "permanent failure" and removes the device.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/core.c |  1 +
 drivers/nvme/host/pci.c  | 21 ++++++++++++++++++---
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f5d150c62955..afb8419566a9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -562,6 +562,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 		switch (old_state) {
 		case NVME_CTRL_NEW:
 		case NVME_CTRL_LIVE:
+		case NVME_CTRL_CONNECTING:
 			changed = true;
 			fallthrough;
 		default:
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 102a9fb0c65f..f1bb8df20701 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2789,6 +2789,17 @@ static void nvme_reset_work(struct work_struct *work)
  out_unlock:
 	mutex_unlock(&dev->shutdown_lock);
  out:
+	/*
+	 * If PCI recovery is ongoing then let it finish first
+	 */
+	if (pci_channel_offline(to_pci_dev(dev->dev))) {
+		if (nvme_ctrl_state(&dev->ctrl) == NVME_CTRL_RESETTING ||
+		    nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING)) {
+			dev_warn(dev->ctrl.device,
+				"Let pci error recovery finish!\n");
+			return;
+		}
+	}
 	/*
 	 * Set state to deleting now to avoid blocking nvme_wait_reset(), which
 	 * may be holding this pci_dev's device lock.
@@ -3308,10 +3319,14 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev,
 	case pci_channel_io_frozen:
 		dev_warn(dev->ctrl.device,
 			"frozen state error detected, reset controller\n");
-		if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING)) {
-			nvme_dev_disable(dev, true);
-			return PCI_ERS_RESULT_DISCONNECT;
+		if (nvme_ctrl_state(&dev->ctrl) != NVME_CTRL_RESETTING) {
+			if (!nvme_change_ctrl_state(&dev->ctrl,
+					NVME_CTRL_RESETTING)) {
+				nvme_dev_disable(dev, true);
+				return PCI_ERS_RESULT_DISCONNECT;
+			}
 		}
+		flush_work(&dev->ctrl.reset_work);
 		nvme_dev_disable(dev, false);
 		return PCI_ERS_RESULT_NEED_RESET;
 	case pci_channel_io_perm_failure:
-- 
2.45.1



  reply	other threads:[~2024-06-04  9:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-04  9:10 [PATCH v3 0/1] nvme-pci: recover from NVM subsystem reset Nilay Shroff
2024-06-04  9:10 ` Nilay Shroff [this message]
2024-06-10 12:32   ` [PATCH v3 1/1] nvme-pci : Fix EEH failure on ppc after " Maurizio Lombardi
2024-06-12 11:07     ` Nilay Shroff
2024-06-12 13:10       ` Maurizio Lombardi
2024-06-12 17:07         ` Nilay Shroff
2024-06-13  7:02           ` Maurizio Lombardi
2024-06-14  9:51   ` Hannes Reinecke
2024-06-21 16:37   ` Keith Busch
2024-06-22 15:07     ` Nilay Shroff
2024-06-24 16:07       ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240604091523.1422027-2-nilay@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=axboe@fb.com \
    --cc=gjoyce@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).