public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] nvme-pci: fix resume after AER recovery
@ 2023-01-30 10:14 Christoph Hellwig
  2023-01-30 18:35 ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2023-01-30 10:14 UTC (permalink / raw)
  To: kbusch, sagi; +Cc: linux-nvme, Maciej Grochowski

All I/O on a nvme controllers hangs after injecting a malformed TLP error
using aer-inject with an error file like:

--- snip ---
AER
PCI_ID WWWW:XX.YY.Z
UNCOR_STATUS COMP_TIME
HEADER_LOG 0 1 2 3
--- snip ---

This is because in this case the ->resume method will be called after
->error_injected and not ->slot_reset, leaving the controller in disabled
state and the queue frozen.  Fix this by doing a controller reset to
resume as well.

Fixes: a0a3408ee614 ("NVMe: Add pci error handlers")
Reported-by: Maciej Grochowski <Maciej.Grochowski@sony.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Maciej Grochowski <Maciej.Grochowski@sony.com>
---
 drivers/nvme/host/pci.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c734934c407ccf..ec1e95d1a8c236 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3336,21 +3336,19 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev,
 	return PCI_ERS_RESULT_NEED_RESET;
 }
 
-static pci_ers_result_t nvme_slot_reset(struct pci_dev *pdev)
+static void nvme_error_resume(struct pci_dev *pdev)
 {
 	struct nvme_dev *dev = pci_get_drvdata(pdev);
 
 	dev_info(dev->ctrl.device, "restart after slot reset\n");
 	pci_restore_state(pdev);
 	nvme_reset_ctrl(&dev->ctrl);
-	return PCI_ERS_RESULT_RECOVERED;
 }
 
-static void nvme_error_resume(struct pci_dev *pdev)
+static pci_ers_result_t nvme_slot_reset(struct pci_dev *pdev)
 {
-	struct nvme_dev *dev = pci_get_drvdata(pdev);
-
-	flush_work(&dev->ctrl.reset_work);
+	nvme_error_resume(pdev);
+	return PCI_ERS_RESULT_RECOVERED;
 }
 
 static const struct pci_error_handlers nvme_err_handler = {
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-02-09  7:56 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-30 10:14 [PATCH] nvme-pci: fix resume after AER recovery Christoph Hellwig
2023-01-30 18:35 ` Keith Busch
2023-01-30 18:43   ` Keith Busch
2023-01-30 18:54     ` Grochowski, Maciej
2023-01-31  8:58     ` Christoph Hellwig
2023-01-31 15:22       ` Keith Busch
2023-02-01 22:58         ` Grochowski, Maciej
2023-02-02 10:18           ` Christoph Hellwig
2023-02-02 18:47             ` Grochowski, Maciej
2023-02-02 19:43               ` Keith Busch
2023-02-03  1:29                 ` Grochowski, Maciej
2023-02-03  1:37                   ` Keith Busch
2023-02-03 18:45                     ` Grochowski, Maciej
2023-02-06 14:02                       ` Javier.gonz
2023-02-06 15:42                         ` Christoph Hellwig
2023-02-06 16:22                           ` Keith Busch
2023-02-06 17:51                             ` Javier.gonz
2023-02-07  1:51                               ` Grochowski, Maciej
2023-02-07  8:29                                 ` Javier.gonz
2023-02-07 10:36                                   ` Klaus Jensen
2023-02-07 19:05                                     ` Grochowski, Maciej
2023-02-08  6:43                                       ` Klaus Jensen
2023-02-08 17:26                                         ` Grochowski, Maciej
2023-02-08 17:39                                           ` Keith Busch
2023-02-08 22:38                                             ` Grochowski, Maciej
2023-02-09  7:55                                               ` Javier.gonz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox