Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/1] nvme: Fix problem when booting from NVMe drive was leading to a hang.
@ 2024-03-04 18:25 Michael Kropaczek
  2024-03-05 13:51 ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Kropaczek @ 2024-03-04 18:25 UTC (permalink / raw)
  To: linux-nvme
  Cc: Michael Kropaczek, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg

Description:

During endurance test, when a system was rebooted from NVMe drive, boot
process hung occasionally. The number of reboot cycles was set to 1000,
with interval of 120s. Hang occurred after ~300 reboot cycles.
After investigating the cause, it was established that NVMe driver
did not disable host memory during shutdown leaving NVMe controller
in a state preventing proper initialization in BIOS pre-boot stage.
Adding of the call to nvme_set_host_mem(dev, 0) when in shutdown
fixed the issue.

Michael Kropaczek (1):
  nvme: Fix problem when booting from NVMe drive was leading to a hang.

 drivers/nvme/host/pci.c | 8 ++++++++
 1 file changed, 8 insertions(+)


base-commit: 8d30528a170905ede9ab6ab81f229e441808590b
-- 
2.34.1

From 9eec234181015af624d8e5cd8670ba5d82d0ce7e Mon Sep 17 00:00:00 2001
From: Michael Kropaczek <michael.kropaczek@solidigm.com>
Date: Thu, 29 Feb 2024 15:33:27 -0800
Subject: [PATCH v2 1/1] nvme: Fix problem when booting from NVMe drive was
 leading to a hang.
To: linux-nvme@lists.infradead.org
Cc: Keith Busch <kbusch@kernel.org>,
    Jens Axboe <axboe@fb.com>,
    Christoph Hellwig <hch@lst.de>,
    Sagi Grimberg <sagi@grimberg.me>,
    Michael Kropaczek <michael.kropaczek@solidigm.com>

On certain host architectures/HW, DRAM was keeping memory contents over reboot
cycles. Certain NVMe controllers were accessing host memory after startup which
led to undefined state, preventing proper initialization in BIOS boot stage.
Freeing host memory during host's shutdown prevents the problem from occurring.

Signed-off-by: Michael Kropaczek <michael.kropaczek@solidigm.com>
---
 drivers/nvme/host/pci.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e6267a6aa380..e5292c7b301f 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2593,6 +2593,14 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 			nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
 	}
 
+	/*
+	 * On certain host architectures/HW, DRAM was keeping memory contents over reboot-cycles.
+	 * It was observed that certain controllers were accessing host memory after
+	 * resetting which led to undefined state preventing proper initialization.
+	 */
+	if (shutdown && dev->hmb)
+		nvme_set_host_mem(dev, 0);
+
 	nvme_quiesce_io_queues(&dev->ctrl);
 
 	if (!dead && dev->ctrl.queue_count > 0) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-03-05 17:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-04 18:25 [PATCH v2 0/1] nvme: Fix problem when booting from NVMe drive was leading to a hang Michael Kropaczek
2024-03-05 13:51 ` Christoph Hellwig
2024-03-05 15:17   ` Keith Busch
2024-03-05 17:49     ` Michael Kropaczek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox