Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] nvme-pci: Log failures to probe
@ 2025-10-24 11:05 Gerd Bayer
  2025-10-24 11:05 ` [PATCH v3 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Gerd Bayer @ 2025-10-24 11:05 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, John Garry, linux-nvme, linux-kernel, Gerd Bayer

Hi,

occasionally, an NVMe drive fails to reply to MMIO reads despite the PCI
link being operational. In these situations (obviously not on the
boot-device :) it is preferred to have this reported in the kernel log
instead of deducing that something is wrong by comparing expectated vs.
observed number of nvme devices.

 - 1/2 adds a clear indication that the device driver failed to probe an
   NVMe drive and will not register a nvme device.
 - 2/2 adds a debug log that allows to further zoom in on the exact
   failure

I've split this into two patches, since 2/2 might be less important wrt
to back-ports, etc.

Thanks,
Gerd

Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
Changes in v3:
- Both patches: Added Christop Helwig's R-b's
- Patch 2/2: add newline (per John Garry)
- Link to v2: https://lore.kernel.org/r/20251022-nvme_probefail-v2-0-26bbdf94280e@linux.ibm.com

Changes in v2:
- Patch 1/2:
  - Replace dev_err() with dev_err_probe(), now last before return
  - Drop R-b from Wilfred Mallawa
- Patch 2/2: Reword as suggested by Christoph Hellwig, pick up R-b's
- Link to v1: https://lore.kernel.org/r/20251020-nvme_probefail-v1-0-a420046d98f0@linux.ibm.com

---
Gerd Bayer (2):
      nvme-pci: Print error message on failure in nvme_probe
      nvme-pci: Add debug message on fail to read CSTS

 drivers/nvme/host/pci.c | 2 ++
 1 file changed, 2 insertions(+)
---
base-commit: 211ddde0823f1442e4ad052a2f30f050145ccada
change-id: 20251015-nvme_probefail-3976120a4719

Best regards,
-- 
Gerd Bayer <gbayer@linux.ibm.com>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 1/2] nvme-pci: Print error message on failure in nvme_probe
  2025-10-24 11:05 [PATCH v3 0/2] nvme-pci: Log failures to probe Gerd Bayer
@ 2025-10-24 11:05 ` Gerd Bayer
  2025-10-26  7:20   ` Wilfred Mallawa
  2025-10-24 11:05 ` [PATCH v3 2/2] nvme-pci: Add debug message on fail to read CSTS Gerd Bayer
  2025-10-27 16:13 ` [PATCH v3 0/2] nvme-pci: Log failures to probe Keith Busch
  2 siblings, 1 reply; 5+ messages in thread
From: Gerd Bayer @ 2025-10-24 11:05 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, John Garry, linux-nvme, linux-kernel, Gerd Bayer

Add a new error message that makes failures to probe visible in the
kernel log, like:
nvme 0008:00:00.0: error -ENODEV: probe failed

This highlights issues with a particular device right away instead of
leaving users to search for missing drives.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
 drivers/nvme/host/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c916176bd9f058b49e6e6768675711df52b15765..8c624960c9f67e0cc8409023de5e532d6ed9b3ac 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3522,6 +3522,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	nvme_uninit_ctrl(&dev->ctrl);
 out_put_ctrl:
 	nvme_put_ctrl(&dev->ctrl);
+	dev_err_probe(&pdev->dev, result, "probe failed\n");
 	return result;
 }
 

-- 
2.48.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v3 2/2] nvme-pci: Add debug message on fail to read CSTS
  2025-10-24 11:05 [PATCH v3 0/2] nvme-pci: Log failures to probe Gerd Bayer
  2025-10-24 11:05 ` [PATCH v3 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
@ 2025-10-24 11:05 ` Gerd Bayer
  2025-10-27 16:13 ` [PATCH v3 0/2] nvme-pci: Log failures to probe Keith Busch
  2 siblings, 0 replies; 5+ messages in thread
From: Gerd Bayer @ 2025-10-24 11:05 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, John Garry, linux-nvme, linux-kernel, Gerd Bayer

Add a debug log spelling out that reading the CSTS register failed - to
distinguish this from other reasons for ENODEV.

Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
 drivers/nvme/host/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8c624960c9f67e0cc8409023de5e532d6ed9b3ac..15c12e6cba884a9bb5248f70d75490254014b9f9 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2906,6 +2906,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
 	pci_set_master(pdev);
 
 	if (readl(dev->bar + NVME_REG_CSTS) == -1) {
+		dev_dbg(dev->ctrl.device, "reading CSTS register failed\n");
 		result = -ENODEV;
 		goto disable;
 	}

-- 
2.48.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] nvme-pci: Print error message on failure in nvme_probe
  2025-10-24 11:05 ` [PATCH v3 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
@ 2025-10-26  7:20   ` Wilfred Mallawa
  0 siblings, 0 replies; 5+ messages in thread
From: Wilfred Mallawa @ 2025-10-26  7:20 UTC (permalink / raw)
  To: kbusch@kernel.org, gbayer@linux.ibm.com, axboe@kernel.dk,
	sagi@grimberg.me, hch
  Cc: john.g.garry@oracle.com, gregkh@linuxfoundation.org,
	schnelle@linux.ibm.com, linux-nvme@lists.infradead.org,
	kch@nvidia.com, linux-kernel@vger.kernel.org

On Fri, 2025-10-24 at 13:05 +0200, Gerd Bayer wrote:
> Add a new error message that makes failures to probe visible in the
> kernel log, like:
> nvme 0008:00:00.0: error -ENODEV: probe failed
> 
> This highlights issues with a particular device right away instead of
> leaving users to search for missing drives.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>  drivers/nvme/host/pci.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index
> c916176bd9f058b49e6e6768675711df52b15765..8c624960c9f67e0cc8409023de5
> e532d6ed9b3ac 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3522,6 +3522,7 @@ static int nvme_probe(struct pci_dev *pdev,
> const struct pci_device_id *id)
>  	nvme_uninit_ctrl(&dev->ctrl);
>  out_put_ctrl:
>  	nvme_put_ctrl(&dev->ctrl);
> +	dev_err_probe(&pdev->dev, result, "probe failed\n");
>  	return result;
>  }
>  
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>

Regards,
Wilfred

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 0/2] nvme-pci: Log failures to probe
  2025-10-24 11:05 [PATCH v3 0/2] nvme-pci: Log failures to probe Gerd Bayer
  2025-10-24 11:05 ` [PATCH v3 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
  2025-10-24 11:05 ` [PATCH v3 2/2] nvme-pci: Add debug message on fail to read CSTS Gerd Bayer
@ 2025-10-27 16:13 ` Keith Busch
  2 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2025-10-27 16:13 UTC (permalink / raw)
  To: Gerd Bayer
  Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, Niklas Schnelle,
	Greg Kroah-Hartman, Chaitanya Kulkarni, Wilfred Mallawa,
	John Garry, linux-nvme, linux-kernel

On Fri, Oct 24, 2025 at 01:05:30PM +0200, Gerd Bayer wrote:
> occasionally, an NVMe drive fails to reply to MMIO reads despite the PCI
> link being operational. In these situations (obviously not on the
> boot-device :) it is preferred to have this reported in the kernel log
> instead of deducing that something is wrong by comparing expectated vs.
> observed number of nvme devices.

Thanks, applied to nvme-6.19.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-10-27 16:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-24 11:05 [PATCH v3 0/2] nvme-pci: Log failures to probe Gerd Bayer
2025-10-24 11:05 ` [PATCH v3 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
2025-10-26  7:20   ` Wilfred Mallawa
2025-10-24 11:05 ` [PATCH v3 2/2] nvme-pci: Add debug message on fail to read CSTS Gerd Bayer
2025-10-27 16:13 ` [PATCH v3 0/2] nvme-pci: Log failures to probe Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox