linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] nvme-pci: Log failures to probe
@ 2025-10-22 10:33 Gerd Bayer
  2025-10-22 10:33 ` [PATCH v2 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
  2025-10-22 10:33 ` [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS Gerd Bayer
  0 siblings, 2 replies; 8+ messages in thread
From: Gerd Bayer @ 2025-10-22 10:33 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, linux-nvme, linux-kernel, Gerd Bayer,
	Chaitanya Kulkarni

Hi,

occasionally, an NVMe drive fails to reply to MMIO reads despite the PCI
link being operational. In these situations (obviously not on the
boot-device :) it is preferred to have this reported in the kernel log
instead of deducing that something is wrong by comparing expectated vs.
observed number of nvme devices.

 - 1/2 adds a clear indication that the device driver failed to probe an
   NVMe drive and will not register a nvme device.
 - 2/2 adds a debug log that allows to further zoom in on the exact
   failure

I've split this into two patches, since 2/2 might be less important wrt
to back-ports, etc.

Thanks,
Gerd

Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
Changes in v2:
- Patch 1/2:
  - Replace dev_err() with dev_err_probe(), now last before return
  - Drop R-b from Wilfred Mallawa
- Patch 2/2: Reword as suggested by Christoph Hellwig, pick up R-b's
- Link to v1: https://lore.kernel.org/r/20251020-nvme_probefail-v1-0-a420046d98f0@linux.ibm.com

---
Gerd Bayer (2):
      nvme-pci: Print error message on failure in nvme_probe
      nvme-pci: Add debug message on fail to read CSTS

 drivers/nvme/host/pci.c | 2 ++
 1 file changed, 2 insertions(+)
---
base-commit: 211ddde0823f1442e4ad052a2f30f050145ccada
change-id: 20251015-nvme_probefail-3976120a4719

Best regards,
-- 
Gerd Bayer <gbayer@linux.ibm.com>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] nvme-pci: Print error message on failure in nvme_probe
  2025-10-22 10:33 [PATCH v2 0/2] nvme-pci: Log failures to probe Gerd Bayer
@ 2025-10-22 10:33 ` Gerd Bayer
  2025-10-22 10:53   ` Gerd Bayer
  2025-10-23  8:13   ` Christoph Hellwig
  2025-10-22 10:33 ` [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS Gerd Bayer
  1 sibling, 2 replies; 8+ messages in thread
From: Gerd Bayer @ 2025-10-22 10:33 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, linux-nvme, linux-kernel, Gerd Bayer

Add a new error message that makes failures to probe visible in the
kernel log, like:
nvme 0008:00:00.0: error -ENODEV: probe failed

This highlights issues with a particular device right away instead of
leaving users to search for missing drives.

Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
 drivers/nvme/host/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c916176bd9f058b49e6e6768675711df52b15765..8c624960c9f67e0cc8409023de5e532d6ed9b3ac 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3522,6 +3522,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	nvme_uninit_ctrl(&dev->ctrl);
 out_put_ctrl:
 	nvme_put_ctrl(&dev->ctrl);
+	dev_err_probe(&pdev->dev, result, "probe failed\n");
 	return result;
 }
 

-- 
2.48.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS
  2025-10-22 10:33 [PATCH v2 0/2] nvme-pci: Log failures to probe Gerd Bayer
  2025-10-22 10:33 ` [PATCH v2 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
@ 2025-10-22 10:33 ` Gerd Bayer
  2025-10-23  8:13   ` Christoph Hellwig
  2025-10-23 10:55   ` John Garry
  1 sibling, 2 replies; 8+ messages in thread
From: Gerd Bayer @ 2025-10-22 10:33 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, linux-nvme, linux-kernel, Gerd Bayer,
	Chaitanya Kulkarni

Add a debug log spelling out that reading the CSTS register failed - to
distinguish this from other reasons for ENODEV.

Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
 drivers/nvme/host/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8c624960c9f67e0cc8409023de5e532d6ed9b3ac..ce563e93d29443a25e9236d0b6d8152ee85796d7 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2906,6 +2906,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
 	pci_set_master(pdev);
 
 	if (readl(dev->bar + NVME_REG_CSTS) == -1) {
+		dev_dbg(dev->ctrl.device, "reading CSTS register failed");
 		result = -ENODEV;
 		goto disable;
 	}

-- 
2.48.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Print error message on failure in nvme_probe
  2025-10-22 10:33 ` [PATCH v2 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
@ 2025-10-22 10:53   ` Gerd Bayer
  2025-10-23  8:13   ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Gerd Bayer @ 2025-10-22 10:53 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, linux-nvme, linux-kernel

On Wed, 2025-10-22 at 12:33 +0200, Gerd Bayer wrote:
> Add a new error message that makes failures to probe visible in the
> kernel log, like:
> nvme 0008:00:00.0: error -ENODEV: probe failed
> 
> This highlights issues with a particular device right away instead of
> leaving users to search for missing drives.
> 
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>  drivers/nvme/host/pci.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index c916176bd9f058b49e6e6768675711df52b15765..8c624960c9f67e0cc8409023de5e532d6ed9b3ac 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3522,6 +3522,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	nvme_uninit_ctrl(&dev->ctrl);
>  out_put_ctrl:
>  	nvme_put_ctrl(&dev->ctrl);
> +	dev_err_probe(&pdev->dev, result, "probe failed\n");
>  	return result;
>  }

This one failed to reach Chaitanya due to a typo in the email address
on my side. Sorry!


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] nvme-pci: Print error message on failure in nvme_probe
  2025-10-22 10:33 ` [PATCH v2 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
  2025-10-22 10:53   ` Gerd Bayer
@ 2025-10-23  8:13   ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-10-23  8:13 UTC (permalink / raw)
  To: Gerd Bayer
  Cc: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
	Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, linux-nvme, linux-kernel

On Wed, Oct 22, 2025 at 12:33:16PM +0200, Gerd Bayer wrote:
> Add a new error message that makes failures to probe visible in the
> kernel log, like:
> nvme 0008:00:00.0: error -ENODEV: probe failed
> 
> This highlights issues with a particular device right away instead of
> leaving users to search for missing drives.
> 
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS
  2025-10-22 10:33 ` [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS Gerd Bayer
@ 2025-10-23  8:13   ` Christoph Hellwig
  2025-10-23 10:55   ` John Garry
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-10-23  8:13 UTC (permalink / raw)
  To: Gerd Bayer
  Cc: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
	Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, linux-nvme, linux-kernel

On Wed, Oct 22, 2025 at 12:33:17PM +0200, Gerd Bayer wrote:
> Add a debug log spelling out that reading the CSTS register failed - to
> distinguish this from other reasons for ENODEV.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS
  2025-10-22 10:33 ` [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS Gerd Bayer
  2025-10-23  8:13   ` Christoph Hellwig
@ 2025-10-23 10:55   ` John Garry
  2025-10-23 14:17     ` Gerd Bayer
  1 sibling, 1 reply; 8+ messages in thread
From: John Garry @ 2025-10-23 10:55 UTC (permalink / raw)
  To: Gerd Bayer, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, linux-nvme, linux-kernel, Chaitanya Kulkarni

On 22/10/2025 11:33, Gerd Bayer wrote:
> Add a debug log spelling out that reading the CSTS register failed - to
> distinguish this from other reasons for ENODEV.
> 
> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
>   drivers/nvme/host/pci.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 8c624960c9f67e0cc8409023de5e532d6ed9b3ac..ce563e93d29443a25e9236d0b6d8152ee85796d7 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2906,6 +2906,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
>   	pci_set_master(pdev);
>   
>   	if (readl(dev->bar + NVME_REG_CSTS) == -1) {
> +		dev_dbg(dev->ctrl.device, "reading CSTS register failed");

we should add "\n" as a good practice, no?

>   		result = -ENODEV;
>   		goto disable;
>   	}
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS
  2025-10-23 10:55   ` John Garry
@ 2025-10-23 14:17     ` Gerd Bayer
  0 siblings, 0 replies; 8+ messages in thread
From: Gerd Bayer @ 2025-10-23 14:17 UTC (permalink / raw)
  To: John Garry, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg
  Cc: Niklas Schnelle, Greg Kroah-Hartman, Chaitanya Kulkarni,
	Wilfred Mallawa, linux-nvme, linux-kernel, Chaitanya Kulkarni

On Thu, 2025-10-23 at 11:55 +0100, John Garry wrote:
> On 22/10/2025 11:33, Gerd Bayer wrote:
> > Add a debug log spelling out that reading the CSTS register failed - to
> > distinguish this from other reasons for ENODEV.
> > 
> > Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> > Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> > ---
> >   drivers/nvme/host/pci.c | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index 8c624960c9f67e0cc8409023de5e532d6ed9b3ac..ce563e93d29443a25e9236d0b6d8152ee85796d7 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -2906,6 +2906,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
> >   	pci_set_master(pdev);
> >   
> >   	if (readl(dev->bar + NVME_REG_CSTS) == -1) {
> > +		dev_dbg(dev->ctrl.device, "reading CSTS register failed");
> 
> we should add "\n" as a good practice, no?

that's a very valid point. Sheesh - how much can go wrong...

Thanks,
Gerd

> 
> >   		result = -ENODEV;
> >   		goto disable;
> >   	}
> > 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-10-23 14:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-22 10:33 [PATCH v2 0/2] nvme-pci: Log failures to probe Gerd Bayer
2025-10-22 10:33 ` [PATCH v2 1/2] nvme-pci: Print error message on failure in nvme_probe Gerd Bayer
2025-10-22 10:53   ` Gerd Bayer
2025-10-23  8:13   ` Christoph Hellwig
2025-10-22 10:33 ` [PATCH v2 2/2] nvme-pci: Add debug message on fail to read CSTS Gerd Bayer
2025-10-23  8:13   ` Christoph Hellwig
2025-10-23 10:55   ` John Garry
2025-10-23 14:17     ` Gerd Bayer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).