All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme: fixup boot failure on nvme-pci
@ 2025-04-14 12:05 hare
  2025-04-14 12:26 ` Christoph Hellwig
  2025-04-14 22:09 ` Sagi Grimberg
  0 siblings, 2 replies; 7+ messages in thread
From: hare @ 2025-04-14 12:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke,
	Srikanth Aithal

From: Hannes Reinecke <hare@kernel.org>

Commit 62baf70c3274 caused the ANA log page to be re-read, even on systems
where the ANA is not supported.

Fixes: 62baf70c3274 ("nvme: re-read ANA log page after ns scan completes")

Signed-off-by: Hannes Reinecke <hare@kernel.org>
Tested-by: Srikanth Aithal <sraithal@amd.com>
---
 drivers/nvme/host/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b502ac07483b..eb6ea8acb3cc 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4300,7 +4300,7 @@ static void nvme_scan_work(struct work_struct *work)
 	if (test_bit(NVME_AER_NOTICE_NS_CHANGED, &ctrl->events))
 		nvme_queue_scan(ctrl);
 #ifdef CONFIG_NVME_MULTIPATH
-	else
+	else if (ctrl->ana_log_buf)
 		/* Re-read the ANA log page to not miss updates */
 		queue_work(nvme_wq, &ctrl->ana_work);
 #endif
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: fixup boot failure on nvme-pci
  2025-04-14 12:05 [PATCH] nvme: fixup boot failure on nvme-pci hare
@ 2025-04-14 12:26 ` Christoph Hellwig
  2025-04-14 13:07   ` Aithal, Srikanth
                     ` (2 more replies)
  2025-04-14 22:09 ` Sagi Grimberg
  1 sibling, 3 replies; 7+ messages in thread
From: Christoph Hellwig @ 2025-04-14 12:26 UTC (permalink / raw)
  To: hare
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
	Srikanth Aithal

On Mon, Apr 14, 2025 at 02:05:09PM +0200, hare@kernel.org wrote:
> From: Hannes Reinecke <hare@kernel.org>
> 
> Commit 62baf70c3274 caused the ANA log page to be re-read, even on systems
> where the ANA is not supported.

And unsupported log page should normally not cause a boot failure, but
it seems the controller in question does not handle it well.  I've
applied the patch with a better subjet and commit message explaining this.

But if the controller handles unsupported log pages so badly it will
probably cause trouble in the future as well, or even now when
applications ask for unsupported log pages using the passthrough
interfaces.

Srikanth: what controller is this?  I'd like to add that to the commit
message as well.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: fixup boot failure on nvme-pci
  2025-04-14 12:26 ` Christoph Hellwig
@ 2025-04-14 13:07   ` Aithal, Srikanth
  2025-04-14 14:23   ` Domenico Andreoli
  2025-04-16  6:48   ` Aithal, Srikanth
  2 siblings, 0 replies; 7+ messages in thread
From: Aithal, Srikanth @ 2025-04-14 13:07 UTC (permalink / raw)
  To: Christoph Hellwig, hare; +Cc: Sagi Grimberg, Keith Busch, linux-nvme


On 4/14/2025 5:56 PM, Christoph Hellwig wrote:
> On Mon, Apr 14, 2025 at 02:05:09PM +0200, hare@kernel.org wrote:
>> From: Hannes Reinecke <hare@kernel.org>
>>
>> Commit 62baf70c3274 caused the ANA log page to be re-read, even on systems
>> where the ANA is not supported.
> And unsupported log page should normally not cause a boot failure, but
> it seems the controller in question does not handle it well.  I've
> applied the patch with a better subjet and commit message explaining this.
>
> But if the controller handles unsupported log pages so badly it will
> probably cause trouble in the future as well, or even now when
> applications ask for unsupported log pages using the passthrough
> interfaces.
>
> Srikanth: what controller is this?  I'd like to add that to the commit
> message as well.

It is a Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe 
SSD Controller SM981/PM981/PM983


>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: fixup boot failure on nvme-pci
  2025-04-14 12:26 ` Christoph Hellwig
  2025-04-14 13:07   ` Aithal, Srikanth
@ 2025-04-14 14:23   ` Domenico Andreoli
  2025-04-16  6:48   ` Aithal, Srikanth
  2 siblings, 0 replies; 7+ messages in thread
From: Domenico Andreoli @ 2025-04-14 14:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: hare, Sagi Grimberg, Keith Busch, linux-nvme, Srikanth Aithal

On Mon, Apr 14, 2025 at 02:26:16PM +0200, Christoph Hellwig wrote:
> On Mon, Apr 14, 2025 at 02:05:09PM +0200, hare@kernel.org wrote:
> > From: Hannes Reinecke <hare@kernel.org>
> > 
> > Commit 62baf70c3274 caused the ANA log page to be re-read, even on systems
> > where the ANA is not supported.
> 
> And unsupported log page should normally not cause a boot failure, but
> it seems the controller in question does not handle it well.  I've
> applied the patch with a better subjet and commit message explaining this.

These are the messages that hang my FriendlyELEC NanoPI M4 SBC at boot:

[    4.342362] nvme nvme0: 6/0/0 default/read/poll queues
[    4.359986]  nvme0n1: p2 p3 p8 p9
[   35.830402] nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x1010
[   35.879571] nvme0n1: I/O Cmd(0x2) @ LBA 1000215040, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
[   35.880346] I/O error, dev nvme0n1, sector 1000215040 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   35.881320] nvme nvme0: Failed to get ANA log: -4
[   35.926157] nvme nvme0: D3 entry latency set to 8 seconds
[   35.939041] nvme nvme0: 6/0/0 default/read/poll queues
[   66.550428] nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x1010
[   66.600020] nvme0n1: I/O Cmd(0x2) @ LBA 1000215152, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
[   66.600806] I/O error, dev nvme0n1, sector 1000215152 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   66.610493] nvme nvme0: Failed to get ANA log: -4
[   66.654065] nvme nvme0: D3 entry latency set to 8 seconds
[   66.667024] nvme nvme0: 6/0/0 default/read/poll queues
[   97.270420] nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x1010
[   97.320023] nvme0n1: I/O Cmd(0x2) @ LBA 1000214240, 8 blocks, I/O Error (sct 0x3 / sc 0x71)
[   97.320796] I/O error, dev nvme0n1, sector 1000214240 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   97.330486] nvme nvme0: Failed to get ANA log: -4
[   97.378015] nvme nvme0: D3 entry latency set to 8 seconds
[   97.390837] nvme nvme0: 6/0/0 default/read/poll queues

> But if the controller handles unsupported log pages so badly it will
> probably cause trouble in the future as well, or even now when
> applications ask for unsupported log pages using the passthrough
> interfaces.
> 
> Srikanth: what controller is this?  I'd like to add that to the commit
> message as well.

00:00.0 PCI bridge: Rockchip Electronics Co., Ltd RK3399 PCI Express Root Port
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983

With the patch above everything works again.

Thanks,
Dom

-- 
rsa4096: 3B10 0CA1 8674 ACBA B4FE  FCD2 CE5B CF17 9960 DE13
ed25519: FFB4 0CC3 7F2E 091D F7DA  356E CC79 2832 ED38 CB05


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: fixup boot failure on nvme-pci
  2025-04-14 12:05 [PATCH] nvme: fixup boot failure on nvme-pci hare
  2025-04-14 12:26 ` Christoph Hellwig
@ 2025-04-14 22:09 ` Sagi Grimberg
  1 sibling, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2025-04-14 22:09 UTC (permalink / raw)
  To: hare, Christoph Hellwig; +Cc: Keith Busch, linux-nvme, Srikanth Aithal

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: fixup boot failure on nvme-pci
  2025-04-14 12:26 ` Christoph Hellwig
  2025-04-14 13:07   ` Aithal, Srikanth
  2025-04-14 14:23   ` Domenico Andreoli
@ 2025-04-16  6:48   ` Aithal, Srikanth
  2025-04-16 14:29     ` Keith Busch
  2 siblings, 1 reply; 7+ messages in thread
From: Aithal, Srikanth @ 2025-04-16  6:48 UTC (permalink / raw)
  To: Christoph Hellwig, hare; +Cc: Sagi Grimberg, Keith Busch, linux-nvme

Hello,

When will this fix be included in the Linux-next build? Please let me know.


On 4/14/2025 5:56 PM, Christoph Hellwig wrote:
> On Mon, Apr 14, 2025 at 02:05:09PM +0200, hare@kernel.org wrote:
>> From: Hannes Reinecke <hare@kernel.org>
>>
>> Commit 62baf70c3274 caused the ANA log page to be re-read, even on systems
>> where the ANA is not supported.
> And unsupported log page should normally not cause a boot failure, but
> it seems the controller in question does not handle it well.  I've
> applied the patch with a better subjet and commit message explaining this.
>
> But if the controller handles unsupported log pages so badly it will
> probably cause trouble in the future as well, or even now when
> applications ask for unsupported log pages using the passthrough
> interfaces.
>
> Srikanth: what controller is this?  I'd like to add that to the commit
> message as well.
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nvme: fixup boot failure on nvme-pci
  2025-04-16  6:48   ` Aithal, Srikanth
@ 2025-04-16 14:29     ` Keith Busch
  0 siblings, 0 replies; 7+ messages in thread
From: Keith Busch @ 2025-04-16 14:29 UTC (permalink / raw)
  To: Aithal, Srikanth; +Cc: Christoph Hellwig, hare, Sagi Grimberg, linux-nvme

On Wed, Apr 16, 2025 at 12:18:54PM +0530, Aithal, Srikanth wrote:
> Hello,
> 
> When will this fix be included in the Linux-next build? Please let me know.

Pull requests from the linux-nvme tree usually go out on Thursday, so
tomorrow.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-04-16 15:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-14 12:05 [PATCH] nvme: fixup boot failure on nvme-pci hare
2025-04-14 12:26 ` Christoph Hellwig
2025-04-14 13:07   ` Aithal, Srikanth
2025-04-14 14:23   ` Domenico Andreoli
2025-04-16  6:48   ` Aithal, Srikanth
2025-04-16 14:29     ` Keith Busch
2025-04-14 22:09 ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.