[PATCH] nvme-pci: fix resume after AER recovery

public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed

* [PATCH] nvme-pci: fix resume after AER recovery
@ 2023-01-30 10:14 Christoph Hellwig
  2023-01-30 18:35 ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2023-01-30 10:14 UTC (permalink / raw)
  To: kbusch, sagi; +Cc: linux-nvme, Maciej Grochowski

All I/O on a nvme controllers hangs after injecting a malformed TLP error
using aer-inject with an error file like:

--- snip ---
AER
PCI_ID WWWW:XX.YY.Z
UNCOR_STATUS COMP_TIME
HEADER_LOG 0 1 2 3
--- snip ---

This is because in this case the ->resume method will be called after
->error_injected and not ->slot_reset, leaving the controller in disabled
state and the queue frozen.  Fix this by doing a controller reset to
resume as well.

Fixes: a0a3408ee614 ("NVMe: Add pci error handlers")
Reported-by: Maciej Grochowski <Maciej.Grochowski@sony.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Maciej Grochowski <Maciej.Grochowski@sony.com>
---
 drivers/nvme/host/pci.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c734934c407ccf..ec1e95d1a8c236 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3336,21 +3336,19 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev,
 	return PCI_ERS_RESULT_NEED_RESET;
 }
 
-static pci_ers_result_t nvme_slot_reset(struct pci_dev *pdev)
+static void nvme_error_resume(struct pci_dev *pdev)
 {
 	struct nvme_dev *dev = pci_get_drvdata(pdev);
 
 	dev_info(dev->ctrl.device, "restart after slot reset\n");
 	pci_restore_state(pdev);
 	nvme_reset_ctrl(&dev->ctrl);
-	return PCI_ERS_RESULT_RECOVERED;
 }
 
-static void nvme_error_resume(struct pci_dev *pdev)
+static pci_ers_result_t nvme_slot_reset(struct pci_dev *pdev)
 {
-	struct nvme_dev *dev = pci_get_drvdata(pdev);
-
-	flush_work(&dev->ctrl.reset_work);
+	nvme_error_resume(pdev);
+	return PCI_ERS_RESULT_RECOVERED;
 }
 
 static const struct pci_error_handlers nvme_err_handler = {
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-01-30 10:14 [PATCH] nvme-pci: fix resume after AER recovery Christoph Hellwig
@ 2023-01-30 18:35 ` Keith Busch
  2023-01-30 18:43   ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2023-01-30 18:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: sagi, linux-nvme, Maciej Grochowski

On Mon, Jan 30, 2023 at 11:14:49AM +0100, Christoph Hellwig wrote:
> All I/O on a nvme controllers hangs after injecting a malformed TLP error
> using aer-inject with an error file like:
> 
> --- snip ---
> AER
> PCI_ID WWWW:XX.YY.Z
> UNCOR_STATUS COMP_TIME
> HEADER_LOG 0 1 2 3
> --- snip ---
> 
> This is because in this case the ->resume method will be called after
> ->error_injected and not ->slot_reset, leaving the controller in disabled
> state and the queue frozen.  Fix this by doing a controller reset to
> resume as well.

Why isn't slot_reset being called after error_detected? Driver should be
returning "RESULT_NEEDS_RESET", which should have the pcie error handling
always invoke the slot_reset callback.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-01-30 18:35 ` Keith Busch
@ 2023-01-30 18:43   ` Keith Busch
  2023-01-30 18:54     ` Grochowski, Maciej
  2023-01-31  8:58     ` Christoph Hellwig
  0 siblings, 2 replies; 26+ messages in thread
From: Keith Busch @ 2023-01-30 18:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: sagi, linux-nvme, Maciej Grochowski

On Mon, Jan 30, 2023 at 11:35:36AM -0700, Keith Busch wrote:
> On Mon, Jan 30, 2023 at 11:14:49AM +0100, Christoph Hellwig wrote:
> > All I/O on a nvme controllers hangs after injecting a malformed TLP error
> > using aer-inject with an error file like:
> > 
> > --- snip ---
> > AER
> > PCI_ID WWWW:XX.YY.Z
> > UNCOR_STATUS COMP_TIME
> > HEADER_LOG 0 1 2 3
> > --- snip ---
> > 
> > This is because in this case the ->resume method will be called after
> > ->error_injected and not ->slot_reset, leaving the controller in disabled
> > state and the queue frozen.  Fix this by doing a controller reset to
> > resume as well.
> 
> Why isn't slot_reset being called after error_detected? Driver should be
> returning "RESULT_NEEDS_RESET", which should have the pcie error handling
> always invoke the slot_reset callback.

Are you using an older kernel that doesn't have

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=387c72cdd7fb6bef650fb078d0f6ae9682abf631

?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-01-30 18:43   ` Keith Busch
@ 2023-01-30 18:54     ` Grochowski, Maciej
  2023-01-31  8:58     ` Christoph Hellwig
  1 sibling, 0 replies; 26+ messages in thread
From: Grochowski, Maciej @ 2023-01-30 18:54 UTC (permalink / raw)
  To: Keith Busch, Christoph Hellwig
  Cc: sagi@grimberg.me, linux-nvme@lists.infradead.org

The issue was spotted on 5.10 LTS, I checked the sources and indeed sources are before 387c72cd commit.
Let me update that and retest.

-----Original Message-----
From: Keith Busch <kbusch@kernel.org> 
Sent: Monday, January 30, 2023 10:43 AM
To: Christoph Hellwig <hch@lst.de>
Cc: sagi@grimberg.me; linux-nvme@lists.infradead.org; Grochowski, Maciej <Maciej.Grochowski@sony.com>
Subject: Re: [PATCH] nvme-pci: fix resume after AER recovery

On Mon, Jan 30, 2023 at 11:35:36AM -0700, Keith Busch wrote:
> On Mon, Jan 30, 2023 at 11:14:49AM +0100, Christoph Hellwig wrote:
> > All I/O on a nvme controllers hangs after injecting a malformed TLP 
> > error using aer-inject with an error file like:
> > 
> > --- snip ---
> > AER
> > PCI_ID WWWW:XX.YY.Z
> > UNCOR_STATUS COMP_TIME
> > HEADER_LOG 0 1 2 3
> > --- snip ---
> > 
> > This is because in this case the ->resume method will be called 
> > after
> > ->error_injected and not ->slot_reset, leaving the controller in 
> > ->disabled
> > state and the queue frozen.  Fix this by doing a controller reset to 
> > resume as well.
> 
> Why isn't slot_reset being called after error_detected? Driver should 
> be returning "RESULT_NEEDS_RESET", which should have the pcie error 
> handling always invoke the slot_reset callback.

Are you using an older kernel that doesn't have

  https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=387c72cdd7fb6bef650fb078d0f6ae9682abf631__;!!JmoZiZGBv3RvKRSx!40iIn65MdrKZ0SfOKFPZh2uzo1KyAjcza3Tj7fDDip143yds9jH361GQ09RcoZbiYiX0ot8uZYS5-Q2cGWE6$ 

?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-01-30 18:43   ` Keith Busch
  2023-01-30 18:54     ` Grochowski, Maciej
@ 2023-01-31  8:58     ` Christoph Hellwig
  2023-01-31 15:22       ` Keith Busch
  1 sibling, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2023-01-31  8:58 UTC (permalink / raw)
  To: Keith Busch; +Cc: Christoph Hellwig, sagi, linux-nvme, Maciej Grochowski

On Mon, Jan 30, 2023 at 11:43:28AM -0700, Keith Busch wrote:
> > Why isn't slot_reset being called after error_detected? Driver should be
> > returning "RESULT_NEEDS_RESET", which should have the pcie error handling
> > always invoke the slot_reset callback.
> 
> Are you using an older kernel that doesn't have
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=387c72cdd7fb6bef650fb078d0f6ae9682abf631

Oh, that does looks like the real fix.  That being said, what is
the point of flushing the reset_wor in nvme_error_resume?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-01-31  8:58     ` Christoph Hellwig
@ 2023-01-31 15:22       ` Keith Busch
  2023-02-01 22:58         ` Grochowski, Maciej
  0 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2023-01-31 15:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: sagi, linux-nvme, Maciej Grochowski

On Tue, Jan 31, 2023 at 09:58:28AM +0100, Christoph Hellwig wrote:
> On Mon, Jan 30, 2023 at 11:43:28AM -0700, Keith Busch wrote:
> > > Why isn't slot_reset being called after error_detected? Driver should be
> > > returning "RESULT_NEEDS_RESET", which should have the pcie error handling
> > > always invoke the slot_reset callback.
> > 
> > Are you using an older kernel that doesn't have
> > 
> >   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=387c72cdd7fb6bef650fb078d0f6ae9682abf631
> 
> Oh, that does looks like the real fix.  That being said, what is
> the point of flushing the reset_wor in nvme_error_resume?

It's so we don't try to handle a new error or remove before finishing recovery
from the first one.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-01-31 15:22       ` Keith Busch
@ 2023-02-01 22:58         ` Grochowski, Maciej
  2023-02-02 10:18           ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Grochowski, Maciej @ 2023-02-01 22:58 UTC (permalink / raw)
  To: Keith Busch, Christoph Hellwig
  Cc: sagi@grimberg.me, linux-nvme@lists.infradead.org

Hi Keith!

I updated kernel on my machine into 5.15.87 sources and I run same experiment with aer_inject tool.
As a result I got followed error and nvme disappeared:

nvme 0000:02:00.0: can't change power state from D3cold to D0 (config space inaccessible)

full log
```
[  679.061060] pcieport 0000:00:03.2: aer_inject: Injecting errors 00000000/00004000 into device 0000:02:00.0
[  679.061100] pcieport 0000:00:03.2: AER: Uncorrected (Fatal) error received: 0000:02:00.0
[  679.061111] nvme 0000:02:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
[  679.061115] pcieport 0000:00:03.2: AER: broadcast error_detected message
[  679.061120] nvme nvme15: frozen state error detected, reset controller
[  679.076520] pcieport 0000:00:03.2: pciehp: pending interrupts 0x0010 from Slot Status
[  679.076528] pcieport 0000:00:03.2: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
[  680.103638] pcieport 0000:00:03.2: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
[  680.103660] pcieport 0000:00:03.2: pciehp: pending interrupts 0x0010 from Slot Status
[  680.103670] pcieport 0000:00:03.2: AER: Root Port link has been reset (0)
[  680.103674] pcieport 0000:00:03.2: AER: broadcast slot_reset message
[  680.103677] nvme nvme15: restart after slot reset
[  680.106193] nvme 0000:02:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff)
...
[  680.166532] pcieport 0000:00:03.2: AER: broadcast resume message
[  680.171640] nvme 0000:02:00.0: can't change power state from D3cold to D0 (config space inaccessible)
[  680.171825] nvme nvme15: Removing after probe failure status: -19
[  680.177638] nvme15n1: detected capacity change from 3750748848 to 0
```
After that device /dev/nvme15 disappears /
Are there any quirks that needs to be performed for certain NVMe devices?
In my case I use "Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO"

I tried the same experiment on different AMD platform with  "Samsung Electronics Co Ltd Device a824" it give me same result, 
but when I did this test on KIOXIA NVME "1e0f:0007" it worked fine and was able to recover.


-----Original Message-----
From: Keith Busch <kbusch@kernel.org> 
Sent: Tuesday, January 31, 2023 7:22 AM
To: Christoph Hellwig <hch@lst.de>
Cc: sagi@grimberg.me; linux-nvme@lists.infradead.org; Grochowski, Maciej <Maciej.Grochowski@sony.com>
Subject: Re: [PATCH] nvme-pci: fix resume after AER recovery

On Tue, Jan 31, 2023 at 09:58:28AM +0100, Christoph Hellwig wrote:
> On Mon, Jan 30, 2023 at 11:43:28AM -0700, Keith Busch wrote:
> > > Why isn't slot_reset being called after error_detected? Driver 
> > > should be returning "RESULT_NEEDS_RESET", which should have the 
> > > pcie error handling always invoke the slot_reset callback.
> > 
> > Are you using an older kernel that doesn't have
> > 
> >   
> > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/ker
> > nel/git/torvalds/linux.git/commit?id=387c72cdd7fb6bef650fb078d0f6ae9
> > 682abf631__;!!JmoZiZGBv3RvKRSx!5wLCu5dVEfcVCU88GB-KhvnD4nWEP2QiJPMbe
> > oDDQSvL03dI7O_wXoRKBzAEF9o23C0usSRp8QqIloxstMc8$
> 
> Oh, that does looks like the real fix.  That being said, what is the 
> point of flushing the reset_wor in nvme_error_resume?

It's so we don't try to handle a new error or remove before finishing recovery from the first one.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-01 22:58         ` Grochowski, Maciej
@ 2023-02-02 10:18           ` Christoph Hellwig
  2023-02-02 18:47             ` Grochowski, Maciej
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2023-02-02 10:18 UTC (permalink / raw)
  To: Grochowski, Maciej
  Cc: Keith Busch, Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org

On Wed, Feb 01, 2023 at 10:58:40PM +0000, Grochowski, Maciej wrote:
> Hi Keith!
> 
> I updated kernel on my machine into 5.15.87 sources and I run same experiment with aer_inject tool.
> As a result I got followed error and nvme disappeared:
> 
> nvme 0000:02:00.0: can't change power state from D3cold to D0 (config space inaccessible)

I've been trying to look for this code in latest upstream and it has
changed entirely.  Any chance you could do a quick run with Linux 6.1?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-02 10:18           ` Christoph Hellwig
@ 2023-02-02 18:47             ` Grochowski, Maciej
  2023-02-02 19:43               ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Grochowski, Maciej @ 2023-02-02 18:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, sagi@grimberg.me, linux-nvme@lists.infradead.org

> I've been trying to look for this code in latest upstream and it has changed entirely.  Any chance you could do a quick run with Linux 6.1?

Hi Christoph,
I tried same test on 6.1.9 (uname -r -> 6.1.9) for Samsung PM9A3

Samsung Electronics Co Ltd Device a80a

It gave me the result as on older kernel: failure in power state change and device disappeared after test (logs below)
```
[  365.052300] pcieport 0000:00:03.4: aer_inject: Injecting errors 00000000/00004000 into device 0000:0b:00.0
[  365.062200] pcieport 0000:00:03.4: AER: Uncorrected (Fatal) error received: 0000:0b:00.0
[  365.070439] nvme 0000:0b:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
[  365.081824] pcieport 0000:00:03.4: AER: broadcast error_detected message
[  365.088635] nvme nvme5: frozen state error detected, reset controller
[  365.157742] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
[  366.205193] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
[  366.213394] pcieport 0000:00:03.4: AER: Root Port link has been reset (0)
[  366.220312] pcieport 0000:00:03.4: AER: broadcast slot_reset message
[  366.226771] nvme nvme5: restart after slot reset
[  366.232018] pcieport 0000:00:03.4: re-enabling LTR
[  366.239088] nvme 0000:0b:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x10a)  
...
[  366.994113] nvme 0000:0b:00.0: restoring config space at offset 0x8 (was 0xffffffff, writing 0x1080200)
[  367.003924] nvme 0000:0b:00.0: restoring config space at offset 0x4 (was 0xffffffff, writing 0x100406)
[  367.013663] nvme 0000:0b:00.0: restoring config space at offset 0x0 (was 0xffffffff, writing 0xa80a144d)
[  367.023595] pcieport 0000:00:03.4: AER: broadcast resume message
[  367.045269] nvme 0000:0b:00.0: Unable to change power state from D3cold to D0, device inaccessible
[  367.061721] nvme nvme5: Removing after probe failure status: -19
[  367.089330] nvme5n1: detected capacity change from 3750748848 to 0
[  367.096431] pcieport 0000:00:03.4: AER: device recovery successful
[  367.096470] nvme 0000:0b:00.0: vgaarb: pci_notify
[  367.226016] pci 0000:0b:00.0: vgaarb: pci_notify
```
# ls /dev/nvme5n1
ls: cannot access '/dev/nvme5n1': No such file or directory

I run the same test for another Samsung drive: PM1733
0b:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a824 (prog-if 02 [NVM Express])
        Subsystem: Samsung Electronics Co Ltd Device a801

```
[  334.527200] pcieport 0000:00:03.4: aer_inject: Injecting errors 00000000/00004000 into device 0000:0b:00.0
[  334.537072] pcieport 0000:00:03.4: AER: Uncorrected (Fatal) error received: 0000:0b:00.0
[  334.545320] nvme 0000:0b:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
[  334.556682] pcieport 0000:00:03.4: AER: broadcast error_detected message
[  334.563467] nvme nvme5: frozen state error detected, reset controller
[  334.615434] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
[  335.655445] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
[  335.663647] pcieport 0000:00:03.4: AER: Root Port link has been reset (0)
[  335.670523] pcieport 0000:00:03.4: AER: broadcast slot_reset message
[  335.676954] nvme nvme5: restart after slot reset
[  335.684371] nvme 0000:0b:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x10a)
[  336.427724] nvme 0000:0b:00.0: restoring config space at offset 0x8 (was 0xffffffff, writing 0x1080200)
[  336.437510] nvme 0000:0b:00.0: restoring config space at offset 0x4 (was 0xffffffff, writing 0x100406)
[  336.447215] nvme 0000:0b:00.0: restoring config space at offset 0x0 (was 0xffffffff, writing 0xa824144d)
[  336.457117] pcieport 0000:00:03.4: AER: broadcast resume message
[  336.479575] nvme 0000:0b:00.0: Unable to change power state from D3cold to D0, device inaccessible
[  336.494264] nvme nvme5: Removing after probe failure status: -19
[  336.535861] pcieport 0000:00:03.4: AER: device recovery successful
[  336.535899] nvme 0000:0b:00.0: vgaarb: pci_notify
[  336.691465] pci 0000:0b:00.0: vgaarb: pci_notify
```

And again KIOXIA CD6 Device 1e0f:0007
This time recovery seems to works fine 
```
[  760.688465] pcieport 0000:80:03.4: aer_inject: Injecting errors 00000000/00004000 into device 0000:85:00.0
[  760.700437] pcieport 0000:80:03.4: AER: Uncorrected (Fatal) error received: 0000:85:00.0
[  760.710238] nvme 0000:85:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
[  760.723600] pcieport 0000:80:03.4: AER: broadcast error_detected message
[  760.732011] nvme nvme3: frozen state error detected, reset controller
[  760.793311] pcieport 0000:80:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
[  761.819692] pcieport 0000:80:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
[  761.829594] pcieport 0000:80:03.4: AER: Root Port link has been reset (0)
[  761.838156] pcieport 0000:80:03.4: AER: broadcast slot_reset message
[  761.846263] nvme nvme3: restart after slot reset
[  761.852721] nvme 0000:85:00.0: restoring config space at offset 0x30 (was 0x0, writing 0xbc200000)
[  761.863559] nvme 0000:85:00.0: restoring config space at offset 0x10 (was 0x4, writing 0xbc210004)
[  761.874389] nvme 0000:85:00.0: restoring config space at offset 0xc (was 0x0, writing 0x10)
[  761.884634] nvme 0000:85:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100406)
[  761.895725] pcieport 0000:80:03.4: AER: broadcast resume message
[  761.921897] nvme 0000:85:00.0: saving config space at offset 0x0 (reading 0x71e0f)
...
[  762.055409] nvme 0000:85:00.0: saving config space at offset 0x38 (reading 0x0)
[  762.064599] nvme 0000:85:00.0: saving config space at offset 0x3c (reading 0x1ff)
[  762.079458] nvme nvme3: Shutdown timeout set to 16 seconds
[  762.166218] nvme nvme3: 64/0/0 default/read/poll queues
[  762.176470] pcieport 0000:80:03.4: AER: device recovery successful
```
Device is still visible:

# ls /dev/nvme3n1
/dev/nvme3n1

So looks like behavior is very similar on 6.1.9 to what I experience with 5.15 kernel



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-02 18:47             ` Grochowski, Maciej
@ 2023-02-02 19:43               ` Keith Busch
  2023-02-03  1:29                 ` Grochowski, Maciej
  0 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2023-02-02 19:43 UTC (permalink / raw)
  To: Grochowski, Maciej
  Cc: Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org

On Thu, Feb 02, 2023 at 06:47:35PM +0000, Grochowski, Maciej wrote:
> > I've been trying to look for this code in latest upstream and it has changed entirely.  Any chance you could do a quick run with Linux 6.1?
> 
> It gave me the result as on older kernel: failure in power state change and device disappeared after test (logs below)
> ```
> [  365.052300] pcieport 0000:00:03.4: aer_inject: Injecting errors 00000000/00004000 into device 0000:0b:00.0
> [  365.062200] pcieport 0000:00:03.4: AER: Uncorrected (Fatal) error received: 0000:0b:00.0
> [  365.070439] nvme 0000:0b:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
> [  365.081824] pcieport 0000:00:03.4: AER: broadcast error_detected message
> [  365.088635] nvme nvme5: frozen state error detected, reset controller
> [  365.157742] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
> [  366.205193] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
> [  366.213394] pcieport 0000:00:03.4: AER: Root Port link has been reset (0)
> [  366.220312] pcieport 0000:00:03.4: AER: broadcast slot_reset message
> [  366.226771] nvme nvme5: restart after slot reset
> [  366.232018] pcieport 0000:00:03.4: re-enabling LTR
> [  366.239088] nvme 0000:0b:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x10a)  
> ...
> [  366.994113] nvme 0000:0b:00.0: restoring config space at offset 0x8 (was 0xffffffff, writing 0x1080200)
> [  367.003924] nvme 0000:0b:00.0: restoring config space at offset 0x4 (was 0xffffffff, writing 0x100406)
> [  367.013663] nvme 0000:0b:00.0: restoring config space at offset 0x0 (was 0xffffffff, writing 0xa80a144d)
> [  367.023595] pcieport 0000:00:03.4: AER: broadcast resume message
> [  367.045269] nvme 0000:0b:00.0: Unable to change power state from D3cold to D0, device inaccessible

We'll do a secondary bus reset right before restarting the controller. It
sounds like this particular device isn't recoverying from.

Does the device come back if you manually remove/rescan? Something like this:

  # echo 1 > /sys/bus/pci/devices/0000:0b:00.0/remove
  # echo 1 > /sys/bus/pci/rescan


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-02 19:43               ` Keith Busch
@ 2023-02-03  1:29                 ` Grochowski, Maciej
  2023-02-03  1:37                   ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Grochowski, Maciej @ 2023-02-03  1:29 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org

I run remove/rescan for this Samsung PM1733 and it looks like it works fine on both 5.15 and 6.1.9
See the log below:

# echo 1 > /sys/bus/pci/devices/0000:0b:00.0/remove
```
[ 2915.510903] nvme 0000:0b:00.0: PME# disabled
[ 2917.286586] pci 0000:0b:00.0: device released
```

# echo 1 > /sys/bus/pci/rescan
```
[ 2920.627012] pci_bus 0000:c0: scanning bus
[ 2920.632857] pcieport 0000:c0:01.1: scanning [bus c1-c1] behind bridge, pass 0
[ 2920.641819] pci_bus 0000:c1: scanning bus
...
[ 2921.881870] pcieport 0000:00:03.3: scanning [bus 0a-0a] behind bridge, pass 0
[ 2921.892102] pci_bus 0000:0a: scanning bus
[ 2921.898621] pci_bus 0000:0a: bus scan returning with max=0a
[ 2921.906664] pcieport 0000:00:03.4: scanning [bus 0b-0b] behind bridge, pass 0
[ 2921.916322] pci_bus 0000:0b: scanning bus
[ 2921.923143] pci 0000:0b:00.0: [144d:a824] type 00 class 0x010802
[ 2921.931767] pci 0000:0b:00.0: reg 0x10: [mem 0xf0210000-0xf0217fff 64bit]
[ 2921.941367] pci 0000:0b:00.0: reg 0x30: [mem 0xf2d00000-0xf2d0ffff pref]
[ 2921.951365] pci 0000:0b:00.0: reg 0x20c: [mem 0xf0218000-0xf021ffff 64bit]
[ 2921.960879] pci 0000:0b:00.0: VF(n) BAR0 space: [mem 0xf0218000-0xf0317fff 64bit] (contains BAR0 for 32 VFs)
[ 2921.975227] pci_bus 0000:0b: bus scan returning with max=0b
[ 2921.983650] pcieport 0000:00:07.1: scanning [bus 0c-0c] behind bridge, pass 0
[ 2921.993646] pci_bus 0000:0c: scanning bus
[ 2922.000333] pci_bus 0000:0c: bus scan returning with max=0c
[ 2922.008557] pcieport 0000:00:08.1: scanning [bus 0d-0d] behind bridge, pass 0
[ 2922.018366] pci_bus 0000:0d: scanning bus
[ 2922.025031] pci_bus 0000:0d: bus scan returning with max=0d
[ 2922.033234] pcieport 0000:00:01.1: scanning [bus 01-08] behind bridge, pass 1
[ 2922.043121] pcieport 0000:00:03.1: scanning [bus 09-09] behind bridge, pass 1
[ 2922.052866] pcieport 0000:00:03.3: scanning [bus 0a-0a] behind bridge, pass 1
[ 2922.062577] pcieport 0000:00:03.4: scanning [bus 0b-0b] behind bridge, pass 1
[ 2922.072232] pcieport 0000:00:07.1: scanning [bus 0c-0c] behind bridge, pass 1
[ 2922.081835] pcieport 0000:00:08.1: scanning [bus 0d-0d] behind bridge, pass 1
[ 2922.091437] pci_bus 0000:00: bus scan returning with max=0d
[ 2922.499469] pci 0000:0b:00.0: BAR 6: assigned [mem 0xf0200000-0xf020ffff pref]
[ 2922.509087] pci 0000:0b:00.0: BAR 0: assigned [mem 0xf0210000-0xf0217fff 64bit]
[ 2922.518830] pci 0000:0b:00.0: BAR 7: assigned [mem 0xf0218000-0xf0317fff 64bit]
[ 2922.528871] nvme 0000:0b:00.0: runtime IRQ mapping not provided by arch
[ 2922.539667] nvme nvme5: pci function 0000:0b:00.0
[ 2922.548225] nvme 0000:0b:00.0: enabling bus mastering
[ 2922.555997] nvme 0000:0b:00.0: saving config space at offset 0x0 (reading 0xa824144d)
[ 2922.566124] nvme 0000:0b:00.0: saving config space at offset 0x4 (reading 0x100406)
[ 2922.576334] nvme 0000:0b:00.0: saving config space at offset 0x8 (reading 0x1080200)
[ 2922.586310] nvme 0000:0b:00.0: saving config space at offset 0xc (reading 0x10)
[ 2922.595446] nvme 0000:0b:00.0: saving config space at offset 0x10 (reading 0xf0210004)
[ 2922.605199] nvme 0000:0b:00.0: saving config space at offset 0x14 (reading 0x0)
[ 2922.614322] nvme 0000:0b:00.0: saving config space at offset 0x18 (reading 0x0)
[ 2922.623452] nvme 0000:0b:00.0: saving config space at offset 0x1c (reading 0x0)
[ 2922.632541] nvme 0000:0b:00.0: saving config space at offset 0x20 (reading 0x0)
[ 2922.641630] nvme 0000:0b:00.0: saving config space at offset 0x24 (reading 0x0)
[ 2922.650659] nvme 0000:0b:00.0: saving config space at offset 0x28 (reading 0x0)
[ 2922.659654] nvme 0000:0b:00.0: saving config space at offset 0x2c (reading 0xa801144d)
[ 2922.669236] nvme 0000:0b:00.0: saving config space at offset 0x30 (reading 0xf2d00000)
[ 2922.678813] nvme 0000:0b:00.0: saving config space at offset 0x34 (reading 0x40)
[ 2922.687822] nvme 0000:0b:00.0: saving config space at offset 0x38 (reading 0x0)
[ 2922.696721] nvme 0000:0b:00.0: saving config space at offset 0x3c (reading 0x10a)
[ 2922.723443] nvme nvme5: Shutdown timeout set to 10 seconds
[ 2922.765257] nvme nvme5: 63/0/0 default/read/poll queues
[ 2922.842614]  nvme5n1: p1
```


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-03  1:29                 ` Grochowski, Maciej
@ 2023-02-03  1:37                   ` Keith Busch
  2023-02-03 18:45                     ` Grochowski, Maciej
  0 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2023-02-03  1:37 UTC (permalink / raw)
  To: Grochowski, Maciej
  Cc: Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org

On Fri, Feb 03, 2023 at 01:29:46AM +0000, Grochowski, Maciej wrote:
> I run remove/rescan for this Samsung PM1733 and it looks like it works fine on both 5.15 and 6.1.9

Sounds like the Samsung wants a longer, non-standard delay between SBR and
reinit.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-03  1:37                   ` Keith Busch
@ 2023-02-03 18:45                     ` Grochowski, Maciej
  2023-02-06 14:02                       ` Javier.gonz
  0 siblings, 1 reply; 26+ messages in thread
From: Grochowski, Maciej @ 2023-02-03 18:45 UTC (permalink / raw)
  To: Javier.gonz@samsung.com, Keith Busch
  Cc: Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel

> > I run remove/rescan for this Samsung PM1733 and it looks like it works 
> > fine on both 5.15 and 6.1.9
>
> Sounds like the Samsung wants a longer, non-standard delay between SBR and reinit.

Thanks for the suggestion.

Hi Javier: 

We have 2 Samsung NVMe drives: PM9A3 and PM1733
When we issue fatal AER via aer_inject these driver are not able to recover due to the 
"Unable to change power state from D3cold to D0, device inaccessible"

Repeated log from previous mail (this is consistent behavior on 5.15 and 6.1 kernel)
```
[  334.527200] pcieport 0000:00:03.4: aer_inject: Injecting errors 00000000/00004000 into device 0000:0b:00.0
[  334.537072] pcieport 0000:00:03.4: AER: Uncorrected (Fatal) error received: 0000:0b:00.0
[  334.545320] nvme 0000:0b:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
[  334.556682] pcieport 0000:00:03.4: AER: broadcast error_detected message
[  334.563467] nvme nvme5: frozen state error detected, reset controller
[  334.615434] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
[  335.655445] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
[  335.663647] pcieport 0000:00:03.4: AER: Root Port link has been reset (0)
[  335.670523] pcieport 0000:00:03.4: AER: broadcast slot_reset message
[  335.676954] nvme nvme5: restart after slot reset
[  335.684371] nvme 0000:0b:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x10a)
[  336.427724] nvme 0000:0b:00.0: restoring config space at offset 0x8 (was 0xffffffff, writing 0x1080200)
[  336.437510] nvme 0000:0b:00.0: restoring config space at offset 0x4 (was 0xffffffff, writing 0x100406)
[  336.447215] nvme 0000:0b:00.0: restoring config space at offset 0x0 (was 0xffffffff, writing 0xa824144d)
[  336.457117] pcieport 0000:00:03.4: AER: broadcast resume message
[  336.479575] nvme 0000:0b:00.0: Unable to change power state from D3cold to D0, device inaccessible
[  336.494264] nvme nvme5: Removing after probe failure status: -19
[  336.535861] pcieport 0000:00:03.4: AER: device recovery successful
[  336.535899] nvme 0000:0b:00.0: vgaarb: pci_notify
[  336.691465] pci 0000:0b:00.0: vgaarb: pci_notify
```

Same experiment for other NVMe vendors seems to works fine (I tried on KIOXIA NVME)
is that something you can take a look at?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-03 18:45                     ` Grochowski, Maciej
@ 2023-02-06 14:02                       ` Javier.gonz
  2023-02-06 15:42                         ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Javier.gonz @ 2023-02-06 14:02 UTC (permalink / raw)
  To: Grochowski, Maciej
  Cc: Keith Busch, Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel

On 03.02.2023 18:45, Grochowski, Maciej wrote:
>> > I run remove/rescan for this Samsung PM1733 and it looks like it works
>> > fine on both 5.15 and 6.1.9
>>
>> Sounds like the Samsung wants a longer, non-standard delay between SBR and reinit.
>
>Thanks for the suggestion.
>
>Hi Javier:
>
>We have 2 Samsung NVMe drives: PM9A3 and PM1733
>When we issue fatal AER via aer_inject these driver are not able to recover due to the
>"Unable to change power state from D3cold to D0, device inaccessible"
>
>Repeated log from previous mail (this is consistent behavior on 5.15 and 6.1 kernel)
>```
>[  334.527200] pcieport 0000:00:03.4: aer_inject: Injecting errors 00000000/00004000 into device 0000:0b:00.0
>[  334.537072] pcieport 0000:00:03.4: AER: Uncorrected (Fatal) error received: 0000:0b:00.0
>[  334.545320] nvme 0000:0b:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
>[  334.556682] pcieport 0000:00:03.4: AER: broadcast error_detected message
>[  334.563467] nvme nvme5: frozen state error detected, reset controller
>[  334.615434] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
>[  335.655445] pcieport 0000:00:03.4: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
>[  335.663647] pcieport 0000:00:03.4: AER: Root Port link has been reset (0)
>[  335.670523] pcieport 0000:00:03.4: AER: broadcast slot_reset message
>[  335.676954] nvme nvme5: restart after slot reset
>[  335.684371] nvme 0000:0b:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x10a)
>[  336.427724] nvme 0000:0b:00.0: restoring config space at offset 0x8 (was 0xffffffff, writing 0x1080200)
>[  336.437510] nvme 0000:0b:00.0: restoring config space at offset 0x4 (was 0xffffffff, writing 0x100406)
>[  336.447215] nvme 0000:0b:00.0: restoring config space at offset 0x0 (was 0xffffffff, writing 0xa824144d)
>[  336.457117] pcieport 0000:00:03.4: AER: broadcast resume message
>[  336.479575] nvme 0000:0b:00.0: Unable to change power state from D3cold to D0, device inaccessible
>[  336.494264] nvme nvme5: Removing after probe failure status: -19
>[  336.535861] pcieport 0000:00:03.4: AER: device recovery successful
>[  336.535899] nvme 0000:0b:00.0: vgaarb: pci_notify
>[  336.691465] pci 0000:0b:00.0: vgaarb: pci_notify
>```
>
>Same experiment for other NVMe vendors seems to works fine (I tried on KIOXIA NVME)
>is that something you can take a look at?

Thanks for the note Maciej. I will report this internally.

Keith, Christoph,

Is there a chance we can get a quirk for this for this FW. Seems like an
issue on our side that is creating problems.

Thanks,
Javier


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-06 14:02                       ` Javier.gonz
@ 2023-02-06 15:42                         ` Christoph Hellwig
  2023-02-06 16:22                           ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2023-02-06 15:42 UTC (permalink / raw)
  To: Javier.gonz@samsung.com
  Cc: Grochowski, Maciej, Keith Busch, Christoph Hellwig,
	sagi@grimberg.me, linux-nvme@lists.infradead.org,
	Lewis, Nathaniel

On Mon, Feb 06, 2023 at 03:02:20PM +0100, Javier.gonz@samsung.com wrote:
> Is there a chance we can get a quirk for this for this FW. Seems like an
> issue on our side that is creating problems.

So waht would the quirk look like?  This would have to be something in
the core PCIe code, not NVMe as far as I can tell.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-06 15:42                         ` Christoph Hellwig
@ 2023-02-06 16:22                           ` Keith Busch
  2023-02-06 17:51                             ` Javier.gonz
  0 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2023-02-06 16:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Javier.gonz@samsung.com, Grochowski, Maciej, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel

On Mon, Feb 06, 2023 at 04:42:18PM +0100, Christoph Hellwig wrote:
> On Mon, Feb 06, 2023 at 03:02:20PM +0100, Javier.gonz@samsung.com wrote:
> > Is there a chance we can get a quirk for this for this FW. Seems like an
> > issue on our side that is creating problems.
> 
> So waht would the quirk look like?  This would have to be something in
> the core PCIe code, not NVMe as far as I can tell.

Yeah, I'm assuming it's a PCI level quirk because the remove-rescan was
successful at reinitializing. The remove-rescan should do a very similiar
config space setup and nvme startup as an AER recovery, so I'm guessing this is
a matter of timing. If so, the below is how I'd imagine the quirk may look.

But if this idea isn't successful, it'd be bit a more work to figure out what
sequence this device wants to happen in order to complete recovery.

---
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index fba95486caaf2..f642e7029e0e9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5054,7 +5054,10 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
 	 * PCI spec v3.0 7.6.4.2 requires minimum Trst of 1ms.  Double
 	 * this to 2ms to ensure that we meet the minimum requirement.
 	 */
-	msleep(2);
+	if (dev->quirks & <NEW_SB_DELAY_QUIRK_BIT>)
+		msleep(100);
+	else
+		msleep(2);
 
 	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
 	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
@@ -5066,7 +5069,10 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
 	 * be re-initialized.  PCIe has some ways to shorten this,
 	 * but we don't make use of them yet.
 	 */
-	ssleep(1);
+	if (dev->quirks & <NEW_SB_DELAY_QUIRK_BIT>)
+		ssleep(5);
+	else
+		ssleep(1);
 }
 
 void __weak pcibios_reset_secondary_bus(struct pci_dev *dev)
--


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-06 16:22                           ` Keith Busch
@ 2023-02-06 17:51                             ` Javier.gonz
  2023-02-07  1:51                               ` Grochowski, Maciej
  0 siblings, 1 reply; 26+ messages in thread
From: Javier.gonz @ 2023-02-06 17:51 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Grochowski, Maciej, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel

On 06.02.2023 09:22, Keith Busch wrote:
>On Mon, Feb 06, 2023 at 04:42:18PM +0100, Christoph Hellwig wrote:
>> On Mon, Feb 06, 2023 at 03:02:20PM +0100, Javier.gonz@samsung.com wrote:
>> > Is there a chance we can get a quirk for this for this FW. Seems like an
>> > issue on our side that is creating problems.
>>
>> So waht would the quirk look like?  This would have to be something in
>> the core PCIe code, not NVMe as far as I can tell.

Yes. I was wondering if you had a reason for this not being a good idea.

>
>Yeah, I'm assuming it's a PCI level quirk because the remove-rescan was
>successful at reinitializing. The remove-rescan should do a very similiar
>config space setup and nvme startup as an AER recovery, so I'm guessing this is
>a matter of timing. If so, the below is how I'd imagine the quirk may look.

Thanks Keith. Let me try this out. I need to reproduce this first.

Maciej: It would be helpful if you can try this at your end.

>But if this idea isn't successful, it'd be bit a more work to figure out what
>sequence this device wants to happen in order to complete recovery.

Ok. Thanks!

>
>---
>diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>index fba95486caaf2..f642e7029e0e9 100644
>--- a/drivers/pci/pci.c
>+++ b/drivers/pci/pci.c
>@@ -5054,7 +5054,10 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
> 	 * PCI spec v3.0 7.6.4.2 requires minimum Trst of 1ms.  Double
> 	 * this to 2ms to ensure that we meet the minimum requirement.
> 	 */
>-	msleep(2);
>+	if (dev->quirks & <NEW_SB_DELAY_QUIRK_BIT>)
>+		msleep(100);
>+	else
>+		msleep(2);
>
> 	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> 	pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
>@@ -5066,7 +5069,10 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
> 	 * be re-initialized.  PCIe has some ways to shorten this,
> 	 * but we don't make use of them yet.
> 	 */
>-	ssleep(1);
>+	if (dev->quirks & <NEW_SB_DELAY_QUIRK_BIT>)
>+		ssleep(5);
>+	else
>+		ssleep(1);
> }
>
> void __weak pcibios_reset_secondary_bus(struct pci_dev *dev)
>--


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-06 17:51                             ` Javier.gonz
@ 2023-02-07  1:51                               ` Grochowski, Maciej
  2023-02-07  8:29                                 ` Javier.gonz
  0 siblings, 1 reply; 26+ messages in thread
From: Grochowski, Maciej @ 2023-02-07  1:51 UTC (permalink / raw)
  To: Javier.gonz@samsung.com, Keith Busch
  Cc: Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel

I have tried suggested approach, with some modification:
pci_device in pci_reset_secondary_bus is actually the bridge not NVMe device itself, thus I checked devices behind that bridge to see if any has D0 bit and base on that logic I run the custom delay.

Unfortunately even with this approach I see the same issue for both Samsung drives, and based on kernel logs I can see that wait for secondary bus reset get increased.
Thus seems like this quirk don't work for some reason. (I tried also increasing delays to different values but it didn't work).


diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index fba95486caaf..6ec2fe042765 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5042,19 +5042,38 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
        }
 }

+static int pci_check_d0_under_bridge(struct pci_dev *dev, void *arg)
+{
+        u16 *d0 = arg;
+
+        if (dev->d0_delay) {
+                *d0 = 1;
+                return *d0;
+        }
+
+        return 0;
+}
+
 void pci_reset_secondary_bus(struct pci_dev *dev)
 {
        u16 ctrl;
+       u16 d0_delay;

        pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &ctrl);
        ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
        pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);

+       if (dev->subordinate)
+               pci_walk_bus(dev->subordinate, pci_check_d0_under_bridge, &d0_delay);
+
        /*
         * PCI spec v3.0 7.6.4.2 requires minimum Trst of 1ms.  Double
         * this to 2ms to ensure that we meet the minimum requirement.
         */
-       msleep(2);
+       if (d0_delay)
+               msleep(100);
+       else
+               msleep(2);

        ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
        pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
@@ -5066,7 +5085,10 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
         * be re-initialized.  PCIe has some ways to shorten this,
         * but we don't make use of them yet.
         */
-       ssleep(1);
+       if (d0_delay)
+               ssleep(5);
+       else
+               ssleep(1);
 }

 void __weak pcibios_reset_secondary_bus(struct pci_dev *dev)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 285acc4aaccc..c948f6b3fbc8 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5992,3 +5992,9 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2d, dpc_log_size);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size);
 #endif
+
+static void samsung_d0_fixup(struct pci_dev *pdev) {
+       pdev->d0_delay = 1;
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SAMSUNG, 0xa824, samsung_d0_fixup);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SAMSUNG, 0xa80a, samsung_d0_fixup);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index adffd65e84b4..2112cba45abd 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -464,6 +464,7 @@ struct pci_dev {
        unsigned int    no_vf_scan:1;           /* Don't scan for VFs after IOV enablement */
        unsigned int    no_command_memory:1;    /* No PCI_COMMAND_MEMORY */
        unsigned int    rom_bar_overlap:1;      /* ROM BAR disable broken */
+       unsigned int    d0_delay : 1;           /* Require additional delay for D3-D0 transition*/
        pci_dev_flags_t dev_flags;
        atomic_t        enable_cnt;     /* pci_enable_device has been called */


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-07  1:51                               ` Grochowski, Maciej
@ 2023-02-07  8:29                                 ` Javier.gonz
  2023-02-07 10:36                                   ` Klaus Jensen
  0 siblings, 1 reply; 26+ messages in thread
From: Javier.gonz @ 2023-02-07  8:29 UTC (permalink / raw)
  To: Grochowski, Maciej
  Cc: Keith Busch, Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel, Kanchan Joshi,
	Klaus Jensen

On 07.02.2023 01:51, Grochowski, Maciej wrote:
>I have tried suggested approach, with some modification: pci_device in
>pci_reset_secondary_bus is actually the bridge not NVMe device itself,
>thus I checked devices behind that bridge to see if any has D0 bit and
>base on that logic I run the custom delay.
>
>Unfortunately even with this approach I see the same issue for both
>Samsung drives, and based on kernel logs I can see that wait for
>secondary bus reset get increased.  Thus seems like this quirk don't
>work for some reason. (I tried also increasing delays to different
>values but it didn't work).

Too bad.

I will write you separately to get som dumps from the device. We have
not seen this before, so we need to understand this a bit better.

Regarding the quirk, we are looking into it. Will come with something in
this thread later. Cc'ing Kanchan and Klaus.

Thanks,
Javier


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-07  8:29                                 ` Javier.gonz
@ 2023-02-07 10:36                                   ` Klaus Jensen
  2023-02-07 19:05                                     ` Grochowski, Maciej
  0 siblings, 1 reply; 26+ messages in thread
From: Klaus Jensen @ 2023-02-07 10:36 UTC (permalink / raw)
  To: Javier.gonz@samsung.com
  Cc: Grochowski, Maciej, Keith Busch, Christoph Hellwig,
	sagi@grimberg.me, linux-nvme@lists.infradead.org,
	Lewis, Nathaniel, Kanchan Joshi, Klaus Jensen

[-- Attachment #1: Type: text/plain, Size: 2084 bytes --]

On Feb  7 09:29, Javier.gonz@samsung.com wrote:
> On 07.02.2023 01:51, Grochowski, Maciej wrote:
> > I have tried suggested approach, with some modification: pci_device in
> > pci_reset_secondary_bus is actually the bridge not NVMe device itself,
> > thus I checked devices behind that bridge to see if any has D0 bit and
> > base on that logic I run the custom delay.
> > 
> > Unfortunately even with this approach I see the same issue for both
> > Samsung drives, and based on kernel logs I can see that wait for
> > secondary bus reset get increased.  Thus seems like this quirk don't
> > work for some reason. (I tried also increasing delays to different
> > values but it didn't work).
> 
> Too bad.
> 
> I will write you separately to get som dumps from the device. We have
> not seen this before, so we need to understand this a bit better.
> 
> Regarding the quirk, we are looking into it. Will come with something in
> this thread later. Cc'ing Kanchan and Klaus.
> 

I dug up a PM1733 and I am not immediately able to reproduce on 6.2-rc7.

With an aer-inject error file,

  AER
  PCI_ID 0000:04:00.0
  UNCOR_STATUS MALF_TLP
  HEADER_LOG 0 1 2 3

I'm getting a Fatal error with type "Inaccessible, (Unregistered Agent
ID)", but it still recovers successfully:

  pcieport 0000:00:01.2: aer_inject: Injecting errors 00000000/00040000 into device 0000:04:00.0
  pcieport 0000:00:01.2: AER: Uncorrected (Fatal) error received: 0000:04:00.0
  nvme 0000:04:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
  nvme nvme1: frozen state error detected, reset controller
  pcieport 0000:03:00.0: AER: Downstream Port link has been reset (0)
  nvme nvme1: restart after slot reset
  nvme nvme1: Shutdown timeout set to 10 seconds
  nvme nvme1: 32/0/0 default/read/poll queues
  pcieport 0000:03:00.0: AER: device recovery successful

Maciej, can you share firmware revision information and a bit more
details on your reproducer/setup that might allow us to replicate?


Thanks,
Klaus

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-07 10:36                                   ` Klaus Jensen
@ 2023-02-07 19:05                                     ` Grochowski, Maciej
  2023-02-08  6:43                                       ` Klaus Jensen
  0 siblings, 1 reply; 26+ messages in thread
From: Grochowski, Maciej @ 2023-02-07 19:05 UTC (permalink / raw)
  To: Klaus Jensen, Javier.gonz@samsung.com
  Cc: Keith Busch, Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel, Kanchan Joshi,
	Klaus Jensen

I also updated the Kernel into 6.2-rc-7 to make sure we are on the same revision.
I repeated experiment and I am still getting error on both drives: PM1733 and PM9A3,
So I think there may be some platform/fw differences 

I have two AMD Rome platforms that behave the same (RomeD8-2T and other custom made based on this design). I set these platform to do OS Handle first (instead of relying on platform FW).

Do you have any set of nvme-cli commands that I should issue for these drives so you can compare FW revision and other details?

I see also in your setup you have NVME connected via the bridge 0000:00:01.2 so it looks very similar to my platform.

-----Original Message-----
From: Klaus Jensen <its@irrelevant.dk> 
Sent: Tuesday, February 7, 2023 2:37 AM
To: Javier.gonz@samsung.com
Cc: Grochowski, Maciej <Maciej.Grochowski@sony.com>; Keith Busch <kbusch@kernel.org>; Christoph Hellwig <hch@lst.de>; sagi@grimberg.me; linux-nvme@lists.infradead.org; Lewis, Nathaniel <Nathaniel.Lewis@sony.com>; Kanchan Joshi <joshi.k@samsung.com>; Klaus Jensen <k.jensen@samsung.com>
Subject: Re: [PATCH] nvme-pci: fix resume after AER recovery

On Feb  7 09:29, Javier.gonz@samsung.com wrote:
> On 07.02.2023 01:51, Grochowski, Maciej wrote:
> > I have tried suggested approach, with some modification: pci_device 
> > in pci_reset_secondary_bus is actually the bridge not NVMe device 
> > itself, thus I checked devices behind that bridge to see if any has 
> > D0 bit and base on that logic I run the custom delay.
> > 
> > Unfortunately even with this approach I see the same issue for both 
> > Samsung drives, and based on kernel logs I can see that wait for 
> > secondary bus reset get increased.  Thus seems like this quirk don't 
> > work for some reason. (I tried also increasing delays to different 
> > values but it didn't work).
> 
> Too bad.
> 
> I will write you separately to get som dumps from the device. We have 
> not seen this before, so we need to understand this a bit better.
> 
> Regarding the quirk, we are looking into it. Will come with something 
> in this thread later. Cc'ing Kanchan and Klaus.
> 

I dug up a PM1733 and I am not immediately able to reproduce on 6.2-rc7.

With an aer-inject error file,

  AER
  PCI_ID 0000:04:00.0
  UNCOR_STATUS MALF_TLP
  HEADER_LOG 0 1 2 3

I'm getting a Fatal error with type "Inaccessible, (Unregistered Agent ID)", but it still recovers successfully:

  pcieport 0000:00:01.2: aer_inject: Injecting errors 00000000/00040000 into device 0000:04:00.0
  pcieport 0000:00:01.2: AER: Uncorrected (Fatal) error received: 0000:04:00.0
  nvme 0000:04:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
  nvme nvme1: frozen state error detected, reset controller
  pcieport 0000:03:00.0: AER: Downstream Port link has been reset (0)
  nvme nvme1: restart after slot reset
  nvme nvme1: Shutdown timeout set to 10 seconds
  nvme nvme1: 32/0/0 default/read/poll queues
  pcieport 0000:03:00.0: AER: device recovery successful

Maciej, can you share firmware revision information and a bit more details on your reproducer/setup that might allow us to replicate?


Thanks,
Klaus

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-07 19:05                                     ` Grochowski, Maciej
@ 2023-02-08  6:43                                       ` Klaus Jensen
  2023-02-08 17:26                                         ` Grochowski, Maciej
  0 siblings, 1 reply; 26+ messages in thread
From: Klaus Jensen @ 2023-02-08  6:43 UTC (permalink / raw)
  To: Grochowski, Maciej
  Cc: Javier.gonz@samsung.com, Keith Busch, Christoph Hellwig,
	sagi@grimberg.me, linux-nvme@lists.infradead.org,
	Lewis, Nathaniel, Kanchan Joshi, Klaus Jensen

[-- Attachment #1: Type: text/plain, Size: 226 bytes --]

On Feb  7 19:05, Grochowski, Maciej wrote:
> Do you have any set of nvme-cli commands that I should issue for these
> drives so you can compare FW revision and other details?

`nvme-cli id-ctrl /dev/nvmeX` should do nicely :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-08  6:43                                       ` Klaus Jensen
@ 2023-02-08 17:26                                         ` Grochowski, Maciej
  2023-02-08 17:39                                           ` Keith Busch
  0 siblings, 1 reply; 26+ messages in thread
From: Grochowski, Maciej @ 2023-02-08 17:26 UTC (permalink / raw)
  To: Klaus Jensen
  Cc: Javier.gonz@samsung.com, Keith Busch, Christoph Hellwig,
	sagi@grimberg.me, linux-nvme@lists.infradead.org,
	Lewis, Nathaniel, Kanchan Joshi, Klaus Jensen

[-- Attachment #1: Type: text/plain, Size: 444 bytes --]

> `nvme-cli id-ctrl /dev/nvmeX` should do nicely :)

Thanks I attach you output of id-ctrl for both PM1733 and PM9A3.

Also I am not sure if you use DPC but in our setup we have NVMe in PCIe slots with DPC configured in the BIOS and also inside kernel we enable it via "pcie_ports=dpc-native" parameter. 
I potentially can check behavior without DPC but not sure how much value is with testing Error Recovery/HotPlug without DPC enabled.

[-- Attachment #2: PM1733_nvme_cli.txt --]
[-- Type: text/plain, Size: 1439 bytes --]

PM1733:

root@cronos-ws00:/home/tester# nvme id-ctrl /dev/nvme5
NVME Identify Controller:
vid       : 0x144d
ssvid     : 0x144d
sn        : S4YNNE0NA00859
mn        : SAMSUNG MZWLJ1T9HBJR-00007
fr        : EPK98B5Q
rab       : 8
ieee      : 002538
cmic      : 0x3
mdts      : 5
cntlid    : 41
ver       : 10300
rtd3r     : e4e1c0
rtd3e     : 989680
oaes      : 0x300
ctratt    : 0
rrls      : 0
oacs      : 0xdf
acl       : 127
aerl      : 15
frmw      : 0x17
lpa       : 0xe
elpe      : 255
npss      : 0
avscc     : 0x1
apsta     : 0
wctemp    : 345
cctemp    : 358
mtfa      : 130
hmpre     : 0
hmmin     : 0
tnvmcap   : 1920383410176
unvmcap   : 0
rpmbs     : 0
edstt     : 2
dsto      : 1
fwug      : 255
kas       : 0
hctma     : 0
mntmt     : 0
mxtmt     : 0
sanicap   : 0x3
hmminds   : 0
hmmaxd    : 0
nsetidmax : 0
anatt     : 0
anacap    : 0
anagrpmax : 0
nanagrpid : 0
sqes      : 0x66
cqes      : 0x44
maxcmd    : 0
nn        : 32
oncs      : 0x7f
fuses     : 0
fna       : 0x4
vwc       : 0
awun      : 65535
awupf     : 0
nvscc     : 1
nwpc      : 0
acwu      : 0
sgls      : f0002
mnan      : 0
subnqn    : nqn.1994-11.com.samsung:nvme:PM1733:2.5-inch:S4YNNE0NA00859
ioccsz    : 0
iorcsz    : 0
icdoff    : 0
ctrattr   : 0
msdbd     : 0
ps    0 : mp:25.00W operational enlat:100 exlat:100 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-

[-- Attachment #3: PM9A3_nvme_cli.txt --]
[-- Type: text/plain, Size: 1933 bytes --]

PM9A3

/home/mgrochowski # nvme id-ctrl /dev/nvme1
NVME Identify Controller:
vid       : 0x144d
ssvid     : 0x144d
sn        : S64GNE0T503228
mn        : SAMSUNG MZQL21T9HCJR-00A07
fr        : GDC5602Q
rab       : 2
ieee      : 002538
cmic      : 0
mdts      : 9
cntlid    : 0x6
ver       : 0x10400
rtd3r     : 0x7a1200
rtd3e     : 0x7a1200
oaes      : 0x300
ctratt    : 0x80
rrls      : 0
cntrltype : 1
fguid     : 00000000-0000-0000-0000-000000000000
crdt1     : 0
crdt2     : 0
crdt3     : 0
nvmsr     : 1
vwci      : 0
mec       : 1
oacs      : 0x5f
acl       : 7
aerl      : 3
frmw      : 0x17
lpa       : 0xe
elpe      : 63
npss      : 1
avscc     : 0x1
apsta     : 0
wctemp    : 353
cctemp    : 356
mtfa      : 0
hmpre     : 0
hmmin     : 0
tnvmcap   : 1920383410176
unvmcap   : 0
rpmbs     : 0
edstt     : 35
dsto      : 1
fwug      : 0
kas       : 0
hctma     : 0
mntmt     : 0
mxtmt     : 0
sanicap   : 0x3
hmminds   : 0
hmmaxd    : 0
nsetidmax : 0
endgidmax : 0
anatt     : 0
anacap    : 0
anagrpmax : 0
nanagrpid : 0
pels      : 0
domainid  : 0
megcap    : 0
sqes      : 0x66
cqes      : 0x44
maxcmd    : 256
nn        : 32
oncs      : 0x5f
fuses     : 0
fna       : 0x4
vwc       : 0x6
awun      : 127
awupf     : 0
icsvscc   : 1
nwpc      : 0
acwu      : 0
ocfs      : 0
sgls      : 0
mnan      : 0
maxdna    : 0
maxcna    : 0
subnqn    : nqn.1994-11.com.samsung:nvme:PM9A3:2.5-inch:S64GNE0T503228
ioccsz    : 0
iorcsz    : 0
icdoff    : 0
fcatt     : 0
msdbd     : 0
ofcs      : 0
ps      0 : mp:25.00W operational enlat:70 exlat:70 rrt:0 rrl:0
            rwt:0 rwl:0 idle_power:4.00W active_power:14.00W
            active_power_workload:80K 128KiB SW
ps      1 : mp:8.00W operational enlat:70 exlat:70 rrt:1 rrl:1
            rwt:1 rwl:1 idle_power:4.00W active_power:8.00W
            active_power_workload:80K 128KiB SW

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-08 17:26                                         ` Grochowski, Maciej
@ 2023-02-08 17:39                                           ` Keith Busch
  2023-02-08 22:38                                             ` Grochowski, Maciej
  0 siblings, 1 reply; 26+ messages in thread
From: Keith Busch @ 2023-02-08 17:39 UTC (permalink / raw)
  To: Grochowski, Maciej
  Cc: Klaus Jensen, Javier.gonz@samsung.com, Christoph Hellwig,
	sagi@grimberg.me, linux-nvme@lists.infradead.org,
	Lewis, Nathaniel, Kanchan Joshi, Klaus Jensen

On Wed, Feb 08, 2023 at 05:26:42PM +0000, Grochowski, Maciej wrote:
> > `nvme-cli id-ctrl /dev/nvmeX` should do nicely :)
> 
> Thanks I attach you output of id-ctrl for both PM1733 and PM9A3.
> 
> Also I am not sure if you use DPC but in our setup we have NVMe in PCIe slots with DPC configured in the BIOS and also inside kernel we enable it via "pcie_ports=dpc-native" parameter. 
> I potentially can check behavior without DPC but not sure how much value is with testing Error Recovery/HotPlug without DPC enabled.

Are you observing DPC events during the error injection recovery? I didn't see
any in the previous logs.

If you have DPC enabled, it really doesn't make sense to aer_inject on anything
downstream capable ports. Those types of errors would be contained by the DPC
hardware; the kernel will get a DPC event instead of an AER.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-08 17:39                                           ` Keith Busch
@ 2023-02-08 22:38                                             ` Grochowski, Maciej
  2023-02-09  7:55                                               ` Javier.gonz
  0 siblings, 1 reply; 26+ messages in thread
From: Grochowski, Maciej @ 2023-02-08 22:38 UTC (permalink / raw)
  To: Keith Busch, Klaus Jensen
  Cc: Javier.gonz@samsung.com, Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel, Kanchan Joshi,
	Klaus Jensen

> Are you observing DPC events during the error injection recovery? I didn't see any in the previous logs.

Correct there was no DPC triggered with error recovery.
I initially run into issue with NVMe device not being able to recover when DPC event got triggered, and by trying to narrow down what happened I figured that same effect can be caused by injecting fatal aer error via aer-inject.

> If you have DPC enabled, it really doesn't make sense to aer_inject on anything downstream capable ports. Those types of errors would be contained by the DPC hardware; the kernel will get a DPC event instead of an AER.

Understand these two scenarios should be investigated separately. I will remove DPC from the picture then.
I disabled DPC on my setup and removed kernel option. I run same experiment and I got same results, however by playing with connection to NVMe drive I was able to reproduce same result on KIOXIA.

That lead me to the conclusion that this error may be dependent not on NVMe drive but on connection.
To prove that I got Samsung PM9A3 in M2 factor (previously my experiments were done on U2 devices connected via SlimSAS or Oculink), interestingly M2 factor PM9A3 is able to recover from such scenario, proof below:

```
pcieport 0000:80:03.3: aer_inject: Injecting errors 00000000/00040000 into device 0000:84:00.0
pcieport 0000:80:03.3: AER: Uncorrected (Fatal) error received: 0000:84:00.0
nvme 0000:84:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
pcieport 0000:80:03.3: AER: broadcast error_detected message
nvme nvme0: frozen state error detected, reset controller
pcieport 0000:80:03.3: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
#d0_delay = 100ms
pcieport 0000:80:03.3: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
pcieport 0000:80:03.3: AER: Root Port link has been reset (0)
pcieport 0000:80:03.3: AER: broadcast slot_reset message
nvme nvme0: restart after slot reset
nvme 0000:84:00.0: restoring config space at offset 0x30 (was 0x0, writing 0xbc200000)
nvme 0000:84:00.0: restoring config space at offset 0x10 (was 0x4, writing 0xbc210004)
nvme 0000:84:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100406)
pcieport 0000:80:03.3: AER: broadcast resume message
nvme 0000:84:00.0: saving config space at offset 0x0 (reading 0xa80a144d)
nvme 0000:84:00.0: saving config space at offset 0x4 (reading 0x100406)
nvme 0000:84:00.0: saving config space at offset 0x8 (reading 0x1080200)
nvme 0000:84:00.0: saving config space at offset 0xc (reading 0x0)
nvme 0000:84:00.0: saving config space at offset 0x10 (reading 0xbc210004)
nvme 0000:84:00.0: saving config space at offset 0x14 (reading 0x0)
nvme 0000:84:00.0: saving config space at offset 0x18 (reading 0x0)
nvme 0000:84:00.0: saving config space at offset 0x1c (reading 0x0)
nvme 0000:84:00.0: saving config space at offset 0x20 (reading 0x0)
nvme 0000:84:00.0: saving config space at offset 0x24 (reading 0x0)
nvme 0000:84:00.0: saving config space at offset 0x28 (reading 0x0)
nvme 0000:84:00.0: saving config space at offset 0x2c (reading 0xa812144d)
nvme 0000:84:00.0: saving config space at offset 0x30 (reading 0xbc200000)
nvme 0000:84:00.0: saving config space at offset 0x34 (reading 0x40)
nvme 0000:84:00.0: saving config space at offset 0x38 (reading 0x0)
nvme 0000:84:00.0: saving config space at offset 0x3c (reading 0x1ff)
nvme nvme0: Shutdown timeout set to 8 seconds
nvme nvme0: 64/0/0 default/read/poll queues
pcieport 0000:80:03.3: AER: device recovery successful
```

So my current conclusion is that this is very likely Hardware related issue not Driver/Software.
I will dig deeper into it but for this particular thread I think there is not much more we can do.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] nvme-pci: fix resume after AER recovery
  2023-02-08 22:38                                             ` Grochowski, Maciej
@ 2023-02-09  7:55                                               ` Javier.gonz
  0 siblings, 0 replies; 26+ messages in thread
From: Javier.gonz @ 2023-02-09  7:55 UTC (permalink / raw)
  To: Grochowski, Maciej
  Cc: Keith Busch, Klaus Jensen, Christoph Hellwig, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, Lewis, Nathaniel, Kanchan Joshi,
	Klaus Jensen

On 08.02.2023 22:38, Grochowski, Maciej wrote:
>> Are you observing DPC events during the error injection recovery? I didn't see any in the previous logs.
>
>Correct there was no DPC triggered with error recovery.
>I initially run into issue with NVMe device not being able to recover when DPC event got triggered, and by trying to narrow down what happened I figured that same effect can be caused by injecting fatal aer error via aer-inject.
>
>> If you have DPC enabled, it really doesn't make sense to aer_inject on anything downstream capable ports. Those types of errors would be contained by the DPC hardware; the kernel will get a DPC event instead of an AER.
>
>Understand these two scenarios should be investigated separately. I will remove DPC from the picture then.
>I disabled DPC on my setup and removed kernel option. I run same experiment and I got same results, however by playing with connection to NVMe drive I was able to reproduce same result on KIOXIA.
>
>That lead me to the conclusion that this error may be dependent not on NVMe drive but on connection.
>To prove that I got Samsung PM9A3 in M2 factor (previously my experiments were done on U2 devices connected via SlimSAS or Oculink), interestingly M2 factor PM9A3 is able to recover from such scenario, proof below:
>
>```
>pcieport 0000:80:03.3: aer_inject: Injecting errors 00000000/00040000 into device 0000:84:00.0
>pcieport 0000:80:03.3: AER: Uncorrected (Fatal) error received: 0000:84:00.0
>nvme 0000:84:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
>pcieport 0000:80:03.3: AER: broadcast error_detected message
>nvme nvme0: frozen state error detected, reset controller
>pcieport 0000:80:03.3: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 0
>#d0_delay = 100ms
>pcieport 0000:80:03.3: pciehp: pciehp_reset_slot: SLOTCTRL 70 write cmd 1008
>pcieport 0000:80:03.3: AER: Root Port link has been reset (0)
>pcieport 0000:80:03.3: AER: broadcast slot_reset message
>nvme nvme0: restart after slot reset
>nvme 0000:84:00.0: restoring config space at offset 0x30 (was 0x0, writing 0xbc200000)
>nvme 0000:84:00.0: restoring config space at offset 0x10 (was 0x4, writing 0xbc210004)
>nvme 0000:84:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100406)
>pcieport 0000:80:03.3: AER: broadcast resume message
>nvme 0000:84:00.0: saving config space at offset 0x0 (reading 0xa80a144d)
>nvme 0000:84:00.0: saving config space at offset 0x4 (reading 0x100406)
>nvme 0000:84:00.0: saving config space at offset 0x8 (reading 0x1080200)
>nvme 0000:84:00.0: saving config space at offset 0xc (reading 0x0)
>nvme 0000:84:00.0: saving config space at offset 0x10 (reading 0xbc210004)
>nvme 0000:84:00.0: saving config space at offset 0x14 (reading 0x0)
>nvme 0000:84:00.0: saving config space at offset 0x18 (reading 0x0)
>nvme 0000:84:00.0: saving config space at offset 0x1c (reading 0x0)
>nvme 0000:84:00.0: saving config space at offset 0x20 (reading 0x0)
>nvme 0000:84:00.0: saving config space at offset 0x24 (reading 0x0)
>nvme 0000:84:00.0: saving config space at offset 0x28 (reading 0x0)
>nvme 0000:84:00.0: saving config space at offset 0x2c (reading 0xa812144d)
>nvme 0000:84:00.0: saving config space at offset 0x30 (reading 0xbc200000)
>nvme 0000:84:00.0: saving config space at offset 0x34 (reading 0x40)
>nvme 0000:84:00.0: saving config space at offset 0x38 (reading 0x0)
>nvme 0000:84:00.0: saving config space at offset 0x3c (reading 0x1ff)
>nvme nvme0: Shutdown timeout set to 8 seconds
>nvme nvme0: 64/0/0 default/read/poll queues
>pcieport 0000:80:03.3: AER: device recovery successful
>```
>
>So my current conclusion is that this is very likely Hardware related issue not Driver/Software.
>I will dig deeper into it but for this particular thread I think there is not much more we can do.

Thanks for sharing this Maciej.

Our firmware team is investigating the issue and have not been able to
reproduce either. I will ask them to pause this until we have more data.

Seems we do not need a quirk then either.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-02-09  7:56 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-30 10:14 [PATCH] nvme-pci: fix resume after AER recovery Christoph Hellwig
2023-01-30 18:35 ` Keith Busch
2023-01-30 18:43   ` Keith Busch
2023-01-30 18:54     ` Grochowski, Maciej
2023-01-31  8:58     ` Christoph Hellwig
2023-01-31 15:22       ` Keith Busch
2023-02-01 22:58         ` Grochowski, Maciej
2023-02-02 10:18           ` Christoph Hellwig
2023-02-02 18:47             ` Grochowski, Maciej
2023-02-02 19:43               ` Keith Busch
2023-02-03  1:29                 ` Grochowski, Maciej
2023-02-03  1:37                   ` Keith Busch
2023-02-03 18:45                     ` Grochowski, Maciej
2023-02-06 14:02                       ` Javier.gonz
2023-02-06 15:42                         ` Christoph Hellwig
2023-02-06 16:22                           ` Keith Busch
2023-02-06 17:51                             ` Javier.gonz
2023-02-07  1:51                               ` Grochowski, Maciej
2023-02-07  8:29                                 ` Javier.gonz
2023-02-07 10:36                                   ` Klaus Jensen
2023-02-07 19:05                                     ` Grochowski, Maciej
2023-02-08  6:43                                       ` Klaus Jensen
2023-02-08 17:26                                         ` Grochowski, Maciej
2023-02-08 17:39                                           ` Keith Busch
2023-02-08 22:38                                             ` Grochowski, Maciej
2023-02-09  7:55                                               ` Javier.gonz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox