From: Klaus Jensen <its@irrelevant.dk>
To: "Javier.gonz@samsung.com" <Javier.gonz@samsung.com>
Cc: "Grochowski, Maciej" <Maciej.Grochowski@sony.com>,
Keith Busch <kbusch@kernel.org>, Christoph Hellwig <hch@lst.de>,
"sagi@grimberg.me" <sagi@grimberg.me>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
"Lewis, Nathaniel" <Nathaniel.Lewis@sony.com>,
Kanchan Joshi <joshi.k@samsung.com>,
Klaus Jensen <k.jensen@samsung.com>
Subject: Re: [PATCH] nvme-pci: fix resume after AER recovery
Date: Tue, 7 Feb 2023 11:36:32 +0100 [thread overview]
Message-ID: <Y+IpsEd8kMoljOAa@cormorant.local> (raw)
In-Reply-To: <20230207082930.i5mz2y5bsabnj2ud@ArmHalley.localdomain>
[-- Attachment #1: Type: text/plain, Size: 2084 bytes --]
On Feb 7 09:29, Javier.gonz@samsung.com wrote:
> On 07.02.2023 01:51, Grochowski, Maciej wrote:
> > I have tried suggested approach, with some modification: pci_device in
> > pci_reset_secondary_bus is actually the bridge not NVMe device itself,
> > thus I checked devices behind that bridge to see if any has D0 bit and
> > base on that logic I run the custom delay.
> >
> > Unfortunately even with this approach I see the same issue for both
> > Samsung drives, and based on kernel logs I can see that wait for
> > secondary bus reset get increased. Thus seems like this quirk don't
> > work for some reason. (I tried also increasing delays to different
> > values but it didn't work).
>
> Too bad.
>
> I will write you separately to get som dumps from the device. We have
> not seen this before, so we need to understand this a bit better.
>
> Regarding the quirk, we are looking into it. Will come with something in
> this thread later. Cc'ing Kanchan and Klaus.
>
I dug up a PM1733 and I am not immediately able to reproduce on 6.2-rc7.
With an aer-inject error file,
AER
PCI_ID 0000:04:00.0
UNCOR_STATUS MALF_TLP
HEADER_LOG 0 1 2 3
I'm getting a Fatal error with type "Inaccessible, (Unregistered Agent
ID)", but it still recovers successfully:
pcieport 0000:00:01.2: aer_inject: Injecting errors 00000000/00040000 into device 0000:04:00.0
pcieport 0000:00:01.2: AER: Uncorrected (Fatal) error received: 0000:04:00.0
nvme 0000:04:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
nvme nvme1: frozen state error detected, reset controller
pcieport 0000:03:00.0: AER: Downstream Port link has been reset (0)
nvme nvme1: restart after slot reset
nvme nvme1: Shutdown timeout set to 10 seconds
nvme nvme1: 32/0/0 default/read/poll queues
pcieport 0000:03:00.0: AER: device recovery successful
Maciej, can you share firmware revision information and a bit more
details on your reproducer/setup that might allow us to replicate?
Thanks,
Klaus
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2023-02-07 10:36 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-30 10:14 [PATCH] nvme-pci: fix resume after AER recovery Christoph Hellwig
2023-01-30 18:35 ` Keith Busch
2023-01-30 18:43 ` Keith Busch
2023-01-30 18:54 ` Grochowski, Maciej
2023-01-31 8:58 ` Christoph Hellwig
2023-01-31 15:22 ` Keith Busch
2023-02-01 22:58 ` Grochowski, Maciej
2023-02-02 10:18 ` Christoph Hellwig
2023-02-02 18:47 ` Grochowski, Maciej
2023-02-02 19:43 ` Keith Busch
2023-02-03 1:29 ` Grochowski, Maciej
2023-02-03 1:37 ` Keith Busch
2023-02-03 18:45 ` Grochowski, Maciej
2023-02-06 14:02 ` Javier.gonz
2023-02-06 15:42 ` Christoph Hellwig
2023-02-06 16:22 ` Keith Busch
2023-02-06 17:51 ` Javier.gonz
2023-02-07 1:51 ` Grochowski, Maciej
2023-02-07 8:29 ` Javier.gonz
2023-02-07 10:36 ` Klaus Jensen [this message]
2023-02-07 19:05 ` Grochowski, Maciej
2023-02-08 6:43 ` Klaus Jensen
2023-02-08 17:26 ` Grochowski, Maciej
2023-02-08 17:39 ` Keith Busch
2023-02-08 22:38 ` Grochowski, Maciej
2023-02-09 7:55 ` Javier.gonz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+IpsEd8kMoljOAa@cormorant.local \
--to=its@irrelevant.dk \
--cc=Javier.gonz@samsung.com \
--cc=Maciej.Grochowski@sony.com \
--cc=Nathaniel.Lewis@sony.com \
--cc=hch@lst.de \
--cc=joshi.k@samsung.com \
--cc=k.jensen@samsung.com \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox