public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Klaus Jensen <its@irrelevant.dk>
To: "Javier.gonz@samsung.com" <Javier.gonz@samsung.com>
Cc: "Grochowski, Maciej" <Maciej.Grochowski@sony.com>,
	Keith Busch <kbusch@kernel.org>, Christoph Hellwig <hch@lst.de>,
	"sagi@grimberg.me" <sagi@grimberg.me>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"Lewis, Nathaniel" <Nathaniel.Lewis@sony.com>,
	Kanchan Joshi <joshi.k@samsung.com>,
	Klaus Jensen <k.jensen@samsung.com>
Subject: Re: [PATCH] nvme-pci: fix resume after AER recovery
Date: Tue, 7 Feb 2023 11:36:32 +0100	[thread overview]
Message-ID: <Y+IpsEd8kMoljOAa@cormorant.local> (raw)
In-Reply-To: <20230207082930.i5mz2y5bsabnj2ud@ArmHalley.localdomain>

[-- Attachment #1: Type: text/plain, Size: 2084 bytes --]

On Feb  7 09:29, Javier.gonz@samsung.com wrote:
> On 07.02.2023 01:51, Grochowski, Maciej wrote:
> > I have tried suggested approach, with some modification: pci_device in
> > pci_reset_secondary_bus is actually the bridge not NVMe device itself,
> > thus I checked devices behind that bridge to see if any has D0 bit and
> > base on that logic I run the custom delay.
> > 
> > Unfortunately even with this approach I see the same issue for both
> > Samsung drives, and based on kernel logs I can see that wait for
> > secondary bus reset get increased.  Thus seems like this quirk don't
> > work for some reason. (I tried also increasing delays to different
> > values but it didn't work).
> 
> Too bad.
> 
> I will write you separately to get som dumps from the device. We have
> not seen this before, so we need to understand this a bit better.
> 
> Regarding the quirk, we are looking into it. Will come with something in
> this thread later. Cc'ing Kanchan and Klaus.
> 

I dug up a PM1733 and I am not immediately able to reproduce on 6.2-rc7.

With an aer-inject error file,

  AER
  PCI_ID 0000:04:00.0
  UNCOR_STATUS MALF_TLP
  HEADER_LOG 0 1 2 3

I'm getting a Fatal error with type "Inaccessible, (Unregistered Agent
ID)", but it still recovers successfully:

  pcieport 0000:00:01.2: aer_inject: Injecting errors 00000000/00040000 into device 0000:04:00.0
  pcieport 0000:00:01.2: AER: Uncorrected (Fatal) error received: 0000:04:00.0
  nvme 0000:04:00.0: AER: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID)
  nvme nvme1: frozen state error detected, reset controller
  pcieport 0000:03:00.0: AER: Downstream Port link has been reset (0)
  nvme nvme1: restart after slot reset
  nvme nvme1: Shutdown timeout set to 10 seconds
  nvme nvme1: 32/0/0 default/read/poll queues
  pcieport 0000:03:00.0: AER: device recovery successful

Maciej, can you share firmware revision information and a bit more
details on your reproducer/setup that might allow us to replicate?


Thanks,
Klaus

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2023-02-07 10:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-30 10:14 [PATCH] nvme-pci: fix resume after AER recovery Christoph Hellwig
2023-01-30 18:35 ` Keith Busch
2023-01-30 18:43   ` Keith Busch
2023-01-30 18:54     ` Grochowski, Maciej
2023-01-31  8:58     ` Christoph Hellwig
2023-01-31 15:22       ` Keith Busch
2023-02-01 22:58         ` Grochowski, Maciej
2023-02-02 10:18           ` Christoph Hellwig
2023-02-02 18:47             ` Grochowski, Maciej
2023-02-02 19:43               ` Keith Busch
2023-02-03  1:29                 ` Grochowski, Maciej
2023-02-03  1:37                   ` Keith Busch
2023-02-03 18:45                     ` Grochowski, Maciej
2023-02-06 14:02                       ` Javier.gonz
2023-02-06 15:42                         ` Christoph Hellwig
2023-02-06 16:22                           ` Keith Busch
2023-02-06 17:51                             ` Javier.gonz
2023-02-07  1:51                               ` Grochowski, Maciej
2023-02-07  8:29                                 ` Javier.gonz
2023-02-07 10:36                                   ` Klaus Jensen [this message]
2023-02-07 19:05                                     ` Grochowski, Maciej
2023-02-08  6:43                                       ` Klaus Jensen
2023-02-08 17:26                                         ` Grochowski, Maciej
2023-02-08 17:39                                           ` Keith Busch
2023-02-08 22:38                                             ` Grochowski, Maciej
2023-02-09  7:55                                               ` Javier.gonz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+IpsEd8kMoljOAa@cormorant.local \
    --to=its@irrelevant.dk \
    --cc=Javier.gonz@samsung.com \
    --cc=Maciej.Grochowski@sony.com \
    --cc=Nathaniel.Lewis@sony.com \
    --cc=hch@lst.de \
    --cc=joshi.k@samsung.com \
    --cc=k.jensen@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox