linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sathyanarayanan Kuppuswamy  <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Tushar Dave <tdave@nvidia.com>, Keith Busch <kbusch@kernel.org>,
	Bjorn Helgaas <helgaas@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>, Sagi Grimberg <sagi@grimberg.me>,
	linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org
Subject: Re: nvme-pci: Disabling device after reset failure: -5 occurs while AER recovery
Date: Wed, 15 Mar 2023 13:43:26 -0700	[thread overview]
Message-ID: <362f552f-df15-878a-1fd6-4ef086e8fdb1@linux.intel.com> (raw)
In-Reply-To: <e598b84f-2f90-b29a-6209-17309763514f@nvidia.com>



On 3/15/23 1:01 PM, Tushar Dave wrote:
> 
> 
> On 3/14/23 10:26, Keith Busch wrote:
>> On Tue, Mar 14, 2023 at 11:11:27AM -0500, Bjorn Helgaas wrote:
>>> On Mon, Mar 13, 2023 at 05:57:43PM -0700, Tushar Dave wrote:
>>>> On 3/11/23 00:22, Lukas Wunner wrote:
>>>>> On Fri, Mar 10, 2023 at 05:45:48PM -0800, Tushar Dave wrote:
>>>>>> On 3/10/2023 3:53 PM, Bjorn Helgaas wrote:
>>>>>>> In the log below, pciehp obviously is enabled; should I infer that in
>>>>>>> the log above, it is not?
>>>>>>
>>>>>> pciehp is enabled all the time. In the log above and below.
>>>>>> I do not have answer yet why pciehp shows-up only in some tests (due to DPC
>>>>>> link down/up) and not in others like you noticed in both the logs.
>>>>>
>>>>> Maybe some of the switch Downstream Ports are hotplug-capable and
>>>>> some are not?  (Check the Slot Implemented bit in the PCI Express
>>>>> Capabilities Register as well as the Hot-Plug Capable bit in the
>>>>> Slot Capabilities Register.)
>>>>> ...
>>>
>>>>>>> Generally we've avoided handling a device reset as a
>>>>>>> remove/add event because upper layers can't deal well with
>>>>>>> that.  But in the log below it looks like pciehp *did* treat
>>>>>>> the DPC containment as a remove/add, which of course involves
>>>>>>> configuring the "new" device and its MPS settings.
>>>>>>
>>>>>> yes and that puzzled me why? especially when"Link Down/Up
>>>>>> ignored (recovered by DPC)". Do we still have race somewhere, I
>>>>>> am not sure.
>>>>>
>>>>> You're seeing the expected behavior.  pciehp ignores DLLSC events
>>>>> caused by DPC, but then double-checks that DPC recovery succeeded.
>>>>> If it didn't, it would be a bug not to bring down the slot.  So
>>>>> pciehp does exactly that.  See this code snippet in
>>>>> pciehp_ignore_dpc_link_change():
>>>>>
>>>>>     /*
>>>>>      * If the link is unexpectedly down after successful recovery,
>>>>>      * the corresponding link change may have been ignored above.
>>>>>      * Synthesize it to ensure that it is acted on.
>>>>>      */
>>>>>     down_read_nested(&ctrl->reset_lock, ctrl->depth);
>>>>>     if (!pciehp_check_link_active(ctrl))
>>>>>         pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
>>>>>     up_read(&ctrl->reset_lock);
>>>>>
>>>>> So on hotplug-capable ports, pciehp is able to mop up the mess
>>>>> created by fiddling with the MPS settings behind the kernel's
>>>>> back.
>>>>
>>>> That's the thing, even on hotplug-capable slot I do not see pciehp
>>>> _all_ the time. Sometime pciehp get involve and takes care of things
>>>> (like I mentioned in the previous thread) and other times no pciehp
>>>> engagement at all!
>>>
>>> Possibly a timing issue, so I'll be interested to see if 53b54ad074de
>>> ("PCI/DPC: Await readiness of secondary bus after reset") makes any
>>> difference.  Lukas didn't mention that, so maybe it's a red herring,
>>> but I'm still curious since it explicitly mentions the DPC reset case
>>> that you're exercising here.
> 
> Commit 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset") didn't help.

I did not check the full thread. Since this seems to be in EDR recovery path, make sure to
include following patch.

https://lore.kernel.org/lkml/20230215200532.3126937-1-sathyanarayanan.kuppuswamy@linux.intel.com/T/

> 
> [ 6265.268757] pcieport 0000:a5:01.0: EDR: EDR event received
> [ 6265.276034] pcieport 0000:a5:01.0: EDR: Reported EDR dev: 0000:a9:10.0
> [ 6265.283780] pcieport 0000:a9:10.0: DPC: containment event, status:0x2009 source:0x0000
> [ 6265.292972] pcieport 0000:a9:10.0: DPC: unmasked uncorrectable error detected
> [ 6265.301284] pcieport 0000:a9:10.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
> [ 6265.313569] pcieport 0000:a9:10.0:   device [1000:c030] error status/mask=00040000/00180000
> [ 6265.323208] pcieport 0000:a9:10.0:    [18] MalfTLP                (First)
> [ 6265.331084] pcieport 0000:a9:10.0: AER:   TLP Header: 6000007a ab0000ff 00000001 629d4318
> [ 6265.340536] pcieport 0000:a9:10.0: AER: broadcast error_detected message
> [ 6265.348320] nvme nvme1: frozen state error detected, reset controller
> [ 6265.419633] pcieport 0000:a9:10.0: waiting 100 ms for downstream link, after activation
> [ 6265.627639] pcieport 0000:a9:10.0: AER: broadcast slot_reset message
> [ 6265.635289] nvme nvme1: restart after slot reset
> [ 6265.641016] nvme 0000:ab:00.0: restoring config space at offset 0x3c (was 0x100, writing 0x1ff)
> [ 6265.651248] nvme 0000:ab:00.0: restoring config space at offset 0x30 (was 0x0, writing 0xe0600000)
> [ 6265.661739] nvme 0000:ab:00.0: restoring config space at offset 0x10 (was 0x4, writing 0xe0710004)
> [ 6265.672210] nvme 0000:ab:00.0: restoring config space at offset 0xc (was 0x0, writing 0x8)
> [ 6265.681897] nvme 0000:ab:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100546)
> [ 6265.692616] pcieport 0000:a9:10.0: AER: broadcast resume message
> [ 6265.716299] nvme 0000:ab:00.0: saving config space at offset 0x0 (reading 0xa824144d)
> [ 6265.725614] nvme 0000:ab:00.0: saving config space at offset 0x4 (reading 0x100546)
> [ 6265.734657] nvme 0000:ab:00.0: saving config space at offset 0x8 (reading 0x1080200)
> [ 6265.743824] nvme 0000:ab:00.0: saving config space at offset 0xc (reading 0x8)
> [ 6265.752348] nvme 0000:ab:00.0: saving config space at offset 0x10 (reading 0xe0710004)
> [ 6265.761647] nvme 0000:ab:00.0: saving config space at offset 0x14 (reading 0x0)
> [ 6265.770247] nvme 0000:ab:00.0: saving config space at offset 0x18 (reading 0x0)
> [ 6265.778857] nvme 0000:ab:00.0: saving config space at offset 0x1c (reading 0x0)
> [ 6265.787450] nvme 0000:ab:00.0: saving config space at offset 0x20 (reading 0x0)
> [ 6265.796034] nvme 0000:ab:00.0: saving config space at offset 0x24 (reading 0x0)
> [ 6265.804620] nvme 0000:ab:00.0: saving config space at offset 0x28 (reading 0x0)
> [ 6265.813201] nvme 0000:ab:00.0: saving config space at offset 0x2c (reading 0xa80a144d)
> [ 6265.822473] nvme 0000:ab:00.0: saving config space at offset 0x30 (reading 0xe0600000)
> [ 6265.831816] nvme 0000:ab:00.0: saving config space at offset 0x34 (reading 0x40)
> [ 6265.840482] nvme 0000:ab:00.0: saving config space at offset 0x38 (reading 0x0)
> [ 6265.849037] nvme 0000:ab:00.0: saving config space at offset 0x3c (reading 0x1ff)
> [ 6275.037534] block nvme1n1: no usable path - requeuing I/O
> [ 6326.920009] nvme nvme1: I/O 22 QID 0 timeout, disable controller
> [ 6326.988701] nvme nvme1: Identify Controller failed (-4)
> [ 6326.995253] nvme nvme1: Disabling device after reset failure: -5
> [ 6327.032308] pcieport 0000:a9:10.0: AER: device recovery successful
> [ 6327.039781] pcieport 0000:a9:10.0: EDR: DPC port successfully recovered
> [ 6327.047687] pcieport 0000:a5:01.0: EDR: Status for 0000:a9:10.0: 0x80
> [ 6327.083131] pcieport 0000:a5:01.0: EDR: EDR event received
> [ 6327.090173] pcieport 0000:a5:01.0: EDR: Reported EDR dev: 0000:a9:10.0
> [ 6327.097816] pcieport 0000:a9:10.0: DPC: containment event, status:0x2009 source:0x0000
> [ 6327.107009] pcieport 0000:a9:10.0: DPC: unmasked uncorrectable error detected
> [ 6327.115330] pcieport 0000:a9:10.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
> [ 6327.127640] pcieport 0000:a9:10.0:   device [1000:c030] error status/mask=00040000/00180000
> [ 6327.137319] pcieport 0000:a9:10.0:    [18] MalfTLP                (First)
> [ 6327.145236] pcieport 0000:a9:10.0: AER:   TLP Header: 60000080 ab0000ff 00000001 5ad65000
> [ 6327.154728] pcieport 0000:a9:10.0: AER: broadcast error_detected message
> [ 6327.162624] nvme nvme1: frozen state error detected, reset controller
> [ 6327.183979] pcieport 0000:a9:10.0: waiting 100 ms for downstream link, after activation
> [ 6327.387969] pcieport 0000:a9:10.0: AER: broadcast slot_reset message
> [ 6327.395596] nvme nvme1: restart after slot reset
> [ 6327.401313] nvme 0000:ab:00.0: restoring config space at offset 0x3c (was 0x100, writing 0x1ff)
> [ 6327.411517] nvme 0000:ab:00.0: restoring config space at offset 0x30 (was 0x0, writing 0xe0600000)
> [ 6327.422045] nvme 0000:ab:00.0: restoring config space at offset 0x10 (was 0x4, writing 0xe0710004)
> [ 6327.432523] nvme 0000:ab:00.0: restoring config space at offset 0xc (was 0x0, writing 0x8)
> [ 6327.442212] nvme 0000:ab:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100546)
> [ 6327.452933] pcieport 0000:a9:10.0: AER: broadcast resume message
> [ 6327.460184] pcieport 0000:a9:10.0: AER: device recovery successful
> [ 6327.467533] pcieport 0000:a9:10.0: EDR: DPC port successfully recovered
> [ 6327.475367] pcieport 0000:a5:01.0: EDR: Status for 0000:a9:10.0: 0x80
> 
>>
>> Catching the PDC event may be timing related. pciehp ignores the link events
>> during a DPC event, but it always reacts to PDC since it's indistinguishable
>> from a DPC occuring in response to a surprise removal, and these slots probably
>> don't have out-of-band presence detection.
> 
> yeah, In-Band PD Disable bit in Slot Control register of PCIe Downstream Switch port is set to '0' , no idea about out-of-band presence detection!

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

  reply	other threads:[~2023-03-15 20:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bcbd48b5-1d6e-8fe3-d6a0-cb341e5c34e3@nvidia.com>
2023-03-09 17:53 ` nvme-pci: Disabling device after reset failure: -5 occurs while AER recovery Bjorn Helgaas
2023-03-10 22:39   ` Tushar Dave
2023-03-10 23:53     ` Bjorn Helgaas
2023-03-11  1:45       ` Tushar Dave
2023-03-11  8:22         ` Lukas Wunner
2023-03-11 16:46           ` Keith Busch
2023-03-14  0:57           ` Tushar Dave
2023-03-14 16:11             ` Bjorn Helgaas
2023-03-14 17:26               ` Keith Busch
2023-03-15 20:01                 ` Tushar Dave
2023-03-15 20:43                   ` Sathyanarayanan Kuppuswamy [this message]
2023-03-15 22:16                     ` Tushar Dave
2023-03-15 22:23                       ` Sathyanarayanan Kuppuswamy
2023-03-15 22:25                         ` Sathyanarayanan Kuppuswamy
2023-03-18  0:15                   ` Tushar Dave

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=362f552f-df15-878a-1fd6-4ef086e8fdb1@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=helgaas@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=sagi@grimberg.me \
    --cc=tdave@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).