linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: stuart hayes <stuart.w.hayes@gmail.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	Kuppuswamy Sathyanarayanan 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Ethan Zhao <haifeng.zhao@intel.com>,
	Sinan Kaya <okaya@kernel.org>, Ashok Raj <ashok.raj@intel.com>,
	Keith Busch <kbusch@kernel.org>,
	Yicong Yang <yangyicong@hisilicon.com>,
	linux-pci@vger.kernel.org, Russell Currey <ruscur@russell.cc>,
	Oliver OHalloran <oohall@gmail.com>,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Subject: Re: [PATCH v2] PCI: pciehp: Ignore Link Down/Up caused by DPC
Date: Tue, 20 Jul 2021 08:57:18 +0200	[thread overview]
Message-ID: <20210720065718.GA21556@wunner.de> (raw)
In-Reply-To: <a70a936d-d031-7199-4ed7-30753b3e7cfa@gmail.com>

On Mon, Jul 19, 2021 at 02:00:51PM -0500, stuart hayes wrote:
> On 7/19/2021 10:10 AM, Lukas Wunner wrote:
> > Could you test if the below patch fixes the issue?
> 
> That does appear to fix the issue, thanks!  Without your patch, the PCIe
> devices under 64:02.0 disappear (the triggered bit is still set in the DPC
> capability).  With your patch, recovery is successful and all of the PCIe
> devices are still there.

Thanks for testing.

The test patch clears DLLSC because the Hot Reset that is propagated
down the hierarchy causes the link to flap.  I'm wondering though if
that's sufficient or if PDC needs to be cleared as well.  According
to PCIe Base Spec sec. 4.2.6, LTSSM transitions from "Hot Reset" state
to "Detect", then "Polling".  If I understand the table "Link Status
Mapped to the LTSSM" in the spec correctly, in-band presence is 0b
in Detect state, hence I'd expect PDC to flap as well as a result of
a Hot Reset being propagated down the hierarchy.

Does the hotplug port at 0000:68:00.0 support In-Band Presence Disable?
That would explain why only clearing DLLSC is sufficient.

The problem is, if PDC is cleared as well, we lose the ability to
detect that a device was hot-removed while the reset was ongoing,
which is unfortunate.

If an error is handled by aer_root_reset() (instead of dpc_reset_link())
and the reset is performed at a hotplug port, then pciehp_reset_slot()
is invoked:

aer_root_reset()
  pci_bus_error_reset()
    pci_slot_reset()
      pci_reset_hotplug_slot()
        pciehp_reset_slot()

pciehp_reset_slot() temporarily masks both DLLSC *and* PDC events,
then performs a Secondary Bus Reset at the hotplug port.

If there are further hotplug ports below that hotplug port
where the SBR is performed, my expectation is that the Hot Reset
is likewise propagated down the hierarchy (just as with DPC),
so those cascaded hotplug ports should also see their link go down.

In other words, the issue you're seeing isn't really DPC-specific.
However, the test patch should fix the issue for AER-handled errors
as well.  Do you agree with this analysis or did I miss anything?

Thanks,

Lukas

  reply	other threads:[~2021-07-20  6:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-01  8:29 [PATCH v2] PCI: pciehp: Ignore Link Down/Up caused by DPC Lukas Wunner
2021-05-01  8:38 ` Lukas Wunner
2021-06-16 22:19 ` Bjorn Helgaas
2021-06-20  7:38   ` Lukas Wunner
2021-06-25 20:38     ` stuart hayes
2021-06-26  6:50       ` Lukas Wunner
2021-07-06 22:15         ` stuart hayes
2021-07-18 21:26           ` Lukas Wunner
2021-07-19 15:10       ` Lukas Wunner
2021-07-19 19:00         ` stuart hayes
2021-07-20  6:57           ` Lukas Wunner [this message]
2021-07-20 22:11             ` stuart hayes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210720065718.GA21556@wunner.de \
    --to=lukas@wunner.de \
    --cc=ashok.raj@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=haifeng.zhao@intel.com \
    --cc=helgaas@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=okaya@kernel.org \
    --cc=oohall@gmail.com \
    --cc=ruscur@russell.cc \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=stuart.w.hayes@gmail.com \
    --cc=yangyicong@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).