From: Kai-Heng Feng <kai.heng.feng@canonical.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: bhelgaas@google.com, mika.westerberg@linux.intel.com,
koba.ko@canonical.com, Lukas Wunner <lukas@wunner.de>,
Stuart Hayes <stuart.w.hayes@gmail.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] PCI/portdrv: Skip enabling AER on external facing ports
Date: Fri, 7 Jan 2022 12:09:57 +0800 [thread overview]
Message-ID: <CAAd53p5V9gCCc6v9Wdo-bONYfASnhtyGHVPPb6vOneft2XewQQ@mail.gmail.com> (raw)
In-Reply-To: <20220105201226.GA218998@bhelgaas>
On Thu, Jan 6, 2022 at 4:12 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Wed, Jan 05, 2022 at 02:06:41PM +0800, Kai-Heng Feng wrote:
> > The Thunderbolt root ports may constantly spew out uncorrected errors
> > from AER service:
> > [ 30.100211] pcieport 0000:00:1d.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:1d.0
> > [ 30.100251] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> > [ 30.100256] pcieport 0000:00:1d.0: device [8086:7ab0] error status/mask=00100000/00004000
> > [ 30.100262] pcieport 0000:00:1d.0: [20] UnsupReq (First)
> > [ 30.100267] pcieport 0000:00:1d.0: AER: TLP Header: 34000000 08000052 00000000 00000000
> > [ 30.100372] thunderbolt 0000:0a:00.0: AER: can't recover (no error_detected callback)
> > [ 30.100401] xhci_hcd 0000:3e:00.0: AER: can't recover (no error_detected callback)
> > [ 30.100427] pcieport 0000:00:1d.0: AER: device recovery failed
>
> No timestamps needed here; they don't add to understanding the
> problem.
Got it. Will remove it for later iteration.
>
> > The link may not be reliable on external facing ports, so don't enable
> > AER on those ports.
>
> I'm not sure what you want to accomplish here. If the errors are
> legitimate and the result of some hardware issue like a bad cable, why
> should we ignore them? If they're caused by a software problem, we
> should figure that out and fix it.
>
> Does this occur on a specific instance of possibly flaky hardware?
Only from root ports of thunderbolt devices.
The error occurs as soon as the root port is runtime suspended to D3cold.
Runtime suspend the AER service can resolve the issue. I wonder if
it's the right thing to do here?
D3cold should also mean the PCI link is gone, disabling AER seems to
be a reasonable approach.
Kai-Heng
>
> You mention a spew of errors; do you think this is a single error that
> we fail to clear correctly? Or is it really many separate errors?
>
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215453
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> > drivers/pci/pcie/portdrv_core.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> > index bda630889f955..d464d00ade8f2 100644
> > --- a/drivers/pci/pcie/portdrv_core.c
> > +++ b/drivers/pci/pcie/portdrv_core.c
> > @@ -219,7 +219,8 @@ static int get_port_device_capability(struct pci_dev *dev)
> >
> > #ifdef CONFIG_PCIEAER
> > if (dev->aer_cap && pci_aer_available() &&
> > - (pcie_ports_native || host->native_aer)) {
> > + (pcie_ports_native || host->native_aer) &&
> > + !dev->external_facing) {
> > services |= PCIE_PORT_SERVICE_AER;
> >
> > /*
> > --
> > 2.33.1
> >
next prev parent reply other threads:[~2022-01-07 4:10 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-05 6:06 [PATCH] PCI/portdrv: Skip enabling AER on external facing ports Kai-Heng Feng
2022-01-05 20:12 ` Bjorn Helgaas
2022-01-07 4:09 ` Kai-Heng Feng [this message]
2022-01-21 10:55 ` Mika Westerberg
2022-01-21 12:31 ` Kai-Heng Feng
2022-01-21 12:44 ` Mika Westerberg
2022-01-21 14:25 ` Kai-Heng Feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAAd53p5V9gCCc6v9Wdo-bONYfASnhtyGHVPPb6vOneft2XewQQ@mail.gmail.com \
--to=kai.heng.feng@canonical.com \
--cc=bhelgaas@google.com \
--cc=helgaas@kernel.org \
--cc=jan.kiszka@siemens.com \
--cc=koba.ko@canonical.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mika.westerberg@linux.intel.com \
--cc=stuart.w.hayes@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).