Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: bhelgaas@google.com,
	"Mario Limonciello" <mario.limonciello@amd.com>,
	"Mika Westerberg" <mika.westerberg@linux.intel.com>,
	"Keith Busch" <kbusch@kernel.org>,
	"Kuppuswamy Sathyanarayanan"
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	"Pali Rohár" <pali@kernel.org>, "Stefan Roese" <sr@denx.de>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	"David E. Box" <david.e.box@linux.intel.com>
Subject: Re: [PATCH] PCI/portdrv: Avoid enabling AER on Thunderbolt devices
Date: Mon, 26 Dec 2022 16:50:45 -0600	[thread overview]
Message-ID: <20221226225045.GA400369@bhelgaas> (raw)
In-Reply-To: <20221226153048.1208359-1-kai.heng.feng@canonical.com>

[+cc David]

Hi Kai-Heng,

Thanks for the report and the debugging!

On Mon, Dec 26, 2022 at 11:30:31PM +0800, Kai-Heng Feng wrote:
> We are seeing igc ethernet device on Thunderbolt dock stops working
> after S3 resume because of AER error, or even make S3 resume freeze:
> pcieport 0000:00:1d.0: AER: Multiple Corrected error received: 0000:00:1d.0
> pcieport 0000:00:1d.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
> pcieport 0000:00:1d.0:   device [8086:7ab0] error status/mask=00008000/00002000
> pcieport 0000:00:1d.0:    [15] HeaderOF
> pcieport 0000:00:1d.0: AER: Multiple Uncorrected (Non-Fatal) error received: 0000:00:1d.0
> pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> pcieport 0000:00:1d.0:   device [8086:7ab0] error status/mask=00100000/00004000
> pcieport 0000:00:1d.0:    [20] UnsupReq               (First)
> pcieport 0000:00:1d.0: AER:   TLP Header: 34000000 0a000052 00000000 00000000

From a very quick look, I think 34...... ......52 is a PTM message (as
you suggest below).

> pcieport 0000:00:1d.0: AER:   Error of this Agent is reported first
> pcieport 0000:04:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> pcieport 0000:04:01.0:   device [8086:1136] error status/mask=00300000/00000000
> pcieport 0000:04:01.0:    [20] UnsupReq               (First)
> pcieport 0000:04:01.0:    [21] ACSViol
> pcieport 0000:04:01.0: AER:   TLP Header: 34000000 04000052 00000000 00000000
> thunderbolt 0000:05:00.0: AER: can't recover (no error_detected callback)
> 
> This supposedly should be fixed by commit c01163dbd1b8 ("PCI/PM: Always disable
> PTM for all devices during suspend"), but somehow it doesn't work for
> this case.
> 
> By dumping the PCI_PTM_CTRL register on resume, it turns out PTM is
> already flipped on by either the Thunderbolt dock firmware or the host
> BIOS. Writing 0 to PCI_PTM_CTRL yields the same result.

Can you share your debug patch and corresponding dmesg log in the
bugzilla?

> Windows is however not affected by this issue, by using WinDbg's !pci
> command, it shows that AER is not enabled for devices connected via
> Thunderbolt port, and that's the reason why Windows doesn't exhibit the
> issue.
> 
> So turn a blind eye on external Thunderbolt devices like Windows does by
> disabling AER.

Unless there's something in the PCIe or Thunderbolt spec that says AER
shouldn't be used on external devices, I think we need to figure out
the root cause before disabling AER on all removable devices.

The dmesg in the bugzilla below is from an HP ZBook Fury 16.  Do you
see this on any other platforms?  Do you have any HP BIOS contacts to
ask about this?

It seems like a firmware defect to enable PTM without knowing whether
upstream devices have PTM enabled.

We could leave PTM enabled on upstream devices when suspending, but
that apparently prevents some low-power states.  Adding David since he
worked on that.

> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216850
> Cc: Mario Limonciello <mario.limonciello@amd.com>
> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/pci/pcie/portdrv.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> index 2cc2e60bcb396..59d00e20e57bf 100644
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -237,7 +237,8 @@ static int get_port_device_capability(struct pci_dev *dev)
>  	if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
>               pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC) &&
>  	    dev->aer_cap && pci_aer_available() &&
> -	    (pcie_ports_native || host->native_aer))
> +	    (pcie_ports_native || host->native_aer) &&
> +	    !dev_is_removable(&dev->dev))
>  		services |= PCIE_PORT_SERVICE_AER;
>  #endif
>  
> -- 
> 2.34.1
> 

  parent reply	other threads:[~2022-12-26 22:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-26 15:30 [PATCH] PCI/portdrv: Avoid enabling AER on Thunderbolt devices Kai-Heng Feng
2022-12-26 15:46 ` Pali Rohár
2022-12-29  3:45   ` Kai-Heng Feng
2022-12-29 12:02     ` Pali Rohár
2022-12-29 16:51     ` Bjorn Helgaas
2022-12-26 22:50 ` Bjorn Helgaas [this message]
2022-12-29  4:26   ` Kai-Heng Feng
2023-01-17 23:14 ` Bjorn Helgaas
2023-02-08 13:33   ` Kai-Heng Feng
2023-02-14  0:10     ` Bjorn Helgaas
2023-05-16 14:14 ` Bagas Sanjaya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221226225045.GA400369@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=david.e.box@linux.intel.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=mika.westerberg@linux.intel.com \
    --cc=pali@kernel.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=sr@denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox