All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Lukas Wunner <lukas@wunner.de>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	Mika Westerberg <mika.westerberg@linux.intel.com>
Subject: Re: [PATCH v3] PCI: pciehp: Fix hotplug on Catlow Lake with unreliable PME status
Date: Tue, 24 Mar 2026 14:45:25 -0700	[thread overview]
Message-ID: <ca15feda-5d63-405d-a4c4-00cbb9991fed@linux.intel.com> (raw)
In-Reply-To: <20260323232437.GA1085990@bhelgaas>

Hi Bjorn,

Thanks for the review.

On 3/23/2026 4:24 PM, Bjorn Helgaas wrote:
> [+cc Mika, author of eb34da60edee]
> 
> On Mon, Mar 16, 2026 at 03:08:06PM -0700, Kuppuswamy Sathyanarayanan wrote:
>> On Intel Catlow Lake platforms, PCH PCIe root ports do not reliably
>> update PME status registers (PME Status and PME Requester_ID in the
>> Root Status register) during D3hot to D0 transitions, even though PME
>> interrupts are delivered correctly.
> 
> IIUC, the Root Status update should happen on receipt of a PME Message
> and is not directly related to a D3hot to D0 transition (PCIe r7.0,
> sec 6.1.6).


You're correct. PME status register updates occur on PME Message
reception, not during power state transitions.

I used "D3hot to D0 transitions" as shorthand for the full sequence: port
in D3hot detects hotplug event -> generates PME Message -> should update
Root Status -> PME interrupt -> port wakes to D0.

On Catlow Lake, the PME interrupt is delivered but the Root Status
registers (PME_Status and PME_Requester_ID) are not updated when the port
generates these PME Messages. This causes pcie_pme_irq() to return
IRQ_NONE, leaving the port in D3hot.

I'll clarify the commit message to avoid this misleading shorthand.
Something like below:

On Intel Catlow Lake platforms, PCH PCIe root ports do not reliably
update PME status registers (PME Status and PME Requester_ID in the
Root Status register) when receiving PME Messages while in D3hot state,
even though PME interrupts are delivered correctly

> 
>> This issue manifests during PCIe hotplug operations as follows:
>>
>> 1. After a hot-remove event, the PCIe port runtime suspends to D3hot.
>>    pciehp_suspend() disables hotplug interrupts (HPIE) to rely on
>>    PME-based wakeup.
> 
> Didn't we rely on PME interrupts anyway, independent of HPIE?

I don't think they are independent. If HPIE is enabled, hotplug interrupt
will be triggered as direct interrupt and we don't have to reply on PME
for wakeup.

> 
> eb34da60edee ("PCI: pciehp: Disable hotplug interrupt during suspend")
> cleared PCI_EXP_SLTCTL_HPIE so that when the link goes down, we
> wouldn't get a PCI_EXP_SLTSTA_DLLSC interrupt and wake the system.
> 
> I don't know the details of why the PCI_EXP_SLTSTA_DLLSC would cause
> that wakeup.  I would think pciehp should field that, and it should be
> able to figure out whether to bring the port out of D3hot.
> 
> Anyway, with this patch it looks like we'll leave PCI_EXP_SLTCTL_HPIE
> set, and potentially get that PCI_EXP_SLTSTA_DLLSC interrupt again?

I have tested this patch on Catlow Lake. Enabling HPIE does not result in
spurious wakeups as mentioned in Mika's patch.

Mika, any comments?

> 
>> 2. When a hot-add occurs while the port is in D3hot, a PME interrupt
>>    fires as expected to wake the port.
> 
> Why is a PME interrupt expected here?  I would expect a hot-add to
> cause a PCI_EXP_SLTSTA_PDC or PCI_EXP_SLTSTA_DLLSC interrupt.  Sec
> 6.1.6 suggests PME interrupts are only from Root Ports.
> 
>> 3. However, the PME interrupt handler finds the PME_Status and
>>    PME_Requester_ID registers unpopulated, preventing identification
>>    of which device triggered the PME. The handler returns IRQ_NONE,
>>    leaving the port in D3hot.
> 
> I guess this is pcie_pme_irq(), and it finds PCI_EXP_RTSTA_PME clear
> because of this Catlow defect?  It looks like it returns without even
> looking at PME_Requester_ID.

Yes, it returns after checking for PCI_EXP_RTSTA_PME. I tried modifying
it to proceed without the status check, but it still failed when reading
PME_Requester_ID (also not updated). So I mentioned both registers.

> 
> Sec 5.3.3.1 suggests that the purpose of PME_Requester_ID is to
> facilitate quicker PME service and shorter resume time.  So maybe the
> lack of PME_Requester_ID should only be a performance issue, not a
> functional problem?
> 
> If we know we got a PME interrupt, and we can wake up (maybe more
> slowly without a Requester ID), why can't we just do the wakeup
> independent of PCI_EXP_RTSTA_PME and PCI_EXP_RTSTA_PME_RQ_ID?  Are
> spurious PME interrupts a problem?
> 

Yes, I think we can call pcie_pme_walk_bus() even when PCI_EXP_RTSTA_PME
is clear for ports with the quirk. This would work but be slower without
the Requester_ID hint.

I can verify this alternative approach if Mika confirms that keeping HPIE
enabled is problematic. In our testing on Catlow Lake, enabling HPIE did not
cause any spurious wakeup issues.

>> 4. Because the port remains in D3hot with HPIE disabled, the hotplug
>>    event is lost and the newly inserted device is not recognized.
>>
>> The PME interrupt delivery mechanism itself works correctly;
>> interrupts arrive reliably. The problem is purely the missing status
>> register updates. Verification via IOSF-SideBand (IOSF-SB) backdoor
>> reads confirms that these registers remain empty when the PME
>> interrupt fires. Neither BIOS nor kernel code is clearing these
>> registers.
>>
>> This issue is present in all steppings of Catlow Lake PCH and affects
>> customers in production deployments. A public hardware errata document
>> is not yet available.
>>
>> Work around this issue by introducing a PCI_DEV_FLAGS_PME_UNRELIABLE
>> flag for affected ports. When this flag is set, pciehp keeps hotplug
>> interrupts (HPIE) enabled during D3hot instead of disabling them and
>> relying on PME. This allows hotplug events to be delivered via direct
>> interrupts rather than through the broken PME status mechanism.
>>
>> The port still enters D3hot for power savings during runtime suspend,
>> avoiding the power regression that would occur with pm_runtime_disable().
>> Testing confirms this approach does not impact PC6/PC10 package C-state
>> residency.
>>
>> During system suspend/resume, the behavior is unchanged. Ports are
>> resumed unconditionally when coming out of system sleep due to
>> DPM_FLAG_SMART_SUSPEND set by pcie_portdrv_probe(), and pciehp
>> re-enables interrupts and checks slot occupation status during resume.
>>
>> The quirk is applied only to Catlow PCH PCIe root ports (device IDs
>> 0x7a30 through 0x7a4b). Catlow CPU PCIe ports are not affected as
>> they are not hotplug-capable.
>>
>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>> ---
>>
>> Changes since v2:
>>  * Switched from pm_runtime_disable() to PCI_DEV_FLAGS_PME_UNRELIABLE
>>    flag approach to avoid power regression (feedback from Rafael and Lukas)
>>  * Keep hotplug interrupts (HPIE) enabled during D3hot instead of
>>    preventing D3hot entry entirely
>>  * Port still enters D3hot for power savings; testing confirms no impact
>>    on PC6 package C-state residency
>>  * Modified pciehp to check pme_is_broken() before disabling/enabling
>>    hotplug interrupts during suspend/resume
>>  * Made quirk comment generic to cover both PME notification and status
>>    update issues, with Catlow Lake specifics documented separately
>>
>> Changes since v1:
>>  * Removed hack in hotplug driver and disabled runtime PM on affected ports.
>>  * Fixed the commit log and comments accordingly.
>>
>>  drivers/pci/hotplug/pciehp_core.c | 11 ++++--
>>  drivers/pci/quirks.c              | 60 +++++++++++++++++++++++++++++++
>>  include/linux/pci.h               |  2 ++
>>  3 files changed, 71 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
>> index 1e9158d7bac7..f854ef9551c3 100644
>> --- a/drivers/pci/hotplug/pciehp_core.c
>> +++ b/drivers/pci/hotplug/pciehp_core.c
>> @@ -260,13 +260,20 @@ static bool pme_is_native(struct pcie_device *dev)
>>  	return pcie_ports_native || host->native_pme;
>>  }
>>  
>> +static bool pme_is_broken(struct pcie_device *pcie)
>> +{
>> +	struct pci_dev *pdev = pcie->port;
>> +
>> +	return !!(pdev->dev_flags & PCI_DEV_FLAGS_PME_UNRELIABLE);
>> +}
>> +
>>  static void pciehp_disable_interrupt(struct pcie_device *dev)
>>  {
>>  	/*
>>  	 * Disable hotplug interrupt so that it does not trigger
>>  	 * immediately when the downstream link goes down.
>>  	 */
>> -	if (pme_is_native(dev))
>> +	if (pme_is_native(dev) && !pme_is_broken(dev))
>>  		pcie_disable_interrupt(get_service_data(dev));
>>  }
>>  
>> @@ -318,7 +325,7 @@ static int pciehp_resume(struct pcie_device *dev)
>>  {
>>  	struct controller *ctrl = get_service_data(dev);
>>  
>> -	if (pme_is_native(dev))
>> +	if (pme_is_native(dev) && !pme_is_broken(dev))
>>  		pcie_enable_interrupt(ctrl);
>>  
>>  	pciehp_check_presence(ctrl);
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 48946cca4be7..bfb52735c4e3 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -6380,3 +6380,63 @@ static void pci_mask_replay_timer_timeout(struct pci_dev *pdev)
>>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9750, pci_mask_replay_timer_timeout);
>>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9755, pci_mask_replay_timer_timeout);
>>  #endif
>> +
>> +/*
>> + * Some PCIe root ports have a hardware issue where PME-based wakeup
>> + * from D3hot is unreliable. This can manifest as either PME interrupts
>> + * not being delivered, or PME status registers (PME Status and PME
>> + * Requester_ID in Root Status) not being reliably updated even when
>> + * interrupts are delivered.
>> + *
>> + * When a hotplug event occurs while the port is in D3hot, the system
>> + * relies on PME to wake the port back to D0. If PME notification or
>> + * status updates are unreliable, the PME handler either doesn't get
>> + * invoked or cannot identify the event source. This leaves the port in
>> + * D3hot with hotplug interrupts disabled, causing hotplug events to be
>> + * missed.
>> + *
>> + * Mark affected ports with PCI_DEV_FLAGS_PME_UNRELIABLE to keep
>> + * hotplug interrupts (HPIE) enabled during D3hot instead of relying on
>> + * PME-based wakeup. This allows hotplug events to be delivered via
>> + * direct interrupts while still permitting the port to enter D3hot for
>> + * power savings.
>> + *
>> + * Known affected hardware:
>> + * - Intel Catlow Lake PCH PCIe root ports: PME status registers are
>> + *   not updated during D3hot to D0 transitions, even though PME
>> + *   interrupts are delivered correctly.
>> + */
>> +static void quirk_pcie_pme_unreliable(struct pci_dev *dev)
>> +{
>> +	dev->dev_flags |= PCI_DEV_FLAGS_PME_UNRELIABLE;
>> +	pci_info(dev, "PME status unreliable, keeping hotplug interrupts enabled in D3hot\n");
>> +}
>> +/* Apply quirk to Catlow Lake PCH root ports (0x7a30 - 0x7a4b) */
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a30, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a31, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a32, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a33, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a34, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a35, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a36, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a37, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a38, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a39, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3a, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3b, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3c, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3d, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3e, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a3f, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a40, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a41, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a42, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a43, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a44, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a45, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a46, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a47, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a48, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a49, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a4a, quirk_pcie_pme_unreliable);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7a4b, quirk_pcie_pme_unreliable);
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 1c270f1d5123..9761351c5d70 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -253,6 +253,8 @@ enum pci_dev_flags {
>>  	 * integrated with the downstream devices and doesn't use real PCI.
>>  	 */
>>  	PCI_DEV_FLAGS_PCI_BRIDGE_NO_ALIAS = (__force pci_dev_flags_t) (1 << 14),
>> +	/* Device PME is broken or unreliable */
>> +	PCI_DEV_FLAGS_PME_UNRELIABLE = (__force pci_dev_flags_t) (1 << 15),
>>  };
>>  
>>  enum pci_irq_reroute_variant {
>> -- 
>> 2.43.0
>>
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


  reply	other threads:[~2026-03-24 21:45 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-16 22:08 [PATCH v3] PCI: pciehp: Fix hotplug on Catlow Lake with unreliable PME status Kuppuswamy Sathyanarayanan
2026-03-23 12:53 ` Lukas Wunner
2026-03-23 23:24 ` Bjorn Helgaas
2026-03-24 21:45   ` Kuppuswamy Sathyanarayanan [this message]
2026-03-24 23:46     ` Bjorn Helgaas
2026-03-25  5:56     ` Lukas Wunner
2026-03-25 23:21       ` Bjorn Helgaas
2026-03-25  6:11     ` Mika Westerberg
2026-03-25 21:12       ` Kuppuswamy Sathyanarayanan
2026-03-26  6:12         ` Mika Westerberg
2026-03-26 21:23           ` Kuppuswamy Sathyanarayanan
2026-03-27 11:16             ` Mika Westerberg
2026-04-03 19:37               ` Kuppuswamy Sathyanarayanan
2026-04-07  7:08                 ` Mika Westerberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca15feda-5d63-405d-a4c4-00cbb9991fed@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mika.westerberg@linux.intel.com \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.