* [PATCH] PCI/PM: Don't call pci_pm_power_up_and_verify_state() for devices already in D0
@ 2026-03-04 17:03 Rich Hanes
2026-03-04 17:52 ` Lukas Wunner
0 siblings, 1 reply; 2+ messages in thread
From: Rich Hanes @ 2026-03-04 17:03 UTC (permalink / raw)
To: bhelgaas; +Cc: linux-pci, linux-kernel, mario.limonciello, stable, Rich Hanes
Commit 4d4c10f763d7 ("PCI: Explicitly put devices into D0 when
initializing") calls pci_pm_power_up_and_verify_state() for every PCI
device at the end of pci_pm_init(). This causes a boot-time regression
on PCIe endpoints that the firmware leaves in D0.
The problem is not the _PS0 call itself — pci_power_up() returns early
when the hardware PM_CTRL register already reads D0. The problem is
that pci_pm_power_up_and_verify_state() unconditionally calls
pci_update_current_state() afterwards, which sets current_state to
PCI_D0 even for devices that were never actually transitioned.
pcie_aspm_init_link_state() later checks current_state when the
pcieport driver binds to the upstream port. From aspm.c:
/* Spec says both ports must be in D0 before enabling PCI PM substates */
if (parent->current_state != PCI_D0 || child->current_state != PCI_D0) {
state &= ~PCIE_LINK_STATE_L1_SS_PCIPM;
When current_state is PCI_D0 (set prematurely by pci_pm_init()), ASPM
enables L1 PM substates including L1.2. L1.2 removes the PCIe
reference clock from the link during idle periods, putting the
downstream device into a D3cold-equivalent state. Without a subsequent
_PS0 call to restore it, the device does not respond to config space
reads when the link returns to L0. pciehp observes this as:
pciehp: Slot(0): Card not present <- L1.2 entry during link retrain
pciehp: Slot(0): Card present
pciehp: Slot(0): Link Up
pciehp: Slot(0): No device found <- device unresponsive in D3cold
Reproduced on a Lenovo ThinkPad with Intel Wireless-AC 7265 (8086:095a)
behind PCIe root port 8086:9d10. The workaround pcie_aspm=off confirms
that suppressing L1 PM substate configuration prevents the failure.
The fix is to read PM_CTRL before calling
pci_pm_power_up_and_verify_state() and skip it when the hardware already
reports D0. This preserves the
_REG notification for devices the firmware leaves in a non-D0 state
(the original motivation for the commit) while leaving current_state as
PCI_UNKNOWN for devices already in D0, which correctly defers L1 PM
substate enablement.
Also add an early return to pci_power_up() to avoid calling
platform_pci_set_power_state() (_PS0) redundantly on devices whose
hardware state is already D0 and whose software state is PCI_UNKNOWN.
Fixes: 4d4c10f763d7 ("PCI: Explicitly put devices into D0 when initializing")
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rich Hanes <georgebastille@gmail.com>
---
drivers/pci/pci.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8479c2e1f74f..e2e11b2884d4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1301,6 +1301,13 @@ int pci_power_up(struct pci_dev *dev)
pci_power_t state;
u16 pmcsr;
+ if (dev->pm_cap && dev->current_state == PCI_UNKNOWN && !pci_dev_is_disconnected(dev)) {
+ pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
+ if (!PCI_POSSIBLE_ERROR(pmcsr) && (pmcsr & PCI_PM_CTRL_STATE_MASK) == PCI_D0) {
+ dev->current_state = PCI_D0;
+ return 0;
+ }
+ }
platform_pci_set_power_state(dev, PCI_D0);
if (!dev->pm_cap) {
@@ -3191,7 +3198,27 @@ void pci_pm_init(struct pci_dev *dev)
}
poweron:
- pci_pm_power_up_and_verify_state(dev);
+ /*
+ * Explicitly put the device into D0 to ensure platform power methods
+ * such as _PS0 are called and _REG is notified. Only do this when
+ * the hardware is not already in D0: pcie_aspm_init_link_state()
+ * enables PCIe L1 PM substates (L1.1/L1.2) only when both the
+ * downstream device and its parent report current_state == PCI_D0.
+ * Calling pci_pm_power_up_and_verify_state() for a device already in
+ * D0 would set current_state prematurely, enabling L1.2 before the
+ * driver is ready and causing the device to enter a deep sleep state
+ * it cannot exit without platform intervention (_PS0).
+ */
+ if (dev->pm_cap) {
+ u16 pmcsr;
+
+ pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
+ if (PCI_POSSIBLE_ERROR(pmcsr) ||
+ (pmcsr & PCI_PM_CTRL_STATE_MASK) != PCI_D0)
+ pci_pm_power_up_and_verify_state(dev);
+ } else {
+ pci_pm_power_up_and_verify_state(dev);
+ }
pm_runtime_forbid(&dev->dev);
/*
--
2.53.0
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH] PCI/PM: Don't call pci_pm_power_up_and_verify_state() for devices already in D0
2026-03-04 17:03 [PATCH] PCI/PM: Don't call pci_pm_power_up_and_verify_state() for devices already in D0 Rich Hanes
@ 2026-03-04 17:52 ` Lukas Wunner
0 siblings, 0 replies; 2+ messages in thread
From: Lukas Wunner @ 2026-03-04 17:52 UTC (permalink / raw)
To: Rich Hanes; +Cc: bhelgaas, linux-pci, linux-kernel, mario.limonciello, stable
On Wed, Mar 04, 2026 at 05:03:24PM +0000, Rich Hanes wrote:
> Reproduced on a Lenovo ThinkPad with Intel Wireless-AC 7265 (8086:095a)
> behind PCIe root port 8086:9d10. The workaround pcie_aspm=off confirms
> that suppressing L1 PM substate configuration prevents the failure.
The same issue was observed on a Google Pixelbook Eve:
https://bugzilla.kernel.org/show_bug.cgi?id=220705
That laptop also uses an i7265 attached to a Sunrise Point-LP PCH.
The PCH suffers from a hardware erratum, it doesn't reinstate the
clock quick enough after CLREQ# assertion to stay below the
spec-prescribed 400 nsec.
Intel's specification update recommends disabling CLKREQ# support
at the Root Port to work around the issue. This must be done by
the BIOS, an operating system patch isn't the right approach.
I've prepared a BIOS change for the coreboot BIOS used on the
Pixelbook, see the above-linked bugzilla. It has not been
upstreamed yet as I'm waiting on the reporter to test it.
If your ThinkPad can be made to boot with coreboot, I can look into
porting the patch over to your machine. Please specify the exact
Thinkpad model you're using. Otherwise please check whether Lenovo
has released a BIOS update for your machine.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-03-04 18:01 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-04 17:03 [PATCH] PCI/PM: Don't call pci_pm_power_up_and_verify_state() for devices already in D0 Rich Hanes
2026-03-04 17:52 ` Lukas Wunner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox