public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state
@ 2026-04-28  4:01 Carlos Bilbao (Lambda)
  2026-05-06 18:07 ` Bjorn Helgaas
  2026-05-06 18:10 ` Bjorn Helgaas
  0 siblings, 2 replies; 4+ messages in thread
From: Carlos Bilbao (Lambda) @ 2026-04-28  4:01 UTC (permalink / raw)
  To: bhelgaas; +Cc: eduardo.habkost, linux-pci, linux-kernel, bilbao, Carlos Bilbao

From: Carlos Bilbao <carlos.bilbao@kernel.org>

Reconfiguring ASPM when a device transitions to low-power state can enable
L1.1/L1.2 substates on the PCIe link at a time when the device is sleeping
and may be unable to exit them. ASPM should be reconfigured on D0 entry
(resume), not on the way down.

pci_set_low_power_state() calls pcie_aspm_pm_state_change() after writing
D3hot to PCI_PM_CTRL. pcie_aspm_pm_state_change() resets link->aspm_capable
to link->aspm_support and then calls pcie_config_aspm_path(), which can
enable ASPM L1.1/L1.2 substates on the PCIe link. If the device cannot
recover the link from L1.2 while in D3hot, subsequent config space reads
return 0xFFFF ("device inaccessible") and pci_power_up() fails with message
"Unable to change power state from D3hot to D0, device inaccessible".

This was observed on NVIDIA H100 SXM5 GPUs bound to vfio-pci when Linux
runtime PM suspends them to D3hot: the GPU becomes permanently inaccessible
and disappears from the PCIe bus.

The call to pcie_aspm_pm_state_change() in pci_set_low_power_state() was
restored by commit f93e71aea6c6 ("Revert "PCI/ASPM: Remove
pcie_aspm_pm_state_change()""), which reverted
commit 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()").
The revert was necessary because the
removal broke suspend/resume on certain platforms that required ASPM to be
reconfigured on D0 entry. However, the revert restored the call in both
pci_set_full_power_state() (D0 entry) and pci_set_low_power_state()
(low-power entry).

Only the D0-entry call is needed to fix the suspend/resume regression. The
low-power-entry call is harmful: reconfiguring ASPM immediately after
putting a device into D3hot can enable link substates that the device or
platform cannot exit while the device is sleeping.

Remove the pcie_aspm_pm_state_change() call from pci_set_low_power_state().
ASPM will still be reconfigured correctly when the device returns to D0 via
pci_set_full_power_state().

Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"")
Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org
Signed-off-by: Carlos Bilbao (Lambda) <carlos.bilbao@kernel.org>
---
 drivers/pci/pci.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b2ccb8e122f2..8b47887019f9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1542,9 +1542,6 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
 				     pci_power_name(dev->current_state),
 				     pci_power_name(state));
 
-	if (dev->bus->self)
-		pcie_aspm_pm_state_change(dev->bus->self, locked);
-
 	return 0;
 }
 
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-07  2:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28  4:01 [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state Carlos Bilbao (Lambda)
2026-05-06 18:07 ` Bjorn Helgaas
2026-05-06 18:10 ` Bjorn Helgaas
2026-05-07  2:51   ` Carlos Bilbao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox