public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state
@ 2026-04-28  4:01 Carlos Bilbao (Lambda)
  2026-05-06 18:07 ` Bjorn Helgaas
  2026-05-06 18:10 ` Bjorn Helgaas
  0 siblings, 2 replies; 4+ messages in thread
From: Carlos Bilbao (Lambda) @ 2026-04-28  4:01 UTC (permalink / raw)
  To: bhelgaas; +Cc: eduardo.habkost, linux-pci, linux-kernel, bilbao, Carlos Bilbao

From: Carlos Bilbao <carlos.bilbao@kernel.org>

Reconfiguring ASPM when a device transitions to low-power state can enable
L1.1/L1.2 substates on the PCIe link at a time when the device is sleeping
and may be unable to exit them. ASPM should be reconfigured on D0 entry
(resume), not on the way down.

pci_set_low_power_state() calls pcie_aspm_pm_state_change() after writing
D3hot to PCI_PM_CTRL. pcie_aspm_pm_state_change() resets link->aspm_capable
to link->aspm_support and then calls pcie_config_aspm_path(), which can
enable ASPM L1.1/L1.2 substates on the PCIe link. If the device cannot
recover the link from L1.2 while in D3hot, subsequent config space reads
return 0xFFFF ("device inaccessible") and pci_power_up() fails with message
"Unable to change power state from D3hot to D0, device inaccessible".

This was observed on NVIDIA H100 SXM5 GPUs bound to vfio-pci when Linux
runtime PM suspends them to D3hot: the GPU becomes permanently inaccessible
and disappears from the PCIe bus.

The call to pcie_aspm_pm_state_change() in pci_set_low_power_state() was
restored by commit f93e71aea6c6 ("Revert "PCI/ASPM: Remove
pcie_aspm_pm_state_change()""), which reverted
commit 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()").
The revert was necessary because the
removal broke suspend/resume on certain platforms that required ASPM to be
reconfigured on D0 entry. However, the revert restored the call in both
pci_set_full_power_state() (D0 entry) and pci_set_low_power_state()
(low-power entry).

Only the D0-entry call is needed to fix the suspend/resume regression. The
low-power-entry call is harmful: reconfiguring ASPM immediately after
putting a device into D3hot can enable link substates that the device or
platform cannot exit while the device is sleeping.

Remove the pcie_aspm_pm_state_change() call from pci_set_low_power_state().
ASPM will still be reconfigured correctly when the device returns to D0 via
pci_set_full_power_state().

Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"")
Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org
Signed-off-by: Carlos Bilbao (Lambda) <carlos.bilbao@kernel.org>
---
 drivers/pci/pci.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b2ccb8e122f2..8b47887019f9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1542,9 +1542,6 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
 				     pci_power_name(dev->current_state),
 				     pci_power_name(state));
 
-	if (dev->bus->self)
-		pcie_aspm_pm_state_change(dev->bus->self, locked);
-
 	return 0;
 }
 
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state
  2026-04-28  4:01 [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state Carlos Bilbao (Lambda)
@ 2026-05-06 18:07 ` Bjorn Helgaas
  2026-05-06 18:10 ` Bjorn Helgaas
  1 sibling, 0 replies; 4+ messages in thread
From: Bjorn Helgaas @ 2026-05-06 18:07 UTC (permalink / raw)
  To: Carlos Bilbao (Lambda)
  Cc: bhelgaas, eduardo.habkost, linux-pci, linux-kernel, bilbao,
	Kai-Heng Feng, Michael Schaller, David E. Box,
	Manivannan Sadhasivam

[+cc Kai-Heng, Michael, David, Mani]

On Mon, Apr 27, 2026 at 09:01:04PM -0700, Carlos Bilbao (Lambda) wrote:
> From: Carlos Bilbao <carlos.bilbao@kernel.org>
> 
> Reconfiguring ASPM when a device transitions to low-power state can enable
> L1.1/L1.2 substates on the PCIe link at a time when the device is sleeping
> and may be unable to exit them. ASPM should be reconfigured on D0 entry
> (resume), not on the way down.
> 
> pci_set_low_power_state() calls pcie_aspm_pm_state_change() after writing
> D3hot to PCI_PM_CTRL. pcie_aspm_pm_state_change() resets link->aspm_capable
> to link->aspm_support and then calls pcie_config_aspm_path(), which can
> enable ASPM L1.1/L1.2 substates on the PCIe link. If the device cannot
> recover the link from L1.2 while in D3hot, subsequent config space reads
> return 0xFFFF ("device inaccessible") and pci_power_up() fails with message
> "Unable to change power state from D3hot to D0, device inaccessible".
> 
> This was observed on NVIDIA H100 SXM5 GPUs bound to vfio-pci when Linux
> runtime PM suspends them to D3hot: the GPU becomes permanently inaccessible
> and disappears from the PCIe bus.
> 
> The call to pcie_aspm_pm_state_change() in pci_set_low_power_state() was
> restored by commit f93e71aea6c6 ("Revert "PCI/ASPM: Remove
> pcie_aspm_pm_state_change()""), which reverted
> commit 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()").
> The revert was necessary because the
> removal broke suspend/resume on certain platforms that required ASPM to be
> reconfigured on D0 entry. However, the revert restored the call in both
> pci_set_full_power_state() (D0 entry) and pci_set_low_power_state()
> (low-power entry).
> 
> Only the D0-entry call is needed to fix the suspend/resume regression. The
> low-power-entry call is harmful: reconfiguring ASPM immediately after
> putting a device into D3hot can enable link substates that the device or
> platform cannot exit while the device is sleeping.
> 
> Remove the pcie_aspm_pm_state_change() call from pci_set_low_power_state().
> ASPM will still be reconfigured correctly when the device returns to D0 via
> pci_set_full_power_state().

Sounds right to me.  I don't know why we would want to touch ASPM
during suspend to D3hot.  I could imagine disabling ASPM states
*before* that transition, but enabling new states at that point sounds
wrong.

Any comments, Kai-Heng and Michael?

I know your regression report was long ago, Michael,
(https://lore.kernel.org/all/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de/),
so likely not practical for you to test this change, but I would hate
to have your system break again.

> Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"")
> Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org
> Signed-off-by: Carlos Bilbao (Lambda) <carlos.bilbao@kernel.org>
> ---
>  drivers/pci/pci.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b2ccb8e122f2..8b47887019f9 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1542,9 +1542,6 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
>  				     pci_power_name(dev->current_state),
>  				     pci_power_name(state));
>  
> -	if (dev->bus->self)
> -		pcie_aspm_pm_state_change(dev->bus->self, locked);
> -
>  	return 0;
>  }
>  
> -- 
> 2.50.1 (Apple Git-155)
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state
  2026-04-28  4:01 [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state Carlos Bilbao (Lambda)
  2026-05-06 18:07 ` Bjorn Helgaas
@ 2026-05-06 18:10 ` Bjorn Helgaas
  2026-05-07  2:51   ` Carlos Bilbao
  1 sibling, 1 reply; 4+ messages in thread
From: Bjorn Helgaas @ 2026-05-06 18:10 UTC (permalink / raw)
  To: Carlos Bilbao (Lambda)
  Cc: bhelgaas, eduardo.habkost, linux-pci, linux-kernel, bilbao

On Mon, Apr 27, 2026 at 09:01:04PM -0700, Carlos Bilbao (Lambda) wrote:
> From: Carlos Bilbao <carlos.bilbao@kernel.org>
> 
> Reconfiguring ASPM when a device transitions to low-power state can enable
> L1.1/L1.2 substates on the PCIe link at a time when the device is sleeping
> and may be unable to exit them. ASPM should be reconfigured on D0 entry
> (resume), not on the way down.
> 
> pci_set_low_power_state() calls pcie_aspm_pm_state_change() after writing
> D3hot to PCI_PM_CTRL. pcie_aspm_pm_state_change() resets link->aspm_capable
> to link->aspm_support and then calls pcie_config_aspm_path(), which can
> enable ASPM L1.1/L1.2 substates on the PCIe link. If the device cannot
> recover the link from L1.2 while in D3hot, subsequent config space reads
> return 0xFFFF ("device inaccessible") and pci_power_up() fails with message
> "Unable to change power state from D3hot to D0, device inaccessible".

Carlos, do you have a few lines of dmesg showing this issue that we
could quote to help people match the issue with this fix?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state
  2026-05-06 18:10 ` Bjorn Helgaas
@ 2026-05-07  2:51   ` Carlos Bilbao
  0 siblings, 0 replies; 4+ messages in thread
From: Carlos Bilbao @ 2026-05-07  2:51 UTC (permalink / raw)
  To: Bjorn Helgaas, Carlos Bilbao (Lambda)
  Cc: bhelgaas, eduardo.habkost, linux-pci, linux-kernel, bilbao

Hey Bjorn.

On 5/6/26 11:10, Bjorn Helgaas wrote:
> On Mon, Apr 27, 2026 at 09:01:04PM -0700, Carlos Bilbao (Lambda) wrote:
>> From: Carlos Bilbao <carlos.bilbao@kernel.org>
>>
>> Reconfiguring ASPM when a device transitions to low-power state can enable
>> L1.1/L1.2 substates on the PCIe link at a time when the device is sleeping
>> and may be unable to exit them. ASPM should be reconfigured on D0 entry
>> (resume), not on the way down.
>>
>> pci_set_low_power_state() calls pcie_aspm_pm_state_change() after writing
>> D3hot to PCI_PM_CTRL. pcie_aspm_pm_state_change() resets link->aspm_capable
>> to link->aspm_support and then calls pcie_config_aspm_path(), which can
>> enable ASPM L1.1/L1.2 substates on the PCIe link. If the device cannot
>> recover the link from L1.2 while in D3hot, subsequent config space reads
>> return 0xFFFF ("device inaccessible") and pci_power_up() fails with message
>> "Unable to change power state from D3hot to D0, device inaccessible".
> Carlos, do you have a few lines of dmesg showing this issue that we
> could quote to help people match the issue with this fix?


Thank you for reviewing this. Only the error message:

[160459.607156] vfio-pci 0000:5d:00.0: Unable to change power state from 
D3cold to D0, device inaccessible


Thanks,

Carlos


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-07  2:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28  4:01 [PATCH] PCI/ASPM: Don't reconfigure ASPM entering low-power state Carlos Bilbao (Lambda)
2026-05-06 18:07 ` Bjorn Helgaas
2026-05-06 18:10 ` Bjorn Helgaas
2026-05-07  2:51   ` Carlos Bilbao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox