* [PATCH] PCI: remove call pci_save_aspm_l1ss_state() from pci_save_pcie_state()
@ 2025-07-07 11:52 weilinghan
2025-07-07 19:49 ` Bjorn Helgaas
0 siblings, 1 reply; 3+ messages in thread
From: weilinghan @ 2025-07-07 11:52 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: linux-kernel, hulingchen, weilinghan, weipengliang
During the suspend-resume process, PCIe resumes by enabling L1.2
in the pci_restore_state function due to patch 4ff116d0d5fd.
However, in the following scenario, the resume process
becomes very time-consuming:
1.The platform has multiple PCI buses.
2.The link transition time from L1.2 to L0 exceeds 100 microseconds by
accessing the configuration space of the EP.
3.The PCI framework has async_suspend enabled
(by calling device_enable_async_suspend(&dev->dev)
in pci_pm_init(struct pci_dev *dev)).
4.On ARM platforms, CONFIG_PCI_LOCKLESS_CONFIG is not enabled, which means
the pci_bus_read_config_##size interfaces contain locks (spinlock).
Practical measurements show that enabling L1.2 during the
resume process introduces an additional delay of approximately
150ms in the pci_pm_resume_noirq() function for platforms
with two PCI buses, compared to when L1.2 is disabled.
Signed-off-by: weilinghan <weilinghan@xiaomi.com>
---
drivers/pci/pci.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 9e42090fb108..0834211b0f8c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1708,7 +1708,6 @@ static int pci_save_pcie_state(struct pci_dev *dev)
pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &cap[i++]);
pcie_capability_read_word(dev, PCI_EXP_SLTCTL2, &cap[i++]);
- pci_save_aspm_l1ss_state(dev);
pci_save_ltr_state(dev);
return 0;
@@ -1725,7 +1724,6 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
* LTR itself in PCI_EXP_DEVCTL2.
*/
pci_restore_ltr_state(dev);
- pci_restore_aspm_l1ss_state(dev);
save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP);
if (!save_state)
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] PCI: remove call pci_save_aspm_l1ss_state() from pci_save_pcie_state()
2025-07-07 11:52 [PATCH] PCI: remove call pci_save_aspm_l1ss_state() from pci_save_pcie_state() weilinghan
@ 2025-07-07 19:49 ` Bjorn Helgaas
2025-07-08 5:53 ` weilinghan
0 siblings, 1 reply; 3+ messages in thread
From: Bjorn Helgaas @ 2025-07-07 19:49 UTC (permalink / raw)
To: weilinghan
Cc: Bjorn Helgaas, linux-kernel, hulingchen, weipengliang,
Vidya Sagar
[+cc Vidya, author of 4ff116d0d5fd]
On Mon, Jul 07, 2025 at 07:52:36PM +0800, weilinghan wrote:
> During the suspend-resume process, PCIe resumes by enabling L1.2
> in the pci_restore_state function due to patch 4ff116d0d5fd.
> However, in the following scenario, the resume process
> becomes very time-consuming:
>
> 1.The platform has multiple PCI buses.
> 2.The link transition time from L1.2 to L0 exceeds 100 microseconds by
> accessing the configuration space of the EP.
> 3.The PCI framework has async_suspend enabled
> (by calling device_enable_async_suspend(&dev->dev)
> in pci_pm_init(struct pci_dev *dev)).
> 4.On ARM platforms, CONFIG_PCI_LOCKLESS_CONFIG is not enabled, which means
> the pci_bus_read_config_##size interfaces contain locks (spinlock).
>
> Practical measurements show that enabling L1.2 during the
> resume process introduces an additional delay of approximately
> 150ms in the pci_pm_resume_noirq() function for platforms
> with two PCI buses, compared to when L1.2 is disabled.
We really need an argument for why this change would be correct, not
just the fact that it makes resume faster. Vidya made the change in
4ff116d0d5fd to fix a problem, and it looks like this patch would
reintroduce the problem.
Nits:
- Look at previous history and follow the subject line convention.
- Add "()" after function names, e.g., pci_restore_state().
- Omit "the ... function"; it's sufficient to say "in
pci_restore_state()" and "in pci_pm_resume_noirq()".
- Omit function parameter info since it's not really relevant here.
- Cite commits as: 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates
Capability for suspend/resume")
- s/4.0n/4/
> Signed-off-by: weilinghan <weilinghan@xiaomi.com>
> ---
> drivers/pci/pci.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9e42090fb108..0834211b0f8c 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1708,7 +1708,6 @@ static int pci_save_pcie_state(struct pci_dev *dev)
> pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &cap[i++]);
> pcie_capability_read_word(dev, PCI_EXP_SLTCTL2, &cap[i++]);
>
> - pci_save_aspm_l1ss_state(dev);
> pci_save_ltr_state(dev);
>
> return 0;
> @@ -1725,7 +1724,6 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
> * LTR itself in PCI_EXP_DEVCTL2.
> */
> pci_restore_ltr_state(dev);
> - pci_restore_aspm_l1ss_state(dev);
>
> save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP);
> if (!save_state)
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] PCI: remove call pci_save_aspm_l1ss_state() from pci_save_pcie_state()
2025-07-07 19:49 ` Bjorn Helgaas
@ 2025-07-08 5:53 ` weilinghan
0 siblings, 0 replies; 3+ messages in thread
From: weilinghan @ 2025-07-08 5:53 UTC (permalink / raw)
To: helgaas
Cc: bhelgaas, hulingchen, linux-kernel, vidyas, weilinghan,
weipengliang
On Mon, 7 Jul 2025 14:49:03 -0500, Bjorn Helgaas wrote:
>On Mon, Jul 07, 2025 at 07:52:36PM +0800, weilinghan wrote:
>> During the suspend-resume process, PCIe resumes by enabling L1.2 in
>> the pci_restore_state function due to patch 4ff116d0d5fd.
>> However, in the following scenario, the resume process becomes very
>> time-consuming:
>>
>> 1.The platform has multiple PCI buses.
>> 2.The link transition time from L1.2 to L0 exceeds 100 microseconds by
>> accessing the configuration space of the EP.
>> 3.The PCI framework has async_suspend enabled (by calling
>> device_enable_async_suspend(&dev->dev)
>> in pci_pm_init(struct pci_dev *dev)).
>> 4.On ARM platforms, CONFIG_PCI_LOCKLESS_CONFIG is not enabled, which
>> means the pci_bus_read_config_##size interfaces contain locks (spinlock).
>>
>> Practical measurements show that enabling L1.2 during the resume
>> process introduces an additional delay of approximately 150ms in the
>> pci_pm_resume_noirq() function for platforms with two PCI buses,
>> compared to when L1.2 is disabled.
>We really need an argument for why this change would be correct, not just the fact that it makes resume faster. Vidya made the change in 4ff116d0d5fd to fix a problem, and it looks like this patch would reintroduce the problem.
Ok, I'm seeing lock contention issues when multiple PCI devices call pci_restore_state() during the resume_noirq phase.
This problem arises due to commit a1e4d72cd ("PM: Allow PCI devices to suspend/resume asynchronously"), which changed the noirq phase of PCI devices to be executed asynchronously. As a result, multiple PCI devices may attempt to restore configuration space concurrently, leading to contention on the PCI configuration lock.
Additionally, commit 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") by Vidya enables L1.2 state handling, which increases the time spent in the critical section, thereby further exacerbating the lock contention and increasing resume latency.
Currently, I'm considering a few possible approaches to address this:
1.In the driver, call device_disable_async_suspend() to prevent asynchronous suspend/resume for specific devices that are known to have contention issues.
2.Enable CONFIG_PCI_LOCKLESS_CONFIG on the ARM platform
3.Make dev_pm_skip_resume() return true for certain devices, skipping pci_restore_state() in the PCI core during resume.
4.revert commit a1e4d72cd ("PM: Allow PCI devices to suspend/resume asynchronously")
I'd appreciate any insights or recommendations from the community on the best way to proceed. Are there any preferred approaches for handling?
Thanks,
weilinghan
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-07-08 5:53 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-07 11:52 [PATCH] PCI: remove call pci_save_aspm_l1ss_state() from pci_save_pcie_state() weilinghan
2025-07-07 19:49 ` Bjorn Helgaas
2025-07-08 5:53 ` weilinghan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).