* [PATCH v2 0/3] PCI: Universal error recoverability of devices
@ 2025-11-19 8:50 Lukas Wunner
2025-11-19 8:50 ` [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths Lukas Wunner
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Lukas Wunner @ 2025-11-19 8:50 UTC (permalink / raw)
To: Bjorn Helgaas, Rafael J. Wysocki
Cc: Riana Tauro, Sean C. Dardis, Farhan Ali, Benjamin Block,
Niklas Schnelle, Alek Du, Mahesh J Salgaonkar, Oliver OHalloran,
linuxppc-dev, linux-pci, linux-pm
This series intends to replace commit 1dc302f7fccc ("PCI: Ensure error
recoverability at all times") on the pci/err topic branch:
https://git.kernel.org/pci/pci/c/1dc302f7fccc
The commit is assigning "dev->state_saved = false" in pci_bus_add_device()
and during review there were requests to explain the assignment more
clearly in a code comment.
However the assignment is (only) necessitated by missing assignments in
pci_legacy_suspend() and pci_pm_freeze(), so I propose to instead add
*those* assignments (patch [1/3]) and thus avoid the need for the
assignment in pci_bus_add_device(), together with its code comment.
Furthermore the commit is *removing* an assignment in pci_device_add().
I am separating that out to new patch [2/3].
So patch [3/3] is identical to the commit, but without the addition
of an assignment in pci_bus_add_device() and without the removal
of an assignment in pci_device_add().
I am looking into improving the documentation on pci_save_state()
in a separate series.
Lukas Wunner (3):
PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
PCI/ERR: Ensure error recoverability at all times
drivers/pci/bus.c | 3 +++
drivers/pci/pci-driver.c | 6 ++++--
drivers/pci/pci.c | 3 ---
drivers/pci/probe.c | 2 --
4 files changed, 7 insertions(+), 7 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
2025-11-19 8:50 [PATCH v2 0/3] PCI: Universal error recoverability of devices Lukas Wunner
@ 2025-11-19 8:50 ` Lukas Wunner
2025-11-19 21:08 ` Rafael J. Wysocki
2025-11-25 23:18 ` Bjorn Helgaas
2025-11-19 8:50 ` [PATCH v2 2/3] PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw Lukas Wunner
` (2 subsequent siblings)
3 siblings, 2 replies; 12+ messages in thread
From: Lukas Wunner @ 2025-11-19 8:50 UTC (permalink / raw)
To: Bjorn Helgaas, Rafael J. Wysocki
Cc: Riana Tauro, Sean C. Dardis, Farhan Ali, Benjamin Block,
Niklas Schnelle, Alek Du, Mahesh J Salgaonkar, Oliver OHalloran,
linuxppc-dev, linux-pci, linux-pm
When a PCI device is suspended, it is normally the PCI core's job to save
Config Space and put the device into a low power state. However drivers
are allowed to assume these responsibilities. When they do, the PCI core
can tell by looking at the state_saved flag in struct pci_dev: The flag
is cleared before commencing the suspend sequence and it is set when
pci_save_state() is called. If the PCI core finds the flag set late in
the suspend sequence, it refrains from calling pci_save_state() itself.
But there are two corner cases where the PCI core neglects to clear the
flag before commencing the suspend sequence:
* If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects
to clear the flag. The (stale) flag is subsequently queried by
pci_legacy_suspend() itself and pci_legacy_suspend_late().
* If a device has no driver or its driver has no PCI PM callbacks,
pci_pm_freeze() neglects to clear the flag. The (stale) flag is
subsequently queried by pci_pm_freeze_noirq().
The flag may be set prior to suspend if the device went through error
recovery: Drivers commonly invoke pci_restore_state() + pci_save_state()
to restore Config Space after reset.
The flag may also be set if drivers call pci_save_state() on probe to
allow for recovery from subsequent errors.
The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq()
don't call pci_save_state() and so the state that will be restored on
resume is the one recorded on last error recovery or on probe, not the one
that the device had on suspend. If the two states happen to be identical,
there's no problem.
Reinstate clearing the flag in pci_legacy_suspend() and pci_pm_freeze().
The two functions used to do that until commit 4b77b0a2ba27 ("PCI: Clear
saved_state after the state has been restored") deemed it unnecessary
because it assumed that it's sufficient to clear the flag on resume in
pci_restore_state(). The commit seemingly did not take into account that
pci_save_state() and pci_restore_state() are not only used by power
management code, but also for error recovery.
Devices without driver or whose driver has no PCI PM callbacks may be in
runtime suspend when pci_pm_freeze() is called. Their state has already
been saved, so don't clear the flag to skip a pointless pci_save_state()
in pci_pm_freeze_noirq().
None of the drivers with legacy PCI PM callbacks seem to use runtime PM,
so clear the flag unconditionally in their case.
Fixes: 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v2.6.32+
---
drivers/pci/pci-driver.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 302d61783f6c..327b21c48614 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -629,6 +629,8 @@ static int pci_legacy_suspend(struct device *dev, pm_message_t state)
struct pci_dev *pci_dev = to_pci_dev(dev);
struct pci_driver *drv = pci_dev->driver;
+ pci_dev->state_saved = false;
+
if (drv && drv->suspend) {
pci_power_t prev = pci_dev->current_state;
int error;
@@ -1036,6 +1038,8 @@ static int pci_pm_freeze(struct device *dev)
if (!pm) {
pci_pm_default_suspend(pci_dev);
+ if (!pm_runtime_suspended(dev))
+ pci_dev->state_saved = false;
return 0;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/3] PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
2025-11-19 8:50 [PATCH v2 0/3] PCI: Universal error recoverability of devices Lukas Wunner
2025-11-19 8:50 ` [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths Lukas Wunner
@ 2025-11-19 8:50 ` Lukas Wunner
2025-11-19 21:09 ` Rafael J. Wysocki
2025-11-19 8:50 ` [PATCH v2 3/3] PCI/ERR: Ensure error recoverability at all times Lukas Wunner
2025-11-25 21:03 ` [PATCH v2 0/3] PCI: Universal error recoverability of devices Bjorn Helgaas
3 siblings, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2025-11-19 8:50 UTC (permalink / raw)
To: Bjorn Helgaas, Rafael J. Wysocki
Cc: Riana Tauro, Sean C. Dardis, Farhan Ali, Benjamin Block,
Niklas Schnelle, Alek Du, Mahesh J Salgaonkar, Oliver OHalloran,
linuxppc-dev, linux-pci, linux-pm
The state_saved flag tells the PCI core whether a driver assumes
responsibility to save Config Space and put the device into a low power
state on suspend.
The flag is currently initialized to false on enumeration, even though it
already is false (because struct pci_dev is zeroed by kzalloc()) and even
though it is set to false before commencing the suspend sequence (the only
code path where it's relevant).
The flag is also set to false in pci_pm_thaw(), i.e. on resume, when it's
no longer relevant.
Drop these two superfluous flag assignments for simplicity.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
drivers/pci/pci-driver.c | 2 --
drivers/pci/probe.c | 2 --
2 files changed, 4 deletions(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 327b21c48614..7c2d9d596258 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1133,8 +1133,6 @@ static int pci_pm_thaw(struct device *dev)
pci_pm_reenable_device(pci_dev);
}
- pci_dev->state_saved = false;
-
return error;
}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c83e75a0ec12..c7c7a3d5ec0f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2747,8 +2747,6 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
pci_reassigndev_resource_alignment(dev);
- dev->state_saved = false;
-
pci_init_capabilities(dev);
/*
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 3/3] PCI/ERR: Ensure error recoverability at all times
2025-11-19 8:50 [PATCH v2 0/3] PCI: Universal error recoverability of devices Lukas Wunner
2025-11-19 8:50 ` [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths Lukas Wunner
2025-11-19 8:50 ` [PATCH v2 2/3] PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw Lukas Wunner
@ 2025-11-19 8:50 ` Lukas Wunner
2025-11-25 21:03 ` [PATCH v2 0/3] PCI: Universal error recoverability of devices Bjorn Helgaas
3 siblings, 0 replies; 12+ messages in thread
From: Lukas Wunner @ 2025-11-19 8:50 UTC (permalink / raw)
To: Bjorn Helgaas, Rafael J. Wysocki
Cc: Riana Tauro, Sean C. Dardis, Farhan Ali, Benjamin Block,
Niklas Schnelle, Alek Du, Mahesh J Salgaonkar, Oliver OHalloran,
linuxppc-dev, linux-pci, linux-pm
When the PCI core gained power management support in 2002, it introduced
pci_save_state() and pci_restore_state() helpers to restore Config Space
after a D3hot or D3cold transition, which implies a Soft or Fundamental
Reset (PCIe r7.0 sec 5.8):
https://git.kernel.org/tglx/history/c/a5287abe398b
In 2006, EEH and AER were introduced to recover from errors by performing
a reset. Because errors can occur at any time, drivers began calling
pci_save_state() on probe to ensure recoverability.
In 2009, recoverability was foiled by commit c82f63e411f1 ("PCI: check
saved state before restore"): It amended pci_restore_state() to bail out
if the "state_saved" flag has been cleared. The flag is cleared by
pci_restore_state() itself, hence a saved state is now allowed to be
restored only once and is then invalidated. That doesn't seem to make
sense because the saved state should be good enough to be reused.
Soon after, drivers began to work around this behavior by calling
pci_save_state() immediately after pci_restore_state(), see e.g. commit
b94f2d775a71 ("igb: call pci_save_state after pci_restore_state").
Hilariously, two drivers even set the "saved_state" flag to true before
invoking pci_restore_state(), see ipr_reset_restore_cfg_space() and
e1000_io_slot_reset().
Despite these workarounds, recoverability at all times is not guaranteed:
E.g. when a PCIe port goes through a runtime suspend and resume cycle,
the "saved_state" flag is cleared by:
pci_pm_runtime_resume()
pci_pm_default_resume_early()
pci_restore_state()
... and hence on a subsequent AER event, the port's Config Space cannot be
restored. Riana reports a recovery failure of a GPU-integrated PCIe
switch and has root-caused it to the behavior of pci_restore_state().
Another workaround would be necessary, namely calling pci_save_state() in
pcie_port_device_runtime_resume().
The motivation of commit c82f63e411f1 was to prevent restoring state if
pci_save_state() hasn't been called before. But that can be achieved by
saving state already on device addition, after Config Space has been
initialized. A desirable side effect is that devices become recoverable
even if no driver gets bound. This renders the commit unnecessary, so
revert it.
Reported-by: Riana Tauro <riana.tauro@intel.com> # off-list
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
---
drivers/pci/bus.c | 3 +++
drivers/pci/pci.c | 3 ---
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index f26aec6ff588..9daf13ed3714 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -357,6 +357,9 @@ void pci_bus_add_device(struct pci_dev *dev)
pci_proc_attach_device(dev);
pci_bridge_d3_update(dev);
+ /* Save config space for error recoverability */
+ pci_save_state(dev);
+
/*
* If the PCI device is associated with a pwrctrl device with a
* power supply, create a device link between the PCI device and
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..2f0da5dbbba4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1855,9 +1855,6 @@ static void pci_restore_rebar_state(struct pci_dev *pdev)
*/
void pci_restore_state(struct pci_dev *dev)
{
- if (!dev->state_saved)
- return;
-
pci_restore_pcie_state(dev);
pci_restore_pasid_state(dev);
pci_restore_pri_state(dev);
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
2025-11-19 8:50 ` [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths Lukas Wunner
@ 2025-11-19 21:08 ` Rafael J. Wysocki
2025-11-25 23:18 ` Bjorn Helgaas
1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2025-11-19 21:08 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Rafael J. Wysocki, Riana Tauro, Sean C. Dardis,
Farhan Ali, Benjamin Block, Niklas Schnelle, Alek Du,
Mahesh J Salgaonkar, Oliver OHalloran, linuxppc-dev, linux-pci,
linux-pm
On Wed, Nov 19, 2025 at 9:59 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> When a PCI device is suspended, it is normally the PCI core's job to save
> Config Space and put the device into a low power state. However drivers
> are allowed to assume these responsibilities. When they do, the PCI core
> can tell by looking at the state_saved flag in struct pci_dev: The flag
> is cleared before commencing the suspend sequence and it is set when
> pci_save_state() is called. If the PCI core finds the flag set late in
> the suspend sequence, it refrains from calling pci_save_state() itself.
>
> But there are two corner cases where the PCI core neglects to clear the
> flag before commencing the suspend sequence:
>
> * If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects
> to clear the flag. The (stale) flag is subsequently queried by
> pci_legacy_suspend() itself and pci_legacy_suspend_late().
>
> * If a device has no driver or its driver has no PCI PM callbacks,
> pci_pm_freeze() neglects to clear the flag. The (stale) flag is
> subsequently queried by pci_pm_freeze_noirq().
>
> The flag may be set prior to suspend if the device went through error
> recovery: Drivers commonly invoke pci_restore_state() + pci_save_state()
> to restore Config Space after reset.
>
> The flag may also be set if drivers call pci_save_state() on probe to
> allow for recovery from subsequent errors.
>
> The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq()
> don't call pci_save_state() and so the state that will be restored on
> resume is the one recorded on last error recovery or on probe, not the one
> that the device had on suspend. If the two states happen to be identical,
> there's no problem.
>
> Reinstate clearing the flag in pci_legacy_suspend() and pci_pm_freeze().
> The two functions used to do that until commit 4b77b0a2ba27 ("PCI: Clear
> saved_state after the state has been restored") deemed it unnecessary
> because it assumed that it's sufficient to clear the flag on resume in
> pci_restore_state(). The commit seemingly did not take into account that
> pci_save_state() and pci_restore_state() are not only used by power
> management code, but also for error recovery.
That's right, it didn't.
> Devices without driver or whose driver has no PCI PM callbacks may be in
> runtime suspend when pci_pm_freeze() is called. Their state has already
> been saved, so don't clear the flag to skip a pointless pci_save_state()
> in pci_pm_freeze_noirq().
>
> None of the drivers with legacy PCI PM callbacks seem to use runtime PM,
> so clear the flag unconditionally in their case.
>
> Fixes: 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored")
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v2.6.32+
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
> ---
> drivers/pci/pci-driver.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 302d61783f6c..327b21c48614 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -629,6 +629,8 @@ static int pci_legacy_suspend(struct device *dev, pm_message_t state)
> struct pci_dev *pci_dev = to_pci_dev(dev);
> struct pci_driver *drv = pci_dev->driver;
>
> + pci_dev->state_saved = false;
> +
> if (drv && drv->suspend) {
> pci_power_t prev = pci_dev->current_state;
> int error;
> @@ -1036,6 +1038,8 @@ static int pci_pm_freeze(struct device *dev)
>
> if (!pm) {
> pci_pm_default_suspend(pci_dev);
> + if (!pm_runtime_suspended(dev))
> + pci_dev->state_saved = false;
> return 0;
> }
>
> --
> 2.51.0
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/3] PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
2025-11-19 8:50 ` [PATCH v2 2/3] PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw Lukas Wunner
@ 2025-11-19 21:09 ` Rafael J. Wysocki
0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2025-11-19 21:09 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Rafael J. Wysocki, Riana Tauro, Sean C. Dardis,
Farhan Ali, Benjamin Block, Niklas Schnelle, Alek Du,
Mahesh J Salgaonkar, Oliver OHalloran, linuxppc-dev, linux-pci,
linux-pm
On Wed, Nov 19, 2025 at 10:02 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> The state_saved flag tells the PCI core whether a driver assumes
> responsibility to save Config Space and put the device into a low power
> state on suspend.
>
> The flag is currently initialized to false on enumeration, even though it
> already is false (because struct pci_dev is zeroed by kzalloc()) and even
> though it is set to false before commencing the suspend sequence (the only
> code path where it's relevant).
>
> The flag is also set to false in pci_pm_thaw(), i.e. on resume, when it's
> no longer relevant.
>
> Drop these two superfluous flag assignments for simplicity.
>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
> ---
> drivers/pci/pci-driver.c | 2 --
> drivers/pci/probe.c | 2 --
> 2 files changed, 4 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 327b21c48614..7c2d9d596258 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1133,8 +1133,6 @@ static int pci_pm_thaw(struct device *dev)
> pci_pm_reenable_device(pci_dev);
> }
>
> - pci_dev->state_saved = false;
> -
> return error;
> }
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index c83e75a0ec12..c7c7a3d5ec0f 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2747,8 +2747,6 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
>
> pci_reassigndev_resource_alignment(dev);
>
> - dev->state_saved = false;
> -
> pci_init_capabilities(dev);
>
> /*
> --
> 2.51.0
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] PCI: Universal error recoverability of devices
2025-11-19 8:50 [PATCH v2 0/3] PCI: Universal error recoverability of devices Lukas Wunner
` (2 preceding siblings ...)
2025-11-19 8:50 ` [PATCH v2 3/3] PCI/ERR: Ensure error recoverability at all times Lukas Wunner
@ 2025-11-25 21:03 ` Bjorn Helgaas
3 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2025-11-25 21:03 UTC (permalink / raw)
To: Lukas Wunner
Cc: Rafael J. Wysocki, Riana Tauro, Sean C. Dardis, Farhan Ali,
Benjamin Block, Niklas Schnelle, Alek Du, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-pm
On Wed, Nov 19, 2025 at 09:50:00AM +0100, Lukas Wunner wrote:
> This series intends to replace commit 1dc302f7fccc ("PCI: Ensure error
> recoverability at all times") on the pci/err topic branch:
>
> https://git.kernel.org/pci/pci/c/1dc302f7fccc
>
> The commit is assigning "dev->state_saved = false" in pci_bus_add_device()
> and during review there were requests to explain the assignment more
> clearly in a code comment.
>
> However the assignment is (only) necessitated by missing assignments in
> pci_legacy_suspend() and pci_pm_freeze(), so I propose to instead add
> *those* assignments (patch [1/3]) and thus avoid the need for the
> assignment in pci_bus_add_device(), together with its code comment.
>
> Furthermore the commit is *removing* an assignment in pci_device_add().
> I am separating that out to new patch [2/3].
>
> So patch [3/3] is identical to the commit, but without the addition
> of an assignment in pci_bus_add_device() and without the removal
> of an assignment in pci_device_add().
>
> I am looking into improving the documentation on pci_save_state()
> in a separate series.
>
> Lukas Wunner (3):
> PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
> PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
> PCI/ERR: Ensure error recoverability at all times
>
> drivers/pci/bus.c | 3 +++
> drivers/pci/pci-driver.c | 6 ++++--
> drivers/pci/pci.c | 3 ---
> drivers/pci/probe.c | 2 --
> 4 files changed, 7 insertions(+), 7 deletions(-)
Applied on pci/err for v6.19, thanks!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
2025-11-19 8:50 ` [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths Lukas Wunner
2025-11-19 21:08 ` Rafael J. Wysocki
@ 2025-11-25 23:18 ` Bjorn Helgaas
2025-11-26 12:49 ` Lukas Wunner
1 sibling, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2025-11-25 23:18 UTC (permalink / raw)
To: Lukas Wunner
Cc: Rafael J. Wysocki, Riana Tauro, Sean C. Dardis, Farhan Ali,
Benjamin Block, Niklas Schnelle, Alek Du, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-pm
On Wed, Nov 19, 2025 at 09:50:01AM +0100, Lukas Wunner wrote:
> When a PCI device is suspended, it is normally the PCI core's job to save
> Config Space and put the device into a low power state. However drivers
> are allowed to assume these responsibilities. When they do, the PCI core
> can tell by looking at the state_saved flag in struct pci_dev: The flag
> is cleared before commencing the suspend sequence and it is set when
> pci_save_state() is called. If the PCI core finds the flag set late in
> the suspend sequence, it refrains from calling pci_save_state() itself.
This has been applied already, no issue there. I have a few questions
to help come up with a short higher-level merge commit log.
> But there are two corner cases where the PCI core neglects to clear the
> flag before commencing the suspend sequence:
>
> * If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects
> to clear the flag. The (stale) flag is subsequently queried by
> pci_legacy_suspend() itself and pci_legacy_suspend_late().
>
> * If a device has no driver or its driver has no PCI PM callbacks,
> pci_pm_freeze() neglects to clear the flag. The (stale) flag is
> subsequently queried by pci_pm_freeze_noirq().
>
> The flag may be set prior to suspend if the device went through error
> recovery: Drivers commonly invoke pci_restore_state() + pci_save_state()
> to restore Config Space after reset.
I guess the only point of pci_save_state() in this case is to set
state_saved again so a future pci_restore_state() will work, right?
The actual state being *saved* is pointless, assuming pci_save_state()
saves exactly the same data that pci_restore_state() restored.
And these are the pci_save_state() calls you removed with "treewide:
Drop pci_save_state() after pci_restore_state()". Too bad we have to
document the behavior we're about to change, but that's what we need
to do. It's just a little clutter to keep in mind for this release.
> The flag may also be set if drivers call pci_save_state() on probe to
> allow for recovery from subsequent errors.
>
> The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq()
> don't call pci_save_state() and so the state that will be restored on
> resume is the one recorded on last error recovery or on probe, not the one
> that the device had on suspend. If the two states happen to be identical,
> there's no problem.
So IIUC the effect is that after this change and the "treewide"
change,
- If the driver uses legacy PM, the state restored on resume will be
the state from suspend instead of the state on probe.
- For devices with no driver or a driver without PM, if the device
has already been runtime-suspended, we avoid a pointless
pci_save_state(), so it's an optimization and not logically
related to the legacy PM case.
Right?
I'm thinking of something like this for the merge commit and eventual
pull request; please correct me if this isn't right:
Restore the suspend config state, not the state from probe or error
recovery, for drivers using legacy PCI suspend.
Avoid saving config state again for devices without driver PM if
their state was already saved by runtime suspend.
> Reinstate clearing the flag in pci_legacy_suspend() and pci_pm_freeze().
> The two functions used to do that until commit 4b77b0a2ba27 ("PCI: Clear
> saved_state after the state has been restored") deemed it unnecessary
> because it assumed that it's sufficient to clear the flag on resume in
> pci_restore_state(). The commit seemingly did not take into account that
> pci_save_state() and pci_restore_state() are not only used by power
> management code, but also for error recovery.
>
> Devices without driver or whose driver has no PCI PM callbacks may be in
> runtime suspend when pci_pm_freeze() is called. Their state has already
> been saved, so don't clear the flag to skip a pointless pci_save_state()
> in pci_pm_freeze_noirq().
>
> None of the drivers with legacy PCI PM callbacks seem to use runtime PM,
> so clear the flag unconditionally in their case.
>
> Fixes: 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored")
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v2.6.32+
> ---
> drivers/pci/pci-driver.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 302d61783f6c..327b21c48614 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -629,6 +629,8 @@ static int pci_legacy_suspend(struct device *dev, pm_message_t state)
> struct pci_dev *pci_dev = to_pci_dev(dev);
> struct pci_driver *drv = pci_dev->driver;
>
> + pci_dev->state_saved = false;
> +
> if (drv && drv->suspend) {
> pci_power_t prev = pci_dev->current_state;
> int error;
> @@ -1036,6 +1038,8 @@ static int pci_pm_freeze(struct device *dev)
>
> if (!pm) {
> pci_pm_default_suspend(pci_dev);
> + if (!pm_runtime_suspended(dev))
> + pci_dev->state_saved = false;
> return 0;
> }
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
2025-11-25 23:18 ` Bjorn Helgaas
@ 2025-11-26 12:49 ` Lukas Wunner
2025-11-26 23:46 ` Bjorn Helgaas
0 siblings, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2025-11-26 12:49 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Rafael J. Wysocki, Riana Tauro, Sean C. Dardis, Farhan Ali,
Benjamin Block, Niklas Schnelle, Alek Du, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-pm
On Tue, Nov 25, 2025 at 05:18:46PM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 19, 2025 at 09:50:01AM +0100, Lukas Wunner wrote:
> > But there are two corner cases where the PCI core neglects to clear the
> > flag before commencing the suspend sequence:
> >
> > * If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects
> > to clear the flag. The (stale) flag is subsequently queried by
> > pci_legacy_suspend() itself and pci_legacy_suspend_late().
> >
> > * If a device has no driver or its driver has no PCI PM callbacks,
> > pci_pm_freeze() neglects to clear the flag. The (stale) flag is
> > subsequently queried by pci_pm_freeze_noirq().
> >
> > The flag may be set prior to suspend if the device went through error
> > recovery: Drivers commonly invoke pci_restore_state() + pci_save_state()
> > to restore Config Space after reset.
>
> I guess the only point of pci_save_state() in this case is to set
> state_saved again so a future pci_restore_state() will work, right?
>
> The actual state being *saved* is pointless, assuming pci_save_state()
> saves exactly the same data that pci_restore_state() restored.
>
> And these are the pci_save_state() calls you removed with "treewide:
> Drop pci_save_state() after pci_restore_state()". Too bad we have to
> document the behavior we're about to change, but that's what we need
> to do. It's just a little clutter to keep in mind for this release.
Yes. All of your comments above are correct.
> > The flag may also be set if drivers call pci_save_state() on probe to
> > allow for recovery from subsequent errors.
> >
> > The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq()
> > don't call pci_save_state() and so the state that will be restored on
> > resume is the one recorded on last error recovery or on probe, not the one
> > that the device had on suspend. If the two states happen to be identical,
> > there's no problem.
>
> So IIUC the effect is that after this change and the "treewide"
> change,
>
> - If the driver uses legacy PM, the state restored on resume will be
> the state from suspend instead of the state on probe.
Right.
>
> - For devices with no driver or a driver without PM, if the device
> has already been runtime-suspended, we avoid a pointless
> pci_save_state(), so it's an optimization and not logically
> related to the legacy PM case.
It's slightly different:
- For devices with no driver or a driver without PM, the state restored
on "thaw" and "restore" will be the state from "freeze" instead of the
state on probe.
So the same problem that we have for drivers using legacy PM, we also
have for devices with no driver or a driver without (modern) PM callbacks,
but only in the "freeze" codepath (for hibernation).
In the patch, I made the "pci_dev->state_saved = false" assignment
conditional on !pm_runtime_suspended() in the "freeze" codepath.
I didn't do the same in the legacy codepath because none of the
drivers using legacy PM callbacks seem to be using runtime PM.
The purpose of making it conditional on !pm_runtime_suspended()
is just that we'd otherwise call pci_save_state() twice: Once in
pci_pm_runtime_suspend() and once more in pci_pm_freeze().
That would be pointless.
In the commit message, I provided a rationale for the conditionality,
but inadvertently caused confusion.
> I'm thinking of something like this for the merge commit and eventual
> pull request; please correct me if this isn't right:
>
> Restore the suspend config state, not the state from probe or error
> recovery, for drivers using legacy PCI suspend.
>
> Avoid saving config state again for devices without driver PM if
> their state was already saved by runtime suspend.
I'd suggest instead (feel free to wordsmith as you see fit):
Restore the suspend config state, not the state from probe or error
recovery, for drivers using legacy PCI suspend. [ <- unmodified ]
Same for devices with no driver or a driver without PM callbacks
when the system is hibernated. [ <- replacement ]
Mentioning the runtime PM conditionality in the high-level changelog
is probably not worth it.
Was I able to clarify all questions? Please ask again if not.
Also, in case the meaning of "freeze", "thaw", "restore" isn't clear,
here's the order of a hibernation sequence (suspend to disk):
pci_pm_prepare()
pci_pm_freeze()
pci_pm_freeze_noirq()
<system image is generated>
pci_pm_thaw_noirq()
pci_pm_thaw()
pci_pm_complete()
pci_pm_prepare()
pci_pm_poweroff()
pci_pm_poweroff_late()
pci_pm_poweroff_noirq()
<system is asleep, then restarted with boot kernel>
pci_pm_prepare()
pci_pm_freeze()
pci_pm_freeze_noirq()
<system image is restored, replacing the boot kernel>
pci_pm_restore_noirq()
pci_pm_restore()
pci_pm_complete()
Note that "freeze" happens twice in the whole sequence.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
2025-11-26 12:49 ` Lukas Wunner
@ 2025-11-26 23:46 ` Bjorn Helgaas
2025-11-27 7:58 ` Lukas Wunner
0 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2025-11-26 23:46 UTC (permalink / raw)
To: Lukas Wunner
Cc: Rafael J. Wysocki, Riana Tauro, Sean C. Dardis, Farhan Ali,
Benjamin Block, Niklas Schnelle, Alek Du, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-pm
On Wed, Nov 26, 2025 at 01:49:06PM +0100, Lukas Wunner wrote:
> On Tue, Nov 25, 2025 at 05:18:46PM -0600, Bjorn Helgaas wrote:
> > On Wed, Nov 19, 2025 at 09:50:01AM +0100, Lukas Wunner wrote:
> > > But there are two corner cases where the PCI core neglects to clear the
> > > flag before commencing the suspend sequence:
> > >
> > > * If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects
> > > to clear the flag. The (stale) flag is subsequently queried by
> > > pci_legacy_suspend() itself and pci_legacy_suspend_late().
> > >
> > > * If a device has no driver or its driver has no PCI PM callbacks,
> > > pci_pm_freeze() neglects to clear the flag. The (stale) flag is
> > > subsequently queried by pci_pm_freeze_noirq().
> > >
> > > The flag may be set prior to suspend if the device went through error
> > > recovery: Drivers commonly invoke pci_restore_state() + pci_save_state()
> > > to restore Config Space after reset.
> >
> > I guess the only point of pci_save_state() in this case is to set
> > state_saved again so a future pci_restore_state() will work, right?
> >
> > The actual state being *saved* is pointless, assuming pci_save_state()
> > saves exactly the same data that pci_restore_state() restored.
> >
> > And these are the pci_save_state() calls you removed with "treewide:
> > Drop pci_save_state() after pci_restore_state()". Too bad we have to
> > document the behavior we're about to change, but that's what we need
> > to do. It's just a little clutter to keep in mind for this release.
>
> Yes. All of your comments above are correct.
>
> > > The flag may also be set if drivers call pci_save_state() on probe to
> > > allow for recovery from subsequent errors.
> > >
> > > The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq()
> > > don't call pci_save_state() and so the state that will be restored on
> > > resume is the one recorded on last error recovery or on probe, not the one
> > > that the device had on suspend. If the two states happen to be identical,
> > > there's no problem.
> >
> > So IIUC the effect is that after this change and the "treewide"
> > change,
> >
> > - If the driver uses legacy PM, the state restored on resume will be
> > the state from suspend instead of the state on probe.
>
> Right.
>
> > - For devices with no driver or a driver without PM, if the device
> > has already been runtime-suspended, we avoid a pointless
> > pci_save_state(), so it's an optimization and not logically
> > related to the legacy PM case.
>
> It's slightly different:
>
> - For devices with no driver or a driver without PM, the state restored
> on "thaw" and "restore" will be the state from "freeze" instead of the
> state on probe.
>
> So the same problem that we have for drivers using legacy PM, we also
> have for devices with no driver or a driver without (modern) PM callbacks,
> but only in the "freeze" codepath (for hibernation).
>
> In the patch, I made the "pci_dev->state_saved = false" assignment
> conditional on !pm_runtime_suspended() in the "freeze" codepath.
> I didn't do the same in the legacy codepath because none of the
> drivers using legacy PM callbacks seem to be using runtime PM.
Maybe it's moot because we hope there will be no new users of PCI
legacy PM with runtime PM, but I don't think there's anything to
*prevent* that or to protect against out-of-tree drivers.
The implicit assumption that there are no such drivers makes it look
like there's something magic involving state_saved, legacy PM, and
runtime PM. It might be worth doing the same in the legacy PM path
just for readability.
> The purpose of making it conditional on !pm_runtime_suspended()
> is just that we'd otherwise call pci_save_state() twice: Once in
> pci_pm_runtime_suspend() and once more in pci_pm_freeze().
> That would be pointless.
>
> In the commit message, I provided a rationale for the conditionality,
> but inadvertently caused confusion.
>
> > I'm thinking of something like this for the merge commit and eventual
> > pull request; please correct me if this isn't right:
> >
> > Restore the suspend config state, not the state from probe or error
> > recovery, for drivers using legacy PCI suspend.
> >
> > Avoid saving config state again for devices without driver PM if
> > their state was already saved by runtime suspend.
>
> I'd suggest instead (feel free to wordsmith as you see fit):
>
> Restore the suspend config state, not the state from probe or error
> recovery, for drivers using legacy PCI suspend. [ <- unmodified ]
>
> Same for devices with no driver or a driver without PM callbacks
> when the system is hibernated. [ <- replacement ]
Stepping back, I guess that when drivers use generic PM, we already
save config state during suspend and restore that state during resume,
and do the same when entering/leaving hibernation.
And the point of this patch is to do the same when drivers lack PM or
use legacy PCI PM, or when devices have no driver?
Maybe third try is the charm?
For drivers using PCI legacy suspend, save config state at suspend
so that state, not any earlier state from enumeration, probe, or
error recovery, will be restored when resuming.
For devices with no driver or a driver that lacks PM, save config
state at hibernate so that state, not any earlier state from
enumeration, probe, or error recovery, will be restored when
resuming.
IIUC, after "Ensure error recoverability", the PCI core will always
save the state during enumeration, so drivers shouldn't use
pci_save_state() at all unless they make config changes that they want
restored during error recovery?
Or, I guess (sigh) if they do their own power management?
> Mentioning the runtime PM conditionality in the high-level changelog
> is probably not worth it.
>
> Was I able to clarify all questions? Please ask again if not.
>
> Also, in case the meaning of "freeze", "thaw", "restore" isn't clear,
> here's the order of a hibernation sequence (suspend to disk):
>
> pci_pm_prepare()
> pci_pm_freeze()
> pci_pm_freeze_noirq()
> <system image is generated>
> pci_pm_thaw_noirq()
> pci_pm_thaw()
> pci_pm_complete()
> pci_pm_prepare()
> pci_pm_poweroff()
> pci_pm_poweroff_late()
> pci_pm_poweroff_noirq()
> <system is asleep, then restarted with boot kernel>
> pci_pm_prepare()
> pci_pm_freeze()
> pci_pm_freeze_noirq()
> <system image is restored, replacing the boot kernel>
> pci_pm_restore_noirq()
> pci_pm_restore()
> pci_pm_complete()
>
> Note that "freeze" happens twice in the whole sequence.
Thanks, this is extremely helpful. Copied into my notes. I guess
this is essentially condensed from
Documentation/driver-api/pm/devices.rst, but it's very helpful to see
the bare bones and your annotations.
Bjorn
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
2025-11-26 23:46 ` Bjorn Helgaas
@ 2025-11-27 7:58 ` Lukas Wunner
2025-11-27 12:51 ` Rafael J. Wysocki
0 siblings, 1 reply; 12+ messages in thread
From: Lukas Wunner @ 2025-11-27 7:58 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Rafael J. Wysocki, Riana Tauro, Sean C. Dardis, Farhan Ali,
Benjamin Block, Niklas Schnelle, Alek Du, Mahesh J Salgaonkar,
Oliver OHalloran, linuxppc-dev, linux-pci, linux-pm
On Wed, Nov 26, 2025 at 05:46:03PM -0600, Bjorn Helgaas wrote:
> On Wed, Nov 26, 2025 at 01:49:06PM +0100, Lukas Wunner wrote:
> > In the patch, I made the "pci_dev->state_saved = false" assignment
> > conditional on !pm_runtime_suspended() in the "freeze" codepath.
> > I didn't do the same in the legacy codepath because none of the
> > drivers using legacy PM callbacks seem to be using runtime PM.
>
> Maybe it's moot because we hope there will be no new users of PCI
> legacy PM with runtime PM, but I don't think there's anything to
> *prevent* that or to protect against out-of-tree drivers.
>
> The implicit assumption that there are no such drivers makes it look
> like there's something magic involving state_saved, legacy PM, and
> runtime PM. It might be worth doing the same in the legacy PM path
> just for readability.
Drivers having both legacy callbacks and modern callbacks (including
runtime PM callbacks) cause emission of a WARN splat in
pci_has_legacy_pm_support().
Drivers need to activate runtime PM by dropping a runtime PM reference
on probe (see the code comment in local_pci_probe()). In theory a
driver could have legacy callbacks but no modern callbacks and still
use runtime PM by calling pm_runtime_put_noidle() on probe. So I
compiled a list of drivers implementing legacy callbacks (included
at the end of this e-mail for reference), grep'ed through them
for any "pm_runtime" occurrences and found none.
Hence it seems very unlikely that drivers using legacy callbacks and
runtime PM exist. We probably shouldn't accommodate for such use cases
but should rather try to incentivize conversion to modern callbacks.
When compiling the list I sadly noticed that new drivers do exist
which use legacy callbacks. A case in point is:
drivers/net/ethernet/google/gve/gve_main.c
... which started using legacy callbacks in 2021 with commit 974365e51861
("gve: Implement suspend/resume/shutdown").
I guess there is no real incentive to convert to modern PM callbacks and
finding someone who has the hardware and can test patches is hard
(most drivers are for ATA, some for really old 1990s hardware).
Plus, a lot of detailed knowledge about PCI PM is necessary to avoid
breakage, making this a task that can't easily be delegated to new
contributors. And everyone with the knowledge is overworked already.
So we keep dragging this tech debt along which complicates codepaths. :(
> Stepping back, I guess that when drivers use generic PM, we already
> save config state during suspend and restore that state during resume,
> and do the same when entering/leaving hibernation.
>
> And the point of this patch is to do the same when drivers lack PM or
> use legacy PCI PM, or when devices have no driver?
Right. To have a consistent logic that state_saved is always cleared
before commencing the suspend sequence, in all codepaths.
> Maybe third try is the charm?
>
> For drivers using PCI legacy suspend, save config state at suspend
> so that state, not any earlier state from enumeration, probe, or
> error recovery, will be restored when resuming.
>
> For devices with no driver or a driver that lacks PM, save config
> state at hibernate so that state, not any earlier state from
> enumeration, probe, or error recovery, will be restored when
> resuming.
Perfect.
> IIUC, after "Ensure error recoverability", the PCI core will always
> save the state during enumeration, so drivers shouldn't use
> pci_save_state() at all unless they make config changes that they want
> restored during error recovery?
>
> Or, I guess (sigh) if they do their own power management?
Exactly right.
> > Also, in case the meaning of "freeze", "thaw", "restore" isn't clear,
> > here's the order of a hibernation sequence (suspend to disk):
[...]
> Thanks, this is extremely helpful.
FWIW, this is the sequence of suspend-to-memory, obviously much simpler:
pci_pm_prepare()
pci_pm_suspend()
pci_pm_suspend_late()
pci_pm_suspend_noirq()
pci_pm_resume_noirq()
pci_pm_resume_early()
pci_pm_resume()
pci_pm_complete()
And runtime PM:
pci_pm_runtime_suspend()
pci_pm_runtime_resume()
Thanks,
Lukas
-- >8 --
Drivers with legacy PCI PM callbacks:
drivers/ata/acard-ahci.c
drivers/ata/ata_generic.c
drivers/ata/ata_piix.c
drivers/ata/pata_acpi.c
drivers/ata/pata_ali.c
drivers/ata/pata_amd.c
drivers/ata/pata_artop.c
drivers/ata/pata_atiixp.c
drivers/ata/pata_atp867x.c
drivers/ata/pata_cmd640.c
drivers/ata/pata_cmd64x.c
drivers/ata/pata_cs5520.c
drivers/ata/pata_cs5530.c
drivers/ata/pata_cs5535.c
drivers/ata/pata_cs5536.c
drivers/ata/pata_cypress.c
drivers/ata/pata_efar.c
drivers/ata/pata_hpt366.c
drivers/ata/pata_hpt3x3.c
drivers/ata/pata_it8213.c
drivers/ata/pata_it821x.c
drivers/ata/pata_jmicron.c
drivers/ata/pata_macio.c
drivers/ata/pata_marvell.c
drivers/ata/pata_mpiix.c
drivers/ata/pata_netcell.c
drivers/ata/pata_ninja32.c
drivers/ata/pata_ns87410.c
drivers/ata/pata_ns87415.c
drivers/ata/pata_oldpiix.c
drivers/ata/pata_opti.c
drivers/ata/pata_optidma.c
drivers/ata/pata_pdc2027x.c
drivers/ata/pata_pdc202xx_old.c
drivers/ata/pata_piccolo.c
drivers/ata/pata_radisys.c
drivers/ata/pata_rdc.c
drivers/ata/pata_rz1000.c
drivers/ata/pata_sc1200.c
drivers/ata/pata_sch.c
drivers/ata/pata_serverworks.c
drivers/ata/pata_sil680.c
drivers/ata/pata_sis.c
drivers/ata/pata_sl82c105.c
drivers/ata/pata_triflex.c
drivers/ata/pata_via.c
drivers/ata/sata_inic162x.c
drivers/ata/sata_mv.c
drivers/ata/sata_nv.c
drivers/ata/sata_sil.c
drivers/ata/sata_sil24.c
drivers/ata/sata_sis.c
drivers/ata/sata_via.c
drivers/bluetooth/hci_bcm4377.c
drivers/gpio/gpio-bt8xx.c
drivers/message/fusion/mptfc.c
drivers/message/fusion/mptsas.c
drivers/message/fusion/mptspi.c
drivers/mtd/nand/raw/cafe_nand.c
drivers/net/ethernet/atheros/atl1e/atl1e_main.c
drivers/net/ethernet/atheros/atlx/atl2.c
drivers/net/ethernet/google/gve/gve_main.c
drivers/net/ethernet/microsoft/mana/gdma_main.c
drivers/net/ethernet/toshiba/tc35815.c
drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
drivers/net/wireless/mediatek/mt76/mt7615/pci.c
drivers/net/wireless/mediatek/mt76/mt76x0/pci.c
drivers/net/wireless/mediatek/mt76/mt76x2/pci.c
drivers/scsi/nsp32.c
drivers/scsi/qedf/qedf_main.c
drivers/scsi/qedi/qedi_main.c
drivers/scsi/stex.c
drivers/tty/serial/serial_txx9.c
drivers/video/fbdev/chipsfb.c
drivers/video/fbdev/i810/i810_main.c
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths
2025-11-27 7:58 ` Lukas Wunner
@ 2025-11-27 12:51 ` Rafael J. Wysocki
0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2025-11-27 12:51 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Rafael J. Wysocki, Riana Tauro, Sean C. Dardis,
Farhan Ali, Benjamin Block, Niklas Schnelle, Alek Du,
Mahesh J Salgaonkar, Oliver OHalloran, linuxppc-dev, linux-pci,
linux-pm
On Thu, Nov 27, 2025 at 8:58 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Wed, Nov 26, 2025 at 05:46:03PM -0600, Bjorn Helgaas wrote:
> > On Wed, Nov 26, 2025 at 01:49:06PM +0100, Lukas Wunner wrote:
> > > In the patch, I made the "pci_dev->state_saved = false" assignment
> > > conditional on !pm_runtime_suspended() in the "freeze" codepath.
> > > I didn't do the same in the legacy codepath because none of the
> > > drivers using legacy PM callbacks seem to be using runtime PM.
> >
> > Maybe it's moot because we hope there will be no new users of PCI
> > legacy PM with runtime PM, but I don't think there's anything to
> > *prevent* that or to protect against out-of-tree drivers.
> >
> > The implicit assumption that there are no such drivers makes it look
> > like there's something magic involving state_saved, legacy PM, and
> > runtime PM. It might be worth doing the same in the legacy PM path
> > just for readability.
>
> Drivers having both legacy callbacks and modern callbacks (including
> runtime PM callbacks) cause emission of a WARN splat in
> pci_has_legacy_pm_support().
>
> Drivers need to activate runtime PM by dropping a runtime PM reference
> on probe (see the code comment in local_pci_probe()). In theory a
> driver could have legacy callbacks but no modern callbacks and still
> use runtime PM by calling pm_runtime_put_noidle() on probe. So I
> compiled a list of drivers implementing legacy callbacks (included
> at the end of this e-mail for reference), grep'ed through them
> for any "pm_runtime" occurrences and found none.
>
> Hence it seems very unlikely that drivers using legacy callbacks and
> runtime PM exist. We probably shouldn't accommodate for such use cases
> but should rather try to incentivize conversion to modern callbacks.
Agreed.
What about adding a WARN_ON(pm_runtime_enabled(dev)) to the "legacy"
suspend/hibernation callback paths?
> When compiling the list I sadly noticed that new drivers do exist
> which use legacy callbacks. A case in point is:
>
> drivers/net/ethernet/google/gve/gve_main.c
>
> ... which started using legacy callbacks in 2021 with commit 974365e51861
> ("gve: Implement suspend/resume/shutdown").
>
> I guess there is no real incentive to convert to modern PM callbacks and
> finding someone who has the hardware and can test patches is hard
> (most drivers are for ATA, some for really old 1990s hardware).
> Plus, a lot of detailed knowledge about PCI PM is necessary to avoid
> breakage, making this a task that can't easily be delegated to new
> contributors. And everyone with the knowledge is overworked already.
> So we keep dragging this tech debt along which complicates codepaths. :(
While I agree that this is the case, I'm not sure what can be done to
address this problem, realistically.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-11-27 12:52 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-19 8:50 [PATCH v2 0/3] PCI: Universal error recoverability of devices Lukas Wunner
2025-11-19 8:50 ` [PATCH v2 1/3] PCI/PM: Reinstate clearing state_saved in legacy and !pm codepaths Lukas Wunner
2025-11-19 21:08 ` Rafael J. Wysocki
2025-11-25 23:18 ` Bjorn Helgaas
2025-11-26 12:49 ` Lukas Wunner
2025-11-26 23:46 ` Bjorn Helgaas
2025-11-27 7:58 ` Lukas Wunner
2025-11-27 12:51 ` Rafael J. Wysocki
2025-11-19 8:50 ` [PATCH v2 2/3] PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw Lukas Wunner
2025-11-19 21:09 ` Rafael J. Wysocki
2025-11-19 8:50 ` [PATCH v2 3/3] PCI/ERR: Ensure error recoverability at all times Lukas Wunner
2025-11-25 21:03 ` [PATCH v2 0/3] PCI: Universal error recoverability of devices Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).