* [PATCH v2] PCI/PM: Prevent runtime suspend before devices are fully initialized
@ 2025-10-17 19:21 Brian Norris
2025-10-18 11:27 ` Rafael J. Wysocki
0 siblings, 1 reply; 3+ messages in thread
From: Brian Norris @ 2025-10-17 19:21 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Rafael J . Wysocki, linux-kernel, Ilpo Järvinen, linux-pm,
Lukas Wunner, linux-pci, Brian Norris
Today, it's possible for a PCI device to be created and
runtime-suspended before it is fully initialized. When that happens, the
device will remain in D0, but the suspend process may save an
intermediate version of that device's state -- for example, without
appropriate BAR configuration. When the device later resumes, we'll
restore invalid PCI state and the device may not function.
Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
until we've fully initialized the device.
More details on how exactly this may occur:
1. PCI device is created by pci_scan_slot() or similar
2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the
device starts "active" and we initially prevent (pm_runtime_forbid())
suspend -- but see [*] footnote
3. Underlying 'struct device' is added to the system (device_add());
runtime PM can now be configured by user space
4. PCI device receives BAR configuration
(pci_assign_unassigned_bus_resources(), etc.)
5. PCI device is added to the system in pci_bus_add_device()
The device may potentially suspend between #3 and #4.
[*] By default, pm_runtime_forbid() prevents suspending a device; but by
design, this can be overridden by user space policy via
echo auto > /sys/bus/pci/devices/.../power/control
Thus, the above #3/#4 sequence is racy with user space (udev or
similar).
Notably, many PCI devices are enumerated at subsys_initcall time and so
will not race with user space. However, there are several scenarios
where PCI devices are created later on, such as with hotplug or when
drivers (pwrctrl or controller drivers) are built as modules.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Cc: <stable@vger.kernel.org>
---
Changes in v2:
* Update CC list
* Rework problem description
* Update solution: defer pm_runtime_enable(), instead of trying to
get()/put()
drivers/pci/bus.c | 3 +++
drivers/pci/pci.c | 1 -
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index f26aec6ff588..fc66b6cb3a54 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -14,6 +14,7 @@
#include <linux/of.h>
#include <linux/of_platform.h>
#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
#include <linux/proc_fs.h>
#include <linux/slab.h>
@@ -375,6 +376,8 @@ void pci_bus_add_device(struct pci_dev *dev)
put_device(&pdev->dev);
}
+ pm_runtime_enable(&dev->dev);
+
if (!dn || of_device_is_available(dn))
pci_dev_allow_binding(dev);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..f792164fa297 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3226,7 +3226,6 @@ void pci_pm_init(struct pci_dev *dev)
pci_pm_power_up_and_verify_state(dev);
pm_runtime_forbid(&dev->dev);
pm_runtime_set_active(&dev->dev);
- pm_runtime_enable(&dev->dev);
}
static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] PCI/PM: Prevent runtime suspend before devices are fully initialized
2025-10-17 19:21 [PATCH v2] PCI/PM: Prevent runtime suspend before devices are fully initialized Brian Norris
@ 2025-10-18 11:27 ` Rafael J. Wysocki
2025-10-20 20:07 ` Brian Norris
0 siblings, 1 reply; 3+ messages in thread
From: Rafael J. Wysocki @ 2025-10-18 11:27 UTC (permalink / raw)
To: Brian Norris
Cc: Bjorn Helgaas, Rafael J . Wysocki, linux-kernel,
Ilpo Järvinen, linux-pm, Lukas Wunner, linux-pci
On Fri, Oct 17, 2025 at 9:22 PM Brian Norris <briannorris@chromium.org> wrote:
>
> Today, it's possible for a PCI device to be created and
> runtime-suspended before it is fully initialized. When that happens, the
> device will remain in D0, but the suspend process may save an
> intermediate version of that device's state -- for example, without
> appropriate BAR configuration. When the device later resumes, we'll
> restore invalid PCI state and the device may not function.
>
> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> until we've fully initialized the device.
>
> More details on how exactly this may occur:
>
> 1. PCI device is created by pci_scan_slot() or similar
> 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the
> device starts "active" and we initially prevent (pm_runtime_forbid())
> suspend -- but see [*] footnote
> 3. Underlying 'struct device' is added to the system (device_add());
> runtime PM can now be configured by user space
> 4. PCI device receives BAR configuration
> (pci_assign_unassigned_bus_resources(), etc.)
> 5. PCI device is added to the system in pci_bus_add_device()
>
> The device may potentially suspend between #3 and #4.
>
> [*] By default, pm_runtime_forbid() prevents suspending a device; but by
> design, this can be overridden by user space policy via
>
> echo auto > /sys/bus/pci/devices/.../power/control
>
> Thus, the above #3/#4 sequence is racy with user space (udev or
> similar).
>
> Notably, many PCI devices are enumerated at subsys_initcall time and so
> will not race with user space. However, there are several scenarios
> where PCI devices are created later on, such as with hotplug or when
> drivers (pwrctrl or controller drivers) are built as modules.
>
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> Cc: <stable@vger.kernel.org>
Can you please add a Link: pointer to the discussion on the previous
version of the patch?
With that
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
> ---
>
> Changes in v2:
> * Update CC list
> * Rework problem description
> * Update solution: defer pm_runtime_enable(), instead of trying to
> get()/put()
>
> drivers/pci/bus.c | 3 +++
> drivers/pci/pci.c | 1 -
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index f26aec6ff588..fc66b6cb3a54 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -14,6 +14,7 @@
> #include <linux/of.h>
> #include <linux/of_platform.h>
> #include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> #include <linux/proc_fs.h>
> #include <linux/slab.h>
>
> @@ -375,6 +376,8 @@ void pci_bus_add_device(struct pci_dev *dev)
> put_device(&pdev->dev);
> }
>
> + pm_runtime_enable(&dev->dev);
> +
> if (!dn || of_device_is_available(dn))
> pci_dev_allow_binding(dev);
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b14dd064006c..f792164fa297 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3226,7 +3226,6 @@ void pci_pm_init(struct pci_dev *dev)
> pci_pm_power_up_and_verify_state(dev);
> pm_runtime_forbid(&dev->dev);
> pm_runtime_set_active(&dev->dev);
> - pm_runtime_enable(&dev->dev);
> }
>
> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
> --
> 2.51.0.858.gf9c4a03a3a-goog
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] PCI/PM: Prevent runtime suspend before devices are fully initialized
2025-10-18 11:27 ` Rafael J. Wysocki
@ 2025-10-20 20:07 ` Brian Norris
0 siblings, 0 replies; 3+ messages in thread
From: Brian Norris @ 2025-10-20 20:07 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Bjorn Helgaas, linux-kernel, Ilpo Järvinen, linux-pm,
Lukas Wunner, linux-pci
Hi,
On Sat, Oct 18, 2025 at 01:27:13PM +0200, Rafael J. Wysocki wrote:
> On Fri, Oct 17, 2025 at 9:22 PM Brian Norris <briannorris@chromium.org> wrote:
> >
> > Today, it's possible for a PCI device to be created and
> > runtime-suspended before it is fully initialized. When that happens, the
> > device will remain in D0, but the suspend process may save an
> > intermediate version of that device's state -- for example, without
> > appropriate BAR configuration. When the device later resumes, we'll
> > restore invalid PCI state and the device may not function.
> >
> > Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> > until we've fully initialized the device.
> >
> > More details on how exactly this may occur:
> >
> > 1. PCI device is created by pci_scan_slot() or similar
> > 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the
> > device starts "active" and we initially prevent (pm_runtime_forbid())
> > suspend -- but see [*] footnote
> > 3. Underlying 'struct device' is added to the system (device_add());
> > runtime PM can now be configured by user space
> > 4. PCI device receives BAR configuration
> > (pci_assign_unassigned_bus_resources(), etc.)
> > 5. PCI device is added to the system in pci_bus_add_device()
> >
> > The device may potentially suspend between #3 and #4.
> >
> > [*] By default, pm_runtime_forbid() prevents suspending a device; but by
> > design, this can be overridden by user space policy via
> >
> > echo auto > /sys/bus/pci/devices/.../power/control
> >
> > Thus, the above #3/#4 sequence is racy with user space (udev or
> > similar).
> >
> > Notably, many PCI devices are enumerated at subsys_initcall time and so
> > will not race with user space. However, there are several scenarios
> > where PCI devices are created later on, such as with hotplug or when
> > drivers (pwrctrl or controller drivers) are built as modules.
> >
> > Signed-off-by: Brian Norris <briannorris@chromium.org>
> > Cc: <stable@vger.kernel.org>
>
> Can you please add a Link: pointer to the discussion on the previous
> version of the patch?
Ha, it sounds like you want me to get flamed by Linus :) I don't even
know how "Link:" is supposed to be used any more.
But in case Bjorn wants to apply this as-is, here's a reference to v1,
where that discussion happened:
Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
If I send a v3, I'll include that.
<soap-box while we're at it:>
I really wish this proposal didn't also get flamed out:
Subject: [Ksummit-discuss] Allowing something Change-Id (or something like it) in kernel commits
https://lore.kernel.org/all/CAD=FV=UPjPpUyFTPjF-Ogzj_6LJLE4PTxMhCoCEDmH1LXSSmpQ@mail.gmail.com/
If we did that, then no one would have to ask for series-history links,
because changes would already include such IDs so anyone could follow
them automatically.
As a compromise, I've been doing this:
https://lore.kernel.org/all/CAD=FV=VLMFxFt55oB4ERTFw3xnH4czUY5tXiqfY14NKZ8gqojA@mail.gmail.com/
i.e., my patches tend to have Message-Ids that look like
<$timestamp.$revision.$changeid@changeid>, and so you can cleanly
track this patch, including past and future versions, with:
https://lore.kernel.org/all/?q=I60a53c170a8596661883bd2b4ef475155c7aa72b%40changeid
</soap-box>
> With that
>
> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Thanks!
Brian
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-10-20 20:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-17 19:21 [PATCH v2] PCI/PM: Prevent runtime suspend before devices are fully initialized Brian Norris
2025-10-18 11:27 ` Rafael J. Wysocki
2025-10-20 20:07 ` Brian Norris
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).