* [PATCH v3] PCI/PM: Prevent runtime suspend before devices are fully initialized @ 2025-10-22 21:14 Brian Norris 2025-10-23 9:41 ` Rafael J. Wysocki 0 siblings, 1 reply; 5+ messages in thread From: Brian Norris @ 2025-10-22 21:14 UTC (permalink / raw) To: Bjorn Helgaas Cc: Ilpo Järvinen, linux-pm, linux-pci, linux-kernel, Rafael J . Wysocki, Lukas Wunner, Brian Norris Today, it's possible for a PCI device to be created and runtime-suspended before it is fully initialized. When that happens, the device will remain in D0, but the suspend process may save an intermediate version of that device's state -- for example, without appropriate BAR configuration. When the device later resumes, we'll restore invalid PCI state and the device may not function. Prevent runtime suspend for PCI devices by deferring pm_runtime_enable() until we've fully initialized the device. More details on how exactly this may occur: 1. PCI device is created by pci_scan_slot() or similar 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the device starts "active" and we initially prevent (pm_runtime_forbid()) suspend -- but see [*] footnote 3. Underlying 'struct device' is added to the system (device_add()); runtime PM can now be configured by user space 4. PCI device receives BAR configuration (pci_assign_unassigned_bus_resources(), etc.) 5. PCI device is added to the system in pci_bus_add_device() The device may potentially suspend between #3 and #4. [*] By default, pm_runtime_forbid() prevents suspending a device; but by design [**], this can be overridden by user space policy via echo auto > /sys/bus/pci/devices/.../power/control Thus, the above #3/#4 sequence is racy with user space (udev or similar). Notably, many PCI devices are enumerated at subsys_initcall time and so will not race with user space. However, there are several scenarios where PCI devices are created later on, such as with hotplug or when drivers (pwrctrl or controller drivers) are built as modules. [**] The relationship between pm_runtime_forbid(), pm_runtime_allow(), /sys/.../power/control, and the runtime PM usage counter can be subtle. It appears that the intention of pm_runtime_forbid() / pm_runtime_allow() is twofold: 1. Allow the user to disable runtime_pm (force device to always be powered on) through sysfs. 2. Allow the driver to start with runtime_pm disabled (device forced on) and user space could later enable runtime_pm. This conclusion comes from reading `Documentation/power/runtime_pm.rst`, specifically the section starting "The user space can effectively disallow". This means that while pm_runtime_forbid() does technically increase the runtime PM usage counter, this usage counter is not a guarantee of functional correctness, because sysfs can decrease that count again. Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/ Signed-off-by: Brian Norris <briannorris@chromium.org> Cc: <stable@vger.kernel.org> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> --- Changes in v3: * Add Link to initial discussion * Add Rafael's Reviewed-by * Add lengthier footnotes about forbid vs allow vs sysfs Changes in v2: * Update CC list * Rework problem description * Update solution: defer pm_runtime_enable(), instead of trying to get()/put() drivers/pci/bus.c | 3 +++ drivers/pci/pci.c | 1 - 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index f26aec6ff588..fc66b6cb3a54 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -14,6 +14,7 @@ #include <linux/of.h> #include <linux/of_platform.h> #include <linux/platform_device.h> +#include <linux/pm_runtime.h> #include <linux/proc_fs.h> #include <linux/slab.h> @@ -375,6 +376,8 @@ void pci_bus_add_device(struct pci_dev *dev) put_device(&pdev->dev); } + pm_runtime_enable(&dev->dev); + if (!dn || of_device_is_available(dn)) pci_dev_allow_binding(dev); diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index b14dd064006c..f792164fa297 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3226,7 +3226,6 @@ void pci_pm_init(struct pci_dev *dev) pci_pm_power_up_and_verify_state(dev); pm_runtime_forbid(&dev->dev); pm_runtime_set_active(&dev->dev); - pm_runtime_enable(&dev->dev); } static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) -- 2.51.1.814.gb8fa24458f-goog ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3] PCI/PM: Prevent runtime suspend before devices are fully initialized 2025-10-22 21:14 [PATCH v3] PCI/PM: Prevent runtime suspend before devices are fully initialized Brian Norris @ 2025-10-23 9:41 ` Rafael J. Wysocki 2025-10-23 18:34 ` Brian Norris 0 siblings, 1 reply; 5+ messages in thread From: Rafael J. Wysocki @ 2025-10-23 9:41 UTC (permalink / raw) To: Brian Norris Cc: Bjorn Helgaas, Ilpo Järvinen, linux-pm, linux-pci, linux-kernel, Rafael J . Wysocki, Lukas Wunner On Wed, Oct 22, 2025 at 11:15 PM Brian Norris <briannorris@chromium.org> wrote: > > Today, it's possible for a PCI device to be created and > runtime-suspended before it is fully initialized. When that happens, the > device will remain in D0, but the suspend process may save an > intermediate version of that device's state -- for example, without > appropriate BAR configuration. When the device later resumes, we'll > restore invalid PCI state and the device may not function. > > Prevent runtime suspend for PCI devices by deferring pm_runtime_enable() > until we've fully initialized the device. > > More details on how exactly this may occur: > > 1. PCI device is created by pci_scan_slot() or similar > 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the > device starts "active" and we initially prevent (pm_runtime_forbid()) > suspend -- but see [*] footnote > 3. Underlying 'struct device' is added to the system (device_add()); > runtime PM can now be configured by user space > 4. PCI device receives BAR configuration > (pci_assign_unassigned_bus_resources(), etc.) > 5. PCI device is added to the system in pci_bus_add_device() > > The device may potentially suspend between #3 and #4. > > [*] By default, pm_runtime_forbid() prevents suspending a device; but by > design [**], this can be overridden by user space policy via > > echo auto > /sys/bus/pci/devices/.../power/control > > Thus, the above #3/#4 sequence is racy with user space (udev or > similar). > > Notably, many PCI devices are enumerated at subsys_initcall time and so > will not race with user space. However, there are several scenarios > where PCI devices are created later on, such as with hotplug or when > drivers (pwrctrl or controller drivers) are built as modules. > > [**] The relationship between pm_runtime_forbid(), pm_runtime_allow(), > /sys/.../power/control, and the runtime PM usage counter can be subtle. > It appears that the intention of pm_runtime_forbid() / > pm_runtime_allow() is twofold: > > 1. Allow the user to disable runtime_pm (force device to always be > powered on) through sysfs. > 2. Allow the driver to start with runtime_pm disabled (device forced > on) and user space could later enable runtime_pm. > > This conclusion comes from reading `Documentation/power/runtime_pm.rst`, > specifically the section starting "The user space can effectively > disallow". > > This means that while pm_runtime_forbid() does technically increase the > runtime PM usage counter, this usage counter is not a guarantee of > functional correctness, because sysfs can decrease that count again. > > Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/ > Signed-off-by: Brian Norris <briannorris@chromium.org> > Cc: <stable@vger.kernel.org> > Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> > --- > > Changes in v3: > * Add Link to initial discussion > * Add Rafael's Reviewed-by > * Add lengthier footnotes about forbid vs allow vs sysfs > > Changes in v2: > * Update CC list > * Rework problem description > * Update solution: defer pm_runtime_enable(), instead of trying to > get()/put() > > drivers/pci/bus.c | 3 +++ > drivers/pci/pci.c | 1 - > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > index f26aec6ff588..fc66b6cb3a54 100644 > --- a/drivers/pci/bus.c > +++ b/drivers/pci/bus.c > @@ -14,6 +14,7 @@ > #include <linux/of.h> > #include <linux/of_platform.h> > #include <linux/platform_device.h> > +#include <linux/pm_runtime.h> > #include <linux/proc_fs.h> > #include <linux/slab.h> > > @@ -375,6 +376,8 @@ void pci_bus_add_device(struct pci_dev *dev) > put_device(&pdev->dev); > } > > + pm_runtime_enable(&dev->dev); > + > if (!dn || of_device_is_available(dn)) > pci_dev_allow_binding(dev); > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index b14dd064006c..f792164fa297 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -3226,7 +3226,6 @@ void pci_pm_init(struct pci_dev *dev) > pci_pm_power_up_and_verify_state(dev); > pm_runtime_forbid(&dev->dev); > pm_runtime_set_active(&dev->dev); Actually, I think that the two statements above can be moved too. The pm_runtime_forbid() call doesn't matter until runtime PM is enabled and it is better to do pm_runtime_set_active() right before enabling runtime PM. > - pm_runtime_enable(&dev->dev); > } > > static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) > -- ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] PCI/PM: Prevent runtime suspend before devices are fully initialized 2025-10-23 9:41 ` Rafael J. Wysocki @ 2025-10-23 18:34 ` Brian Norris 2025-10-23 18:42 ` Brian Norris 0 siblings, 1 reply; 5+ messages in thread From: Brian Norris @ 2025-10-23 18:34 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Bjorn Helgaas, Ilpo Järvinen, linux-pm, linux-pci, linux-kernel, Lukas Wunner On Thu, Oct 23, 2025 at 11:41:38AM +0200, Rafael J. Wysocki wrote: > On Wed, Oct 22, 2025 at 11:15 PM Brian Norris <briannorris@chromium.org> wrote: > > > > Today, it's possible for a PCI device to be created and > > runtime-suspended before it is fully initialized. When that happens, the > > device will remain in D0, but the suspend process may save an > > intermediate version of that device's state -- for example, without > > appropriate BAR configuration. When the device later resumes, we'll > > restore invalid PCI state and the device may not function. > > > > Prevent runtime suspend for PCI devices by deferring pm_runtime_enable() > > until we've fully initialized the device. > > > > More details on how exactly this may occur: > > > > 1. PCI device is created by pci_scan_slot() or similar > > 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the > > device starts "active" and we initially prevent (pm_runtime_forbid()) > > suspend -- but see [*] footnote > > 3. Underlying 'struct device' is added to the system (device_add()); > > runtime PM can now be configured by user space > > 4. PCI device receives BAR configuration > > (pci_assign_unassigned_bus_resources(), etc.) > > 5. PCI device is added to the system in pci_bus_add_device() > > > > The device may potentially suspend between #3 and #4. > > > > [*] By default, pm_runtime_forbid() prevents suspending a device; but by > > design [**], this can be overridden by user space policy via > > > > echo auto > /sys/bus/pci/devices/.../power/control > > > > Thus, the above #3/#4 sequence is racy with user space (udev or > > similar). > > > > Notably, many PCI devices are enumerated at subsys_initcall time and so > > will not race with user space. However, there are several scenarios > > where PCI devices are created later on, such as with hotplug or when > > drivers (pwrctrl or controller drivers) are built as modules. > > > > [**] The relationship between pm_runtime_forbid(), pm_runtime_allow(), > > /sys/.../power/control, and the runtime PM usage counter can be subtle. > > It appears that the intention of pm_runtime_forbid() / > > pm_runtime_allow() is twofold: > > > > 1. Allow the user to disable runtime_pm (force device to always be > > powered on) through sysfs. > > 2. Allow the driver to start with runtime_pm disabled (device forced > > on) and user space could later enable runtime_pm. > > > > This conclusion comes from reading `Documentation/power/runtime_pm.rst`, > > specifically the section starting "The user space can effectively > > disallow". > > > > This means that while pm_runtime_forbid() does technically increase the > > runtime PM usage counter, this usage counter is not a guarantee of > > functional correctness, because sysfs can decrease that count again. > > > > Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/ > > Signed-off-by: Brian Norris <briannorris@chromium.org> > > Cc: <stable@vger.kernel.org> > > Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> > > --- > > > > Changes in v3: > > * Add Link to initial discussion > > * Add Rafael's Reviewed-by > > * Add lengthier footnotes about forbid vs allow vs sysfs > > > > Changes in v2: > > * Update CC list > > * Rework problem description > > * Update solution: defer pm_runtime_enable(), instead of trying to > > get()/put() > > > > drivers/pci/bus.c | 3 +++ > > drivers/pci/pci.c | 1 - > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > > index f26aec6ff588..fc66b6cb3a54 100644 > > --- a/drivers/pci/bus.c > > +++ b/drivers/pci/bus.c > > @@ -14,6 +14,7 @@ > > #include <linux/of.h> > > #include <linux/of_platform.h> > > #include <linux/platform_device.h> > > +#include <linux/pm_runtime.h> > > #include <linux/proc_fs.h> > > #include <linux/slab.h> > > > > @@ -375,6 +376,8 @@ void pci_bus_add_device(struct pci_dev *dev) > > put_device(&pdev->dev); > > } > > > > + pm_runtime_enable(&dev->dev); > > + > > if (!dn || of_device_is_available(dn)) > > pci_dev_allow_binding(dev); > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index b14dd064006c..f792164fa297 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -3226,7 +3226,6 @@ void pci_pm_init(struct pci_dev *dev) > > pci_pm_power_up_and_verify_state(dev); > > pm_runtime_forbid(&dev->dev); > > pm_runtime_set_active(&dev->dev); > > Actually, I think that the two statements above can be moved too. > > The pm_runtime_forbid() call doesn't matter until runtime PM is > enabled and it is better to do pm_runtime_set_active() right before > enabling runtime PM. Ehh, sure, I can do that I suppose. That signs me up for fixing some excessive Documentation/ that spells out exactly what goes on in pci_pm_init() for some reason, but I can do that too :) Brian > > - pm_runtime_enable(&dev->dev); > > } > > > > static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) > > -- ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] PCI/PM: Prevent runtime suspend before devices are fully initialized 2025-10-23 18:34 ` Brian Norris @ 2025-10-23 18:42 ` Brian Norris 2025-10-23 18:49 ` Rafael J. Wysocki 0 siblings, 1 reply; 5+ messages in thread From: Brian Norris @ 2025-10-23 18:42 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Bjorn Helgaas, Ilpo Järvinen, linux-pm, linux-pci, linux-kernel, Lukas Wunner Hi again, On Thu, Oct 23, 2025 at 11:34:57AM -0700, Brian Norris wrote: > On Thu, Oct 23, 2025 at 11:41:38AM +0200, Rafael J. Wysocki wrote: > > On Wed, Oct 22, 2025 at 11:15 PM Brian Norris <briannorris@chromium.org> wrote: > > > > > > Today, it's possible for a PCI device to be created and > > > runtime-suspended before it is fully initialized. When that happens, the > > > device will remain in D0, but the suspend process may save an > > > intermediate version of that device's state -- for example, without > > > appropriate BAR configuration. When the device later resumes, we'll > > > restore invalid PCI state and the device may not function. > > > > > > Prevent runtime suspend for PCI devices by deferring pm_runtime_enable() > > > until we've fully initialized the device. > > > > > > More details on how exactly this may occur: > > > > > > 1. PCI device is created by pci_scan_slot() or similar > > > 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the > > > device starts "active" and we initially prevent (pm_runtime_forbid()) > > > suspend -- but see [*] footnote > > > 3. Underlying 'struct device' is added to the system (device_add()); > > > runtime PM can now be configured by user space > > > 4. PCI device receives BAR configuration > > > (pci_assign_unassigned_bus_resources(), etc.) > > > 5. PCI device is added to the system in pci_bus_add_device() > > > > > > The device may potentially suspend between #3 and #4. > > > > > > [*] By default, pm_runtime_forbid() prevents suspending a device; but by > > > design [**], this can be overridden by user space policy via > > > > > > echo auto > /sys/bus/pci/devices/.../power/control > > > > > > Thus, the above #3/#4 sequence is racy with user space (udev or > > > similar). > > > > > > Notably, many PCI devices are enumerated at subsys_initcall time and so > > > will not race with user space. However, there are several scenarios > > > where PCI devices are created later on, such as with hotplug or when > > > drivers (pwrctrl or controller drivers) are built as modules. > > > > > > [**] The relationship between pm_runtime_forbid(), pm_runtime_allow(), > > > /sys/.../power/control, and the runtime PM usage counter can be subtle. > > > It appears that the intention of pm_runtime_forbid() / > > > pm_runtime_allow() is twofold: > > > > > > 1. Allow the user to disable runtime_pm (force device to always be > > > powered on) through sysfs. > > > 2. Allow the driver to start with runtime_pm disabled (device forced > > > on) and user space could later enable runtime_pm. > > > > > > This conclusion comes from reading `Documentation/power/runtime_pm.rst`, > > > specifically the section starting "The user space can effectively > > > disallow". > > > > > > This means that while pm_runtime_forbid() does technically increase the > > > runtime PM usage counter, this usage counter is not a guarantee of > > > functional correctness, because sysfs can decrease that count again. > > > > > > Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/ > > > Signed-off-by: Brian Norris <briannorris@chromium.org> > > > Cc: <stable@vger.kernel.org> > > > Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> > > > --- > > > > > > Changes in v3: > > > * Add Link to initial discussion > > > * Add Rafael's Reviewed-by > > > * Add lengthier footnotes about forbid vs allow vs sysfs > > > > > > Changes in v2: > > > * Update CC list > > > * Rework problem description > > > * Update solution: defer pm_runtime_enable(), instead of trying to > > > get()/put() > > > > > > drivers/pci/bus.c | 3 +++ > > > drivers/pci/pci.c | 1 - > > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > > > index f26aec6ff588..fc66b6cb3a54 100644 > > > --- a/drivers/pci/bus.c > > > +++ b/drivers/pci/bus.c > > > @@ -14,6 +14,7 @@ > > > #include <linux/of.h> > > > #include <linux/of_platform.h> > > > #include <linux/platform_device.h> > > > +#include <linux/pm_runtime.h> > > > #include <linux/proc_fs.h> > > > #include <linux/slab.h> > > > > > > @@ -375,6 +376,8 @@ void pci_bus_add_device(struct pci_dev *dev) > > > put_device(&pdev->dev); > > > } > > > > > > + pm_runtime_enable(&dev->dev); > > > + > > > if (!dn || of_device_is_available(dn)) > > > pci_dev_allow_binding(dev); > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > index b14dd064006c..f792164fa297 100644 > > > --- a/drivers/pci/pci.c > > > +++ b/drivers/pci/pci.c > > > @@ -3226,7 +3226,6 @@ void pci_pm_init(struct pci_dev *dev) > > > pci_pm_power_up_and_verify_state(dev); > > > pm_runtime_forbid(&dev->dev); > > > pm_runtime_set_active(&dev->dev); > > > > Actually, I think that the two statements above can be moved too. > > > > The pm_runtime_forbid() call doesn't matter until runtime PM is > > enabled and it is better to do pm_runtime_set_active() right before > > enabling runtime PM. > > Ehh, sure, I can do that I suppose. That signs me up for fixing some > excessive Documentation/ that spells out exactly what goes on in > pci_pm_init() for some reason, but I can do that too :) Hmm, never mind, I think I agreed with you too soon. It should be ok to move pm_runtime_set_active(), but I believe moving pm_runtime_forbid() would introduce a similar race condition problem to the bug I'm resolving -- pci_bus_add_device() is run after device_add(), and so user space might already see the device and try to configure power/control. If we forbid() after that point, user space would be confused. Documentation/power/runtime_pm.rst even calls this out: "It should be noted, however, that if the user space has already intentionally changed the value of /sys/devices/.../power/control to "auto" to allow the driver to power manage the device at run time, the driver may confuse it by using pm_runtime_forbid() this way." So I'd rather not make this change. Brian > > > - pm_runtime_enable(&dev->dev); > > > } > > > > > > static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) > > > -- > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3] PCI/PM: Prevent runtime suspend before devices are fully initialized 2025-10-23 18:42 ` Brian Norris @ 2025-10-23 18:49 ` Rafael J. Wysocki 0 siblings, 0 replies; 5+ messages in thread From: Rafael J. Wysocki @ 2025-10-23 18:49 UTC (permalink / raw) To: Brian Norris Cc: Rafael J. Wysocki, Bjorn Helgaas, Ilpo Järvinen, linux-pm, linux-pci, linux-kernel, Lukas Wunner On Thu, Oct 23, 2025 at 8:42 PM Brian Norris <briannorris@chromium.org> wrote: > > Hi again, > > On Thu, Oct 23, 2025 at 11:34:57AM -0700, Brian Norris wrote: > > On Thu, Oct 23, 2025 at 11:41:38AM +0200, Rafael J. Wysocki wrote: > > > On Wed, Oct 22, 2025 at 11:15 PM Brian Norris <briannorris@chromium.org> wrote: > > > > > > > > Today, it's possible for a PCI device to be created and > > > > runtime-suspended before it is fully initialized. When that happens, the > > > > device will remain in D0, but the suspend process may save an > > > > intermediate version of that device's state -- for example, without > > > > appropriate BAR configuration. When the device later resumes, we'll > > > > restore invalid PCI state and the device may not function. > > > > > > > > Prevent runtime suspend for PCI devices by deferring pm_runtime_enable() > > > > until we've fully initialized the device. > > > > > > > > More details on how exactly this may occur: > > > > > > > > 1. PCI device is created by pci_scan_slot() or similar > > > > 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the > > > > device starts "active" and we initially prevent (pm_runtime_forbid()) > > > > suspend -- but see [*] footnote > > > > 3. Underlying 'struct device' is added to the system (device_add()); > > > > runtime PM can now be configured by user space > > > > 4. PCI device receives BAR configuration > > > > (pci_assign_unassigned_bus_resources(), etc.) > > > > 5. PCI device is added to the system in pci_bus_add_device() > > > > > > > > The device may potentially suspend between #3 and #4. > > > > > > > > [*] By default, pm_runtime_forbid() prevents suspending a device; but by > > > > design [**], this can be overridden by user space policy via > > > > > > > > echo auto > /sys/bus/pci/devices/.../power/control > > > > > > > > Thus, the above #3/#4 sequence is racy with user space (udev or > > > > similar). > > > > > > > > Notably, many PCI devices are enumerated at subsys_initcall time and so > > > > will not race with user space. However, there are several scenarios > > > > where PCI devices are created later on, such as with hotplug or when > > > > drivers (pwrctrl or controller drivers) are built as modules. > > > > > > > > [**] The relationship between pm_runtime_forbid(), pm_runtime_allow(), > > > > /sys/.../power/control, and the runtime PM usage counter can be subtle. > > > > It appears that the intention of pm_runtime_forbid() / > > > > pm_runtime_allow() is twofold: > > > > > > > > 1. Allow the user to disable runtime_pm (force device to always be > > > > powered on) through sysfs. > > > > 2. Allow the driver to start with runtime_pm disabled (device forced > > > > on) and user space could later enable runtime_pm. > > > > > > > > This conclusion comes from reading `Documentation/power/runtime_pm.rst`, > > > > specifically the section starting "The user space can effectively > > > > disallow". > > > > > > > > This means that while pm_runtime_forbid() does technically increase the > > > > runtime PM usage counter, this usage counter is not a guarantee of > > > > functional correctness, because sysfs can decrease that count again. > > > > > > > > Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/ > > > > Signed-off-by: Brian Norris <briannorris@chromium.org> > > > > Cc: <stable@vger.kernel.org> > > > > Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> > > > > --- > > > > > > > > Changes in v3: > > > > * Add Link to initial discussion > > > > * Add Rafael's Reviewed-by > > > > * Add lengthier footnotes about forbid vs allow vs sysfs > > > > > > > > Changes in v2: > > > > * Update CC list > > > > * Rework problem description > > > > * Update solution: defer pm_runtime_enable(), instead of trying to > > > > get()/put() > > > > > > > > drivers/pci/bus.c | 3 +++ > > > > drivers/pci/pci.c | 1 - > > > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > > > > index f26aec6ff588..fc66b6cb3a54 100644 > > > > --- a/drivers/pci/bus.c > > > > +++ b/drivers/pci/bus.c > > > > @@ -14,6 +14,7 @@ > > > > #include <linux/of.h> > > > > #include <linux/of_platform.h> > > > > #include <linux/platform_device.h> > > > > +#include <linux/pm_runtime.h> > > > > #include <linux/proc_fs.h> > > > > #include <linux/slab.h> > > > > > > > > @@ -375,6 +376,8 @@ void pci_bus_add_device(struct pci_dev *dev) > > > > put_device(&pdev->dev); > > > > } > > > > > > > > + pm_runtime_enable(&dev->dev); > > > > + > > > > if (!dn || of_device_is_available(dn)) > > > > pci_dev_allow_binding(dev); > > > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > > index b14dd064006c..f792164fa297 100644 > > > > --- a/drivers/pci/pci.c > > > > +++ b/drivers/pci/pci.c > > > > @@ -3226,7 +3226,6 @@ void pci_pm_init(struct pci_dev *dev) > > > > pci_pm_power_up_and_verify_state(dev); > > > > pm_runtime_forbid(&dev->dev); > > > > pm_runtime_set_active(&dev->dev); > > > > > > Actually, I think that the two statements above can be moved too. > > > > > > The pm_runtime_forbid() call doesn't matter until runtime PM is > > > enabled and it is better to do pm_runtime_set_active() right before > > > enabling runtime PM. > > > > Ehh, sure, I can do that I suppose. That signs me up for fixing some > > excessive Documentation/ that spells out exactly what goes on in > > pci_pm_init() for some reason, but I can do that too :) > > Hmm, never mind, I think I agreed with you too soon. > > It should be ok to move pm_runtime_set_active(), but I believe moving > pm_runtime_forbid() would introduce a similar race condition problem to > the bug I'm resolving -- pci_bus_add_device() is run after device_add(), > and so user space might already see the device and try to configure > power/control. If we forbid() after that point, user space would be > confused. > > Documentation/power/runtime_pm.rst even calls this out: > > "It should be noted, however, that if the user space has already > intentionally changed the value of /sys/devices/.../power/control to > "auto" to allow the driver to power manage the device at run time, the > driver may confuse it by using pm_runtime_forbid() this way." > > So I'd rather not make this change. OK, fair enough, so leave the pm_runtime_forbid() as is, but pm_runtime_set_active() should really be done right prior to enabling. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-10-23 18:49 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-22 21:14 [PATCH v3] PCI/PM: Prevent runtime suspend before devices are fully initialized Brian Norris 2025-10-23 9:41 ` Rafael J. Wysocki 2025-10-23 18:34 ` Brian Norris 2025-10-23 18:42 ` Brian Norris 2025-10-23 18:49 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox