[PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management
@ 2023-08-03 17:12 Alex Williamson
  2023-08-03 17:12 ` [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface Alex Williamson
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Alex Williamson @ 2023-08-03 17:12 UTC (permalink / raw)
  To: bhelgaas; +Cc: Alex Williamson, linux-pci, linux-kernel, eric.auger

Since v5.19, vfio-pci makes use of runtime power management on devices.
This has the effect of potentially putting entire sub-hierarchies into
lower power states, which has exposed some gaps in the PCI subsystem
around power management support.

The first issue is that lspci accesses the VPD sysfs interface, which
does not provide the same power management wrappers as general config
space.

The next covers PME, where we attempt to skip devices based on their PCI
power state, but don't protect changes to that state or look at the
overall runtime power management state of the device.

This latter patch addresses the issue noted by Eric in the follow-ups to
v1 linked below.

These patches are logically independent, but only together resolve an
issue on Eric's system where a pair of endpoints bound to vfio-pci and
unused by userspace drivers trigger faults through lspci and PME
polling.  Thanks,

Alex 

v1: https://lore.kernel.org/all/20230707151044.1311544-1-alex.williamson@redhat.com/

Alex Williamson (2):
  PCI/VPD: Add runtime power management to sysfs interface
  PCI: Fix runtime PM race with PME polling

 drivers/pci/pci.c | 23 ++++++++++++++++-------
 drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
 2 files changed, 48 insertions(+), 9 deletions(-)

-- 
2.40.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface
  2023-08-03 17:12 [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management Alex Williamson
@ 2023-08-03 17:12 ` Alex Williamson
  2023-08-10 15:59   ` Bjorn Helgaas
  2023-08-11 19:25   ` Bjorn Helgaas
  2023-08-03 17:12 ` [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling Alex Williamson
  2023-08-10 18:29 ` [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management Bjorn Helgaas
  2 siblings, 2 replies; 18+ messages in thread
From: Alex Williamson @ 2023-08-03 17:12 UTC (permalink / raw)
  To: bhelgaas; +Cc: Alex Williamson, linux-pci, linux-kernel, eric.auger

Unlike default access to config space through sysfs, the vpd read and
write function don't actively manage the runtime power management state
of the device during access.  Since commit 7ab5e10eda02 ("vfio/pci: Move
the unused device into low power state with runtime PM"), the vfio-pci
driver will use runtime power management and release unused devices to
make use of low power states.  Attempting to access VPD information in
this low power state can result in incorrect information or kernel
crashes depending on the system behavior.

Wrap the vpd read/write bin attribute handlers in runtime PM and take
into account the potential quirk to select the correct device to wake.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
index a4fc4d0690fe..81217dd4789f 100644
--- a/drivers/pci/vpd.c
+++ b/drivers/pci/vpd.c
@@ -275,8 +275,23 @@ static ssize_t vpd_read(struct file *filp, struct kobject *kobj,
 			size_t count)
 {
 	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
+	struct pci_dev *vpd_dev = dev;
+	ssize_t ret;
+
+	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
+		vpd_dev = pci_get_func0_dev(dev);
+		if (!vpd_dev)
+			return -ENODEV;
+	}
+
+	pci_config_pm_runtime_get(vpd_dev);
+	ret = pci_read_vpd(vpd_dev, off, count, buf);
+	pci_config_pm_runtime_put(vpd_dev);
+
+	if (dev != vpd_dev)
+		pci_dev_put(vpd_dev);
 
-	return pci_read_vpd(dev, off, count, buf);
+	return ret;
 }
 
 static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
@@ -284,8 +299,23 @@ static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
 			 size_t count)
 {
 	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
+	struct pci_dev *vpd_dev = dev;
+	ssize_t ret;
+
+	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
+		vpd_dev = pci_get_func0_dev(dev);
+		if (!vpd_dev)
+			return -ENODEV;
+	}
+
+	pci_config_pm_runtime_get(vpd_dev);
+	ret = pci_write_vpd(vpd_dev, off, count, buf);
+	pci_config_pm_runtime_put(vpd_dev);
+
+	if (dev != vpd_dev)
+		pci_dev_put(vpd_dev);
 
-	return pci_write_vpd(dev, off, count, buf);
+	return ret;
 }
 static BIN_ATTR(vpd, 0600, vpd_read, vpd_write, 0);
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface
  2023-08-03 17:12 ` [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface Alex Williamson
@ 2023-08-10 15:59   ` Bjorn Helgaas
  2023-08-10 16:26     ` Alex Williamson
  2023-08-11 19:25   ` Bjorn Helgaas
  1 sibling, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2023-08-10 15:59 UTC (permalink / raw)
  To: Alex Williamson; +Cc: bhelgaas, linux-pci, linux-kernel, eric.auger

On Thu, Aug 03, 2023 at 11:12:32AM -0600, Alex Williamson wrote:
> Unlike default access to config space through sysfs, the vpd read and
> write function don't actively manage the runtime power management state
> of the device during access.  Since commit 7ab5e10eda02 ("vfio/pci: Move
> the unused device into low power state with runtime PM"), the vfio-pci
> driver will use runtime power management and release unused devices to
> make use of low power states.  Attempting to access VPD information in
> this low power state can result in incorrect information or kernel
> crashes depending on the system behavior.
> 
> Wrap the vpd read/write bin attribute handlers in runtime PM and take
> into account the potential quirk to select the correct device to wake.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
> index a4fc4d0690fe..81217dd4789f 100644
> --- a/drivers/pci/vpd.c
> +++ b/drivers/pci/vpd.c
> @@ -275,8 +275,23 @@ static ssize_t vpd_read(struct file *filp, struct kobject *kobj,
>  			size_t count)
>  {
>  	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> +	struct pci_dev *vpd_dev = dev;
> +	ssize_t ret;
> +
> +	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> +		vpd_dev = pci_get_func0_dev(dev);
> +		if (!vpd_dev)
> +			return -ENODEV;
> +	}
> +
> +	pci_config_pm_runtime_get(vpd_dev);
> +	ret = pci_read_vpd(vpd_dev, off, count, buf);
> +	pci_config_pm_runtime_put(vpd_dev);
> +
> +	if (dev != vpd_dev)
> +		pci_dev_put(vpd_dev);

I first thought this would leak a reference if dev was func0 and had
PCI_DEV_FLAGS_VPD_REF_F0 set, because in that case vpd_dev would be
the same as dev.

But I think that case can't happen because quirk_f0_vpd_link() does
nothing for func0 devices, so PCI_DEV_FLAGS_VPD_REF_F0 should never be
set for func0.  But it seems like this might be easier to analyze as:

  if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0)
    pci_dev_put(vpd_dev);

Or am I missing something?

> -	return pci_read_vpd(dev, off, count, buf);
> +	return ret;
>  }
>  
>  static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
> @@ -284,8 +299,23 @@ static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
>  			 size_t count)
>  {
>  	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> +	struct pci_dev *vpd_dev = dev;
> +	ssize_t ret;
> +
> +	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> +		vpd_dev = pci_get_func0_dev(dev);
> +		if (!vpd_dev)
> +			return -ENODEV;
> +	}
> +
> +	pci_config_pm_runtime_get(vpd_dev);
> +	ret = pci_write_vpd(vpd_dev, off, count, buf);
> +	pci_config_pm_runtime_put(vpd_dev);
> +
> +	if (dev != vpd_dev)
> +		pci_dev_put(vpd_dev);
>  
> -	return pci_write_vpd(dev, off, count, buf);
> +	return ret;
>  }
>  static BIN_ATTR(vpd, 0600, vpd_read, vpd_write, 0);
>  
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface
  2023-08-10 15:59   ` Bjorn Helgaas
@ 2023-08-10 16:26     ` Alex Williamson
  0 siblings, 0 replies; 18+ messages in thread
From: Alex Williamson @ 2023-08-10 16:26 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: bhelgaas, linux-pci, linux-kernel, eric.auger

On Thu, 10 Aug 2023 10:59:26 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Thu, Aug 03, 2023 at 11:12:32AM -0600, Alex Williamson wrote:
> > Unlike default access to config space through sysfs, the vpd read and
> > write function don't actively manage the runtime power management state
> > of the device during access.  Since commit 7ab5e10eda02 ("vfio/pci: Move
> > the unused device into low power state with runtime PM"), the vfio-pci
> > driver will use runtime power management and release unused devices to
> > make use of low power states.  Attempting to access VPD information in
> > this low power state can result in incorrect information or kernel
> > crashes depending on the system behavior.
> > 
> > Wrap the vpd read/write bin attribute handlers in runtime PM and take
> > into account the potential quirk to select the correct device to wake.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
> >  1 file changed, 32 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
> > index a4fc4d0690fe..81217dd4789f 100644
> > --- a/drivers/pci/vpd.c
> > +++ b/drivers/pci/vpd.c
> > @@ -275,8 +275,23 @@ static ssize_t vpd_read(struct file *filp, struct kobject *kobj,
> >  			size_t count)
> >  {
> >  	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> > +	struct pci_dev *vpd_dev = dev;
> > +	ssize_t ret;
> > +
> > +	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> > +		vpd_dev = pci_get_func0_dev(dev);
> > +		if (!vpd_dev)
> > +			return -ENODEV;
> > +	}
> > +
> > +	pci_config_pm_runtime_get(vpd_dev);
> > +	ret = pci_read_vpd(vpd_dev, off, count, buf);
> > +	pci_config_pm_runtime_put(vpd_dev);
> > +
> > +	if (dev != vpd_dev)
> > +		pci_dev_put(vpd_dev);  
> 
> I first thought this would leak a reference if dev was func0 and had
> PCI_DEV_FLAGS_VPD_REF_F0 set, because in that case vpd_dev would be
> the same as dev.
> 
> But I think that case can't happen because quirk_f0_vpd_link() does
> nothing for func0 devices, so PCI_DEV_FLAGS_VPD_REF_F0 should never be
> set for func0.  But it seems like this might be easier to analyze as:
> 
>   if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0)
>     pci_dev_put(vpd_dev);
> 
> Or am I missing something?

Nope, your analysis is correct, it doesn't make any sense to have a
flag on func0 redirecting VPD access to func0 so vpd_dev can only be
different on non-zero functions.  The alternative test is equally
valid so if you think it's more intuitive, let's use it.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface
  2023-08-03 17:12 ` [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface Alex Williamson
  2023-08-10 15:59   ` Bjorn Helgaas
@ 2023-08-11 19:25   ` Bjorn Helgaas
  2023-08-11 19:56     ` Alex Williamson
  1 sibling, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2023-08-11 19:25 UTC (permalink / raw)
  To: Alex Williamson; +Cc: bhelgaas, linux-pci, linux-kernel, eric.auger

On Thu, Aug 03, 2023 at 11:12:32AM -0600, Alex Williamson wrote:
> Unlike default access to config space through sysfs, the vpd read and
> write function don't actively manage the runtime power management state
> of the device during access.  Since commit 7ab5e10eda02 ("vfio/pci: Move
> the unused device into low power state with runtime PM"), the vfio-pci
> driver will use runtime power management and release unused devices to
> make use of low power states.  Attempting to access VPD information in
> this low power state can result in incorrect information or kernel
> crashes depending on the system behavior.

I guess this specifically refers to D3cold, right?  VPD is accessed
via config space, which should work in all D0-D3hot states, but not in
D3cold.  I don't see anything in the spec about needing to be in D0 to
access VPD.

I assume there's no public problem report we could cite here?  I
suppose the behavior in D3cold is however the system handles a UR
error.

> Wrap the vpd read/write bin attribute handlers in runtime PM and take
> into account the potential quirk to select the correct device to wake.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
> index a4fc4d0690fe..81217dd4789f 100644
> --- a/drivers/pci/vpd.c
> +++ b/drivers/pci/vpd.c
> @@ -275,8 +275,23 @@ static ssize_t vpd_read(struct file *filp, struct kobject *kobj,
>  			size_t count)
>  {
>  	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> +	struct pci_dev *vpd_dev = dev;
> +	ssize_t ret;
> +
> +	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> +		vpd_dev = pci_get_func0_dev(dev);
> +		if (!vpd_dev)
> +			return -ENODEV;
> +	}
> +
> +	pci_config_pm_runtime_get(vpd_dev);
> +	ret = pci_read_vpd(vpd_dev, off, count, buf);
> +	pci_config_pm_runtime_put(vpd_dev);
> +
> +	if (dev != vpd_dev)
> +		pci_dev_put(vpd_dev);
>  
> -	return pci_read_vpd(dev, off, count, buf);
> +	return ret;
>  }
>  
>  static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
> @@ -284,8 +299,23 @@ static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
>  			 size_t count)
>  {
>  	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> +	struct pci_dev *vpd_dev = dev;
> +	ssize_t ret;
> +
> +	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> +		vpd_dev = pci_get_func0_dev(dev);
> +		if (!vpd_dev)
> +			return -ENODEV;
> +	}
> +
> +	pci_config_pm_runtime_get(vpd_dev);
> +	ret = pci_write_vpd(vpd_dev, off, count, buf);
> +	pci_config_pm_runtime_put(vpd_dev);
> +
> +	if (dev != vpd_dev)
> +		pci_dev_put(vpd_dev);
>  
> -	return pci_write_vpd(dev, off, count, buf);
> +	return ret;
>  }
>  static BIN_ATTR(vpd, 0600, vpd_read, vpd_write, 0);
>  
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface
  2023-08-11 19:25   ` Bjorn Helgaas
@ 2023-08-11 19:56     ` Alex Williamson
  0 siblings, 0 replies; 18+ messages in thread
From: Alex Williamson @ 2023-08-11 19:56 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: bhelgaas, linux-pci, linux-kernel, eric.auger

On Fri, 11 Aug 2023 14:25:43 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Thu, Aug 03, 2023 at 11:12:32AM -0600, Alex Williamson wrote:
> > Unlike default access to config space through sysfs, the vpd read and
> > write function don't actively manage the runtime power management state
> > of the device during access.  Since commit 7ab5e10eda02 ("vfio/pci: Move
> > the unused device into low power state with runtime PM"), the vfio-pci
> > driver will use runtime power management and release unused devices to
> > make use of low power states.  Attempting to access VPD information in
> > this low power state can result in incorrect information or kernel
> > crashes depending on the system behavior.  
> 
> I guess this specifically refers to D3cold, right?  VPD is accessed
> via config space, which should work in all D0-D3hot states, but not in
> D3cold.  I don't see anything in the spec about needing to be in D0 to
> access VPD.
> 
> I assume there's no public problem report we could cite here?  I
> suppose the behavior in D3cold is however the system handles a UR
> error.

Yes, Eric tested that pcie_port_pm=off is a viable workaround resolving
both the VPD and PME accesses, so I think the issue is actually that
the root port is in D3cold.  This aligns with commit 7ab5e10eda02
above, since prior to that we were only manipulating the endpoint power
state.

The oops indicates an "Internal error: synchronous external abort",
with a stack trace ending in pci_generic_config_read, so I suspect this
is a UR.

Unfortunately the bz is not currently public :-\  Thanks,

Alex
 
> > Wrap the vpd read/write bin attribute handlers in runtime PM and take
> > into account the potential quirk to select the correct device to wake.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
> >  1 file changed, 32 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
> > index a4fc4d0690fe..81217dd4789f 100644
> > --- a/drivers/pci/vpd.c
> > +++ b/drivers/pci/vpd.c
> > @@ -275,8 +275,23 @@ static ssize_t vpd_read(struct file *filp, struct kobject *kobj,
> >  			size_t count)
> >  {
> >  	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> > +	struct pci_dev *vpd_dev = dev;
> > +	ssize_t ret;
> > +
> > +	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> > +		vpd_dev = pci_get_func0_dev(dev);
> > +		if (!vpd_dev)
> > +			return -ENODEV;
> > +	}
> > +
> > +	pci_config_pm_runtime_get(vpd_dev);
> > +	ret = pci_read_vpd(vpd_dev, off, count, buf);
> > +	pci_config_pm_runtime_put(vpd_dev);
> > +
> > +	if (dev != vpd_dev)
> > +		pci_dev_put(vpd_dev);
> >  
> > -	return pci_read_vpd(dev, off, count, buf);
> > +	return ret;
> >  }
> >  
> >  static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
> > @@ -284,8 +299,23 @@ static ssize_t vpd_write(struct file *filp, struct kobject *kobj,
> >  			 size_t count)
> >  {
> >  	struct pci_dev *dev = to_pci_dev(kobj_to_dev(kobj));
> > +	struct pci_dev *vpd_dev = dev;
> > +	ssize_t ret;
> > +
> > +	if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0) {
> > +		vpd_dev = pci_get_func0_dev(dev);
> > +		if (!vpd_dev)
> > +			return -ENODEV;
> > +	}
> > +
> > +	pci_config_pm_runtime_get(vpd_dev);
> > +	ret = pci_write_vpd(vpd_dev, off, count, buf);
> > +	pci_config_pm_runtime_put(vpd_dev);
> > +
> > +	if (dev != vpd_dev)
> > +		pci_dev_put(vpd_dev);
> >  
> > -	return pci_write_vpd(dev, off, count, buf);
> > +	return ret;
> >  }
> >  static BIN_ATTR(vpd, 0600, vpd_read, vpd_write, 0);
> >  
> > -- 
> > 2.40.1
> >   
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2023-08-03 17:12 [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management Alex Williamson
  2023-08-03 17:12 ` [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface Alex Williamson
@ 2023-08-03 17:12 ` Alex Williamson
  2024-01-18 18:50   ` Alex Williamson
  2023-08-10 18:29 ` [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management Bjorn Helgaas
  2 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2023-08-03 17:12 UTC (permalink / raw)
  To: bhelgaas; +Cc: Alex Williamson, linux-pci, linux-kernel, eric.auger

Testing that a device is not currently in a low power state provides no
guarantees that the device is not immenently transitioning to such a state.
We need to increment the PM usage counter before accessing the device.
Since we don't wish to wake the device for PME polling, do so only if the
device is already active by using pm_runtime_get_if_active().

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/pci.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 60230da957e0..bc266f290b2c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2415,10 +2415,13 @@ static void pci_pme_list_scan(struct work_struct *work)
 
 	mutex_lock(&pci_pme_list_mutex);
 	list_for_each_entry_safe(pme_dev, n, &pci_pme_list, list) {
-		if (pme_dev->dev->pme_poll) {
-			struct pci_dev *bridge;
+		struct pci_dev *pdev = pme_dev->dev;
+
+		if (pdev->pme_poll) {
+			struct pci_dev *bridge = pdev->bus->self;
+			struct device *dev = &pdev->dev;
+			int pm_status;
 
-			bridge = pme_dev->dev->bus->self;
 			/*
 			 * If bridge is in low power state, the
 			 * configuration space of subordinate devices
@@ -2426,14 +2429,20 @@ static void pci_pme_list_scan(struct work_struct *work)
 			 */
 			if (bridge && bridge->current_state != PCI_D0)
 				continue;
+
 			/*
-			 * If the device is in D3cold it should not be
-			 * polled either.
+			 * If the device is in a low power state it
+			 * should not be polled either.
 			 */
-			if (pme_dev->dev->current_state == PCI_D3cold)
+			pm_status = pm_runtime_get_if_active(dev, true);
+			if (!pm_status)
 				continue;
 
-			pci_pme_wakeup(pme_dev->dev, NULL);
+			if (pdev->current_state != PCI_D3cold)
+				pci_pme_wakeup(pdev, NULL);
+
+			if (pm_status > 0)
+				pm_runtime_put(dev);
 		} else {
 			list_del(&pme_dev->list);
 			kfree(pme_dev);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2023-08-03 17:12 ` [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling Alex Williamson
@ 2024-01-18 18:50   ` Alex Williamson
  2024-01-22 22:17     ` Lukas Wunner
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2024-01-18 18:50 UTC (permalink / raw)
  To: linux-pci
  Cc: bhelgaas, linux-kernel, eric.auger, mika.westerberg, lukas,
	rafael.j.wysocki, Sanath.S

On Thu,  3 Aug 2023 11:12:33 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> Testing that a device is not currently in a low power state provides no
> guarantees that the device is not immenently transitioning to such a state.
> We need to increment the PM usage counter before accessing the device.
> Since we don't wish to wake the device for PME polling, do so only if the
> device is already active by using pm_runtime_get_if_active().
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/pci/pci.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)

Hey folks,

Resurrecting this patch (currently commit d3fcd7360338) for discussion
as it's been identified as the source of a regression in:

https://bugzilla.kernel.org/show_bug.cgi?id=218360

Copying Mika, Lukas, and Rafael as it's related to:

000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")

where we skip devices in D3cold when processing the PME list.

I think the issue in the above bz is that the downstream TB3/USB4 port
is in D3 (presumably D3hot) and I therefore infer the device is in state
RPM_SUSPENDED.  This commit is attempting to make sure the device power
state is stable across the call such that it does not transition into
D3cold while we're accessing it.

To do that I used pm_runtime_get_if_active(), but in retrospect this
requires the device to be in RPM_ACTIVE so we end up skipping anything
suspended or transitioning.

As reported in the above bz, I tried replacing this with:

	pm_runtime_get_noresume(dev);
	pm_runtime_barrier(dev);

The theory here being that the barrier would wait for any transitioning
states such that as far as runtime power management is concerned, the
device power state is stable.

This causes live locks where the barrier never returns.

Instead I'm considering that since we're polling the PME list, maybe we
could just defer devices in transition states, for instance something
that looks like pm_runtime_get_if_active(), but would return zero if
the device was in RPM_SUSPENDING or RPM_RESUMING rather than requiring
RPM_ACTIVE.

I'm not an expert in PME or runtime power management though, so I'm
looking for advice.  Thanks,

Alex

> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 60230da957e0..bc266f290b2c 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2415,10 +2415,13 @@ static void pci_pme_list_scan(struct work_struct *work)
>  
>  	mutex_lock(&pci_pme_list_mutex);
>  	list_for_each_entry_safe(pme_dev, n, &pci_pme_list, list) {
> -		if (pme_dev->dev->pme_poll) {
> -			struct pci_dev *bridge;
> +		struct pci_dev *pdev = pme_dev->dev;
> +
> +		if (pdev->pme_poll) {
> +			struct pci_dev *bridge = pdev->bus->self;
> +			struct device *dev = &pdev->dev;
> +			int pm_status;
>  
> -			bridge = pme_dev->dev->bus->self;
>  			/*
>  			 * If bridge is in low power state, the
>  			 * configuration space of subordinate devices
> @@ -2426,14 +2429,20 @@ static void pci_pme_list_scan(struct work_struct *work)
>  			 */
>  			if (bridge && bridge->current_state != PCI_D0)
>  				continue;
> +
>  			/*
> -			 * If the device is in D3cold it should not be
> -			 * polled either.
> +			 * If the device is in a low power state it
> +			 * should not be polled either.
>  			 */
> -			if (pme_dev->dev->current_state == PCI_D3cold)
> +			pm_status = pm_runtime_get_if_active(dev, true);
> +			if (!pm_status)
>  				continue;
>  
> -			pci_pme_wakeup(pme_dev->dev, NULL);
> +			if (pdev->current_state != PCI_D3cold)
> +				pci_pme_wakeup(pdev, NULL);
> +
> +			if (pm_status > 0)
> +				pm_runtime_put(dev);
>  		} else {
>  			list_del(&pme_dev->list);
>  			kfree(pme_dev);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2024-01-18 18:50   ` Alex Williamson
@ 2024-01-22 22:17     ` Lukas Wunner
  2024-01-22 22:50       ` Alex Williamson
  0 siblings, 1 reply; 18+ messages in thread
From: Lukas Wunner @ 2024-01-22 22:17 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-pci, bhelgaas, linux-kernel, eric.auger, mika.westerberg,
	rafael.j.wysocki, Sanath.S

On Thu, Jan 18, 2024 at 11:50:49AM -0700, Alex Williamson wrote:
> On Thu,  3 Aug 2023 11:12:33 -0600 Alex Williamson <alex.williamson@redhat.com wrote:
> > Testing that a device is not currently in a low power state provides no
> > guarantees that the device is not immenently transitioning to such a state.
> > We need to increment the PM usage counter before accessing the device.
> > Since we don't wish to wake the device for PME polling, do so only if the
> > device is already active by using pm_runtime_get_if_active().
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  drivers/pci/pci.c | 23 ++++++++++++++++-------
> >  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> Resurrecting this patch (currently commit d3fcd7360338) for discussion
> as it's been identified as the source of a regression in:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=218360
> 
> Copying Mika, Lukas, and Rafael as it's related to:
> 
> 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
> 
> where we skip devices in D3cold when processing the PME list.
> 
> I think the issue in the above bz is that the downstream TB3/USB4 port
> is in D3 (presumably D3hot) and I therefore infer the device is in state
> RPM_SUSPENDED.  This commit is attempting to make sure the device power
> state is stable across the call such that it does not transition into
> D3cold while we're accessing it.
> 
> To do that I used pm_runtime_get_if_active(), but in retrospect this
> requires the device to be in RPM_ACTIVE so we end up skipping anything
> suspended or transitioning.

How about dropping the calls to pm_runtime_get_if_active() and
pm_runtime_put() and instead simply do:

			if (pm_runtime_suspended(&pdev->dev) &&
			    pdev->current_state != PCI_D3cold)
				pci_pme_wakeup(pdev, NULL);

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2024-01-22 22:17     ` Lukas Wunner
@ 2024-01-22 22:50       ` Alex Williamson
  2024-01-23  4:46         ` Alex Williamson
  2024-01-23 10:45         ` Lukas Wunner
  0 siblings, 2 replies; 18+ messages in thread
From: Alex Williamson @ 2024-01-22 22:50 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, bhelgaas, linux-kernel, eric.auger, mika.westerberg,
	rafael.j.wysocki, Sanath.S

On Mon, 22 Jan 2024 23:17:30 +0100
Lukas Wunner <lukas@wunner.de> wrote:

> On Thu, Jan 18, 2024 at 11:50:49AM -0700, Alex Williamson wrote:
> > On Thu,  3 Aug 2023 11:12:33 -0600 Alex Williamson <alex.williamson@redhat.com wrote:  
> > > Testing that a device is not currently in a low power state provides no
> > > guarantees that the device is not immenently transitioning to such a state.
> > > We need to increment the PM usage counter before accessing the device.
> > > Since we don't wish to wake the device for PME polling, do so only if the
> > > device is already active by using pm_runtime_get_if_active().
> > > 
> > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > ---
> > >  drivers/pci/pci.c | 23 ++++++++++++++++-------
> > >  1 file changed, 16 insertions(+), 7 deletions(-)  
> > 
> > Resurrecting this patch (currently commit d3fcd7360338) for discussion
> > as it's been identified as the source of a regression in:
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=218360
> > 
> > Copying Mika, Lukas, and Rafael as it's related to:
> > 
> > 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
> > 
> > where we skip devices in D3cold when processing the PME list.
> > 
> > I think the issue in the above bz is that the downstream TB3/USB4 port
> > is in D3 (presumably D3hot) and I therefore infer the device is in state
> > RPM_SUSPENDED.  This commit is attempting to make sure the device power
> > state is stable across the call such that it does not transition into
> > D3cold while we're accessing it.
> > 
> > To do that I used pm_runtime_get_if_active(), but in retrospect this
> > requires the device to be in RPM_ACTIVE so we end up skipping anything
> > suspended or transitioning.  
> 
> How about dropping the calls to pm_runtime_get_if_active() and
> pm_runtime_put() and instead simply do:
> 
> 			if (pm_runtime_suspended(&pdev->dev) &&
> 			    pdev->current_state != PCI_D3cold)
> 				pci_pme_wakeup(pdev, NULL);

Hi Lukas,

Do we require that the polled device is in the RPM_SUSPENDED state?
Also pm_runtime_suspended() can also only be trusted while holding the
device power.lock, we need a usage count reference to maintain that
state.

I'm also seeing cases where the bridge is power state D0, but PM state
RPM_SUSPENDING, so config space of the polled device becomes
inaccessible even while we're holding a reference once we allow polling
in RPM_SUSPENDED.

I'm currently working with the below patch, which I believe addresses
all these issues, but I'd welcome review and testing. Thanks,

Alex

commit 0a063b8e91d0bc807db712c81c8b270864f99ecb
Author: Alex Williamson <alex.williamson@redhat.com>
Date:   Tue Jan 16 13:28:33 2024 -0700

    PCI: Fix active state requirement in PME polling
    
    The commit noted in fixes added a bogus requirement that runtime PM
    managed devices need to be in the RPM_ACTIVE state for PME polling.
    In fact, there is no requirement of a specific runtime PM state, it
    is only required that the state is stable such that testing config
    space availability, ie. !D3cold, remains valid across the PME wakeup.
    
    To that effect, defer polling of runtime PM managed devices that are
    not in either the RPM_ACTIVE or RPM_SUSPENDED states.  Devices in
    transitional states remain on the pci_pme_list and will be re-queued.
    
    However in allowing polling of devices in the RPM_SUSPENDED state,
    the bridge state requires further refinement as it's possible to poll
    while the bridge is in D0, but the runtime PM state is RPM_SUSPENDING.
    An asynchronous completion of the bridge transition to a low power
    state can make config space of the subordinate device become
    unavailable.  A runtime PM reference to the bridge is therefore added
    with a supplementary requirement that the bridge is in the RPM_ACTIVE
    state.
    
    Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling")
    Reported-by: Sanath S <sanath.s@amd.com>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
    Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index bdbf8a94b4d0..31dbf1834b07 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2433,29 +2433,45 @@ static void pci_pme_list_scan(struct work_struct *work)
 		if (pdev->pme_poll) {
 			struct pci_dev *bridge = pdev->bus->self;
 			struct device *dev = &pdev->dev;
-			int pm_status;
+			struct device *bdev = bridge ? &bridge->dev : NULL;
 
 			/*
-			 * If bridge is in low power state, the
-			 * configuration space of subordinate devices
-			 * may be not accessible
+			 * If we have a bridge, it should be in an active/D0
+			 * state or the configuration space of subordinate
+			 * devices may not be accessible.
 			 */
-			if (bridge && bridge->current_state != PCI_D0)
-				continue;
+			if (bdev) {
+				spin_lock_irq(&bdev->power.lock);
+				if (!pm_runtime_active(bdev) ||
+				    bridge->current_state != PCI_D0) {
+					spin_unlock_irq(&bdev->power.lock);
+					continue;
+				}
+				pm_runtime_get_noresume(bdev);
+				spin_unlock_irq(&bdev->power.lock);
+			}
 
 			/*
-			 * If the device is in a low power state it
-			 * should not be polled either.
+			 * The device itself may be either in active or
+			 * suspended state, but must not be in D3cold so
+			 * that configuration space is accessible.  The
+			 * transitional resuming and suspending states are
+			 * skipped to avoid D3cold races.
 			 */
-			pm_status = pm_runtime_get_if_active(dev, true);
-			if (!pm_status)
-				continue;
-
-			if (pdev->current_state != PCI_D3cold)
+			spin_lock_irq(&dev->power.lock);
+			if ((pm_runtime_active(dev) ||
+			     pm_runtime_suspended(dev)) &&
+			    pdev->current_state != PCI_D3cold) {
+				pm_runtime_get_noresume(dev);
+				spin_unlock_irq(&dev->power.lock);
 				pci_pme_wakeup(pdev, NULL);
-
-			if (pm_status > 0)
 				pm_runtime_put(dev);
+			} else {
+				spin_unlock_irq(&dev->power.lock);
+			}
+
+			if (bdev)
+				pm_runtime_put(bdev);
 		} else {
 			list_del(&pme_dev->list);
 			kfree(pme_dev);


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2024-01-22 22:50       ` Alex Williamson
@ 2024-01-23  4:46         ` Alex Williamson
  2024-01-23  4:48           ` Sanath S
  2024-01-23 10:45         ` Lukas Wunner
  1 sibling, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2024-01-23  4:46 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, bhelgaas, linux-kernel, eric.auger, mika.westerberg,
	rafael.j.wysocki, Sanath.S

On Mon, 22 Jan 2024 15:50:03 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Mon, 22 Jan 2024 23:17:30 +0100
> Lukas Wunner <lukas@wunner.de> wrote:
> 
> > On Thu, Jan 18, 2024 at 11:50:49AM -0700, Alex Williamson wrote:  
> > > On Thu,  3 Aug 2023 11:12:33 -0600 Alex Williamson <alex.williamson@redhat.com wrote:    
> > > > Testing that a device is not currently in a low power state provides no
> > > > guarantees that the device is not immenently transitioning to such a state.
> > > > We need to increment the PM usage counter before accessing the device.
> > > > Since we don't wish to wake the device for PME polling, do so only if the
> > > > device is already active by using pm_runtime_get_if_active().
> > > > 
> > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > > ---
> > > >  drivers/pci/pci.c | 23 ++++++++++++++++-------
> > > >  1 file changed, 16 insertions(+), 7 deletions(-)    
> > > 
> > > Resurrecting this patch (currently commit d3fcd7360338) for discussion
> > > as it's been identified as the source of a regression in:
> > > 
> > > https://bugzilla.kernel.org/show_bug.cgi?id=218360
> > > 
> > > Copying Mika, Lukas, and Rafael as it's related to:
> > > 
> > > 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
> > > 
> > > where we skip devices in D3cold when processing the PME list.
> > > 
> > > I think the issue in the above bz is that the downstream TB3/USB4 port
> > > is in D3 (presumably D3hot) and I therefore infer the device is in state
> > > RPM_SUSPENDED.  This commit is attempting to make sure the device power
> > > state is stable across the call such that it does not transition into
> > > D3cold while we're accessing it.
> > > 
> > > To do that I used pm_runtime_get_if_active(), but in retrospect this
> > > requires the device to be in RPM_ACTIVE so we end up skipping anything
> > > suspended or transitioning.    
> > 
> > How about dropping the calls to pm_runtime_get_if_active() and
> > pm_runtime_put() and instead simply do:
> > 
> > 			if (pm_runtime_suspended(&pdev->dev) &&
> > 			    pdev->current_state != PCI_D3cold)
> > 				pci_pme_wakeup(pdev, NULL);  
> 
> Hi Lukas,
> 
> Do we require that the polled device is in the RPM_SUSPENDED state?
> Also pm_runtime_suspended() can also only be trusted while holding the
> device power.lock, we need a usage count reference to maintain that
> state.
> 
> I'm also seeing cases where the bridge is power state D0, but PM state
> RPM_SUSPENDING, so config space of the polled device becomes
> inaccessible even while we're holding a reference once we allow polling
> in RPM_SUSPENDED.
> 
> I'm currently working with the below patch, which I believe addresses
> all these issues, but I'd welcome review and testing. Thanks,
> 
> Alex
> 
> commit 0a063b8e91d0bc807db712c81c8b270864f99ecb
> Author: Alex Williamson <alex.williamson@redhat.com>
> Date:   Tue Jan 16 13:28:33 2024 -0700
> 
>     PCI: Fix active state requirement in PME polling
>     
>     The commit noted in fixes added a bogus requirement that runtime PM
>     managed devices need to be in the RPM_ACTIVE state for PME polling.
>     In fact, there is no requirement of a specific runtime PM state, it
>     is only required that the state is stable such that testing config
>     space availability, ie. !D3cold, remains valid across the PME wakeup.
>     
>     To that effect, defer polling of runtime PM managed devices that are
>     not in either the RPM_ACTIVE or RPM_SUSPENDED states.  Devices in
>     transitional states remain on the pci_pme_list and will be re-queued.
>     
>     However in allowing polling of devices in the RPM_SUSPENDED state,
>     the bridge state requires further refinement as it's possible to poll
>     while the bridge is in D0, but the runtime PM state is RPM_SUSPENDING.
>     An asynchronous completion of the bridge transition to a low power
>     state can make config space of the subordinate device become
>     unavailable.  A runtime PM reference to the bridge is therefore added
>     with a supplementary requirement that the bridge is in the RPM_ACTIVE
>     state.
>     
>     Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling")
>     Reported-by: Sanath S <sanath.s@amd.com>
>     Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
>     Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index bdbf8a94b4d0..31dbf1834b07 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2433,29 +2433,45 @@ static void pci_pme_list_scan(struct work_struct *work)
>  		if (pdev->pme_poll) {
>  			struct pci_dev *bridge = pdev->bus->self;
>  			struct device *dev = &pdev->dev;
> -			int pm_status;
> +			struct device *bdev = bridge ? &bridge->dev : NULL;
>  
>  			/*
> -			 * If bridge is in low power state, the
> -			 * configuration space of subordinate devices
> -			 * may be not accessible
> +			 * If we have a bridge, it should be in an active/D0
> +			 * state or the configuration space of subordinate
> +			 * devices may not be accessible.
>  			 */
> -			if (bridge && bridge->current_state != PCI_D0)
> -				continue;
> +			if (bdev) {
> +				spin_lock_irq(&bdev->power.lock);

With the code as shown here I have one system that seems to be getting
contention when reading the vpd sysfs attribute when the endpoints
(QL41000) are bound to vfio-pci and unused, resulting in the root port
and endpoints being suspended.  A vpd read can take over a minute.
Seems to be resolved changing the above spin_lock to a spin_trylock:

				if (!spin_trylock_irq(&bdev->power.lock))
					continue;

The pm_runtime_barrier() as used in the vpd path can be prone to such
issues, I saw similar in the fix I previously proposed in the bz.

I'll continue to do more testing with this change and hopefully Sanath
can verify this resolves the bug report.  Thanks,

Alex

> +				if (!pm_runtime_active(bdev) ||
> +				    bridge->current_state != PCI_D0) {
> +					spin_unlock_irq(&bdev->power.lock);
> +					continue;
> +				}
> +				pm_runtime_get_noresume(bdev);
> +				spin_unlock_irq(&bdev->power.lock);
> +			}
>  
>  			/*
> -			 * If the device is in a low power state it
> -			 * should not be polled either.
> +			 * The device itself may be either in active or
> +			 * suspended state, but must not be in D3cold so
> +			 * that configuration space is accessible.  The
> +			 * transitional resuming and suspending states are
> +			 * skipped to avoid D3cold races.
>  			 */
> -			pm_status = pm_runtime_get_if_active(dev, true);
> -			if (!pm_status)
> -				continue;
> -
> -			if (pdev->current_state != PCI_D3cold)
> +			spin_lock_irq(&dev->power.lock);
> +			if ((pm_runtime_active(dev) ||
> +			     pm_runtime_suspended(dev)) &&
> +			    pdev->current_state != PCI_D3cold) {
> +				pm_runtime_get_noresume(dev);
> +				spin_unlock_irq(&dev->power.lock);
>  				pci_pme_wakeup(pdev, NULL);
> -
> -			if (pm_status > 0)
>  				pm_runtime_put(dev);
> +			} else {
> +				spin_unlock_irq(&dev->power.lock);
> +			}
> +
> +			if (bdev)
> +				pm_runtime_put(bdev);
>  		} else {
>  			list_del(&pme_dev->list);
>  			kfree(pme_dev);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2024-01-23  4:46         ` Alex Williamson
@ 2024-01-23  4:48           ` Sanath S
  0 siblings, 0 replies; 18+ messages in thread
From: Sanath S @ 2024-01-23  4:48 UTC (permalink / raw)
  To: Alex Williamson, Lukas Wunner
  Cc: linux-pci, bhelgaas, linux-kernel, eric.auger, mika.westerberg,
	rafael.j.wysocki, Sanath.S


On 1/23/2024 10:16 AM, Alex Williamson wrote:
> On Mon, 22 Jan 2024 15:50:03 -0700
> Alex Williamson <alex.williamson@redhat.com> wrote:
>
>> On Mon, 22 Jan 2024 23:17:30 +0100
>> Lukas Wunner <lukas@wunner.de> wrote:
>>
>>> On Thu, Jan 18, 2024 at 11:50:49AM -0700, Alex Williamson wrote:
>>>> On Thu,  3 Aug 2023 11:12:33 -0600 Alex Williamson <alex.williamson@redhat.com wrote:
>>>>> Testing that a device is not currently in a low power state provides no
>>>>> guarantees that the device is not immenently transitioning to such a state.
>>>>> We need to increment the PM usage counter before accessing the device.
>>>>> Since we don't wish to wake the device for PME polling, do so only if the
>>>>> device is already active by using pm_runtime_get_if_active().
>>>>>
>>>>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
>>>>> ---
>>>>>   drivers/pci/pci.c | 23 ++++++++++++++++-------
>>>>>   1 file changed, 16 insertions(+), 7 deletions(-)
>>>> Resurrecting this patch (currently commit d3fcd7360338) for discussion
>>>> as it's been identified as the source of a regression in:
>>>>
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=218360
>>>>
>>>> Copying Mika, Lukas, and Rafael as it's related to:
>>>>
>>>> 000dd5316e1c ("PCI: Do not poll for PME if the device is in D3cold")
>>>>
>>>> where we skip devices in D3cold when processing the PME list.
>>>>
>>>> I think the issue in the above bz is that the downstream TB3/USB4 port
>>>> is in D3 (presumably D3hot) and I therefore infer the device is in state
>>>> RPM_SUSPENDED.  This commit is attempting to make sure the device power
>>>> state is stable across the call such that it does not transition into
>>>> D3cold while we're accessing it.
>>>>
>>>> To do that I used pm_runtime_get_if_active(), but in retrospect this
>>>> requires the device to be in RPM_ACTIVE so we end up skipping anything
>>>> suspended or transitioning.
>>> How about dropping the calls to pm_runtime_get_if_active() and
>>> pm_runtime_put() and instead simply do:
>>>
>>> 			if (pm_runtime_suspended(&pdev->dev) &&
>>> 			    pdev->current_state != PCI_D3cold)
>>> 				pci_pme_wakeup(pdev, NULL);
>> Hi Lukas,
>>
>> Do we require that the polled device is in the RPM_SUSPENDED state?
>> Also pm_runtime_suspended() can also only be trusted while holding the
>> device power.lock, we need a usage count reference to maintain that
>> state.
>>
>> I'm also seeing cases where the bridge is power state D0, but PM state
>> RPM_SUSPENDING, so config space of the polled device becomes
>> inaccessible even while we're holding a reference once we allow polling
>> in RPM_SUSPENDED.
>>
>> I'm currently working with the below patch, which I believe addresses
>> all these issues, but I'd welcome review and testing. Thanks,
>>
>> Alex
>>
>> commit 0a063b8e91d0bc807db712c81c8b270864f99ecb
>> Author: Alex Williamson <alex.williamson@redhat.com>
>> Date:   Tue Jan 16 13:28:33 2024 -0700
>>
>>      PCI: Fix active state requirement in PME polling
>>      
>>      The commit noted in fixes added a bogus requirement that runtime PM
>>      managed devices need to be in the RPM_ACTIVE state for PME polling.
>>      In fact, there is no requirement of a specific runtime PM state, it
>>      is only required that the state is stable such that testing config
>>      space availability, ie. !D3cold, remains valid across the PME wakeup.
>>      
>>      To that effect, defer polling of runtime PM managed devices that are
>>      not in either the RPM_ACTIVE or RPM_SUSPENDED states.  Devices in
>>      transitional states remain on the pci_pme_list and will be re-queued.
>>      
>>      However in allowing polling of devices in the RPM_SUSPENDED state,
>>      the bridge state requires further refinement as it's possible to poll
>>      while the bridge is in D0, but the runtime PM state is RPM_SUSPENDING.
>>      An asynchronous completion of the bridge transition to a low power
>>      state can make config space of the subordinate device become
>>      unavailable.  A runtime PM reference to the bridge is therefore added
>>      with a supplementary requirement that the bridge is in the RPM_ACTIVE
>>      state.
>>      
>>      Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling")
>>      Reported-by: Sanath S <sanath.s@amd.com>
>>      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
>>      Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index bdbf8a94b4d0..31dbf1834b07 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -2433,29 +2433,45 @@ static void pci_pme_list_scan(struct work_struct *work)
>>   		if (pdev->pme_poll) {
>>   			struct pci_dev *bridge = pdev->bus->self;
>>   			struct device *dev = &pdev->dev;
>> -			int pm_status;
>> +			struct device *bdev = bridge ? &bridge->dev : NULL;
>>   
>>   			/*
>> -			 * If bridge is in low power state, the
>> -			 * configuration space of subordinate devices
>> -			 * may be not accessible
>> +			 * If we have a bridge, it should be in an active/D0
>> +			 * state or the configuration space of subordinate
>> +			 * devices may not be accessible.
>>   			 */
>> -			if (bridge && bridge->current_state != PCI_D0)
>> -				continue;
>> +			if (bdev) {
>> +				spin_lock_irq(&bdev->power.lock);
> With the code as shown here I have one system that seems to be getting
> contention when reading the vpd sysfs attribute when the endpoints
> (QL41000) are bound to vfio-pci and unused, resulting in the root port
> and endpoints being suspended.  A vpd read can take over a minute.
> Seems to be resolved changing the above spin_lock to a spin_trylock:
>
> 				if (!spin_trylock_irq(&bdev->power.lock))
> 					continue;
>
> The pm_runtime_barrier() as used in the vpd path can be prone to such
> issues, I saw similar in the fix I previously proposed in the bz.
>
> I'll continue to do more testing with this change and hopefully Sanath
> can verify this resolves the bug report.  Thanks,
>
> Alex
I'll verify it today and let you know the observations.
>> +				if (!pm_runtime_active(bdev) ||
>> +				    bridge->current_state != PCI_D0) {
>> +					spin_unlock_irq(&bdev->power.lock);
>> +					continue;
>> +				}
>> +				pm_runtime_get_noresume(bdev);
>> +				spin_unlock_irq(&bdev->power.lock);
>> +			}
>>   
>>   			/*
>> -			 * If the device is in a low power state it
>> -			 * should not be polled either.
>> +			 * The device itself may be either in active or
>> +			 * suspended state, but must not be in D3cold so
>> +			 * that configuration space is accessible.  The
>> +			 * transitional resuming and suspending states are
>> +			 * skipped to avoid D3cold races.
>>   			 */
>> -			pm_status = pm_runtime_get_if_active(dev, true);
>> -			if (!pm_status)
>> -				continue;
>> -
>> -			if (pdev->current_state != PCI_D3cold)
>> +			spin_lock_irq(&dev->power.lock);
>> +			if ((pm_runtime_active(dev) ||
>> +			     pm_runtime_suspended(dev)) &&
>> +			    pdev->current_state != PCI_D3cold) {
>> +				pm_runtime_get_noresume(dev);
>> +				spin_unlock_irq(&dev->power.lock);
>>   				pci_pme_wakeup(pdev, NULL);
>> -
>> -			if (pm_status > 0)
>>   				pm_runtime_put(dev);
>> +			} else {
>> +				spin_unlock_irq(&dev->power.lock);
>> +			}
>> +
>> +			if (bdev)
>> +				pm_runtime_put(bdev);
>>   		} else {
>>   			list_del(&pme_dev->list);
>>   			kfree(pme_dev);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2024-01-22 22:50       ` Alex Williamson
  2024-01-23  4:46         ` Alex Williamson
@ 2024-01-23 10:45         ` Lukas Wunner
  2024-01-23 15:55           ` Alex Williamson
  1 sibling, 1 reply; 18+ messages in thread
From: Lukas Wunner @ 2024-01-23 10:45 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-pci, bhelgaas, linux-kernel, eric.auger, mika.westerberg,
	rafael.j.wysocki, Sanath.S

On Mon, Jan 22, 2024 at 03:50:03PM -0700, Alex Williamson wrote:
> On Mon, 22 Jan 2024 23:17:30 +0100 Lukas Wunner <lukas@wunner.de> wrote:
> > On Thu, Jan 18, 2024 at 11:50:49AM -0700, Alex Williamson wrote:
> > > To do that I used pm_runtime_get_if_active(), but in retrospect this
> > > requires the device to be in RPM_ACTIVE so we end up skipping anything
> > > suspended or transitioning.  
> > 
> > How about dropping the calls to pm_runtime_get_if_active() and
> > pm_runtime_put() and instead simply do:
> > 
> > 			if (pm_runtime_suspended(&pdev->dev) &&
> > 			    pdev->current_state != PCI_D3cold)
> > 				pci_pme_wakeup(pdev, NULL);
> 
> Do we require that the polled device is in the RPM_SUSPENDED state?

If the device is RPM_SUSPENDING, why immediately resume it for polling?
It's sufficient to poll it the next time around, i.e. 1 second later.

Likewise, if it's already RPM_RESUMING or RPM_ACTIVE anyway, no need
to poll PME.

This leaves RPM_SUSPENDED as the only state in which it makes sense to
poll.


> Also pm_runtime_suspended() can also only be trusted while holding the
> device power.lock, we need a usage count reference to maintain that
> state.

Why?  Let's say there's a race and the device resumes immediately after
we call pm_runtime_suspended() here.  So we might call pci_pme_wakeup()
gratuitouly.  So what?  No biggie.


> +			if (bdev) {
> +				spin_lock_irq(&bdev->power.lock);

Hm, I'd expect that lock to be internal to the PM core,
although there *are* a few stray users outside of it.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2024-01-23 10:45         ` Lukas Wunner
@ 2024-01-23 15:55           ` Alex Williamson
  2024-01-23 16:12             ` Lukas Wunner
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2024-01-23 15:55 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, bhelgaas, linux-kernel, eric.auger, mika.westerberg,
	rafael.j.wysocki, Sanath.S

On Tue, 23 Jan 2024 11:45:19 +0100
Lukas Wunner <lukas@wunner.de> wrote:

> On Mon, Jan 22, 2024 at 03:50:03PM -0700, Alex Williamson wrote:
> > On Mon, 22 Jan 2024 23:17:30 +0100 Lukas Wunner <lukas@wunner.de> wrote:  
> > > On Thu, Jan 18, 2024 at 11:50:49AM -0700, Alex Williamson wrote:  
> > > > To do that I used pm_runtime_get_if_active(), but in retrospect this
> > > > requires the device to be in RPM_ACTIVE so we end up skipping anything
> > > > suspended or transitioning.    
> > > 
> > > How about dropping the calls to pm_runtime_get_if_active() and
> > > pm_runtime_put() and instead simply do:
> > > 
> > > 			if (pm_runtime_suspended(&pdev->dev) &&
> > > 			    pdev->current_state != PCI_D3cold)
> > > 				pci_pme_wakeup(pdev, NULL);  
> > 
> > Do we require that the polled device is in the RPM_SUSPENDED state?  
> 
> If the device is RPM_SUSPENDING, why immediately resume it for polling?
> It's sufficient to poll it the next time around, i.e. 1 second later.
> 
> Likewise, if it's already RPM_RESUMING or RPM_ACTIVE anyway, no need
> to poll PME.

I'm clearly not an expert on PME, but this is not obvious to me and
before the commit that went in through this thread, PME wakeup was
triggered regardless of the PM state.  I was trying to restore the
behavior of not requiring a specific PM state other than deferring
polling across transition states.

> This leaves RPM_SUSPENDED as the only state in which it makes sense to
> poll.
>
> > Also pm_runtime_suspended() can also only be trusted while holding the
> > device power.lock, we need a usage count reference to maintain that
> > state.  
> 
> Why?  Let's say there's a race and the device resumes immediately after
> we call pm_runtime_suspended() here.  So we might call pci_pme_wakeup()
> gratuitouly.  So what?  No biggie.

The issue I'm trying to address is that config space of the device can
become inaccessible while calling pci_pme_wakeup() on it, causing a
system fault on some hardware.  So a gratuitous pci_pme_wakeup() can be
detrimental.

We require the device config space to remain accessible, therefore the
instantaneous test against D3cold and that the parent bridge is in D0
is not sufficient.  I see traces where the parent bridge is in D0, but
the PM state is RPM_SUSPENDING and the endpoint device transitions to
D3cold while we're executing pci_pme_wakeup().

Therefore at a minimum, I think we need to enforce that the bridge is
in RPM_ACTIVE and remains in that state across pci_pme_wakeup(), which
means we need to hold a usage count reference, and that usage count
reference must be acquired under power.lock in RPM_ACTIVE state to be
effective.

> > +			if (bdev) {
> > +				spin_lock_irq(&bdev->power.lock);  
> 
> Hm, I'd expect that lock to be internal to the PM core,
> although there *are* a few stray users outside of it.

Right, there are.  It's possible that if we only need to hold a
reference on the bridge we can abstract this through
pm_runtime_get_if_active(), the semantics worked better to essentially
open code it in this iteration though.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2024-01-23 15:55           ` Alex Williamson
@ 2024-01-23 16:12             ` Lukas Wunner
  2024-01-23 16:47               ` Alex Williamson
  0 siblings, 1 reply; 18+ messages in thread
From: Lukas Wunner @ 2024-01-23 16:12 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linux-pci, bhelgaas, linux-kernel, eric.auger, mika.westerberg,
	rafael.j.wysocki, Sanath.S

On Tue, Jan 23, 2024 at 08:55:21AM -0700, Alex Williamson wrote:
> On Tue, 23 Jan 2024 11:45:19 +0100 Lukas Wunner <lukas@wunner.de> wrote:
> > If the device is RPM_SUSPENDING, why immediately resume it for polling?
> > It's sufficient to poll it the next time around, i.e. 1 second later.
> > 
> > Likewise, if it's already RPM_RESUMING or RPM_ACTIVE anyway, no need
> > to poll PME.
> 
> I'm clearly not an expert on PME, but this is not obvious to me and
> before the commit that went in through this thread, PME wakeup was
> triggered regardless of the PM state.  I was trying to restore the
> behavior of not requiring a specific PM state other than deferring
> polling across transition states.

There are broken devices which are incapable of signaling PME.
As a workaround, the kernel polls these devices once per second.
The first time the device signals PME, the kernel stops polling
that particular device because PME is clearly working.

So this is just a best-effort way to support PME for broken devices.
If it takes a little longer to detect that PME was signaled, it's not
a big deal.


> The issue I'm trying to address is that config space of the device can
> become inaccessible while calling pci_pme_wakeup() on it, causing a
> system fault on some hardware.  So a gratuitous pci_pme_wakeup() can be
> detrimental.
> 
> We require the device config space to remain accessible, therefore the
> instantaneous test against D3cold and that the parent bridge is in D0
> is not sufficient.  I see traces where the parent bridge is in D0, but
> the PM state is RPM_SUSPENDING and the endpoint device transitions to
> D3cold while we're executing pci_pme_wakeup().

We have pci_config_pm_runtime_{get,put}() helpers to ensure the parent
of a device is in D0 so that the device's config space is accessible.
So you may need to use that in pci_pme_wakeup().

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling
  2024-01-23 16:12             ` Lukas Wunner
@ 2024-01-23 16:47               ` Alex Williamson
  0 siblings, 0 replies; 18+ messages in thread
From: Alex Williamson @ 2024-01-23 16:47 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, bhelgaas, linux-kernel, eric.auger, mika.westerberg,
	rafael.j.wysocki, Sanath.S

On Tue, 23 Jan 2024 17:12:39 +0100
Lukas Wunner <lukas@wunner.de> wrote:

> On Tue, Jan 23, 2024 at 08:55:21AM -0700, Alex Williamson wrote:
> > On Tue, 23 Jan 2024 11:45:19 +0100 Lukas Wunner <lukas@wunner.de> wrote:  
> > > If the device is RPM_SUSPENDING, why immediately resume it for polling?
> > > It's sufficient to poll it the next time around, i.e. 1 second later.
> > > 
> > > Likewise, if it's already RPM_RESUMING or RPM_ACTIVE anyway, no need
> > > to poll PME.  
> > 
> > I'm clearly not an expert on PME, but this is not obvious to me and
> > before the commit that went in through this thread, PME wakeup was
> > triggered regardless of the PM state.  I was trying to restore the
> > behavior of not requiring a specific PM state other than deferring
> > polling across transition states.  
> 
> There are broken devices which are incapable of signaling PME.
> As a workaround, the kernel polls these devices once per second.
> The first time the device signals PME, the kernel stops polling
> that particular device because PME is clearly working.
> 
> So this is just a best-effort way to support PME for broken devices.
> If it takes a little longer to detect that PME was signaled, it's not
> a big deal.
> 
> > The issue I'm trying to address is that config space of the device can
> > become inaccessible while calling pci_pme_wakeup() on it, causing a
> > system fault on some hardware.  So a gratuitous pci_pme_wakeup() can be
> > detrimental.
> > 
> > We require the device config space to remain accessible, therefore the
> > instantaneous test against D3cold and that the parent bridge is in D0
> > is not sufficient.  I see traces where the parent bridge is in D0, but
> > the PM state is RPM_SUSPENDING and the endpoint device transitions to
> > D3cold while we're executing pci_pme_wakeup().  
> 
> We have pci_config_pm_runtime_{get,put}() helpers to ensure the parent
> of a device is in D0 so that the device's config space is accessible.
> So you may need to use that in pci_pme_wakeup().

pci_config_pm_runtime_get() doesn't seem to align with our current
philosophy to defer polling devices that aren't in the correct power
state.  We require the bridge to be in D0, but we defer polling rather
than resume it otherwise.  We also defer device polling if the device
is in D3cold, whereas the above function would resume a device in that
state.  I think our bridge D0 test could be reliable if it were done
holding a reference acquired via pm_runtime_get_if_active().  Thanks,

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management
  2023-08-03 17:12 [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management Alex Williamson
  2023-08-03 17:12 ` [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface Alex Williamson
  2023-08-03 17:12 ` [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling Alex Williamson
@ 2023-08-10 18:29 ` Bjorn Helgaas
  2023-08-10 18:54   ` Alex Williamson
  2 siblings, 1 reply; 18+ messages in thread
From: Bjorn Helgaas @ 2023-08-10 18:29 UTC (permalink / raw)
  To: Alex Williamson; +Cc: bhelgaas, linux-pci, linux-kernel, eric.auger

On Thu, Aug 03, 2023 at 11:12:31AM -0600, Alex Williamson wrote:
> Since v5.19, vfio-pci makes use of runtime power management on devices.
> This has the effect of potentially putting entire sub-hierarchies into
> lower power states, which has exposed some gaps in the PCI subsystem
> around power management support.
> 
> The first issue is that lspci accesses the VPD sysfs interface, which
> does not provide the same power management wrappers as general config
> space.
> 
> The next covers PME, where we attempt to skip devices based on their PCI
> power state, but don't protect changes to that state or look at the
> overall runtime power management state of the device.
> 
> This latter patch addresses the issue noted by Eric in the follow-ups to
> v1 linked below.
> 
> These patches are logically independent, but only together resolve an
> issue on Eric's system where a pair of endpoints bound to vfio-pci and
> unused by userspace drivers trigger faults through lspci and PME
> polling.  Thanks,
> 
> Alex 
> 
> v1: https://lore.kernel.org/all/20230707151044.1311544-1-alex.williamson@redhat.com/
> 
> Alex Williamson (2):
>   PCI/VPD: Add runtime power management to sysfs interface
>   PCI: Fix runtime PM race with PME polling
> 
>  drivers/pci/pci.c | 23 ++++++++++++++++-------
>  drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
>  2 files changed, 48 insertions(+), 9 deletions(-)

Applied with the tweak below to pci/vpd for v6.6, thanks!  The idea is
to match the pci_get_func0_dev() so the get/put balance is clear
without having to analyze PCI_DEV_FLAGS_VPD_REF_F0 usage:

-       if (dev != vpd_dev)
+       if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management
  2023-08-10 18:29 ` [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management Bjorn Helgaas
@ 2023-08-10 18:54   ` Alex Williamson
  0 siblings, 0 replies; 18+ messages in thread
From: Alex Williamson @ 2023-08-10 18:54 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: bhelgaas, linux-pci, linux-kernel, eric.auger

On Thu, 10 Aug 2023 13:29:44 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Thu, Aug 03, 2023 at 11:12:31AM -0600, Alex Williamson wrote:
> > Since v5.19, vfio-pci makes use of runtime power management on devices.
> > This has the effect of potentially putting entire sub-hierarchies into
> > lower power states, which has exposed some gaps in the PCI subsystem
> > around power management support.
> > 
> > The first issue is that lspci accesses the VPD sysfs interface, which
> > does not provide the same power management wrappers as general config
> > space.
> > 
> > The next covers PME, where we attempt to skip devices based on their PCI
> > power state, but don't protect changes to that state or look at the
> > overall runtime power management state of the device.
> > 
> > This latter patch addresses the issue noted by Eric in the follow-ups to
> > v1 linked below.
> > 
> > These patches are logically independent, but only together resolve an
> > issue on Eric's system where a pair of endpoints bound to vfio-pci and
> > unused by userspace drivers trigger faults through lspci and PME
> > polling.  Thanks,
> > 
> > Alex 
> > 
> > v1: https://lore.kernel.org/all/20230707151044.1311544-1-alex.williamson@redhat.com/
> > 
> > Alex Williamson (2):
> >   PCI/VPD: Add runtime power management to sysfs interface
> >   PCI: Fix runtime PM race with PME polling
> > 
> >  drivers/pci/pci.c | 23 ++++++++++++++++-------
> >  drivers/pci/vpd.c | 34 ++++++++++++++++++++++++++++++++--
> >  2 files changed, 48 insertions(+), 9 deletions(-)  
> 
> Applied with the tweak below to pci/vpd for v6.6, thanks!  The idea is
> to match the pci_get_func0_dev() so the get/put balance is clear
> without having to analyze PCI_DEV_FLAGS_VPD_REF_F0 usage:
> 
> -       if (dev != vpd_dev)
> +       if (dev->dev_flags & PCI_DEV_FLAGS_VPD_REF_F0)
> 

Looks good, thanks!

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-01-23 16:47 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-03 17:12 [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management Alex Williamson
2023-08-03 17:12 ` [PATCH v2 1/2] PCI/VPD: Add runtime power management to sysfs interface Alex Williamson
2023-08-10 15:59   ` Bjorn Helgaas
2023-08-10 16:26     ` Alex Williamson
2023-08-11 19:25   ` Bjorn Helgaas
2023-08-11 19:56     ` Alex Williamson
2023-08-03 17:12 ` [PATCH v2 2/2] PCI: Fix runtime PM race with PME polling Alex Williamson
2024-01-18 18:50   ` Alex Williamson
2024-01-22 22:17     ` Lukas Wunner
2024-01-22 22:50       ` Alex Williamson
2024-01-23  4:46         ` Alex Williamson
2024-01-23  4:48           ` Sanath S
2024-01-23 10:45         ` Lukas Wunner
2024-01-23 15:55           ` Alex Williamson
2024-01-23 16:12             ` Lukas Wunner
2024-01-23 16:47               ` Alex Williamson
2023-08-10 18:29 ` [PATCH v2 0/2] PCI: Protect VPD and PME accesses from power management Bjorn Helgaas
2023-08-10 18:54   ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).