linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] PCI: Prevent power state transition of erroneous device
@ 2025-05-04  9:04 Raag Jadav
  2025-05-12 10:23 ` Raag Jadav
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Raag Jadav @ 2025-05-04  9:04 UTC (permalink / raw)
  To: rafael, mahesh, oohall, bhelgaas
  Cc: linux-pci, linux-kernel, ilpo.jarvinen, lukas, aravind.iddamsetty,
	Raag Jadav

If error flags are set on an AER capable device, most likely either the
device recovery is in progress or has already failed. Neither of the
cases are well suited for power state transition of the device, since
this can lead to unpredictable consequences like resume failure, or in
worst case the device is lost because of it. Leave the device in its
existing power state to avoid such issues.

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
---

v2: Synchronize AER handling with PCI PM (Rafael)
v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
    Elaborate "why" (Bjorn)

More discussion on [1].
[1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/

 drivers/pci/pci.c      | 12 ++++++++++++
 drivers/pci/pcie/aer.c | 11 +++++++++++
 include/linux/aer.h    |  2 ++
 3 files changed, 25 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 4d7c9f64ea24..25b2df34336c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -9,6 +9,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/aer.h>
 #include <linux/kernel.h>
 #include <linux/delay.h>
 #include <linux/dmi.h>
@@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
 	   || (state == PCI_D2 && !dev->d2_support))
 		return -EIO;
 
+	/*
+	 * If error flags are set on an AER capable device, most likely either
+	 * the device recovery is in progress or has already failed. Neither of
+	 * the cases are well suited for power state transition of the device,
+	 * since this can lead to unpredictable consequences like resume
+	 * failure, or in worst case the device is lost because of it. Leave the
+	 * device in its existing power state to avoid such issues.
+	 */
+	if (pci_aer_in_progress(dev))
+		return -EIO;
+
 	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
 	if (PCI_POSSIBLE_ERROR(pmcsr)) {
 		pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a1cf8c7ef628..4040770df4f0 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
 
+bool pci_aer_in_progress(struct pci_dev *dev)
+{
+	u16 reg16;
+
+	if (!pcie_aer_is_native(dev))
+		return false;
+
+	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
+	return !!(reg16 & PCI_EXP_AER_FLAGS);
+}
+
 static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
 {
 	int rc;
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 02940be66324..e6a380bb2e68 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -56,12 +56,14 @@ struct aer_capability_regs {
 #if defined(CONFIG_PCIEAER)
 int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
 int pcie_aer_is_native(struct pci_dev *dev);
+bool pci_aer_in_progress(struct pci_dev *dev);
 #else
 static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
 {
 	return -EINVAL;
 }
 static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
+static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
 #endif
 
 void pci_print_aer(struct pci_dev *dev, int aer_severity,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-04  9:04 [PATCH v3] PCI: Prevent power state transition of erroneous device Raag Jadav
@ 2025-05-12 10:23 ` Raag Jadav
  2025-05-12 11:56 ` Rafael J. Wysocki
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Raag Jadav @ 2025-05-12 10:23 UTC (permalink / raw)
  To: rafael, mahesh, oohall, bhelgaas
  Cc: linux-pci, linux-kernel, ilpo.jarvinen, lukas, aravind.iddamsetty

On Sun, May 04, 2025 at 02:34:44PM +0530, Raag Jadav wrote:
> If error flags are set on an AER capable device, most likely either the
> device recovery is in progress or has already failed. Neither of the
> cases are well suited for power state transition of the device, since
> this can lead to unpredictable consequences like resume failure, or in
> worst case the device is lost because of it. Leave the device in its
> existing power state to avoid such issues.
> 
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
> 
> v2: Synchronize AER handling with PCI PM (Rafael)
> v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
>     Elaborate "why" (Bjorn)
> 
> More discussion on [1].
> [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/

Bump. Anything I can do to move this forward?

Raag

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-04  9:04 [PATCH v3] PCI: Prevent power state transition of erroneous device Raag Jadav
  2025-05-12 10:23 ` Raag Jadav
@ 2025-05-12 11:56 ` Rafael J. Wysocki
  2025-05-12 16:52   ` Raag Jadav
  2025-05-13  4:41 ` Lukas Wunner
  2025-05-14 16:29 ` Denis Benato
  3 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2025-05-12 11:56 UTC (permalink / raw)
  To: Raag Jadav
  Cc: rafael, mahesh, oohall, bhelgaas, linux-pci, linux-kernel,
	ilpo.jarvinen, lukas, aravind.iddamsetty

On Sun, May 4, 2025 at 11:06 AM Raag Jadav <raag.jadav@intel.com> wrote:
>
> If error flags are set on an AER capable device, most likely either the
> device recovery is in progress or has already failed. Neither of the
> cases are well suited for power state transition of the device, since
> this can lead to unpredictable consequences like resume failure, or in
> worst case the device is lost because of it. Leave the device in its
> existing power state to avoid such issues.
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
>
> v2: Synchronize AER handling with PCI PM (Rafael)
> v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
>     Elaborate "why" (Bjorn)

I think this is reasonable, so

Acked-by: Rafael J. Wysocki <rafael@kernel.org>

(and you might as well CC it to linux-pm@vger.kernel.org>).

Thanks!

>
> More discussion on [1].
> [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/
>
>  drivers/pci/pci.c      | 12 ++++++++++++
>  drivers/pci/pcie/aer.c | 11 +++++++++++
>  include/linux/aer.h    |  2 ++
>  3 files changed, 25 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 4d7c9f64ea24..25b2df34336c 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -9,6 +9,7 @@
>   */
>
>  #include <linux/acpi.h>
> +#include <linux/aer.h>
>  #include <linux/kernel.h>
>  #include <linux/delay.h>
>  #include <linux/dmi.h>
> @@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
>            || (state == PCI_D2 && !dev->d2_support))
>                 return -EIO;
>
> +       /*
> +        * If error flags are set on an AER capable device, most likely either
> +        * the device recovery is in progress or has already failed. Neither of
> +        * the cases are well suited for power state transition of the device,
> +        * since this can lead to unpredictable consequences like resume
> +        * failure, or in worst case the device is lost because of it. Leave the
> +        * device in its existing power state to avoid such issues.
> +        */
> +       if (pci_aer_in_progress(dev))
> +               return -EIO;
> +
>         pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>         if (PCI_POSSIBLE_ERROR(pmcsr)) {
>                 pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index a1cf8c7ef628..4040770df4f0 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
>
> +bool pci_aer_in_progress(struct pci_dev *dev)
> +{
> +       u16 reg16;
> +
> +       if (!pcie_aer_is_native(dev))
> +               return false;
> +
> +       pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
> +       return !!(reg16 & PCI_EXP_AER_FLAGS);
> +}
> +
>  static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
>  {
>         int rc;
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index 02940be66324..e6a380bb2e68 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -56,12 +56,14 @@ struct aer_capability_regs {
>  #if defined(CONFIG_PCIEAER)
>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>  int pcie_aer_is_native(struct pci_dev *dev);
> +bool pci_aer_in_progress(struct pci_dev *dev);
>  #else
>  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>  {
>         return -EINVAL;
>  }
>  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
> +static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
>  #endif
>
>  void pci_print_aer(struct pci_dev *dev, int aer_severity,
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-12 11:56 ` Rafael J. Wysocki
@ 2025-05-12 16:52   ` Raag Jadav
  0 siblings, 0 replies; 14+ messages in thread
From: Raag Jadav @ 2025-05-12 16:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: mahesh, oohall, bhelgaas, linux-pci, linux-kernel, ilpo.jarvinen,
	lukas, aravind.iddamsetty, linux-pm

On Mon, May 12, 2025 at 01:56:06PM +0200, Rafael J. Wysocki wrote:
> On Sun, May 4, 2025 at 11:06 AM Raag Jadav <raag.jadav@intel.com> wrote:
> >
> > If error flags are set on an AER capable device, most likely either the
> > device recovery is in progress or has already failed. Neither of the
> > cases are well suited for power state transition of the device, since
> > this can lead to unpredictable consequences like resume failure, or in
> > worst case the device is lost because of it. Leave the device in its
> > existing power state to avoid such issues.
> >
> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > ---
> >
> > v2: Synchronize AER handling with PCI PM (Rafael)
> > v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
> >     Elaborate "why" (Bjorn)
> 
> I think this is reasonable, so
> 
> Acked-by: Rafael J. Wysocki <rafael@kernel.org>

Thank you!

> (and you might as well CC it to linux-pm@vger.kernel.org>).

Cc'ing linux-pm list as requested by Rafael.

> > More discussion on [1].
> > [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/
> >
> >  drivers/pci/pci.c      | 12 ++++++++++++
> >  drivers/pci/pcie/aer.c | 11 +++++++++++
> >  include/linux/aer.h    |  2 ++
> >  3 files changed, 25 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 4d7c9f64ea24..25b2df34336c 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -9,6 +9,7 @@
> >   */
> >
> >  #include <linux/acpi.h>
> > +#include <linux/aer.h>
> >  #include <linux/kernel.h>
> >  #include <linux/delay.h>
> >  #include <linux/dmi.h>
> > @@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
> >            || (state == PCI_D2 && !dev->d2_support))
> >                 return -EIO;
> >
> > +       /*
> > +        * If error flags are set on an AER capable device, most likely either
> > +        * the device recovery is in progress or has already failed. Neither of
> > +        * the cases are well suited for power state transition of the device,
> > +        * since this can lead to unpredictable consequences like resume
> > +        * failure, or in worst case the device is lost because of it. Leave the
> > +        * device in its existing power state to avoid such issues.
> > +        */
> > +       if (pci_aer_in_progress(dev))
> > +               return -EIO;
> > +
> >         pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> >         if (PCI_POSSIBLE_ERROR(pmcsr)) {
> >                 pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index a1cf8c7ef628..4040770df4f0 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
> >
> > +bool pci_aer_in_progress(struct pci_dev *dev)
> > +{
> > +       u16 reg16;
> > +
> > +       if (!pcie_aer_is_native(dev))
> > +               return false;
> > +
> > +       pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
> > +       return !!(reg16 & PCI_EXP_AER_FLAGS);
> > +}
> > +
> >  static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
> >  {
> >         int rc;
> > diff --git a/include/linux/aer.h b/include/linux/aer.h
> > index 02940be66324..e6a380bb2e68 100644
> > --- a/include/linux/aer.h
> > +++ b/include/linux/aer.h
> > @@ -56,12 +56,14 @@ struct aer_capability_regs {
> >  #if defined(CONFIG_PCIEAER)
> >  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
> >  int pcie_aer_is_native(struct pci_dev *dev);
> > +bool pci_aer_in_progress(struct pci_dev *dev);
> >  #else
> >  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
> >  {
> >         return -EINVAL;
> >  }
> >  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
> > +static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
> >  #endif
> >
> >  void pci_print_aer(struct pci_dev *dev, int aer_severity,
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-04  9:04 [PATCH v3] PCI: Prevent power state transition of erroneous device Raag Jadav
  2025-05-12 10:23 ` Raag Jadav
  2025-05-12 11:56 ` Rafael J. Wysocki
@ 2025-05-13  4:41 ` Lukas Wunner
  2025-05-13 15:17   ` Raag Jadav
  2025-05-14 16:29 ` Denis Benato
  3 siblings, 1 reply; 14+ messages in thread
From: Lukas Wunner @ 2025-05-13  4:41 UTC (permalink / raw)
  To: Raag Jadav
  Cc: rafael, mahesh, oohall, bhelgaas, linux-pci, linux-kernel,
	ilpo.jarvinen, aravind.iddamsetty

On Sun, May 04, 2025 at 02:34:44PM +0530, Raag Jadav wrote:
> If error flags are set on an AER capable device, most likely either the
> device recovery is in progress or has already failed. Neither of the
> cases are well suited for power state transition of the device, since
> this can lead to unpredictable consequences like resume failure, or in
> worst case the device is lost because of it. Leave the device in its
> existing power state to avoid such issues.

Have you witnessed this on a particular platform / hardware combination?
If so, it would be good to mention it.  If I'd happen to find this
commit in the future through "git blame", that's the first question
that would come to mind:  How and on what hardware was this actually
triggered, how can I reproduce it.

> +	/*
> +	 * If error flags are set on an AER capable device, most likely either
> +	 * the device recovery is in progress or has already failed. Neither of
> +	 * the cases are well suited for power state transition of the device,
> +	 * since this can lead to unpredictable consequences like resume
> +	 * failure, or in worst case the device is lost because of it. Leave the
> +	 * device in its existing power state to avoid such issues.
> +	 */

That's quite verbose and merely a 1:1 repetition of the commit message.
I'd recommend a more condensed code comment and anyone interested in
further details may look them up in the commit message.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-13  4:41 ` Lukas Wunner
@ 2025-05-13 15:17   ` Raag Jadav
  0 siblings, 0 replies; 14+ messages in thread
From: Raag Jadav @ 2025-05-13 15:17 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: rafael, mahesh, oohall, bhelgaas, linux-pci, linux-kernel,
	ilpo.jarvinen, aravind.iddamsetty

On Tue, May 13, 2025 at 06:41:31AM +0200, Lukas Wunner wrote:
> On Sun, May 04, 2025 at 02:34:44PM +0530, Raag Jadav wrote:
> > If error flags are set on an AER capable device, most likely either the
> > device recovery is in progress or has already failed. Neither of the
> > cases are well suited for power state transition of the device, since
> > this can lead to unpredictable consequences like resume failure, or in
> > worst case the device is lost because of it. Leave the device in its
> > existing power state to avoid such issues.
> 
> Have you witnessed this on a particular platform / hardware combination?
> If so, it would be good to mention it.  If I'd happen to find this
> commit in the future through "git blame", that's the first question
> that would come to mind:  How and on what hardware was this actually
> triggered, how can I reproduce it.

We have a few issues[1] reported which are similar in nature. But since
they are not easily reproducible and still under investigation, I'm
a bit hesitant to explicitly reference it.

[1] https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4395

> > +	/*
> > +	 * If error flags are set on an AER capable device, most likely either
> > +	 * the device recovery is in progress or has already failed. Neither of
> > +	 * the cases are well suited for power state transition of the device,
> > +	 * since this can lead to unpredictable consequences like resume
> > +	 * failure, or in worst case the device is lost because of it. Leave the
> > +	 * device in its existing power state to avoid such issues.
> > +	 */
> 
> That's quite verbose and merely a 1:1 repetition of the commit message.
> I'd recommend a more condensed code comment and anyone interested in
> further details may look them up in the commit message.

Sure, will update.

Raag

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-04  9:04 [PATCH v3] PCI: Prevent power state transition of erroneous device Raag Jadav
                   ` (2 preceding siblings ...)
  2025-05-13  4:41 ` Lukas Wunner
@ 2025-05-14 16:29 ` Denis Benato
  2025-05-14 19:53   ` Mario Limonciello
  3 siblings, 1 reply; 14+ messages in thread
From: Denis Benato @ 2025-05-14 16:29 UTC (permalink / raw)
  To: Raag Jadav, rafael, mahesh, oohall, bhelgaas, Mario Limonciello
  Cc: linux-pci, linux-kernel, ilpo.jarvinen, lukas, aravind.iddamsetty

Hello,

Lately I am experiencing a few problems related to either (one of or both) PCI and/or thunderbolt and Mario Limonciello pointed me to this patch.

you can follow an example of my problems in this [1] bug report.

I tested this patch on top of 6.14.6 and this patch comes with a nasty regression: s2idle resume breaks all my three GPUs, while for example the sound of a YT video resumes fine.

You can see the dmesg here: https://pastebin.com/Um7bmdWi

I will also say that, on the bright side, this patch makes my laptop behave better on boot as the amdgpu plugged on the thunderbolt port is always enabled on power on, while without this patch it is random if it will be active immediately after laptop has been turned on.


[1] https://lore.kernel.org/all/965c9753-f14b-4a87-9f6d-8798e09ad6f5@gmail.com/

On 5/4/25 11:04, Raag Jadav wrote:

> If error flags are set on an AER capable device, most likely either the
> device recovery is in progress or has already failed. Neither of the
> cases are well suited for power state transition of the device, since
> this can lead to unpredictable consequences like resume failure, or in
> worst case the device is lost because of it. Leave the device in its
> existing power state to avoid such issues.
>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
>
> v2: Synchronize AER handling with PCI PM (Rafael)
> v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
>     Elaborate "why" (Bjorn)
>
> More discussion on [1].
> [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/
>
>  drivers/pci/pci.c      | 12 ++++++++++++
>  drivers/pci/pcie/aer.c | 11 +++++++++++
>  include/linux/aer.h    |  2 ++
>  3 files changed, 25 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 4d7c9f64ea24..25b2df34336c 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -9,6 +9,7 @@
>   */
>  
>  #include <linux/acpi.h>
> +#include <linux/aer.h>
>  #include <linux/kernel.h>
>  #include <linux/delay.h>
>  #include <linux/dmi.h>
> @@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
>  	   || (state == PCI_D2 && !dev->d2_support))
>  		return -EIO;
>  
> +	/*
> +	 * If error flags are set on an AER capable device, most likely either
> +	 * the device recovery is in progress or has already failed. Neither of
> +	 * the cases are well suited for power state transition of the device,
> +	 * since this can lead to unpredictable consequences like resume
> +	 * failure, or in worst case the device is lost because of it. Leave the
> +	 * device in its existing power state to avoid such issues.
> +	 */
> +	if (pci_aer_in_progress(dev))
> +		return -EIO;
> +
>  	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>  	if (PCI_POSSIBLE_ERROR(pmcsr)) {
>  		pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index a1cf8c7ef628..4040770df4f0 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
>  
> +bool pci_aer_in_progress(struct pci_dev *dev)
> +{
> +	u16 reg16;
> +
> +	if (!pcie_aer_is_native(dev))
> +		return false;
> +
> +	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
> +	return !!(reg16 & PCI_EXP_AER_FLAGS);
> +}
> +
>  static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
>  {
>  	int rc;
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index 02940be66324..e6a380bb2e68 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -56,12 +56,14 @@ struct aer_capability_regs {
>  #if defined(CONFIG_PCIEAER)
>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>  int pcie_aer_is_native(struct pci_dev *dev);
> +bool pci_aer_in_progress(struct pci_dev *dev);
>  #else
>  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>  {
>  	return -EINVAL;
>  }
>  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
> +static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
>  #endif
>  
>  void pci_print_aer(struct pci_dev *dev, int aer_severity,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-14 16:29 ` Denis Benato
@ 2025-05-14 19:53   ` Mario Limonciello
  2025-05-14 21:25     ` Denis Benato
  0 siblings, 1 reply; 14+ messages in thread
From: Mario Limonciello @ 2025-05-14 19:53 UTC (permalink / raw)
  To: Denis Benato, Raag Jadav, rafael, mahesh, oohall, bhelgaas
  Cc: linux-pci, linux-kernel, ilpo.jarvinen, lukas, aravind.iddamsetty

On 5/14/2025 11:29 AM, Denis Benato wrote:
> Hello,
> 
> Lately I am experiencing a few problems related to either (one of or both) PCI and/or thunderbolt and Mario Limonciello pointed me to this patch.
> 
> you can follow an example of my problems in this [1] bug report.
> 
> I tested this patch on top of 6.14.6 and this patch comes with a nasty regression: s2idle resume breaks all my three GPUs, while for example the sound of a YT video resumes fine.
> 
> You can see the dmesg here: https://pastebin.com/Um7bmdWi
> 
> I will also say that, on the bright side, this patch makes my laptop behave better on boot as the amdgpu plugged on the thunderbolt port is always enabled on power on, while without this patch it is random if it will be active immediately after laptop has been turned on.
> 

Just for clarity - if you unplug your eGPU enclosure before suspend is 
everything OK?  IE this patch only has an impact to the USB4/TBT3 PCIe 
tunnels?

The errors after resume in amdgpu /look/ like the device is "missing" 
from the bus or otherwise not responding.

I think it would be helpful to capture the kernel log with a baseline of 
6.14.6 but without this patch for comparison of what this patch is 
actually causing.

> 
> [1] https://lore.kernel.org/all/965c9753-f14b-4a87-9f6d-8798e09ad6f5@gmail.com/
> 
> On 5/4/25 11:04, Raag Jadav wrote:
> 
>> If error flags are set on an AER capable device, most likely either the
>> device recovery is in progress or has already failed. Neither of the
>> cases are well suited for power state transition of the device, since
>> this can lead to unpredictable consequences like resume failure, or in
>> worst case the device is lost because of it. Leave the device in its
>> existing power state to avoid such issues.
>>
>> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
>> ---
>>
>> v2: Synchronize AER handling with PCI PM (Rafael)
>> v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
>>      Elaborate "why" (Bjorn)
>>
>> More discussion on [1].
>> [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/
>>
>>   drivers/pci/pci.c      | 12 ++++++++++++
>>   drivers/pci/pcie/aer.c | 11 +++++++++++
>>   include/linux/aer.h    |  2 ++
>>   3 files changed, 25 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 4d7c9f64ea24..25b2df34336c 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -9,6 +9,7 @@
>>    */
>>   
>>   #include <linux/acpi.h>
>> +#include <linux/aer.h>
>>   #include <linux/kernel.h>
>>   #include <linux/delay.h>
>>   #include <linux/dmi.h>
>> @@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
>>   	   || (state == PCI_D2 && !dev->d2_support))
>>   		return -EIO;
>>   
>> +	/*
>> +	 * If error flags are set on an AER capable device, most likely either
>> +	 * the device recovery is in progress or has already failed. Neither of
>> +	 * the cases are well suited for power state transition of the device,
>> +	 * since this can lead to unpredictable consequences like resume
>> +	 * failure, or in worst case the device is lost because of it. Leave the
>> +	 * device in its existing power state to avoid such issues.
>> +	 */
>> +	if (pci_aer_in_progress(dev))
>> +		return -EIO;
>> +
>>   	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>>   	if (PCI_POSSIBLE_ERROR(pmcsr)) {
>>   		pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index a1cf8c7ef628..4040770df4f0 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
>>   
>> +bool pci_aer_in_progress(struct pci_dev *dev)
>> +{
>> +	u16 reg16;
>> +
>> +	if (!pcie_aer_is_native(dev))
>> +		return false;
>> +
>> +	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
>> +	return !!(reg16 & PCI_EXP_AER_FLAGS);
>> +}
>> +
>>   static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
>>   {
>>   	int rc;
>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>> index 02940be66324..e6a380bb2e68 100644
>> --- a/include/linux/aer.h
>> +++ b/include/linux/aer.h
>> @@ -56,12 +56,14 @@ struct aer_capability_regs {
>>   #if defined(CONFIG_PCIEAER)
>>   int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>>   int pcie_aer_is_native(struct pci_dev *dev);
>> +bool pci_aer_in_progress(struct pci_dev *dev);
>>   #else
>>   static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>   {
>>   	return -EINVAL;
>>   }
>>   static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>> +static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
>>   #endif
>>   
>>   void pci_print_aer(struct pci_dev *dev, int aer_severity,


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-14 19:53   ` Mario Limonciello
@ 2025-05-14 21:25     ` Denis Benato
  2025-05-15 11:58       ` Raag Jadav
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Benato @ 2025-05-14 21:25 UTC (permalink / raw)
  To: Mario Limonciello, Raag Jadav, rafael, mahesh, oohall, bhelgaas
  Cc: linux-pci, linux-kernel, ilpo.jarvinen, lukas, aravind.iddamsetty


On 5/14/25 21:53, Mario Limonciello wrote:
> On 5/14/2025 11:29 AM, Denis Benato wrote:
>> Hello,
>>
>> Lately I am experiencing a few problems related to either (one of or both) PCI and/or thunderbolt and Mario Limonciello pointed me to this patch.
>>
>> you can follow an example of my problems in this [1] bug report.
>>
>> I tested this patch on top of 6.14.6 and this patch comes with a nasty regression: s2idle resume breaks all my three GPUs, while for example the sound of a YT video resumes fine.
>>
>> You can see the dmesg here: https://pastebin.com/Um7bmdWi
>>
>> I will also say that, on the bright side, this patch makes my laptop behave better on boot as the amdgpu plugged on the thunderbolt port is always enabled on power on, while without this patch it is random if it will be active immediately after laptop has been turned on.
>>
>
> Just for clarity - if you unplug your eGPU enclosure before suspend is everything OK?  IE this patch only has an impact to the USB4/TBT3 PCIe tunnels?
>
Laptop seems to enter and exit s2idle with the thunderbolt amdgpu disconnected using this patch too.

Probably this either unveils a pre-existing thunderbolt bug or creates a new one.  If you need assistance in finding the bug or investigating in any other mean let me know as I want to see this patch merged once it stops regressing sleep with egpu.


I will add that as a visible effect entering and exiting s2idle, even without the egpu connected (so when sleep works), makes the screen backlight to turn off and on rapidly about 6 times and it's a bit "concerning" to see, also I have the impression that it takes slightly longer to enter/exit s2idle.


> The errors after resume in amdgpu /look/ like the device is "missing" from the bus or otherwise not responding.
>
> I think it would be helpful to capture the kernel log with a baseline of 6.14.6 but without this patch for comparison of what this patch is actually causing.
>
I have a dmesg of the same 6.14.6 minus this patch ready: https://pastebin.com/kLZtibcD
>>
>> [1] https://lore.kernel.org/all/965c9753-f14b-4a87-9f6d-8798e09ad6f5@gmail.com/
>>
>> On 5/4/25 11:04, Raag Jadav wrote:
>>
>>> If error flags are set on an AER capable device, most likely either the
>>> device recovery is in progress or has already failed. Neither of the
>>> cases are well suited for power state transition of the device, since
>>> this can lead to unpredictable consequences like resume failure, or in
>>> worst case the device is lost because of it. Leave the device in its
>>> existing power state to avoid such issues.
>>>
>>> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
>>> ---
>>>
>>> v2: Synchronize AER handling with PCI PM (Rafael)
>>> v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
>>>      Elaborate "why" (Bjorn)
>>>
>>> More discussion on [1].
>>> [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/
>>>
>>>   drivers/pci/pci.c      | 12 ++++++++++++
>>>   drivers/pci/pcie/aer.c | 11 +++++++++++
>>>   include/linux/aer.h    |  2 ++
>>>   3 files changed, 25 insertions(+)
>>>
>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>> index 4d7c9f64ea24..25b2df34336c 100644
>>> --- a/drivers/pci/pci.c
>>> +++ b/drivers/pci/pci.c
>>> @@ -9,6 +9,7 @@
>>>    */
>>>     #include <linux/acpi.h>
>>> +#include <linux/aer.h>
>>>   #include <linux/kernel.h>
>>>   #include <linux/delay.h>
>>>   #include <linux/dmi.h>
>>> @@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
>>>          || (state == PCI_D2 && !dev->d2_support))
>>>           return -EIO;
>>>   +    /*
>>> +     * If error flags are set on an AER capable device, most likely either
>>> +     * the device recovery is in progress or has already failed. Neither of
>>> +     * the cases are well suited for power state transition of the device,
>>> +     * since this can lead to unpredictable consequences like resume
>>> +     * failure, or in worst case the device is lost because of it. Leave the
>>> +     * device in its existing power state to avoid such issues.
>>> +     */
>>> +    if (pci_aer_in_progress(dev))
>>> +        return -EIO;
>>> +
>>>       pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>>>       if (PCI_POSSIBLE_ERROR(pmcsr)) {
>>>           pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index a1cf8c7ef628..4040770df4f0 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
>>>   }
>>>   EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
>>>   +bool pci_aer_in_progress(struct pci_dev *dev)
>>> +{
>>> +    u16 reg16;
>>> +
>>> +    if (!pcie_aer_is_native(dev))
>>> +        return false;
>>> +
>>> +    pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
>>> +    return !!(reg16 & PCI_EXP_AER_FLAGS);
>>> +}
>>> +
>>>   static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
>>>   {
>>>       int rc;
>>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>>> index 02940be66324..e6a380bb2e68 100644
>>> --- a/include/linux/aer.h
>>> +++ b/include/linux/aer.h
>>> @@ -56,12 +56,14 @@ struct aer_capability_regs {
>>>   #if defined(CONFIG_PCIEAER)
>>>   int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>>>   int pcie_aer_is_native(struct pci_dev *dev);
>>> +bool pci_aer_in_progress(struct pci_dev *dev);
>>>   #else
>>>   static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>>   {
>>>       return -EINVAL;
>>>   }
>>>   static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>>> +static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
>>>   #endif
>>>     void pci_print_aer(struct pci_dev *dev, int aer_severity,
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-14 21:25     ` Denis Benato
@ 2025-05-15 11:58       ` Raag Jadav
  2025-05-15 12:26         ` Lukas Wunner
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Raag Jadav @ 2025-05-15 11:58 UTC (permalink / raw)
  To: Denis Benato
  Cc: Mario Limonciello, rafael, mahesh, oohall, bhelgaas, linux-pci,
	linux-kernel, ilpo.jarvinen, lukas, aravind.iddamsetty

On Wed, May 14, 2025 at 11:25:36PM +0200, Denis Benato wrote:
> On 5/14/25 21:53, Mario Limonciello wrote:
> > On 5/14/2025 11:29 AM, Denis Benato wrote:
> >> Hello,
> >>
> >> Lately I am experiencing a few problems related to either (one of or both) PCI and/or thunderbolt and Mario Limonciello pointed me to this patch.
> >>
> >> you can follow an example of my problems in this [1] bug report.
> >>
> >> I tested this patch on top of 6.14.6 and this patch comes with a nasty regression: s2idle resume breaks all my three GPUs, while for example the sound of a YT video resumes fine.
> >>
> >> You can see the dmesg here: https://pastebin.com/Um7bmdWi

Thanks for the report. From logs it looks like a hotplug event is triggered
for presence detect which is disabling the slot and in turn loosing the device
on resume. The cause of it is unclear though (assuming it is not a manual
intervention).

> >> I will also say that, on the bright side, this patch makes my laptop behave better on boot as the amdgpu plugged on the thunderbolt port is always enabled on power on, while without this patch it is random if it will be active immediately after laptop has been turned on.
> >>
> >
> > Just for clarity - if you unplug your eGPU enclosure before suspend is everything OK?  IE this patch only has an impact to the USB4/TBT3 PCIe tunnels?
> >
> Laptop seems to enter and exit s2idle with the thunderbolt amdgpu disconnected using this patch too.
> 
> Probably this either unveils a pre-existing thunderbolt bug or creates a new one.  If you need assistance in finding the bug or investigating in any other mean let me know as I want to see this patch merged once it stops regressing sleep with egpu.

If you're observing this only on thunderbolt port, one experiment I could
think of is to configure the port power delivery to be always on during suspend
and observe. Perhaps enable both thunderbolt and PCI logging to help figure out
what's really happening.

> I will add that as a visible effect entering and exiting s2idle, even without the egpu connected (so when sleep works), makes the screen backlight to turn off and on rapidly about 6 times and it's a bit "concerning" to see, also I have the impression that it takes slightly longer to enter/exit s2idle.

Yes, I'm expecting a lot of hidden issues to be surfaced by this patch. Since
you've confirmed the machine itself is working fine, I'm hoping there are no
serious regressions.

Raag

> > The errors after resume in amdgpu /look/ like the device is "missing" from the bus or otherwise not responding.
> >
> > I think it would be helpful to capture the kernel log with a baseline of 6.14.6 but without this patch for comparison of what this patch is actually causing.
> >
> I have a dmesg of the same 6.14.6 minus this patch ready: https://pastebin.com/kLZtibcD
> >>
> >> [1] https://lore.kernel.org/all/965c9753-f14b-4a87-9f6d-8798e09ad6f5@gmail.com/
> >>
> >> On 5/4/25 11:04, Raag Jadav wrote:
> >>
> >>> If error flags are set on an AER capable device, most likely either the
> >>> device recovery is in progress or has already failed. Neither of the
> >>> cases are well suited for power state transition of the device, since
> >>> this can lead to unpredictable consequences like resume failure, or in
> >>> worst case the device is lost because of it. Leave the device in its
> >>> existing power state to avoid such issues.
> >>>
> >>> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> >>> ---
> >>>
> >>> v2: Synchronize AER handling with PCI PM (Rafael)
> >>> v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
> >>>      Elaborate "why" (Bjorn)
> >>>
> >>> More discussion on [1].
> >>> [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/
> >>>
> >>>   drivers/pci/pci.c      | 12 ++++++++++++
> >>>   drivers/pci/pcie/aer.c | 11 +++++++++++
> >>>   include/linux/aer.h    |  2 ++
> >>>   3 files changed, 25 insertions(+)
> >>>
> >>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> >>> index 4d7c9f64ea24..25b2df34336c 100644
> >>> --- a/drivers/pci/pci.c
> >>> +++ b/drivers/pci/pci.c
> >>> @@ -9,6 +9,7 @@
> >>>    */
> >>>     #include <linux/acpi.h>
> >>> +#include <linux/aer.h>
> >>>   #include <linux/kernel.h>
> >>>   #include <linux/delay.h>
> >>>   #include <linux/dmi.h>
> >>> @@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
> >>>          || (state == PCI_D2 && !dev->d2_support))
> >>>           return -EIO;
> >>>   +    /*
> >>> +     * If error flags are set on an AER capable device, most likely either
> >>> +     * the device recovery is in progress or has already failed. Neither of
> >>> +     * the cases are well suited for power state transition of the device,
> >>> +     * since this can lead to unpredictable consequences like resume
> >>> +     * failure, or in worst case the device is lost because of it. Leave the
> >>> +     * device in its existing power state to avoid such issues.
> >>> +     */
> >>> +    if (pci_aer_in_progress(dev))
> >>> +        return -EIO;
> >>> +
> >>>       pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> >>>       if (PCI_POSSIBLE_ERROR(pmcsr)) {
> >>>           pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
> >>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> >>> index a1cf8c7ef628..4040770df4f0 100644
> >>> --- a/drivers/pci/pcie/aer.c
> >>> +++ b/drivers/pci/pcie/aer.c
> >>> @@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
> >>>   }
> >>>   EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
> >>>   +bool pci_aer_in_progress(struct pci_dev *dev)
> >>> +{
> >>> +    u16 reg16;
> >>> +
> >>> +    if (!pcie_aer_is_native(dev))
> >>> +        return false;
> >>> +
> >>> +    pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
> >>> +    return !!(reg16 & PCI_EXP_AER_FLAGS);
> >>> +}
> >>> +
> >>>   static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
> >>>   {
> >>>       int rc;
> >>> diff --git a/include/linux/aer.h b/include/linux/aer.h
> >>> index 02940be66324..e6a380bb2e68 100644
> >>> --- a/include/linux/aer.h
> >>> +++ b/include/linux/aer.h
> >>> @@ -56,12 +56,14 @@ struct aer_capability_regs {
> >>>   #if defined(CONFIG_PCIEAER)
> >>>   int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
> >>>   int pcie_aer_is_native(struct pci_dev *dev);
> >>> +bool pci_aer_in_progress(struct pci_dev *dev);
> >>>   #else
> >>>   static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
> >>>   {
> >>>       return -EINVAL;
> >>>   }
> >>>   static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
> >>> +static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
> >>>   #endif
> >>>     void pci_print_aer(struct pci_dev *dev, int aer_severity,
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-15 11:58       ` Raag Jadav
@ 2025-05-15 12:26         ` Lukas Wunner
  2025-05-15 14:11         ` Denis Benato
  2025-05-21 13:33         ` Lukas Wunner
  2 siblings, 0 replies; 14+ messages in thread
From: Lukas Wunner @ 2025-05-15 12:26 UTC (permalink / raw)
  To: Raag Jadav
  Cc: Denis Benato, Mario Limonciello, rafael, mahesh, oohall, bhelgaas,
	linux-pci, linux-kernel, ilpo.jarvinen, aravind.iddamsetty

On Thu, May 15, 2025 at 02:58:25PM +0300, Raag Jadav wrote:
> On Wed, May 14, 2025 at 11:25:36PM +0200, Denis Benato wrote:
> > I tested this patch on top of 6.14.6 and this patch comes with a nasty regression: s2idle resume breaks all my three GPUs, while for example the sound of a YT video resumes fine.
> >
> > You can see the dmesg here: https://pastebin.com/Um7bmdWi
> 
> Thanks for the report. From logs it looks like a hotplug event is triggered
> for presence detect which is disabling the slot and in turn loosing the
> device on resume. The cause of it is unclear though (assuming it is not
> a manual intervention).

When an Endpoint transitions to D3cold, the link to the Endpoint goes
down.  If the Downstream Port above the Endpoint is hotplug-capable,
it will see a Data Link Layer State Changed event as a side effect.
If it doesn't support out-of-band presence detect, it will also see
a Presence Detect Changed event as a side effect.

As a workaround, graphics drivers invoke pci_ignore_hotplug() and that
will cause pciehp and acpiphp to permanently ignore any hotplug events.

In v6.16 there will be a new pci_hp_ignore_link_change() and
pci_hp_unignore_link_change() API to tell PCI hotplug drivers that
DLLSC and PDC events shall be ignored temporarily:

https://git.kernel.org/pci/pci/c/2af781a9edc4

I intend to replace pci_ignore_hotplug() with this new approach,
as explained here:

https://lore.kernel.org/r/Z_nfuGrVh_CO7vbe@wunner.de

I'm not sure though if that will help with the issue at hand.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-15 11:58       ` Raag Jadav
  2025-05-15 12:26         ` Lukas Wunner
@ 2025-05-15 14:11         ` Denis Benato
  2025-05-15 14:41           ` Mario Limonciello
  2025-05-21 13:33         ` Lukas Wunner
  2 siblings, 1 reply; 14+ messages in thread
From: Denis Benato @ 2025-05-15 14:11 UTC (permalink / raw)
  To: Raag Jadav
  Cc: Mario Limonciello, rafael, mahesh, oohall, bhelgaas, linux-pci,
	linux-kernel, ilpo.jarvinen, lukas, aravind.iddamsetty


On 5/15/25 13:58, Raag Jadav wrote:
> On Wed, May 14, 2025 at 11:25:36PM +0200, Denis Benato wrote:
>> On 5/14/25 21:53, Mario Limonciello wrote:
>>> On 5/14/2025 11:29 AM, Denis Benato wrote:
>>>> Hello,
>>>>
>>>> Lately I am experiencing a few problems related to either (one of or both) PCI and/or thunderbolt and Mario Limonciello pointed me to this patch.
>>>>
>>>> you can follow an example of my problems in this [1] bug report.
>>>>
>>>> I tested this patch on top of 6.14.6 and this patch comes with a nasty regression: s2idle resume breaks all my three GPUs, while for example the sound of a YT video resumes fine.
>>>>
>>>> You can see the dmesg here: https://pastebin.com/Um7bmdWi
> Thanks for the report. From logs it looks like a hotplug event is triggered
> for presence detect which is disabling the slot and in turn loosing the device
> on resume. The cause of it is unclear though (assuming it is not a manual
> intervention).
No manual intervention: I do "sudo systemctl suspend", wait for the led pattern of sleep and press space. Nothing more than this.

I also noticed that with this patch, while sleeping, the amd gpu has fans on, while this is not the case sleeping without the patch.

>>>> I will also say that, on the bright side, this patch makes my laptop behave better on boot as the amdgpu plugged on the thunderbolt port is always enabled on power on, while without this patch it is random if it will be active immediately after laptop has been turned on.
>>>>
>>> Just for clarity - if you unplug your eGPU enclosure before suspend is everything OK?  IE this patch only has an impact to the USB4/TBT3 PCIe tunnels?
>>>
>> Laptop seems to enter and exit s2idle with the thunderbolt amdgpu disconnected using this patch too.
>>
>> Probably this either unveils a pre-existing thunderbolt bug or creates a new one.  If you need assistance in finding the bug or investigating in any other mean let me know as I want to see this patch merged once it stops regressing sleep with egpu.
> If you're observing this only on thunderbolt port, one experiment I could
> think of is to configure the port power delivery to be always on during suspend
> and observe. Perhaps enable both thunderbolt and PCI logging to help figure out
> what's really happening.
>
I have compiled the kernel with CONFIG_PCI_DEBUG=y and added to kernel cmdline "thunderbolt.dyndbg=+p pm_debug_messages" and here is the dmesg of a failed resume: https://pastebin.com/RsxXQQTm

Please let me know if this is not detailed enough, and how to enable more logging if you need it.
>> I will add that as a visible effect entering and exiting s2idle, even without the egpu connected (so when sleep works), makes the screen backlight to turn off and on rapidly about 6 times and it's a bit "concerning" to see, also I have the impression that it takes slightly longer to enter/exit s2idle.
> Yes, I'm expecting a lot of hidden issues to be surfaced by this patch. Since
> you've confirmed the machine itself is working fine, I'm hoping there are no
> serious regressions.
Except that for thunderbolt nothing major stands out, but once that is solved I would conduct a test about s2idle power consumption because, as noted above, the amdgpu remains on during sleep and it might not be the only component.

Anyway thanks for your work and if you need more info just ask.

Denis

> Raag
>
>>> The errors after resume in amdgpu /look/ like the device is "missing" from the bus or otherwise not responding.
>>>
>>> I think it would be helpful to capture the kernel log with a baseline of 6.14.6 but without this patch for comparison of what this patch is actually causing.
>>>
>> I have a dmesg of the same 6.14.6 minus this patch ready: https://pastebin.com/kLZtibcD
>>>> [1] https://lore.kernel.org/all/965c9753-f14b-4a87-9f6d-8798e09ad6f5@gmail.com/
>>>>
>>>> On 5/4/25 11:04, Raag Jadav wrote:
>>>>
>>>>> If error flags are set on an AER capable device, most likely either the
>>>>> device recovery is in progress or has already failed. Neither of the
>>>>> cases are well suited for power state transition of the device, since
>>>>> this can lead to unpredictable consequences like resume failure, or in
>>>>> worst case the device is lost because of it. Leave the device in its
>>>>> existing power state to avoid such issues.
>>>>>
>>>>> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
>>>>> ---
>>>>>
>>>>> v2: Synchronize AER handling with PCI PM (Rafael)
>>>>> v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
>>>>>      Elaborate "why" (Bjorn)
>>>>>
>>>>> More discussion on [1].
>>>>> [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/
>>>>>
>>>>>   drivers/pci/pci.c      | 12 ++++++++++++
>>>>>   drivers/pci/pcie/aer.c | 11 +++++++++++
>>>>>   include/linux/aer.h    |  2 ++
>>>>>   3 files changed, 25 insertions(+)
>>>>>
>>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>>> index 4d7c9f64ea24..25b2df34336c 100644
>>>>> --- a/drivers/pci/pci.c
>>>>> +++ b/drivers/pci/pci.c
>>>>> @@ -9,6 +9,7 @@
>>>>>    */
>>>>>     #include <linux/acpi.h>
>>>>> +#include <linux/aer.h>
>>>>>   #include <linux/kernel.h>
>>>>>   #include <linux/delay.h>
>>>>>   #include <linux/dmi.h>
>>>>> @@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
>>>>>          || (state == PCI_D2 && !dev->d2_support))
>>>>>           return -EIO;
>>>>>   +    /*
>>>>> +     * If error flags are set on an AER capable device, most likely either
>>>>> +     * the device recovery is in progress or has already failed. Neither of
>>>>> +     * the cases are well suited for power state transition of the device,
>>>>> +     * since this can lead to unpredictable consequences like resume
>>>>> +     * failure, or in worst case the device is lost because of it. Leave the
>>>>> +     * device in its existing power state to avoid such issues.
>>>>> +     */
>>>>> +    if (pci_aer_in_progress(dev))
>>>>> +        return -EIO;
>>>>> +
>>>>>       pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>>>>>       if (PCI_POSSIBLE_ERROR(pmcsr)) {
>>>>>           pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
>>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>>> index a1cf8c7ef628..4040770df4f0 100644
>>>>> --- a/drivers/pci/pcie/aer.c
>>>>> +++ b/drivers/pci/pcie/aer.c
>>>>> @@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
>>>>>   }
>>>>>   EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
>>>>>   +bool pci_aer_in_progress(struct pci_dev *dev)
>>>>> +{
>>>>> +    u16 reg16;
>>>>> +
>>>>> +    if (!pcie_aer_is_native(dev))
>>>>> +        return false;
>>>>> +
>>>>> +    pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
>>>>> +    return !!(reg16 & PCI_EXP_AER_FLAGS);
>>>>> +}
>>>>> +
>>>>>   static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
>>>>>   {
>>>>>       int rc;
>>>>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>>>>> index 02940be66324..e6a380bb2e68 100644
>>>>> --- a/include/linux/aer.h
>>>>> +++ b/include/linux/aer.h
>>>>> @@ -56,12 +56,14 @@ struct aer_capability_regs {
>>>>>   #if defined(CONFIG_PCIEAER)
>>>>>   int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>>>>>   int pcie_aer_is_native(struct pci_dev *dev);
>>>>> +bool pci_aer_in_progress(struct pci_dev *dev);
>>>>>   #else
>>>>>   static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>>>>   {
>>>>>       return -EINVAL;
>>>>>   }
>>>>>   static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>>>>> +static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
>>>>>   #endif
>>>>>     void pci_print_aer(struct pci_dev *dev, int aer_severity,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-15 14:11         ` Denis Benato
@ 2025-05-15 14:41           ` Mario Limonciello
  0 siblings, 0 replies; 14+ messages in thread
From: Mario Limonciello @ 2025-05-15 14:41 UTC (permalink / raw)
  To: Denis Benato, Raag Jadav
  Cc: rafael, mahesh, oohall, bhelgaas, linux-pci, linux-kernel,
	ilpo.jarvinen, lukas, aravind.iddamsetty

On 5/15/2025 9:11 AM, Denis Benato wrote:
> 
> On 5/15/25 13:58, Raag Jadav wrote:
>> On Wed, May 14, 2025 at 11:25:36PM +0200, Denis Benato wrote:
>>> On 5/14/25 21:53, Mario Limonciello wrote:
>>>> On 5/14/2025 11:29 AM, Denis Benato wrote:
>>>>> Hello,
>>>>>
>>>>> Lately I am experiencing a few problems related to either (one of or both) PCI and/or thunderbolt and Mario Limonciello pointed me to this patch.
>>>>>
>>>>> you can follow an example of my problems in this [1] bug report.
>>>>>
>>>>> I tested this patch on top of 6.14.6 and this patch comes with a nasty regression: s2idle resume breaks all my three GPUs, while for example the sound of a YT video resumes fine.
>>>>>
>>>>> You can see the dmesg here: https://pastebin.com/Um7bmdWi
>> Thanks for the report. From logs it looks like a hotplug event is triggered
>> for presence detect which is disabling the slot and in turn loosing the device
>> on resume. The cause of it is unclear though (assuming it is not a manual
>> intervention).
> No manual intervention: I do "sudo systemctl suspend", wait for the led pattern of sleep and press space. Nothing more than this.
> 
> I also noticed that with this patch, while sleeping, the amd gpu has fans on, while this is not the case sleeping without the patch.
> 
>>>>> I will also say that, on the bright side, this patch makes my laptop behave better on boot as the amdgpu plugged on the thunderbolt port is always enabled on power on, while without this patch it is random if it will be active immediately after laptop has been turned on.
>>>>>
>>>> Just for clarity - if you unplug your eGPU enclosure before suspend is everything OK?  IE this patch only has an impact to the USB4/TBT3 PCIe tunnels?
>>>>
>>> Laptop seems to enter and exit s2idle with the thunderbolt amdgpu disconnected using this patch too.
>>>
>>> Probably this either unveils a pre-existing thunderbolt bug or creates a new one.  If you need assistance in finding the bug or investigating in any other mean let me know as I want to see this patch merged once it stops regressing sleep with egpu.
>> If you're observing this only on thunderbolt port, one experiment I could
>> think of is to configure the port power delivery to be always on during suspend
>> and observe. Perhaps enable both thunderbolt and PCI logging to help figure out
>> what's really happening.
>>
> I have compiled the kernel with CONFIG_PCI_DEBUG=y and added to kernel cmdline "thunderbolt.dyndbg=+p pm_debug_messages" and here is the dmesg of a failed resume: https://pastebin.com/RsxXQQTm

What's really notable to me about this log is these two lines:

amdgpu 0000:09:00.0: PCI PM: Suspend power state: D0
amdgpu 0000:09:00.0: PCI PM: Skipped

The callpath is basically:

pci_pm_suspend_noirq()
->pci_prepare_to_sleep()
->->pci_set_power_state()
->->->__pci_set_power_state()
->->->->pci_set_low_power_state()

So the new pci_aer_in_progress() flags an error here and causes the dGPU 
in the eGPU enclosure to not go to suspend.

A simple W/A to ignore this could be to ignore when 
pm_suspend_target_state is not PM_SUSPEND_ON in pci_aer_in_progress(), 
but it sounds like it's masking a problem.

> 
> Please let me know if this is not detailed enough, and how to enable more logging if you need it.
>>> I will add that as a visible effect entering and exiting s2idle, even without the egpu connected (so when sleep works), makes the screen backlight to turn off and on rapidly about 6 times and it's a bit "concerning" to see, also I have the impression that it takes slightly longer to enter/exit s2idle.
>> Yes, I'm expecting a lot of hidden issues to be surfaced by this patch. Since
>> you've confirmed the machine itself is working fine, I'm hoping there are no
>> serious regressions.
> Except that for thunderbolt nothing major stands out, but once that is solved I would conduct a test about s2idle power consumption because, as noted above, the amdgpu remains on during sleep and it might not be the only component.
> 
> Anyway thanks for your work and if you need more info just ask.
> 
> Denis
> 
>> Raag
>>
>>>> The errors after resume in amdgpu /look/ like the device is "missing" from the bus or otherwise not responding.
>>>>
>>>> I think it would be helpful to capture the kernel log with a baseline of 6.14.6 but without this patch for comparison of what this patch is actually causing.
>>>>
>>> I have a dmesg of the same 6.14.6 minus this patch ready: https://pastebin.com/kLZtibcD
>>>>> [1] https://lore.kernel.org/all/965c9753-f14b-4a87-9f6d-8798e09ad6f5@gmail.com/
>>>>>
>>>>> On 5/4/25 11:04, Raag Jadav wrote:
>>>>>
>>>>>> If error flags are set on an AER capable device, most likely either the
>>>>>> device recovery is in progress or has already failed. Neither of the
>>>>>> cases are well suited for power state transition of the device, since
>>>>>> this can lead to unpredictable consequences like resume failure, or in
>>>>>> worst case the device is lost because of it. Leave the device in its
>>>>>> existing power state to avoid such issues.
>>>>>>
>>>>>> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
>>>>>> ---
>>>>>>
>>>>>> v2: Synchronize AER handling with PCI PM (Rafael)
>>>>>> v3: Move pci_aer_in_progress() to pci_set_low_power_state() (Rafael)
>>>>>>       Elaborate "why" (Bjorn)
>>>>>>
>>>>>> More discussion on [1].
>>>>>> [1] https://lore.kernel.org/all/CAJZ5v0g-aJXfVH+Uc=9eRPuW08t-6PwzdyMXsC6FZRKYJtY03Q@mail.gmail.com/
>>>>>>
>>>>>>    drivers/pci/pci.c      | 12 ++++++++++++
>>>>>>    drivers/pci/pcie/aer.c | 11 +++++++++++
>>>>>>    include/linux/aer.h    |  2 ++
>>>>>>    3 files changed, 25 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>>>> index 4d7c9f64ea24..25b2df34336c 100644
>>>>>> --- a/drivers/pci/pci.c
>>>>>> +++ b/drivers/pci/pci.c
>>>>>> @@ -9,6 +9,7 @@
>>>>>>     */
>>>>>>      #include <linux/acpi.h>
>>>>>> +#include <linux/aer.h>
>>>>>>    #include <linux/kernel.h>
>>>>>>    #include <linux/delay.h>
>>>>>>    #include <linux/dmi.h>
>>>>>> @@ -1539,6 +1540,17 @@ static int pci_set_low_power_state(struct pci_dev *dev, pci_power_t state, bool
>>>>>>           || (state == PCI_D2 && !dev->d2_support))
>>>>>>            return -EIO;
>>>>>>    +    /*
>>>>>> +     * If error flags are set on an AER capable device, most likely either
>>>>>> +     * the device recovery is in progress or has already failed. Neither of
>>>>>> +     * the cases are well suited for power state transition of the device,
>>>>>> +     * since this can lead to unpredictable consequences like resume
>>>>>> +     * failure, or in worst case the device is lost because of it. Leave the
>>>>>> +     * device in its existing power state to avoid such issues.
>>>>>> +     */
>>>>>> +    if (pci_aer_in_progress(dev))
>>>>>> +        return -EIO;
>>>>>> +
>>>>>>        pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>>>>>>        if (PCI_POSSIBLE_ERROR(pmcsr)) {
>>>>>>            pci_err(dev, "Unable to change power state from %s to %s, device inaccessible\n",
>>>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>>>> index a1cf8c7ef628..4040770df4f0 100644
>>>>>> --- a/drivers/pci/pcie/aer.c
>>>>>> +++ b/drivers/pci/pcie/aer.c
>>>>>> @@ -237,6 +237,17 @@ int pcie_aer_is_native(struct pci_dev *dev)
>>>>>>    }
>>>>>>    EXPORT_SYMBOL_NS_GPL(pcie_aer_is_native, "CXL");
>>>>>>    +bool pci_aer_in_progress(struct pci_dev *dev)
>>>>>> +{
>>>>>> +    u16 reg16;
>>>>>> +
>>>>>> +    if (!pcie_aer_is_native(dev))
>>>>>> +        return false;
>>>>>> +
>>>>>> +    pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &reg16);
>>>>>> +    return !!(reg16 & PCI_EXP_AER_FLAGS);
>>>>>> +}
>>>>>> +
>>>>>>    static int pci_enable_pcie_error_reporting(struct pci_dev *dev)
>>>>>>    {
>>>>>>        int rc;
>>>>>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>>>>>> index 02940be66324..e6a380bb2e68 100644
>>>>>> --- a/include/linux/aer.h
>>>>>> +++ b/include/linux/aer.h
>>>>>> @@ -56,12 +56,14 @@ struct aer_capability_regs {
>>>>>>    #if defined(CONFIG_PCIEAER)
>>>>>>    int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>>>>>>    int pcie_aer_is_native(struct pci_dev *dev);
>>>>>> +bool pci_aer_in_progress(struct pci_dev *dev);
>>>>>>    #else
>>>>>>    static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>>>>>    {
>>>>>>        return -EINVAL;
>>>>>>    }
>>>>>>    static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>>>>>> +static inline bool pci_aer_in_progress(struct pci_dev *dev) { return false; }
>>>>>>    #endif
>>>>>>      void pci_print_aer(struct pci_dev *dev, int aer_severity,


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] PCI: Prevent power state transition of erroneous device
  2025-05-15 11:58       ` Raag Jadav
  2025-05-15 12:26         ` Lukas Wunner
  2025-05-15 14:11         ` Denis Benato
@ 2025-05-21 13:33         ` Lukas Wunner
  2 siblings, 0 replies; 14+ messages in thread
From: Lukas Wunner @ 2025-05-21 13:33 UTC (permalink / raw)
  To: Raag Jadav
  Cc: Denis Benato, Mario Limonciello, rafael, mahesh, oohall, bhelgaas,
	linux-pci, linux-kernel, ilpo.jarvinen, aravind.iddamsetty

On Thu, May 15, 2025 at 02:58:25PM +0300, Raag Jadav wrote:
> On Wed, May 14, 2025 at 11:25:36PM +0200, Denis Benato wrote:
> > >> You can see the dmesg here: https://pastebin.com/Um7bmdWi
> 
> Thanks for the report. From logs it looks like a hotplug event is triggered
> for presence detect which is disabling the slot and in turn loosing the
> device on resume. The cause of it is unclear though (assuming it is not
> a manual intervention).

Below the Root Port 0000:00:1a.0, there is a discrete Intel Thunderbolt
controller.  Attached to it is an ASMedia Thunderbolt controller.

The Presence Detect Changed event occurs at the Intel Thunderbolt
controller's Downstream Port.  Because PCIe is tunneled over Thunderbolt,
this means that the tunnel to the ASMedia controller could not be
re-established on resume, hence the Presence Detect Changed event.
Could be the result of the device being unplugged or some Thunderbolt
issue.

For the same reason, anything below the Intel Thunderbolt controller
is inaccessible on resume, hence the "Unable to change power state"
messages for the HDA controller on the AMD GPU.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-05-21 13:33 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-04  9:04 [PATCH v3] PCI: Prevent power state transition of erroneous device Raag Jadav
2025-05-12 10:23 ` Raag Jadav
2025-05-12 11:56 ` Rafael J. Wysocki
2025-05-12 16:52   ` Raag Jadav
2025-05-13  4:41 ` Lukas Wunner
2025-05-13 15:17   ` Raag Jadav
2025-05-14 16:29 ` Denis Benato
2025-05-14 19:53   ` Mario Limonciello
2025-05-14 21:25     ` Denis Benato
2025-05-15 11:58       ` Raag Jadav
2025-05-15 12:26         ` Lukas Wunner
2025-05-15 14:11         ` Denis Benato
2025-05-15 14:41           ` Mario Limonciello
2025-05-21 13:33         ` Lukas Wunner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).