public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected
@ 2025-08-11 16:35 Mario Limonciello (AMD)
  2025-09-05 18:38 ` Mario Limonciello
  2025-09-08 21:03 ` Bjorn Helgaas
  0 siblings, 2 replies; 6+ messages in thread
From: Mario Limonciello (AMD) @ 2025-08-11 16:35 UTC (permalink / raw)
  To: Bjorn Helgaas, open list:PCI SUBSYSTEM
  Cc: linux-pm, Rafael J . Wysocki, Mario Limonciello, Lukas Wunner,
	Ilpo Järvinen, Rafael J . Wysocki

When a USB4 dock is unplugged the PCIe bridge it's connected to will
issue a "Link Down" and "Card not detected event". The PCI core will
treat this as a surprise hotplug event and unconfigure all downstream
devices. This involves setting the device error state to
`pci_channel_io_perm_failure` which pci_dev_is_disconnected() will check.

It doesn't make sense to runtime resume disconnected devices to D0 and
report the (expected) error, so bail early.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
---
v6:
 * rebase on v6.17-rc1
v5:
 * Pick up tags, rebase on linux-next
 * https://lore.kernel.org/linux-pci/20250709205948.3888045-1-superm1@kernel.org/T/#mbd784f786c50a3d1b5ab1833520995c01eae2fd2
---
 drivers/pci/pci.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b0f4d98036cdd..036511f5b2625 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1374,6 +1374,11 @@ int pci_power_up(struct pci_dev *dev)
 		return -EIO;
 	}
 
+	if (pci_dev_is_disconnected(dev)) {
+		dev->current_state = PCI_D3cold;
+		return -EIO;
+	}
+
 	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
 	if (PCI_POSSIBLE_ERROR(pmcsr)) {
 		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",

base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected
  2025-08-11 16:35 [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected Mario Limonciello (AMD)
@ 2025-09-05 18:38 ` Mario Limonciello
  2025-09-08 21:03 ` Bjorn Helgaas
  1 sibling, 0 replies; 6+ messages in thread
From: Mario Limonciello @ 2025-09-05 18:38 UTC (permalink / raw)
  To: Bjorn Helgaas, open list:PCI SUBSYSTEM
  Cc: linux-pm, Rafael J . Wysocki, Lukas Wunner, Ilpo Järvinen,
	Rafael J . Wysocki

On 8/11/2025 11:35 AM, Mario Limonciello (AMD) wrote:
> When a USB4 dock is unplugged the PCIe bridge it's connected to will
> issue a "Link Down" and "Card not detected event". The PCI core will
> treat this as a surprise hotplug event and unconfigure all downstream
> devices. This involves setting the device error state to
> `pci_channel_io_perm_failure` which pci_dev_is_disconnected() will check.
> 
> It doesn't make sense to runtime resume disconnected devices to D0 and
> report the (expected) error, so bail early.
> 
> Suggested-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

Bjorn,

Can you take a look at this one?

> ---
> v6:
>   * rebase on v6.17-rc1
> v5:
>   * Pick up tags, rebase on linux-next
>   * https://lore.kernel.org/linux-pci/20250709205948.3888045-1-superm1@kernel.org/T/#mbd784f786c50a3d1b5ab1833520995c01eae2fd2
> ---
>   drivers/pci/pci.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0f4d98036cdd..036511f5b2625 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1374,6 +1374,11 @@ int pci_power_up(struct pci_dev *dev)
>   		return -EIO;
>   	}
>   
> +	if (pci_dev_is_disconnected(dev)) {
> +		dev->current_state = PCI_D3cold;
> +		return -EIO;
> +	}
> +
>   	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>   	if (PCI_POSSIBLE_ERROR(pmcsr)) {
>   		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
> 
> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected
  2025-08-11 16:35 [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected Mario Limonciello (AMD)
  2025-09-05 18:38 ` Mario Limonciello
@ 2025-09-08 21:03 ` Bjorn Helgaas
  2025-09-08 21:29   ` Mario Limonciello (kernel.org)
  1 sibling, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2025-09-08 21:03 UTC (permalink / raw)
  To: Mario Limonciello (AMD)
  Cc: Bjorn Helgaas, open list:PCI SUBSYSTEM, linux-pm,
	Rafael J . Wysocki, Lukas Wunner, Ilpo Järvinen,
	Rafael J . Wysocki

On Mon, Aug 11, 2025 at 11:35:10AM -0500, Mario Limonciello (AMD) wrote:
> When a USB4 dock is unplugged the PCIe bridge it's connected to will
> issue a "Link Down" and "Card not detected event". The PCI core will
> treat this as a surprise hotplug event and unconfigure all downstream
> devices. This involves setting the device error state to
> `pci_channel_io_perm_failure` which pci_dev_is_disconnected() will check.

There's nothing special about USB4 here, right?  I guess the same
happens with any surprise hotplug remove?

pciehp_unconfigure_device() does the pci_dev_set_io_state(dev,
pci_channel_io_perm_failure) part on everything that got removed, so
that part is pretty straightforward.

> It doesn't make sense to runtime resume disconnected devices to D0 and
> report the (expected) error, so bail early.

Can you include a hint about where the runtime resume happens?  It
seems unintuitive to power up removed devices.

> Suggested-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
> ---
> v6:
>  * rebase on v6.17-rc1
> v5:
>  * Pick up tags, rebase on linux-next
>  * https://lore.kernel.org/linux-pci/20250709205948.3888045-1-superm1@kernel.org/T/#mbd784f786c50a3d1b5ab1833520995c01eae2fd2
> ---
>  drivers/pci/pci.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0f4d98036cdd..036511f5b2625 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1374,6 +1374,11 @@ int pci_power_up(struct pci_dev *dev)
>  		return -EIO;
>  	}
>  
> +	if (pci_dev_is_disconnected(dev)) {
> +		dev->current_state = PCI_D3cold;
> +		return -EIO;
> +	}
> +
>  	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>  	if (PCI_POSSIBLE_ERROR(pmcsr)) {
>  		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
> 
> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected
  2025-09-08 21:03 ` Bjorn Helgaas
@ 2025-09-08 21:29   ` Mario Limonciello (kernel.org)
  2025-09-08 21:48     ` Bjorn Helgaas
  0 siblings, 1 reply; 6+ messages in thread
From: Mario Limonciello (kernel.org) @ 2025-09-08 21:29 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, open list:PCI SUBSYSTEM, linux-pm,
	Rafael J . Wysocki, Lukas Wunner, Ilpo Järvinen,
	Rafael J . Wysocki



On 9/8/2025 4:03 PM, Bjorn Helgaas wrote:
> On Mon, Aug 11, 2025 at 11:35:10AM -0500, Mario Limonciello (AMD) wrote:
>> When a USB4 dock is unplugged the PCIe bridge it's connected to will
>> issue a "Link Down" and "Card not detected event". The PCI core will
>> treat this as a surprise hotplug event and unconfigure all downstream
>> devices. This involves setting the device error state to
>> `pci_channel_io_perm_failure` which pci_dev_is_disconnected() will check.
> 
> There's nothing special about USB4 here, right?  I guess the same
> happens with any surprise hotplug remove?

Correct.

> 
> pciehp_unconfigure_device() does the pci_dev_set_io_state(dev,
> pci_channel_io_perm_failure) part on everything that got removed, so
> that part is pretty straightforward.
> 
>> It doesn't make sense to runtime resume disconnected devices to D0 and
>> report the (expected) error, so bail early.
> 
> Can you include a hint about where the runtime resume happens?  It
> seems unintuitive to power up removed devices.

Here's the whole trace.  Basically as part of the base driver release 
the device is runtime resumed.

Do you want me to respin and try to incorporate a sentence or two into 
the commit text?

[   37.237841] dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
[   37.237858] pci_power_up.cold+0x86/0xc1
[   37.237867] ? __pfx_pci_power_up (drivers/pci/pci.c:1360)
[   37.237876] ? prb_commit (kernel/printk/printk_ringbuffer.c:1748)
[   37.237883] ? __pfx_prb_read_valid 
(kernel/printk/printk_ringbuffer.c:2184)
[   37.237889] pci_pm_power_up_and_verify_state (drivers/pci/pci.c:1146 
drivers/pci/pci.c:1201 drivers/pci/pci.c:3206)
[   37.237897] ? __pfx_pci_pm_power_up_and_verify_state 
(drivers/pci/pci.c:3204)
[   37.237903] ? __pfx_rpm_resume (drivers/base/power/runtime.c:785)
[   37.237911] pci_pm_runtime_resume (drivers/pci/pci-driver.c:561 
drivers/pci/pci-driver.c:1349)
[   37.237918] __rpm_callback (drivers/base/power/runtime.c:406)
[   37.237923] ? __pfx__raw_spin_lock (kernel/locking/spinlock.c:153)
[   37.237932] rpm_callback (drivers/base/power/runtime.c:444)
[   37.237936] ? __pfx_pci_pm_runtime_resume (drivers/pci/pci-driver.c:1338)
[   37.237941] rpm_resume (drivers/base/power/runtime.c:943)
[   37.237946] ? __pfx_rpm_resume (drivers/base/power/runtime.c:785)
[   37.237951] ? _raw_spin_lock_irqsave 
(./include/linux/instrumented.h:96 
./include/linux/atomic/atomic-instrumented.h:1301 
./include/asm-generic/qspinlock.h:111 ./include/linux/spinlock.h:187 
./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
[   37.237954] ? __pfx__raw_spin_lock_irqsave 
(kernel/locking/spinlock.c:161)
[   37.237958] ? mutex_lock (./include/linux/instrumented.h:96 
./include/linux/atomic/atomic-instrumented.h:4457 
kernel/locking/mutex.c:157 kernel/locking/mutex.c:273)
[   37.237965] __pm_runtime_resume (drivers/base/power/runtime.c:1192)
[   37.237971] device_release_driver_internal (drivers/base/dd.c:1111 
(discriminator 1) drivers/base/dd.c:1252 (discriminator 1) 
drivers/base/dd.c:1297 (discriminator 1))
[   37.237978] ? pci_pme_active (drivers/pci/pci.c:2521)
[   37.237984] pci_stop_bus_device (drivers/pci/remove.c:44 
drivers/pci/remove.c:107)
[   37.237992] pci_stop_bus_device (drivers/pci/remove.c:102 
(discriminator 1))
[   37.237997] pci_stop_and_remove_bus_device (drivers/pci/remove.c:142)
[   37.238003] pciehp_unconfigure_device 
(drivers/pci/hotplug/pciehp_pci.c:124)
[   37.238011] ? __pfx_pciehp_unconfigure_device 
(drivers/pci/hotplug/pciehp_pci.c:96)
[   37.238017] ? _dev_info (drivers/base/core.c:4983)
[   37.238025] pciehp_disable_slot 
(drivers/pci/hotplug/pciehp_ctrl.c:115 
drivers/pci/hotplug/pciehp_ctrl.c:355 drivers/pci/hotplug/pciehp_ctrl.c:364)
[   37.238030] ? __pfx_pciehp_disable_slot 
(drivers/pci/hotplug/pciehp_ctrl.c:360)
[   37.238035] ? __pfx_mutex_unlock (kernel/locking/mutex.c:531)
[   37.238039] ? mutex_lock_interruptible (kernel/locking/mutex.c:992)
[   37.238047] pciehp_handle_presence_or_link_change 
(drivers/pci/hotplug/pciehp_ctrl.c:253)
[   37.238054] ? down_read (kernel/locking/rwsem.c:174 
kernel/locking/rwsem.c:182 kernel/locking/rwsem.c:257 
kernel/locking/rwsem.c:249 kernel/locking/rwsem.c:1260 
kernel/locking/rwsem.c:1274 kernel/locking/rwsem.c:1539)
[   37.238061] ? __pfx_pciehp_handle_presence_or_link_change 
(drivers/pci/hotplug/pciehp_ctrl.c:232)
[   37.238068] ? __pfx_down_read (kernel/locking/rwsem.c:1535)
[   37.238074] ? __pfx_pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:728)
[   37.238079] pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:788)
[   37.238085] ? __pfx_pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:728)
[   37.238091] irq_thread_fn (kernel/irq/manage.c:1131)
[   37.238098] irq_thread (./arch/x86/include/asm/bitops.h:206 
./arch/x86/include/asm/bitops.h:238 
./include/asm-generic/bitops/instrumented-non-atomic.h:142 
kernel/irq/manage.c:1244)
[   37.238104] ? __pfx_irq_thread_fn (kernel/irq/manage.c:1130)
[   37.238110] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
[   37.238115] ? __pfx_irq_thread_dtor (kernel/irq/manage.c:1167)
[   37.238123] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:206 
./arch/x86/include/asm/bitops.h:238 
./include/asm-generic/bitops/instrumented-non-atomic.h:142 
kernel/kthread.c:290)
[   37.238130] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
[   37.238136] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
[   37.238141] kthread (kernel/kthread.c:463)
[   37.238147] ? __pfx_kthread (kernel/kthread.c:412)
[   37.238153] ? finish_task_switch.isra.0 
(./arch/x86/include/asm/irqflags.h:42 
./arch/x86/include/asm/irqflags.h:119 kernel/sched/sched.h:1531 
kernel/sched/core.c:5105 kernel/sched/core.c:5223)
[   37.238161] ? __pfx_kthread (kernel/kthread.c:412)
[   37.238166] ret_from_fork (arch/x86/kernel/process.c:154)
[   37.238175] ? __pfx_kthread (kernel/kthread.c:412)
[   37.238180] ? __pfx_kthread (kernel/kthread.c:412)
[   37.238184] ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
[   37.238192]  </TASK>

> 
>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>> Reviewed-by: Lukas Wunner <lukas@wunner.de>
>> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
>> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
>> ---
>> v6:
>>   * rebase on v6.17-rc1
>> v5:
>>   * Pick up tags, rebase on linux-next
>>   * https://lore.kernel.org/linux-pci/20250709205948.3888045-1-superm1@kernel.org/T/#mbd784f786c50a3d1b5ab1833520995c01eae2fd2
>> ---
>>   drivers/pci/pci.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index b0f4d98036cdd..036511f5b2625 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1374,6 +1374,11 @@ int pci_power_up(struct pci_dev *dev)
>>   		return -EIO;
>>   	}
>>   
>> +	if (pci_dev_is_disconnected(dev)) {
>> +		dev->current_state = PCI_D3cold;
>> +		return -EIO;
>> +	}
>> +
>>   	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>>   	if (PCI_POSSIBLE_ERROR(pmcsr)) {
>>   		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
>>
>> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
>> -- 
>> 2.43.0
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected
  2025-09-08 21:29   ` Mario Limonciello (kernel.org)
@ 2025-09-08 21:48     ` Bjorn Helgaas
  2025-09-08 21:53       ` Mario Limonciello (kernel.org)
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2025-09-08 21:48 UTC (permalink / raw)
  To: Mario Limonciello (kernel.org)
  Cc: Bjorn Helgaas, open list:PCI SUBSYSTEM, linux-pm,
	Rafael J . Wysocki, Lukas Wunner, Ilpo Järvinen,
	Rafael J . Wysocki

On Mon, Sep 08, 2025 at 04:29:32PM -0500, Mario Limonciello (kernel.org) wrote:
> On 9/8/2025 4:03 PM, Bjorn Helgaas wrote:
> > On Mon, Aug 11, 2025 at 11:35:10AM -0500, Mario Limonciello (AMD) wrote:
> > > When a USB4 dock is unplugged the PCIe bridge it's connected to will
> > > issue a "Link Down" and "Card not detected event". The PCI core will
> > > treat this as a surprise hotplug event and unconfigure all downstream
> > > devices. This involves setting the device error state to
> > > `pci_channel_io_perm_failure` which pci_dev_is_disconnected() will check.
> > 
> > There's nothing special about USB4 here, right?  I guess the same
> > happens with any surprise hotplug remove?
> 
> Correct.
> 
> > pciehp_unconfigure_device() does the pci_dev_set_io_state(dev,
> > pci_channel_io_perm_failure) part on everything that got removed, so
> > that part is pretty straightforward.
> > 
> > > It doesn't make sense to runtime resume disconnected devices to D0 and
> > > report the (expected) error, so bail early.
> > 
> > Can you include a hint about where the runtime resume happens?  It
> > seems unintuitive to power up removed devices.
> 
> Here's the whole trace.  Basically as part of the base driver release the
> device is runtime resumed.
> 
> Do you want me to respin and try to incorporate a sentence or two into the
> commit text?

Yes, please.  I guess maybe we are basically here inside the
pm_runtime_get_sync() here?

  pciehp_unconfigure_device
    pci_stop_and_remove_bus_device
      device_release_driver
        device_remove
          pci_device_remove
            pm_runtime_get_sync    <--
              drv->remove

I guess it's the pm_runtime_get_sync() that eventually lands us in
pci_power_up()?

And I guess we can't really check whether the device is already gone
in pci_device_remove() because we still want to run the driver
.remove() method and other things to clean up.

I suppose everything we call just needs to be aware that its device
may be already gone?

> [   37.237841] dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
> [   37.237858] pci_power_up.cold+0x86/0xc1
> [   37.237867] ? __pfx_pci_power_up (drivers/pci/pci.c:1360)
> [   37.237876] ? prb_commit (kernel/printk/printk_ringbuffer.c:1748)
> [   37.237883] ? __pfx_prb_read_valid
> (kernel/printk/printk_ringbuffer.c:2184)
> [   37.237889] pci_pm_power_up_and_verify_state (drivers/pci/pci.c:1146
> drivers/pci/pci.c:1201 drivers/pci/pci.c:3206)
> [   37.237897] ? __pfx_pci_pm_power_up_and_verify_state
> (drivers/pci/pci.c:3204)
> [   37.237903] ? __pfx_rpm_resume (drivers/base/power/runtime.c:785)
> [   37.237911] pci_pm_runtime_resume (drivers/pci/pci-driver.c:561
> drivers/pci/pci-driver.c:1349)
> [   37.237918] __rpm_callback (drivers/base/power/runtime.c:406)
> [   37.237923] ? __pfx__raw_spin_lock (kernel/locking/spinlock.c:153)
> [   37.237932] rpm_callback (drivers/base/power/runtime.c:444)
> [   37.237936] ? __pfx_pci_pm_runtime_resume (drivers/pci/pci-driver.c:1338)
> [   37.237941] rpm_resume (drivers/base/power/runtime.c:943)
> [   37.237946] ? __pfx_rpm_resume (drivers/base/power/runtime.c:785)
> [   37.237951] ? _raw_spin_lock_irqsave (./include/linux/instrumented.h:96
> ./include/linux/atomic/atomic-instrumented.h:1301
> ./include/asm-generic/qspinlock.h:111 ./include/linux/spinlock.h:187
> ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
> [   37.237954] ? __pfx__raw_spin_lock_irqsave
> (kernel/locking/spinlock.c:161)
> [   37.237958] ? mutex_lock (./include/linux/instrumented.h:96
> ./include/linux/atomic/atomic-instrumented.h:4457 kernel/locking/mutex.c:157
> kernel/locking/mutex.c:273)
> [   37.237965] __pm_runtime_resume (drivers/base/power/runtime.c:1192)
> [   37.237971] device_release_driver_internal (drivers/base/dd.c:1111
> (discriminator 1) drivers/base/dd.c:1252 (discriminator 1)
> drivers/base/dd.c:1297 (discriminator 1))
> [   37.237978] ? pci_pme_active (drivers/pci/pci.c:2521)
> [   37.237984] pci_stop_bus_device (drivers/pci/remove.c:44
> drivers/pci/remove.c:107)
> [   37.237992] pci_stop_bus_device (drivers/pci/remove.c:102 (discriminator
> 1))
> [   37.237997] pci_stop_and_remove_bus_device (drivers/pci/remove.c:142)
> [   37.238003] pciehp_unconfigure_device
> (drivers/pci/hotplug/pciehp_pci.c:124)
> [   37.238011] ? __pfx_pciehp_unconfigure_device
> (drivers/pci/hotplug/pciehp_pci.c:96)
> [   37.238017] ? _dev_info (drivers/base/core.c:4983)
> [   37.238025] pciehp_disable_slot (drivers/pci/hotplug/pciehp_ctrl.c:115
> drivers/pci/hotplug/pciehp_ctrl.c:355 drivers/pci/hotplug/pciehp_ctrl.c:364)
> [   37.238030] ? __pfx_pciehp_disable_slot
> (drivers/pci/hotplug/pciehp_ctrl.c:360)
> [   37.238035] ? __pfx_mutex_unlock (kernel/locking/mutex.c:531)
> [   37.238039] ? mutex_lock_interruptible (kernel/locking/mutex.c:992)
> [   37.238047] pciehp_handle_presence_or_link_change
> (drivers/pci/hotplug/pciehp_ctrl.c:253)
> [   37.238054] ? down_read (kernel/locking/rwsem.c:174
> kernel/locking/rwsem.c:182 kernel/locking/rwsem.c:257
> kernel/locking/rwsem.c:249 kernel/locking/rwsem.c:1260
> kernel/locking/rwsem.c:1274 kernel/locking/rwsem.c:1539)
> [   37.238061] ? __pfx_pciehp_handle_presence_or_link_change
> (drivers/pci/hotplug/pciehp_ctrl.c:232)
> [   37.238068] ? __pfx_down_read (kernel/locking/rwsem.c:1535)
> [   37.238074] ? __pfx_pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:728)
> [   37.238079] pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:788)
> [   37.238085] ? __pfx_pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:728)
> [   37.238091] irq_thread_fn (kernel/irq/manage.c:1131)
> [   37.238098] irq_thread (./arch/x86/include/asm/bitops.h:206
> ./arch/x86/include/asm/bitops.h:238
> ./include/asm-generic/bitops/instrumented-non-atomic.h:142
> kernel/irq/manage.c:1244)
> [   37.238104] ? __pfx_irq_thread_fn (kernel/irq/manage.c:1130)
> [   37.238110] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
> [   37.238115] ? __pfx_irq_thread_dtor (kernel/irq/manage.c:1167)
> [   37.238123] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:206
> ./arch/x86/include/asm/bitops.h:238
> ./include/asm-generic/bitops/instrumented-non-atomic.h:142
> kernel/kthread.c:290)
> [   37.238130] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
> [   37.238136] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
> [   37.238141] kthread (kernel/kthread.c:463)
> [   37.238147] ? __pfx_kthread (kernel/kthread.c:412)
> [   37.238153] ? finish_task_switch.isra.0
> (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:119
> kernel/sched/sched.h:1531 kernel/sched/core.c:5105 kernel/sched/core.c:5223)
> [   37.238161] ? __pfx_kthread (kernel/kthread.c:412)
> [   37.238166] ret_from_fork (arch/x86/kernel/process.c:154)
> [   37.238175] ? __pfx_kthread (kernel/kthread.c:412)
> [   37.238180] ? __pfx_kthread (kernel/kthread.c:412)
> [   37.238184] ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
> [   37.238192]  </TASK>
> 
> > 
> > > Suggested-by: Lukas Wunner <lukas@wunner.de>
> > > Reviewed-by: Lukas Wunner <lukas@wunner.de>
> > > Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > > Acked-by: Rafael J. Wysocki <rafael@kernel.org>
> > > Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
> > > ---
> > > v6:
> > >   * rebase on v6.17-rc1
> > > v5:
> > >   * Pick up tags, rebase on linux-next
> > >   * https://lore.kernel.org/linux-pci/20250709205948.3888045-1-superm1@kernel.org/T/#mbd784f786c50a3d1b5ab1833520995c01eae2fd2
> > > ---
> > >   drivers/pci/pci.c | 5 +++++
> > >   1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index b0f4d98036cdd..036511f5b2625 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -1374,6 +1374,11 @@ int pci_power_up(struct pci_dev *dev)
> > >   		return -EIO;
> > >   	}
> > > +	if (pci_dev_is_disconnected(dev)) {
> > > +		dev->current_state = PCI_D3cold;
> > > +		return -EIO;
> > > +	}
> > > +
> > >   	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> > >   	if (PCI_POSSIBLE_ERROR(pmcsr)) {
> > >   		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
> > > 
> > > base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
> > > -- 
> > > 2.43.0
> > > 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected
  2025-09-08 21:48     ` Bjorn Helgaas
@ 2025-09-08 21:53       ` Mario Limonciello (kernel.org)
  0 siblings, 0 replies; 6+ messages in thread
From: Mario Limonciello (kernel.org) @ 2025-09-08 21:53 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Bjorn Helgaas, open list:PCI SUBSYSTEM, linux-pm,
	Rafael J . Wysocki, Lukas Wunner, Ilpo Järvinen,
	Rafael J . Wysocki



On 9/8/2025 4:48 PM, Bjorn Helgaas wrote:
> On Mon, Sep 08, 2025 at 04:29:32PM -0500, Mario Limonciello (kernel.org) wrote:
>> On 9/8/2025 4:03 PM, Bjorn Helgaas wrote:
>>> On Mon, Aug 11, 2025 at 11:35:10AM -0500, Mario Limonciello (AMD) wrote:
>>>> When a USB4 dock is unplugged the PCIe bridge it's connected to will
>>>> issue a "Link Down" and "Card not detected event". The PCI core will
>>>> treat this as a surprise hotplug event and unconfigure all downstream
>>>> devices. This involves setting the device error state to
>>>> `pci_channel_io_perm_failure` which pci_dev_is_disconnected() will check.
>>>
>>> There's nothing special about USB4 here, right?  I guess the same
>>> happens with any surprise hotplug remove?
>>
>> Correct.
>>
>>> pciehp_unconfigure_device() does the pci_dev_set_io_state(dev,
>>> pci_channel_io_perm_failure) part on everything that got removed, so
>>> that part is pretty straightforward.
>>>
>>>> It doesn't make sense to runtime resume disconnected devices to D0 and
>>>> report the (expected) error, so bail early.
>>>
>>> Can you include a hint about where the runtime resume happens?  It
>>> seems unintuitive to power up removed devices.
>>
>> Here's the whole trace.  Basically as part of the base driver release the
>> device is runtime resumed.
>>
>> Do you want me to respin and try to incorporate a sentence or two into the
>> commit text?
> 
> Yes, please.  

OK, will do.
> I guess maybe we are basically here inside the
> pm_runtime_get_sync() here?
> 
>    pciehp_unconfigure_device
>      pci_stop_and_remove_bus_device
>        device_release_driver
>          device_remove
>            pci_device_remove
>              pm_runtime_get_sync    <--
>                drv->remove
> 
> I guess it's the pm_runtime_get_sync() that eventually lands us in
> pci_power_up()?
> 
> And I guess we can't really check whether the device is already gone
> in pci_device_remove() because we still want to run the driver
> .remove() method and other things to clean up.
> 
> I suppose everything we call just needs to be aware that its device
> may be already gone?

Yes, exactly.

> 
>> [   37.237841] dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
>> [   37.237858] pci_power_up.cold+0x86/0xc1
>> [   37.237867] ? __pfx_pci_power_up (drivers/pci/pci.c:1360)
>> [   37.237876] ? prb_commit (kernel/printk/printk_ringbuffer.c:1748)
>> [   37.237883] ? __pfx_prb_read_valid
>> (kernel/printk/printk_ringbuffer.c:2184)
>> [   37.237889] pci_pm_power_up_and_verify_state (drivers/pci/pci.c:1146
>> drivers/pci/pci.c:1201 drivers/pci/pci.c:3206)
>> [   37.237897] ? __pfx_pci_pm_power_up_and_verify_state
>> (drivers/pci/pci.c:3204)
>> [   37.237903] ? __pfx_rpm_resume (drivers/base/power/runtime.c:785)
>> [   37.237911] pci_pm_runtime_resume (drivers/pci/pci-driver.c:561
>> drivers/pci/pci-driver.c:1349)
>> [   37.237918] __rpm_callback (drivers/base/power/runtime.c:406)
>> [   37.237923] ? __pfx__raw_spin_lock (kernel/locking/spinlock.c:153)
>> [   37.237932] rpm_callback (drivers/base/power/runtime.c:444)
>> [   37.237936] ? __pfx_pci_pm_runtime_resume (drivers/pci/pci-driver.c:1338)
>> [   37.237941] rpm_resume (drivers/base/power/runtime.c:943)
>> [   37.237946] ? __pfx_rpm_resume (drivers/base/power/runtime.c:785)
>> [   37.237951] ? _raw_spin_lock_irqsave (./include/linux/instrumented.h:96
>> ./include/linux/atomic/atomic-instrumented.h:1301
>> ./include/asm-generic/qspinlock.h:111 ./include/linux/spinlock.h:187
>> ./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
>> [   37.237954] ? __pfx__raw_spin_lock_irqsave
>> (kernel/locking/spinlock.c:161)
>> [   37.237958] ? mutex_lock (./include/linux/instrumented.h:96
>> ./include/linux/atomic/atomic-instrumented.h:4457 kernel/locking/mutex.c:157
>> kernel/locking/mutex.c:273)
>> [   37.237965] __pm_runtime_resume (drivers/base/power/runtime.c:1192)
>> [   37.237971] device_release_driver_internal (drivers/base/dd.c:1111
>> (discriminator 1) drivers/base/dd.c:1252 (discriminator 1)
>> drivers/base/dd.c:1297 (discriminator 1))
>> [   37.237978] ? pci_pme_active (drivers/pci/pci.c:2521)
>> [   37.237984] pci_stop_bus_device (drivers/pci/remove.c:44
>> drivers/pci/remove.c:107)
>> [   37.237992] pci_stop_bus_device (drivers/pci/remove.c:102 (discriminator
>> 1))
>> [   37.237997] pci_stop_and_remove_bus_device (drivers/pci/remove.c:142)
>> [   37.238003] pciehp_unconfigure_device
>> (drivers/pci/hotplug/pciehp_pci.c:124)
>> [   37.238011] ? __pfx_pciehp_unconfigure_device
>> (drivers/pci/hotplug/pciehp_pci.c:96)
>> [   37.238017] ? _dev_info (drivers/base/core.c:4983)
>> [   37.238025] pciehp_disable_slot (drivers/pci/hotplug/pciehp_ctrl.c:115
>> drivers/pci/hotplug/pciehp_ctrl.c:355 drivers/pci/hotplug/pciehp_ctrl.c:364)
>> [   37.238030] ? __pfx_pciehp_disable_slot
>> (drivers/pci/hotplug/pciehp_ctrl.c:360)
>> [   37.238035] ? __pfx_mutex_unlock (kernel/locking/mutex.c:531)
>> [   37.238039] ? mutex_lock_interruptible (kernel/locking/mutex.c:992)
>> [   37.238047] pciehp_handle_presence_or_link_change
>> (drivers/pci/hotplug/pciehp_ctrl.c:253)
>> [   37.238054] ? down_read (kernel/locking/rwsem.c:174
>> kernel/locking/rwsem.c:182 kernel/locking/rwsem.c:257
>> kernel/locking/rwsem.c:249 kernel/locking/rwsem.c:1260
>> kernel/locking/rwsem.c:1274 kernel/locking/rwsem.c:1539)
>> [   37.238061] ? __pfx_pciehp_handle_presence_or_link_change
>> (drivers/pci/hotplug/pciehp_ctrl.c:232)
>> [   37.238068] ? __pfx_down_read (kernel/locking/rwsem.c:1535)
>> [   37.238074] ? __pfx_pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:728)
>> [   37.238079] pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:788)
>> [   37.238085] ? __pfx_pciehp_ist (drivers/pci/hotplug/pciehp_hpc.c:728)
>> [   37.238091] irq_thread_fn (kernel/irq/manage.c:1131)
>> [   37.238098] irq_thread (./arch/x86/include/asm/bitops.h:206
>> ./arch/x86/include/asm/bitops.h:238
>> ./include/asm-generic/bitops/instrumented-non-atomic.h:142
>> kernel/irq/manage.c:1244)
>> [   37.238104] ? __pfx_irq_thread_fn (kernel/irq/manage.c:1130)
>> [   37.238110] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
>> [   37.238115] ? __pfx_irq_thread_dtor (kernel/irq/manage.c:1167)
>> [   37.238123] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:206
>> ./arch/x86/include/asm/bitops.h:238
>> ./include/asm-generic/bitops/instrumented-non-atomic.h:142
>> kernel/kthread.c:290)
>> [   37.238130] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
>> [   37.238136] ? __pfx_irq_thread (kernel/irq/manage.c:1233)
>> [   37.238141] kthread (kernel/kthread.c:463)
>> [   37.238147] ? __pfx_kthread (kernel/kthread.c:412)
>> [   37.238153] ? finish_task_switch.isra.0
>> (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:119
>> kernel/sched/sched.h:1531 kernel/sched/core.c:5105 kernel/sched/core.c:5223)
>> [   37.238161] ? __pfx_kthread (kernel/kthread.c:412)
>> [   37.238166] ret_from_fork (arch/x86/kernel/process.c:154)
>> [   37.238175] ? __pfx_kthread (kernel/kthread.c:412)
>> [   37.238180] ? __pfx_kthread (kernel/kthread.c:412)
>> [   37.238184] ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
>> [   37.238192]  </TASK>
>>
>>>
>>>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>>>> Reviewed-by: Lukas Wunner <lukas@wunner.de>
>>>> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>>> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
>>>> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
>>>> ---
>>>> v6:
>>>>    * rebase on v6.17-rc1
>>>> v5:
>>>>    * Pick up tags, rebase on linux-next
>>>>    * https://lore.kernel.org/linux-pci/20250709205948.3888045-1-superm1@kernel.org/T/#mbd784f786c50a3d1b5ab1833520995c01eae2fd2
>>>> ---
>>>>    drivers/pci/pci.c | 5 +++++
>>>>    1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>> index b0f4d98036cdd..036511f5b2625 100644
>>>> --- a/drivers/pci/pci.c
>>>> +++ b/drivers/pci/pci.c
>>>> @@ -1374,6 +1374,11 @@ int pci_power_up(struct pci_dev *dev)
>>>>    		return -EIO;
>>>>    	}
>>>> +	if (pci_dev_is_disconnected(dev)) {
>>>> +		dev->current_state = PCI_D3cold;
>>>> +		return -EIO;
>>>> +	}
>>>> +
>>>>    	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
>>>>    	if (PCI_POSSIBLE_ERROR(pmcsr)) {
>>>>    		pci_err(dev, "Unable to change power state from %s to D0, device inaccessible\n",
>>>>
>>>> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
>>>> -- 
>>>> 2.43.0
>>>>
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-09-08 21:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 16:35 [PATCH v6] PCI/PM: Skip resuming to D0 if disconnected Mario Limonciello (AMD)
2025-09-05 18:38 ` Mario Limonciello
2025-09-08 21:03 ` Bjorn Helgaas
2025-09-08 21:29   ` Mario Limonciello (kernel.org)
2025-09-08 21:48     ` Bjorn Helgaas
2025-09-08 21:53       ` Mario Limonciello (kernel.org)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox