* [PATCH] PCI: Fix AB-BA deadlock between aer_isr() and device_shutdown()
@ 2026-01-09 9:56 Ziming Du
2026-01-13 18:51 ` Bjorn Helgaas
2026-02-24 6:40 ` Lukas Wunner
0 siblings, 2 replies; 4+ messages in thread
From: Ziming Du @ 2026-01-09 9:56 UTC (permalink / raw)
To: bhelgaas, okaya, keith.busch
Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2
During system shutdown, a deadlock may occur between AER recovery process
and device shutdown as follows:
The device_shutdown path holds the device_lock throughout the entire
process and waits for the irq handlers to complete when release nodes:
device_shutdown
device_lock # A hold device_lock
pci_device_shutdown
pcie_port_device_remove
remove_iter
device_unregister
device_del
bus_remove_device
device_release_driver
devres_release_all
release_nodes # B wait for irq handlers
The aer_isr path will acquire device_lock in pci_bus_reset():
aer_isr # B execute irq process
aer_isr_one_error
aer_process_err_devices
handle_error_source
pcie_do_recovery
aer_root_reset
pci_bus_error_reset
pci_bus_reset # A acquire device_lock
The circular dependency causes system hang. Fix it by using
pci_bus_trylock() instead of pci_bus_lock() in pci_bus_reset(). When the
lock is unavailable, return -EAGAIN, as in similar cases.
Fixes: c4eed62a2143 ("PCI/ERR: Use slot reset if available")
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
drivers/pci/pci.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 13dbb405dc31..7471bfa6f32e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5515,15 +5515,22 @@ static int pci_bus_reset(struct pci_bus *bus, bool probe)
if (probe)
return 0;
- pci_bus_lock(bus);
+ /*
+ * Replace blocking lock with trylock to prevent deadlock during bus reset.
+ * Same as above except return -EAGAIN if the bus cannot be locked.
+ */
+ if (pci_bus_trylock(bus)) {
- might_sleep();
+ might_sleep();
- ret = pci_bridge_secondary_bus_reset(bus->self);
+ ret = pci_bridge_secondary_bus_reset(bus->self);
- pci_bus_unlock(bus);
+ pci_bus_unlock(bus);
- return ret;
+ return ret;
+ }
+
+ return -EAGAIN;
}
/**
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] PCI: Fix AB-BA deadlock between aer_isr() and device_shutdown()
2026-01-09 9:56 [PATCH] PCI: Fix AB-BA deadlock between aer_isr() and device_shutdown() Ziming Du
@ 2026-01-13 18:51 ` Bjorn Helgaas
2026-01-15 2:50 ` duziming
2026-02-24 6:40 ` Lukas Wunner
1 sibling, 1 reply; 4+ messages in thread
From: Bjorn Helgaas @ 2026-01-13 18:51 UTC (permalink / raw)
To: Ziming Du
Cc: bhelgaas, okaya, keith.busch, linux-pci, linux-kernel,
liuyongqiang13
On Fri, Jan 09, 2026 at 05:56:03PM +0800, Ziming Du wrote:
> During system shutdown, a deadlock may occur between AER recovery process
> and device shutdown as follows:
>
> The device_shutdown path holds the device_lock throughout the entire
> process and waits for the irq handlers to complete when release nodes:
>
> device_shutdown
> device_lock # A hold device_lock
> pci_device_shutdown
> pcie_port_device_remove
> remove_iter
> device_unregister
> device_del
> bus_remove_device
> device_release_driver
> devres_release_all
> release_nodes # B wait for irq handlers
Can you add the wait location to these example? release_nodes()
doesn't wait itself, so I guess it must be in a dr->node.release()
function?
And I guess it must be related to something in the IRQ path that is
held while aer_isr() runs?
> The aer_isr path will acquire device_lock in pci_bus_reset():
>
> aer_isr # B execute irq process
> aer_isr_one_error
> aer_process_err_devices
> handle_error_source
> pcie_do_recovery
> aer_root_reset
> pci_bus_error_reset
> pci_bus_reset # A acquire device_lock
>
> The circular dependency causes system hang. Fix it by using
> pci_bus_trylock() instead of pci_bus_lock() in pci_bus_reset(). When the
> lock is unavailable, return -EAGAIN, as in similar cases.
pci_bus_error_reset() may use either pci_slot_reset() or
pci_bus_reset(), and this patch addresses only pci_bus_reset(). Is
the same deadlock possible in the pci_slot_reset() path?
> Fixes: c4eed62a2143 ("PCI/ERR: Use slot reset if available")
> Signed-off-by: Ziming Du <duziming2@huawei.com>
> ---
> drivers/pci/pci.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 13dbb405dc31..7471bfa6f32e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5515,15 +5515,22 @@ static int pci_bus_reset(struct pci_bus *bus, bool probe)
> if (probe)
> return 0;
>
> - pci_bus_lock(bus);
> + /*
> + * Replace blocking lock with trylock to prevent deadlock during bus reset.
> + * Same as above except return -EAGAIN if the bus cannot be locked.
Wrap this to fit in 80 columns like the rest of the file.
> + */
> + if (pci_bus_trylock(bus)) {
>
> - might_sleep();
> + might_sleep();
>
> - ret = pci_bridge_secondary_bus_reset(bus->self);
> + ret = pci_bridge_secondary_bus_reset(bus->self);
>
> - pci_bus_unlock(bus);
> + pci_bus_unlock(bus);
>
> - return ret;
> + return ret;
> + }
> +
> + return -EAGAIN;
> }
>
> /**
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] PCI: Fix AB-BA deadlock between aer_isr() and device_shutdown()
2026-01-13 18:51 ` Bjorn Helgaas
@ 2026-01-15 2:50 ` duziming
0 siblings, 0 replies; 4+ messages in thread
From: duziming @ 2026-01-15 2:50 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: bhelgaas, okaya, keith.busch, linux-pci, linux-kernel,
liuyongqiang13
在 2026/1/14 2:51, Bjorn Helgaas 写道:
> On Fri, Jan 09, 2026 at 05:56:03PM +0800, Ziming Du wrote:
>> During system shutdown, a deadlock may occur between AER recovery process
>> and device shutdown as follows:
>>
>> The device_shutdown path holds the device_lock throughout the entire
>> process and waits for the irq handlers to complete when release nodes:
>>
>> device_shutdown
>> device_lock # A hold device_lock
>> pci_device_shutdown
>> pcie_port_device_remove
>> remove_iter
>> device_unregister
>> device_del
>> bus_remove_device
>> device_release_driver
>> devres_release_all
>> release_nodes # B wait for irq handlers
> Can you add the wait location to these example? release_nodes()
> doesn't wait itself, so I guess it must be in a dr->node.release()
> function?
>
> And I guess it must be related to something in the IRQ path that is
> held while aer_isr() runs?
When releasing the interrupt resources, the process eventually calls
free_irq(), and then
__synchronize_irq () will be called to wait until all irq handlers have
finished.
>> The aer_isr path will acquire device_lock in pci_bus_reset():
>>
>> aer_isr # B execute irq process
>> aer_isr_one_error
>> aer_process_err_devices
>> handle_error_source
>> pcie_do_recovery
>> aer_root_reset
>> pci_bus_error_reset
>> pci_bus_reset # A acquire device_lock
>>
>> The circular dependency causes system hang. Fix it by using
>> pci_bus_trylock() instead of pci_bus_lock() in pci_bus_reset(). When the
>> lock is unavailable, return -EAGAIN, as in similar cases.
> pci_bus_error_reset() may use either pci_slot_reset() or
> pci_bus_reset(), and this patch addresses only pci_bus_reset(). Is
> the same deadlock possible in the pci_slot_reset() path?
Looking at the code flow, I agree that there is likely a potential issue
here.
Unfortunately, my current test environment does not support slot_reset, so
I haven't been able to reproduce this specific scenario locally. It would be
incredibly helpful if someone with a compatible setup could help verify
or reproduce this behavior.
>> Fixes: c4eed62a2143 ("PCI/ERR: Use slot reset if available")
>> Signed-off-by: Ziming Du <duziming2@huawei.com>
>> ---
>> drivers/pci/pci.c | 17 ++++++++++++-----
>> 1 file changed, 12 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 13dbb405dc31..7471bfa6f32e 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -5515,15 +5515,22 @@ static int pci_bus_reset(struct pci_bus *bus, bool probe)
>> if (probe)
>> return 0;
>>
>> - pci_bus_lock(bus);
>> + /*
>> + * Replace blocking lock with trylock to prevent deadlock during bus reset.
>> + * Same as above except return -EAGAIN if the bus cannot be locked.
> Wrap this to fit in 80 columns like the rest of the file.
>
>> + */
>> + if (pci_bus_trylock(bus)) {
>>
>> - might_sleep();
>> + might_sleep();
>>
>> - ret = pci_bridge_secondary_bus_reset(bus->self);
>> + ret = pci_bridge_secondary_bus_reset(bus->self);
>>
>> - pci_bus_unlock(bus);
>> + pci_bus_unlock(bus);
>>
>> - return ret;
>> + return ret;
>> + }
>> +
>> + return -EAGAIN;
>> }
>>
>> /**
>> --
>> 2.43.0
>>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] PCI: Fix AB-BA deadlock between aer_isr() and device_shutdown()
2026-01-09 9:56 [PATCH] PCI: Fix AB-BA deadlock between aer_isr() and device_shutdown() Ziming Du
2026-01-13 18:51 ` Bjorn Helgaas
@ 2026-02-24 6:40 ` Lukas Wunner
1 sibling, 0 replies; 4+ messages in thread
From: Lukas Wunner @ 2026-02-24 6:40 UTC (permalink / raw)
To: Ziming Du
Cc: bhelgaas, okaya, keith.busch, linux-pci, linux-kernel,
liuyongqiang13, Alex Williamson
On Fri, Jan 09, 2026 at 05:56:03PM +0800, Ziming Du wrote:
> During system shutdown, a deadlock may occur between AER recovery process
> and device shutdown as follows:
The subject is slightly misleading as this isn't an AB-BA deadlock,
which involves two locks. It's a deadlock involving a single lock
(device_lock), where one task (shutdown) acquires the lock, then
waits for the AER interrupt thread to finish, but that thread is
waiting on the lock.
device_shutdown() acquires the device_lock to avoid invoking a driver's
->shutdown() callback while its ->probe() callback is still running or
while the driver is being removed, cf. d1c6c030fcec. That seems
reasonable.
It's unclear why pci_bus_reset() needs to acquire device_lock. This was
introduced by 090a3c5322e9. I'm adding Alex (the author) to cc.
Another question to ask is whether it makes sense at all to attempt
error recovery when the system is shutting down. Maybe we should log
the errors, but no longer try to recover from them?
It's possible to determine whether shutdown is in progress by querying
system_state (set by kernel/reboot.c). However we can't just skip
calling pci_bus_error_reset() in aer_root_reset() if system_state
indicates shutdown because it would still be racy. The only race-free
solution would be to register a notifier with reboot_notifier_list
which sets a flag that shutdown is in progress and waits for the
interrupt thread to finish. It's quite a complicated solution just
to work around a deadlock, so I suggest to first look into removal of
device_lock acquisition in pci_bus_reset().
Simply using trylock doesn't seem bullet-proof.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-24 6:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-09 9:56 [PATCH] PCI: Fix AB-BA deadlock between aer_isr() and device_shutdown() Ziming Du
2026-01-13 18:51 ` Bjorn Helgaas
2026-01-15 2:50 ` duziming
2026-02-24 6:40 ` Lukas Wunner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox