Linux PCI subsystem development
 help / color / mirror / Atom feed
* [PATCH RESEND 0/1] pcie hotplug deadlock fix
@ 2025-10-16 19:30 Keith Busch
  2025-10-16 19:30 ` [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock Keith Busch
  0 siblings, 1 reply; 4+ messages in thread
From: Keith Busch @ 2025-10-16 19:30 UTC (permalink / raw)
  To: linux-pci, lukas, helgaas; +Cc: Keith Busch

From: Keith Busch <kbusch@kernel.org>

It's been a while since the last resend, and no movement on this so far.
This problem has been reported multiple times over the years, first one
I can find is from here:

  https://lore.kernel.org/lkml/1519648875-38196-1-git-send-email-wangxiongfeng2@huawei.com/

Lacking any other progress, I once again ask that we can take fixes for
this issue, as perfection is the providing to be a blocker to
improvement. The patch from this series is what we run in the Meta fleet
already and it has proven tremendously valuable at maintaining uptime
across the machines. No issues have been found with running it, so
asking once again if we can consider it for upstream.

Keith Busch (1):
  PCI: pciehp: fix concurrent sub-tree removal deadlock

 drivers/pci/hotplug/pciehp_pci.c | 12 +++++++++---
 drivers/pci/pci.h                |  1 +
 drivers/pci/probe.c              | 25 +++++++++++++++++++++++++
 include/linux/pci.h              |  2 ++
 4 files changed, 37 insertions(+), 3 deletions(-)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock
  2025-10-16 19:30 [PATCH RESEND 0/1] pcie hotplug deadlock fix Keith Busch
@ 2025-10-16 19:30 ` Keith Busch
  2025-10-17 10:06   ` kernel test robot
  2025-10-17 10:06   ` kernel test robot
  0 siblings, 2 replies; 4+ messages in thread
From: Keith Busch @ 2025-10-16 19:30 UTC (permalink / raw)
  To: linux-pci, lukas, helgaas; +Cc: Keith Busch

From: Keith Busch <kbusch@kernel.org>

PCIe hotplug events modify the topology in their IRQ thread once it can
acquire the global pci_rescan_remove_lock.

If a different removal event happens to acquire that lock first, and
that removal event is for the parent device of the bridge processing the
other hotplug event, then we are deadlocked: the parent removal will
wait indefinitely on the child's IRQ thread because the parent is
holding the global lock the child thread needs to make forward progress.

Introduce a new locking function that aborts if the device is being
removed. The following are stack traces of the deadlock:

Task A:

  pciehp_unconfigure_device+0x41/0x120
  pciehp_disable_slot+0x3c/0xc0
  pciehp_handle_presence_or_link_change+0x28f/0x3e0
  pciehp_ist+0xc3/0x210
  irq_thread_fn+0x19/0x40

Task B:

  __synchronize_irq+0x5b/0x90
  free_irq+0x192/0x2e0
  pcie_shutdown_notification+0x3b/0x40
  pciehp_remove+0x23/0x50
  pcie_port_remove_service+0x2c/0x40
  device_release_driver_internal+0x11f/0x180
  bus_remove_device+0xc5/0x110
  device_del+0x126/0x340
  device_unregister+0x13/0x50
  remove_iter+0x17/0x20
  device_for_each_child+0x4a/0x70
  pcie_portdrv_remove+0x23/0x40
  pci_device_remove+0x24/0x60
  device_release_driver_internal+0x11f/0x180
  pci_stop_bus_device+0x57/0x80
  pci_stop_bus_device+0x2c/0x80
  pci_stop_and_remove_bus_device+0xe/0x20
  pciehp_unconfigure_device+0x76/0x120
  pciehp_disable_slot+0x3c/0xc0
  pciehp_handle_presence_or_link_change+0x28f/0x3e0
  pciehp_ist+0xc3/0x210
  irq_thread_fn+0x19/0x40

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/pci/hotplug/pciehp_pci.c | 12 +++++++++---
 drivers/pci/pci.h                |  1 +
 drivers/pci/probe.c              | 25 +++++++++++++++++++++++++
 include/linux/pci.h              |  2 ++
 4 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 65e50bee1a8c0..8ad4f8a931d0b 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -34,9 +34,12 @@ int pciehp_configure_device(struct controller *ctrl)
 	struct pci_dev *dev;
 	struct pci_dev *bridge = ctrl->pcie->port;
 	struct pci_bus *parent = bridge->subordinate;
-	int num, ret = 0;
+	int num, ret;
 
-	pci_lock_rescan_remove();
+	ret = pci_trylock_rescan_remove(bridge);
+	if (!ret)
+		return -ENODEV;
+	ret = 0;
 
 	dev = pci_get_slot(parent, PCI_DEVFN(0, 0));
 	if (dev) {
@@ -97,6 +100,7 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
 	struct pci_dev *dev, *temp;
 	struct pci_bus *parent = ctrl->pcie->port->subordinate;
 	u16 command;
+	int ret;
 
 	ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n",
 		 __func__, pci_domain_nr(parent), parent->number);
@@ -104,7 +108,9 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
 	if (!presence)
 		pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
 
-	pci_lock_rescan_remove();
+	ret = pci_trylock_rescan_remove(parent->self);
+	if (!ret)
+		return;
 
 	/*
 	 * Stopping an SR-IOV PF device removes all the associated VFs,
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4492b809094b5..0b2f1f5f99ac7 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -650,6 +650,7 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
 {
 	pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
 	pci_doe_disconnected(dev);
+	pci_notify_disconnected();
 
 	return 0;
 }
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c83e75a0ec126..5086e23eb84d9 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3515,12 +3515,37 @@ void pci_lock_rescan_remove(void)
 }
 EXPORT_SYMBOL_GPL(pci_lock_rescan_remove);
 
+static DECLARE_WAIT_QUEUE_HEAD(pci_lock_wq);
+
+/*
+ * pci_trylock_rescan_remove() - keep trying to take the lock until successful
+ *				 or notified the device is disconnected
+ *
+ * Returns 1 if the lock was successfully taken, 0 otherwise.
+ */
+bool pci_trylock_rescan_remove(struct pci_dev *dev)
+{
+	int ret;
+
+	wait_event(pci_lock_wq,
+		   (ret = mutex_trylock(&pci_rescan_remove_lock)) == 1 ||
+		   pci_dev_is_disconnected(dev));
+
+	return ret;
+}
+
 void pci_unlock_rescan_remove(void)
 {
 	mutex_unlock(&pci_rescan_remove_lock);
+	wake_up_all(&pci_lock_wq);
 }
 EXPORT_SYMBOL_GPL(pci_unlock_rescan_remove);
 
+void pci_notify_disconnected(void)
+{
+	wake_up_all(&pci_lock_wq);
+}
+
 static int __init pci_sort_bf_cmp(const struct device *d_a,
 				  const struct device *d_b)
 {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d1fdf81fbe1e4..059d99c31204f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1478,7 +1478,9 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev);
 /* Functions for PCI Hotplug drivers to use */
 unsigned int pci_rescan_bus(struct pci_bus *bus);
 void pci_lock_rescan_remove(void);
+bool pci_trylock_rescan_remove(struct pci_dev *dev);
 void pci_unlock_rescan_remove(void);
+void pci_notify_disconnected(void);
 
 /* Vital Product Data routines */
 ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock
  2025-10-16 19:30 ` [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock Keith Busch
@ 2025-10-17 10:06   ` kernel test robot
  2025-10-17 10:06   ` kernel test robot
  1 sibling, 0 replies; 4+ messages in thread
From: kernel test robot @ 2025-10-17 10:06 UTC (permalink / raw)
  To: Keith Busch, linux-pci, lukas, helgaas; +Cc: llvm, oe-kbuild-all, Keith Busch

Hi Keith,

kernel test robot noticed the following build errors:

[auto build test ERROR on pci/next]
[also build test ERROR on pci/for-linus linus/master v6.18-rc1 next-20251016]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/PCI-pciehp-fix-concurrent-sub-tree-removal-deadlock/20251017-033133
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link:    https://lore.kernel.org/r/20251016193049.881212-2-kbusch%40meta.com
patch subject: [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock
config: arm-allnoconfig (https://download.01.org/0day-ci/archive/20251017/202510171720.AKF8sDMd-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 754ebc6ebb9fb9fbee7aef33478c74ea74949853)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251017/202510171720.AKF8sDMd-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510171720.AKF8sDMd-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/pci/of.c:18:
>> drivers/pci/pci.h:653:2: error: call to undeclared function 'pci_notify_disconnected'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     653 |         pci_notify_disconnected();
         |         ^
   drivers/pci/pci.h:653:2: note: did you mean 'pci_doe_disconnected'?
   drivers/pci/pci.h:597:20: note: 'pci_doe_disconnected' declared here
     597 | static inline void pci_doe_disconnected(struct pci_dev *pdev) { }
         |                    ^
   1 error generated.


vim +/pci_notify_disconnected +653 drivers/pci/pci.h

   648	
   649	static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
   650	{
   651		pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
   652		pci_doe_disconnected(dev);
 > 653		pci_notify_disconnected();
   654	
   655		return 0;
   656	}
   657	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock
  2025-10-16 19:30 ` [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock Keith Busch
  2025-10-17 10:06   ` kernel test robot
@ 2025-10-17 10:06   ` kernel test robot
  1 sibling, 0 replies; 4+ messages in thread
From: kernel test robot @ 2025-10-17 10:06 UTC (permalink / raw)
  To: Keith Busch, linux-pci, lukas, helgaas; +Cc: oe-kbuild-all, Keith Busch

Hi Keith,

kernel test robot noticed the following build errors:

[auto build test ERROR on pci/next]
[also build test ERROR on pci/for-linus linus/master v6.18-rc1 next-20251016]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/PCI-pciehp-fix-concurrent-sub-tree-removal-deadlock/20251017-033133
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link:    https://lore.kernel.org/r/20251016193049.881212-2-kbusch%40meta.com
patch subject: [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock
config: arc-allnoconfig (https://download.01.org/0day-ci/archive/20251017/202510171744.AnmSOlry-lkp@intel.com/config)
compiler: arc-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251017/202510171744.AnmSOlry-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510171744.AnmSOlry-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/pci/of.c:18:
   drivers/pci/pci.h: In function 'pci_dev_set_disconnected':
>> drivers/pci/pci.h:653:9: error: implicit declaration of function 'pci_notify_disconnected'; did you mean 'pci_doe_disconnected'? [-Wimplicit-function-declaration]
     653 |         pci_notify_disconnected();
         |         ^~~~~~~~~~~~~~~~~~~~~~~
         |         pci_doe_disconnected


vim +653 drivers/pci/pci.h

   648	
   649	static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
   650	{
   651		pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
   652		pci_doe_disconnected(dev);
 > 653		pci_notify_disconnected();
   654	
   655		return 0;
   656	}
   657	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-10-17 10:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-16 19:30 [PATCH RESEND 0/1] pcie hotplug deadlock fix Keith Busch
2025-10-16 19:30 ` [PATCH RESEND 1/1] PCI: pciehp: fix concurrent sub-tree removal deadlock Keith Busch
2025-10-17 10:06   ` kernel test robot
2025-10-17 10:06   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox