linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes
@ 2025-07-15 21:31 Timothy Pearson
  2025-07-15 21:36 ` [PATCH v3 1/6] PCI: pnv_php: Properly clean up allocated IRQs on unplug Timothy Pearson
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Timothy Pearson @ 2025-07-15 21:31 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, linux-pci, Madhavan Srinivasan, Michael Ellerman,
	christophe leroy, Naveen N Rao, Bjorn Helgaas, Shawn Anastasio

Hello all,

This series includes several fixes for bugs in the PowerNV PCIe hotplug
driver that were discovered in testing with a Microsemi Switchtec PM8533
PFX 48xG3 PCIe switch on a PowerNV system, as well as one workaround for
PCIe switches that don't correctly implement slot presence detection
such as the aforementioned one. Without the workaround, the switch works
and downstream devices can be hot-unplugged, but the devices never come
back online after being plugged in again until the system is rebooted.
Other hotplug drivers (like pciehp_hpc) use a similar workaround.

Also included are fixes for the EEH driver to make it hotplug safe,
and a small patch to enable all three attention indicator states per
the PCIe specification.

Thanks,

Shawn Anastasio (2):
  PCI: pnv_php: Properly clean up allocated IRQs on unplug
  PCI: pnv_php: Work around switches with broken presence detection

Timothy Pearson (4):
  powerpc/eeh: Export eeh_unfreeze_pe()
  powerpc/eeh: Make EEH driver device hotplug safe
  PCI: pnv_php: Fix surprise plug detection and recovery
  PCI: pnv_php: Enable third attention indicator state

 arch/powerpc/kernel/eeh.c         |   1 +
 arch/powerpc/kernel/eeh_driver.c  |  48 ++++--
 arch/powerpc/kernel/eeh_pe.c      |  10 +-
 arch/powerpc/kernel/pci-hotplug.c |   3 +
 drivers/pci/hotplug/pnv_php.c     | 244 +++++++++++++++++++++++++++---
 5 files changed, 263 insertions(+), 43 deletions(-)

-- 
2.39.5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/6] PCI: pnv_php: Properly clean up allocated IRQs on unplug
  2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
@ 2025-07-15 21:36 ` Timothy Pearson
  2025-07-15 21:36 ` [PATCH v3 2/6] PCI: pnv_php: Work around switches with broken presence detection Timothy Pearson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Timothy Pearson @ 2025-07-15 21:36 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

In cases where the root of a nested PCIe bridge configuration is
unplugged, the pnv_php driver would leak the allocated IRQ resources for
the child bridges' hotplug event notifications, resulting in a panic.
Fix this by walking all child buses and deallocating all it's IRQ
resources before calling pci_hp_remove_devices.

Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so
that it is only destroyed in pnv_php_free_slot, instead of
pnv_php_disable_irq. This is required since pnv_php_disable_irq will now
be called by workers triggered by hot unplug interrupts, so the
workqueue needs to stay allocated.

The abridged kernel panic that occurs without this patch is as follows:

  WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c
  CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2
  Call Trace:
   msi_device_data_release+0x34/0x9c (unreliable)
   release_nodes+0x64/0x13c
   devres_release_all+0xc0/0x140
   device_del+0x2d4/0x46c
   pci_destroy_dev+0x5c/0x194
   pci_hp_remove_devices+0x90/0x128
   pci_hp_remove_devices+0x44/0x128
   pnv_php_disable_slot+0x54/0xd4
   power_write_file+0xf8/0x18c
   pci_slot_attr_store+0x40/0x5c
   sysfs_kf_write+0x64/0x78
   kernfs_fop_write_iter+0x1b0/0x290
   vfs_write+0x3bc/0x50c
   ksys_write+0x84/0x140
   system_call_exception+0x124/0x230
   system_call_vectored_common+0x15c/0x2ec

Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 drivers/pci/hotplug/pnv_php.c | 94 ++++++++++++++++++++++++++++-------
 1 file changed, 75 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 573a41869c15..aec0a6d594ac 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -3,6 +3,7 @@
  * PCI Hotplug Driver for PowerPC PowerNV platform.
  *
  * Copyright Gavin Shan, IBM Corporation 2016.
+ * Copyright (C) 2025 Raptor Engineering, LLC
  */
 
 #include <linux/bitfield.h>
@@ -36,8 +37,10 @@ static void pnv_php_register(struct device_node *dn);
 static void pnv_php_unregister_one(struct device_node *dn);
 static void pnv_php_unregister(struct device_node *dn);
 
+static void pnv_php_enable_irq(struct pnv_php_slot *php_slot);
+
 static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
-				bool disable_device)
+				bool disable_device, bool disable_msi)
 {
 	struct pci_dev *pdev = php_slot->pdev;
 	u16 ctrl;
@@ -53,19 +56,15 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
 		php_slot->irq = 0;
 	}
 
-	if (php_slot->wq) {
-		destroy_workqueue(php_slot->wq);
-		php_slot->wq = NULL;
-	}
-
-	if (disable_device) {
+	if (disable_device || disable_msi) {
 		if (pdev->msix_enabled)
 			pci_disable_msix(pdev);
 		else if (pdev->msi_enabled)
 			pci_disable_msi(pdev);
+	}
 
+	if (disable_device)
 		pci_disable_device(pdev);
-	}
 }
 
 static void pnv_php_free_slot(struct kref *kref)
@@ -74,7 +73,8 @@ static void pnv_php_free_slot(struct kref *kref)
 					struct pnv_php_slot, kref);
 
 	WARN_ON(!list_empty(&php_slot->children));
-	pnv_php_disable_irq(php_slot, false);
+	pnv_php_disable_irq(php_slot, false, false);
+	destroy_workqueue(php_slot->wq);
 	kfree(php_slot->name);
 	kfree(php_slot);
 }
@@ -561,8 +561,57 @@ static int pnv_php_reset_slot(struct hotplug_slot *slot, bool probe)
 static int pnv_php_enable_slot(struct hotplug_slot *slot)
 {
 	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
+	u32 prop32;
+	int ret;
+
+	ret = pnv_php_enable(php_slot, true);
+	if (ret)
+		return ret;
+
+	/* (Re-)enable interrupt if the slot supports surprise hotplug */
+	ret = of_property_read_u32(php_slot->dn, "ibm,slot-surprise-pluggable", &prop32);
+	if (!ret && prop32)
+		pnv_php_enable_irq(php_slot);
+
+	return 0;
+}
+
+/**
+ * Disable any hotplug interrupts for all slots on the provided bus, as well as
+ * all downstream slots in preparation for a hot unplug.
+ */
+static int pnv_php_disable_all_irqs(struct pci_bus *bus)
+{
+	struct pci_bus *child_bus;
+	struct pci_slot *cur_slot;
+
+	/* First go down child busses */
+	list_for_each_entry(child_bus, &bus->children, node)
+		pnv_php_disable_all_irqs(child_bus);
+
+	/* Disable IRQs for all pnv_php slots on this bus */
+	list_for_each_entry(cur_slot, &bus->slots, list) {
+		struct pnv_php_slot *php_slot = to_pnv_php_slot(cur_slot->hotplug);
+
+		pnv_php_disable_irq(php_slot, false, true);
+	}
 
-	return pnv_php_enable(php_slot, true);
+	return 0;
+}
+
+/**
+ * Disable any hotplug interrupts for all downstream slots on the provided bus in
+ * preparation for a hot unplug.
+ */
+static int pnv_php_disable_all_downstream_irqs(struct pci_bus *bus)
+{
+	struct pci_bus *child_bus;
+
+	/* Go down child busses, recursively deactivating their IRQs */
+	list_for_each_entry(child_bus, &bus->children, node)
+		pnv_php_disable_all_irqs(child_bus);
+
+	return 0;
 }
 
 static int pnv_php_disable_slot(struct hotplug_slot *slot)
@@ -579,6 +628,12 @@ static int pnv_php_disable_slot(struct hotplug_slot *slot)
 	    php_slot->state != PNV_PHP_STATE_REGISTERED)
 		return 0;
 
+	/* Free all IRQ resources from all child slots before remove.
+	 * Note that we do not disable the root slot IRQ here as that
+	 * would also deactivate the slot hot (re)plug interrupt!
+	 */
+	pnv_php_disable_all_downstream_irqs(php_slot->bus);
+
 	/* Remove all devices behind the slot */
 	pci_lock_rescan_remove();
 	pci_hp_remove_devices(php_slot->bus);
@@ -647,6 +702,15 @@ static struct pnv_php_slot *pnv_php_alloc_slot(struct device_node *dn)
 		return NULL;
 	}
 
+	/* Allocate workqueue for this slot's interrupt handling */
+	php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
+	if (!php_slot->wq) {
+		SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
+		kfree(php_slot->name);
+		kfree(php_slot);
+		return NULL;
+	}
+
 	if (dn->child && PCI_DN(dn->child))
 		php_slot->slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
 	else
@@ -843,14 +907,6 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq)
 	u16 sts, ctrl;
 	int ret;
 
-	/* Allocate workqueue */
-	php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
-	if (!php_slot->wq) {
-		SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
-		pnv_php_disable_irq(php_slot, true);
-		return;
-	}
-
 	/* Check PDC (Presence Detection Change) is broken or not */
 	ret = of_property_read_u32(php_slot->dn, "ibm,slot-broken-pdc",
 				   &broken_pdc);
@@ -869,7 +925,7 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq)
 	ret = request_irq(irq, pnv_php_interrupt, IRQF_SHARED,
 			  php_slot->name, php_slot);
 	if (ret) {
-		pnv_php_disable_irq(php_slot, true);
+		pnv_php_disable_irq(php_slot, true, true);
 		SLOT_WARN(php_slot, "Error %d enabling IRQ %d\n", ret, irq);
 		return;
 	}
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/6] PCI: pnv_php: Work around switches with broken presence detection
  2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
  2025-07-15 21:36 ` [PATCH v3 1/6] PCI: pnv_php: Properly clean up allocated IRQs on unplug Timothy Pearson
@ 2025-07-15 21:36 ` Timothy Pearson
  2025-07-15 21:37 ` [PATCH v3 3/6] powerpc/eeh: Export eeh_unfreeze_pe() Timothy Pearson
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Timothy Pearson @ 2025-07-15 21:36 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
was observed to incorrectly assert the Presence Detect Set bit in its
capabilities when tested on a Raptor Computing Systems Blackbird system,
resulting in the hot insert path never attempting a rescan of the bus
and any downstream devices not being re-detected.

Work around this by additionally checking whether the PCIe data link is
active or not when performing presence detection on downstream switches'
ports, similar to the pciehp_hpc.c driver.

Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 drivers/pci/hotplug/pnv_php.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index aec0a6d594ac..bac8af3df41a 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -391,6 +391,20 @@ static int pnv_php_get_power_state(struct hotplug_slot *slot, u8 *state)
 	return 0;
 }
 
+static int pcie_check_link_active(struct pci_dev *pdev)
+{
+	u16 lnk_status;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
+	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
+		return -ENODEV;
+
+	ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
+
+	return ret;
+}
+
 static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
 {
 	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
@@ -403,6 +417,19 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
 	 */
 	ret = pnv_pci_get_presence_state(php_slot->id, &presence);
 	if (ret >= 0) {
+		if (pci_pcie_type(php_slot->pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
+			presence == OPAL_PCI_SLOT_EMPTY) {
+			/*
+			 * Similar to pciehp_hpc, check whether the Link Active
+			 * bit is set to account for broken downstream bridges
+			 * that don't properly assert Presence Detect State, as
+			 * was observed on the Microsemi Switchtec PM8533 PFX
+			 * [11f8:8533].
+			 */
+			if (pcie_check_link_active(php_slot->pdev) > 0)
+				presence = OPAL_PCI_SLOT_PRESENT;
+		}
+
 		*state = presence;
 		ret = 0;
 	} else {
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/6] powerpc/eeh: Export eeh_unfreeze_pe()
  2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
  2025-07-15 21:36 ` [PATCH v3 1/6] PCI: pnv_php: Properly clean up allocated IRQs on unplug Timothy Pearson
  2025-07-15 21:36 ` [PATCH v3 2/6] PCI: pnv_php: Work around switches with broken presence detection Timothy Pearson
@ 2025-07-15 21:37 ` Timothy Pearson
  2025-07-15 21:38 ` [PATCH v3 4/6] powerpc/eeh: Make EEH driver device hotplug safe Timothy Pearson
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Timothy Pearson @ 2025-07-15 21:37 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

The PowerNV hotplug driver needs to be able to clear any frozen PE(s)
on the PHB after suprise removal of a downstream device.

Export the eeh_unfreeze_pe() symbol to allow implementation of this
functionality in the php_nv module.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 arch/powerpc/kernel/eeh.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index ca7f7bb2b478..2b5f3323e107 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1139,6 +1139,7 @@ int eeh_unfreeze_pe(struct eeh_pe *pe)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(eeh_unfreeze_pe);
 
 
 static struct pci_device_id eeh_reset_ids[] = {
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 4/6] powerpc/eeh: Make EEH driver device hotplug safe
  2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
                   ` (2 preceding siblings ...)
  2025-07-15 21:37 ` [PATCH v3 3/6] powerpc/eeh: Export eeh_unfreeze_pe() Timothy Pearson
@ 2025-07-15 21:38 ` Timothy Pearson
  2025-07-15 21:39 ` [PATCH v3 5/6] PCI: pnv_php: Fix surprise plug detection and recovery Timothy Pearson
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Timothy Pearson @ 2025-07-15 21:38 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

Multiple race conditions existed between the PCIe hotplug driver and
the EEH driver, leading to a variety of kernel oopses of the same
general nature:

<pcie device unplug>
<eeh driver trigger>
<hotplug removal trigger>
<pcie tree reconfiguration>
<eeh recovery next step>
<oops in EEH driver bus iteration loop>

A second class of oops is also seen when the underling bus disappears
during device recovery.

Refactor the EEH module to be PCI rescan and remove safe.  Also clean
up a few minor formatting / readability issues.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 arch/powerpc/kernel/eeh_driver.c | 48 +++++++++++++++++++++-----------
 arch/powerpc/kernel/eeh_pe.c     | 10 ++++---
 2 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 7efe04c68f0f..dd50de91c438 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -257,13 +257,12 @@ static void eeh_pe_report_edev(struct eeh_dev *edev, eeh_report_fn fn,
 	struct pci_driver *driver;
 	enum pci_ers_result new_result;
 
-	pci_lock_rescan_remove();
 	pdev = edev->pdev;
 	if (pdev)
 		get_device(&pdev->dev);
-	pci_unlock_rescan_remove();
 	if (!pdev) {
 		eeh_edev_info(edev, "no device");
+		*result = PCI_ERS_RESULT_DISCONNECT;
 		return;
 	}
 	device_lock(&pdev->dev);
@@ -304,8 +303,9 @@ static void eeh_pe_report(const char *name, struct eeh_pe *root,
 	struct eeh_dev *edev, *tmp;
 
 	pr_info("EEH: Beginning: '%s'\n", name);
-	eeh_for_each_pe(root, pe) eeh_pe_for_each_dev(pe, edev, tmp)
-		eeh_pe_report_edev(edev, fn, result);
+	eeh_for_each_pe(root, pe)
+		eeh_pe_for_each_dev(pe, edev, tmp)
+			eeh_pe_report_edev(edev, fn, result);
 	if (result)
 		pr_info("EEH: Finished:'%s' with aggregate recovery state:'%s'\n",
 			name, pci_ers_result_name(*result));
@@ -383,6 +383,8 @@ static void eeh_dev_restore_state(struct eeh_dev *edev, void *userdata)
 	if (!edev)
 		return;
 
+	pci_lock_rescan_remove();
+
 	/*
 	 * The content in the config space isn't saved because
 	 * the blocked config space on some adapters. We have
@@ -393,14 +395,19 @@ static void eeh_dev_restore_state(struct eeh_dev *edev, void *userdata)
 		if (list_is_last(&edev->entry, &edev->pe->edevs))
 			eeh_pe_restore_bars(edev->pe);
 
+		pci_unlock_rescan_remove();
 		return;
 	}
 
 	pdev = eeh_dev_to_pci_dev(edev);
-	if (!pdev)
+	if (!pdev) {
+		pci_unlock_rescan_remove();
 		return;
+	}
 
 	pci_restore_state(pdev);
+
+	pci_unlock_rescan_remove();
 }
 
 /**
@@ -647,9 +654,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	if (any_passed || driver_eeh_aware || (pe->type & EEH_PE_VF)) {
 		eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
 	} else {
-		pci_lock_rescan_remove();
 		pci_hp_remove_devices(bus);
-		pci_unlock_rescan_remove();
 	}
 
 	/*
@@ -665,8 +670,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	if (rc)
 		return rc;
 
-	pci_lock_rescan_remove();
-
 	/* Restore PE */
 	eeh_ops->configure_bridge(pe);
 	eeh_pe_restore_bars(pe);
@@ -674,7 +677,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	/* Clear frozen state */
 	rc = eeh_clear_pe_frozen_state(pe, false);
 	if (rc) {
-		pci_unlock_rescan_remove();
 		return rc;
 	}
 
@@ -709,7 +711,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	pe->tstamp = tstamp;
 	pe->freeze_count = cnt;
 
-	pci_unlock_rescan_remove();
 	return 0;
 }
 
@@ -843,10 +844,13 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		{LIST_HEAD_INIT(rmv_data.removed_vf_list), 0};
 	int devices = 0;
 
+	pci_lock_rescan_remove();
+
 	bus = eeh_pe_bus_get(pe);
 	if (!bus) {
 		pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",
 			__func__, pe->phb->global_number, pe->addr);
+		pci_unlock_rescan_remove();
 		return;
 	}
 
@@ -1094,10 +1098,15 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true);
 		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
 
-		pci_lock_rescan_remove();
-		pci_hp_remove_devices(bus);
-		pci_unlock_rescan_remove();
+		bus = eeh_pe_bus_get(pe);
+		if (bus)
+			pci_hp_remove_devices(bus);
+		else
+			pr_err("%s: PCI bus for PHB#%x-PE#%x disappeared\n",
+				__func__, pe->phb->global_number, pe->addr);
+
 		/* The passed PE should no longer be used */
+		pci_unlock_rescan_remove();
 		return;
 	}
 
@@ -1114,6 +1123,8 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 			eeh_clear_slot_attention(edev->pdev);
 
 	eeh_pe_state_clear(pe, EEH_PE_RECOVERING, true);
+
+	pci_unlock_rescan_remove();
 }
 
 /**
@@ -1132,6 +1143,7 @@ void eeh_handle_special_event(void)
 	unsigned long flags;
 	int rc;
 
+	pci_lock_rescan_remove();
 
 	do {
 		rc = eeh_ops->next_error(&pe);
@@ -1171,10 +1183,12 @@ void eeh_handle_special_event(void)
 
 			break;
 		case EEH_NEXT_ERR_NONE:
+			pci_unlock_rescan_remove();
 			return;
 		default:
 			pr_warn("%s: Invalid value %d from next_error()\n",
 				__func__, rc);
+			pci_unlock_rescan_remove();
 			return;
 		}
 
@@ -1186,7 +1200,9 @@ void eeh_handle_special_event(void)
 		if (rc == EEH_NEXT_ERR_FROZEN_PE ||
 		    rc == EEH_NEXT_ERR_FENCED_PHB) {
 			eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
+			pci_unlock_rescan_remove();
 			eeh_handle_normal_event(pe);
+			pci_lock_rescan_remove();
 		} else {
 			eeh_for_each_pe(pe, tmp_pe)
 				eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev)
@@ -1199,7 +1215,6 @@ void eeh_handle_special_event(void)
 				eeh_report_failure, NULL);
 			eeh_set_channel_state(pe, pci_channel_io_perm_failure);
 
-			pci_lock_rescan_remove();
 			list_for_each_entry(hose, &hose_list, list_node) {
 				phb_pe = eeh_phb_pe_get(hose);
 				if (!phb_pe ||
@@ -1218,7 +1233,6 @@ void eeh_handle_special_event(void)
 				}
 				pci_hp_remove_devices(bus);
 			}
-			pci_unlock_rescan_remove();
 		}
 
 		/*
@@ -1228,4 +1242,6 @@ void eeh_handle_special_event(void)
 		if (rc == EEH_NEXT_ERR_DEAD_IOC)
 			break;
 	} while (rc != EEH_NEXT_ERR_NONE);
+
+	pci_unlock_rescan_remove();
 }
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index d283d281d28e..e740101fadf3 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -671,10 +671,12 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
 	eeh_ops->write_config(edev, cap + PCI_EXP_LNKCTL, 2, val);
 
 	/* Check link */
-	if (!edev->pdev->link_active_reporting) {
-		eeh_edev_dbg(edev, "No link reporting capability\n");
-		msleep(1000);
-		return;
+	if (edev->pdev) {
+		if (!edev->pdev->link_active_reporting) {
+			eeh_edev_dbg(edev, "No link reporting capability\n");
+			msleep(1000);
+			return;
+		}
 	}
 
 	/* Wait the link is up until timeout (5s) */
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 5/6] PCI: pnv_php: Fix surprise plug detection and recovery
  2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
                   ` (3 preceding siblings ...)
  2025-07-15 21:38 ` [PATCH v3 4/6] powerpc/eeh: Make EEH driver device hotplug safe Timothy Pearson
@ 2025-07-15 21:39 ` Timothy Pearson
  2025-07-17 23:27   ` Bjorn Helgaas
  2025-07-15 21:39 ` [PATCH v3 6/6] PCI: pnv_php: Enable third attention indicator state Timothy Pearson
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Timothy Pearson @ 2025-07-15 21:39 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

The existing PowerNV hotplug code did not handle surprise plug events
correctly, leading to a complete failure of the hotplug system after
device removal and a required reboot to detect new devices.

This comes down to two issues:
1.) When a device is surprise removed, oftentimes the bridge upstream
    port will cause a PE freeze on the PHB.  If this freeze is not
    cleared, the MSI interrupts from the bridge hotplug notification
    logic will not be received by the kernel, stalling all plug events
    on all slots associated with the PE.
2.) When a device is removed from a slot, regardless of surprise or
    programmatic removal, the associated PHB/PE ls left frozen.
    If this freeze is not cleared via a fundamental reset, skiboot
    is unable to clear the freeze and cannot retrain / rescan the
    slot.  This also requires a reboot to clear the freeze and redetect
    the device in the slot.

Issue the appropriate unfreeze and rescan commands on hotplug events,
and don't oops on hotplug if pci_bus_to_OF_node() returns NULL.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 arch/powerpc/kernel/pci-hotplug.c |   3 +
 drivers/pci/hotplug/pnv_php.c     | 108 +++++++++++++++++++++++++++++-
 2 files changed, 108 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 9ea74973d78d..6f444d0822d8 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -141,6 +141,9 @@ void pci_hp_add_devices(struct pci_bus *bus)
 	struct pci_controller *phb;
 	struct device_node *dn = pci_bus_to_OF_node(bus);
 
+	if (!dn)
+		return;
+
 	phb = pci_bus_to_host(bus);
 
 	mode = PCI_PROBE_NORMAL;
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index bac8af3df41a..3533f7f23b71 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -4,12 +4,14 @@
  *
  * Copyright Gavin Shan, IBM Corporation 2016.
  * Copyright (C) 2025 Raptor Engineering, LLC
+ * Copyright (C) 2025 Raptor Computing Systems, LLC
  */
 
 #include <linux/bitfield.h>
 #include <linux/libfdt.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <linux/delay.h>
 #include <linux/pci_hotplug.h>
 #include <linux/of_fdt.h>
 
@@ -469,6 +471,59 @@ static int pnv_php_set_attention_state(struct hotplug_slot *slot, u8 state)
 	return 0;
 }
 
+static int pnv_php_activate_slot(struct pnv_php_slot *php_slot,
+				 struct hotplug_slot *slot)
+{
+	int ret, i;
+
+	/*
+	 * Issue initial slot activation command to firmware
+	 *
+	 * Firmware will power slot on, attempt to train the link, and discover any downstream devices
+	 * If this process fails, firmware will return an error code and an invalid device tree
+	 * Failure can be caused for multiple reasons, including a faulty downstream device,
+	 * poor connection to the downstream device, or a previously latched PHB fence.
+	 * On failure, issue fundamental reset up to three times before aborting.
+	 */
+	ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
+	if (ret) {
+		SLOT_WARN(
+			php_slot,
+			"PCI slot activation failed with error code %d, possible frozen PHB",
+			ret);
+		SLOT_WARN(
+			php_slot,
+			"Attempting complete PHB reset before retrying slot activation\n");
+		for (i = 0; i < 3; i++) {
+			/*
+			 * Slot activation failed, PHB may be fenced from a
+			 * prior device failure.
+			 *
+			 * Use the OPAL fundamental reset call to both try a
+			 * device reset and clear any potentially active PHB
+			 * fence / freeze.
+			 */
+			SLOT_WARN(php_slot, "Try %d...\n", i + 1);
+			pci_set_pcie_reset_state(php_slot->pdev,
+						 pcie_warm_reset);
+			msleep(250);
+			pci_set_pcie_reset_state(php_slot->pdev,
+						 pcie_deassert_reset);
+
+			ret = pnv_php_set_slot_power_state(
+				slot, OPAL_PCI_SLOT_POWER_ON);
+			if (!ret)
+				break;
+		}
+
+		if (i >= 3)
+			SLOT_WARN(php_slot,
+				  "Failed to bring slot online, aborting!\n");
+	}
+
+	return ret;
+}
+
 static int pnv_php_enable(struct pnv_php_slot *php_slot, bool rescan)
 {
 	struct hotplug_slot *slot = &php_slot->slot;
@@ -531,7 +586,7 @@ static int pnv_php_enable(struct pnv_php_slot *php_slot, bool rescan)
 		goto scan;
 
 	/* Power is off, turn it on and then scan the slot */
-	ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
+	ret = pnv_php_activate_slot(php_slot, slot);
 	if (ret)
 		return ret;
 
@@ -836,16 +891,63 @@ static int pnv_php_enable_msix(struct pnv_php_slot *php_slot)
 	return entry.vector;
 }
 
+static void
+pnv_php_detect_clear_suprise_removal_freeze(struct pnv_php_slot *php_slot)
+{
+	struct pci_dev *pdev = php_slot->pdev;
+	struct eeh_dev *edev;
+	struct eeh_pe *pe;
+	int i, rc;
+
+	/*
+	 * When a device is surprise removed from a downstream bridge slot,
+	 * the upstream bridge port can still end up frozen due to related EEH
+	 * events, which will in turn block the MSI interrupts for slot hotplug
+	 * detection.
+	 *
+	 * Detect and thaw any frozen upstream PE after slot deactivation...
+	 */
+	edev = pci_dev_to_eeh_dev(pdev);
+	pe = edev ? edev->pe : NULL;
+	rc = eeh_pe_get_state(pe);
+	if ((rc == -ENODEV) || (rc == -ENOENT)) {
+		SLOT_WARN(
+			php_slot,
+			"Upstream bridge PE state unknown, hotplug detect may fail\n");
+	} else {
+		if (pe->state & EEH_PE_ISOLATED) {
+			SLOT_WARN(
+				php_slot,
+				"Upstream bridge PE %02x frozen, thawing...\n",
+				pe->addr);
+			for (i = 0; i < 3; i++)
+				if (!eeh_unfreeze_pe(pe))
+					break;
+			if (i >= 3)
+				SLOT_WARN(
+					php_slot,
+					"Unable to thaw PE %02x, hotplug detect will fail!\n",
+					pe->addr);
+			else
+				SLOT_WARN(php_slot,
+					  "PE %02x thawed successfully\n",
+					  pe->addr);
+		}
+	}
+}
+
 static void pnv_php_event_handler(struct work_struct *work)
 {
 	struct pnv_php_event *event =
 		container_of(work, struct pnv_php_event, work);
 	struct pnv_php_slot *php_slot = event->php_slot;
 
-	if (event->added)
+	if (event->added) {
 		pnv_php_enable_slot(&php_slot->slot);
-	else
+	} else {
 		pnv_php_disable_slot(&php_slot->slot);
+		pnv_php_detect_clear_suprise_removal_freeze(php_slot);
+	}
 
 	kfree(event);
 }
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 6/6] PCI: pnv_php: Enable third attention indicator state
  2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
                   ` (4 preceding siblings ...)
  2025-07-15 21:39 ` [PATCH v3 5/6] PCI: pnv_php: Fix surprise plug detection and recovery Timothy Pearson
@ 2025-07-15 21:39 ` Timothy Pearson
  2025-07-17 23:27 ` [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Bjorn Helgaas
  2025-07-23 11:47 ` Ganesh G R
  7 siblings, 0 replies; 14+ messages in thread
From: Timothy Pearson @ 2025-07-15 21:39 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

The PCIe specification allows three attention indicator states,
on, off, and blink.  Enable all three states instead of basic
on / off control.

This changes the userspace API to match the behavior of pcihp.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 drivers/pci/hotplug/pnv_php.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 3533f7f23b71..c65460ced862 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -441,10 +441,23 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
 	return ret;
 }
 
+static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8 *state)
+{
+	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
+	struct pci_dev *bridge = php_slot->pdev;
+	u16 status;
+
+	pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
+	*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;
+	return 0;
+}
+
+
 static int pnv_php_get_attention_state(struct hotplug_slot *slot, u8 *state)
 {
 	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
 
+	pnv_php_get_raw_indicator_status(slot, &php_slot->attention_state);
 	*state = php_slot->attention_state;
 	return 0;
 }
@@ -462,7 +475,7 @@ static int pnv_php_set_attention_state(struct hotplug_slot *slot, u8 state)
 	mask = PCI_EXP_SLTCTL_AIC;
 
 	if (state)
-		new = PCI_EXP_SLTCTL_ATTN_IND_ON;
+		new = FIELD_PREP(PCI_EXP_SLTCTL_AIC, state);
 	else
 		new = PCI_EXP_SLTCTL_ATTN_IND_OFF;
 
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 5/6] PCI: pnv_php: Fix surprise plug detection and recovery
  2025-07-15 21:39 ` [PATCH v3 5/6] PCI: pnv_php: Fix surprise plug detection and recovery Timothy Pearson
@ 2025-07-17 23:27   ` Bjorn Helgaas
  2025-07-18  0:05     ` Timothy Pearson
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2025-07-17 23:27 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

On Tue, Jul 15, 2025 at 04:39:06PM -0500, Timothy Pearson wrote:
> The existing PowerNV hotplug code did not handle surprise plug events
> correctly, leading to a complete failure of the hotplug system after
> device removal and a required reboot to detect new devices.

> +++ b/drivers/pci/hotplug/pnv_php.c
> @@ -4,12 +4,14 @@
>   *
>   * Copyright Gavin Shan, IBM Corporation 2016.
>   * Copyright (C) 2025 Raptor Engineering, LLC
> + * Copyright (C) 2025 Raptor Computing Systems, LLC

Just to double-check that you want both copyright lines here?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes
  2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
                   ` (5 preceding siblings ...)
  2025-07-15 21:39 ` [PATCH v3 6/6] PCI: pnv_php: Enable third attention indicator state Timothy Pearson
@ 2025-07-17 23:27 ` Bjorn Helgaas
  2025-07-22 20:47   ` Bjorn Helgaas
  2025-07-23 11:47 ` Ganesh G R
  7 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2025-07-17 23:27 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

On Tue, Jul 15, 2025 at 04:31:49PM -0500, Timothy Pearson wrote:
> Hello all,
> 
> This series includes several fixes for bugs in the PowerNV PCIe hotplug
> driver that were discovered in testing with a Microsemi Switchtec PM8533
> PFX 48xG3 PCIe switch on a PowerNV system, as well as one workaround for
> PCIe switches that don't correctly implement slot presence detection
> such as the aforementioned one. Without the workaround, the switch works
> and downstream devices can be hot-unplugged, but the devices never come
> back online after being plugged in again until the system is rebooted.
> Other hotplug drivers (like pciehp_hpc) use a similar workaround.
> 
> Also included are fixes for the EEH driver to make it hotplug safe,
> and a small patch to enable all three attention indicator states per
> the PCIe specification.
> 
> Thanks,
> 
> Shawn Anastasio (2):
>   PCI: pnv_php: Properly clean up allocated IRQs on unplug
>   PCI: pnv_php: Work around switches with broken presence detection
> 
> Timothy Pearson (4):
>   powerpc/eeh: Export eeh_unfreeze_pe()
>   powerpc/eeh: Make EEH driver device hotplug safe
>   PCI: pnv_php: Fix surprise plug detection and recovery
>   PCI: pnv_php: Enable third attention indicator state
> 
>  arch/powerpc/kernel/eeh.c         |   1 +
>  arch/powerpc/kernel/eeh_driver.c  |  48 ++++--
>  arch/powerpc/kernel/eeh_pe.c      |  10 +-
>  arch/powerpc/kernel/pci-hotplug.c |   3 +
>  drivers/pci/hotplug/pnv_php.c     | 244 +++++++++++++++++++++++++++---
>  5 files changed, 263 insertions(+), 43 deletions(-)

I'm OK with this from a PCI perspective, and I optimistically put it
on pci/hotplug.

I'm happy to merge via the PCI tree, but would need acks from the
powerpc folks for the arch/powerpc parts.

Alternatively it could be merged via powerpc with my ack on the
drivers/pci patches:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

If you do merge via powerpc, I made some comment formatting and commit
log tweaks that I would like reflected in the drivers/pci part.  These
are on
https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=hotplug

Bjorn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 5/6] PCI: pnv_php: Fix surprise plug detection and recovery
  2025-07-17 23:27   ` Bjorn Helgaas
@ 2025-07-18  0:05     ` Timothy Pearson
  0 siblings, 0 replies; 14+ messages in thread
From: Timothy Pearson @ 2025-07-18  0:05 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Timothy Pearson, linuxppc-dev, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas, Shawn Anastasio



----- Original Message -----
> From: "Bjorn Helgaas" <helgaas@kernel.org>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
> Sent: Thursday, July 17, 2025 6:27:45 PM
> Subject: Re: [PATCH v3 5/6] PCI: pnv_php: Fix surprise plug detection and recovery

> On Tue, Jul 15, 2025 at 04:39:06PM -0500, Timothy Pearson wrote:
>> The existing PowerNV hotplug code did not handle surprise plug events
>> correctly, leading to a complete failure of the hotplug system after
>> device removal and a required reboot to detect new devices.
> 
>> +++ b/drivers/pci/hotplug/pnv_php.c
>> @@ -4,12 +4,14 @@
>>   *
>>   * Copyright Gavin Shan, IBM Corporation 2016.
>>   * Copyright (C) 2025 Raptor Engineering, LLC
>> + * Copyright (C) 2025 Raptor Computing Systems, LLC
> 
> Just to double-check that you want both copyright lines here?

Yes, both entities ended up sponsoring this part of the work over time.  Thank you for double checking!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes
  2025-07-17 23:27 ` [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Bjorn Helgaas
@ 2025-07-22 20:47   ` Bjorn Helgaas
  2025-07-23 11:00     ` Madhavan Srinivasan
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2025-07-22 20:47 UTC (permalink / raw)
  To: Madhavan Srinivasan, Michael Ellerman, Mahesh J Salgaonkar
  Cc: linuxppc-dev, linux-kernel, linux-pci, christophe leroy,
	Naveen N Rao, Bjorn Helgaas, Shawn Anastasio, Timothy Pearson

[-> to: Madhavan, Michael, Mahesh; seeking acks]

On Thu, Jul 17, 2025 at 06:27:52PM -0500, Bjorn Helgaas wrote:
> On Tue, Jul 15, 2025 at 04:31:49PM -0500, Timothy Pearson wrote:
> > Hello all,
> > 
> > This series includes several fixes for bugs in the PowerNV PCIe hotplug
> > driver that were discovered in testing with a Microsemi Switchtec PM8533
> > PFX 48xG3 PCIe switch on a PowerNV system, as well as one workaround for
> > PCIe switches that don't correctly implement slot presence detection
> > such as the aforementioned one. Without the workaround, the switch works
> > and downstream devices can be hot-unplugged, but the devices never come
> > back online after being plugged in again until the system is rebooted.
> > Other hotplug drivers (like pciehp_hpc) use a similar workaround.
> > 
> > Also included are fixes for the EEH driver to make it hotplug safe,
> > and a small patch to enable all three attention indicator states per
> > the PCIe specification.
> > 
> > Thanks,
> > 
> > Shawn Anastasio (2):
> >   PCI: pnv_php: Properly clean up allocated IRQs on unplug
> >   PCI: pnv_php: Work around switches with broken presence detection
> > 
> > Timothy Pearson (4):
> >   powerpc/eeh: Export eeh_unfreeze_pe()
> >   powerpc/eeh: Make EEH driver device hotplug safe
> >   PCI: pnv_php: Fix surprise plug detection and recovery
> >   PCI: pnv_php: Enable third attention indicator state
> > 
> >  arch/powerpc/kernel/eeh.c         |   1 +
> >  arch/powerpc/kernel/eeh_driver.c  |  48 ++++--
> >  arch/powerpc/kernel/eeh_pe.c      |  10 +-
> >  arch/powerpc/kernel/pci-hotplug.c |   3 +
> >  drivers/pci/hotplug/pnv_php.c     | 244 +++++++++++++++++++++++++++---
> >  5 files changed, 263 insertions(+), 43 deletions(-)
> 
> I'm OK with this from a PCI perspective, and I optimistically put it
> on pci/hotplug.
> 
> I'm happy to merge via the PCI tree, but would need acks from the
> powerpc folks for the arch/powerpc parts.
> 
> Alternatively it could be merged via powerpc with my ack on the
> drivers/pci patches:
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> If you do merge via powerpc, I made some comment formatting and commit
> log tweaks that I would like reflected in the drivers/pci part.  These
> are on
> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=hotplug

Powerpc folks: let me know how you want to handle this.  I haven't
included it in pci/next yet because I don't have acks for the
arch/powerpc parts.

Bjorn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes
  2025-07-22 20:47   ` Bjorn Helgaas
@ 2025-07-23 11:00     ` Madhavan Srinivasan
  2025-07-23 11:39       ` Bjorn Helgaas
  0 siblings, 1 reply; 14+ messages in thread
From: Madhavan Srinivasan @ 2025-07-23 11:00 UTC (permalink / raw)
  To: Bjorn Helgaas, Michael Ellerman, Mahesh J Salgaonkar
  Cc: linuxppc-dev, linux-kernel, linux-pci, christophe leroy,
	Naveen N Rao, Bjorn Helgaas, Shawn Anastasio, Timothy Pearson



On 7/23/25 2:17 AM, Bjorn Helgaas wrote:
> [-> to: Madhavan, Michael, Mahesh; seeking acks]
> 
> On Thu, Jul 17, 2025 at 06:27:52PM -0500, Bjorn Helgaas wrote:
>> On Tue, Jul 15, 2025 at 04:31:49PM -0500, Timothy Pearson wrote:
>>> Hello all,
>>>
>>> This series includes several fixes for bugs in the PowerNV PCIe hotplug
>>> driver that were discovered in testing with a Microsemi Switchtec PM8533
>>> PFX 48xG3 PCIe switch on a PowerNV system, as well as one workaround for
>>> PCIe switches that don't correctly implement slot presence detection
>>> such as the aforementioned one. Without the workaround, the switch works
>>> and downstream devices can be hot-unplugged, but the devices never come
>>> back online after being plugged in again until the system is rebooted.
>>> Other hotplug drivers (like pciehp_hpc) use a similar workaround.
>>>
>>> Also included are fixes for the EEH driver to make it hotplug safe,
>>> and a small patch to enable all three attention indicator states per
>>> the PCIe specification.
>>>
>>> Thanks,
>>>
>>> Shawn Anastasio (2):
>>>   PCI: pnv_php: Properly clean up allocated IRQs on unplug
>>>   PCI: pnv_php: Work around switches with broken presence detection
>>>
>>> Timothy Pearson (4):
>>>   powerpc/eeh: Export eeh_unfreeze_pe()
>>>   powerpc/eeh: Make EEH driver device hotplug safe
>>>   PCI: pnv_php: Fix surprise plug detection and recovery
>>>   PCI: pnv_php: Enable third attention indicator state
>>>
>>>  arch/powerpc/kernel/eeh.c         |   1 +
>>>  arch/powerpc/kernel/eeh_driver.c  |  48 ++++--
>>>  arch/powerpc/kernel/eeh_pe.c      |  10 +-
>>>  arch/powerpc/kernel/pci-hotplug.c |   3 +
>>>  drivers/pci/hotplug/pnv_php.c     | 244 +++++++++++++++++++++++++++---
>>>  5 files changed, 263 insertions(+), 43 deletions(-)
>>
>> I'm OK with this from a PCI perspective, and I optimistically put it
>> on pci/hotplug.
>>
>> I'm happy to merge via the PCI tree, but would need acks from the
>> powerpc folks for the arch/powerpc parts.
>>
>> Alternatively it could be merged via powerpc with my ack on the
>> drivers/pci patches:
>>
>> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>>
>> If you do merge via powerpc, I made some comment formatting and commit
>> log tweaks that I would like reflected in the drivers/pci part.  These
>> are on
>> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=hotplug
> 
> Powerpc folks: let me know how you want to handle this.  I haven't
> included it in pci/next yet because I don't have acks for the
> arch/powerpc parts.
> 

Patchset looks fine to be. 

I am fine to take it via my tree since I already have your Acked-by.

Maddy


> Bjorn
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes
  2025-07-23 11:00     ` Madhavan Srinivasan
@ 2025-07-23 11:39       ` Bjorn Helgaas
  0 siblings, 0 replies; 14+ messages in thread
From: Bjorn Helgaas @ 2025-07-23 11:39 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: Michael Ellerman, Mahesh J Salgaonkar, linuxppc-dev, linux-kernel,
	linux-pci, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio, Timothy Pearson

On Wed, Jul 23, 2025 at 04:30:18PM +0530, Madhavan Srinivasan wrote:
> 
> 
> On 7/23/25 2:17 AM, Bjorn Helgaas wrote:
> > On Thu, Jul 17, 2025 at 06:27:52PM -0500, Bjorn Helgaas wrote:
> >> On Tue, Jul 15, 2025 at 04:31:49PM -0500, Timothy Pearson wrote:
> >>> Hello all,
> >>>
> >>> This series includes several fixes for bugs in the PowerNV PCIe hotplug
> >>> driver that were discovered in testing with a Microsemi Switchtec PM8533
> >>> PFX 48xG3 PCIe switch on a PowerNV system, as well as one workaround for
> >>> PCIe switches that don't correctly implement slot presence detection
> >>> such as the aforementioned one. Without the workaround, the switch works
> >>> and downstream devices can be hot-unplugged, but the devices never come
> >>> back online after being plugged in again until the system is rebooted.
> >>> Other hotplug drivers (like pciehp_hpc) use a similar workaround.
> >>>
> >>> Also included are fixes for the EEH driver to make it hotplug safe,
> >>> and a small patch to enable all three attention indicator states per
> >>> the PCIe specification.
> >>>
> >>> Thanks,
> >>>
> >>> Shawn Anastasio (2):
> >>>   PCI: pnv_php: Properly clean up allocated IRQs on unplug
> >>>   PCI: pnv_php: Work around switches with broken presence detection
> >>>
> >>> Timothy Pearson (4):
> >>>   powerpc/eeh: Export eeh_unfreeze_pe()
> >>>   powerpc/eeh: Make EEH driver device hotplug safe
> >>>   PCI: pnv_php: Fix surprise plug detection and recovery
> >>>   PCI: pnv_php: Enable third attention indicator state
> >>>
> >>>  arch/powerpc/kernel/eeh.c         |   1 +
> >>>  arch/powerpc/kernel/eeh_driver.c  |  48 ++++--
> >>>  arch/powerpc/kernel/eeh_pe.c      |  10 +-
> >>>  arch/powerpc/kernel/pci-hotplug.c |   3 +
> >>>  drivers/pci/hotplug/pnv_php.c     | 244 +++++++++++++++++++++++++++---
> >>>  5 files changed, 263 insertions(+), 43 deletions(-)
> >>
> >> I'm OK with this from a PCI perspective, and I optimistically put it
> >> on pci/hotplug.
> >>
> >> I'm happy to merge via the PCI tree, but would need acks from the
> >> powerpc folks for the arch/powerpc parts.
> >>
> >> Alternatively it could be merged via powerpc with my ack on the
> >> drivers/pci patches:
> >>
> >> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> >>
> >> If you do merge via powerpc, I made some comment formatting and commit
> >> log tweaks that I would like reflected in the drivers/pci part.  These
> >> are on
> >> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=hotplug
> > 
> > Powerpc folks: let me know how you want to handle this.  I haven't
> > included it in pci/next yet because I don't have acks for the
> > arch/powerpc parts.
> 
> Patchset looks fine to be. 
> 
> I am fine to take it via my tree since I already have your Acked-by.

OK, I'll assume this will be merged via your tree.  Please cherry-pick 
the drivers/pci patches from my tree to preserve my tweaks.  I moved
them from pci/hotplug to pci/hotplug-pnv_php:

  https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=hotplug-pnv_php

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes
  2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
                   ` (6 preceding siblings ...)
  2025-07-17 23:27 ` [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Bjorn Helgaas
@ 2025-07-23 11:47 ` Ganesh G R
  7 siblings, 0 replies; 14+ messages in thread
From: Ganesh G R @ 2025-07-23 11:47 UTC (permalink / raw)
  To: Timothy Pearson, linuxppc-dev
  Cc: linux-kernel, linux-pci, Madhavan Srinivasan, Michael Ellerman,
	christophe leroy, Naveen N Rao, Bjorn Helgaas, Shawn Anastasio

On 7/16/25 3:01 AM, Timothy Pearson wrote:
> Hello all,
> 
> This series includes several fixes for bugs in the PowerNV PCIe hotplug
> driver that were discovered in testing with a Microsemi Switchtec PM8533
> PFX 48xG3 PCIe switch on a PowerNV system, as well as one workaround for
> PCIe switches that don't correctly implement slot presence detection
> such as the aforementioned one. Without the workaround, the switch works
> and downstream devices can be hot-unplugged, but the devices never come
> back online after being plugged in again until the system is rebooted.
> Other hotplug drivers (like pciehp_hpc) use a similar workaround.
> 
> Also included are fixes for the EEH driver to make it hotplug safe,
> and a small patch to enable all three attention indicator states per
> the PCIe specification.
> 
> Thanks,
> 
> Shawn Anastasio (2):
>    PCI: pnv_php: Properly clean up allocated IRQs on unplug
>    PCI: pnv_php: Work around switches with broken presence detection
> 
> Timothy Pearson (4):
>    powerpc/eeh: Export eeh_unfreeze_pe()
>    powerpc/eeh: Make EEH driver device hotplug safe
>    PCI: pnv_php: Fix surprise plug detection and recovery
>    PCI: pnv_php: Enable third attention indicator state
> 
>   arch/powerpc/kernel/eeh.c         |   1 +
>   arch/powerpc/kernel/eeh_driver.c  |  48 ++++--
>   arch/powerpc/kernel/eeh_pe.c      |  10 +-
>   arch/powerpc/kernel/pci-hotplug.c |   3 +
>   drivers/pci/hotplug/pnv_php.c     | 244 +++++++++++++++++++++++++++---
>   5 files changed, 263 insertions(+), 43 deletions(-)
> 
Tested the patch series for EEH and hotplug on powernv, recovery is 
working as expected, EEH changes looks good to me.
Tested-by: Ganesh Goudar <ganeshgr@linux.ibm.com>

Thanks
Ganesh

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-07-23 11:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-15 21:31 [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
2025-07-15 21:36 ` [PATCH v3 1/6] PCI: pnv_php: Properly clean up allocated IRQs on unplug Timothy Pearson
2025-07-15 21:36 ` [PATCH v3 2/6] PCI: pnv_php: Work around switches with broken presence detection Timothy Pearson
2025-07-15 21:37 ` [PATCH v3 3/6] powerpc/eeh: Export eeh_unfreeze_pe() Timothy Pearson
2025-07-15 21:38 ` [PATCH v3 4/6] powerpc/eeh: Make EEH driver device hotplug safe Timothy Pearson
2025-07-15 21:39 ` [PATCH v3 5/6] PCI: pnv_php: Fix surprise plug detection and recovery Timothy Pearson
2025-07-17 23:27   ` Bjorn Helgaas
2025-07-18  0:05     ` Timothy Pearson
2025-07-15 21:39 ` [PATCH v3 6/6] PCI: pnv_php: Enable third attention indicator state Timothy Pearson
2025-07-17 23:27 ` [PATCH v3 0/6] PowerNV PCIe Hotplug Driver Fixes Bjorn Helgaas
2025-07-22 20:47   ` Bjorn Helgaas
2025-07-23 11:00     ` Madhavan Srinivasan
2025-07-23 11:39       ` Bjorn Helgaas
2025-07-23 11:47 ` Ganesh G R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).