linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] PowerNV PCIe Hotplug Driver Fixes
@ 2025-06-18 16:54 Timothy Pearson
  2025-06-18 16:56 ` [PATCH v2 1/6] pci/hotplug/pnv_php: Properly clean up allocated IRQs on Timothy Pearson
                   ` (5 more replies)
  0 siblings, 6 replies; 31+ messages in thread
From: Timothy Pearson @ 2025-06-18 16:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, linux-pci, Madhavan Srinivasan, Michael Ellerman,
	christophe leroy, Naveen N Rao, Bjorn Helgaas, Shawn Anastasio

Hello all,

This series includes several fixes for bugs in the PowerNV PCIe hotplug
driver that were discovered in testing with a Microsemi Switchtec PM8533
PFX 48xG3 PCIe switch on a PowerNV system, as well as one workaround for
PCIe switches that don't correctly implement slot presence detection
such as the aforementioned one. Without the workaround, the switch works
and downstream devices can be hot-unplugged, but the devices never come
back online after being plugged in again until the system is rebooted.
Other hotplug drivers (like pciehp_hpc) use a similar workaround.

Also included are fixes for the EEH driver to make it hotplug safe,
and a small patch to enable all three attention indicator states per
the PCIe specification.

Thanks,

Shawn Anastasio (2):
  pci/hotplug/pnv_php: Properly clean up allocated IRQs on unplug
  pci/hotplug/pnv_php: Work around switches with broken presence
    detection

Timothy Pearson (5):
  powerpc/pseries/eeh: Export eeh_unfreeze_pe() and eeh_ops
  powerpc/eeh: Make EEH driver device hotplug safe
  pci/hotplug/pnv_php: Fix surprise plug detection and recovery
  pci/hotplug/pnv_php: Enable third atetntion indicator state

 arch/powerpc/kernel/eeh.c                    |   2 +
 arch/powerpc/kernel/eeh_driver.c             |  48 ++++--
 arch/powerpc/kernel/eeh_pe.c                 |  10 +-
 arch/powerpc/kernel/pci-hotplug.c            |   3 +
 arch/powerpc/platforms/powernv/eeh-powernv.c |   1 +
 drivers/pci/hotplug/pciehp.h                 |   1 -
 drivers/pci/hotplug/pciehp_ctrl.c            |   2 +-
 drivers/pci/hotplug/pciehp_hpc.c             |  33 +---
 drivers/pci/hotplug/pnv_php.c                | 172 ++++++++++++++++---
 drivers/pci/pci.c                            |  31 +++-
 drivers/pci/pci.h                            |   1 +
 11 files changed, 228 insertions(+), 76 deletions(-)

-- 
2.39.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 1/6] pci/hotplug/pnv_php: Properly clean up allocated IRQs on
  2025-06-18 16:54 [PATCH v2 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
@ 2025-06-18 16:56 ` Timothy Pearson
  2025-06-18 16:56 ` [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken Timothy Pearson
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 31+ messages in thread
From: Timothy Pearson @ 2025-06-18 16:56 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

 unplug

In cases where the root of a nested PCIe bridge configuration is
unplugged, the pnv_php driver would leak the allocated IRQ resources for
the child bridges' hotplug event notifications, resulting in a panic.
Fix this by walking all child buses and deallocating all it's IRQ
resources before calling pci_hp_remove_devices.

Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so
that it is only destroyed in pnv_php_free_slot, instead of
pnv_php_disable_irq. This is required since pnv_php_disable_irq will now
be called by workers triggered by hot unplug interrupts, so the
workqueue needs to stay allocated.

The abridged kernel panic that occurs without this patch is as follows:

  WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c
  CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2
  Call Trace:
   msi_device_data_release+0x34/0x9c (unreliable)
   release_nodes+0x64/0x13c
   devres_release_all+0xc0/0x140
   device_del+0x2d4/0x46c
   pci_destroy_dev+0x5c/0x194
   pci_hp_remove_devices+0x90/0x128
   pci_hp_remove_devices+0x44/0x128
   pnv_php_disable_slot+0x54/0xd4
   power_write_file+0xf8/0x18c
   pci_slot_attr_store+0x40/0x5c
   sysfs_kf_write+0x64/0x78
   kernfs_fop_write_iter+0x1b0/0x290
   vfs_write+0x3bc/0x50c
   ksys_write+0x84/0x140
   system_call_exception+0x124/0x230
   system_call_vectored_common+0x15c/0x2ec

Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 drivers/pci/hotplug/pnv_php.c | 94 ++++++++++++++++++++++++++++-------
 1 file changed, 75 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 573a41869c15..aec0a6d594ac 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -3,6 +3,7 @@
  * PCI Hotplug Driver for PowerPC PowerNV platform.
  *
  * Copyright Gavin Shan, IBM Corporation 2016.
+ * Copyright (C) 2025 Raptor Engineering, LLC
  */
 
 #include <linux/bitfield.h>
@@ -36,8 +37,10 @@ static void pnv_php_register(struct device_node *dn);
 static void pnv_php_unregister_one(struct device_node *dn);
 static void pnv_php_unregister(struct device_node *dn);
 
+static void pnv_php_enable_irq(struct pnv_php_slot *php_slot);
+
 static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
-				bool disable_device)
+				bool disable_device, bool disable_msi)
 {
 	struct pci_dev *pdev = php_slot->pdev;
 	u16 ctrl;
@@ -53,19 +56,15 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
 		php_slot->irq = 0;
 	}
 
-	if (php_slot->wq) {
-		destroy_workqueue(php_slot->wq);
-		php_slot->wq = NULL;
-	}
-
-	if (disable_device) {
+	if (disable_device || disable_msi) {
 		if (pdev->msix_enabled)
 			pci_disable_msix(pdev);
 		else if (pdev->msi_enabled)
 			pci_disable_msi(pdev);
+	}
 
+	if (disable_device)
 		pci_disable_device(pdev);
-	}
 }
 
 static void pnv_php_free_slot(struct kref *kref)
@@ -74,7 +73,8 @@ static void pnv_php_free_slot(struct kref *kref)
 					struct pnv_php_slot, kref);
 
 	WARN_ON(!list_empty(&php_slot->children));
-	pnv_php_disable_irq(php_slot, false);
+	pnv_php_disable_irq(php_slot, false, false);
+	destroy_workqueue(php_slot->wq);
 	kfree(php_slot->name);
 	kfree(php_slot);
 }
@@ -561,8 +561,57 @@ static int pnv_php_reset_slot(struct hotplug_slot *slot, bool probe)
 static int pnv_php_enable_slot(struct hotplug_slot *slot)
 {
 	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
+	u32 prop32;
+	int ret;
+
+	ret = pnv_php_enable(php_slot, true);
+	if (ret)
+		return ret;
+
+	/* (Re-)enable interrupt if the slot supports surprise hotplug */
+	ret = of_property_read_u32(php_slot->dn, "ibm,slot-surprise-pluggable", &prop32);
+	if (!ret && prop32)
+		pnv_php_enable_irq(php_slot);
+
+	return 0;
+}
+
+/**
+ * Disable any hotplug interrupts for all slots on the provided bus, as well as
+ * all downstream slots in preparation for a hot unplug.
+ */
+static int pnv_php_disable_all_irqs(struct pci_bus *bus)
+{
+	struct pci_bus *child_bus;
+	struct pci_slot *cur_slot;
+
+	/* First go down child busses */
+	list_for_each_entry(child_bus, &bus->children, node)
+		pnv_php_disable_all_irqs(child_bus);
+
+	/* Disable IRQs for all pnv_php slots on this bus */
+	list_for_each_entry(cur_slot, &bus->slots, list) {
+		struct pnv_php_slot *php_slot = to_pnv_php_slot(cur_slot->hotplug);
+
+		pnv_php_disable_irq(php_slot, false, true);
+	}
 
-	return pnv_php_enable(php_slot, true);
+	return 0;
+}
+
+/**
+ * Disable any hotplug interrupts for all downstream slots on the provided bus in
+ * preparation for a hot unplug.
+ */
+static int pnv_php_disable_all_downstream_irqs(struct pci_bus *bus)
+{
+	struct pci_bus *child_bus;
+
+	/* Go down child busses, recursively deactivating their IRQs */
+	list_for_each_entry(child_bus, &bus->children, node)
+		pnv_php_disable_all_irqs(child_bus);
+
+	return 0;
 }
 
 static int pnv_php_disable_slot(struct hotplug_slot *slot)
@@ -579,6 +628,12 @@ static int pnv_php_disable_slot(struct hotplug_slot *slot)
 	    php_slot->state != PNV_PHP_STATE_REGISTERED)
 		return 0;
 
+	/* Free all IRQ resources from all child slots before remove.
+	 * Note that we do not disable the root slot IRQ here as that
+	 * would also deactivate the slot hot (re)plug interrupt!
+	 */
+	pnv_php_disable_all_downstream_irqs(php_slot->bus);
+
 	/* Remove all devices behind the slot */
 	pci_lock_rescan_remove();
 	pci_hp_remove_devices(php_slot->bus);
@@ -647,6 +702,15 @@ static struct pnv_php_slot *pnv_php_alloc_slot(struct device_node *dn)
 		return NULL;
 	}
 
+	/* Allocate workqueue for this slot's interrupt handling */
+	php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
+	if (!php_slot->wq) {
+		SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
+		kfree(php_slot->name);
+		kfree(php_slot);
+		return NULL;
+	}
+
 	if (dn->child && PCI_DN(dn->child))
 		php_slot->slot_no = PCI_SLOT(PCI_DN(dn->child)->devfn);
 	else
@@ -843,14 +907,6 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq)
 	u16 sts, ctrl;
 	int ret;
 
-	/* Allocate workqueue */
-	php_slot->wq = alloc_workqueue("pciehp-%s", 0, 0, php_slot->name);
-	if (!php_slot->wq) {
-		SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
-		pnv_php_disable_irq(php_slot, true);
-		return;
-	}
-
 	/* Check PDC (Presence Detection Change) is broken or not */
 	ret = of_property_read_u32(php_slot->dn, "ibm,slot-broken-pdc",
 				   &broken_pdc);
@@ -869,7 +925,7 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_slot, int irq)
 	ret = request_irq(irq, pnv_php_interrupt, IRQF_SHARED,
 			  php_slot->name, php_slot);
 	if (ret) {
-		pnv_php_disable_irq(php_slot, true);
+		pnv_php_disable_irq(php_slot, true, true);
 		SLOT_WARN(php_slot, "Error %d enabling IRQ %d\n", ret, irq);
 		return;
 	}
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
  2025-06-18 16:54 [PATCH v2 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
  2025-06-18 16:56 ` [PATCH v2 1/6] pci/hotplug/pnv_php: Properly clean up allocated IRQs on Timothy Pearson
@ 2025-06-18 16:56 ` Timothy Pearson
  2025-06-18 19:44   ` Bjorn Helgaas
  2025-06-18 16:57 ` [PATCH v2 3/6] powerpc/eeh: Export eeh_unfreeze_pe() Timothy Pearson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 31+ messages in thread
From: Timothy Pearson @ 2025-06-18 16:56 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

 presence detection

The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
was observed to incorrectly assert the Presence Detect Set bit in its
capabilities when tested on a Raptor Computing Systems Blackbird system,
resulting in the hot insert path never attempting a rescan of the bus
and any downstream devices not being re-detected.

Work around this by additionally checking whether the PCIe data link is
active or not when performing presence detection on downstream switches'
ports, similar to the pciehp_hpc.c driver.

Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 drivers/pci/hotplug/pnv_php.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index aec0a6d594ac..bac8af3df41a 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -391,6 +391,20 @@ static int pnv_php_get_power_state(struct hotplug_slot *slot, u8 *state)
 	return 0;
 }
 
+static int pcie_check_link_active(struct pci_dev *pdev)
+{
+	u16 lnk_status;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
+	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
+		return -ENODEV;
+
+	ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
+
+	return ret;
+}
+
 static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
 {
 	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
@@ -403,6 +417,19 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
 	 */
 	ret = pnv_pci_get_presence_state(php_slot->id, &presence);
 	if (ret >= 0) {
+		if (pci_pcie_type(php_slot->pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
+			presence == OPAL_PCI_SLOT_EMPTY) {
+			/*
+			 * Similar to pciehp_hpc, check whether the Link Active
+			 * bit is set to account for broken downstream bridges
+			 * that don't properly assert Presence Detect State, as
+			 * was observed on the Microsemi Switchtec PM8533 PFX
+			 * [11f8:8533].
+			 */
+			if (pcie_check_link_active(php_slot->pdev) > 0)
+				presence = OPAL_PCI_SLOT_PRESENT;
+		}
+
 		*state = presence;
 		ret = 0;
 	} else {
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 3/6] powerpc/eeh: Export eeh_unfreeze_pe()
  2025-06-18 16:54 [PATCH v2 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
  2025-06-18 16:56 ` [PATCH v2 1/6] pci/hotplug/pnv_php: Properly clean up allocated IRQs on Timothy Pearson
  2025-06-18 16:56 ` [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken Timothy Pearson
@ 2025-06-18 16:57 ` Timothy Pearson
  2025-06-18 16:57 ` [PATCH v2 4/6] powerpc/eeh: Make EEH driver device hotplug safe Timothy Pearson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 31+ messages in thread
From: Timothy Pearson @ 2025-06-18 16:57 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

The PowerNV hotplug driver needs to be able to clear any frozen PE(s)
on the PHB after suprise removal of a downstream device.

Export the eeh_unfreeze_pe() symbol to allow implementation of this
functionality in the php_nv module.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 arch/powerpc/kernel/eeh.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 83fe99861eb1..3a2a926fbd64 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1139,6 +1139,7 @@ int eeh_unfreeze_pe(struct eeh_pe *pe)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(eeh_unfreeze_pe);
 
 
 static struct pci_device_id eeh_reset_ids[] = {
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 4/6] powerpc/eeh: Make EEH driver device hotplug safe
  2025-06-18 16:54 [PATCH v2 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
                   ` (2 preceding siblings ...)
  2025-06-18 16:57 ` [PATCH v2 3/6] powerpc/eeh: Export eeh_unfreeze_pe() Timothy Pearson
@ 2025-06-18 16:57 ` Timothy Pearson
  2025-06-18 16:58 ` [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and Timothy Pearson
  2025-06-18 16:58 ` [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator Timothy Pearson
  5 siblings, 0 replies; 31+ messages in thread
From: Timothy Pearson @ 2025-06-18 16:57 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

Multiple race conditions existed between the PCIe hotplug driver and
the EEH driver, leading to a variety of kernel oopses of the same
general nature:

<pcie device unplug>
<eeh driver trigger>
<hotplug removal trigger>
<pcie tree reconfiguration>
<eeh recovery next step>
<oops in EEH driver bus iteration loop>

A second class of oops is also seen when the underling bus disappears
during device recovery.

Refactor the EEH module to be PCI rescan and remove safe.  Also clean
up a few minor formatting / readability issues.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 arch/powerpc/kernel/eeh_driver.c | 48 +++++++++++++++++++++-----------
 arch/powerpc/kernel/eeh_pe.c     | 10 ++++---
 2 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 7efe04c68f0f..dd50de91c438 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -257,13 +257,12 @@ static void eeh_pe_report_edev(struct eeh_dev *edev, eeh_report_fn fn,
 	struct pci_driver *driver;
 	enum pci_ers_result new_result;
 
-	pci_lock_rescan_remove();
 	pdev = edev->pdev;
 	if (pdev)
 		get_device(&pdev->dev);
-	pci_unlock_rescan_remove();
 	if (!pdev) {
 		eeh_edev_info(edev, "no device");
+		*result = PCI_ERS_RESULT_DISCONNECT;
 		return;
 	}
 	device_lock(&pdev->dev);
@@ -304,8 +303,9 @@ static void eeh_pe_report(const char *name, struct eeh_pe *root,
 	struct eeh_dev *edev, *tmp;
 
 	pr_info("EEH: Beginning: '%s'\n", name);
-	eeh_for_each_pe(root, pe) eeh_pe_for_each_dev(pe, edev, tmp)
-		eeh_pe_report_edev(edev, fn, result);
+	eeh_for_each_pe(root, pe)
+		eeh_pe_for_each_dev(pe, edev, tmp)
+			eeh_pe_report_edev(edev, fn, result);
 	if (result)
 		pr_info("EEH: Finished:'%s' with aggregate recovery state:'%s'\n",
 			name, pci_ers_result_name(*result));
@@ -383,6 +383,8 @@ static void eeh_dev_restore_state(struct eeh_dev *edev, void *userdata)
 	if (!edev)
 		return;
 
+	pci_lock_rescan_remove();
+
 	/*
 	 * The content in the config space isn't saved because
 	 * the blocked config space on some adapters. We have
@@ -393,14 +395,19 @@ static void eeh_dev_restore_state(struct eeh_dev *edev, void *userdata)
 		if (list_is_last(&edev->entry, &edev->pe->edevs))
 			eeh_pe_restore_bars(edev->pe);
 
+		pci_unlock_rescan_remove();
 		return;
 	}
 
 	pdev = eeh_dev_to_pci_dev(edev);
-	if (!pdev)
+	if (!pdev) {
+		pci_unlock_rescan_remove();
 		return;
+	}
 
 	pci_restore_state(pdev);
+
+	pci_unlock_rescan_remove();
 }
 
 /**
@@ -647,9 +654,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	if (any_passed || driver_eeh_aware || (pe->type & EEH_PE_VF)) {
 		eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data);
 	} else {
-		pci_lock_rescan_remove();
 		pci_hp_remove_devices(bus);
-		pci_unlock_rescan_remove();
 	}
 
 	/*
@@ -665,8 +670,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	if (rc)
 		return rc;
 
-	pci_lock_rescan_remove();
-
 	/* Restore PE */
 	eeh_ops->configure_bridge(pe);
 	eeh_pe_restore_bars(pe);
@@ -674,7 +677,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	/* Clear frozen state */
 	rc = eeh_clear_pe_frozen_state(pe, false);
 	if (rc) {
-		pci_unlock_rescan_remove();
 		return rc;
 	}
 
@@ -709,7 +711,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 	pe->tstamp = tstamp;
 	pe->freeze_count = cnt;
 
-	pci_unlock_rescan_remove();
 	return 0;
 }
 
@@ -843,10 +844,13 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		{LIST_HEAD_INIT(rmv_data.removed_vf_list), 0};
 	int devices = 0;
 
+	pci_lock_rescan_remove();
+
 	bus = eeh_pe_bus_get(pe);
 	if (!bus) {
 		pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n",
 			__func__, pe->phb->global_number, pe->addr);
+		pci_unlock_rescan_remove();
 		return;
 	}
 
@@ -1094,10 +1098,15 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 		eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true);
 		eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED);
 
-		pci_lock_rescan_remove();
-		pci_hp_remove_devices(bus);
-		pci_unlock_rescan_remove();
+		bus = eeh_pe_bus_get(pe);
+		if (bus)
+			pci_hp_remove_devices(bus);
+		else
+			pr_err("%s: PCI bus for PHB#%x-PE#%x disappeared\n",
+				__func__, pe->phb->global_number, pe->addr);
+
 		/* The passed PE should no longer be used */
+		pci_unlock_rescan_remove();
 		return;
 	}
 
@@ -1114,6 +1123,8 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
 			eeh_clear_slot_attention(edev->pdev);
 
 	eeh_pe_state_clear(pe, EEH_PE_RECOVERING, true);
+
+	pci_unlock_rescan_remove();
 }
 
 /**
@@ -1132,6 +1143,7 @@ void eeh_handle_special_event(void)
 	unsigned long flags;
 	int rc;
 
+	pci_lock_rescan_remove();
 
 	do {
 		rc = eeh_ops->next_error(&pe);
@@ -1171,10 +1183,12 @@ void eeh_handle_special_event(void)
 
 			break;
 		case EEH_NEXT_ERR_NONE:
+			pci_unlock_rescan_remove();
 			return;
 		default:
 			pr_warn("%s: Invalid value %d from next_error()\n",
 				__func__, rc);
+			pci_unlock_rescan_remove();
 			return;
 		}
 
@@ -1186,7 +1200,9 @@ void eeh_handle_special_event(void)
 		if (rc == EEH_NEXT_ERR_FROZEN_PE ||
 		    rc == EEH_NEXT_ERR_FENCED_PHB) {
 			eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
+			pci_unlock_rescan_remove();
 			eeh_handle_normal_event(pe);
+			pci_lock_rescan_remove();
 		} else {
 			eeh_for_each_pe(pe, tmp_pe)
 				eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev)
@@ -1199,7 +1215,6 @@ void eeh_handle_special_event(void)
 				eeh_report_failure, NULL);
 			eeh_set_channel_state(pe, pci_channel_io_perm_failure);
 
-			pci_lock_rescan_remove();
 			list_for_each_entry(hose, &hose_list, list_node) {
 				phb_pe = eeh_phb_pe_get(hose);
 				if (!phb_pe ||
@@ -1218,7 +1233,6 @@ void eeh_handle_special_event(void)
 				}
 				pci_hp_remove_devices(bus);
 			}
-			pci_unlock_rescan_remove();
 		}
 
 		/*
@@ -1228,4 +1242,6 @@ void eeh_handle_special_event(void)
 		if (rc == EEH_NEXT_ERR_DEAD_IOC)
 			break;
 	} while (rc != EEH_NEXT_ERR_NONE);
+
+	pci_unlock_rescan_remove();
 }
diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index d283d281d28e..e740101fadf3 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -671,10 +671,12 @@ static void eeh_bridge_check_link(struct eeh_dev *edev)
 	eeh_ops->write_config(edev, cap + PCI_EXP_LNKCTL, 2, val);
 
 	/* Check link */
-	if (!edev->pdev->link_active_reporting) {
-		eeh_edev_dbg(edev, "No link reporting capability\n");
-		msleep(1000);
-		return;
+	if (edev->pdev) {
+		if (!edev->pdev->link_active_reporting) {
+			eeh_edev_dbg(edev, "No link reporting capability\n");
+			msleep(1000);
+			return;
+		}
 	}
 
 	/* Wait the link is up until timeout (5s) */
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and
  2025-06-18 16:54 [PATCH v2 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
                   ` (3 preceding siblings ...)
  2025-06-18 16:57 ` [PATCH v2 4/6] powerpc/eeh: Make EEH driver device hotplug safe Timothy Pearson
@ 2025-06-18 16:58 ` Timothy Pearson
  2025-06-18 19:15   ` Bjorn Helgaas
  2025-06-18 16:58 ` [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator Timothy Pearson
  5 siblings, 1 reply; 31+ messages in thread
From: Timothy Pearson @ 2025-06-18 16:58 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

 recovery

The existing PowerNV hotplug code did not handle suprise plug events
correctly, leading to a complete failure of the hotplug system after
device removal and a required reboot to detect new devices.

This comes down to two issues:
1.) When a device is suprise removed, oftentimes the bridge upstream
    port will cause a PE freeze on the PHB.  If this freeze is not
    cleared, the MSI interrupts from the bridge hotplug notification
    logic will not be received by the kernel, stalling all plug events
    on all slots associated with the PE.
2.) When a device is removed from a slot, regardless of suprise or
    programmatic removal, the associated PHB/PE ls left frozen.
    If this freeze is not cleared via a fundamental reset, skiboot
    is unable to clear the freeze and cannot retrain / rescan the
    slot.  This also requires a reboot to clear the freeze and redetect
    the device in the slot.

Issue the appropriate unfreeze and rescan commands on hotplug events,
and don't oops on hotplug if pci_bus_to_OF_node() returns NULL.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 arch/powerpc/kernel/pci-hotplug.c |  3 ++
 drivers/pci/hotplug/pnv_php.c     | 53 ++++++++++++++++++++++++++++++-
 2 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
index 9ea74973d78d..6f444d0822d8 100644
--- a/arch/powerpc/kernel/pci-hotplug.c
+++ b/arch/powerpc/kernel/pci-hotplug.c
@@ -141,6 +141,9 @@ void pci_hp_add_devices(struct pci_bus *bus)
 	struct pci_controller *phb;
 	struct device_node *dn = pci_bus_to_OF_node(bus);
 
+	if (!dn)
+		return;
+
 	phb = pci_bus_to_host(bus);
 
 	mode = PCI_PROBE_NORMAL;
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index bac8af3df41a..0ceb4a2c3c79 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -10,6 +10,7 @@
 #include <linux/libfdt.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <linux/delay.h>
 #include <linux/pci_hotplug.h>
 #include <linux/of_fdt.h>
 
@@ -474,7 +475,7 @@ static int pnv_php_enable(struct pnv_php_slot *php_slot, bool rescan)
 	struct hotplug_slot *slot = &php_slot->slot;
 	uint8_t presence = OPAL_PCI_SLOT_EMPTY;
 	uint8_t power_status = OPAL_PCI_SLOT_POWER_ON;
-	int ret;
+	int ret, i;
 
 	/* Check if the slot has been configured */
 	if (php_slot->state != PNV_PHP_STATE_REGISTERED)
@@ -532,6 +533,27 @@ static int pnv_php_enable(struct pnv_php_slot *php_slot, bool rescan)
 
 	/* Power is off, turn it on and then scan the slot */
 	ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
+	if (ret) {
+		SLOT_WARN(php_slot, "PCI slot activation failed with error code %d, possible frozen PHB", ret);
+		SLOT_WARN(php_slot, "Attempting complete PHB reset before retrying slot activation\n");
+		for (i = 0; i < 3; i++) {
+			/* Slot activation failed, PHB may be fenced from a prior device failure
+			 * Use the OPAL fundamental reset call to both try a device reset and clear
+			 * any potentially active PHB fence / freeze
+			 */
+			SLOT_WARN(php_slot, "Try %d...\n", i + 1);
+			pci_set_pcie_reset_state(php_slot->pdev, pcie_warm_reset);
+			msleep(250);
+			pci_set_pcie_reset_state(php_slot->pdev, pcie_deassert_reset);
+
+			ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
+			if (!ret)
+				break;
+		}
+
+		if (i >= 3)
+			SLOT_WARN(php_slot, "Failed to bring slot online, aborting!\n");
+	}
 	if (ret)
 		return ret;
 
@@ -841,12 +863,41 @@ static void pnv_php_event_handler(struct work_struct *work)
 	struct pnv_php_event *event =
 		container_of(work, struct pnv_php_event, work);
 	struct pnv_php_slot *php_slot = event->php_slot;
+	struct pci_dev *pdev = php_slot->pdev;
+	struct eeh_dev *edev;
+	struct eeh_pe *pe;
+	int i, rc;
 
 	if (event->added)
 		pnv_php_enable_slot(&php_slot->slot);
 	else
 		pnv_php_disable_slot(&php_slot->slot);
 
+	if (!event->added) {
+		/* When a device is surprise removed from a downstream bridge slot, the upstream bridge port
+		 * can still end up frozen due to related EEH events, which will in turn block the MSI interrupts
+		 * for slot hotplug detection.  Detect and thaw any frozen upstream PE after slot deactivation...
+		 */
+		edev = pci_dev_to_eeh_dev(pdev);
+		pe = edev ? edev->pe : NULL;
+		rc = eeh_pe_get_state(pe);
+		if ((rc == -ENODEV) || (rc == -ENOENT)) {
+			SLOT_WARN(php_slot, "Upstream bridge PE state unknown, hotplug detect may fail\n");
+		}
+		else {
+			if (pe->state & EEH_PE_ISOLATED) {
+				SLOT_WARN(php_slot, "Upstream bridge PE %02x frozen, thawing...\n", pe->addr);
+				for (i = 0; i < 3; i++)
+					if (!eeh_unfreeze_pe(pe))
+						break;
+				if (i >= 3)
+					SLOT_WARN(php_slot, "Unable to thaw PE %02x, hotplug detect will fail!\n", pe->addr);
+				else
+					SLOT_WARN(php_slot, "PE %02x thawed successfully\n", pe->addr);
+			}
+		}
+	}
+
 	kfree(event);
 }
 
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-18 16:54 [PATCH v2 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
                   ` (4 preceding siblings ...)
  2025-06-18 16:58 ` [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and Timothy Pearson
@ 2025-06-18 16:58 ` Timothy Pearson
  2025-06-18 19:01   ` Bjorn Helgaas
  5 siblings, 1 reply; 31+ messages in thread
From: Timothy Pearson @ 2025-06-18 16:58 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

 state

The PCIe specification allows three attention indicator states,
on, off, and blink.  Enable all three states instead of basic
on / off control.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
---
 drivers/pci/hotplug/pnv_php.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 0ceb4a2c3c79..c3005324be3d 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -440,10 +440,23 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
 	return ret;
 }
 
+static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8 *state)
+{
+	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
+	struct pci_dev *bridge = php_slot->pdev;
+	u16 status;
+
+	pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
+	*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;
+	return 0;
+}
+
+
 static int pnv_php_get_attention_state(struct hotplug_slot *slot, u8 *state)
 {
 	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
 
+	pnv_php_get_raw_indicator_status(slot, &php_slot->attention_state);
 	*state = php_slot->attention_state;
 	return 0;
 }
@@ -461,7 +474,7 @@ static int pnv_php_set_attention_state(struct hotplug_slot *slot, u8 state)
 	mask = PCI_EXP_SLTCTL_AIC;
 
 	if (state)
-		new = PCI_EXP_SLTCTL_ATTN_IND_ON;
+		new = FIELD_PREP(PCI_EXP_SLTCTL_AIC, state);
 	else
 		new = PCI_EXP_SLTCTL_ATTN_IND_OFF;
 
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-18 16:58 ` [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator Timothy Pearson
@ 2025-06-18 19:01   ` Bjorn Helgaas
  2025-06-19  0:37     ` Timothy Pearson
  0 siblings, 1 reply; 31+ messages in thread
From: Bjorn Helgaas @ 2025-06-18 19:01 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

On Wed, Jun 18, 2025 at 11:58:59AM -0500, Timothy Pearson wrote:
>  state

Weird wrapping of last word of subject to here.

> The PCIe specification allows three attention indicator states,
> on, off, and blink.  Enable all three states instead of basic
> on / off control.
> 
> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
> ---
>  drivers/pci/hotplug/pnv_php.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
> index 0ceb4a2c3c79..c3005324be3d 100644
> --- a/drivers/pci/hotplug/pnv_php.c
> +++ b/drivers/pci/hotplug/pnv_php.c
> @@ -440,10 +440,23 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
>  	return ret;
>  }
>  
> +static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8 *state)
> +{
> +	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
> +	struct pci_dev *bridge = php_slot->pdev;
> +	u16 status;
> +
> +	pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
> +	*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;

Should be able to do this with FIELD_GET().

Is the PCI_EXP_SLTCTL_PIC part needed?  It wasn't there before, commit
log doesn't mention it, and as far as I can tell, this would be the
only driver to do that.  Most expose only the attention status (0=off,
1=on, 2=identify/blink).

> +	return 0;
> +}
> +
> +
>  static int pnv_php_get_attention_state(struct hotplug_slot *slot, u8 *state)
>  {
>  	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
>  
> +	pnv_php_get_raw_indicator_status(slot, &php_slot->attention_state);

This is a change worth noting.  Previously we didn't read the AIC
state from PCI_EXP_SLTCTL at all; we used php_slot->attention_state to
keep track of whatever had been previously set via
pnv_php_set_attention_state().

Now we read the current state from PCI_EXP_SLTCTL.  It's not clear
that php_slot->attention_state is still needed at all.

Previously, the user could write any value at all to the sysfs
"attention" file and then read that same value back.  After this
patch, the user can still write anything, but reads will only return
values with PCI_EXP_SLTCTL_AIC and PCI_EXP_SLTCTL_PIC.

>  	*state = php_slot->attention_state;
>  	return 0;
>  }
> @@ -461,7 +474,7 @@ static int pnv_php_set_attention_state(struct hotplug_slot *slot, u8 state)
>  	mask = PCI_EXP_SLTCTL_AIC;
>  
>  	if (state)
> -		new = PCI_EXP_SLTCTL_ATTN_IND_ON;
> +		new = FIELD_PREP(PCI_EXP_SLTCTL_AIC, state);

This changes the behavior in some cases:

  write 0: previously turned indicator off, now writes reserved value
  write 2: previously turned indicator on, now sets to blink
  write 3: previously turned indicator on, now turns it off

>  	else
>  		new = PCI_EXP_SLTCTL_ATTN_IND_OFF;
>  
> -- 
> 2.39.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and
  2025-06-18 16:58 ` [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and Timothy Pearson
@ 2025-06-18 19:15   ` Bjorn Helgaas
  2025-06-19 19:22     ` Timothy Pearson
  0 siblings, 1 reply; 31+ messages in thread
From: Bjorn Helgaas @ 2025-06-18 19:15 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

On Wed, Jun 18, 2025 at 11:58:23AM -0500, Timothy Pearson wrote:
>  recovery

Same weird subject/commit wrapping.

> The existing PowerNV hotplug code did not handle suprise plug events
> correctly, leading to a complete failure of the hotplug system after
> device removal and a required reboot to detect new devices.

s/suprise/surprise/ (also below)

> This comes down to two issues:
> 1.) When a device is suprise removed, oftentimes the bridge upstream
>     port will cause a PE freeze on the PHB.  If this freeze is not
>     cleared, the MSI interrupts from the bridge hotplug notification
>     logic will not be received by the kernel, stalling all plug events
>     on all slots associated with the PE.

I guess you mean the bridge *downstream* port that leads to the slot?

> 2.) When a device is removed from a slot, regardless of suprise or
>     programmatic removal, the associated PHB/PE ls left frozen.
>     If this freeze is not cleared via a fundamental reset, skiboot
>     is unable to clear the freeze and cannot retrain / rescan the
>     slot.  This also requires a reboot to clear the freeze and redetect
>     the device in the slot.
> 
> Issue the appropriate unfreeze and rescan commands on hotplug events,
> and don't oops on hotplug if pci_bus_to_OF_node() returns NULL.
> 
> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
> ---
>  arch/powerpc/kernel/pci-hotplug.c |  3 ++
>  drivers/pci/hotplug/pnv_php.c     | 53 ++++++++++++++++++++++++++++++-
>  2 files changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c
> index 9ea74973d78d..6f444d0822d8 100644
> --- a/arch/powerpc/kernel/pci-hotplug.c
> +++ b/arch/powerpc/kernel/pci-hotplug.c
> @@ -141,6 +141,9 @@ void pci_hp_add_devices(struct pci_bus *bus)
>  	struct pci_controller *phb;
>  	struct device_node *dn = pci_bus_to_OF_node(bus);
>  
> +	if (!dn)
> +		return;
> +
>  	phb = pci_bus_to_host(bus);
>  
>  	mode = PCI_PROBE_NORMAL;
> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
> index bac8af3df41a..0ceb4a2c3c79 100644
> --- a/drivers/pci/hotplug/pnv_php.c
> +++ b/drivers/pci/hotplug/pnv_php.c
> @@ -10,6 +10,7 @@
>  #include <linux/libfdt.h>
>  #include <linux/module.h>
>  #include <linux/pci.h>
> +#include <linux/delay.h>
>  #include <linux/pci_hotplug.h>
>  #include <linux/of_fdt.h>
>  
> @@ -474,7 +475,7 @@ static int pnv_php_enable(struct pnv_php_slot *php_slot, bool rescan)
>  	struct hotplug_slot *slot = &php_slot->slot;
>  	uint8_t presence = OPAL_PCI_SLOT_EMPTY;
>  	uint8_t power_status = OPAL_PCI_SLOT_POWER_ON;
> -	int ret;
> +	int ret, i;
>  
>  	/* Check if the slot has been configured */
>  	if (php_slot->state != PNV_PHP_STATE_REGISTERED)
> @@ -532,6 +533,27 @@ static int pnv_php_enable(struct pnv_php_slot *php_slot, bool rescan)
>  
>  	/* Power is off, turn it on and then scan the slot */
>  	ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
> +	if (ret) {
> +		SLOT_WARN(php_slot, "PCI slot activation failed with error code %d, possible frozen PHB", ret);
> +		SLOT_WARN(php_slot, "Attempting complete PHB reset before retrying slot activation\n");
> +		for (i = 0; i < 3; i++) {
> +			/* Slot activation failed, PHB may be fenced from a prior device failure
> +			 * Use the OPAL fundamental reset call to both try a device reset and clear
> +			 * any potentially active PHB fence / freeze
> +			 */
> +			SLOT_WARN(php_slot, "Try %d...\n", i + 1);
> +			pci_set_pcie_reset_state(php_slot->pdev, pcie_warm_reset);
> +			msleep(250);

What is the source of the 250 value?  Is there a spec you can cite for
this?  Maybe add a #define if it makes sense?

> +			pci_set_pcie_reset_state(php_slot->pdev, pcie_deassert_reset);
> +
> +			ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);

Wrap the comment and non-printk lines to fit in 80 columns like the
rest of the file.  Preserve the messages as-is so grep finds them
easily.

Usual multi-line comment style is:

  /*
   * Text ...
   */

Possibly factor this warn/reset code into a helper function to
unclutter pnv_php_enable()?

> +			if (!ret)
> +				break;
> +		}
> +
> +		if (i >= 3)
> +			SLOT_WARN(php_slot, "Failed to bring slot online, aborting!\n");
> +	}
>  	if (ret)
>  		return ret;
>  
> @@ -841,12 +863,41 @@ static void pnv_php_event_handler(struct work_struct *work)
>  	struct pnv_php_event *event =
>  		container_of(work, struct pnv_php_event, work);
>  	struct pnv_php_slot *php_slot = event->php_slot;
> +	struct pci_dev *pdev = php_slot->pdev;
> +	struct eeh_dev *edev;
> +	struct eeh_pe *pe;
> +	int i, rc;
>  
>  	if (event->added)
>  		pnv_php_enable_slot(&php_slot->slot);
>  	else
>  		pnv_php_disable_slot(&php_slot->slot);
>  
> +	if (!event->added) {
> +		/* When a device is surprise removed from a downstream bridge slot, the upstream bridge port
> +		 * can still end up frozen due to related EEH events, which will in turn block the MSI interrupts
> +		 * for slot hotplug detection.  Detect and thaw any frozen upstream PE after slot deactivation...
> +		 */

Restyle and wrap comment.

s/upstream bridge port/bridge downstream port/ to avoid confusion.

> +		edev = pci_dev_to_eeh_dev(pdev);
> +		pe = edev ? edev->pe : NULL;
> +		rc = eeh_pe_get_state(pe);
> +		if ((rc == -ENODEV) || (rc == -ENOENT)) {
> +			SLOT_WARN(php_slot, "Upstream bridge PE state unknown, hotplug detect may fail\n");
> +		}
> +		else {
> +			if (pe->state & EEH_PE_ISOLATED) {
> +				SLOT_WARN(php_slot, "Upstream bridge PE %02x frozen, thawing...\n", pe->addr);
> +				for (i = 0; i < 3; i++)
> +					if (!eeh_unfreeze_pe(pe))
> +						break;
> +				if (i >= 3)
> +					SLOT_WARN(php_slot, "Unable to thaw PE %02x, hotplug detect will fail!\n", pe->addr);
> +				else
> +					SLOT_WARN(php_slot, "PE %02x thawed successfully\n", pe->addr);
> +			}
> +		}
> +	}

Possibly factor this out, too.  Then pnv_php_event_handler() could
look simpler:

  if (event->added) {
    pnv_php_enable_slot(&php_slot->slot);
  } else {
    pnv_php_disable_slot(&php_slot->slot);
    <new helper to check for surprise removal>
  }

  kfree(event);

>  	kfree(event);
>  }
>  
> -- 
> 2.39.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
  2025-06-18 16:56 ` [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken Timothy Pearson
@ 2025-06-18 19:44   ` Bjorn Helgaas
  2025-06-18 19:50     ` Timothy Pearson
  0 siblings, 1 reply; 31+ messages in thread
From: Bjorn Helgaas @ 2025-06-18 19:44 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio, Lukas Wunner

[+cc Lukas, pciehp expert]

On Wed, Jun 18, 2025 at 11:56:54AM -0500, Timothy Pearson wrote:
>  presence detection

(subject/commit wrapping seems to be on all of these patches)

> The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
> was observed to incorrectly assert the Presence Detect Set bit in its
> capabilities when tested on a Raptor Computing Systems Blackbird system,
> resulting in the hot insert path never attempting a rescan of the bus
> and any downstream devices not being re-detected.

Seems like this switch supports standard PCIe hotplug?  Quite a bit of
this driver looks similar to things in pciehp.  Is there some reason
we can't use pciehp directly?  Maybe pciehp could work if there were
hooks for the PPC-specific bits?

> Work around this by additionally checking whether the PCIe data link is
> active or not when performing presence detection on downstream switches'
> ports, similar to the pciehp_hpc.c driver.
> 
> Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
> ---
>  drivers/pci/hotplug/pnv_php.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
> index aec0a6d594ac..bac8af3df41a 100644
> --- a/drivers/pci/hotplug/pnv_php.c
> +++ b/drivers/pci/hotplug/pnv_php.c
> @@ -391,6 +391,20 @@ static int pnv_php_get_power_state(struct hotplug_slot *slot, u8 *state)
>  	return 0;
>  }
>  
> +static int pcie_check_link_active(struct pci_dev *pdev)
> +{
> +	u16 lnk_status;
> +	int ret;
> +
> +	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
> +	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
> +		return -ENODEV;
> +
> +	ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
> +
> +	return ret;
> +}
> +
>  static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
>  {
>  	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
> @@ -403,6 +417,19 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
>  	 */
>  	ret = pnv_pci_get_presence_state(php_slot->id, &presence);
>  	if (ret >= 0) {
> +		if (pci_pcie_type(php_slot->pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
> +			presence == OPAL_PCI_SLOT_EMPTY) {
> +			/*
> +			 * Similar to pciehp_hpc, check whether the Link Active
> +			 * bit is set to account for broken downstream bridges
> +			 * that don't properly assert Presence Detect State, as
> +			 * was observed on the Microsemi Switchtec PM8533 PFX
> +			 * [11f8:8533].
> +			 */
> +			if (pcie_check_link_active(php_slot->pdev) > 0)
> +				presence = OPAL_PCI_SLOT_PRESENT;
> +		}
> +
>  		*state = presence;
>  		ret = 0;
>  	} else {
> -- 
> 2.39.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
  2025-06-18 19:44   ` Bjorn Helgaas
@ 2025-06-18 19:50     ` Timothy Pearson
  2025-06-18 20:17       ` Bjorn Helgaas
  0 siblings, 1 reply; 31+ messages in thread
From: Timothy Pearson @ 2025-06-18 19:50 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio, Lukas Wunner



----- Original Message -----
> From: "Bjorn Helgaas" <helgaas@kernel.org>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>, "Lukas Wunner" <lukas@wunner.de>
> Sent: Wednesday, June 18, 2025 2:44:00 PM
> Subject: Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken

> [+cc Lukas, pciehp expert]
> 
> On Wed, Jun 18, 2025 at 11:56:54AM -0500, Timothy Pearson wrote:
>>  presence detection
> 
> (subject/commit wrapping seems to be on all of these patches)
> 
>> The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
>> was observed to incorrectly assert the Presence Detect Set bit in its
>> capabilities when tested on a Raptor Computing Systems Blackbird system,
>> resulting in the hot insert path never attempting a rescan of the bus
>> and any downstream devices not being re-detected.
> 
> Seems like this switch supports standard PCIe hotplug?  Quite a bit of
> this driver looks similar to things in pciehp.  Is there some reason
> we can't use pciehp directly?  Maybe pciehp could work if there were
> hooks for the PPC-specific bits?

While that is a good long term goal that Raptor is willing to work toward, it is non-trivial and will require buy-in from other stakeholders (e.g. IBM).  If practical, I'd like to get this series merged first, to fix the broken hotplug on our hardware that is deployed worldwide, then in parallel see what can be done to merge PowerNV support into pciehp.  Would that work?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
  2025-06-18 19:50     ` Timothy Pearson
@ 2025-06-18 20:17       ` Bjorn Helgaas
  2025-06-19 19:29         ` Timothy Pearson
  0 siblings, 1 reply; 31+ messages in thread
From: Bjorn Helgaas @ 2025-06-18 20:17 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio, Lukas Wunner

On Wed, Jun 18, 2025 at 02:50:04PM -0500, Timothy Pearson wrote:
> ----- Original Message -----
> > From: "Bjorn Helgaas" <helgaas@kernel.org>
> > To: "Timothy Pearson" <tpearson@raptorengineering.com>
> > Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> > <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> > "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> > <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>, "Lukas Wunner" <lukas@wunner.de>
> > Sent: Wednesday, June 18, 2025 2:44:00 PM
> > Subject: Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
> 
> > [+cc Lukas, pciehp expert]
> > 
> > On Wed, Jun 18, 2025 at 11:56:54AM -0500, Timothy Pearson wrote:
> >>  presence detection
> > 
> > (subject/commit wrapping seems to be on all of these patches)
> > 
> >> The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
> >> was observed to incorrectly assert the Presence Detect Set bit in its
> >> capabilities when tested on a Raptor Computing Systems Blackbird system,
> >> resulting in the hot insert path never attempting a rescan of the bus
> >> and any downstream devices not being re-detected.
> > 
> > Seems like this switch supports standard PCIe hotplug?  Quite a bit of
> > this driver looks similar to things in pciehp.  Is there some reason
> > we can't use pciehp directly?  Maybe pciehp could work if there were
> > hooks for the PPC-specific bits?
> 
> While that is a good long term goal that Raptor is willing to work
> toward, it is non-trivial and will require buy-in from other
> stakeholders (e.g. IBM).  If practical, I'd like to get this series
> merged first, to fix the broken hotplug on our hardware that is
> deployed worldwide, then in parallel see what can be done to merge
> PowerNV support into pciehp.  Would that work?

Yeah, it wouldn't make sense to switch horses at this stage.

I guess I was triggered by this patch, which seems to be a workaround
for a defect in a device that is probably also used on non-PPC
systems, and pciehp would need a similar workaround.  But I guess you
go on to say that pciehp already does something similar, so it guess
it's already covered.

Bjorn

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-18 19:01   ` Bjorn Helgaas
@ 2025-06-19  0:37     ` Timothy Pearson
  2025-06-20  9:26       ` Krishna Kumar
  2025-06-24 22:34       ` Bjorn Helgaas
  0 siblings, 2 replies; 31+ messages in thread
From: Timothy Pearson @ 2025-06-19  0:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio



----- Original Message -----
> From: "Bjorn Helgaas" <helgaas@kernel.org>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
> Sent: Wednesday, June 18, 2025 2:01:46 PM
> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator

> On Wed, Jun 18, 2025 at 11:58:59AM -0500, Timothy Pearson wrote:
>>  state
> 
> Weird wrapping of last word of subject to here.

I'll need to see what's up with my git format-patch setup. Apologies for that across the multiple series.

>> The PCIe specification allows three attention indicator states,
>> on, off, and blink.  Enable all three states instead of basic
>> on / off control.
>> 
>> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
>> ---
>>  drivers/pci/hotplug/pnv_php.c | 15 ++++++++++++++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
>> index 0ceb4a2c3c79..c3005324be3d 100644
>> --- a/drivers/pci/hotplug/pnv_php.c
>> +++ b/drivers/pci/hotplug/pnv_php.c
>> @@ -440,10 +440,23 @@ static int pnv_php_get_adapter_state(struct hotplug_slot
>> *slot, u8 *state)
>>  	return ret;
>>  }
>>  
>> +static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8
>> *state)
>> +{
>> +	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
>> +	struct pci_dev *bridge = php_slot->pdev;
>> +	u16 status;
>> +
>> +	pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
>> +	*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;
> 
> Should be able to do this with FIELD_GET().

I used the same overall structure as the pciehp_hpc driver here.  Do you want me to also fix up that driver with FIELD_GET()?

> Is the PCI_EXP_SLTCTL_PIC part needed?  It wasn't there before, commit
> log doesn't mention it, and as far as I can tell, this would be the
> only driver to do that.  Most expose only the attention status (0=off,
> 1=on, 2=identify/blink).
> 
>> +	return 0;
>> +}
>> +
>> +
>>  static int pnv_php_get_attention_state(struct hotplug_slot *slot, u8 *state)
>>  {
>>  	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
>>  
>> +	pnv_php_get_raw_indicator_status(slot, &php_slot->attention_state);
> 
> This is a change worth noting.  Previously we didn't read the AIC
> state from PCI_EXP_SLTCTL at all; we used php_slot->attention_state to
> keep track of whatever had been previously set via
> pnv_php_set_attention_state().
> 
> Now we read the current state from PCI_EXP_SLTCTL.  It's not clear
> that php_slot->attention_state is still needed at all.

It probably isn't.  It's unclear why IBM took this path at all, given pciehp's attention handlers predate pnv-php's by many years.

> Previously, the user could write any value at all to the sysfs
> "attention" file and then read that same value back.  After this
> patch, the user can still write anything, but reads will only return
> values with PCI_EXP_SLTCTL_AIC and PCI_EXP_SLTCTL_PIC.
> 
>>  	*state = php_slot->attention_state;
>>  	return 0;
>>  }
>> @@ -461,7 +474,7 @@ static int pnv_php_set_attention_state(struct hotplug_slot
>> *slot, u8 state)
>>  	mask = PCI_EXP_SLTCTL_AIC;
>>  
>>  	if (state)
>> -		new = PCI_EXP_SLTCTL_ATTN_IND_ON;
>> +		new = FIELD_PREP(PCI_EXP_SLTCTL_AIC, state);
> 
> This changes the behavior in some cases:
> 
>  write 0: previously turned indicator off, now writes reserved value
>  write 2: previously turned indicator on, now sets to blink
>  write 3: previously turned indicator on, now turns it off

If we're looking at normalizing with pciehp with an eye toward eventually deprecating / removing pnv-php, I can't think of a better time to change this behavior.  I suspect we're the only major user of this code path at the moment, with most software expecting to see pciehp-style handling.  Thoughts?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and
  2025-06-18 19:15   ` Bjorn Helgaas
@ 2025-06-19 19:22     ` Timothy Pearson
  0 siblings, 0 replies; 31+ messages in thread
From: Timothy Pearson @ 2025-06-19 19:22 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio



----- Original Message -----
> From: "Bjorn Helgaas" <helgaas@kernel.org>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
> Sent: Wednesday, June 18, 2025 2:15:30 PM
> Subject: Re: [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and

> On Wed, Jun 18, 2025 at 11:58:23AM -0500, Timothy Pearson wrote:
>>  recovery
> 
> Same weird subject/commit wrapping.
> 
>> The existing PowerNV hotplug code did not handle suprise plug events
>> correctly, leading to a complete failure of the hotplug system after
>> device removal and a required reboot to detect new devices.
> 
> s/suprise/surprise/ (also below)
> 
>> This comes down to two issues:
>> 1.) When a device is suprise removed, oftentimes the bridge upstream
>>     port will cause a PE freeze on the PHB.  If this freeze is not
>>     cleared, the MSI interrupts from the bridge hotplug notification
>>     logic will not be received by the kernel, stalling all plug events
>>     on all slots associated with the PE.
> 
> I guess you mean the bridge *downstream* port that leads to the slot?

No, the upstream port leading to the PHB.  If it was just the downstream port, we'd still be receiving MSI interrupts from the bridge; the upstream PE also ends up frozen.  My best guess is that the downstream error is propagated upward by the PHB, and the hardware freezes the PE to avoid data corruption until the software (in this case the hotplug driver) can analyze the fault, see that it is expected, and thaw the PE.

>> 2.) When a device is removed from a slot, regardless of suprise or
>>     programmatic removal, the associated PHB/PE ls left frozen.
>>     If this freeze is not cleared via a fundamental reset, skiboot
>>     is unable to clear the freeze and cannot retrain / rescan the
>>     slot.  This also requires a reboot to clear the freeze and redetect
>>     the device in the slot.
>> 
>> Issue the appropriate unfreeze and rescan commands on hotplug events,
>> and don't oops on hotplug if pci_bus_to_OF_node() returns NULL.
>> 
>> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
>> ---
>>  arch/powerpc/kernel/pci-hotplug.c |  3 ++
>>  drivers/pci/hotplug/pnv_php.c     | 53 ++++++++++++++++++++++++++++++-
>>  2 files changed, 55 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/powerpc/kernel/pci-hotplug.c
>> b/arch/powerpc/kernel/pci-hotplug.c
>> index 9ea74973d78d..6f444d0822d8 100644
>> --- a/arch/powerpc/kernel/pci-hotplug.c
>> +++ b/arch/powerpc/kernel/pci-hotplug.c
>> @@ -141,6 +141,9 @@ void pci_hp_add_devices(struct pci_bus *bus)
>>  	struct pci_controller *phb;
>>  	struct device_node *dn = pci_bus_to_OF_node(bus);
>>  
>> +	if (!dn)
>> +		return;
>> +
>>  	phb = pci_bus_to_host(bus);
>>  
>>  	mode = PCI_PROBE_NORMAL;
>> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
>> index bac8af3df41a..0ceb4a2c3c79 100644
>> --- a/drivers/pci/hotplug/pnv_php.c
>> +++ b/drivers/pci/hotplug/pnv_php.c
>> @@ -10,6 +10,7 @@
>>  #include <linux/libfdt.h>
>>  #include <linux/module.h>
>>  #include <linux/pci.h>
>> +#include <linux/delay.h>
>>  #include <linux/pci_hotplug.h>
>>  #include <linux/of_fdt.h>
>>  
>> @@ -474,7 +475,7 @@ static int pnv_php_enable(struct pnv_php_slot *php_slot,
>> bool rescan)
>>  	struct hotplug_slot *slot = &php_slot->slot;
>>  	uint8_t presence = OPAL_PCI_SLOT_EMPTY;
>>  	uint8_t power_status = OPAL_PCI_SLOT_POWER_ON;
>> -	int ret;
>> +	int ret, i;
>>  
>>  	/* Check if the slot has been configured */
>>  	if (php_slot->state != PNV_PHP_STATE_REGISTERED)
>> @@ -532,6 +533,27 @@ static int pnv_php_enable(struct pnv_php_slot *php_slot,
>> bool rescan)
>>  
>>  	/* Power is off, turn it on and then scan the slot */
>>  	ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
>> +	if (ret) {
>> +		SLOT_WARN(php_slot, "PCI slot activation failed with error code %d, possible
>> frozen PHB", ret);
>> +		SLOT_WARN(php_slot, "Attempting complete PHB reset before retrying slot
>> activation\n");
>> +		for (i = 0; i < 3; i++) {
>> +			/* Slot activation failed, PHB may be fenced from a prior device failure
>> +			 * Use the OPAL fundamental reset call to both try a device reset and clear
>> +			 * any potentially active PHB fence / freeze
>> +			 */
>> +			SLOT_WARN(php_slot, "Try %d...\n", i + 1);
>> +			pci_set_pcie_reset_state(php_slot->pdev, pcie_warm_reset);
>> +			msleep(250);
> 
> What is the source of the 250 value?  Is there a spec you can cite for
> this?  Maybe add a #define if it makes sense?

It's a magic number used elsewhere in the tree, specifically by the EEH driver which also issues fundamental resets as part of its logic.  Unfortunately I don't know the origin of the magic number, and from my understanding of the PCIe specification it's probably non-critical -- long enough to be reasonably sure that the PCIe device has seen the reset strobe, but short enough not to stall the operation for an obnoxiously long interval.

>> +			pci_set_pcie_reset_state(php_slot->pdev, pcie_deassert_reset);
>> +
>> +			ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
> 
> Wrap the comment and non-printk lines to fit in 80 columns like the
> rest of the file.  Preserve the messages as-is so grep finds them
> easily.
> 
> Usual multi-line comment style is:
> 
>  /*
>   * Text ...
>   */

Will do.

> Possibly factor this warn/reset code into a helper function to
> unclutter pnv_php_enable()?

Sure, will do.

>> +			if (!ret)
>> +				break;
>> +		}
>> +
>> +		if (i >= 3)
>> +			SLOT_WARN(php_slot, "Failed to bring slot online, aborting!\n");
>> +	}
>>  	if (ret)
>>  		return ret;
>>  
>> @@ -841,12 +863,41 @@ static void pnv_php_event_handler(struct work_struct
>> *work)
>>  	struct pnv_php_event *event =
>>  		container_of(work, struct pnv_php_event, work);
>>  	struct pnv_php_slot *php_slot = event->php_slot;
>> +	struct pci_dev *pdev = php_slot->pdev;
>> +	struct eeh_dev *edev;
>> +	struct eeh_pe *pe;
>> +	int i, rc;
>>  
>>  	if (event->added)
>>  		pnv_php_enable_slot(&php_slot->slot);
>>  	else
>>  		pnv_php_disable_slot(&php_slot->slot);
>>  
>> +	if (!event->added) {
>> +		/* When a device is surprise removed from a downstream bridge slot, the
>> upstream bridge port
>> +		 * can still end up frozen due to related EEH events, which will in turn
>> block the MSI interrupts
>> +		 * for slot hotplug detection.  Detect and thaw any frozen upstream PE after
>> slot deactivation...
>> +		 */
> 
> Restyle and wrap comment.
> 
> s/upstream bridge port/bridge downstream port/ to avoid confusion.

It's the upstream port of the bridge (downstream slot of the PHB itself) that ends up as the frozen PE.

>> +		edev = pci_dev_to_eeh_dev(pdev);
>> +		pe = edev ? edev->pe : NULL;
>> +		rc = eeh_pe_get_state(pe);
>> +		if ((rc == -ENODEV) || (rc == -ENOENT)) {
>> +			SLOT_WARN(php_slot, "Upstream bridge PE state unknown, hotplug detect may
>> fail\n");
>> +		}
>> +		else {
>> +			if (pe->state & EEH_PE_ISOLATED) {
>> +				SLOT_WARN(php_slot, "Upstream bridge PE %02x frozen, thawing...\n",
>> pe->addr);
>> +				for (i = 0; i < 3; i++)
>> +					if (!eeh_unfreeze_pe(pe))
>> +						break;
>> +				if (i >= 3)
>> +					SLOT_WARN(php_slot, "Unable to thaw PE %02x, hotplug detect will fail!\n",
>> pe->addr);
>> +				else
>> +					SLOT_WARN(php_slot, "PE %02x thawed successfully\n", pe->addr);
>> +			}
>> +		}
>> +	}
> 
> Possibly factor this out, too.  Then pnv_php_event_handler() could
> look simpler:
> 
>  if (event->added) {
>    pnv_php_enable_slot(&php_slot->slot);
>  } else {
>    pnv_php_disable_slot(&php_slot->slot);
>    <new helper to check for surprise removal>
>  }

Fair enough, will do.

>  kfree(event);
> 
>>  	kfree(event);
>>  }
>>  
>> --
> > 2.39.5

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
  2025-06-18 20:17       ` Bjorn Helgaas
@ 2025-06-19 19:29         ` Timothy Pearson
  2025-06-20  7:52           ` Lukas Wunner
  0 siblings, 1 reply; 31+ messages in thread
From: Timothy Pearson @ 2025-06-19 19:29 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio, Lukas Wunner



----- Original Message -----
> From: "Bjorn Helgaas" <helgaas@kernel.org>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>, "Lukas Wunner" <lukas@wunner.de>
> Sent: Wednesday, June 18, 2025 3:17:22 PM
> Subject: Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken

> On Wed, Jun 18, 2025 at 02:50:04PM -0500, Timothy Pearson wrote:
>> ----- Original Message -----
>> > From: "Bjorn Helgaas" <helgaas@kernel.org>
>> > To: "Timothy Pearson" <tpearson@raptorengineering.com>
>> > Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel"
>> > <linux-kernel@vger.kernel.org>, "linux-pci"
>> > <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>,
>> > "Michael Ellerman" <mpe@ellerman.id.au>,
>> > "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao"
>> > <naveen@kernel.org>, "Bjorn Helgaas"
>> > <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>,
>> > "Lukas Wunner" <lukas@wunner.de>
>> > Sent: Wednesday, June 18, 2025 2:44:00 PM
>> > Subject: Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with
>> > broken
>> 
>> > [+cc Lukas, pciehp expert]
>> > 
>> > On Wed, Jun 18, 2025 at 11:56:54AM -0500, Timothy Pearson wrote:
>> >>  presence detection
>> > 
>> > (subject/commit wrapping seems to be on all of these patches)
>> > 
>> >> The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system
>> >> was observed to incorrectly assert the Presence Detect Set bit in its
>> >> capabilities when tested on a Raptor Computing Systems Blackbird system,
>> >> resulting in the hot insert path never attempting a rescan of the bus
>> >> and any downstream devices not being re-detected.
>> > 
>> > Seems like this switch supports standard PCIe hotplug?  Quite a bit of
>> > this driver looks similar to things in pciehp.  Is there some reason
>> > we can't use pciehp directly?  Maybe pciehp could work if there were
>> > hooks for the PPC-specific bits?
>> 
>> While that is a good long term goal that Raptor is willing to work
>> toward, it is non-trivial and will require buy-in from other
>> stakeholders (e.g. IBM).  If practical, I'd like to get this series
>> merged first, to fix the broken hotplug on our hardware that is
>> deployed worldwide, then in parallel see what can be done to merge
>> PowerNV support into pciehp.  Would that work?
> 
> Yeah, it wouldn't make sense to switch horses at this stage.
> 
> I guess I was triggered by this patch, which seems to be a workaround
> for a defect in a device that is probably also used on non-PPC
> systems, and pciehp would need a similar workaround.  But I guess you
> go on to say that pciehp already does something similar, so it guess
> it's already covered.

No problem, I completely understand.  To be perfectly frank the existing code quality in this driver (and the associated EEH driver) is not the best, and it's been a frustrating experience trying to hack it into semi-stable operation.  I would vastly prefer to rewrite / integrate into the pciehp driver, and we have plans to do so, but that will take an unacceptable amount of time vs. trying to fix up the existing driver as a stopgap.

As you mentioned, pciehp already has this fix, so we just have to deal with the duplicated code until we (Raptor) figures out how to merge PowerNV support into pciehp.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
  2025-06-19 19:29         ` Timothy Pearson
@ 2025-06-20  7:52           ` Lukas Wunner
  2025-06-20 16:45             ` Timothy Pearson
  0 siblings, 1 reply; 31+ messages in thread
From: Lukas Wunner @ 2025-06-20  7:52 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: Bjorn Helgaas, linuxppc-dev, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas, Shawn Anastasio

On Thu, Jun 19, 2025 at 02:29:33PM -0500, Timothy Pearson wrote:
> To be perfectly frank the existing code quality in this driver
> (and the associated EEH driver) is not the best, and it's been
> a frustrating experience trying to hack it into semi-stable
> operation.
> 
> I would vastly prefer to rewrite / integrate into the pciehp driver,
> and we have plans to do so, but that will take an unacceptable amount
> of time vs. trying to fix up the existing driver as a stopgap.
> 
> As you mentioned, pciehp already has this fix, so we just have to
> deal with the duplicated code until we (Raptor) figures out how to
> merge PowerNV support into pciehp.

I don't know how much PCIe hotplug on PowerNV differs from native,
spec-compliant PCIe hotplug.  If the differences are vast (and I
get the feeling they might be if I read terms like "PHB" and
"EEH unfreeze", which sound completely foreign to me), it might
be easier to refactor pnv_php.c and copy patterns or code from
pciehp, than to integrate the functionality from pnv_php.c into
pciehp.

pciehp does carry some historic baggage of its own (such as poll mode),
which you may not want to deal with on PowerNV.

One thing I don't quite understand is, it sounds like you've
attached a PCIe switch to a Root Port and the hotplug ports
are on the PCIe switch.  Aren't those hotplug ports just
bog-standard ones that can be driven by pciehp?  My expectation
would have been that a PowerNV-specific hotplug driver would
only be necessary for hotplug-capable Root Ports.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-19  0:37     ` Timothy Pearson
@ 2025-06-20  9:26       ` Krishna Kumar
  2025-06-21  9:59         ` Lukas Wunner
  2025-06-21 15:05         ` Timothy Pearson
  2025-06-24 22:34       ` Bjorn Helgaas
  1 sibling, 2 replies; 31+ messages in thread
From: Krishna Kumar @ 2025-06-20  9:26 UTC (permalink / raw)
  To: linuxppc-dev, Timothy Pearson, Shawn Anastasio
  Cc: linuxppc-dev, linux-kernel, "linux-pci",,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, "Bjorn Helgaas",, Shawn Anastasio

Shawn, Timothy:

Thanks for posting the series of patch. Few things I wanted to do better understand Raptor problem -


1. Last time my two patches solved all the hotunplug operation and Kernel crash issue except nvme case. It did not work with

    NVME since dpc support was not there. I was not able to do that due to being  occupied in some other work.

2. I want to understand the delta from last yr problem to this problem. Is the PHB freeze or hotunplug failure happens

    only for particular Microsemi switch or it happens with all the switches. When did this problem started coming ? Till last yr 

    it was not there. Is it specific to particular Hardware ? Can I get your setup to test this problem and your patch ?

3. To me, hot unplug opertaion  --> AER triggering --> DPC support, this flow should mask the error to reach root port/cpu and it 

    should solve the PHB freeze/ hot unplug failure operation. If there are AER/EEH related synchronization issue we need to solve them. 

    Can you pls list the issue, I will pass it to EEH/AER team. But yeah, to me if AER implementation is correct and we add DPC support,

    all the error will be contained by switch itself. The PHB/root port/cpu will not be impacted by this and there should not be any freeze.

4. Ofcourse we can pick some of the fixes from pciehp driver if its missing in pnv_php.c. Also at the same time you have done

    some cleanup in hot unplug path and fixed the attenuation button related code. If these works fine, we can pick it. But I want to test it.

     Pls provide me setup.

5. If point 3 and 4 does not solve the problem, then only we should move to pciehp.c. But AFAIK, PPC/Powernv is DT based while pciehp.c

     may be only supporting acpi (I have to check it on this).  We need to provide PHB related information via DTB and maintain the related 

     topology information via dtb and then it can be doable. Also , we need to do thorough planning/testing if we think to choose pciehp.c. 

     But yeah, lets not jump here and lets try to fix the current issues via point 3 & 4. Point 5 will be our last option.


Best Regards,

Krishna




On 6/19/25 6:07 AM, Timothy Pearson wrote:
>
> ----- Original Message -----
>> From: "Bjorn Helgaas" <helgaas@kernel.org>
>> To: "Timothy Pearson" <tpearson@raptorengineering.com>
>> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
>> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
>> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
>> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
>> Sent: Wednesday, June 18, 2025 2:01:46 PM
>> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
>> On Wed, Jun 18, 2025 at 11:58:59AM -0500, Timothy Pearson wrote:
>>>  state
>> Weird wrapping of last word of subject to here.
> I'll need to see what's up with my git format-patch setup. Apologies for that across the multiple series.
>
>>> The PCIe specification allows three attention indicator states,
>>> on, off, and blink.  Enable all three states instead of basic
>>> on / off control.
>>>
>>> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
>>> ---
>>>  drivers/pci/hotplug/pnv_php.c | 15 ++++++++++++++-
>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
>>> index 0ceb4a2c3c79..c3005324be3d 100644
>>> --- a/drivers/pci/hotplug/pnv_php.c
>>> +++ b/drivers/pci/hotplug/pnv_php.c
>>> @@ -440,10 +440,23 @@ static int pnv_php_get_adapter_state(struct hotplug_slot
>>> *slot, u8 *state)
>>>  	return ret;
>>>  }
>>>  
>>> +static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8
>>> *state)
>>> +{
>>> +	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
>>> +	struct pci_dev *bridge = php_slot->pdev;
>>> +	u16 status;
>>> +
>>> +	pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
>>> +	*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;
>> Should be able to do this with FIELD_GET().
> I used the same overall structure as the pciehp_hpc driver here.  Do you want me to also fix up that driver with FIELD_GET()?
>
>> Is the PCI_EXP_SLTCTL_PIC part needed?  It wasn't there before, commit
>> log doesn't mention it, and as far as I can tell, this would be the
>> only driver to do that.  Most expose only the attention status (0=off,
>> 1=on, 2=identify/blink).
>>
>>> +	return 0;
>>> +}
>>> +
>>> +
>>>  static int pnv_php_get_attention_state(struct hotplug_slot *slot, u8 *state)
>>>  {
>>>  	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
>>>  
>>> +	pnv_php_get_raw_indicator_status(slot, &php_slot->attention_state);
>> This is a change worth noting.  Previously we didn't read the AIC
>> state from PCI_EXP_SLTCTL at all; we used php_slot->attention_state to
>> keep track of whatever had been previously set via
>> pnv_php_set_attention_state().
>>
>> Now we read the current state from PCI_EXP_SLTCTL.  It's not clear
>> that php_slot->attention_state is still needed at all.
> It probably isn't.  It's unclear why IBM took this path at all, given pciehp's attention handlers predate pnv-php's by many years.
>
>> Previously, the user could write any value at all to the sysfs
>> "attention" file and then read that same value back.  After this
>> patch, the user can still write anything, but reads will only return
>> values with PCI_EXP_SLTCTL_AIC and PCI_EXP_SLTCTL_PIC.
>>
>>>  	*state = php_slot->attention_state;
>>>  	return 0;
>>>  }
>>> @@ -461,7 +474,7 @@ static int pnv_php_set_attention_state(struct hotplug_slot
>>> *slot, u8 state)
>>>  	mask = PCI_EXP_SLTCTL_AIC;
>>>  
>>>  	if (state)
>>> -		new = PCI_EXP_SLTCTL_ATTN_IND_ON;
>>> +		new = FIELD_PREP(PCI_EXP_SLTCTL_AIC, state);
>> This changes the behavior in some cases:
>>
>>  write 0: previously turned indicator off, now writes reserved value
>>  write 2: previously turned indicator on, now sets to blink
>>  write 3: previously turned indicator on, now turns it off
> If we're looking at normalizing with pciehp with an eye toward eventually deprecating / removing pnv-php, I can't think of a better time to change this behavior.  I suspect we're the only major user of this code path at the moment, with most software expecting to see pciehp-style handling.  Thoughts?
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
  2025-06-20  7:52           ` Lukas Wunner
@ 2025-06-20 16:45             ` Timothy Pearson
  2025-06-25  8:45               ` Lukas Wunner
  0 siblings, 1 reply; 31+ messages in thread
From: Timothy Pearson @ 2025-06-20 16:45 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, linuxppc-dev, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas, Shawn Anastasio



----- Original Message -----
> From: "Lukas Wunner" <lukas@wunner.de>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel"
> <linux-kernel@vger.kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>,
> "Michael Ellerman" <mpe@ellerman.id.au>, "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao"
> <naveen@kernel.org>, "Bjorn Helgaas" <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
> Sent: Friday, June 20, 2025 2:52:55 AM
> Subject: Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken

> On Thu, Jun 19, 2025 at 02:29:33PM -0500, Timothy Pearson wrote:
>> To be perfectly frank the existing code quality in this driver
>> (and the associated EEH driver) is not the best, and it's been
>> a frustrating experience trying to hack it into semi-stable
>> operation.
>> 
>> I would vastly prefer to rewrite / integrate into the pciehp driver,
>> and we have plans to do so, but that will take an unacceptable amount
>> of time vs. trying to fix up the existing driver as a stopgap.
>> 
>> As you mentioned, pciehp already has this fix, so we just have to
>> deal with the duplicated code until we (Raptor) figures out how to
>> merge PowerNV support into pciehp.
> 
> I don't know how much PCIe hotplug on PowerNV differs from native,
> spec-compliant PCIe hotplug.  If the differences are vast (and I
> get the feeling they might be if I read terms like "PHB" and
> "EEH unfreeze", which sound completely foreign to me), it might
> be easier to refactor pnv_php.c and copy patterns or code from
> pciehp, than to integrate the functionality from pnv_php.c into
> pciehp.

The differences at the root level (PHB) are quite significant -- the controller is more advanced in many ways than standard PCIe root complexes [1] -- and those differences need very special handling.  Once we are looking at devices downstream of the root complex, standard PCIe hotplug and AER specifications apply, so we're in a strange spot of really wanting to use pciehp (to handle all nested downstream bridges), but pciehp still needs to understand how to deal with our unqiue root complex.

One idea I had was to use the existing modularity of pciehp's source and add a new pciehp_powernv.c file with all of our root complex handling methods.  We could then #ifdef swap the assocated root complex calls to that external file when compiled in PowerNV mode.

> pciehp does carry some historic baggage of its own (such as poll mode),
> which you may not want to deal with on PowerNV.

At the root level we probably don't care about polling mode, but in terms of nested downtream bridges polling may still be desireable.  I don't have a strong opinion either way.

> One thing I don't quite understand is, it sounds like you've
> attached a PCIe switch to a Root Port and the hotplug ports
> are on the PCIe switch.  Aren't those hotplug ports just
> bog-standard ones that can be driven by pciehp?  My expectation
> would have been that a PowerNV-specific hotplug driver would
> only be necessary for hotplug-capable Root Ports.

That's the problem -- the pciehp driver assumes x86 root ports, and the powernv driver ends up duplicating (badly) parts of the pciehp functionality for downstream bridges.  That's one reason I'd like to abstract the root port handling in pciehp and eventually move the PowerNV root port handling into that module.

> Thanks,
> 
> Lukas

[1] Among other interesting differences, it is both capable of and has been tested to properly block and report all invalid accesses from a PCIe device to system memory.  This breaks assumptions in many PCIe device drivers, but is also a significant security advantage.  EEH freeze is effectively this security mechanism kicking in -- on detecting an invalid access, the PHB itself will block the access and put the PE into a frozen state where no PCIe traffic is permitted.  It simultaneously rases an error to the host (colloquially referred to as EEH) and punts the decision making on whether to reset or resume the device to the kernel itself.  We've caught various cards doing fun things like reading host RAM or trying to write to 0x0 via this mechanism, one of the most egregious was a Chinese telecom card that apparently tries to reset the host with a write to low memory on detecting an interrupt servicing delay (!).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-20  9:26       ` Krishna Kumar
@ 2025-06-21  9:59         ` Lukas Wunner
  2025-06-25  4:08           ` Krishna Kumar
  2025-06-21 15:05         ` Timothy Pearson
  1 sibling, 1 reply; 31+ messages in thread
From: Lukas Wunner @ 2025-06-21  9:59 UTC (permalink / raw)
  To: Krishna Kumar
  Cc: linuxppc-dev, Timothy Pearson, Shawn Anastasio, linux-kernel,
	"linux-pci",, Madhavan Srinivasan, Michael Ellerman,
	christophe leroy, Naveen N Rao, "Bjorn Helgaas",

On Fri, Jun 20, 2025 at 02:56:51PM +0530, Krishna Kumar wrote:
> 5. If point 3 and 4 does not solve the problem, then only we should
> move to pciehp.c. But AFAIK, PPC/Powernv is DT based while pciehp.c
> may be only supporting acpi (I have to check it on this). We need to
> provide PHB related information via DTB and maintain the related
> topology information via dtb and then it can be doable.

pciehp is not ACPI-specific.  The PCIe port service driver in
drivers/pci/pcie/portdrv.c binds to any PCIe port, examines the
port's capabilities (e.g. hotplug, AER, DPC, ...) and instantiates
sub-devices to which pciehp and the other drivers such as aer bind.

How do you prevent pciehp from driving hotplug on PowerNV anyway?
Do you disable CONFIG_HOTPLUG_PCI_PCIE?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-20  9:26       ` Krishna Kumar
  2025-06-21  9:59         ` Lukas Wunner
@ 2025-06-21 15:05         ` Timothy Pearson
  2025-06-24  7:07           ` Krishna Kumar
  1 sibling, 1 reply; 31+ messages in thread
From: Timothy Pearson @ 2025-06-21 15:05 UTC (permalink / raw)
  To: Krishna Kumar
  Cc: linuxppc-dev, Shawn Anastasio, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas



----- Original Message -----
> From: "Krishna Kumar" <krishnak@linux.ibm.com>
> To: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "Timothy Pearson" <tpearson@raptorengineering.com>, "Shawn
> Anastasio" <sanastasio@raptorengineering.com>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
> Sent: Friday, June 20, 2025 4:26:51 AM
> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator

> Shawn, Timothy:
> 
> Thanks for posting the series of patch. Few things I wanted to do better
> understand Raptor problem -
> 
> 
> 1. Last time my two patches solved all the hotunplug operation and Kernel crash
> issue except nvme case. It did not work with
> 
>    NVME since dpc support was not there. I was not able to do that due to being
>      occupied in some other work.

With the current series all hotplug is working correctly, including not only NVMe on root port and bridge ports, but also suprise plug of the entire PCIe switch at the root port.  The lack of DPC support *might* be related to the PE freeze, but in any case we prefer the hotplug driver to be able to recover from a PE freeze (e.g. if a bridge card is faulty and needs to be replaced) without also requiring a reboot, so I would consider DPC implementation orthogonal to this patch set.

> 2. I want to understand the delta from last yr problem to this problem. Is the
> PHB freeze or hotunplug failure happens
> 
>    only for particular Microsemi switch or it happens with all the switches. When
>    did this problem started coming ? Till last yr

Hotplug has never worked reliably for us, if it worked at all it was always rolling the dice on whether the kernel would oops and take down the host.  Even if the kernel didn't oops, suprise plug and auto-add / auto-remove never worked beyond one remove operation.

>    it was not there. Is it specific to particular Hardware ? Can I get your setup
>    to test this problem and your patch ?

Because you will need to be able to physically plug and unplug cards and drives this may be a bit tricky.  Do you have access to a POWER9 host system with a x16 PCIe slot?  If so, all you need is a Supermicro SLC-AO3G-8E2P card and some random U.2 NVMe drives -- these cards are readily available and provide relatively standardized OCuLink access to a Switchtec bridge.

If you don't have access to a POWER9 host, we can set you up with remote access, but it won't show all of the crashing and problems that occur with surprise plug unless we set up a live debug session (video call or similar).

> 3. To me, hot unplug opertaion  --> AER triggering --> DPC support, this flow
> should mask the error to reach root port/cpu and it
> 
>    should solve the PHB freeze/ hot unplug failure operation. If there are AER/EEH
>    related synchronization issue we need to solve them.
> 
>    Can you pls list the issue, I will pass it to EEH/AER team. But yeah, to me if
>    AER implementation is correct and we add DPC support,
> 
>    all the error will be contained by switch itself. The PHB/root port/cpu will not
>    be impacted by this and there should not be any freeze.

While this is a good goal to work toward, it only solves one possible fault mode.  The patch series posted here will handle the general case of a PE freeze without requiring a host reboot, which is great for high-reliability systems where there might be a desire to replace the entire switch card (this has been tested with the patch series and works perfectly).

> 4. Ofcourse we can pick some of the fixes from pciehp driver if its missing in
> pnv_php.c. Also at the same time you have done
> 
>    some cleanup in hot unplug path and fixed the attenuation button related code.
>    If these works fine, we can pick it. But I want to test it.
> 
>     Pls provide me setup.
> 
> 5. If point 3 and 4 does not solve the problem, then only we should move to
> pciehp.c. But AFAIK, PPC/Powernv is DT based while pciehp.c
> 
>     may be only supporting acpi (I have to check it on this).  We need to provide
>     PHB related information via DTB and maintain the related
> 
>     topology information via dtb and then it can be doable. Also , we need to do
>     thorough planning/testing if we think to choose pciehp.c.
> 
>     But yeah, lets not jump here and lets try to fix the current issues via point 3
>     & 4. Point 5 will be our last option.

If possible I would like to see this series merged vs. being blocked on DPC.  Again, from where I sit DPC is orthogonal; many events can cause a PE freeze and implementing DPC only solves one.  We do *not* want to require a host reboot in any situation whatsoever short of a complete failure of a critical element (e.g. the PHB itself or a CPU package); our use case as deployed is five nines critical infrastructure, and the broken hotplug has already been the sole reason we have not maintained 100% uptime on a key system.

Thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-21 15:05         ` Timothy Pearson
@ 2025-06-24  7:07           ` Krishna Kumar
  2025-06-24 16:34             ` Timothy Pearson
  0 siblings, 1 reply; 31+ messages in thread
From: Krishna Kumar @ 2025-06-24  7:07 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, Shawn Anastasio, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas


On 6/21/25 8:35 PM, Timothy Pearson wrote:
>
> ----- Original Message -----
>> From: "Krishna Kumar" <krishnak@linux.ibm.com>
>> To: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "Timothy Pearson" <tpearson@raptorengineering.com>, "Shawn
>> Anastasio" <sanastasio@raptorengineering.com>
>> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
>> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
>> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
>> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
>> Sent: Friday, June 20, 2025 4:26:51 AM
>> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
>> Shawn, Timothy:
>>
>> Thanks for posting the series of patch. Few things I wanted to do better
>> understand Raptor problem -
>>
>>
>> 1. Last time my two patches solved all the hotunplug operation and Kernel crash
>> issue except nvme case. It did not work with
>>
>>     NVME since dpc support was not there. I was not able to do that due to being
>>       occupied in some other work.
> With the current series all hotplug is working correctly, including not only NVMe on root port and bridge ports, but also suprise plug of the entire PCIe switch at the root port.  The lack of DPC support *might* be related to the PE freeze, but in any case we prefer the hotplug driver to be able to recover from a PE freeze (e.g. if a bridge card is faulty and needs to be replaced) without also requiring a reboot, so I would consider DPC implementation orthogonal to this patch set.
Sounds Good !!
>
>> 2. I want to understand the delta from last yr problem to this problem. Is the
>> PHB freeze or hotunplug failure happens
>>
>>     only for particular Microsemi switch or it happens with all the switches. When
>>     did this problem started coming ? Till last yr
> Hotplug has never worked reliably for us, if it worked at all it was always rolling the dice on whether the kernel would oops and take down the host.  Even if the kernel didn't oops, suprise plug and auto-add / auto-remove never worked beyond one remove operation.
I would like to see this problem may be during our zoom/teams meeting. Though I have not tested surprise plug/unplug and only tested via sysfs, you may be correct but I want to have a look of this problem.
>
>>     it was not there. Is it specific to particular Hardware ? Can I get your setup
>>     to test this problem and your patch ?
> Because you will need to be able to physically plug and unplug cards and drives this may be a bit tricky.  Do you have access to a POWER9 host system with a x16 PCIe slot?  If so, all you need is a Supermicro SLC-AO3G-8E2P card and some random U.2 NVMe drives -- these cards are readily available and provide relatively standardized OCuLink access to a Switchtec bridge.
>
> If you don't have access to a POWER9 host, we can set you up with remote access, but it won't show all of the crashing and problems that occur with surprise plug unless we set up a live debug session (video call or similar).


Video Call should be fine. During the call I will have a look of existing problem and fix along with driver/kernel logs.

>
>> 3. To me, hot unplug opertaion  --> AER triggering --> DPC support, this flow
>> should mask the error to reach root port/cpu and it
>>
>>     should solve the PHB freeze/ hot unplug failure operation. If there are AER/EEH
>>     related synchronization issue we need to solve them.
>>
>>     Can you pls list the issue, I will pass it to EEH/AER team. But yeah, to me if
>>     AER implementation is correct and we add DPC support,
>>
>>     all the error will be contained by switch itself. The PHB/root port/cpu will not
>>     be impacted by this and there should not be any freeze.
> While this is a good goal to work toward, it only solves one possible fault mode.  The patch series posted here will handle the general case of a PE freeze without requiring a host reboot, which is great for high-reliability systems where there might be a desire to replace the entire switch card (this has been tested with the patch series and works perfectly).


You may be correct on this and this is possible. If the driver and AER/EEH errors/events are properly 

handled then we may not need DPC in all cases. The point of DPC was to absorb the error at switch port 

itself so that it will not reach to PHB/Root-port/Cpu and will avoid further corruption. I was hoping that if 

DPC gets enabled, we may not need explicit reboot for drives to come up in case of surprise hot unplug.

But yeah, we can compare this with current result when this support will be enabled.

>
>> 4. Ofcourse we can pick some of the fixes from pciehp driver if its missing in
>> pnv_php.c. Also at the same time you have done
>>
>>     some cleanup in hot unplug path and fixed the attenuation button related code.
>>     If these works fine, we can pick it. But I want to test it.
>>
>>      Pls provide me setup.
>>
>> 5. If point 3 and 4 does not solve the problem, then only we should move to
>> pciehp.c. But AFAIK, PPC/Powernv is DT based while pciehp.c
>>
>>      may be only supporting acpi (I have to check it on this).  We need to provide
>>      PHB related information via DTB and maintain the related
>>
>>      topology information via dtb and then it can be doable. Also , we need to do
>>      thorough planning/testing if we think to choose pciehp.c.
>>
>>      But yeah, lets not jump here and lets try to fix the current issues via point 3
>>      & 4. Point 5 will be our last option.
> If possible I would like to see this series merged vs. being blocked on DPC.  Again, from where I sit DPC is orthogonal; many events can cause a PE freeze and implementing DPC only solves one.  We do *not* want to require a host reboot in any situation whatsoever short of a complete failure of a critical element (e.g. the PHB itself or a CPU package); our use case as deployed is five nines critical infrastructure, and the broken hotplug has already been the sole reason we have not maintained 100% uptime on a key system.

If you are in hurry and want to defer DPC for some time, I am fine with it since it serves larger purpose like PE freeze and NVME drives working 

along with surprise hotplug fixes.  I have gone through your pnv_php.c changes and I am mostly fine with it. But, I would like to review it again 

from larger prespective w.r.t to EEH & pciehp.c, so give me some time.  Also, if possible you can show me 

the problem/fix along with log  during video call. it would be great if we can meet sometimes next month in early first week may be on 5th of July.

I will request few of the EEH/AER developer to have a look into the patch and to join the meeting if they have bandwidth. Please shoot the 

mail/invite on krishna.kumar11@ibm.com along with this email id. I am based in Bangalore but can be available till night 10:00 pm.

>
> Thanks!


Thanks & Best Regards,

Krishna

>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-24  7:07           ` Krishna Kumar
@ 2025-06-24 16:34             ` Timothy Pearson
  0 siblings, 0 replies; 31+ messages in thread
From: Timothy Pearson @ 2025-06-24 16:34 UTC (permalink / raw)
  To: Krishna Kumar
  Cc: linuxppc-dev, Shawn Anastasio, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas



----- Original Message -----
> From: "Krishna Kumar" <krishnak@linux.ibm.com>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "Shawn Anastasio" <sanastasio@raptorengineering.com>, "linux-kernel"
> <linux-kernel@vger.kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>,
> "Michael Ellerman" <mpe@ellerman.id.au>, "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao"
> <naveen@kernel.org>, "Bjorn Helgaas" <bhelgaas@google.com>
> Sent: Tuesday, June 24, 2025 2:07:30 AM
> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator

> On 6/21/25 8:35 PM, Timothy Pearson wrote:
>>
>> ----- Original Message -----
>>> From: "Krishna Kumar" <krishnak@linux.ibm.com>
>>> To: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "Timothy Pearson"
>>> <tpearson@raptorengineering.com>, "Shawn
>>> Anastasio" <sanastasio@raptorengineering.com>
>>> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel"
>>> <linux-kernel@vger.kernel.org>, "linux-pci"
>>> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>,
>>> "Michael Ellerman" <mpe@ellerman.id.au>,
>>> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao"
>>> <naveen@kernel.org>, "Bjorn Helgaas"
>>> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
>>> Sent: Friday, June 20, 2025 4:26:51 AM
>>> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention
>>> indicator
>>> Shawn, Timothy:
>>>
>>> Thanks for posting the series of patch. Few things I wanted to do better
>>> understand Raptor problem -
>>>
>>>
>>> 1. Last time my two patches solved all the hotunplug operation and Kernel crash
>>> issue except nvme case. It did not work with
>>>
>>>     NVME since dpc support was not there. I was not able to do that due to being
>>>       occupied in some other work.
>> With the current series all hotplug is working correctly, including not only
>> NVMe on root port and bridge ports, but also suprise plug of the entire PCIe
>> switch at the root port.  The lack of DPC support *might* be related to the PE
>> freeze, but in any case we prefer the hotplug driver to be able to recover from
>> a PE freeze (e.g. if a bridge card is faulty and needs to be replaced) without
>> also requiring a reboot, so I would consider DPC implementation orthogonal to
>> this patch set.
> Sounds Good !!
>>
>>> 2. I want to understand the delta from last yr problem to this problem. Is the
>>> PHB freeze or hotunplug failure happens
>>>
>>>     only for particular Microsemi switch or it happens with all the switches. When
>>>     did this problem started coming ? Till last yr
>> Hotplug has never worked reliably for us, if it worked at all it was always
>> rolling the dice on whether the kernel would oops and take down the host.  Even
>> if the kernel didn't oops, suprise plug and auto-add / auto-remove never worked
>> beyond one remove operation.
> I would like to see this problem may be during our zoom/teams meeting. Though I
> have not tested surprise plug/unplug and only tested via sysfs, you may be
> correct but I want to have a look of this problem.
>>
>>>     it was not there. Is it specific to particular Hardware ? Can I get your setup
>>>     to test this problem and your patch ?
>> Because you will need to be able to physically plug and unplug cards and drives
>> this may be a bit tricky.  Do you have access to a POWER9 host system with a
>> x16 PCIe slot?  If so, all you need is a Supermicro SLC-AO3G-8E2P card and some
>> random U.2 NVMe drives -- these cards are readily available and provide
>> relatively standardized OCuLink access to a Switchtec bridge.
>>
>> If you don't have access to a POWER9 host, we can set you up with remote access,
>> but it won't show all of the crashing and problems that occur with surprise
>> plug unless we set up a live debug session (video call or similar).
> 
> 
> Video Call should be fine. During the call I will have a look of existing
> problem and fix along with driver/kernel logs.

Sounds good.  We'll set up a machine in the DMZ for this session so you can also have access.  For anyone interested in logging on to the box for logs, can you send over an SSH public key to my Email address directly?  Will get everyone added with root access to the test box prior to the call start.

>>
>>> 3. To me, hot unplug opertaion  --> AER triggering --> DPC support, this flow
>>> should mask the error to reach root port/cpu and it
>>>
>>>     should solve the PHB freeze/ hot unplug failure operation. If there are AER/EEH
>>>     related synchronization issue we need to solve them.
>>>
>>>     Can you pls list the issue, I will pass it to EEH/AER team. But yeah, to me if
>>>     AER implementation is correct and we add DPC support,
>>>
>>>     all the error will be contained by switch itself. The PHB/root port/cpu will not
>>>     be impacted by this and there should not be any freeze.
>> While this is a good goal to work toward, it only solves one possible fault
>> mode.  The patch series posted here will handle the general case of a PE freeze
>> without requiring a host reboot, which is great for high-reliability systems
>> where there might be a desire to replace the entire switch card (this has been
>> tested with the patch series and works perfectly).
> 
> 
> You may be correct on this and this is possible. If the driver and AER/EEH
> errors/events are properly
> 
> handled then we may not need DPC in all cases. The point of DPC was to absorb
> the error at switch port
> 
> itself so that it will not reach to PHB/Root-port/Cpu and will avoid further
> corruption. I was hoping that if
> 
> DPC gets enabled, we may not need explicit reboot for drives to come up in case
> of surprise hot unplug.

I do understand the logic here, and it would theoretically work, but again it's a bit more fragile than the solution we're presenting here in that it relies on another chunk of device logic to work correctly in all cases, with the consequence of a failure being a forced reboot.

With our patch series here we can hot plug and hot unplug NVMe drives all day without requiring any reboots, including surprise plug.  DPC would simply make this process a little bit faster, in that we don't have to wait a few hundred milliseconds for the PE to unfreeze and the EEH driver to give up.

> But yeah, we can compare this with current result when this support will be
> enabled.
> 
>>
>>> 4. Ofcourse we can pick some of the fixes from pciehp driver if its missing in
>>> pnv_php.c. Also at the same time you have done
>>>
>>>     some cleanup in hot unplug path and fixed the attenuation button related code.
>>>     If these works fine, we can pick it. But I want to test it.
>>>
>>>      Pls provide me setup.
>>>
>>> 5. If point 3 and 4 does not solve the problem, then only we should move to
>>> pciehp.c. But AFAIK, PPC/Powernv is DT based while pciehp.c
>>>
>>>      may be only supporting acpi (I have to check it on this).  We need to provide
>>>      PHB related information via DTB and maintain the related
>>>
>>>      topology information via dtb and then it can be doable. Also , we need to do
>>>      thorough planning/testing if we think to choose pciehp.c.
>>>
>>>      But yeah, lets not jump here and lets try to fix the current issues via point 3
>>>      & 4. Point 5 will be our last option.
>> If possible I would like to see this series merged vs. being blocked on DPC.
>> Again, from where I sit DPC is orthogonal; many events can cause a PE freeze
>> and implementing DPC only solves one.  We do *not* want to require a host
>> reboot in any situation whatsoever short of a complete failure of a critical
>> element (e.g. the PHB itself or a CPU package); our use case as deployed is
>> five nines critical infrastructure, and the broken hotplug has already been the
>> sole reason we have not maintained 100% uptime on a key system.
> 
> If you are in hurry and want to defer DPC for some time, I am fine with it since
> it serves larger purpose like PE freeze and NVME drives working
> 
> along with surprise hotplug fixes.  I have gone through your pnv_php.c changes
> and I am mostly fine with it. But, I would like to review it again
> 
> from larger prespective w.r.t to EEH & pciehp.c, so give me some time.

Sure, please let me know if anything is concerning.  My goal would be to have this reviewed from a code quality perspective and a v2 posted at least a couple of days before the video call.

Also, if
> possible you can show me
> 
> the problem/fix along with log  during video call. it would be great if we can
> meet sometimes next month in early first week may be on 5th of July.

Let's plan for that -- this is somewhat urgent with Debian Trixie being released soon, and I want to see this repair backported.  We already ship Bookworm Debian kernels with completely broken VFIO and hotplug, our goal is to get that all fixed for Trixie and enable the functionality our customers are paying for.

> I will request few of the EEH/AER developer to have a look into the patch and to
> join the meeting if they have bandwidth. Please shoot the
> 
> mail/invite on krishna.kumar11@ibm.com along with this email id. I am based in
> Bangalore but can be available till night 10:00 pm.

Will do, thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-19  0:37     ` Timothy Pearson
  2025-06-20  9:26       ` Krishna Kumar
@ 2025-06-24 22:34       ` Bjorn Helgaas
  2025-07-07  8:01         ` Krishna Kumar
  1 sibling, 1 reply; 31+ messages in thread
From: Bjorn Helgaas @ 2025-06-24 22:34 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio

On Wed, Jun 18, 2025 at 07:37:54PM -0500, Timothy Pearson wrote:
> ----- Original Message -----
> > From: "Bjorn Helgaas" <helgaas@kernel.org>
> > To: "Timothy Pearson" <tpearson@raptorengineering.com>
> > Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> > <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> > "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> > <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
> > Sent: Wednesday, June 18, 2025 2:01:46 PM
> > Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
> 
> > On Wed, Jun 18, 2025 at 11:58:59AM -0500, Timothy Pearson wrote:
> >>  state
> > 
> > Weird wrapping of last word of subject to here.
> 
> I'll need to see what's up with my git format-patch setup. Apologies
> for that across the multiple series.

No worries.  If you can figure out how to make your mailer use the
normal "On xxx, somebody wrote:" attribution instead of duplicating
all those headers, that would be far more useful :)

> >> +static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8
> >> *state)
> >> +{
> >> +	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
> >> +	struct pci_dev *bridge = php_slot->pdev;
> >> +	u16 status;
> >> +
> >> +	pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
> >> +	*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;
> > 
> > Should be able to do this with FIELD_GET().
> 
> I used the same overall structure as the pciehp_hpc driver here.  Do
> you want me to also fix up that driver with FIELD_GET()?

Nope, I think it's fine to keep this looking like pciehp for now.
If somebody wants to use FIELD_GET() in pciehp, I'd probably be OK
with that, but no need for you to open that can of worms.

> > Is the PCI_EXP_SLTCTL_PIC part needed?  It wasn't there before, commit
> > log doesn't mention it, and as far as I can tell, this would be the
> > only driver to do that.  Most expose only the attention status (0=off,
> > 1=on, 2=identify/blink).
> > 
> >> +	return 0;
> >> +}
> >> +
> >> +
> >>  static int pnv_php_get_attention_state(struct hotplug_slot *slot, u8 *state)
> >>  {
> >>  	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
> >>  
> >> +	pnv_php_get_raw_indicator_status(slot, &php_slot->attention_state);
> > 
> > This is a change worth noting.  Previously we didn't read the AIC
> > state from PCI_EXP_SLTCTL at all; we used php_slot->attention_state to
> > keep track of whatever had been previously set via
> > pnv_php_set_attention_state().
> > 
> > Now we read the current state from PCI_EXP_SLTCTL.  It's not clear
> > that php_slot->attention_state is still needed at all.
> 
> It probably isn't.  It's unclear why IBM took this path at all,
> given pciehp's attention handlers predate pnv-php's by many years.
> 
> > Previously, the user could write any value at all to the sysfs
> > "attention" file and then read that same value back.  After this
> > patch, the user can still write anything, but reads will only return
> > values with PCI_EXP_SLTCTL_AIC and PCI_EXP_SLTCTL_PIC.
> > 
> >>  	*state = php_slot->attention_state;
> >>  	return 0;
> >>  }
> >> @@ -461,7 +474,7 @@ static int pnv_php_set_attention_state(struct hotplug_slot
> >> *slot, u8 state)
> >>  	mask = PCI_EXP_SLTCTL_AIC;
> >>  
> >>  	if (state)
> >> -		new = PCI_EXP_SLTCTL_ATTN_IND_ON;
> >> +		new = FIELD_PREP(PCI_EXP_SLTCTL_AIC, state);
> > 
> > This changes the behavior in some cases:
> > 
> >  write 0: previously turned indicator off, now writes reserved value
> >  write 2: previously turned indicator on, now sets to blink
> >  write 3: previously turned indicator on, now turns it off
> 
> If we're looking at normalizing with pciehp with an eye toward
> eventually deprecating / removing pnv-php, I can't think of a better
> time to change this behavior.  I suspect we're the only major user
> of this code path at the moment, with most software expecting to see
> pciehp-style handling.  Thoughts?

I'm OK with changing this, but I do think it would be worth calling
out the different behavior in the commit log.

Bjorn

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-21  9:59         ` Lukas Wunner
@ 2025-06-25  4:08           ` Krishna Kumar
  2025-06-25  8:08             ` Lukas Wunner
  0 siblings, 1 reply; 31+ messages in thread
From: Krishna Kumar @ 2025-06-25  4:08 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linuxppc-dev, Timothy Pearson, Shawn Anastasio, linux-kernel,
	"linux-pci",, Madhavan Srinivasan, Michael Ellerman,
	christophe leroy, Naveen N Rao, "Bjorn Helgaas",


On 6/21/25 3:29 PM, Lukas Wunner wrote:
> On Fri, Jun 20, 2025 at 02:56:51PM +0530, Krishna Kumar wrote:
>> 5. If point 3 and 4 does not solve the problem, then only we should
>> move to pciehp.c. But AFAIK, PPC/Powernv is DT based while pciehp.c
>> may be only supporting acpi (I have to check it on this). We need to
>> provide PHB related information via DTB and maintain the related
>> topology information via dtb and then it can be doable.
> pciehp is not ACPI-specific.  The PCIe port service driver in
> drivers/pci/pcie/portdrv.c binds to any PCIe port, examines the
> port's capabilities (e.g. hotplug, AER, DPC, ...) and instantiates
> sub-devices to which pciehp and the other drivers such as aer bind.

Good to know and thanks for the info. As I already told I did not go through pciehp.c,

and its internal working (since I did not needed it) I can only comment in concrete manner 

once I am done with its code review and internal logic.

What my concern was, you can comment on below point if I am wrong (I will learn something) -

1. If we get PHB info from mmcfg via acpi table in x86 and create a root port from there with 

some address/entity and if this Acpi and associated entity is not present for PPC, then it can be a problem.

2. PPC is normally based on DTB entity and it identifies PHB and pcie devices from there. If this all the information 

is correctly map via portdrv.c then there is no problem and whatever you are telling is correct and it will work.

3. But if point 2 is not handled correctly we need to just aligned with port related data structure to make it work. 

Not a big deal but need to put thorough logic and testing effort. If its already in place, we are good.


>
> How do you prevent pciehp from driving hotplug on PowerNV anyway?
> Do you disable CONFIG_HOTPLUG_PCI_PCIE?

I am not sure if at present pciehp is used for Powenv, and PPC uses this config or it has disabled it (since we rely on pnvphp.c for our hotplug operation, I did not checked it). 

If its going to work on PPC by enabling config and its giving correct output, I dont see any reason to not using it as an 

alternate driver where pnvphp.c fails. But yeah, as I told I can commnet on this once I do some experiment (going through 

the code and some testing). But yeah, its always good to hear from you. Thanks a bunch !!

>
> Thanks,
>
> Lukas

Best regards,

Krishna


>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-25  4:08           ` Krishna Kumar
@ 2025-06-25  8:08             ` Lukas Wunner
  2025-06-25 10:55               ` Krishna Kumar
  0 siblings, 1 reply; 31+ messages in thread
From: Lukas Wunner @ 2025-06-25  8:08 UTC (permalink / raw)
  To: Krishna Kumar
  Cc: linuxppc-dev, Timothy Pearson, Shawn Anastasio, linux-kernel,
	"linux-pci",, Madhavan Srinivasan, Michael Ellerman,
	christophe leroy, Naveen N Rao, "Bjorn Helgaas",

On Wed, Jun 25, 2025 at 09:38:19AM +0530, Krishna Kumar wrote:
> On 6/21/25 3:29 PM, Lukas Wunner wrote:
> > On Fri, Jun 20, 2025 at 02:56:51PM +0530, Krishna Kumar wrote:
> > > 5. If point 3 and 4 does not solve the problem, then only we should
> > > move to pciehp.c. But AFAIK, PPC/Powernv is DT based while pciehp.c
> > > may be only supporting acpi (I have to check it on this). We need to
> > > provide PHB related information via DTB and maintain the related
> > > topology information via dtb and then it can be doable.
> > 
> > pciehp is not ACPI-specific.  The PCIe port service driver in
> > drivers/pci/pcie/portdrv.c binds to any PCIe port, examines the
> > port's capabilities (e.g. hotplug, AER, DPC, ...) and instantiates
> > sub-devices to which pciehp and the other drivers such as aer bind.
> 
> 1. If we get PHB info from mmcfg via acpi table in x86 and create a
>    root port from there with some address/entity and if this Acpi and
>    associated entity is not present for PPC, then it can be a problem.
> 
> 2. PPC is normally based on DTB entity and it identifies PHB and pcie
>    devices from there. If this all the information is correctly map
>    via portdrv.c then there is no problem and whatever you are telling
>    is correct and it will work.
> 
> 3. But if point 2 is not handled correctly we need to just aligned with
>    port related data structure to make it work.

PCI devices do not have to be enumerated in the devicetree (or in ACPI
DSDT) because PCI is an enumerable bus (like USB).  Only the host bridge
has to be enumerated in the devicetree or DSDT.  The kernel can find the
PCI devices below the host bridge itself.  Hot-plugged devices are
usually not described in the devicetree or DSDT because one doesn't
know their properties in advance.

pnv_php.c seems to search the devicetree for hotplug slots and
instantiates them.  My expectation would be that any hotplug-capable
PCIe Root Port or Downstream Port, which is *not* described in the
devicetree such that pnv_php.c creates a slot for it, is handled by
pciehp.

Timothy was talking about a Microsemi PCIe switch below the Root Port.
My understanding is that the Downstream Ports of that switch are
hotplug-capable.  So unless you've disabled CONFIG_HOTPLUG_PCI_PCIE,
I'd expect those ports to be handled by pciehp.  Assuming they're not
described as a "ibm,ioda2-phb" compatible device in the devicetree,
but why would they?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken
  2025-06-20 16:45             ` Timothy Pearson
@ 2025-06-25  8:45               ` Lukas Wunner
  0 siblings, 0 replies; 31+ messages in thread
From: Lukas Wunner @ 2025-06-25  8:45 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: Bjorn Helgaas, linuxppc-dev, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas, Shawn Anastasio

On Fri, Jun 20, 2025 at 11:45:45AM -0500, Timothy Pearson wrote:
> From: "Lukas Wunner" <lukas@wunner.de>
> > I don't know how much PCIe hotplug on PowerNV differs from native,
> > spec-compliant PCIe hotplug.  If the differences are vast (and I
> > get the feeling they might be if I read terms like "PHB" and
> > "EEH unfreeze", which sound completely foreign to me), it might
> > be easier to refactor pnv_php.c and copy patterns or code from
> > pciehp, than to integrate the functionality from pnv_php.c into
> > pciehp.
> 
> The differences at the root level (PHB) are quite significant -- the
> controller is more advanced in many ways than standard PCIe root
> complexes [1] -- and those differences need very special handling.
> Once we are looking at devices downstream of the root complex,
> standard PCIe hotplug and AER specifications apply, so we're in a
> strange spot of really wanting to use pciehp (to handle all nested
> downstream bridges), but pciehp still needs to understand how to deal
> with our unqiue root complex.
> 
> One idea I had was to use the existing modularity of pciehp's source
> and add a new pciehp_powernv.c file with all of our root complex
> handling methods.  We could then #ifdef swap the assocated root complex
> calls to that external file when compiled in PowerNV mode.

We've traditionally dealt with such issues by inserting pcibios_*()
hooks in generic code, with a __weak implementation (which is usually
an empty stub) and a custom implementation in arch/powerpc/.

The linker then chooses the custom implementation over the __weak one.

You can find the existing hooks with:

git grep "__weak .*pcibios" -- drivers/pci
git grep pcibios -- arch/powerpc

An alternative method is to add a callback to struct pci_host_bridge.

> > One thing I don't quite understand is, it sounds like you've
> > attached a PCIe switch to a Root Port and the hotplug ports
> > are on the PCIe switch.  Aren't those hotplug ports just
> > bog-standard ones that can be driven by pciehp?  My expectation
> > would have been that a PowerNV-specific hotplug driver would
> > only be necessary for hotplug-capable Root Ports.
> 
> That's the problem -- the pciehp driver assumes x86 root ports,

Nah, there's nothing x86-specific about it.  pciehp is used on all
kinds of arches, it's just an implementation of the PCIe Base Spec.

> and the powernv driver ends up duplicating (badly) parts of the pciehp
> functionality for downstream bridges.  That's one reason I'd like to
> abstract the root port handling in pciehp and eventually move the
> PowerNV root port handling into that module.

Sounds like you're currently describing the Switch Downstream Ports with
an "ibm,ioda-phb2" compatible string in the devicetree?  That would feel
wrong since they're not host bridges.

> [1] Among other interesting differences, it is both capable of and has
> been tested to properly block and report all invalid accesses from a
> PCIe device to system memory.  This breaks assumptions in many PCIe
> device drivers, but is also a significant security advantage.
> EEH freeze is effectively this security mechanism kicking in -- on
> detecting an invalid access, the PHB itself will block the access and
> put the PE into a frozen state where no PCIe traffic is permitted.

That sounds somewhat similar to what the PCIe Access Control Services
Capability provides.  I think at least some IOMMUs support raising an
exception on illegal accesses, so the EEH functionality is probably not
*that* special.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-25  8:08             ` Lukas Wunner
@ 2025-06-25 10:55               ` Krishna Kumar
  0 siblings, 0 replies; 31+ messages in thread
From: Krishna Kumar @ 2025-06-25 10:55 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linuxppc-dev, Timothy Pearson, Shawn Anastasio, linux-kernel,
	"linux-pci",, Madhavan Srinivasan, Michael Ellerman,
	christophe leroy, Naveen N Rao, "Bjorn Helgaas",


On 6/25/25 1:38 PM, Lukas Wunner wrote:
> On Wed, Jun 25, 2025 at 09:38:19AM +0530, Krishna Kumar wrote:
>> On 6/21/25 3:29 PM, Lukas Wunner wrote:
>>> On Fri, Jun 20, 2025 at 02:56:51PM +0530, Krishna Kumar wrote:
>>>> 5. If point 3 and 4 does not solve the problem, then only we should
>>>> move to pciehp.c. But AFAIK, PPC/Powernv is DT based while pciehp.c
>>>> may be only supporting acpi (I have to check it on this). We need to
>>>> provide PHB related information via DTB and maintain the related
>>>> topology information via dtb and then it can be doable.
>>> pciehp is not ACPI-specific.  The PCIe port service driver in
>>> drivers/pci/pcie/portdrv.c binds to any PCIe port, examines the
>>> port's capabilities (e.g. hotplug, AER, DPC, ...) and instantiates
>>> sub-devices to which pciehp and the other drivers such as aer bind.
>> 1. If we get PHB info from mmcfg via acpi table in x86 and create a
>>    root port from there with some address/entity and if this Acpi and
>>    associated entity is not present for PPC, then it can be a problem.
>>
>> 2. PPC is normally based on DTB entity and it identifies PHB and pcie
>>    devices from there. If this all the information is correctly map
>>    via portdrv.c then there is no problem and whatever you are telling
>>    is correct and it will work.
>>
>> 3. But if point 2 is not handled correctly we need to just aligned with
>>    port related data structure to make it work.
> PCI devices do not have to be enumerated in the devicetree (or in ACPI
> DSDT) because PCI is an enumerable bus (like USB).  Only the host bridge
> has to be enumerated in the devicetree or DSDT.  The kernel can find the
> PCI devices below the host bridge itself.
Yes in DFS manner (once it gets to know the PHB address via -acpi in X86 and via DTB in PPC)
>   Hot-plugged devices are
> usually not described in the devicetree or DSDT because one doesn't
> know their properties in advance.
>
> pnv_php.c seems to search the devicetree for hotplug slots and
> instantiates them.  My expectation would be that any hotplug-capable
> PCIe Root Port or Downstream Port, which is *not* described in the
> devicetree such that pnv_php.c creates a slot for it, is handled by
> pciehp.
Your are correct, pnv_php.c heavily depends on slot id and DTB nodes. Thats how its designed. Can we decouple it via DTB nodes, I will come back on this.
>
> Timothy was talking about a Microsemi PCIe switch below the Root Port.
> My understanding is that the Downstream Ports of that switch are
> hotplug-capable. 
I understand it.
>  So unless you've disabled CONFIG_HOTPLUG_PCI_PCIE,
> I'd expect those ports to be handled by pciehp. 
I need to check it and test it, but yeah- it may or maynot work and I will confirm it only after some study and testing (maybe in 1-2 week)
>  Assuming they're not
> described as a "ibm,ioda2-phb" compatible device in the devicetree,
> but why would they?
HW topology and DTB is based on IODA and it will keep changing, not frequently but eventually.
>
> Thanks,
>
> Lukas


Best Regards,

Krishna


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-06-24 22:34       ` Bjorn Helgaas
@ 2025-07-07  8:01         ` Krishna Kumar
  2025-07-11 18:18           ` Timothy Pearson
  0 siblings, 1 reply; 31+ messages in thread
From: Krishna Kumar @ 2025-07-07  8:01 UTC (permalink / raw)
  To: Bjorn Helgaas, Timothy Pearson
  Cc: linuxppc-dev, linux-kernel, linux-pci, Madhavan Srinivasan,
	Michael Ellerman, christophe leroy, Naveen N Rao, Bjorn Helgaas,
	Shawn Anastasio, Lukas Wunner

Thanks all for the review and Thanks a bunch to Timothy for fixing the PE Freeze issue. The hotplug issues are like you fix N issue and N+1 th issue will come with new HW.

We had a meeting of around 1.5 -2.0 hr with demo, code review and log review and we decided to let these fixes go ahead. I am consolidating what we discussed -


1. Let these fixes go ahead as it solves wider and much needed customer issue and its urgently needed for time being. We have verified in live demo that nothing is broken from legacy flow and 

PE/PHB freeze, race condition, hung, oops etc has been solved correctly. Basically it fixes below issues -

root-port(phb) -> switch(having4 port)--> 2 nvme devices

1st case - only removal of EP-nvme device (surprise hotplug disables msi at root port), soft removal may work
2nd case  - If we remove switch itself (surprise hotplug disable msi at root port) .


Some explanation of problem -

PHB Freeze (Interrupt is not reaching there) - Fence - Need to reset ||
EP device on removal generated msi - goes to cpu (via root port and then apic mmio region to cpu ),
PHB/root port itself got freeze and cpu never get interrupt - No wq/event going to work - driver is noit working


One area what I thought to fix it with OPAL call is below piece of code-

ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
+    if (ret) {
+        SLOT_WARN(php_slot, "PCI slot activation failed with error code %d, possible frozen PHB", ret);
+        SLOT_WARN(php_slot, "Attempting complete PHB reset before retrying slot activation\n");
+        for (i = 0; i < 3; i++) {
+            /* Slot activation failed, PHB may be fenced from a prior device failure
+             * Use the OPAL fundamental reset call to both try a device reset and clear
+             * any potentially active PHB fence / freeze
+             */
+            SLOT_WARN(php_slot, "Try %d...\n", i + 1);
+            pci_set_pcie_reset_state(php_slot->pdev, pcie_warm_reset);
+            msleep(250);
+            pci_set_pcie_reset_state(php_slot->pdev, pcie_deassert_reset);
+
+            ret = pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON);
+            if (!ret)
+                break;
+        }


I would like to see this fix in skiboot, the opal call should reset it and it should work. Normally opal call is responsible for  link training and reset, so ideally it should  happens from there. As if now, Timothy has some explanation for it, so its fine for now. He can add his points for record.


2. In future we have decided to work on items like - removal of Work-queue with threaded irq, addition of dpc support in this driver.

3. We have also discussed if we want to move pciehpc.c driver in future, we have to keep below things in mind, Timothy can add some more points for record.

Device Node (DTB) & its relationship with slot, Can we decouple it and will pciehpc.c going to work correctly for this ?
Driver binding and unbinding based on hotplug event and its relationship with slot. Role of DTB in this. Our driver  also depends on OPAL call for link reset etc, can we decouple from it ? If we add some PPC specific code in pciehpc.c, how will it gets handled (by VFT/Function-Pointer or #ifdef or by seperate files ?)


Lukas has some points for above and I am somewhat aligned with below points, but it needs to be tested to see conceptually it fixes above issues, I am consolidating his points and he can add more  if needed-

Only the host bridge
has to be enumerated in the devicetree or DSDT.

pnv_php.c seems to search the devicetree for hotplug slots and
instantiates them.

We've traditionally dealt with such issues by inserting pcibios_*()
hooks in generic code, with a __weak implementation (which is usually
an empty stub) and a custom implementation in arch/powerpc/.

The linker then chooses the custom implementation over the __weak one.

You can find the existing hooks with:

git grep "__weak .*pcibios" -- drivers/pci
git grep pcibios -- arch/powerpc

An alternative method is to add a callback to struct pci_host_bridge.

pciehp is used on all kinds of arches, it's just an implementation of the PCIe Base Spec.



Best Regards,

Krishna




On 6/25/25 4:04 AM, Bjorn Helgaas wrote:
> On Wed, Jun 18, 2025 at 07:37:54PM -0500, Timothy Pearson wrote:
>> ----- Original Message -----
>>> From: "Bjorn Helgaas" <helgaas@kernel.org>
>>> To: "Timothy Pearson" <tpearson@raptorengineering.com>
>>> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
>>> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
>>> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
>>> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>
>>> Sent: Wednesday, June 18, 2025 2:01:46 PM
>>> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
>>> On Wed, Jun 18, 2025 at 11:58:59AM -0500, Timothy Pearson wrote:
>>>>  state
>>> Weird wrapping of last word of subject to here.
>> I'll need to see what's up with my git format-patch setup. Apologies
>> for that across the multiple series.
> No worries.  If you can figure out how to make your mailer use the
> normal "On xxx, somebody wrote:" attribution instead of duplicating
> all those headers, that would be far more useful :)
>
>>>> +static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8
>>>> *state)
>>>> +{
>>>> +	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
>>>> +	struct pci_dev *bridge = php_slot->pdev;
>>>> +	u16 status;
>>>> +
>>>> +	pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
>>>> +	*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;
>>> Should be able to do this with FIELD_GET().
>> I used the same overall structure as the pciehp_hpc driver here.  Do
>> you want me to also fix up that driver with FIELD_GET()?
> Nope, I think it's fine to keep this looking like pciehp for now.
> If somebody wants to use FIELD_GET() in pciehp, I'd probably be OK
> with that, but no need for you to open that can of worms.
>
>>> Is the PCI_EXP_SLTCTL_PIC part needed?  It wasn't there before, commit
>>> log doesn't mention it, and as far as I can tell, this would be the
>>> only driver to do that.  Most expose only the attention status (0=off,
>>> 1=on, 2=identify/blink).
>>>
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +
>>>>  static int pnv_php_get_attention_state(struct hotplug_slot *slot, u8 *state)
>>>>  {
>>>>  	struct pnv_php_slot *php_slot = to_pnv_php_slot(slot);
>>>>  
>>>> +	pnv_php_get_raw_indicator_status(slot, &php_slot->attention_state);
>>> This is a change worth noting.  Previously we didn't read the AIC
>>> state from PCI_EXP_SLTCTL at all; we used php_slot->attention_state to
>>> keep track of whatever had been previously set via
>>> pnv_php_set_attention_state().
>>>
>>> Now we read the current state from PCI_EXP_SLTCTL.  It's not clear
>>> that php_slot->attention_state is still needed at all.
>> It probably isn't.  It's unclear why IBM took this path at all,
>> given pciehp's attention handlers predate pnv-php's by many years.
>>
>>> Previously, the user could write any value at all to the sysfs
>>> "attention" file and then read that same value back.  After this
>>> patch, the user can still write anything, but reads will only return
>>> values with PCI_EXP_SLTCTL_AIC and PCI_EXP_SLTCTL_PIC.
>>>
>>>>  	*state = php_slot->attention_state;
>>>>  	return 0;
>>>>  }
>>>> @@ -461,7 +474,7 @@ static int pnv_php_set_attention_state(struct hotplug_slot
>>>> *slot, u8 state)
>>>>  	mask = PCI_EXP_SLTCTL_AIC;
>>>>  
>>>>  	if (state)
>>>> -		new = PCI_EXP_SLTCTL_ATTN_IND_ON;
>>>> +		new = FIELD_PREP(PCI_EXP_SLTCTL_AIC, state);
>>> This changes the behavior in some cases:
>>>
>>>  write 0: previously turned indicator off, now writes reserved value
>>>  write 2: previously turned indicator on, now sets to blink
>>>  write 3: previously turned indicator on, now turns it off
>> If we're looking at normalizing with pciehp with an eye toward
>> eventually deprecating / removing pnv-php, I can't think of a better
>> time to change this behavior.  I suspect we're the only major user
>> of this code path at the moment, with most software expecting to see
>> pciehp-style handling.  Thoughts?
> I'm OK with changing this, but I do think it would be worth calling
> out the different behavior in the commit log.
>
> Bjorn
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-07-07  8:01         ` Krishna Kumar
@ 2025-07-11 18:18           ` Timothy Pearson
  2025-07-11 21:05             ` Bjorn Helgaas
  0 siblings, 1 reply; 31+ messages in thread
From: Timothy Pearson @ 2025-07-11 18:18 UTC (permalink / raw)
  To: Krishna Kumar
  Cc: Bjorn Helgaas, linuxppc-dev, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas, Shawn Anastasio, Lukas Wunner



----- Original Message -----
> From: "Krishna Kumar" <krishnak@linux.ibm.com>
> To: "Bjorn Helgaas" <helgaas@kernel.org>, "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>, "Lukas Wunner" <lukas@wunner.de>
> Sent: Monday, July 7, 2025 3:01:00 AM
> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator

> Thanks all for the review and Thanks a bunch to Timothy for fixing the PE Freeze
> issue. The hotplug issues are like you fix N issue and N+1 th issue will come
> with new HW.
> 
> We had a meeting of around 1.5 -2.0 hr with demo, code review and log review and
> we decided to let these fixes go ahead. I am consolidating what we discussed -
> 
> 
> 1. Let these fixes go ahead as it solves wider and much needed customer issue
> and its urgently needed for time being. We have verified in live demo that
> nothing is broken from legacy flow and
> 
> PE/PHB freeze, race condition, hung, oops etc has been solved correctly.
> Basically it fixes below issues -
> 
> root-port(phb) -> switch(having4 port)--> 2 nvme devices
> 
> 1st case - only removal of EP-nvme device (surprise hotplug disables msi at root
> port), soft removal may work
> 2nd case  - If we remove switch itself (surprise hotplug disable msi at root
> port) .

Just a quick follow up to see if anything else is needed from my end before this merge can occur?

Thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-07-11 18:18           ` Timothy Pearson
@ 2025-07-11 21:05             ` Bjorn Helgaas
  2025-07-15 21:41               ` Timothy Pearson
  0 siblings, 1 reply; 31+ messages in thread
From: Bjorn Helgaas @ 2025-07-11 21:05 UTC (permalink / raw)
  To: Timothy Pearson
  Cc: Krishna Kumar, linuxppc-dev, linux-kernel, linux-pci,
	Madhavan Srinivasan, Michael Ellerman, christophe leroy,
	Naveen N Rao, Bjorn Helgaas, Shawn Anastasio, Lukas Wunner

On Fri, Jul 11, 2025 at 01:18:07PM -0500, Timothy Pearson wrote:
> ----- Original Message -----
> > From: "Krishna Kumar" <krishnak@linux.ibm.com>
> > To: "Bjorn Helgaas" <helgaas@kernel.org>, "Timothy Pearson" <tpearson@raptorengineering.com>
> > Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "linux-pci"
> > <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>, "Michael Ellerman" <mpe@ellerman.id.au>,
> > "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao" <naveen@kernel.org>, "Bjorn Helgaas"
> > <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>, "Lukas Wunner" <lukas@wunner.de>
> > Sent: Monday, July 7, 2025 3:01:00 AM
> > Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
> 
> > Thanks all for the review and Thanks a bunch to Timothy for fixing the PE Freeze
> > issue. The hotplug issues are like you fix N issue and N+1 th issue will come
> > with new HW.
> > 
> > We had a meeting of around 1.5 -2.0 hr with demo, code review and log review and
> > we decided to let these fixes go ahead. I am consolidating what we discussed -
> > 
> > 
> > 1. Let these fixes go ahead as it solves wider and much needed customer issue
> > and its urgently needed for time being. We have verified in live demo that
> > nothing is broken from legacy flow and
> > 
> > PE/PHB freeze, race condition, hung, oops etc has been solved correctly.
> > Basically it fixes below issues -
> > 
> > root-port(phb) -> switch(having4 port)--> 2 nvme devices
> > 
> > 1st case - only removal of EP-nvme device (surprise hotplug disables msi at root
> > port), soft removal may work
> > 2nd case  - If we remove switch itself (surprise hotplug disable msi at root
> > port) .
> 
> Just a quick follow up to see if anything else is needed from my end
> before this merge can occur?

I was waiting for a v3 to fix at least these:

  - Subject line style matching history in drivers/pci/hotplug/ (use
    "git log --oneline" to see it)

  - Broken subject line wrapping into commit log

  - Spelling fixes

  - Comment reformat to match prevailing style

  - Note attention indicator behavior changes in commit log

  - Some minor refactoring

Basically everything mentioned in the responses to the original
posting.  Or was there a v3 that I missed?

Bjorn

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator
  2025-07-11 21:05             ` Bjorn Helgaas
@ 2025-07-15 21:41               ` Timothy Pearson
  0 siblings, 0 replies; 31+ messages in thread
From: Timothy Pearson @ 2025-07-15 21:41 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Timothy Pearson, Krishna Kumar, linuxppc-dev, linux-kernel,
	linux-pci, Madhavan Srinivasan, Michael Ellerman,
	christophe leroy, Naveen N Rao, Bjorn Helgaas, Shawn Anastasio,
	Lukas Wunner

Apologies for that, I had meant to send v3 in and apparently it got dropped.  I believe I have addressed the comments in the v3 I just sent in.

Thank you!

----- Original Message -----
> From: "Bjorn Helgaas" <helgaas@kernel.org>
> To: "Timothy Pearson" <tpearson@raptorengineering.com>
> Cc: "Krishna Kumar" <krishnak@linux.ibm.com>, "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel"
> <linux-kernel@vger.kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>,
> "Michael Ellerman" <mpe@ellerman.id.au>, "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao"
> <naveen@kernel.org>, "Bjorn Helgaas" <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>,
> "Lukas Wunner" <lukas@wunner.de>
> Sent: Friday, July 11, 2025 4:05:10 PM
> Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator

> On Fri, Jul 11, 2025 at 01:18:07PM -0500, Timothy Pearson wrote:
>> ----- Original Message -----
>> > From: "Krishna Kumar" <krishnak@linux.ibm.com>
>> > To: "Bjorn Helgaas" <helgaas@kernel.org>, "Timothy Pearson"
>> > <tpearson@raptorengineering.com>
>> > Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "linux-kernel"
>> > <linux-kernel@vger.kernel.org>, "linux-pci"
>> > <linux-pci@vger.kernel.org>, "Madhavan Srinivasan" <maddy@linux.ibm.com>,
>> > "Michael Ellerman" <mpe@ellerman.id.au>,
>> > "christophe leroy" <christophe.leroy@csgroup.eu>, "Naveen N Rao"
>> > <naveen@kernel.org>, "Bjorn Helgaas"
>> > <bhelgaas@google.com>, "Shawn Anastasio" <sanastasio@raptorengineering.com>,
>> > "Lukas Wunner" <lukas@wunner.de>
>> > Sent: Monday, July 7, 2025 3:01:00 AM
>> > Subject: Re: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention
>> > indicator
>> 
>> > Thanks all for the review and Thanks a bunch to Timothy for fixing the PE Freeze
>> > issue. The hotplug issues are like you fix N issue and N+1 th issue will come
>> > with new HW.
>> > 
>> > We had a meeting of around 1.5 -2.0 hr with demo, code review and log review and
>> > we decided to let these fixes go ahead. I am consolidating what we discussed -
>> > 
>> > 
>> > 1. Let these fixes go ahead as it solves wider and much needed customer issue
>> > and its urgently needed for time being. We have verified in live demo that
>> > nothing is broken from legacy flow and
>> > 
>> > PE/PHB freeze, race condition, hung, oops etc has been solved correctly.
>> > Basically it fixes below issues -
>> > 
>> > root-port(phb) -> switch(having4 port)--> 2 nvme devices
>> > 
>> > 1st case - only removal of EP-nvme device (surprise hotplug disables msi at root
>> > port), soft removal may work
>> > 2nd case  - If we remove switch itself (surprise hotplug disable msi at root
>> > port) .
>> 
>> Just a quick follow up to see if anything else is needed from my end
>> before this merge can occur?
> 
> I was waiting for a v3 to fix at least these:
> 
>  - Subject line style matching history in drivers/pci/hotplug/ (use
>    "git log --oneline" to see it)
> 
>  - Broken subject line wrapping into commit log
> 
>  - Spelling fixes
> 
>  - Comment reformat to match prevailing style
> 
>  - Note attention indicator behavior changes in commit log
> 
>  - Some minor refactoring
> 
> Basically everything mentioned in the responses to the original
> posting.  Or was there a v3 that I missed?
> 
> Bjorn

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-07-15 21:41 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-18 16:54 [PATCH v2 0/6] PowerNV PCIe Hotplug Driver Fixes Timothy Pearson
2025-06-18 16:56 ` [PATCH v2 1/6] pci/hotplug/pnv_php: Properly clean up allocated IRQs on Timothy Pearson
2025-06-18 16:56 ` [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken Timothy Pearson
2025-06-18 19:44   ` Bjorn Helgaas
2025-06-18 19:50     ` Timothy Pearson
2025-06-18 20:17       ` Bjorn Helgaas
2025-06-19 19:29         ` Timothy Pearson
2025-06-20  7:52           ` Lukas Wunner
2025-06-20 16:45             ` Timothy Pearson
2025-06-25  8:45               ` Lukas Wunner
2025-06-18 16:57 ` [PATCH v2 3/6] powerpc/eeh: Export eeh_unfreeze_pe() Timothy Pearson
2025-06-18 16:57 ` [PATCH v2 4/6] powerpc/eeh: Make EEH driver device hotplug safe Timothy Pearson
2025-06-18 16:58 ` [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and Timothy Pearson
2025-06-18 19:15   ` Bjorn Helgaas
2025-06-19 19:22     ` Timothy Pearson
2025-06-18 16:58 ` [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator Timothy Pearson
2025-06-18 19:01   ` Bjorn Helgaas
2025-06-19  0:37     ` Timothy Pearson
2025-06-20  9:26       ` Krishna Kumar
2025-06-21  9:59         ` Lukas Wunner
2025-06-25  4:08           ` Krishna Kumar
2025-06-25  8:08             ` Lukas Wunner
2025-06-25 10:55               ` Krishna Kumar
2025-06-21 15:05         ` Timothy Pearson
2025-06-24  7:07           ` Krishna Kumar
2025-06-24 16:34             ` Timothy Pearson
2025-06-24 22:34       ` Bjorn Helgaas
2025-07-07  8:01         ` Krishna Kumar
2025-07-11 18:18           ` Timothy Pearson
2025-07-11 21:05             ` Bjorn Helgaas
2025-07-15 21:41               ` Timothy Pearson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).