Linux PCI subsystem development
 help / color / mirror / Atom feed
* [PATCH v3] PCI: pnv_php: Add null checks for OpenCAPI PHBs
@ 2026-06-09  8:49 Aditya Gupta
  2026-06-09  9:05 ` sashiko-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Aditya Gupta @ 2026-06-09  8:49 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-pci, Madhavan Srinivasan,
	Timothy Pearson, Bjorn Helgaas, Shawn Anastasio
  Cc: Bjorn Helgaas, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy (CS GROUP), stable

For OpenCAPI phb direct slots, the .pdev for php_slots will be NULL

Various sections of the code in pnv_php can do a null dereference and
crash the kernel.

Originally, the issue was hit during boot:

  PowerPC PowerNV PCI Hotplug Driver version: 0.1
  BUG: Kernel NULL pointer dereference at 0x00000074
  Faulting instruction address: 0xc000000000b75fd0
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash  SMP NR_CPUS=2048 NUMA PowerNV
  ...
  NIP [c000000000b75fd0] pnv_php_get_adapter_state+0x60/0x154
  LR [c000000000b75fbc] pnv_php_get_adapter_state+0x4c/0x154
  Call Trace:
  [c000c0000688f990] [c000000000b75fbc] pnv_php_get_adapter_state+0x4c/0x154 (unreliable)
  [c000c0000688fa20] [c000000000b78bd0] pnv_php_enable+0x94/0x378
  [c000c0000688fac0] [c000000000b7912c] pnv_php_register_one.isra.0+0x11c/0x1e0

This occurs for hotplug slots on root buses where bus->self == NULL,
such as OpenCAPI PHB direct slots. An added debug print (not part of
this patch) confirmed it was OpenCAPI:

  pnv_php: slot 'OPENCAPI-0009' has NULL pdev (bus 0009:00, parent=NO (root bus))
  pnv_php: slot 'OPENCAPI-0009' dn->full_name='pciex@603a000000000', compatible='ibm,power10-pau-opencapi-pciex'

This only required null check in 'pnv_php_get_adapter_state', which
caused the kernel to boot.

Even with 'pnv_php_get_adapter_state' null check, there are more
possible null dereferences pointed by sashiko, including cases where
userspace crashes the kernel, such as:

  $ cat /sys/bus/pci/slots/*/attention
  ...
  Kernel attempted to read user page (6e) - exploit attempt? (uid: 0)
  BUG: Kernel NULL pointer dereference on read at 0x0000006e
  Faulting instruction address: 0xc000000000a83334
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash  SMP NR_CPUS=2048 NUMA PowerNV
  ...
  [c000000046707a20] [c000000046707b90] 0xc000000046707b90 (unreliable)
  [c000000046707a70] [0000000000000001] 0x1
  [c000000046707ab0] [c000000000acb00c] attention_read_file+0x54/0xa8
  [c000000046707b30] [c000000000abfbfc] pci_slot_attr_show+0x3c/0x58
  [c000000046707b50] [c0000000008181ec] sysfs_kf_seq_show+0xd4/0x204
  [c000000046707be0] [c000000000815004] kernfs_seq_show+0x44/0x58

Add null checks to prevent the null dereferences.

Cc: stable@vger.kernel.org
Fixes: 80f9fc236279 ("PCI: pnv_php: Work around switches with broken presence detection")
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>

---
Changelog:
v3:
+ split the patch from v2 series, as it's independent
+ incorporate reviews from bjorn to improve the description

v2:
+ sashiko pointed out various pre-existing null pointer derefs, which
  can give access to userspace to crash the kernel, fix them
---
---
 drivers/pci/hotplug/pnv_php.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index ff92a5c301b8..d0f5e8ad1f71 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -47,6 +47,9 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
 	struct pci_dev *pdev = php_slot->pdev;
 	u16 ctrl;
 
+	if (!pdev)
+		return;
+
 	if (php_slot->irq > 0) {
 		pcie_capability_read_word(pdev, PCI_EXP_SLTCTL, &ctrl);
 		ctrl &= ~(PCI_EXP_SLTCTL_HPIE |
@@ -414,7 +417,8 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
 	 */
 	ret = pnv_pci_get_presence_state(php_slot->id, &presence);
 	if (ret >= 0) {
-		if (pci_pcie_type(php_slot->pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
+		if (php_slot->pdev &&
+			pci_pcie_type(php_slot->pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
 			presence == OPAL_PCI_SLOT_EMPTY) {
 			/*
 			 * Similar to pciehp_hpc, check whether the Link Active
@@ -442,6 +446,11 @@ static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8 *state
 	struct pci_dev *bridge = php_slot->pdev;
 	u16 status;
 
+	if (!bridge) {
+		*state = 0;
+		return 0;
+	}
+
 	pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
 	*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;
 	return 0;
@@ -514,11 +523,13 @@ static int pnv_php_activate_slot(struct pnv_php_slot *php_slot,
 			 * fence / freeze.
 			 */
 			SLOT_WARN(php_slot, "Try %d...\n", i + 1);
-			pci_set_pcie_reset_state(php_slot->pdev,
-						 pcie_warm_reset);
-			msleep(250);
-			pci_set_pcie_reset_state(php_slot->pdev,
-						 pcie_deassert_reset);
+			if (php_slot->pdev) {
+				pci_set_pcie_reset_state(php_slot->pdev,
+							 pcie_warm_reset);
+				msleep(250);
+				pci_set_pcie_reset_state(php_slot->pdev,
+							 pcie_deassert_reset);
+			}
 
 			ret = pnv_php_set_slot_power_state(
 				slot, OPAL_PCI_SLOT_POWER_ON);
@@ -911,6 +922,9 @@ pnv_php_detect_clear_suprise_removal_freeze(struct pnv_php_slot *php_slot)
 	struct eeh_pe *pe;
 	int i, rc;
 
+	if (!pdev)
+		return;
+
 	/*
 	 * When a device is surprise removed from a downstream bridge slot,
 	 * the upstream bridge port can still end up frozen due to related EEH
@@ -1093,6 +1107,9 @@ static void pnv_php_enable_irq(struct pnv_php_slot *php_slot)
 	struct pci_dev *pdev = php_slot->pdev;
 	int irq, ret;
 
+	if (!pdev)
+		return;
+
 	/*
 	 * The MSI/MSIx interrupt might have been occupied by other
 	 * drivers. Don't populate the surprise hotplug capability
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v3] PCI: pnv_php: Add null checks for OpenCAPI PHBs
  2026-06-09  8:49 [PATCH v3] PCI: pnv_php: Add null checks for OpenCAPI PHBs Aditya Gupta
@ 2026-06-09  9:05 ` sashiko-bot
  0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-09  9:05 UTC (permalink / raw)
  To: Aditya Gupta; +Cc: linux-pci

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

Pre-existing issues:
- [High] NULL pointer dereference in `pnv_php_alloc_slot()` during error handling.
- [High] Lockless iteration over PCI bus lists risks memory corruption.
- [High] Data race on `php_slot->irq` can lead to a double `free_irq()`.
- [High] Lockless modification of `php_slot->state` leads to TOCTOU races.
--

commit 72170284e1b2c84d6ca9a8f9f30c4d6aaa2d9c12
Author: Aditya Gupta <adityag@linux.ibm.com>

PCI: pnv_php: Add null checks for OpenCAPI PHBs

For OpenCAPI phb direct slots, the .pdev for php_slots will be NULL.
Various sections of the code in pnv_php can do a null dereference and
crash the kernel. Add null checks to prevent the null dereferences.

Cc: stable@vger.kernel.org
Fixes: 80f9fc236279 ("PCI: pnv_php: Work around switches with broken presence detection")

> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
> index ff92a5c301b81..d0f5e8ad1f716 100644
> --- a/drivers/pci/hotplug/pnv_php.c
> +++ b/drivers/pci/hotplug/pnv_php.c
> @@ -47,6 +47,9 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
>  	struct pci_dev *pdev = php_slot->pdev;
>  	u16 ctrl;
>  
> +	if (!pdev)
> +		return;
> +
>  	if (php_slot->irq > 0) {
>  		pcie_capability_read_word(pdev, PCI_EXP_SLTCTL, &ctrl);
>  		ctrl &= ~(PCI_EXP_SLTCTL_HPIE |
>  			  PCI_EXP_SLTCTL_PDCE |
>  			  PCI_EXP_SLTCTL_DLLSCE);
>  		pcie_capability_write_word(pdev, PCI_EXP_SLTCTL, ctrl);
>  
>  		free_irq(php_slot->irq, php_slot);
>  		php_slot->irq = 0;
>  	}

[Severity: High]
This is a pre-existing issue, but can the lockless check of php_slot->irq
lead to a double free?

If two threads execute pnv_php_disable_irq() concurrently (such as from
multiple sysfs writes on parent and child slots), could they both observe
php_slot->irq > 0 and attempt to call free_irq() for the same IRQ before
it is cleared?

[ ... ]

[Severity: High]
This is a pre-existing issue, but does pnv_php_alloc_slot() crash if
alloc_workqueue() fails?

drivers/pci/hotplug/pnv_php.c:pnv_php_alloc_slot() {
	...
	if (!php_slot->wq) {
		SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
	...
}

If SLOT_WARN is invoked here, php_slot->bus is not yet initialized. The
SLOT_WARN macro evaluates to dev_warn(&(sl)->bus->dev, ...), which would take
the address of dev inside a NULL bus pointer. Could this trigger a kernel
Oops when dev_warn() attempts to read the device name?

[Severity: High]
This is a pre-existing issue, but is it safe to traverse bus->children and
bus->slots without acquiring pci_bus_sem?

drivers/pci/hotplug/pnv_php.c:pnv_php_disable_all_irqs() {
	...
	/* First go down child buses */
	list_for_each_entry(child_bus, &bus->children, node)
		pnv_php_disable_all_irqs(child_bus);

	/* Disable IRQs for all pnv_php slots on this bus */
	list_for_each_entry(slot, &bus->slots, list) {
	...
}

Could a concurrent PCI bus rescan or device hot-unplug race with this
lockless traversal, leading to list corruption or a NULL pointer dereference?

[Severity: High]
This is a pre-existing issue, but is there a race condition when checking
php_slot->state locklessly?

drivers/pci/hotplug/pnv_php.c:pnv_php_enable() {
	...
	if (php_slot->state != PNV_PHP_STATE_REGISTERED)
		return 0;
	...
}

And in a concurrent path:

drivers/pci/hotplug/pnv_php.c:pnv_php_unregister_one() {
	...
	php_slot->state = PNV_PHP_STATE_OFFLINE;
	pci_hp_deregister(&php_slot->slot);
	...
}

If a concurrent driver unbind calls pnv_php_unregister_one(), it sets the
state to PNV_PHP_STATE_OFFLINE. If an asynchronous hotplug event executing
pnv_php_enable() has already passed the PNV_PHP_STATE_REGISTERED check,
could it erroneously operate on a slot that is being destroyed?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609084903.1352581-1-adityag@linux.ibm.com?part=1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-09  9:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09  8:49 [PATCH v3] PCI: pnv_php: Add null checks for OpenCAPI PHBs Aditya Gupta
2026-06-09  9:05 ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox