* [PATCH v3] PCI: pnv_php: Add null checks for OpenCAPI PHBs
@ 2026-06-09 8:49 Aditya Gupta
2026-06-09 9:05 ` sashiko-bot
0 siblings, 1 reply; 2+ messages in thread
From: Aditya Gupta @ 2026-06-09 8:49 UTC (permalink / raw)
To: linux-kernel, linuxppc-dev, linux-pci, Madhavan Srinivasan,
Timothy Pearson, Bjorn Helgaas, Shawn Anastasio
Cc: Bjorn Helgaas, Michael Ellerman, Nicholas Piggin,
Christophe Leroy (CS GROUP), stable
For OpenCAPI phb direct slots, the .pdev for php_slots will be NULL
Various sections of the code in pnv_php can do a null dereference and
crash the kernel.
Originally, the issue was hit during boot:
PowerPC PowerNV PCI Hotplug Driver version: 0.1
BUG: Kernel NULL pointer dereference at 0x00000074
Faulting instruction address: 0xc000000000b75fd0
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
...
NIP [c000000000b75fd0] pnv_php_get_adapter_state+0x60/0x154
LR [c000000000b75fbc] pnv_php_get_adapter_state+0x4c/0x154
Call Trace:
[c000c0000688f990] [c000000000b75fbc] pnv_php_get_adapter_state+0x4c/0x154 (unreliable)
[c000c0000688fa20] [c000000000b78bd0] pnv_php_enable+0x94/0x378
[c000c0000688fac0] [c000000000b7912c] pnv_php_register_one.isra.0+0x11c/0x1e0
This occurs for hotplug slots on root buses where bus->self == NULL,
such as OpenCAPI PHB direct slots. An added debug print (not part of
this patch) confirmed it was OpenCAPI:
pnv_php: slot 'OPENCAPI-0009' has NULL pdev (bus 0009:00, parent=NO (root bus))
pnv_php: slot 'OPENCAPI-0009' dn->full_name='pciex@603a000000000', compatible='ibm,power10-pau-opencapi-pciex'
This only required null check in 'pnv_php_get_adapter_state', which
caused the kernel to boot.
Even with 'pnv_php_get_adapter_state' null check, there are more
possible null dereferences pointed by sashiko, including cases where
userspace crashes the kernel, such as:
$ cat /sys/bus/pci/slots/*/attention
...
Kernel attempted to read user page (6e) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x0000006e
Faulting instruction address: 0xc000000000a83334
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
...
[c000000046707a20] [c000000046707b90] 0xc000000046707b90 (unreliable)
[c000000046707a70] [0000000000000001] 0x1
[c000000046707ab0] [c000000000acb00c] attention_read_file+0x54/0xa8
[c000000046707b30] [c000000000abfbfc] pci_slot_attr_show+0x3c/0x58
[c000000046707b50] [c0000000008181ec] sysfs_kf_seq_show+0xd4/0x204
[c000000046707be0] [c000000000815004] kernfs_seq_show+0x44/0x58
Add null checks to prevent the null dereferences.
Cc: stable@vger.kernel.org
Fixes: 80f9fc236279 ("PCI: pnv_php: Work around switches with broken presence detection")
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
---
Changelog:
v3:
+ split the patch from v2 series, as it's independent
+ incorporate reviews from bjorn to improve the description
v2:
+ sashiko pointed out various pre-existing null pointer derefs, which
can give access to userspace to crash the kernel, fix them
---
---
drivers/pci/hotplug/pnv_php.c | 29 +++++++++++++++++++++++------
1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index ff92a5c301b8..d0f5e8ad1f71 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -47,6 +47,9 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
struct pci_dev *pdev = php_slot->pdev;
u16 ctrl;
+ if (!pdev)
+ return;
+
if (php_slot->irq > 0) {
pcie_capability_read_word(pdev, PCI_EXP_SLTCTL, &ctrl);
ctrl &= ~(PCI_EXP_SLTCTL_HPIE |
@@ -414,7 +417,8 @@ static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state)
*/
ret = pnv_pci_get_presence_state(php_slot->id, &presence);
if (ret >= 0) {
- if (pci_pcie_type(php_slot->pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
+ if (php_slot->pdev &&
+ pci_pcie_type(php_slot->pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
presence == OPAL_PCI_SLOT_EMPTY) {
/*
* Similar to pciehp_hpc, check whether the Link Active
@@ -442,6 +446,11 @@ static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8 *state
struct pci_dev *bridge = php_slot->pdev;
u16 status;
+ if (!bridge) {
+ *state = 0;
+ return 0;
+ }
+
pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status);
*state = (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6;
return 0;
@@ -514,11 +523,13 @@ static int pnv_php_activate_slot(struct pnv_php_slot *php_slot,
* fence / freeze.
*/
SLOT_WARN(php_slot, "Try %d...\n", i + 1);
- pci_set_pcie_reset_state(php_slot->pdev,
- pcie_warm_reset);
- msleep(250);
- pci_set_pcie_reset_state(php_slot->pdev,
- pcie_deassert_reset);
+ if (php_slot->pdev) {
+ pci_set_pcie_reset_state(php_slot->pdev,
+ pcie_warm_reset);
+ msleep(250);
+ pci_set_pcie_reset_state(php_slot->pdev,
+ pcie_deassert_reset);
+ }
ret = pnv_php_set_slot_power_state(
slot, OPAL_PCI_SLOT_POWER_ON);
@@ -911,6 +922,9 @@ pnv_php_detect_clear_suprise_removal_freeze(struct pnv_php_slot *php_slot)
struct eeh_pe *pe;
int i, rc;
+ if (!pdev)
+ return;
+
/*
* When a device is surprise removed from a downstream bridge slot,
* the upstream bridge port can still end up frozen due to related EEH
@@ -1093,6 +1107,9 @@ static void pnv_php_enable_irq(struct pnv_php_slot *php_slot)
struct pci_dev *pdev = php_slot->pdev;
int irq, ret;
+ if (!pdev)
+ return;
+
/*
* The MSI/MSIx interrupt might have been occupied by other
* drivers. Don't populate the surprise hotplug capability
--
2.54.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v3] PCI: pnv_php: Add null checks for OpenCAPI PHBs
2026-06-09 8:49 [PATCH v3] PCI: pnv_php: Add null checks for OpenCAPI PHBs Aditya Gupta
@ 2026-06-09 9:05 ` sashiko-bot
0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-09 9:05 UTC (permalink / raw)
To: Aditya Gupta; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
Pre-existing issues:
- [High] NULL pointer dereference in `pnv_php_alloc_slot()` during error handling.
- [High] Lockless iteration over PCI bus lists risks memory corruption.
- [High] Data race on `php_slot->irq` can lead to a double `free_irq()`.
- [High] Lockless modification of `php_slot->state` leads to TOCTOU races.
--
commit 72170284e1b2c84d6ca9a8f9f30c4d6aaa2d9c12
Author: Aditya Gupta <adityag@linux.ibm.com>
PCI: pnv_php: Add null checks for OpenCAPI PHBs
For OpenCAPI phb direct slots, the .pdev for php_slots will be NULL.
Various sections of the code in pnv_php can do a null dereference and
crash the kernel. Add null checks to prevent the null dereferences.
Cc: stable@vger.kernel.org
Fixes: 80f9fc236279 ("PCI: pnv_php: Work around switches with broken presence detection")
> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
> index ff92a5c301b81..d0f5e8ad1f716 100644
> --- a/drivers/pci/hotplug/pnv_php.c
> +++ b/drivers/pci/hotplug/pnv_php.c
> @@ -47,6 +47,9 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
> struct pci_dev *pdev = php_slot->pdev;
> u16 ctrl;
>
> + if (!pdev)
> + return;
> +
> if (php_slot->irq > 0) {
> pcie_capability_read_word(pdev, PCI_EXP_SLTCTL, &ctrl);
> ctrl &= ~(PCI_EXP_SLTCTL_HPIE |
> PCI_EXP_SLTCTL_PDCE |
> PCI_EXP_SLTCTL_DLLSCE);
> pcie_capability_write_word(pdev, PCI_EXP_SLTCTL, ctrl);
>
> free_irq(php_slot->irq, php_slot);
> php_slot->irq = 0;
> }
[Severity: High]
This is a pre-existing issue, but can the lockless check of php_slot->irq
lead to a double free?
If two threads execute pnv_php_disable_irq() concurrently (such as from
multiple sysfs writes on parent and child slots), could they both observe
php_slot->irq > 0 and attempt to call free_irq() for the same IRQ before
it is cleared?
[ ... ]
[Severity: High]
This is a pre-existing issue, but does pnv_php_alloc_slot() crash if
alloc_workqueue() fails?
drivers/pci/hotplug/pnv_php.c:pnv_php_alloc_slot() {
...
if (!php_slot->wq) {
SLOT_WARN(php_slot, "Cannot alloc workqueue\n");
...
}
If SLOT_WARN is invoked here, php_slot->bus is not yet initialized. The
SLOT_WARN macro evaluates to dev_warn(&(sl)->bus->dev, ...), which would take
the address of dev inside a NULL bus pointer. Could this trigger a kernel
Oops when dev_warn() attempts to read the device name?
[Severity: High]
This is a pre-existing issue, but is it safe to traverse bus->children and
bus->slots without acquiring pci_bus_sem?
drivers/pci/hotplug/pnv_php.c:pnv_php_disable_all_irqs() {
...
/* First go down child buses */
list_for_each_entry(child_bus, &bus->children, node)
pnv_php_disable_all_irqs(child_bus);
/* Disable IRQs for all pnv_php slots on this bus */
list_for_each_entry(slot, &bus->slots, list) {
...
}
Could a concurrent PCI bus rescan or device hot-unplug race with this
lockless traversal, leading to list corruption or a NULL pointer dereference?
[Severity: High]
This is a pre-existing issue, but is there a race condition when checking
php_slot->state locklessly?
drivers/pci/hotplug/pnv_php.c:pnv_php_enable() {
...
if (php_slot->state != PNV_PHP_STATE_REGISTERED)
return 0;
...
}
And in a concurrent path:
drivers/pci/hotplug/pnv_php.c:pnv_php_unregister_one() {
...
php_slot->state = PNV_PHP_STATE_OFFLINE;
pci_hp_deregister(&php_slot->slot);
...
}
If a concurrent driver unbind calls pnv_php_unregister_one(), it sets the
state to PNV_PHP_STATE_OFFLINE. If an asynchronous hotplug event executing
pnv_php_enable() has already passed the PNV_PHP_STATE_REGISTERED check,
could it erroneously operate on a slot that is being destroyed?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609084903.1352581-1-adityag@linux.ibm.com?part=1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-09 9:05 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 8:49 [PATCH v3] PCI: pnv_php: Add null checks for OpenCAPI PHBs Aditya Gupta
2026-06-09 9:05 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox