* [PATCH 0/4] check returned value of pci_hp_add_bridge()
@ 2024-05-03 19:23 Nam Cao
2024-05-03 19:23 ` [PATCH 1/4] PCI: shpchp: bail out if pci_hp_add_bridge() fails Nam Cao
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Nam Cao @ 2024-05-03 19:23 UTC (permalink / raw)
To: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel
Cc: Nam Cao
Hi,
When I hot-plug a bridge, but there is no bus number available for its
downstreambus, the kernel crashes.
This can be reproduced with something like:
qemu-system-x86_64 -machine pc-q35-2.10 \
-kernel ../build-pci/arch/x86/boot/bzImage \
-drive "file=img,format=raw" \
-m 2048 -smp 2 -enable-kvm \
-append "console=ttyS0 root=/dev/sda" \
-nographic \
-device pcie-root-port,bus=pcie.0,id=rp1,slot=1,bus-reserve=0
Note the "bus-reserve=0": no bus number is reserved for hot-plugging a
bridge.
After booting is completed, a PCI bridge can be hot-added with the QEMU
command:
device_add pcie-pci-bridge,id=br1,bus=rp1
After this command, the kernel crashes (crash log below).
The reason is that, hot-plugging a bridge is done with pci_hp_add_bridge()
and this can fail. However, its returned value is not checked, and the
kernel proceeds despite the bridge was not added correctly. This results
in the crash.
Patch 1 and patch 2 fix the problem for shpchp and pciehp. The other
hotplug drivers have the same problem as well, but I cannot test them. So
this issue is noted in the TODO file in patch 3.
Patch 4 is an unrelated cleanup I noticed while working on this.
Best regards,
Nam
[ 77.763860] pcieport 0000:00:03.0: pciehp: Slot(1): Button press: will power on in 5 sec
[ 77.765343] pcieport 0000:00:03.0: pciehp: Slot(1): Card present
[ 77.766385] pcieport 0000:00:03.0: pciehp: Slot(1): Link Up
[ 78.881224] pci 0000:01:00.0: [1b36:000e] type 01 class 0x060400 PCIe to PCI/PCI-X bridge
[ 78.883650] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000000ff 64bit]
[ 78.884849] pci 0000:01:00.0: PCI bridge to [bus 00]
[ 78.886433] pci 0000:01:00.0: bridge window [io 0x0000-0x0fff]
[ 78.887541] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff]
[ 78.889479] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
[ 78.892464] pci 0000:01:00.0: No bus number available for hot-added bridge
[ 78.893717] pci 0000:01:00.0: BAR 0 [mem 0xfe800000-0xfe8000ff 64bit]: assigned
[ 78.895703] pcieport 0000:00:03.0: PCI bridge to [bus 01]
[ 78.896708] pcieport 0000:00:03.0: bridge window [io 0x1000-0x1fff]
[ 78.898878] pcieport 0000:00:03.0: bridge window [mem 0xfe800000-0xfe9fffff]
[ 78.900829] pcieport 0000:00:03.0: bridge window [mem 0xfe000000-0xfe1fffff 64bit pref]
[ 78.905378] shpchp 0000:01:00.0: HPC vendor_id 1b36 device_id e ss_vid 0 ss_did 0
[ 78.906729] shpchp 0000:01:00.0: enabling device (0000 -> 0002)
[ 78.910290] BUG: kernel NULL pointer dereference, address: 00000000000000da
[ 78.911539] #PF: supervisor write access in kernel mode
[ 78.912484] #PF: error_code(0x0002) - not-present page
[ 78.913407] PGD 0 P4D 0
[ 78.913871] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 78.914652] CPU: 0 PID: 45 Comm: irq/24-pciehp Not tainted 6.8.6 #31
[ 78.915774] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 78.917395] RIP: 0010:shpc_init+0x3fb/0x9d0
[ 78.918162] Code: 8b 48 08 40 80 ff 02 0f 84 15 04 00 00 f7 c2 00 00 00 1f 0f 84 44 02 00 00 b8 04 00 00 00 b9 04 00 0f
[ 78.921407] RSP: 0018:ffffc9000018fad8 EFLAGS: 00010246
[ 78.922330] RAX: 0000000000000000 RBX: ffff88800459ab00 RCX: 0000000000000000
[ 78.923591] RDX: 00000000000000ff RSI: 0000000000000000 RDI: ffffffff83015701
[ 78.924845] RBP: ffffc9000018fb20 R08: ffff888003658280 R09: 0000000000000000
[ 78.926093] R10: 0000000000000000 R11: ffff888006888780 R12: ffff8880042ff000
[ 78.927358] R13: 0000000000000000 R14: 000000007f000d0f R15: 000000000000001f
[ 78.928622] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 78.930040] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.931056] CR2: 00000000000000da CR3: 000000000471a000 CR4: 00000000000006f0
[ 78.932321] Call Trace:
[ 78.932770] <TASK>
[ 78.933175] ? show_regs+0x64/0x70
[ 78.933793] ? __die+0x23/0x70
[ 78.934351] ? page_fault_oops+0x17b/0x460
[ 78.935087] ? search_module_extables+0x18/0x60
[ 78.935907] ? shpc_init+0x3fb/0x9d0
[ 78.936548] ? kernelmode_fixup_or_oops+0x9d/0x120
[ 78.937420] ? __bad_area_nosemaphore+0x16b/0x220
[ 78.938272] ? bad_area_nosemaphore+0x11/0x20
[ 78.939068] ? do_user_addr_fault+0x28c/0x610
[ 78.939858] ? exc_page_fault+0x6e/0x160
[ 78.940566] ? asm_exc_page_fault+0x2b/0x30
[ 78.941332] ? shpc_init+0x3fb/0x9d0
[ 78.941976] ? shpc_init+0x569/0x9d0
[ 78.942618] shpc_probe+0x92/0x390
[ 78.943232] local_pci_probe+0x46/0xa0
[ 78.943922] pci_device_probe+0xb0/0x190
[ 78.944491] really_probe+0xc2/0x2d0
[ 78.944996] ? __pfx___device_attach_driver+0x10/0x10
[ 78.945693] __driver_probe_device+0x73/0x120
[ 78.946300] driver_probe_device+0x1f/0xf0
[ 78.946869] __device_attach_driver+0x8d/0x120
[ 78.947489] bus_for_each_drv+0x96/0xf0
[ 78.948031] __device_attach+0xae/0x1a0
[ 78.948571] device_attach+0xf/0x20
[ 78.949062] pci_bus_add_device+0x58/0x90
[ 78.949628] pci_bus_add_devices+0x30/0x70
[ 78.950201] pciehp_configure_device+0xa8/0x150
[ 78.950840] pciehp_handle_presence_or_link_change+0x161/0x4a0
[ 78.951655] pciehp_ist+0x20f/0x240
[ 78.952144] ? __pfx_irq_thread_fn+0x10/0x10
[ 78.952744] irq_thread_fn+0x23/0x60
[ 78.953245] irq_thread+0xfa/0x1c0
[ 78.953726] ? __pfx_irq_thread_dtor+0x10/0x10
[ 78.954346] ? __pfx_irq_thread+0x10/0x10
[ 78.955175] kthread+0xe0/0x110
[ 78.955631] ? __pfx_kthread+0x10/0x10
[ 78.956160] ret_from_fork+0x3c/0x60
[ 78.956661] ? __pfx_kthread+0x10/0x10
[ 78.957207] ret_from_fork_asm+0x1b/0x30
[ 78.957754] </TASK>
[ 78.958070] Modules linked in:
[ 78.958501] CR2: 00000000000000da
[ 78.958970] ---[ end trace 0000000000000000 ]---
[ 78.959615] RIP: 0010:shpc_init+0x3fb/0x9d0
[ 78.960201] Code: 8b 48 08 40 80 ff 02 0f 84 15 04 00 00 f7 c2 00 00 00 1f 0f 84 44 02 00 00 b8 04 00 00 00 b9 04 00 0f
[ 78.962745] RSP: 0018:ffffc9000018fad8 EFLAGS: 00010246
[ 78.963462] RAX: 0000000000000000 RBX: ffff88800459ab00 RCX: 0000000000000000
[ 78.964441] RDX: 00000000000000ff RSI: 0000000000000000 RDI: ffffffff83015701
[ 78.965469] RBP: ffffc9000018fb20 R08: ffff888003658280 R09: 0000000000000000
[ 78.966537] R10: 0000000000000000 R11: ffff888006888780 R12: ffff8880042ff000
[ 78.967531] R13: 0000000000000000 R14: 000000007f000d0f R15: 000000000000001f
[ 78.968539] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 78.969662] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.970472] CR2: 00000000000000da CR3: 000000000471a000 CR4: 00000000000006f0
[ 78.971449] note: irq/24-pciehp[45] exited with irqs disabled
[ 78.972281] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 78.973232] #PF: supervisor instruction fetch in kernel mode
[ 78.974012] #PF: error_code(0x0010) - not-present page
[ 78.974717] PGD 0 P4D 0
[ 78.975075] Oops: 0010 [#2] PREEMPT SMP NOPTI
[ 78.975686] CPU: 0 PID: 45 Comm: irq/24-pciehp Tainted: G D 6.8.6 #31
[ 78.976751] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 78.978011] RIP: 0010:0x0
[ 78.978383] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 78.979274] RSP: 0018:ffffc9000018fe98 EFLAGS: 00010286
[ 78.979996] RAX: 0000000000000000 RBX: ffff888003d2c740 RCX: 00000000000001c0
[ 78.980973] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffc9000018fea0
[ 78.981949] RBP: ffffc9000018feb8 R08: 0000000000009ffb R09: 00000000ffffdfff
[ 78.982924] R10: 0000000000000001 R11: ffffffff82a58aa0 R12: ffff888003d2c740
[ 78.983901] R13: ffff888003d2cf54 R14: ffff888003e78001 R15: 0000000000000000
[ 78.984884] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 78.985992] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.986786] CR2: ffffffffffffffd6 CR3: 000000000471a000 CR4: 00000000000006f0
[ 78.987793] Call Trace:
[ 78.988143] <TASK>
[ 78.988443] ? show_regs+0x64/0x70
[ 78.988924] ? __die+0x23/0x70
[ 78.989355] ? page_fault_oops+0x17b/0x460
[ 78.989925] ? do_user_addr_fault+0x2d1/0x610
[ 78.990539] ? _prb_read_valid+0x2e6/0x370
[ 78.991132] ? exc_page_fault+0x6e/0x160
[ 78.991688] ? asm_exc_page_fault+0x2b/0x30
[ 78.992318] task_work_run+0x60/0x90
[ 78.992798] do_exit+0x355/0xb00
[ 78.993234] make_task_dead+0x7e/0x160
[ 78.993708] rewind_stack_and_make_dead+0x17/0x20
[ 78.994303] </TASK>
[ 78.994589] Modules linked in:
[ 78.994984] CR2: 0000000000000000
[ 78.995408] ---[ end trace 0000000000000000 ]---
[ 78.995984] RIP: 0010:shpc_init+0x3fb/0x9d0
[ 78.996503] Code: 8b 48 08 40 80 ff 02 0f 84 15 04 00 00 f7 c2 00 00 00 1f 0f 84 44 02 00 00 b8 04 00 00 00 b9 04 00 0f
[ 78.998828] RSP: 0018:ffffc9000018fad8 EFLAGS: 00010246
[ 78.999500] RAX: 0000000000000000 RBX: ffff88800459ab00 RCX: 0000000000000000
[ 79.000406] RDX: 00000000000000ff RSI: 0000000000000000 RDI: ffffffff83015701
[ 79.001282] RBP: ffffc9000018fb20 R08: ffff888003658280 R09: 0000000000000000
[ 79.002159] R10: 0000000000000000 R11: ffff888006888780 R12: ffff8880042ff000
[ 79.003036] R13: 0000000000000000 R14: 000000007f000d0f R15: 000000000000001f
[ 79.003926] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 79.004921] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 79.005637] CR2: ffffffffffffffd6 CR3: 000000000471a000 CR4: 00000000000006f0
[ 79.006523] note: irq/24-pciehp[45] exited with irqs disabled
[ 79.007261] Fixing recursive fault but reboot is needed!
[ 79.007942] BUG: scheduling while atomic: irq/24-pciehp/45/0x00000000
[ 79.008740] Modules linked in:
[ 79.009151] CPU: 0 PID: 45 Comm: irq/24-pciehp Tainted: G D 6.8.6 #31
[ 79.010117] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 79.011255] Call Trace:
[ 79.011573] <TASK>
[ 79.011845] dump_stack_lvl+0x5f/0x80
[ 79.012310] dump_stack+0x14/0x20
[ 79.012730] __schedule_bug+0x51/0x70
[ 79.013195] __schedule+0x79c/0x890
[ 79.013634] ? vprintk+0x31/0x40
[ 79.014044] ? _printk+0x5f/0x80
[ 79.014456] do_task_dead+0x43/0x50
[ 79.014897] make_task_dead+0x142/0x160
[ 79.015378] rewind_stack_and_make_dead+0x17/0x20
[ 79.015971] </TASK>
Nam Cao (4):
PCI: shpchp: bail out if pci_hp_add_bridge() fails
PCI: pciehp: bail out if pci_hp_add_bridge() fails
PCI: hotplug: document unchecked return value of pci_hp_add_bridge()
PCI: hotplug: remove TODO notes for sgi_hotplug
drivers/pci/hotplug/TODO | 12 +++++-------
drivers/pci/hotplug/pciehp_pci.c | 9 +++++++--
drivers/pci/hotplug/shpchp_pci.c | 9 +++++++--
3 files changed, 19 insertions(+), 11 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] PCI: shpchp: bail out if pci_hp_add_bridge() fails
2024-05-03 19:23 [PATCH 0/4] check returned value of pci_hp_add_bridge() Nam Cao
@ 2024-05-03 19:23 ` Nam Cao
2024-05-03 19:23 ` [PATCH 2/4] PCI: pciehp: " Nam Cao
` (3 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Nam Cao @ 2024-05-03 19:23 UTC (permalink / raw)
To: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel
Cc: Nam Cao, stable
If there is no bus number available for the downstream bus of the
hot-plugged bridge, pci_hp_add_bridge() will fail. The driver proceeds
regardless, and the kernel crashes.
Abort if pci_hp_add_bridge() fails.
Fixes: 7d01f70ac6f4 ("PCI: shpchp: use generic pci_hp_add_bridge()")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@vger.kernel.org>
---
drivers/pci/hotplug/shpchp_pci.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c
index 36db0c3c4ea6..2ac98bdc83d9 100644
--- a/drivers/pci/hotplug/shpchp_pci.c
+++ b/drivers/pci/hotplug/shpchp_pci.c
@@ -48,8 +48,13 @@ int shpchp_configure_device(struct slot *p_slot)
}
for_each_pci_bridge(dev, parent) {
- if (PCI_SLOT(dev->devfn) == p_slot->device)
- pci_hp_add_bridge(dev);
+ if (PCI_SLOT(dev->devfn) == p_slot->device) {
+ if (pci_hp_add_bridge(dev)) {
+ pci_stop_and_remove_bus_device(dev);
+ ret = -EINVAL;
+ goto out;
+ }
+ }
}
pci_assign_unassigned_bridge_resources(bridge);
--
2.39.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-03 19:23 [PATCH 0/4] check returned value of pci_hp_add_bridge() Nam Cao
2024-05-03 19:23 ` [PATCH 1/4] PCI: shpchp: bail out if pci_hp_add_bridge() fails Nam Cao
@ 2024-05-03 19:23 ` Nam Cao
2024-05-03 21:23 ` Bjorn Helgaas
2024-05-04 8:54 ` Lukas Wunner
2024-05-03 19:23 ` [PATCH 3/4] PCI: hotplug: document unchecked return value of pci_hp_add_bridge() Nam Cao
` (2 subsequent siblings)
4 siblings, 2 replies; 14+ messages in thread
From: Nam Cao @ 2024-05-03 19:23 UTC (permalink / raw)
To: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel
Cc: Nam Cao, stable
If there is no bus number available for the downstream bus of the
hot-plugged bridge, pci_hp_add_bridge() will fail. The driver proceeds
regardless, and the kernel crashes.
Abort if pci_hp_add_bridge() fails.
Fixes: 0eb3bcfd088e ("[PATCH] pciehp: allow bridged card hotplug")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: <stable@vger.kernel.org>
---
drivers/pci/hotplug/pciehp_pci.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index ad12515a4a12..faf4fcf2fbdf 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -58,8 +58,13 @@ int pciehp_configure_device(struct controller *ctrl)
goto out;
}
- for_each_pci_bridge(dev, parent)
- pci_hp_add_bridge(dev);
+ for_each_pci_bridge(dev, parent) {
+ if (pci_hp_add_bridge(dev)) {
+ pci_stop_and_remove_bus_device(dev);
+ ret = -EINVAL;
+ goto out;
+ }
+ }
pci_assign_unassigned_bridge_resources(bridge);
pcie_bus_configure_settings(parent);
--
2.39.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/4] PCI: hotplug: document unchecked return value of pci_hp_add_bridge()
2024-05-03 19:23 [PATCH 0/4] check returned value of pci_hp_add_bridge() Nam Cao
2024-05-03 19:23 ` [PATCH 1/4] PCI: shpchp: bail out if pci_hp_add_bridge() fails Nam Cao
2024-05-03 19:23 ` [PATCH 2/4] PCI: pciehp: " Nam Cao
@ 2024-05-03 19:23 ` Nam Cao
2024-05-03 19:23 ` [PATCH 4/4] PCI: hotplug: remove TODO notes for sgi_hotplug Nam Cao
2024-05-03 21:29 ` [PATCH 0/4] check returned value of pci_hp_add_bridge() Bjorn Helgaas
4 siblings, 0 replies; 14+ messages in thread
From: Nam Cao @ 2024-05-03 19:23 UTC (permalink / raw)
To: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel
Cc: Nam Cao
Some hotplug drivers do not check the returned value of
pci_hp_add_bridge(). This may be probmatic if the driver proceed while
pci_hp_add_bridge() fails.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
drivers/pci/hotplug/TODO | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/pci/hotplug/TODO b/drivers/pci/hotplug/TODO
index fdb8dd6ea24d..f0a1746c3c88 100644
--- a/drivers/pci/hotplug/TODO
+++ b/drivers/pci/hotplug/TODO
@@ -6,6 +6,8 @@ cpcihp:
->set_power callbacks in struct cpci_hp_controller_ops. Why were they
introduced? Can they be removed from the struct?
+* Returned code from pci_hp_add_bridge() is not checked.
+
cpqphp:
* The driver spawns a kthread cpqhp_event_thread() which is woken by the
@@ -16,6 +18,8 @@ cpqphp:
* A large portion of cpqphp_ctrl.c and cpqphp_pci.c concerns resource
management. Doesn't this duplicate functionality in the core?
+* Returned code from pci_hp_add_bridge() is not checked.
+
ibmphp:
* Implementations of hotplug_slot_ops callbacks such as get_adapter_present()
@@ -43,6 +47,8 @@ ibmphp:
* A large portion of ibmphp_res.c and ibmphp_pci.c concerns resource
management. Doesn't this duplicate functionality in the core?
+* Returned code from pci_hp_add_bridge() is not checked.
+
sgi_hotplug:
* Several functions access the pci_slot member in struct hotplug_slot even
--
2.39.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/4] PCI: hotplug: remove TODO notes for sgi_hotplug
2024-05-03 19:23 [PATCH 0/4] check returned value of pci_hp_add_bridge() Nam Cao
` (2 preceding siblings ...)
2024-05-03 19:23 ` [PATCH 3/4] PCI: hotplug: document unchecked return value of pci_hp_add_bridge() Nam Cao
@ 2024-05-03 19:23 ` Nam Cao
2024-05-03 21:29 ` [PATCH 0/4] check returned value of pci_hp_add_bridge() Bjorn Helgaas
4 siblings, 0 replies; 14+ messages in thread
From: Nam Cao @ 2024-05-03 19:23 UTC (permalink / raw)
To: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel
Cc: Nam Cao
Commit c7532b601e77 ("PCI/hotplug: remove the sgi_hotplug driver")
deleted the driver.
Remove the remaining TODO notes as well.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
drivers/pci/hotplug/TODO | 8 --------
1 file changed, 8 deletions(-)
diff --git a/drivers/pci/hotplug/TODO b/drivers/pci/hotplug/TODO
index f0a1746c3c88..9d428b0ea524 100644
--- a/drivers/pci/hotplug/TODO
+++ b/drivers/pci/hotplug/TODO
@@ -49,14 +49,6 @@ ibmphp:
* Returned code from pci_hp_add_bridge() is not checked.
-sgi_hotplug:
-
-* Several functions access the pci_slot member in struct hotplug_slot even
- though pci_hotplug.h declares it private. See sn_hp_destroy() for an
- example. Either the pci_slot member should no longer be declared private
- or sgi_hotplug should store a pointer to it in struct slot. Probably the
- former.
-
shpchp:
* There is only a single implementation of struct hpc_ops. Can the struct be
--
2.39.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-03 19:23 ` [PATCH 2/4] PCI: pciehp: " Nam Cao
@ 2024-05-03 21:23 ` Bjorn Helgaas
2024-05-03 21:41 ` Nam Cao
2024-05-04 8:54 ` Lukas Wunner
1 sibling, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2024-05-03 21:23 UTC (permalink / raw)
To: Nam Cao
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel, stable, Lukas Wunner
[+cc Lukas]
On Fri, May 03, 2024 at 09:23:20PM +0200, Nam Cao wrote:
> If there is no bus number available for the downstream bus of the
> hot-plugged bridge, pci_hp_add_bridge() will fail. The driver proceeds
> regardless, and the kernel crashes.
>
> Abort if pci_hp_add_bridge() fails.
Thanks for this and for the details in the cover letter. The cover
letter doesn't get directly preserved and connected to the git commit,
so please include some of the details here in the commit log.
I don't think we need *everything* from the cover letter; just enough
of the messages to show what went wrong and how the kernel crashed, so
somebody who trips over this can connect the crash with this fix. And
the timestamps are not relevant, so you can strip them out. The qemu
repro case is useful too, thanks for that!
Same for the shpchp patch.
And use "git log --oneline drivers/pci/hotplug/pciehp_pci.c" and match
the formatting (in particular, the capitalization) of your subject
lines.
> Fixes: 0eb3bcfd088e ("[PATCH] pciehp: allow bridged card hotplug")
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> Cc: Rajesh Shah <rajesh.shah@intel.com>
> Cc: <stable@vger.kernel.org>
> ---
> drivers/pci/hotplug/pciehp_pci.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
> index ad12515a4a12..faf4fcf2fbdf 100644
> --- a/drivers/pci/hotplug/pciehp_pci.c
> +++ b/drivers/pci/hotplug/pciehp_pci.c
> @@ -58,8 +58,13 @@ int pciehp_configure_device(struct controller *ctrl)
> goto out;
> }
>
> - for_each_pci_bridge(dev, parent)
> - pci_hp_add_bridge(dev);
> + for_each_pci_bridge(dev, parent) {
> + if (pci_hp_add_bridge(dev)) {
> + pci_stop_and_remove_bus_device(dev);
> + ret = -EINVAL;
> + goto out;
> + }
> + }
>
> pci_assign_unassigned_bridge_resources(bridge);
> pcie_bus_configure_settings(parent);
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/4] check returned value of pci_hp_add_bridge()
2024-05-03 19:23 [PATCH 0/4] check returned value of pci_hp_add_bridge() Nam Cao
` (3 preceding siblings ...)
2024-05-03 19:23 ` [PATCH 4/4] PCI: hotplug: remove TODO notes for sgi_hotplug Nam Cao
@ 2024-05-03 21:29 ` Bjorn Helgaas
4 siblings, 0 replies; 14+ messages in thread
From: Bjorn Helgaas @ 2024-05-03 21:29 UTC (permalink / raw)
To: Nam Cao
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel
On Fri, May 03, 2024 at 09:23:18PM +0200, Nam Cao wrote:
> ...
> Nam Cao (4):
> PCI: shpchp: bail out if pci_hp_add_bridge() fails
> PCI: pciehp: bail out if pci_hp_add_bridge() fails
> PCI: hotplug: document unchecked return value of pci_hp_add_bridge()
> PCI: hotplug: remove TODO notes for sgi_hotplug
>
> drivers/pci/hotplug/TODO | 12 +++++-------
> drivers/pci/hotplug/pciehp_pci.c | 9 +++++++--
> drivers/pci/hotplug/shpchp_pci.c | 9 +++++++--
> 3 files changed, 19 insertions(+), 11 deletions(-)
I applied the last two (the TODO updates) to pci/hotplug for v6.10,
thanks!
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-03 21:23 ` Bjorn Helgaas
@ 2024-05-03 21:41 ` Nam Cao
0 siblings, 0 replies; 14+ messages in thread
From: Nam Cao @ 2024-05-03 21:41 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel, stable, Lukas Wunner
On Fri, May 03, 2024 at 04:23:41PM -0500, Bjorn Helgaas wrote:
> On Fri, May 03, 2024 at 09:23:20PM +0200, Nam Cao wrote:
> > If there is no bus number available for the downstream bus of the
> > hot-plugged bridge, pci_hp_add_bridge() will fail. The driver proceeds
> > regardless, and the kernel crashes.
> >
> > Abort if pci_hp_add_bridge() fails.
>
> Thanks for this and for the details in the cover letter. The cover
> letter doesn't get directly preserved and connected to the git commit,
> so please include some of the details here in the commit log.
>
> I don't think we need *everything* from the cover letter; just enough
> of the messages to show what went wrong and how the kernel crashed, so
> somebody who trips over this can connect the crash with this fix. And
> the timestamps are not relevant, so you can strip them out. The qemu
> repro case is useful too, thanks for that!
>
> Same for the shpchp patch.
>
> And use "git log --oneline drivers/pci/hotplug/pciehp_pci.c" and match
> the formatting (in particular, the capitalization) of your subject
> lines.
Thanks for the detailed instructions. I will send v2 next week.
Best regards,
Nam
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-03 19:23 ` [PATCH 2/4] PCI: pciehp: " Nam Cao
2024-05-03 21:23 ` Bjorn Helgaas
@ 2024-05-04 8:54 ` Lukas Wunner
2024-05-04 9:35 ` Nam Cao
1 sibling, 1 reply; 14+ messages in thread
From: Lukas Wunner @ 2024-05-04 8:54 UTC (permalink / raw)
To: Nam Cao
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel, stable
On Fri, May 03, 2024 at 09:23:20PM +0200, Nam Cao wrote:
> If there is no bus number available for the downstream bus of the
> hot-plugged bridge, pci_hp_add_bridge() will fail. The driver proceeds
> regardless, and the kernel crashes.
>
> Abort if pci_hp_add_bridge() fails.
[...]
> --- a/drivers/pci/hotplug/pciehp_pci.c
> +++ b/drivers/pci/hotplug/pciehp_pci.c
> @@ -58,8 +58,13 @@ int pciehp_configure_device(struct controller *ctrl)
> goto out;
> }
>
> - for_each_pci_bridge(dev, parent)
> - pci_hp_add_bridge(dev);
> + for_each_pci_bridge(dev, parent) {
> + if (pci_hp_add_bridge(dev)) {
> + pci_stop_and_remove_bus_device(dev);
> + ret = -EINVAL;
> + goto out;
> + }
> + }
Is the pci_stop_and_remove_bus_device() really necessary here?
Why not just leave the bridge as is, without any child devices?
Thanks,
Lukas
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-04 8:54 ` Lukas Wunner
@ 2024-05-04 9:35 ` Nam Cao
2024-05-04 9:51 ` Lukas Wunner
0 siblings, 1 reply; 14+ messages in thread
From: Nam Cao @ 2024-05-04 9:35 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel, stable
On Sat, May 04, 2024 at 10:54:15AM +0200, Lukas Wunner wrote:
> On Fri, May 03, 2024 at 09:23:20PM +0200, Nam Cao wrote:
> > If there is no bus number available for the downstream bus of the
> > hot-plugged bridge, pci_hp_add_bridge() will fail. The driver proceeds
> > regardless, and the kernel crashes.
> >
> > Abort if pci_hp_add_bridge() fails.
> [...]
> > --- a/drivers/pci/hotplug/pciehp_pci.c
> > +++ b/drivers/pci/hotplug/pciehp_pci.c
> > @@ -58,8 +58,13 @@ int pciehp_configure_device(struct controller *ctrl)
> > goto out;
> > }
> >
> > - for_each_pci_bridge(dev, parent)
> > - pci_hp_add_bridge(dev);
> > + for_each_pci_bridge(dev, parent) {
> > + if (pci_hp_add_bridge(dev)) {
> > + pci_stop_and_remove_bus_device(dev);
> > + ret = -EINVAL;
> > + goto out;
> > + }
> > + }
>
> Is the pci_stop_and_remove_bus_device() really necessary here?
> Why not just leave the bridge as is, without any child devices?
pci_stop_and_remove_bus_device() is not necessary to prevent kernel
crashing. But without this, we cannot hot-plug any other devices to this
slot afterward, despite the bridge has already been removed. Below is what
happens without pci_stop_and_remove_bus_device().
First, we hotplug a bridge. That fails, so QEMU removes this bridge:
(qemu) device_add pci-bridge,id=br2,bus=br1,chassis_nr=19,addr=1
[ 9.289609] shpchp 0000:01:00.0: Latch close on Slot(1-1)
[ 9.291145] shpchp 0000:01:00.0: Button pressed on Slot(1-1)
[ 9.292705] shpchp 0000:01:00.0: Card present on Slot(1-1)
[ 9.294369] shpchp 0000:01:00.0: PCI slot #1-1 - powering on due to button press
[ 15.529997] pci 0000:02:01.0: [1b36:0001] type 01 class 0x060400 conventional PCI bridge
[ 15.533907] pci 0000:02:01.0: BAR 0 [mem 0x00000000-0x000000ff 64bit]
[ 15.535802] pci 0000:02:01.0: PCI bridge to [bus 00]
[ 15.538519] pci 0000:02:01.0: bridge window [io 0x0000-0x0fff]
[ 15.540261] pci 0000:02:01.0: bridge window [mem 0x00000000-0x000fffff]
[ 15.543486] pci 0000:02:01.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
[ 15.547151] pci 0000:02:01.0: No bus number available for hot-added bridge
[ 15.549067] shpchp 0000:01:00.0: Cannot add device at 0000:02:01
[ 15.553104] shpchp 0000:01:00.0: Latch open on Slot(1-1)
[ 15.555246] shpchp 0000:01:00.0: Card not present on Slot(1-1)
Then, hot-plug an ethernet device. But the kernel still incorrectly
thought the bridge is still there, and refuses this new ethernet device:
(qemu) device_add e1000,bus=br1,addr=1
[ 58.163529] shpchp 0000:01:00.0: Latch close on Slot(1-1)
[ 58.165076] shpchp 0000:01:00.0: Button pressed on Slot(1-1)
[ 58.166650] shpchp 0000:01:00.0: Card present on Slot(1-1)
[ 58.168287] shpchp 0000:01:00.0: PCI slot #1-1 - powering on due to button press
[ 64.677492] shpchp 0000:01:00.0: Device 0000:02:01.0 already exists at 0000:02:01, cannot hot-add
[ 64.680007] shpchp 0000:01:00.0: Cannot add device at 0000:02:01
[ 64.682802] shpchp 0000:01:00.0: Latch open on Slot(1-1)
[ 64.684353] shpchp 0000:01:00.0: Card not present on Slot(1-1)
Best regards,
Nam
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-04 9:35 ` Nam Cao
@ 2024-05-04 9:51 ` Lukas Wunner
2024-05-04 10:56 ` Nam Cao
0 siblings, 1 reply; 14+ messages in thread
From: Lukas Wunner @ 2024-05-04 9:51 UTC (permalink / raw)
To: Nam Cao
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel, stable
On Sat, May 04, 2024 at 11:35:29AM +0200, Nam Cao wrote:
> On Sat, May 04, 2024 at 10:54:15AM +0200, Lukas Wunner wrote:
> > On Fri, May 03, 2024 at 09:23:20PM +0200, Nam Cao wrote:
> > > If there is no bus number available for the downstream bus of the
> > > hot-plugged bridge, pci_hp_add_bridge() will fail. The driver proceeds
> > > regardless, and the kernel crashes.
> > >
> > > Abort if pci_hp_add_bridge() fails.
> > [...]
> > > --- a/drivers/pci/hotplug/pciehp_pci.c
> > > +++ b/drivers/pci/hotplug/pciehp_pci.c
> > > @@ -58,8 +58,13 @@ int pciehp_configure_device(struct controller *ctrl)
> > > goto out;
> > > }
> > >
> > > - for_each_pci_bridge(dev, parent)
> > > - pci_hp_add_bridge(dev);
> > > + for_each_pci_bridge(dev, parent) {
> > > + if (pci_hp_add_bridge(dev)) {
> > > + pci_stop_and_remove_bus_device(dev);
> > > + ret = -EINVAL;
> > > + goto out;
> > > + }
> > > + }
> >
> > Is the pci_stop_and_remove_bus_device() really necessary here?
> > Why not just leave the bridge as is, without any child devices?
>
> pci_stop_and_remove_bus_device() is not necessary to prevent kernel
> crashing. But without this, we cannot hot-plug any other devices to this
> slot afterward, despite the bridge has already been removed. Below is what
> happens without pci_stop_and_remove_bus_device().
>
> First, we hotplug a bridge. That fails, so QEMU removes this bridge:
> (qemu) device_add pci-bridge,id=br2,bus=br1,chassis_nr=19,addr=1
> [ 9.289609] shpchp 0000:01:00.0: Latch close on Slot(1-1)
> [ 9.291145] shpchp 0000:01:00.0: Button pressed on Slot(1-1)
> [ 9.292705] shpchp 0000:01:00.0: Card present on Slot(1-1)
> [ 9.294369] shpchp 0000:01:00.0: PCI slot #1-1 - powering on due to button press
> [ 15.529997] pci 0000:02:01.0: [1b36:0001] type 01 class 0x060400 conventional PCI bridge
> [ 15.533907] pci 0000:02:01.0: BAR 0 [mem 0x00000000-0x000000ff 64bit]
> [ 15.535802] pci 0000:02:01.0: PCI bridge to [bus 00]
> [ 15.538519] pci 0000:02:01.0: bridge window [io 0x0000-0x0fff]
> [ 15.540261] pci 0000:02:01.0: bridge window [mem 0x00000000-0x000fffff]
> [ 15.543486] pci 0000:02:01.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
> [ 15.547151] pci 0000:02:01.0: No bus number available for hot-added bridge
> [ 15.549067] shpchp 0000:01:00.0: Cannot add device at 0000:02:01
> [ 15.553104] shpchp 0000:01:00.0: Latch open on Slot(1-1)
> [ 15.555246] shpchp 0000:01:00.0: Card not present on Slot(1-1)
I'm not familiar with shpchp, I don't understand why it's thinking
that there's no card after it failed to find a bus number.
Could you reproduce with pciehp instead of shpchp please?
Thanks,
Lukas
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-04 9:51 ` Lukas Wunner
@ 2024-05-04 10:56 ` Nam Cao
2024-05-04 15:02 ` Lukas Wunner
0 siblings, 1 reply; 14+ messages in thread
From: Nam Cao @ 2024-05-04 10:56 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel, stable
On Sat, May 04, 2024 at 11:51:54AM +0200, Lukas Wunner wrote:
> On Sat, May 04, 2024 at 11:35:29AM +0200, Nam Cao wrote:
> > pci_stop_and_remove_bus_device() is not necessary to prevent kernel
> > crashing. But without this, we cannot hot-plug any other devices to this
> > slot afterward, despite the bridge has already been removed. Below is what
> > happens without pci_stop_and_remove_bus_device().
> >
> > First, we hotplug a bridge. That fails, so QEMU removes this bridge:
> > (qemu) device_add pci-bridge,id=br2,bus=br1,chassis_nr=19,addr=1
> > [ 9.289609] shpchp 0000:01:00.0: Latch close on Slot(1-1)
> > [ 9.291145] shpchp 0000:01:00.0: Button pressed on Slot(1-1)
> > [ 9.292705] shpchp 0000:01:00.0: Card present on Slot(1-1)
> > [ 9.294369] shpchp 0000:01:00.0: PCI slot #1-1 - powering on due to button press
> > [ 15.529997] pci 0000:02:01.0: [1b36:0001] type 01 class 0x060400 conventional PCI bridge
> > [ 15.533907] pci 0000:02:01.0: BAR 0 [mem 0x00000000-0x000000ff 64bit]
> > [ 15.535802] pci 0000:02:01.0: PCI bridge to [bus 00]
> > [ 15.538519] pci 0000:02:01.0: bridge window [io 0x0000-0x0fff]
> > [ 15.540261] pci 0000:02:01.0: bridge window [mem 0x00000000-0x000fffff]
> > [ 15.543486] pci 0000:02:01.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
> > [ 15.547151] pci 0000:02:01.0: No bus number available for hot-added bridge
> > [ 15.549067] shpchp 0000:01:00.0: Cannot add device at 0000:02:01
> > [ 15.553104] shpchp 0000:01:00.0: Latch open on Slot(1-1)
> > [ 15.555246] shpchp 0000:01:00.0: Card not present on Slot(1-1)
>
> I'm not familiar with shpchp, I don't understand why it's thinking
> that there's no card after it failed to find a bus number.
Sorry, I got mixed up between the two.
> Could you reproduce with pciehp instead of shpchp please?
Same thing for pciehp below. I think the problem is because without
pci_stop_and_remove_bus_device(), no one cleans up the device added in
pci_scan_slot(). When another device get hot-added, pci_get_slot() wrongly
thinks another device is already there, so the hot-plug fails.
Best regards,
Nam
(qemu) device_add pcie-pci-bridge,id=br1,bus=rp1
[ 19.840550] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0009 from Slot Status
[ 19.842843] pcieport 0000:00:03.0: pciehp: Slot(1): Button press: will power on in 5 sec
[ 19.845289] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 19.847502] pcieport 0000:00:03.0: pciehp: pciehp_set_indicators: SLOTCTRL 6c write cmd 2c0
[ 19.849876] pcieport 0000:00:03.0: pciehp: pciehp_check_link_active: lnk_status = 2011
[ 19.852094] pcieport 0000:00:03.0: pciehp: Slot(1): Card present
[ 19.853809] pcieport 0000:00:03.0: pciehp: Slot(1): Link Up
[ 19.855412] pcieport 0000:00:03.0: pciehp: pciehp_get_power_status: SLOTCTRL 6c value read 6f1
[ 19.857975] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 19.860199] pcieport 0000:00:03.0: pciehp: pciehp_power_on_slot: SLOTCTRL 6c write cmd 0
[ 19.862586] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 19.864806] pcieport 0000:00:03.0: pciehp: pciehp_set_indicators: SLOTCTRL 6c write cmd 200
[ 20.994936] pcieport 0000:00:03.0: pciehp: pciehp_check_link_status: lnk_status = 2011
[ 20.997463] pci 0000:01:00.0: [1b36:000e] type 01 class 0x060400 PCIe to PCI/PCI-X bridge
[ 21.001131] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000000ff 64bit]
[ 21.003071] pci 0000:01:00.0: PCI bridge to [bus 00]
[ 21.005417] pci 0000:01:00.0: bridge window [io 0x0000-0x0fff]
[ 21.007181] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff]
[ 21.010084] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
[ 21.014162] pci 0000:01:00.0: vgaarb: pci_notify
[ 21.015900] pci 0000:01:00.0: No bus number available for hot-added bridge
[ 21.017865] pcieport 0000:00:03.0: pciehp: Cannot add device at 0000:01:00
[ 21.019931] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 21.022178] pcieport 0000:00:03.0: pciehp: pciehp_power_off_slot: SLOTCTRL 6c write cmd 400
[ 22.084607] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0018 from Slot Status
[ 22.086845] pcieport 0000:00:03.0: pciehp: pciehp_set_indicators: SLOTCTRL 6c write cmd 340
[ 22.089323] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 22.091539] pcieport 0000:00:03.0: pciehp: pciehp_set_indicators: SLOTCTRL 6c write cmd 300
[ 22.093913] pcieport 0000:00:03.0: pciehp: pciehp_check_link_active: lnk_status = 11
(qemu) device_add e1000,bus=rp1,id=eth1
[ 58.389527] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0009 from Slot Status
[ 58.391789] pcieport 0000:00:03.0: pciehp: Slot(1): Button press: will power on in 5 sec
[ 58.394175] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 58.396365] pcieport 0000:00:03.0: pciehp: pciehp_set_indicators: SLOTCTRL 6c write cmd 2c0
[ 58.398681] pcieport 0000:00:03.0: pciehp: pciehp_check_link_active: lnk_status = 2011
[ 58.400871] pcieport 0000:00:03.0: pciehp: Slot(1): Card present
[ 58.402542] pcieport 0000:00:03.0: pciehp: Slot(1): Link Up
[ 58.404154] pcieport 0000:00:03.0: pciehp: pciehp_get_power_status: SLOTCTRL 6c value read 6f1
[ 58.406627] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 58.408798] pcieport 0000:00:03.0: pciehp: pciehp_power_on_slot: SLOTCTRL 6c write cmd 0
[ 58.411213] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 58.413386] pcieport 0000:00:03.0: pciehp: pciehp_set_indicators: SLOTCTRL 6c write cmd 200
[ 59.523011] pcieport 0000:00:03.0: pciehp: pciehp_check_link_status: lnk_status = 2011
[ 59.525256] pcieport 0000:00:03.0: pciehp: Device 0000:01:00.0 already exists at 0000:01:00, skipping hot-add
[ 59.528139] pcieport 0000:00:03.0: pciehp: pending interrupts 0x0010 from Slot Status
[ 59.530325] pcieport 0000:00:03.0: pciehp: pciehp_set_indicators: SLOTCTRL 6c write cmd 1c0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-04 10:56 ` Nam Cao
@ 2024-05-04 15:02 ` Lukas Wunner
2024-05-04 15:48 ` Nam Cao
0 siblings, 1 reply; 14+ messages in thread
From: Lukas Wunner @ 2024-05-04 15:02 UTC (permalink / raw)
To: Nam Cao
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel, stable
On Sat, May 04, 2024 at 12:56:30PM +0200, Nam Cao wrote:
> On Sat, May 04, 2024 at 11:51:54AM +0200, Lukas Wunner wrote:
> > Could you reproduce with pciehp instead of shpchp please?
>
> Same thing for pciehp below. I think the problem is because without
> pci_stop_and_remove_bus_device(), no one cleans up the device added in
> pci_scan_slot(). When another device get hot-added, pci_get_slot() wrongly
> thinks another device is already there, so the hot-plug fails.
pciehp powers down the slot because you're returning a negative errno
from pciehp_configure_device(). Please return 0 instead if
pci_hp_add_bridge() fails.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/4] PCI: pciehp: bail out if pci_hp_add_bridge() fails
2024-05-04 15:02 ` Lukas Wunner
@ 2024-05-04 15:48 ` Nam Cao
0 siblings, 0 replies; 14+ messages in thread
From: Nam Cao @ 2024-05-04 15:48 UTC (permalink / raw)
To: Lukas Wunner
Cc: Bjorn Helgaas, Yinghai Lu, Greg Kroah-Hartman, Rajesh Shah,
linux-pci, linux-kernel, stable
On Sat, May 04, 2024 at 05:02:34PM +0200, Lukas Wunner wrote:
> On Sat, May 04, 2024 at 12:56:30PM +0200, Nam Cao wrote:
> > On Sat, May 04, 2024 at 11:51:54AM +0200, Lukas Wunner wrote:
> > > Could you reproduce with pciehp instead of shpchp please?
> >
> > Same thing for pciehp below. I think the problem is because without
> > pci_stop_and_remove_bus_device(), no one cleans up the device added in
> > pci_scan_slot(). When another device get hot-added, pci_get_slot() wrongly
> > thinks another device is already there, so the hot-plug fails.
>
> pciehp powers down the slot because you're returning a negative errno
> from pciehp_configure_device(). Please return 0 instead if
> pci_hp_add_bridge() fails.
Thanks for the suggestion. This is applicable to shpchp as well.
Best regards,
Nam
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-05-04 15:48 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-03 19:23 [PATCH 0/4] check returned value of pci_hp_add_bridge() Nam Cao
2024-05-03 19:23 ` [PATCH 1/4] PCI: shpchp: bail out if pci_hp_add_bridge() fails Nam Cao
2024-05-03 19:23 ` [PATCH 2/4] PCI: pciehp: " Nam Cao
2024-05-03 21:23 ` Bjorn Helgaas
2024-05-03 21:41 ` Nam Cao
2024-05-04 8:54 ` Lukas Wunner
2024-05-04 9:35 ` Nam Cao
2024-05-04 9:51 ` Lukas Wunner
2024-05-04 10:56 ` Nam Cao
2024-05-04 15:02 ` Lukas Wunner
2024-05-04 15:48 ` Nam Cao
2024-05-03 19:23 ` [PATCH 3/4] PCI: hotplug: document unchecked return value of pci_hp_add_bridge() Nam Cao
2024-05-03 19:23 ` [PATCH 4/4] PCI: hotplug: remove TODO notes for sgi_hotplug Nam Cao
2024-05-03 21:29 ` [PATCH 0/4] check returned value of pci_hp_add_bridge() Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox