* [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
@ 2024-06-18 10:54 Lukas Wunner
2024-06-18 11:37 ` Mika Westerberg
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Lukas Wunner @ 2024-06-18 10:54 UTC (permalink / raw)
To: Bjorn Helgaas, Keith Busch; +Cc: Mika Westerberg, linux-pci
Keith reports a use-after-free when a DPC event occurs concurrently to
hot-removal of the same portion of the hierarchy:
The dpc_handler() awaits readiness of the secondary bus below the
Downstream Port where the DPC event occurred. To do so, it polls the
config space of the first child device on the secondary bus. If that
child device is concurrently removed, accesses to its struct pci_dev
cause the kernel to oops.
That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
reference on the child device. Before v6.3, the function was only
called on resume from system sleep or on runtime resume. Holding a
reference wasn't necessary back then because the pciehp IRQ thread
could never run concurrently. (On resume from system sleep, IRQs are
not enabled until after the resume_noirq phase. And runtime resume is
always awaited before a PCI device is removed.)
However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
of secondary bus after reset"), which introduced that, failed to
appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
reference on the child device because dpc_handler() and pciehp may
indeed run concurrently. The commit was backported to v5.10+ stable
kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0
CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
RIP: pci_bus_read_config_dword+0x17/0x50
pci_dev_wait()
pci_bridge_wait_for_secondary_bus()
dpc_reset_link()
pcie_do_recovery()
dpc_handler()
Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
Reported-by: Keith Busch <kbusch@kernel.org>
Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
Tested-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Cc: stable@vger.kernel.org # v5.10+
---
drivers/pci/pci.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d2c388761ba9..0e7cb507a5d6 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4753,7 +4753,7 @@ static int pci_bus_max_d3cold_delay(const struct pci_bus *bus)
*/
int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
{
- struct pci_dev *child;
+ struct pci_dev *child __free(pci_dev_put) = NULL;
int delay;
if (pci_dev_is_disconnected(dev))
@@ -4782,8 +4782,8 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
return 0;
}
- child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
- bus_list);
+ child = pci_dev_get(list_first_entry(&dev->subordinate->devices,
+ struct pci_dev, bus_list));
up_read(&pci_bus_sem);
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
2024-06-18 10:54 [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal Lukas Wunner
@ 2024-06-18 11:37 ` Mika Westerberg
2024-06-18 16:12 ` Keith Busch
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Mika Westerberg @ 2024-06-18 11:37 UTC (permalink / raw)
To: Lukas Wunner; +Cc: Bjorn Helgaas, Keith Busch, linux-pci
On Tue, Jun 18, 2024 at 12:54:55PM +0200, Lukas Wunner wrote:
> Keith reports a use-after-free when a DPC event occurs concurrently to
> hot-removal of the same portion of the hierarchy:
>
> The dpc_handler() awaits readiness of the secondary bus below the
> Downstream Port where the DPC event occurred. To do so, it polls the
> config space of the first child device on the secondary bus. If that
> child device is concurrently removed, accesses to its struct pci_dev
> cause the kernel to oops.
>
> That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
> reference on the child device. Before v6.3, the function was only
> called on resume from system sleep or on runtime resume. Holding a
> reference wasn't necessary back then because the pciehp IRQ thread
> could never run concurrently. (On resume from system sleep, IRQs are
> not enabled until after the resume_noirq phase. And runtime resume is
> always awaited before a PCI device is removed.)
>
> However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
> called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
> of secondary bus after reset"), which introduced that, failed to
> appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
> reference on the child device because dpc_handler() and pciehp may
> indeed run concurrently. The commit was backported to v5.10+ stable
> kernels, so that's the oldest one affected.
>
> Add the missing reference acquisition.
>
> Abridged stack trace:
>
> BUG: unable to handle page fault for address: 00000000091400c0
> CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
> RIP: pci_bus_read_config_dword+0x17/0x50
> pci_dev_wait()
> pci_bridge_wait_for_secondary_bus()
> dpc_reset_link()
> pcie_do_recovery()
> dpc_handler()
>
> Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
> Reported-by: Keith Busch <kbusch@kernel.org>
> Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
> Tested-by: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
2024-06-18 10:54 [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal Lukas Wunner
2024-06-18 11:37 ` Mika Westerberg
@ 2024-06-18 16:12 ` Keith Busch
2024-06-18 16:56 ` Lukas Wunner
2024-06-26 19:38 ` Keith Busch
2024-07-01 20:26 ` Krzysztof Wilczyński
3 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2024-06-18 16:12 UTC (permalink / raw)
To: Lukas Wunner; +Cc: Bjorn Helgaas, Mika Westerberg, linux-pci
On Tue, Jun 18, 2024 at 12:54:55PM +0200, Lukas Wunner wrote:
> However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
> called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
> of secondary bus after reset"), which introduced that, failed to
> appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
> reference on the child device because dpc_handler() and pciehp may
> indeed run concurrently. The commit was backported to v5.10+ stable
> kernels, so that's the oldest one affected.
Caution on applying this to 5.10 and 5.15 stable branches: they don't
have the fancy "__free" cleanup you're using here. The newer active
stables are okay, though.
> int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
> {
> - struct pci_dev *child;
> + struct pci_dev *child __free(pci_dev_put) = NULL;
> int delay;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
2024-06-18 16:12 ` Keith Busch
@ 2024-06-18 16:56 ` Lukas Wunner
0 siblings, 0 replies; 6+ messages in thread
From: Lukas Wunner @ 2024-06-18 16:56 UTC (permalink / raw)
To: Keith Busch; +Cc: Bjorn Helgaas, Mika Westerberg, linux-pci
On Tue, Jun 18, 2024 at 10:12:32AM -0600, Keith Busch wrote:
> On Tue, Jun 18, 2024 at 12:54:55PM +0200, Lukas Wunner wrote:
> > However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
> > called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
> > of secondary bus after reset"), which introduced that, failed to
> > appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
> > reference on the child device because dpc_handler() and pciehp may
> > indeed run concurrently. The commit was backported to v5.10+ stable
> > kernels, so that's the oldest one affected.
>
> Caution on applying this to 5.10 and 5.15 stable branches: they don't
> have the fancy "__free" cleanup you're using here. The newer active
> stables are okay, though.
I'll let Greg & Sasha know when they start applying this to stable
kernels that ced085ef369a is a prerequisite for v5.10-stable and
v5.15-stable. I can rework the patch if they don't want to apply
ced085ef369a to these older versions.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
2024-06-18 10:54 [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal Lukas Wunner
2024-06-18 11:37 ` Mika Westerberg
2024-06-18 16:12 ` Keith Busch
@ 2024-06-26 19:38 ` Keith Busch
2024-07-01 20:26 ` Krzysztof Wilczyński
3 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2024-06-26 19:38 UTC (permalink / raw)
To: Lukas Wunner; +Cc: Bjorn Helgaas, Mika Westerberg, linux-pci
On Tue, Jun 18, 2024 at 12:54:55PM +0200, Lukas Wunner wrote:
> Add the missing reference acquisition.
>
> Abridged stack trace:
>
> BUG: unable to handle page fault for address: 00000000091400c0
> CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
> RIP: pci_bus_read_config_dword+0x17/0x50
> pci_dev_wait()
> pci_bridge_wait_for_secondary_bus()
> dpc_reset_link()
> pcie_do_recovery()
> dpc_handler()
>
> Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
> Reported-by: Keith Busch <kbusch@kernel.org>
> Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
> Tested-by: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
> Cc: stable@vger.kernel.org # v5.10+
Hey, it's been some time, so just wanted to check on this patch status.
It's still a good fix.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
2024-06-18 10:54 [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal Lukas Wunner
` (2 preceding siblings ...)
2024-06-26 19:38 ` Keith Busch
@ 2024-07-01 20:26 ` Krzysztof Wilczyński
3 siblings, 0 replies; 6+ messages in thread
From: Krzysztof Wilczyński @ 2024-07-01 20:26 UTC (permalink / raw)
To: Lukas Wunner; +Cc: Bjorn Helgaas, Keith Busch, Mika Westerberg, linux-pci
Hello,
> Keith reports a use-after-free when a DPC event occurs concurrently to
> hot-removal of the same portion of the hierarchy:
>
> The dpc_handler() awaits readiness of the secondary bus below the
> Downstream Port where the DPC event occurred. To do so, it polls the
> config space of the first child device on the secondary bus. If that
> child device is concurrently removed, accesses to its struct pci_dev
> cause the kernel to oops.
>
> That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
> reference on the child device. Before v6.3, the function was only
> called on resume from system sleep or on runtime resume. Holding a
> reference wasn't necessary back then because the pciehp IRQ thread
> could never run concurrently. (On resume from system sleep, IRQs are
> not enabled until after the resume_noirq phase. And runtime resume is
> always awaited before a PCI device is removed.)
>
> However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
> called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
> of secondary bus after reset"), which introduced that, failed to
> appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
> reference on the child device because dpc_handler() and pciehp may
> indeed run concurrently. The commit was backported to v5.10+ stable
> kernels, so that's the oldest one affected.
>
> Add the missing reference acquisition.
>
> Abridged stack trace:
>
> BUG: unable to handle page fault for address: 00000000091400c0
> CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
> RIP: pci_bus_read_config_dword+0x17/0x50
> pci_dev_wait()
> pci_bridge_wait_for_secondary_bus()
> dpc_reset_link()
> pcie_do_recovery()
> dpc_handler()
Applied to dpc, thank you!
[1/1] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
https://git.kernel.org/pci/pci/c/11a1f4bc4736
Krzysztof
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-01 20:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-18 10:54 [PATCH] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal Lukas Wunner
2024-06-18 11:37 ` Mika Westerberg
2024-06-18 16:12 ` Keith Busch
2024-06-18 16:56 ` Lukas Wunner
2024-06-26 19:38 ` Keith Busch
2024-07-01 20:26 ` Krzysztof Wilczyński
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox