public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-linus] PCI/bwctrl: Fix NULL pointer deref on unbind and bind
@ 2025-01-02 21:20 Lukas Wunner
  2025-01-02 23:22 ` Bjorn Helgaas
  2025-01-05 16:54 ` Ilpo Järvinen
  0 siblings, 2 replies; 7+ messages in thread
From: Lukas Wunner @ 2025-01-02 21:20 UTC (permalink / raw)
  To: Bjorn Helgaas, Krzysztof Wilczynski
  Cc: linux-pci, Niklas Schnelle, Ilpo Jarvinen, Jonathan Cameron,
	Mika Westerberg, Maciej W. Rozycki, Mario Limonciello,
	Evert Vorster

The interrupt handler for bandwidth notifications, pcie_bwnotif_irq(),
dereferences a "data" pointer.

On unbind, that pointer is set to NULL by pcie_bwnotif_remove().  However
the interrupt handler may still be invoked afterwards and will dereference
that NULL pointer.

That's because the interrupt is requested using a devm_*() helper and the
driver core releases devm_*() resources *after* calling ->remove().

pcie_bwnotif_remove() does clear the Link Bandwidth Management Interrupt
Enable and Link Autonomous Bandwidth Interrupt Enable bits in the Link
Control Register, but that won't prevent execution of pcie_bwnotif_irq():
The interrupt for bandwidth notifications may be shared with AER, DPC,
PME, and hotplug.  So pcie_bwnotif_irq() may be executed as long as the
interrupt is requested.

There's a similar race on bind:  pcie_bwnotif_probe() requests the
interrupt when the "data" pointer still points to NULL.  A NULL pointer
deref may thus likewise occur if AER, DPC, PME or hotplug raise an
interrupt in-between the bandwidth controller's call to devm_request_irq()
and assignment of the "data" pointer.

Drop the devm_*() usage and reorder requesting of the interrupt to fix the
issue.

While at it, drop a stray but harmless no_free_ptr() invocation when
assigning the "data" pointer in pcie_bwnotif_probe().

Evert reports a hang on shutdown of an ASUS ROG Strix SCAR 17 G733PYV.
The issue is no longer reproducible with the present commit.

Evert found that attaching a USB-C monitor prevented the hang.  The
machine contains an ASMedia USB 3.2 controller below a hotplug-capable
Root Port.  So one possible explanation is that the controller gets
hot-removed on shutdown unless something is connected.  And the ensuing
hotplug interrupt occurs exactly when the bandwidth controller is
unregistering.  The precise cause could not be determined because the
screen had already turned black when the hang occurred.

Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Reported-by: Evert Vorster <evorster@gmail.com>
Tested-by: Evert Vorster <evorster@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219629
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/pci/pcie/bwctrl.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c
index b59cacc..7cd8211 100644
--- a/drivers/pci/pcie/bwctrl.c
+++ b/drivers/pci/pcie/bwctrl.c
@@ -303,17 +303,18 @@ static int pcie_bwnotif_probe(struct pcie_device *srv)
 	if (ret)
 		return ret;
 
-	ret = devm_request_irq(&srv->device, srv->irq, pcie_bwnotif_irq,
-			       IRQF_SHARED, "PCIe bwctrl", srv);
+	scoped_guard(rwsem_write, &pcie_bwctrl_setspeed_rwsem)
+		scoped_guard(rwsem_write, &pcie_bwctrl_lbms_rwsem)
+			port->link_bwctrl = data;
+
+	ret = request_irq(srv->irq, pcie_bwnotif_irq, IRQF_SHARED,
+			  "PCIe bwctrl", srv);
 	if (ret)
-		return ret;
+		goto err_reset_data;
 
-	scoped_guard(rwsem_write, &pcie_bwctrl_setspeed_rwsem) {
-		scoped_guard(rwsem_write, &pcie_bwctrl_lbms_rwsem) {
-			port->link_bwctrl = no_free_ptr(data);
+	scoped_guard(rwsem_write, &pcie_bwctrl_setspeed_rwsem)
+		scoped_guard(rwsem_write, &pcie_bwctrl_lbms_rwsem)
 			pcie_bwnotif_enable(srv);
-		}
-	}
 
 	pci_dbg(port, "enabled with IRQ %d\n", srv->irq);
 
@@ -323,6 +324,12 @@ static int pcie_bwnotif_probe(struct pcie_device *srv)
 		port->link_bwctrl->cdev = NULL;
 
 	return 0;
+
+err_reset_data:
+	scoped_guard(rwsem_write, &pcie_bwctrl_setspeed_rwsem)
+		scoped_guard(rwsem_write, &pcie_bwctrl_lbms_rwsem)
+			port->link_bwctrl = NULL;
+	return ret;
 }
 
 static void pcie_bwnotif_remove(struct pcie_device *srv)
@@ -333,6 +340,8 @@ static void pcie_bwnotif_remove(struct pcie_device *srv)
 
 	pcie_bwnotif_disable(srv->port);
 
+	free_irq(srv->irq, srv);
+
 	scoped_guard(rwsem_write, &pcie_bwctrl_setspeed_rwsem)
 		scoped_guard(rwsem_write, &pcie_bwctrl_lbms_rwsem)
 			srv->port->link_bwctrl = NULL;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-01-08 12:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-02 21:20 [PATCH for-linus] PCI/bwctrl: Fix NULL pointer deref on unbind and bind Lukas Wunner
2025-01-02 23:22 ` Bjorn Helgaas
2025-01-05 16:54 ` Ilpo Järvinen
2025-01-07  4:29   ` Lukas Wunner
2025-01-07 14:29     ` Ilpo Järvinen
2025-01-08 12:05       ` Lukas Wunner
2025-01-08 12:55         ` Ilpo Järvinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox