Linux CXL
 help / color / mirror / Atom feed
* [PATCH] cxl/port: Fix use after free of parent_port in cxl_detach_ep()
@ 2026-02-18  6:15 Alison Schofield
  2026-02-18 15:30 ` Dave Jiang
  0 siblings, 1 reply; 2+ messages in thread
From: Alison Schofield @ 2026-02-18  6:15 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

cxl_detach_ep() is called during bottom-up removal when all CXL memory
devices beneath a switch port have been removed. For each port in the
hierarchy it locks both the port and its parent, removes the endpoint,
and if the port is now empty, marks it dead and unregisters the port
by calling delete_switch_port(). There are two places during this work
where the parent_port may be used after freeing:

First, a concurrent detach may have already processed a port by the
time a second worker finds it via bus_find_device(). Without pinning
parent_port, it may already be freed when we discover port->dead and
attempt to unlock the parent_port. In a production kernel that's a
silent memory corruption, with lock debug, it looks like this:

[]DEBUG_LOCKS_WARN_ON(__owner_task(owner) != get_current())
[]WARNING: kernel/locking/mutex.c:949 at __mutex_unlock_slowpath+0x1ee/0x310
[]Call Trace:
[]mutex_unlock+0xd/0x20
[]cxl_detach_ep+0x180/0x400 [cxl_core]
[]devm_action_release+0x10/0x20
[]devres_release_all+0xa8/0xe0
[]device_unbind_cleanup+0xd/0xa0
[]really_probe+0x1a6/0x3e0

Fix this first case by adding a check for port->dead after acquiring
both locks. Unlock and release the parent reference before continuing.

Second, delete_switch_port() releases three devm actions registered
against parent_port. The last of those is unregister_port() and it
calls device_unregister() on the child port, which can cascade. If
parent_port is now also empty the device core may unregister and free
it too. So by the time delete_switch_port() returns, parent_port may
be free, and the subsequent device_unlock(&parent_port->dev) operates
on freed memory. The kernel log looks same as above, with a different
offset in cxl_detach_ep().

Fix this second issue by taking an extra reference on parent_port
before locking it, preventing the memory from being freed across
delete_switch_port(). Release it after device_unlock().

These easily reproduce with a reload of cxl_acpi in QEMU environment
with CXL devices present.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---


This was found while trying out unit test cases to backstop DaveJ's
latest finding where QEMU devices with CXL unit tests exposed an
nvdimm bus race. I post this with a bit of skepticism of the
likelihood it appears in the wild. Maybe it would but just not in the
way my test invokes it. A Fixes tag was not obvious, but I can find
the best tag, if any, in a v2.


 drivers/cxl/core/port.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index fea8d5f5f331..94cf6b248e0d 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1533,8 +1533,18 @@ static void cxl_detach_ep(void *data)
 		port = to_cxl_port(dev);
 
 		parent_port = to_cxl_port(port->dev.parent);
+		get_device(&parent_port->dev);
 		device_lock(&parent_port->dev);
 		device_lock(&port->dev);
+
+		/* A concurrent detach may have already removed this port */
+		if (port->dead) {
+			device_unlock(&port->dev);
+			device_unlock(&parent_port->dev);
+			put_device(&parent_port->dev);
+			continue;
+		}
+
 		ep = cxl_ep_load(port, cxlmd);
 		dev_dbg(&cxlmd->dev, "disconnect %s from %s\n",
 			ep ? dev_name(ep->ep) : "", dev_name(&port->dev));
@@ -1553,11 +1563,19 @@ static void cxl_detach_ep(void *data)
 		device_unlock(&port->dev);
 
 		if (died) {
+			/*
+			 * Hold an extra reference to parent_port across
+			 * delete_switch_port() since unregister_port(port)
+			 * may cascade and unregister parent_port, freeing
+			 * it before the call to device_unlock().
+			 */
 			dev_dbg(&cxlmd->dev, "delete %s\n",
 				dev_name(&port->dev));
 			delete_switch_port(port);
 		}
+
 		device_unlock(&parent_port->dev);
+		put_device(&parent_port->dev);
 	}
 }
 

base-commit: 49d273f81f3dad288b7748c6cfb973705ae026d2
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-02-18 15:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-18  6:15 [PATCH] cxl/port: Fix use after free of parent_port in cxl_detach_ep() Alison Schofield
2026-02-18 15:30 ` Dave Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox