* [PATCH 0/2] Fix port enumeration failure and NULL endpoint issue
@ 2026-02-01 9:30 Li Ming
2026-02-01 9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
2026-02-01 9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
0 siblings, 2 replies; 20+ messages in thread
From: Li Ming @ 2026-02-01 9:30 UTC (permalink / raw)
To: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
Cc: linux-cxl, linux-kernel, Li Ming
I ran CXL mock testing with next branch, I usually hit the following
call trace.
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
CPU: 3 UID: 0 PID: 42 Comm: kworker/u16:1 Tainted: G O J 6.19.0-rc5-cxl+ #4 PREEMPT(voluntary)
Tainted: [O]=OOT_MODULE, [J]=FWCTL
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
Workqueue: async async_run_entry_fn
RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
Call Trace:
<TASK>
cxl_event_trace_record+0xd1/0xa70 [cxl_core]
__cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
cxl_mem_get_records_log+0x261/0x500 [cxl_core]
cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
platform_probe+0x9d/0x130
really_probe+0x1c8/0x960
driver_probe_device+0x45/0x120
__device_attach_driver+0x15d/0x280
bus_for_each_drv+0x100/0x180
__device_attach_async_helper+0x199/0x250
async_run_entry_fn+0x95/0x430
process_one_work+0x7db/0x1940
After detailed debugging, I identified two independent issues that
together leads to the problem.
Issue 1:
cxlmd->endpoint is initialized to ERR_PTR(-ENXIO) during cxlmd creation,
but cxl subsystem usually checks endpoint availability by checking
whether it is NULL. As a result, if endpoint port creation fails, some
code paths may incorrectly treat the endpoint as available. In the
call trace above, endpoint port creation fails but cxl_dpa_to_region()
still considers that is available.
Patch #1 is used to fix it, the solution is initializing cxlmd->endpoint
to NULL by default.
Issue 2:
The second issue is why CXL port enumeration could be failure. What I
observed is when two memdev were trying to enumerate a same port, the
first memdev was responsible for port creation and attaching. However,
there is a small window between the point where the new port becomes
visible(after being added to the device list of cxl bus) and when it is
bound to the port driver. During this window, the second memdev may
discover the port and acquire its lock while attempting to add its
dport, which blocks bus_probe_device() inside device_add(). As a result,
the second memdev observes the port as unbound and fails to add its
dport.
Patch #2 fixes this race by holding the grandparent port lock during
dport addition, preventing premature access before driver binding
completed.
base-commit: 63050be0bfe0b280cce5d701b31940fd84858609 cxl/next
Li Ming (2):
cxl/core: Set cxlmd->endpoint to NULL by default
cxl/core: Hold grandparent port lock for dport adding.
drivers/cxl/core/memdev.c | 2 +-
drivers/cxl/core/port.c | 6 +++++-
2 files changed, 6 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-01 9:30 [PATCH 0/2] Fix port enumeration failure and NULL endpoint issue Li Ming
@ 2026-02-01 9:30 ` Li Ming
2026-02-02 14:41 ` Jonathan Cameron
` (2 more replies)
2026-02-01 9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
1 sibling, 3 replies; 20+ messages in thread
From: Li Ming @ 2026-02-01 9:30 UTC (permalink / raw)
To: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
Cc: linux-cxl, linux-kernel, Li Ming
CXL testing environment can trigger following trace
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
Call Trace:
<TASK>
cxl_event_trace_record+0xd1/0xa70 [cxl_core]
__cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
cxl_mem_get_records_log+0x261/0x500 [cxl_core]
cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
platform_probe+0x9d/0x130
really_probe+0x1c8/0x960
__driver_probe_device+0x187/0x3e0
driver_probe_device+0x45/0x120
__device_attach_driver+0x15d/0x280
commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
valid endpoint.
Across the CXL core, endpoint availability is generally determined by
checking whether it is NULL. Align with this convention by initializing
cxlmd->endpoint to NULL by default.
Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
Signed-off-by: Li Ming <ming.li@zohomail.com>
---
drivers/cxl/core/memdev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index af3d0cc65138..41a507b5daa4 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -675,7 +675,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
cxlmd->id = rc;
cxlmd->depth = -1;
cxlmd->attach = attach;
- cxlmd->endpoint = ERR_PTR(-ENXIO);
+ cxlmd->endpoint = NULL;
dev = &cxlmd->dev;
device_initialize(dev);
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-01 9:30 [PATCH 0/2] Fix port enumeration failure and NULL endpoint issue Li Ming
2026-02-01 9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
@ 2026-02-01 9:30 ` Li Ming
2026-02-02 15:39 ` Jonathan Cameron
` (2 more replies)
1 sibling, 3 replies; 20+ messages in thread
From: Li Ming @ 2026-02-01 9:30 UTC (permalink / raw)
To: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
Cc: linux-cxl, linux-kernel, Li Ming
When CXL subsystem adds a cxl port to a hierarchy, there is a small
window where the new port becomes visible before it is bound to a
driver. This happens because device_add() adds a device to bus device
list before bus_probe_device() binds it to a driver.
So if two cxl memdevs are trying to add a dport to a same port via
devm_cxl_enumerate_ports(), the second cxl memdev may observe the port
and attempt to add a dport, but fails because the port has not yet been
attached to cxl port driver.
the sequence is like:
CPU 0 CPU 1
devm_cxl_enumerate_ports()
# port not found, add it
add_port_attach_ep()
# hold the parent port lock
# to add the new port
devm_cxl_create_port()
device_add()
# Add dev to bus devs list
bus_add_device()
devm_cxl_enumerate_ports()
# found the port
find_cxl_port_by_uport()
# hold port lock to add a dport
device_lock(the port)
find_or_add_dport()
cxl_port_add_dport()
return -ENXIO because port->dev.driver is NULL
device_unlock(the port)
bus_probe_device()
# hold the port lock
# for attaching
device_lock(the port)
attaching the new port
device_unlock(the port)
To fix this race, require that dport addition holds the parent port lock
of the target port. The CXL subsystem already requires holding the
parent port lock while attaching a new port. Therefore, successfully
acquiring the parent port lock ganrantees that port attaching has
completed.
Fixes: 4f06d81e7c6a ("cxl: Defer dport allocation for switch ports")
Signed-off-by: Li Ming <ming.li@zohomail.com>
---
drivers/cxl/core/port.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 54f72452fb06..fef2fe913e1f 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1817,8 +1817,12 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
/*
* RP port enumerated by cxl_acpi without dport will
* have the dport added here.
+ *
+ * Hold the parent port lock here to in case that the
+ * port can be observed but has not been attached yet.
*/
- scoped_guard(device, &port->dev) {
+ scoped_guard(device, &parent_port_of(port)->dev) {
+ guard(device)(&port->dev);
dport = find_or_add_dport(port, dport_dev);
if (IS_ERR(dport)) {
if (PTR_ERR(dport) == -EAGAIN)
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-01 9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
@ 2026-02-02 14:41 ` Jonathan Cameron
2026-02-02 15:48 ` Gregory Price
2026-02-03 14:15 ` Li Ming
2026-02-02 21:04 ` Dave Jiang
2026-02-03 0:01 ` dan.j.williams
2 siblings, 2 replies; 20+ messages in thread
From: Jonathan Cameron @ 2026-02-02 14:41 UTC (permalink / raw)
To: Li Ming
Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, linux-cxl, linux-kernel
On Sun, 1 Feb 2026 17:30:01 +0800
Li Ming <ming.li@zohomail.com> wrote:
> CXL testing environment can trigger following trace
>
> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
> RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
> Call Trace:
> <TASK>
> cxl_event_trace_record+0xd1/0xa70 [cxl_core]
> __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
> cxl_mem_get_records_log+0x261/0x500 [cxl_core]
> cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
> cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
> platform_probe+0x9d/0x130
> really_probe+0x1c8/0x960
> __driver_probe_device+0x187/0x3e0
> driver_probe_device+0x45/0x120
> __device_attach_driver+0x15d/0x280
>
> commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
> However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
> valid endpoint.
>
> Across the CXL core, endpoint availability is generally determined by
> checking whether it is NULL. Align with this convention by initializing
> cxlmd->endpoint to NULL by default.
I had a look at whether it made sense to use use IS_ERR_OR_NULL() to check
for validity of the endpoint, but it would be somewhat fiddly and I think
you are correct that convention here seems to be NULL means not set.
We don't need the error code. One comment inline.
Either way nice catch
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>
> Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> Signed-off-by: Li Ming <ming.li@zohomail.com>
> ---
> drivers/cxl/core/memdev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index af3d0cc65138..41a507b5daa4 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -675,7 +675,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> cxlmd->id = rc;
> cxlmd->depth = -1;
> cxlmd->attach = attach;
> - cxlmd->endpoint = ERR_PTR(-ENXIO);
> + cxlmd->endpoint = NULL;
cxlmd has just been allocated with kzalloc so I'd argue we don't need this to be explicitly
set at all. Seems like a natural and safe default.
>
> dev = &cxlmd->dev;
> device_initialize(dev);
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-01 9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
@ 2026-02-02 15:39 ` Jonathan Cameron
2026-02-03 14:23 ` Li Ming
2026-02-02 16:31 ` Gregory Price
2026-02-03 0:07 ` dan.j.williams
2 siblings, 1 reply; 20+ messages in thread
From: Jonathan Cameron @ 2026-02-02 15:39 UTC (permalink / raw)
To: Li Ming
Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, linux-cxl, linux-kernel
On Sun, 1 Feb 2026 17:30:02 +0800
Li Ming <ming.li@zohomail.com> wrote:
> When CXL subsystem adds a cxl port to a hierarchy, there is a small
> window where the new port becomes visible before it is bound to a
> driver. This happens because device_add() adds a device to bus device
> list before bus_probe_device() binds it to a driver.
> So if two cxl memdevs are trying to add a dport to a same port via
> devm_cxl_enumerate_ports(), the second cxl memdev may observe the port
> and attempt to add a dport, but fails because the port has not yet been
> attached to cxl port driver.
> the sequence is like:
>
> CPU 0 CPU 1
> devm_cxl_enumerate_ports()
> # port not found, add it
> add_port_attach_ep()
> # hold the parent port lock
> # to add the new port
> devm_cxl_create_port()
> device_add()
> # Add dev to bus devs list
> bus_add_device()
> devm_cxl_enumerate_ports()
> # found the port
Indenting not consistent here as this call is in devm_cxl_enumerate_ports()
> find_cxl_port_by_uport()
> # hold port lock to add a dport
> device_lock(the port)
> find_or_add_dport()
> cxl_port_add_dport()
> return -ENXIO because port->dev.driver is NULL
> device_unlock(the port)
> bus_probe_device()
> # hold the port lock
> # for attaching
> device_lock(the port)
> attaching the new port
> device_unlock(the port)
>
> To fix this race, require that dport addition holds the parent port lock
> of the target port. The CXL subsystem already requires holding the
> parent port lock while attaching a new port. Therefore, successfully
> acquiring the parent port lock ganrantees that port attaching has
Spell check. Guarantees
> completed.
>
> Fixes: 4f06d81e7c6a ("cxl: Defer dport allocation for switch ports")
> Signed-off-by: Li Ming <ming.li@zohomail.com>
Analysis looks reasonable to me, but I'm not hugely confident on this
one so would like others to take a close look as well.
Question inline.
> ---
> drivers/cxl/core/port.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 54f72452fb06..fef2fe913e1f 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1817,8 +1817,12 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> /*
> * RP port enumerated by cxl_acpi without dport will
> * have the dport added here.
> + *
> + * Hold the parent port lock here to in case that the
> + * port can be observed but has not been attached yet.
> */
> - scoped_guard(device, &port->dev) {
> + scoped_guard(device, &parent_port_of(port)->dev) {
I'm nervous about whether this is the right lock. For unregister_port()
(which is easier to track down that the add path locking) the lock
taken depends on where the port is that is being unregistered.
Specifically root ports are unregistered under parent->uport_dev, not
parent->dev.
> + guard(device)(&port->dev);
> dport = find_or_add_dport(port, dport_dev);
> if (IS_ERR(dport)) {
> if (PTR_ERR(dport) == -EAGAIN)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-02 14:41 ` Jonathan Cameron
@ 2026-02-02 15:48 ` Gregory Price
2026-02-03 14:15 ` Li Ming
1 sibling, 0 replies; 20+ messages in thread
From: Gregory Price @ 2026-02-02 15:48 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Li Ming, dave, dave.jiang, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams, linux-cxl, linux-kernel
On Mon, Feb 02, 2026 at 02:41:03PM +0000, Jonathan Cameron wrote:
> On Sun, 1 Feb 2026 17:30:01 +0800
> Li Ming <ming.li@zohomail.com> wrote:
>
> > CXL testing environment can trigger following trace
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
> > RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
> > Call Trace:
> > <TASK>
> > cxl_event_trace_record+0xd1/0xa70 [cxl_core]
> > __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
> > cxl_mem_get_records_log+0x261/0x500 [cxl_core]
> > cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
> > cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
> > platform_probe+0x9d/0x130
> > really_probe+0x1c8/0x960
> > __driver_probe_device+0x187/0x3e0
> > driver_probe_device+0x45/0x120
> > __device_attach_driver+0x15d/0x280
> >
> > commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
> > However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
> > valid endpoint.
> >
> > Across the CXL core, endpoint availability is generally determined by
> > checking whether it is NULL. Align with this convention by initializing
> > cxlmd->endpoint to NULL by default.
>
> I had a look at whether it made sense to use use IS_ERR_OR_NULL() to check
> for validity of the endpoint, but it would be somewhat fiddly and I think
> you are correct that convention here seems to be NULL means not set.
> We don't need the error code. One comment inline.
>
doing validity checks on pointers by checking for null is a pretty
common convention kernel-wide, I would consider setting some structure's
value to an ERR_PTR to be the aberration.
So yeah, good catch
Reviewed-by: Gregory Price <gourry@gourry.net>
~Gregory
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-01 9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
2026-02-02 15:39 ` Jonathan Cameron
@ 2026-02-02 16:31 ` Gregory Price
2026-02-03 14:33 ` Li Ming
2026-02-03 0:07 ` dan.j.williams
2 siblings, 1 reply; 20+ messages in thread
From: Gregory Price @ 2026-02-02 16:31 UTC (permalink / raw)
To: Li Ming
Cc: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams, linux-cxl,
linux-kernel
On Sun, Feb 01, 2026 at 05:30:02PM +0800, Li Ming wrote:
> When CXL subsystem adds a cxl port to a hierarchy, there is a small
> window where the new port becomes visible before it is bound to a
> driver. This happens because device_add() adds a device to bus device
> list before bus_probe_device() binds it to a driver.
> So if two cxl memdevs are trying to add a dport to a same port via
> devm_cxl_enumerate_ports(), the second cxl memdev may observe the port
> and attempt to add a dport, but fails because the port has not yet been
> attached to cxl port driver.
> the sequence is like:
>
> CPU 0 CPU 1
> devm_cxl_enumerate_ports()
> # port not found, add it
> add_port_attach_ep()
> # hold the parent port lock
> # to add the new port
> devm_cxl_create_port()
> device_add()
> # Add dev to bus devs list
> bus_add_device()
> devm_cxl_enumerate_ports()
> # found the port
> find_cxl_port_by_uport()
> # hold port lock to add a dport
> device_lock(the port)
> find_or_add_dport()
> cxl_port_add_dport()
> return -ENXIO because port->dev.driver is NULL
> device_unlock(the port)
> bus_probe_device()
> # hold the port lock
> # for attaching
> device_lock(the port)
> attaching the new port
> device_unlock(the port)
>
> To fix this race, require that dport addition holds the parent port lock
> of the target port. The CXL subsystem already requires holding the
> parent port lock while attaching a new port. Therefore, successfully
> acquiring the parent port lock ganrantees that port attaching has
> completed.
>
With just a a cursory look, I'm immediately concerned that you're fixing
a race condition with a lock inversion.
Can you guarantee the following is not happening
Thread A Thread B
----------------------------
lock(parent) lock(port)
lock(port) lock(parent)
~Gregory
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-01 9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
2026-02-02 14:41 ` Jonathan Cameron
@ 2026-02-02 21:04 ` Dave Jiang
2026-02-03 15:04 ` Li Ming
2026-02-03 0:01 ` dan.j.williams
2 siblings, 1 reply; 20+ messages in thread
From: Dave Jiang @ 2026-02-02 21:04 UTC (permalink / raw)
To: Li Ming, dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, dan.j.williams
Cc: linux-cxl, linux-kernel
On 2/1/26 2:30 AM, Li Ming wrote:
> CXL testing environment can trigger following trace
>
> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
> RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
> Call Trace:
> <TASK>
> cxl_event_trace_record+0xd1/0xa70 [cxl_core]
> __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
> cxl_mem_get_records_log+0x261/0x500 [cxl_core]
> cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
> cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
> platform_probe+0x9d/0x130
> really_probe+0x1c8/0x960
> __driver_probe_device+0x187/0x3e0
> driver_probe_device+0x45/0x120
> __device_attach_driver+0x15d/0x280
So I talked to Dan and the assigning of ERR_PTR(-ENXIO) is from a previous revision of implementation and isn't intended to be there. So removing the line entirely (which would mean endpoint == NULL from kzalloc) is probably the right change. However, I'm not able to reproduce the error. How is it triggered? Also, by setting it to NULL, are we covering up a different issue? Are you able to determine whether cxlmd->endpoint at that point is unexpected and perhaps triggered by a different bug?
DJ
>
> commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
> However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
> valid endpoint.
>
> Across the CXL core, endpoint availability is generally determined by
> checking whether it is NULL. Align with this convention by initializing
> cxlmd->endpoint to NULL by default.
>
> Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> Signed-off-by: Li Ming <ming.li@zohomail.com>
> ---
> drivers/cxl/core/memdev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index af3d0cc65138..41a507b5daa4 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -675,7 +675,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> cxlmd->id = rc;
> cxlmd->depth = -1;
> cxlmd->attach = attach;
> - cxlmd->endpoint = ERR_PTR(-ENXIO);
> + cxlmd->endpoint = NULL;
>
> dev = &cxlmd->dev;
> device_initialize(dev);
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-01 9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
2026-02-02 14:41 ` Jonathan Cameron
2026-02-02 21:04 ` Dave Jiang
@ 2026-02-03 0:01 ` dan.j.williams
2026-02-03 15:15 ` Li Ming
2 siblings, 1 reply; 20+ messages in thread
From: dan.j.williams @ 2026-02-03 0:01 UTC (permalink / raw)
To: Li Ming, dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
Cc: linux-cxl, linux-kernel, Li Ming
Li Ming wrote:
> CXL testing environment can trigger following trace
>
> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
> RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
> Call Trace:
> <TASK>
> cxl_event_trace_record+0xd1/0xa70 [cxl_core]
> __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
> cxl_mem_get_records_log+0x261/0x500 [cxl_core]
> cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
> cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
Hooray for unit tests, but I wonder why this failure is so reliable for
you and absent for me? I retried with latest cxl/next, no luck.
No matter, it still looks like something worth addressing, but not with
setting ->endpoint to NULL.
> platform_probe+0x9d/0x130
> really_probe+0x1c8/0x960
> __driver_probe_device+0x187/0x3e0
> driver_probe_device+0x45/0x120
> __device_attach_driver+0x15d/0x280
>
> commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
> However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
> valid endpoint.
>
> Across the CXL core, endpoint availability is generally determined by
> checking whether it is NULL. Align with this convention by initializing
> cxlmd->endpoint to NULL by default.
>
> Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> Signed-off-by: Li Ming <ming.li@zohomail.com>
> ---
> drivers/cxl/core/memdev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index af3d0cc65138..41a507b5daa4 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -675,7 +675,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> cxlmd->id = rc;
> cxlmd->depth = -1;
> cxlmd->attach = attach;
> - cxlmd->endpoint = ERR_PTR(-ENXIO);
> + cxlmd->endpoint = NULL;
So, this does not look like a fix to me, it looks like a bug report
against all the code paths that want to assume that a mere NULL check of
->endpoint is sufficient.
I think this unintentional ERR_PTR() trip hazard is now a *feature* in
retrospect to catch those cases. If something depends on ->endpoint
being valid, it depends on ->endpoint *staying* valid.
A proposed fix for this case is below (passes cxl_test), and if other
ERR_PTR() crashes show up I expect they need similar inspection. This
probably wants additional cleanup to consolidate boiler-plate and make
the critical change to cxl_dpa_to_region() easier to see:
-- 8< --
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 422531799af2..e652980098cf 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -39,7 +39,7 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
int cxl_region_init(void);
void cxl_region_exit(void);
int cxl_get_poison_by_endpoint(struct cxl_port *port);
-struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
+struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa);
u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa);
@@ -50,7 +50,7 @@ static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
return ULLONG_MAX;
}
static inline
-struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
+struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
{
return NULL;
}
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index ef202b34e5ea..610ed6744ddc 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -864,7 +864,7 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
unsigned long *cmds);
void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
+void cxl_event_trace_record(struct cxl_memdev *cxlmd,
enum cxl_event_log_type type,
enum cxl_event_type event_type,
const uuid_t *uuid, union cxl_event *evt);
diff --git a/include/linux/device.h b/include/linux/device.h
index 0be95294b6e6..4fafee80524b 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -911,6 +911,7 @@ static inline void device_unlock(struct device *dev)
}
DEFINE_GUARD(device, struct device *, device_lock(_T), device_unlock(_T))
+DEFINE_GUARD_COND(device, _intr, device_lock_interruptible(_T), _RET == 0)
static inline void device_lock_assert(struct device *dev)
{
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index fa6dd0c94656..7b923408b6c5 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -886,7 +886,7 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
}
EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, "CXL");
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
+void cxl_event_trace_record(struct cxl_memdev *cxlmd,
enum cxl_event_log_type type,
enum cxl_event_type event_type,
const uuid_t *uuid, union cxl_event *evt)
@@ -913,6 +913,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
* translations. Take topology mutation locks and lookup
* { HPA, REGION } from { DPA, MEMDEV } in the event record.
*/
+ guard(device)(&cxlmd->dev);
guard(rwsem_read)(&cxl_rwsem.region);
guard(rwsem_read)(&cxl_rwsem.dpa);
@@ -961,7 +962,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
}
EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, "CXL");
-static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
+static void __cxl_event_trace_record(struct cxl_memdev *cxlmd,
enum cxl_event_log_type type,
struct cxl_event_record_raw *record)
{
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index af3d0cc65138..d2bee5608e50 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -331,6 +331,10 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
{
int rc;
+ ACQUIRE(device_intr, devlock)(&cxlmd->dev);
+ if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
+ return rc;
+
ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
return rc;
@@ -355,6 +359,7 @@ int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
if (!IS_ENABLED(CONFIG_DEBUG_FS))
return 0;
+ device_lock_assert(&cxlmd->dev);
lockdep_assert_held(&cxl_rwsem.dpa);
lockdep_assert_held(&cxl_rwsem.region);
@@ -400,6 +405,10 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
{
int rc;
+ ACQUIRE(device_intr, devlock)(&cxlmd->dev);
+ if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
+ return rc;
+
ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
return rc;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 96888d87a8df..a7d391757e16 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2936,7 +2936,7 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg)
return 1;
}
-struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
+struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
{
struct cxl_dpa_to_region_context ctx;
struct cxl_port *port;
@@ -2944,8 +2944,12 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
ctx = (struct cxl_dpa_to_region_context) {
.dpa = dpa,
};
+
+ device_lock_assert(&cxlmd->dev);
+ if (!cxlmd->dev.driver)
+ return NULL;
port = cxlmd->endpoint;
- if (port && is_cxl_endpoint(port) && cxl_num_decoders_committed(port))
+ if (cxl_num_decoders_committed(port))
device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region);
return ctx.cxlr;
@@ -4004,10 +4008,9 @@ static int validate_region_offset(struct cxl_region *cxlr, u64 offset)
return 0;
}
-static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
+static int region_poison_lookup(struct cxl_region *cxlr, u64 offset,
+ struct dpa_result *result)
{
- struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
- struct cxl_region *cxlr = data;
int rc;
ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
@@ -4022,8 +4025,8 @@ static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
return -EINVAL;
offset -= cxlr->params.cache_size;
- rc = region_offset_to_dpa_result(cxlr, offset, &result);
- if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
+ rc = region_offset_to_dpa_result(cxlr, offset, result);
+ if (rc || !result->cxlmd || result->dpa == ULLONG_MAX) {
dev_dbg(&cxlr->dev,
"Failed to resolve DPA for region offset %#llx rc %d\n",
offset, rc);
@@ -4031,7 +4034,43 @@ static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
return rc ? rc : -EINVAL;
}
- return cxl_inject_poison_locked(result.cxlmd, result.dpa);
+ return 0;
+}
+
+static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
+{
+ struct dpa_result result1 = { .dpa = ULLONG_MAX };
+ struct dpa_result result2 = { .dpa = ULLONG_MAX };
+ struct cxl_region *cxlr = data;
+ struct cxl_memdev *cxlmd;
+ int rc;
+
+ rc = region_poison_lookup(cxlr, offset, &result1);
+ if (rc)
+ return rc;
+
+ cxlmd = result1.cxlmd;
+ ACQUIRE(device_intr, devlock)(&cxlmd->dev);
+ if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
+ return rc;
+
+ ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
+ return rc;
+
+ ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+ return rc;
+
+ offset -= cxlr->params.cache_size;
+ rc = region_offset_to_dpa_result(cxlr, offset, &result2);
+ if (rc || memcmp(&result1, &result2, sizeof(result1) != 0)) {
+ dev_dbg(&cxlr->dev,
+ "Error injection raced region reconfiguration: %d\n", rc);
+ return rc ? rc : -ENXIO;
+ }
+
+ return cxl_inject_poison_locked(result1.cxlmd, result1.dpa);
}
DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
@@ -4039,10 +4078,21 @@ DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
{
- struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
+ struct dpa_result result1 = { .dpa = ULLONG_MAX };
+ struct dpa_result result2 = { .dpa = ULLONG_MAX };
struct cxl_region *cxlr = data;
+ struct cxl_memdev *cxlmd;
int rc;
+ rc = region_poison_lookup(cxlr, offset, &result1);
+ if (rc)
+ return rc;
+
+ cxlmd = result1.cxlmd;
+ ACQUIRE(device_intr, devlock)(&cxlmd->dev);
+ if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
+ return rc;
+
ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
return rc;
@@ -4051,20 +4101,15 @@ static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
return rc;
- if (validate_region_offset(cxlr, offset))
- return -EINVAL;
-
offset -= cxlr->params.cache_size;
- rc = region_offset_to_dpa_result(cxlr, offset, &result);
- if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
+ rc = region_offset_to_dpa_result(cxlr, offset, &result2);
+ if (rc || memcmp(&result1, &result2, sizeof(result1) != 0)) {
dev_dbg(&cxlr->dev,
- "Failed to resolve DPA for region offset %#llx rc %d\n",
- offset, rc);
-
- return rc ? rc : -EINVAL;
+ "Error clearing raced region reconfiguration: %d\n", rc);
+ return rc ? rc : -ENXIO;
}
- return cxl_clear_poison_locked(result.cxlmd, result.dpa);
+ return cxl_clear_poison_locked(result1.cxlmd, result1.dpa);
}
DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
--
2.52.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-01 9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
2026-02-02 15:39 ` Jonathan Cameron
2026-02-02 16:31 ` Gregory Price
@ 2026-02-03 0:07 ` dan.j.williams
2026-02-03 15:21 ` Li Ming
2 siblings, 1 reply; 20+ messages in thread
From: dan.j.williams @ 2026-02-03 0:07 UTC (permalink / raw)
To: Li Ming, dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams
Cc: linux-cxl, linux-kernel, Li Ming
Li Ming wrote:
> When CXL subsystem adds a cxl port to a hierarchy, there is a small
> window where the new port becomes visible before it is bound to a
> driver. This happens because device_add() adds a device to bus device
> list before bus_probe_device() binds it to a driver.
> So if two cxl memdevs are trying to add a dport to a same port via
> devm_cxl_enumerate_ports(), the second cxl memdev may observe the port
> and attempt to add a dport, but fails because the port has not yet been
> attached to cxl port driver.
> the sequence is like:
>
> CPU 0 CPU 1
> devm_cxl_enumerate_ports()
> # port not found, add it
> add_port_attach_ep()
> # hold the parent port lock
> # to add the new port
> devm_cxl_create_port()
> device_add()
> # Add dev to bus devs list
> bus_add_device()
> devm_cxl_enumerate_ports()
> # found the port
> find_cxl_port_by_uport()
> # hold port lock to add a dport
> device_lock(the port)
> find_or_add_dport()
> cxl_port_add_dport()
> return -ENXIO because port->dev.driver is NULL
> device_unlock(the port)
> bus_probe_device()
> # hold the port lock
> # for attaching
> device_lock(the port)
> attaching the new port
> device_unlock(the port)
>
> To fix this race, require that dport addition holds the parent port lock
> of the target port. The CXL subsystem already requires holding the
> parent port lock while attaching a new port. Therefore, successfully
> acquiring the parent port lock ganrantees that port attaching has
> completed.
Are you seeing this case fail permanently? The expectation is that the
one that loses the race iterates up the topology and retries.
So yes, you can lose this race once, but not twice is the expectation.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-02 14:41 ` Jonathan Cameron
2026-02-02 15:48 ` Gregory Price
@ 2026-02-03 14:15 ` Li Ming
1 sibling, 0 replies; 20+ messages in thread
From: Li Ming @ 2026-02-03 14:15 UTC (permalink / raw)
To: Jonathan Cameron
Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, linux-cxl, linux-kernel
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: "Li Ming"<ming.li@zohomail.com>
Cc: <dave@stgolabs.net>, <dave.jiang@intel.com>, <alison.schofield@intel.com>, <vishal.l.verma@intel.com>, <ira.weiny@intel.com>, <dan.j.williams@intel.com>, <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Date: Mon, 02 Feb 2026 22:41:03 +0800
Subject: Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
> On Sun, 1 Feb 2026 17:30:01 +0800
> Li Ming <ming.li@zohomail.com> wrote:
>
> > CXL testing environment can trigger following trace
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
> > RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
> > Call Trace:
> > <TASK>
> > cxl_event_trace_record+0xd1/0xa70 [cxl_core]
> > __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
> > cxl_mem_get_records_log+0x261/0x500 [cxl_core]
> > cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
> > cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
> > platform_probe+0x9d/0x130
> > really_probe+0x1c8/0x960
> > __driver_probe_device+0x187/0x3e0
> > driver_probe_device+0x45/0x120
> > __device_attach_driver+0x15d/0x280
> >
> > commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
> > However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
> > valid endpoint.
> >
> > Across the CXL core, endpoint availability is generally determined by
> > checking whether it is NULL. Align with this convention by initializing
> > cxlmd->endpoint to NULL by default.
>
> I had a look at whether it made sense to use use IS_ERR_OR_NULL() to check
> for validity of the endpoint, but it would be somewhat fiddly and I think
> you are correct that convention here seems to be NULL means not set.
> We don't need the error code. One comment inline.
>
> Either way nice catch
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Thanks for the review.
>
>
> >
> > Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > Signed-off-by: Li Ming <ming.li@zohomail.com>
> > ---
> > drivers/cxl/core/memdev.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index af3d0cc65138..41a507b5daa4 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -675,7 +675,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> > cxlmd->id = rc;
> > cxlmd->depth = -1;
> > cxlmd->attach = attach;
> > - cxlmd->endpoint = ERR_PTR(-ENXIO);
> > + cxlmd->endpoint = NULL;
> cxlmd has just been allocated with kzalloc so I'd argue we don't need this to be explicitly
> set at all. Seems like a natural and safe default.
Yes, I forgot it. Thanks for pointing it out.
Ming
>
> >
> > dev = &cxlmd->dev;
> > device_initialize(dev);
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-02 15:39 ` Jonathan Cameron
@ 2026-02-03 14:23 ` Li Ming
2026-02-03 21:14 ` dan.j.williams
0 siblings, 1 reply; 20+ messages in thread
From: Li Ming @ 2026-02-03 14:23 UTC (permalink / raw)
To: Jonathan Cameron
Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, linux-cxl, linux-kernel
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: "Li Ming"<ming.li@zohomail.com>
Cc: <dave@stgolabs.net>, <dave.jiang@intel.com>, <alison.schofield@intel.com>, <vishal.l.verma@intel.com>, <ira.weiny@intel.com>, <dan.j.williams@intel.com>, <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Date: Mon, 02 Feb 2026 23:39:24 +0800
Subject: Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
> On Sun, 1 Feb 2026 17:30:02 +0800
> Li Ming <ming.li@zohomail.com> wrote:
>
> > When CXL subsystem adds a cxl port to a hierarchy, there is a small
> > window where the new port becomes visible before it is bound to a
> > driver. This happens because device_add() adds a device to bus device
> > list before bus_probe_device() binds it to a driver.
> > So if two cxl memdevs are trying to add a dport to a same port via
> > devm_cxl_enumerate_ports(), the second cxl memdev may observe the port
> > and attempt to add a dport, but fails because the port has not yet been
> > attached to cxl port driver.
> > the sequence is like:
> >
> > CPU 0 CPU 1
> > devm_cxl_enumerate_ports()
> > # port not found, add it
> > add_port_attach_ep()
> > # hold the parent port lock
> > # to add the new port
> > devm_cxl_create_port()
> > device_add()
> > # Add dev to bus devs list
> > bus_add_device()
> > devm_cxl_enumerate_ports()
> > # found the port
>
> Indenting not consistent here as this call is in devm_cxl_enumerate_ports()
Thanks for that, will fix it in next version.
>
> > find_cxl_port_by_uport()
> > # hold port lock to add a dport
> > device_lock(the port)
> > find_or_add_dport()
> > cxl_port_add_dport()
> > return -ENXIO because port->dev.driver is NULL
> > device_unlock(the port)
> > bus_probe_device()
> > # hold the port lock
> > # for attaching
> > device_lock(the port)
> > attaching the new port
> > device_unlock(the port)
> >
> > To fix this race, require that dport addition holds the parent port lock
> > of the target port. The CXL subsystem already requires holding the
> > parent port lock while attaching a new port. Therefore, successfully
> > acquiring the parent port lock ganrantees that port attaching has
>
> Spell check. Guarantees
Got it.
>
> > completed.
> >
> > Fixes: 4f06d81e7c6a ("cxl: Defer dport allocation for switch ports")
> > Signed-off-by: Li Ming <ming.li@zohomail.com>
>
> Analysis looks reasonable to me, but I'm not hugely confident on this
> one so would like others to take a close look as well.
> Question inline.
>
>
> > ---
> > drivers/cxl/core/port.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > index 54f72452fb06..fef2fe913e1f 100644
> > --- a/drivers/cxl/core/port.c
> > +++ b/drivers/cxl/core/port.c
> > @@ -1817,8 +1817,12 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> > /*
> > * RP port enumerated by cxl_acpi without dport will
> > * have the dport added here.
> > + *
> > + * Hold the parent port lock here to in case that the
> > + * port can be observed but has not been attached yet.
> > */
> > - scoped_guard(device, &port->dev) {
> > + scoped_guard(device, &parent_port_of(port)->dev) {
> I'm nervous about whether this is the right lock. For unregister_port()
> (which is easier to track down that the add path locking) the lock
> taken depends on where the port is that is being unregistered.
> Specifically root ports are unregistered under parent->uport_dev, not
> parent->dev.
You are right.
When cxl acpi driver attempts to add a cxl host bridge port, it will hold cxl_root->uport_dev.
Otherwide, hold parent_port->dev lock for the new port addition.
So I think it is possible that memdev can observe a cxl host bridge port but it has not been attach yet.
Maybe I should hold the lock of the new port's host.
Let's see if other reviewers have more comments on that.
Ming
>
> > + guard(device)(&port->dev);
> > dport = find_or_add_dport(port, dport_dev);
> > if (IS_ERR(dport)) {
> > if (PTR_ERR(dport) == -EAGAIN)
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-02 16:31 ` Gregory Price
@ 2026-02-03 14:33 ` Li Ming
0 siblings, 0 replies; 20+ messages in thread
From: Li Ming @ 2026-02-03 14:33 UTC (permalink / raw)
To: Gregory Price
Cc: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, dan.j.williams, linux-cxl,
linux-kernel
From: Gregory Price <gourry@gourry.net>
To: "Li Ming"<ming.li@zohomail.com>
Cc: <dave@stgolabs.net>, <jonathan.cameron@huawei.com>, <dave.jiang@intel.com>, <alison.schofield@intel.com>, <vishal.l.verma@intel.com>, <ira.weiny@intel.com>, <dan.j.williams@intel.com>, <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Date: Tue, 03 Feb 2026 00:31:45 +0800
Subject: Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
> On Sun, Feb 01, 2026 at 05:30:02PM +0800, Li Ming wrote:
> > When CXL subsystem adds a cxl port to a hierarchy, there is a small
> > window where the new port becomes visible before it is bound to a
> > driver. This happens because device_add() adds a device to bus device
> > list before bus_probe_device() binds it to a driver.
> > So if two cxl memdevs are trying to add a dport to a same port via
> > devm_cxl_enumerate_ports(), the second cxl memdev may observe the port
> > and attempt to add a dport, but fails because the port has not yet been
> > attached to cxl port driver.
> > the sequence is like:
> >
> > CPU 0 CPU 1
> > devm_cxl_enumerate_ports()
> > # port not found, add it
> > add_port_attach_ep()
> > # hold the parent port lock
> > # to add the new port
> > devm_cxl_create_port()
> > device_add()
> > # Add dev to bus devs list
> > bus_add_device()
> > devm_cxl_enumerate_ports()
> > # found the port
> > find_cxl_port_by_uport()
> > # hold port lock to add a dport
> > device_lock(the port)
> > find_or_add_dport()
> > cxl_port_add_dport()
> > return -ENXIO because port->dev.driver is NULL
> > device_unlock(the port)
> > bus_probe_device()
> > # hold the port lock
> > # for attaching
> > device_lock(the port)
> > attaching the new port
> > device_unlock(the port)
> >
> > To fix this race, require that dport addition holds the parent port lock
> > of the target port. The CXL subsystem already requires holding the
> > parent port lock while attaching a new port. Therefore, successfully
> > acquiring the parent port lock ganrantees that port attaching has
> > completed.
> >
>
> With just a a cursory look, I'm immediately concerned that you're fixing
> a race condition with a lock inversion.
>
> Can you guarantee the following is not happening
>
> Thread A Thread B
> ----------------------------
> lock(parent) lock(port)
> lock(port) lock(parent)
>
> ~Gregory
>
Hi Gregory,
I think no other scenario where driver needs to hold a child port lock together with its parent port lock than during a new port attaching or a port removal.
After re-reading a new port attaching and a port removal flows, I believe both operations acquire the parent port first before acquiring a child port lock. So I think the thread B case would not happen. If I miss something please correct me. Thanks
Ming
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-02 21:04 ` Dave Jiang
@ 2026-02-03 15:04 ` Li Ming
0 siblings, 0 replies; 20+ messages in thread
From: Li Ming @ 2026-02-03 15:04 UTC (permalink / raw)
To: Dave Jiang
Cc: dave, jonathan.cameron, alison.schofield, vishal.l.verma,
ira.weiny, linux-cxl, linux-kernel, dan.j.williams
From: Dave Jiang <dave.jiang@intel.com>
To: "Li Ming"<ming.li@zohomail.com>, <dave@stgolabs.net>, <jonathan.cameron@huawei.com>, <alison.schofield@intel.com>, <vishal.l.verma@intel.com>, <ira.weiny@intel.com>, <dan.j.williams@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Date: Tue, 03 Feb 2026 05:04:44 +0800
Subject: Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
>
>
> On 2/1/26 2:30 AM, Li Ming wrote:
> > CXL testing environment can trigger following trace
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
> > RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
> > Call Trace:
> > <TASK>
> > cxl_event_trace_record+0xd1/0xa70 [cxl_core]
> > __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
> > cxl_mem_get_records_log+0x261/0x500 [cxl_core]
> > cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
> > cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
> > platform_probe+0x9d/0x130
> > really_probe+0x1c8/0x960
> > __driver_probe_device+0x187/0x3e0
> > driver_probe_device+0x45/0x120
> > __device_attach_driver+0x15d/0x280
>
> So I talked to Dan and the assigning of ERR_PTR(-ENXIO) is from a previous revision of implementation and isn't intended to be there. So removing the line entirely (which would mean endpoint == NULL from kzalloc) is probably the right change. However, I'm not able to reproduce the error. How is it triggered? Also, by setting it to NULL, are we covering up a different issue? Are you able to determine whether cxlmd->endpoint at that point is unexpected and perhaps triggered by a different bug?
>
> DJ
Hi Dave,
As I implied in cover-letter, When I trigger issue #2, it always trigger this issue.
I have a simple script to run cxl test over and over again in a VM. I usually reproduce it in one hour with KASAN enabled.
When I reproduced it, I dumpped out the value of cxlmd->endpoint, it is -ENXIO. That means it was not updated in cxl_port_add() yet at that time. I cannot reproduce this issue only with patch #2.
Ming
>
> >
> > commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
> > However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
> > valid endpoint.
> >
> > Across the CXL core, endpoint availability is generally determined by
> > checking whether it is NULL. Align with this convention by initializing
> > cxlmd->endpoint to NULL by default.
> >
> > Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > Signed-off-by: Li Ming <ming.li@zohomail.com>
> > ---
> > drivers/cxl/core/memdev.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index af3d0cc65138..41a507b5daa4 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -675,7 +675,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> > cxlmd->id = rc;
> > cxlmd->depth = -1;
> > cxlmd->attach = attach;
> > - cxlmd->endpoint = ERR_PTR(-ENXIO);
> > + cxlmd->endpoint = NULL;
> >
> > dev = &cxlmd->dev;
> > device_initialize(dev);
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-03 0:01 ` dan.j.williams
@ 2026-02-03 15:15 ` Li Ming
2026-02-03 22:37 ` dan.j.williams
0 siblings, 1 reply; 20+ messages in thread
From: Li Ming @ 2026-02-03 15:15 UTC (permalink / raw)
To: danjwilliams
Cc: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, linux-cxl, linux-kernel
From: <dan.j.williams@intel.com>
To: "Li Ming"<ming.li@zohomail.com>, <dave@stgolabs.net>, <jonathan.cameron@huawei.com>, <dave.jiang@intel.com>, <alison.schofield@intel.com>, <vishal.l.verma@intel.com>, <ira.weiny@intel.com>, <dan.j.williams@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>, "Li Ming"<ming.li@zohomail.com>
Date: Tue, 03 Feb 2026 08:01:04 +0800
Subject: Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
> Li Ming wrote:
> > CXL testing environment can trigger following trace
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
> > RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
> > Call Trace:
> > <TASK>
> > cxl_event_trace_record+0xd1/0xa70 [cxl_core]
> > __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
> > cxl_mem_get_records_log+0x261/0x500 [cxl_core]
> > cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
> > cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
>
> Hooray for unit tests, but I wonder why this failure is so reliable for
> you and absent for me? I retried with latest cxl/next, no luck.
>
> No matter, it still looks like something worth addressing, but not with
> setting ->endpoint to NULL.
>
> > platform_probe+0x9d/0x130
> > really_probe+0x1c8/0x960
> > __driver_probe_device+0x187/0x3e0
> > driver_probe_device+0x45/0x120
> > __device_attach_driver+0x15d/0x280
> >
> > commit 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > initializes cxlmd->endpoint to ERR_PTR(-ENXIO) in cxl_memdev_alloc().
> > However, cxl_dpa_to_region() treats a non-NULL cxlmd->endpoint as a
> > valid endpoint.
> >
> > Across the CXL core, endpoint availability is generally determined by
> > checking whether it is NULL. Align with this convention by initializing
> > cxlmd->endpoint to NULL by default.
> >
> > Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
> > Signed-off-by: Li Ming <ming.li@zohomail.com>
> > ---
> > drivers/cxl/core/memdev.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index af3d0cc65138..41a507b5daa4 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> > @@ -675,7 +675,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> > cxlmd->id = rc;
> > cxlmd->depth = -1;
> > cxlmd->attach = attach;
> > - cxlmd->endpoint = ERR_PTR(-ENXIO);
> > + cxlmd->endpoint = NULL;
>
> So, this does not look like a fix to me, it looks like a bug report
> against all the code paths that want to assume that a mere NULL check of
> ->endpoint is sufficient.
>
> I think this unintentional ERR_PTR() trip hazard is now a *feature* in
> retrospect to catch those cases. If something depends on ->endpoint
> being valid, it depends on ->endpoint *staying* valid.
>
> A proposed fix for this case is below (passes cxl_test), and if other
> ERR_PTR() crashes show up I expect they need similar inspection. This
> probably wants additional cleanup to consolidate boiler-plate and make
> the critical change to cxl_dpa_to_region() easier to see:
>
Hi Dan,
Thanks for your proposal, I think your change can solve this problem too.
But the change is a lot, and need more time to review all driver code to confirm
if there are other places needed such checking. (I found that cxl_reset_done also needs some changes like you mentioned)
Maybe we can consider my change as a quick fix? Then I can prepare a new patchset for the consolidation.
Ming
> -- 8< --
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 422531799af2..e652980098cf 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -39,7 +39,7 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
> int cxl_region_init(void);
> void cxl_region_exit(void);
> int cxl_get_poison_by_endpoint(struct cxl_port *port);
> -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
> +struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa);
> u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> u64 dpa);
>
> @@ -50,7 +50,7 @@ static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
> return ULLONG_MAX;
> }
> static inline
> -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
> +struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
> {
> return NULL;
> }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index ef202b34e5ea..610ed6744ddc 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -864,7 +864,7 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
> unsigned long *cmds);
> void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> +void cxl_event_trace_record(struct cxl_memdev *cxlmd,
> enum cxl_event_log_type type,
> enum cxl_event_type event_type,
> const uuid_t *uuid, union cxl_event *evt);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 0be95294b6e6..4fafee80524b 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -911,6 +911,7 @@ static inline void device_unlock(struct device *dev)
> }
>
> DEFINE_GUARD(device, struct device *, device_lock(_T), device_unlock(_T))
> +DEFINE_GUARD_COND(device, _intr, device_lock_interruptible(_T), _RET == 0)
>
> static inline void device_lock_assert(struct device *dev)
> {
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index fa6dd0c94656..7b923408b6c5 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -886,7 +886,7 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, "CXL");
>
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> +void cxl_event_trace_record(struct cxl_memdev *cxlmd,
> enum cxl_event_log_type type,
> enum cxl_event_type event_type,
> const uuid_t *uuid, union cxl_event *evt)
> @@ -913,6 +913,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> * translations. Take topology mutation locks and lookup
> * { HPA, REGION } from { DPA, MEMDEV } in the event record.
> */
> + guard(device)(&cxlmd->dev);
> guard(rwsem_read)(&cxl_rwsem.region);
> guard(rwsem_read)(&cxl_rwsem.dpa);
>
> @@ -961,7 +962,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> }
> EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, "CXL");
>
> -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> +static void __cxl_event_trace_record(struct cxl_memdev *cxlmd,
> enum cxl_event_log_type type,
> struct cxl_event_record_raw *record)
> {
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index af3d0cc65138..d2bee5608e50 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -331,6 +331,10 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> {
> int rc;
>
> + ACQUIRE(device_intr, devlock)(&cxlmd->dev);
> + if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
> + return rc;
> +
> ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> return rc;
> @@ -355,6 +359,7 @@ int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
> if (!IS_ENABLED(CONFIG_DEBUG_FS))
> return 0;
>
> + device_lock_assert(&cxlmd->dev);
> lockdep_assert_held(&cxl_rwsem.dpa);
> lockdep_assert_held(&cxl_rwsem.region);
>
> @@ -400,6 +405,10 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> {
> int rc;
>
> + ACQUIRE(device_intr, devlock)(&cxlmd->dev);
> + if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
> + return rc;
> +
> ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> return rc;
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 96888d87a8df..a7d391757e16 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2936,7 +2936,7 @@ static int __cxl_dpa_to_region(struct device *dev, void *arg)
> return 1;
> }
>
> -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
> +struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa)
> {
> struct cxl_dpa_to_region_context ctx;
> struct cxl_port *port;
> @@ -2944,8 +2944,12 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
> ctx = (struct cxl_dpa_to_region_context) {
> .dpa = dpa,
> };
> +
> + device_lock_assert(&cxlmd->dev);
> + if (!cxlmd->dev.driver)
> + return NULL;
> port = cxlmd->endpoint;
> - if (port && is_cxl_endpoint(port) && cxl_num_decoders_committed(port))
> + if (cxl_num_decoders_committed(port))
> device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region);
>
> return ctx.cxlr;
> @@ -4004,10 +4008,9 @@ static int validate_region_offset(struct cxl_region *cxlr, u64 offset)
> return 0;
> }
>
> -static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
> +static int region_poison_lookup(struct cxl_region *cxlr, u64 offset,
> + struct dpa_result *result)
> {
> - struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
> - struct cxl_region *cxlr = data;
> int rc;
>
> ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> @@ -4022,8 +4025,8 @@ static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
> return -EINVAL;
>
> offset -= cxlr->params.cache_size;
> - rc = region_offset_to_dpa_result(cxlr, offset, &result);
> - if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
> + rc = region_offset_to_dpa_result(cxlr, offset, result);
> + if (rc || !result->cxlmd || result->dpa == ULLONG_MAX) {
> dev_dbg(&cxlr->dev,
> "Failed to resolve DPA for region offset %#llx rc %d\n",
> offset, rc);
> @@ -4031,7 +4034,43 @@ static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
> return rc ? rc : -EINVAL;
> }
>
> - return cxl_inject_poison_locked(result.cxlmd, result.dpa);
> + return 0;
> +}
> +
> +static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
> +{
> + struct dpa_result result1 = { .dpa = ULLONG_MAX };
> + struct dpa_result result2 = { .dpa = ULLONG_MAX };
> + struct cxl_region *cxlr = data;
> + struct cxl_memdev *cxlmd;
> + int rc;
> +
> + rc = region_poison_lookup(cxlr, offset, &result1);
> + if (rc)
> + return rc;
> +
> + cxlmd = result1.cxlmd;
> + ACQUIRE(device_intr, devlock)(&cxlmd->dev);
> + if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
> + return rc;
> +
> + ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> + return rc;
> +
> + ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> + return rc;
> +
> + offset -= cxlr->params.cache_size;
> + rc = region_offset_to_dpa_result(cxlr, offset, &result2);
> + if (rc || memcmp(&result1, &result2, sizeof(result1) != 0)) {
> + dev_dbg(&cxlr->dev,
> + "Error injection raced region reconfiguration: %d\n", rc);
> + return rc ? rc : -ENXIO;
> + }
> +
> + return cxl_inject_poison_locked(result1.cxlmd, result1.dpa);
> }
>
> DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
> @@ -4039,10 +4078,21 @@ DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
>
> static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
> {
> - struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
> + struct dpa_result result1 = { .dpa = ULLONG_MAX };
> + struct dpa_result result2 = { .dpa = ULLONG_MAX };
> struct cxl_region *cxlr = data;
> + struct cxl_memdev *cxlmd;
> int rc;
>
> + rc = region_poison_lookup(cxlr, offset, &result1);
> + if (rc)
> + return rc;
> +
> + cxlmd = result1.cxlmd;
> + ACQUIRE(device_intr, devlock)(&cxlmd->dev);
> + if ((rc = ACQUIRE_ERR(device_intr, &devlock)))
> + return rc;
> +
> ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> return rc;
> @@ -4051,20 +4101,15 @@ static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
> if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> return rc;
>
> - if (validate_region_offset(cxlr, offset))
> - return -EINVAL;
> -
> offset -= cxlr->params.cache_size;
> - rc = region_offset_to_dpa_result(cxlr, offset, &result);
> - if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
> + rc = region_offset_to_dpa_result(cxlr, offset, &result2);
> + if (rc || memcmp(&result1, &result2, sizeof(result1) != 0)) {
> dev_dbg(&cxlr->dev,
> - "Failed to resolve DPA for region offset %#llx rc %d\n",
> - offset, rc);
> -
> - return rc ? rc : -EINVAL;
> + "Error clearing raced region reconfiguration: %d\n", rc);
> + return rc ? rc : -ENXIO;
> }
>
> - return cxl_clear_poison_locked(result.cxlmd, result.dpa);
> + return cxl_clear_poison_locked(result1.cxlmd, result1.dpa);
> }
>
> DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
> --
> 2.52.0
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-03 0:07 ` dan.j.williams
@ 2026-02-03 15:21 ` Li Ming
2026-02-03 22:25 ` dan.j.williams
0 siblings, 1 reply; 20+ messages in thread
From: Li Ming @ 2026-02-03 15:21 UTC (permalink / raw)
To: danjwilliams
Cc: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, linux-cxl, linux-kernel
From: <dan.j.williams@intel.com>
To: "Li Ming"<ming.li@zohomail.com>, <dave@stgolabs.net>, <jonathan.cameron@huawei.com>, <dave.jiang@intel.com>, <alison.schofield@intel.com>, <vishal.l.verma@intel.com>, <ira.weiny@intel.com>, <dan.j.williams@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>, "Li Ming"<ming.li@zohomail.com>
Date: Tue, 03 Feb 2026 08:07:13 +0800
Subject: Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
> Li Ming wrote:
> > When CXL subsystem adds a cxl port to a hierarchy, there is a small
> > window where the new port becomes visible before it is bound to a
> > driver. This happens because device_add() adds a device to bus device
> > list before bus_probe_device() binds it to a driver.
> > So if two cxl memdevs are trying to add a dport to a same port via
> > devm_cxl_enumerate_ports(), the second cxl memdev may observe the port
> > and attempt to add a dport, but fails because the port has not yet been
> > attached to cxl port driver.
> > the sequence is like:
> >
> > CPU 0 CPU 1
> > devm_cxl_enumerate_ports()
> > # port not found, add it
> > add_port_attach_ep()
> > # hold the parent port lock
> > # to add the new port
> > devm_cxl_create_port()
> > device_add()
> > # Add dev to bus devs list
> > bus_add_device()
> > devm_cxl_enumerate_ports()
> > # found the port
> > find_cxl_port_by_uport()
> > # hold port lock to add a dport
> > device_lock(the port)
> > find_or_add_dport()
> > cxl_port_add_dport()
> > return -ENXIO because port->dev.driver is NULL
> > device_unlock(the port)
> > bus_probe_device()
> > # hold the port lock
> > # for attaching
> > device_lock(the port)
> > attaching the new port
> > device_unlock(the port)
> >
> > To fix this race, require that dport addition holds the parent port lock
> > of the target port. The CXL subsystem already requires holding the
> > parent port lock while attaching a new port. Therefore, successfully
> > acquiring the parent port lock ganrantees that port attaching has
> > completed.
>
> Are you seeing this case fail permanently? The expectation is that the
> one that loses the race iterates up the topology and retries.
>
> So yes, you can lose this race once, but not twice is the expectation.
>
Hi Dan,
My understanding is that would not trigger enumeration retry, because enumerating ports flow retries the enumeration only when find_or_add_dport() returns a -EAGAIN. but the port's driver checking in cxl_port_add_dport() returns a -ENXIO, so it makes devm_cxl_enumerate_ports() failure directly.
Ming
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-03 14:23 ` Li Ming
@ 2026-02-03 21:14 ` dan.j.williams
0 siblings, 0 replies; 20+ messages in thread
From: dan.j.williams @ 2026-02-03 21:14 UTC (permalink / raw)
To: Li Ming, Jonathan Cameron
Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
dan.j.williams, linux-cxl, linux-kernel
Li Ming wrote:
[..]
> > > - scoped_guard(device, &port->dev) {
> > > + scoped_guard(device, &parent_port_of(port)->dev) {
> > I'm nervous about whether this is the right lock. For unregister_port()
> > (which is easier to track down that the add path locking) the lock
> > taken depends on where the port is that is being unregistered.
> > Specifically root ports are unregistered under parent->uport_dev, not
> > parent->dev.
> You are right.
> When cxl acpi driver attempts to add a cxl host bridge port, it will hold cxl_root->uport_dev.
> Otherwide, hold parent_port->dev lock for the new port addition.
> So I think it is possible that memdev can observe a cxl host bridge port but it has not been attach yet.
> Maybe I should hold the lock of the new port's host.
> Let's see if other reviewers have more comments on that.
So, I think current code is ok, if cxl_mem_probe() races
cxl_acpi_probe() in the same way that case is already caught by the
cxl_bus_rescan() at the end of cxl_acpi_probe().
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-03 15:21 ` Li Ming
@ 2026-02-03 22:25 ` dan.j.williams
2026-02-04 13:51 ` Li Ming
0 siblings, 1 reply; 20+ messages in thread
From: dan.j.williams @ 2026-02-03 22:25 UTC (permalink / raw)
To: Li Ming, danjwilliams
Cc: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, linux-cxl, linux-kernel
Li Ming wrote:
[..]
> > >
> > > To fix this race, require that dport addition holds the parent port lock
> > > of the target port. The CXL subsystem already requires holding the
> > > parent port lock while attaching a new port. Therefore, successfully
> > > acquiring the parent port lock ganrantees that port attaching has
> > > completed.
> >
> > Are you seeing this case fail permanently? The expectation is that the
> > one that loses the race iterates up the topology and retries.
> >
> > So yes, you can lose this race once, but not twice is the expectation.
> >
> Hi Dan,
>
> My understanding is that would not trigger enumeration retry, because
> enumerating ports flow retries the enumeration only when
> find_or_add_dport() returns a -EAGAIN. but the port's driver checking
> in cxl_port_add_dport() returns a -ENXIO, so it makes
> devm_cxl_enumerate_ports() failure directly.
Ah, true, my mental model was still stuck in the old top-down dport
enumeration scheme.
So, yes, we do need to make sure that switch port creation does not race
port lookup. However, I think the scoped_guard() tends to make code less
readable and in this case hides the opportunity for more comments to
explain what is happening.
I also think, per that observation from Jonathan, that we can save the
cxl_bus_resan() violence by taking the CXL platform device lock.
So, please move the locking internal to find_or_add_dport(), so that
plain guard() can be used. Add comments for the fact that
devm_cxl_create_port() and the CXL platform init path need to be flushed
by taking the device lock. And explain why the device to lock is
different dependening on whether the parent_port is the cxl_root or a
descendant port.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default
2026-02-03 15:15 ` Li Ming
@ 2026-02-03 22:37 ` dan.j.williams
0 siblings, 0 replies; 20+ messages in thread
From: dan.j.williams @ 2026-02-03 22:37 UTC (permalink / raw)
To: Li Ming, danjwilliams
Cc: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, linux-cxl, linux-kernel
Li Ming wrote:
[..]
> Hi Dan,
>
> Thanks for your proposal, I think your change can solve this problem too.
> But the change is a lot, and need more time to review all driver code
> to confirm if there are other places needed such checking. (I found
> that cxl_reset_done also needs some changes like you mentioned) Maybe
> we can consider my change as a quick fix? Then I can prepare a new
> patchset for the consolidation.
I am not convinced that it is a fix. The fact that you say the bug
disappears when patch2 is applied leads me to believe that is
potentially the only bug.
I.e. it may be the case that cxl_dpa_to_region() is safe to assume that
a valid ->endpoint pointer will remain valid once the port
bus_add_device() vs bus_probe_device() hole is plugged.
I would say start with patch2 by itself. Then circle back to prove
that a mere NULL check is a fix or just makes the vulnerability window
smaller and the locking rework is needed.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
2026-02-03 22:25 ` dan.j.williams
@ 2026-02-04 13:51 ` Li Ming
0 siblings, 0 replies; 20+ messages in thread
From: Li Ming @ 2026-02-04 13:51 UTC (permalink / raw)
To: danjwilliams
Cc: dave, jonathan.cameron, dave.jiang, alison.schofield,
vishal.l.verma, ira.weiny, linux-cxl, linux-kernel
From: <dan.j.williams@intel.com>
To: "Li Ming"<ming.li@zohomail.com>, "danjwilliams"<dan.j.williams@intel.com>
Cc: "dave"<dave@stgolabs.net>, "jonathan.cameron"<jonathan.cameron@huawei.com>, "dave.jiang"<dave.jiang@intel.com>, "alison.schofield"<alison.schofield@intel.com>, "vishal.l.verma"<vishal.l.verma@intel.com>, "ira.weiny"<ira.weiny@intel.com>, "linux-cxl"<linux-cxl@vger.kernel.org>, "linux-kernel"<linux-kernel@vger.kernel.org>
Date: Wed, 04 Feb 2026 06:25:00 +0800
Subject: Re: [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding
> Li Ming wrote:
> [..]
> > > >
> > > > To fix this race, require that dport addition holds the parent port lock
> > > > of the target port. The CXL subsystem already requires holding the
> > > > parent port lock while attaching a new port. Therefore, successfully
> > > > acquiring the parent port lock ganrantees that port attaching has
> > > > completed.
> > >
> > > Are you seeing this case fail permanently? The expectation is that the
> > > one that loses the race iterates up the topology and retries.
> > >
> > > So yes, you can lose this race once, but not twice is the expectation.
> > >
> > Hi Dan,
> >
> > My understanding is that would not trigger enumeration retry, because
> > enumerating ports flow retries the enumeration only when
> > find_or_add_dport() returns a -EAGAIN. but the port's driver checking
> > in cxl_port_add_dport() returns a -ENXIO, so it makes
> > devm_cxl_enumerate_ports() failure directly.
>
> Ah, true, my mental model was still stuck in the old top-down dport
> enumeration scheme.
>
> So, yes, we do need to make sure that switch port creation does not race
> port lookup. However, I think the scoped_guard() tends to make code less
> readable and in this case hides the opportunity for more comments to
> explain what is happening.
>
> I also think, per that observation from Jonathan, that we can save the
> cxl_bus_resan() violence by taking the CXL platform device lock.
>
> So, please move the locking internal to find_or_add_dport(), so that
> plain guard() can be used. Add comments for the fact that
> devm_cxl_create_port() and the CXL platform init path need to be flushed
> by taking the device lock. And explain why the device to lock is
> different dependening on whether the parent_port is the cxl_root or a
> descendant port.
>
>
Got it, will do that in V2, also remove PATCH #1 as you mentioned.
Ming
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2026-02-04 13:51 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-01 9:30 [PATCH 0/2] Fix port enumeration failure and NULL endpoint issue Li Ming
2026-02-01 9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
2026-02-02 14:41 ` Jonathan Cameron
2026-02-02 15:48 ` Gregory Price
2026-02-03 14:15 ` Li Ming
2026-02-02 21:04 ` Dave Jiang
2026-02-03 15:04 ` Li Ming
2026-02-03 0:01 ` dan.j.williams
2026-02-03 15:15 ` Li Ming
2026-02-03 22:37 ` dan.j.williams
2026-02-01 9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
2026-02-02 15:39 ` Jonathan Cameron
2026-02-03 14:23 ` Li Ming
2026-02-03 21:14 ` dan.j.williams
2026-02-02 16:31 ` Gregory Price
2026-02-03 14:33 ` Li Ming
2026-02-03 0:07 ` dan.j.williams
2026-02-03 15:21 ` Li Ming
2026-02-03 22:25 ` dan.j.williams
2026-02-04 13:51 ` Li Ming
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox