Linux CXL
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Fix port enumeration failure
@ 2026-02-07 13:35 Li Ming
  2026-02-07 13:35 ` [PATCH v2 1/2] cxl/port: Hold port host lock while dport adding Li Ming
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Li Ming @ 2026-02-07 13:35 UTC (permalink / raw)
  To: dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, dan.j.williams
  Cc: linux-cxl, linux-kernel

I ran CXL mock testing with next branch, I usually hit the following
call trace.

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
 KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
 CPU: 3 UID: 0 PID: 42 Comm: kworker/u16:1 Tainted: G           O      J 6.19.0-rc5-cxl+ #4 PREEMPT(voluntary)
 Tainted: [O]=OOT_MODULE, [J]=FWCTL
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
 Workqueue: async async_run_entry_fn
 RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
 Call Trace:
  <TASK>
  cxl_event_trace_record+0xd1/0xa70 [cxl_core]
  __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
  cxl_mem_get_records_log+0x261/0x500 [cxl_core]
  cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
  cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
  platform_probe+0x9d/0x130
  really_probe+0x1c8/0x960
  driver_probe_device+0x45/0x120
  __device_attach_driver+0x15d/0x280
  bus_for_each_drv+0x100/0x180
  __device_attach_async_helper+0x199/0x250
  async_run_entry_fn+0x95/0x430
  process_one_work+0x7db/0x1940

After detailed debugging, I identified adding dport failure leads to the
problem.
What I observed is when two memdev were trying to enumerate a same port,
the first memdev was responsible for port creation and bind it to the
cxl port driver. However, there is a small window between the point
where the new port becomes visible(after being added to the device list
of cxl bus) and when it is bound to the port driver. During this window,
the second memdev may discover the port and acquire its lock while
attempting to add its dport, which blocks bus_probe_device() inside
device_add(). As a result, the second memdev observes the port as
unbound and fails to add its dport. The second memdev->endpoint would
not be updated because of that, then trigger above trace.

The solution is to fix this race by holding the host lock of the target
port during dport addition, preventing premature access before driver
binding completed.

base-commit: 63fbf275fa9f18f7020fb8acf54fa107e51d0f23 cxl/next

Changes from V1:
- Remove the patch of initializing memdev->endpoint to NULL. (Dan)
- Fixes typo errors. (Jonathan)
- Introduce a helper called to_port_host().
- unregister_port() cleanup.

Li Ming (2):
  cxl/port: Hold port host lock while dport adding.
  cxl/port: unregister_port() cleanup

 drivers/cxl/core/port.c | 47 +++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 18 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-02-11 11:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-07 13:35 [PATCH v2 0/2] Fix port enumeration failure Li Ming
2026-02-07 13:35 ` [PATCH v2 1/2] cxl/port: Hold port host lock while dport adding Li Ming
2026-02-07 17:28   ` dan.j.williams
2026-02-08 12:05     ` Li Ming
2026-02-08 12:20     ` Li Ming
2026-02-07 13:35 ` [PATCH v2 2/2] cxl/port: unregister_port() cleanup Li Ming
2026-02-10  3:04 ` [PATCH v2 0/2] Fix port enumeration failure Alison Schofield
2026-02-10 11:56   ` Li Ming
2026-02-10 15:00     ` Dave Jiang
2026-02-11 11:49       ` Li Ming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox