public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix port enumeration failure and NULL endpoint issue
@ 2026-02-01  9:30 Li Ming
  2026-02-01  9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
  2026-02-01  9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
  0 siblings, 2 replies; 20+ messages in thread
From: Li Ming @ 2026-02-01  9:30 UTC (permalink / raw)
  To: dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, dan.j.williams
  Cc: linux-cxl, linux-kernel, Li Ming

I ran CXL mock testing with next branch, I usually hit the following
call trace.

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
 KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
 CPU: 3 UID: 0 PID: 42 Comm: kworker/u16:1 Tainted: G           O      J 6.19.0-rc5-cxl+ #4 PREEMPT(voluntary) 
 Tainted: [O]=OOT_MODULE, [J]=FWCTL
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
 Workqueue: async async_run_entry_fn
 RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
 Call Trace:
  <TASK>
  cxl_event_trace_record+0xd1/0xa70 [cxl_core]
  __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
  cxl_mem_get_records_log+0x261/0x500 [cxl_core]
  cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
  cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
  platform_probe+0x9d/0x130
  really_probe+0x1c8/0x960
  driver_probe_device+0x45/0x120
  __device_attach_driver+0x15d/0x280
  bus_for_each_drv+0x100/0x180
  __device_attach_async_helper+0x199/0x250
  async_run_entry_fn+0x95/0x430
  process_one_work+0x7db/0x1940

After detailed debugging, I identified two independent issues that
together leads to the problem.

Issue 1:
cxlmd->endpoint is initialized to ERR_PTR(-ENXIO) during cxlmd creation,
but cxl subsystem usually checks endpoint availability by checking
whether it is NULL. As a result, if endpoint port creation fails, some
code paths may incorrectly treat the endpoint as available. In the
call trace above, endpoint port creation fails but cxl_dpa_to_region()
still considers that is available.
Patch #1 is used to fix it, the solution is initializing cxlmd->endpoint
to NULL by default.

Issue 2:
The second issue is why CXL port enumeration could be failure. What I
observed is when two memdev were trying to enumerate a same port, the
first memdev was responsible for port creation and attaching. However,
there is a small window between the point where the new port becomes
visible(after being added to the device list of cxl bus) and when it is
bound to the port driver. During this window, the second memdev may
discover the port and acquire its lock while attempting to add its
dport, which blocks bus_probe_device() inside device_add(). As a result,
the second memdev observes the port as unbound and fails to add its
dport.
Patch #2 fixes this race by holding the grandparent port lock during
dport addition, preventing premature access before driver binding
completed.

base-commit: 63050be0bfe0b280cce5d701b31940fd84858609 cxl/next

Li Ming (2):
  cxl/core: Set cxlmd->endpoint to NULL by default
  cxl/core: Hold grandparent port lock for dport adding.

 drivers/cxl/core/memdev.c | 2 +-
 drivers/cxl/core/port.c   | 6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2026-02-04 13:51 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-01  9:30 [PATCH 0/2] Fix port enumeration failure and NULL endpoint issue Li Ming
2026-02-01  9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
2026-02-02 14:41   ` Jonathan Cameron
2026-02-02 15:48     ` Gregory Price
2026-02-03 14:15     ` Li Ming
2026-02-02 21:04   ` Dave Jiang
2026-02-03 15:04     ` Li Ming
2026-02-03  0:01   ` dan.j.williams
2026-02-03 15:15     ` Li Ming
2026-02-03 22:37       ` dan.j.williams
2026-02-01  9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
2026-02-02 15:39   ` Jonathan Cameron
2026-02-03 14:23     ` Li Ming
2026-02-03 21:14       ` dan.j.williams
2026-02-02 16:31   ` Gregory Price
2026-02-03 14:33     ` Li Ming
2026-02-03  0:07   ` dan.j.williams
2026-02-03 15:21     ` Li Ming
2026-02-03 22:25       ` dan.j.williams
2026-02-04 13:51         ` Li Ming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox