Linux CXL
 help / color / mirror / Atom feed
From: Li Ming <ming.li@zohomail.com>
To: dave@stgolabs.net, jonathan.cameron@huawei.com,
	dave.jiang@intel.com, alison.schofield@intel.com,
	vishal.l.verma@intel.com, ira.weiny@intel.com,
	dan.j.williams@intel.com
Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
	Li Ming <ming.li@zohomail.com>
Subject: [PATCH 0/2] Fix port enumeration failure and NULL endpoint issue
Date: Sun,  1 Feb 2026 17:30:00 +0800	[thread overview]
Message-ID: <20260201093002.1281858-1-ming.li@zohomail.com> (raw)

I ran CXL mock testing with next branch, I usually hit the following
call trace.

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000092: 0000 [#1] SMP KASAN NOPTI
 KASAN: null-ptr-deref in range [0x0000000000000490-0x0000000000000497]
 CPU: 3 UID: 0 PID: 42 Comm: kworker/u16:1 Tainted: G           O      J 6.19.0-rc5-cxl+ #4 PREEMPT(voluntary) 
 Tainted: [O]=OOT_MODULE, [J]=FWCTL
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
 Workqueue: async async_run_entry_fn
 RIP: 0010:cxl_dpa_to_region+0x105/0x1f0 [cxl_core]
 Call Trace:
  <TASK>
  cxl_event_trace_record+0xd1/0xa70 [cxl_core]
  __cxl_event_trace_record+0x12f/0x1e0 [cxl_core]
  cxl_mem_get_records_log+0x261/0x500 [cxl_core]
  cxl_mem_get_event_records+0x7c/0xc0 [cxl_core]
  cxl_mock_mem_probe+0xd38/0x1c60 [cxl_mock_mem]
  platform_probe+0x9d/0x130
  really_probe+0x1c8/0x960
  driver_probe_device+0x45/0x120
  __device_attach_driver+0x15d/0x280
  bus_for_each_drv+0x100/0x180
  __device_attach_async_helper+0x199/0x250
  async_run_entry_fn+0x95/0x430
  process_one_work+0x7db/0x1940

After detailed debugging, I identified two independent issues that
together leads to the problem.

Issue 1:
cxlmd->endpoint is initialized to ERR_PTR(-ENXIO) during cxlmd creation,
but cxl subsystem usually checks endpoint availability by checking
whether it is NULL. As a result, if endpoint port creation fails, some
code paths may incorrectly treat the endpoint as available. In the
call trace above, endpoint port creation fails but cxl_dpa_to_region()
still considers that is available.
Patch #1 is used to fix it, the solution is initializing cxlmd->endpoint
to NULL by default.

Issue 2:
The second issue is why CXL port enumeration could be failure. What I
observed is when two memdev were trying to enumerate a same port, the
first memdev was responsible for port creation and attaching. However,
there is a small window between the point where the new port becomes
visible(after being added to the device list of cxl bus) and when it is
bound to the port driver. During this window, the second memdev may
discover the port and acquire its lock while attempting to add its
dport, which blocks bus_probe_device() inside device_add(). As a result,
the second memdev observes the port as unbound and fails to add its
dport.
Patch #2 fixes this race by holding the grandparent port lock during
dport addition, preventing premature access before driver binding
completed.

base-commit: 63050be0bfe0b280cce5d701b31940fd84858609 cxl/next

Li Ming (2):
  cxl/core: Set cxlmd->endpoint to NULL by default
  cxl/core: Hold grandparent port lock for dport adding.

 drivers/cxl/core/memdev.c | 2 +-
 drivers/cxl/core/port.c   | 6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

-- 
2.43.0


             reply	other threads:[~2026-02-01  9:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-01  9:30 Li Ming [this message]
2026-02-01  9:30 ` [PATCH 1/2] cxl/core: Set cxlmd->endpoint to NULL by default Li Ming
2026-02-02 14:41   ` Jonathan Cameron
2026-02-02 15:48     ` Gregory Price
2026-02-03 14:15     ` Li Ming
2026-02-02 21:04   ` Dave Jiang
2026-02-03 15:04     ` Li Ming
2026-02-03  0:01   ` dan.j.williams
2026-02-03 15:15     ` Li Ming
2026-02-03 22:37       ` dan.j.williams
2026-02-01  9:30 ` [PATCH 2/2] cxl/core: Hold grandparent port lock while dport adding Li Ming
2026-02-02 15:39   ` Jonathan Cameron
2026-02-03 14:23     ` Li Ming
2026-02-03 21:14       ` dan.j.williams
2026-02-02 16:31   ` Gregory Price
2026-02-03 14:33     ` Li Ming
2026-02-03  0:07   ` dan.j.williams
2026-02-03 15:21     ` Li Ming
2026-02-03 22:25       ` dan.j.williams
2026-02-04 13:51         ` Li Ming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260201093002.1281858-1-ming.li@zohomail.com \
    --to=ming.li@zohomail.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=ira.weiny@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox