From: Dan Williams <dan.j.williams@intel.com>
To: dave.jiang@intel.com
Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
Smita.KoralahalliChannabasappa@amd.com,
alison.schofield@intel.com, terry.bowman@amd.com,
alejandro.lucero-palau@amd.com, linux-pci@vger.kernel.org,
Jonathan.Cameron@huawei.com, Alejandro Lucero <alucerop@amd.com>,
Shiju Jose <shiju.jose@huawei.com>
Subject: [PATCH 0/6] cxl: Initialization reworks in support Soft Reserve Recovery and Accelerator Memory
Date: Wed, 3 Dec 2025 18:21:30 -0800 [thread overview]
Message-ID: <20251204022136.2573521-1-dan.j.williams@intel.com> (raw)
The CXL subsystem is modular. That modularity is a benefit for
separation of concerns and testing. It is generally appropriate for this
class of devices that support hotplug and can dynamically add a CXL
personality alongside their PCI personality. However, a cost of modules
is ambiguity about when devices (cxl_memdevs, cxl_ports, cxl_regions)
have had a chance to attach to their corresponding drivers on
@cxl_bus_type.
This problem of not being able to reliably determine when a device has
had a chance to attach to its driver vs still waiting for the module to
load, is a common problem for the "Soft Reserve Recovery" [1], and
"Accelerator Memory" [2] enabling efforts.
For "Soft Reserve Recovery" it wants to use wait_for_device_probe() as a
sync point for when CXL devices present at boot have had a chance to
attach to the cxl_pci driver (generic CXL memory expansion class
driver). That breaks down if wait_for_device_probe() only flushes PCI
device probe, but not the cxl_mem_probe() of the cxl_memdev that
cxl_pci_probe() creates.
For "Accelerator Memory", the driver is not cxl_pci, but any potential
PCI driver that wants to use the devm_cxl_add_memdev() ABI to attach to
the CXL memory domain. Those drivers want to know if the CXL link is
live end-to-end (from endpoint, through switches, to the host bridge)
and CXL memory operations are enabled. If not, a CXL accelerator may be
able to fall back to PCI-only operation. Similar to the "Soft Reserve
Memory" it needs to know that the CXL subsystem had a chance to probe
the ancestor topology of the device and let that driver make a
synchronous decision about CXL operation.
In support of those efforts:
* Clean up some resource lifetime issues in the current code
* Move some object creation symbols (devm_cxl_add_memdev() and
devm_cxl_add_endpoint()) into the cxl_mem.ko and cxl_port.ko objects.
Implicitly guarantee that cxl_mem_driver and cxl_port_driver have been
registered prior to any device objects being registered. This is
preferred over explicit open-coded request_module().
* Use scoped-based-cleanup before adding more resource management in
devm_cxl_add_memdev()
* Give an accelerator the opportunity to run setup operations in
cxl_mem_probe() so it can further probe if the CXL configuration matches
its needs.
Some of these previously appeared on a branch as an RFC [3] and left
"Soft Reserve Recovery" and "Accelerator Memory" to jockey for ordering.
Instead, create a shared topic branch for both of those efforts to
import. The main changes since that RFC are fixing a bug and reducing
the amount of refactoring (which contributed to hiding the bug).
[1]: http://lore.kernel.org/20251120031925.87762-1-Smita.KoralahalliChannabasappa@amd.com
[2]: http://lore.kernel.org/20251119192236.2527305-1-alejandro.lucero-palau@amd.com
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.18/cxl-probe-order
Dan Williams (6):
cxl/mem: Fix devm_cxl_memdev_edac_release() confusion
cxl/mem: Arrange for always-synchronous memdev attach
cxl/port: Arrange for always synchronous endpoint attach
cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup
cxl/mem: Drop @host argument to devm_cxl_add_memdev()
cxl/mem: Introduce a memdev creation ->probe() operation
drivers/cxl/Kconfig | 2 +-
drivers/cxl/cxl.h | 2 +
drivers/cxl/cxlmem.h | 17 ++++--
drivers/cxl/core/edac.c | 64 ++++++++++++---------
drivers/cxl/core/memdev.c | 104 ++++++++++++++++++++++++-----------
drivers/cxl/mem.c | 69 +++++++++--------------
drivers/cxl/pci.c | 2 +-
drivers/cxl/port.c | 40 ++++++++++++++
tools/testing/cxl/test/mem.c | 2 +-
9 files changed, 192 insertions(+), 110 deletions(-)
base-commit: ea5514e300568cbe8f19431c3e424d4791db8291
--
2.51.1
next reply other threads:[~2025-12-04 2:21 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-04 2:21 Dan Williams [this message]
2025-12-04 2:21 ` [PATCH 1/6] cxl/mem: Fix devm_cxl_memdev_edac_release() confusion Dan Williams
2025-12-04 16:48 ` Dave Jiang
2025-12-04 20:15 ` dan.j.williams
2025-12-04 19:09 ` Cheatham, Benjamin
2025-12-05 2:46 ` Alison Schofield
2025-12-08 14:19 ` Alejandro Lucero Palau
2025-12-15 21:11 ` dan.j.williams
2025-12-08 19:20 ` Shiju Jose
2025-12-15 12:00 ` Jonathan Cameron
2025-12-04 2:21 ` [PATCH 2/6] cxl/mem: Arrange for always-synchronous memdev attach Dan Williams
2025-12-04 16:58 ` Dave Jiang
2025-12-04 19:09 ` Cheatham, Benjamin
2025-12-05 2:49 ` Alison Schofield
2025-12-15 12:08 ` Jonathan Cameron
2025-12-04 2:21 ` [PATCH 3/6] cxl/port: Arrange for always synchronous endpoint attach Dan Williams
2025-12-04 18:36 ` Dave Jiang
2025-12-04 19:09 ` Cheatham, Benjamin
2025-12-05 3:36 ` Alison Schofield
2025-12-15 12:09 ` Jonathan Cameron
2025-12-04 2:21 ` [PATCH 4/6] cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup Dan Williams
2025-12-04 18:58 ` Dave Jiang
2025-12-04 19:09 ` Cheatham, Benjamin
2025-12-04 20:50 ` dan.j.williams
2025-12-05 3:37 ` Alison Schofield
2025-12-04 2:21 ` [PATCH 5/6] cxl/mem: Drop @host argument to devm_cxl_add_memdev() Dan Williams
2025-12-04 19:09 ` Cheatham, Benjamin
2025-12-04 20:02 ` Dave Jiang
2025-12-05 3:38 ` Alison Schofield
2025-12-15 12:15 ` Jonathan Cameron
2025-12-04 2:21 ` [PATCH 6/6] cxl/mem: Introduce a memdev creation ->probe() operation Dan Williams
2025-12-04 19:10 ` Cheatham, Benjamin
2025-12-04 21:11 ` dan.j.williams
2025-12-04 22:02 ` dan.j.williams
2025-12-04 22:15 ` Cheatham, Benjamin
2025-12-04 20:03 ` Dave Jiang
2025-12-05 15:15 ` [PATCH 0/6] cxl: Initialization reworks in support Soft Reserve Recovery and Accelerator Memory Alejandro Lucero Palau
2025-12-05 21:17 ` dan.j.williams
2025-12-08 14:04 ` Alejandro Lucero Palau
2025-12-09 7:53 ` dan.j.williams
2025-12-08 17:04 ` Alejandro Lucero Palau
2025-12-15 23:29 ` dan.j.williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251204022136.2573521-1-dan.j.williams@intel.com \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=alejandro.lucero-palau@amd.com \
--cc=alison.schofield@intel.com \
--cc=alucerop@amd.com \
--cc=dave.jiang@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=shiju.jose@huawei.com \
--cc=terry.bowman@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox