From: Dave Jiang <dave.jiang@intel.com>
To: Dan Williams <djbw@kernel.org>, linux-cxl@vger.kernel.org
Cc: alejandro.lucero-palau@amd.com, jic23@kernel.org,
Alejandro Lucero <alucerop@amd.com>
Subject: Re: [PATCH 5/5] cxl/region: Introduce devm_cxl_probe_mem()
Date: Wed, 20 May 2026 10:37:19 -0700 [thread overview]
Message-ID: <1c4db903-290e-4aa0-b3f2-c4e7688eeb2e@intel.com> (raw)
In-Reply-To: <20260519210158.1499795-6-djbw@kernel.org>
On 5/19/26 2:01 PM, Dan Williams wrote:
> To date, platform firmware maps accelerator memory and accelerator drivers
> simply want an address range that they can map themselves. This typically
> results in a single region being auto-assembled upon registration of a
> memory device. Use the @attach mechanism of devm_cxl_add_memdev()
> parameter to retrieve that region while also adhering to CXL subsystem
> locking and lifetime rules. As part of adhering to current object lifetime
> rules, if the region or the CXL port topology is invalidated, the CXL core
> arranges for the accelertor driver to be detached as well.
>
> The locking and lifetime rules were validated with Dave's work-in-progress
> cxl-type-2 support for cxl_test.
>
> devm_cxl_add_classdev() supports the general memory expansion flow where
> region assembly is optional, dynamic, and user controlled.
>
> Cc: Alejandro Lucero <alucerop@amd.com>
> Signed-off-by: Dan Williams <djbw@kernel.org>
> ---
> drivers/cxl/cxlmem.h | 27 ++++++--
> include/cxl/cxl.h | 3 +
> drivers/cxl/core/region.c | 124 +++++++++++++++++++++++++++++++++++
> drivers/cxl/mem.c | 50 +++++++++++---
> drivers/cxl/pci.c | 2 +-
> tools/testing/cxl/test/mem.c | 2 +-
> 6 files changed, 191 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 776c50d1db51..d3bdd00f94b3 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -34,10 +34,6 @@
> (FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) != \
> CXLMDEV_RESET_NEEDED_NOT)
>
> -struct cxl_memdev_attach {
> - int (*probe)(struct cxl_memdev *cxlmd);
> -};
> -
> /**
> * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device
> * @dev: driver core device object
> @@ -101,10 +97,29 @@ static inline bool is_cxl_endpoint(struct cxl_port *port)
> return is_cxl_memdev(port->uport_dev);
> }
>
> +struct cxl_memdev_attach {
> + int (*probe)(struct cxl_memdev *cxlmd);
> +};
> +
> +/**
> + * struct cxl_attach_region - coordinate mapping a region at memdev registration
> + * @attach: common core attachment descriptor
> + * @hpa_range: physical address range of the region
> + *
> + * For the common simple case of a CXL device with private (non-general purpose
> + * / "accelerator") memory, enumerate firmware instantiated region, or
> + * instantiate a region for the device's capacity. Destroy the region on detach.
> + */
> +struct cxl_attach_region {
> + struct cxl_memdev_attach attach;
> + struct range hpa_range;
> +};
> +
> +int cxl_memdev_attach_region(struct cxl_memdev *cxlmd);
> +
> +struct cxl_memdev *devm_cxl_add_classdev(struct cxl_dev_state *cxlds);
> struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
> const struct cxl_memdev_attach *attach);
> -struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
> - const struct cxl_memdev_attach *attach);
> int devm_cxl_sanitize_setup_notifier(struct device *host,
> struct cxl_memdev *cxlmd);
> struct cxl_memdev_state;
> diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
> index fa7269154620..016c74fb747c 100644
> --- a/include/cxl/cxl.h
> +++ b/include/cxl/cxl.h
> @@ -223,4 +223,7 @@ struct cxl_dev_state *_devm_cxl_dev_state_create(struct device *dev,
> (drv_struct *)_devm_cxl_dev_state_create(parent, type, serial, dvsec, \
> sizeof(drv_struct), mbox); \
> })
> +
> +struct cxl_memdev *devm_cxl_probe_mem(struct cxl_dev_state *cxlds,
> + struct range *range);
> #endif /* __CXL_CXL_H__ */
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index faf9785c0509..ce99f0650764 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1148,6 +1148,19 @@ static int cxl_rr_assign_decoder(struct cxl_port *port, struct cxl_region *cxlr,
> static void cxl_region_setup_flags(struct cxl_region *cxlr,
> struct cxl_decoder *cxld)
> {
> + if (is_endpoint_decoder(&cxld->dev)) {
> + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev);
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +
> + /*
> + * When a region's memdevs specify an @attach method the attach
> + * provider is responsible for dispositioning the region for
> + * both probe and userspace management
> + */
> + if (cxlmd->attach)
> + set_bit(CXL_REGION_F_LOCK, &cxlr->flags);
> + }
> +
> if (cxld->flags & CXL_DECODER_F_LOCK) {
> set_bit(CXL_REGION_F_LOCK, &cxlr->flags);
> clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
> @@ -2560,6 +2573,17 @@ static void unregister_region(struct cxl_region *cxlr)
> put_device(&cxlr->dev);
> }
>
> +static void endpoint_unregister_region(void *_cxlr)
> +{
> + struct cxl_region *cxlr = _cxlr;
> + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> +
> + guard(mutex)(&cxlrd->regions_lock);
> + if (xa_load(&cxlrd->regions, cxlr->id))
> + unregister_region(cxlr);
> + put_device(&cxlr->dev);
Claude mentioned that there may be a refcount underflow happening here when xa_load() succeeds, where unregister_region() calls put_device(&cxlr->dev) and then another put_device(&cxlr->dev) in endpoint_unregister_region().
DJ
> +}
> +
> static struct lock_class_key cxl_region_key;
>
> static struct cxl_region *cxl_region_alloc(struct cxl_root_decoder *cxlrd, int id)
> @@ -4057,6 +4081,103 @@ static int cxl_region_can_probe(struct cxl_region *cxlr)
> return 0;
> }
>
> +static int first_mapped_decoder(struct device *dev, const void *data)
> +{
> + struct cxl_endpoint_decoder *cxled;
> +
> + if (!is_endpoint_decoder(dev))
> + return 0;
> +
> + cxled = to_cxl_endpoint_decoder(dev);
> + if (cxled->cxld.region)
> + return 1;
> +
> + return 0;
> +}
> +
> +/*
> + * Runs in cxl_mem_probe context after successful endpoint probe, assumes the
> + * simple case of single mapped decoder per memdev.
> + */
> +int cxl_memdev_attach_region(struct cxl_memdev *cxlmd)
> +{
> + struct cxl_attach_region *attach =
> + container_of(cxlmd->attach, typeof(*attach), attach);
> + struct cxl_port *endpoint = cxlmd->endpoint;
> + struct cxl_endpoint_decoder *cxled;
> + struct cxl_region *cxlr;
> + int rc;
> +
> + /* hold endpoint lock to setup autoremove of the region */
> + guard(device)(&endpoint->dev);
> + if (!endpoint->dev.driver)
> + return -ENXIO;
> + guard(rwsem_read)(&cxl_rwsem.region);
> + guard(rwsem_read)(&cxl_rwsem.dpa);
> +
> + /*
> + * TODO auto-instantiate a region, for now assume this will find an
> + * auto-region
> + */
> + struct device *dev __free(put_device) =
> + device_find_child(&endpoint->dev, NULL, first_mapped_decoder);
> +
> + if (!dev) {
> + dev_dbg(cxlmd->cxlds->dev, "no region found for memdev %s\n",
> + dev_name(&cxlmd->dev));
> + return -ENXIO;
> + }
> +
> + cxled = to_cxl_endpoint_decoder(dev);
> + cxlr = cxled->cxld.region;
> +
> + if (cxlr->params.state < CXL_CONFIG_COMMIT) {
> + dev_dbg(cxlmd->cxlds->dev,
> + "region %s not committed for memdev %s\n",
> + dev_name(&cxlr->dev), dev_name(&cxlmd->dev));
> + return -ENXIO;
> + }
> +
> + if (cxlr->params.nr_targets > 1) {
> + dev_dbg(cxlmd->cxlds->dev,
> + "Only attach to local non-interleaved region\n");
> + return -ENXIO;
> + }
> +
> + /* Only teardown regions that pass validation, ignore the rest */
> + get_device(&cxlr->dev);
> + rc = devm_add_action_or_reset(&endpoint->dev,
> + endpoint_unregister_region, cxlr);
> + if (rc)
> + return rc;
> +
> + attach->hpa_range = (struct range) {
> + .start = cxlr->params.res->start,
> + .end = cxlr->params.res->end,
> + };
> + return 0;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(cxl_memdev_attach_region, "cxl_mem");
> +
> +/*
> + * The presence of an attach method indicates that the region is designated for
> + * a purpose outside of CXL core memory expansion defaults.
> + */
> +static bool cxl_region_has_memdev_attach(struct cxl_region *cxlr)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> +
> + for (int i = 0; i < p->nr_targets; i++) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +
> + if (cxlmd->attach)
> + return true;
> + }
> +
> + return false;
> +}
> +
> static int cxl_region_probe(struct device *dev)
> {
> struct cxl_region *cxlr = to_cxl_region(dev);
> @@ -4088,6 +4209,9 @@ static int cxl_region_probe(struct device *dev)
> if (rc)
> return rc;
>
> + if (cxl_region_has_memdev_attach(cxlr))
> + return 0;
> +
> switch (cxlr->mode) {
> case CXL_PARTMODE_PMEM:
> rc = devm_cxl_region_edac_register(cxlr);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index ff858318091f..67d31482f06b 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -183,27 +183,59 @@ static int cxl_mem_probe(struct device *dev)
> }
>
> /**
> - * devm_cxl_add_memdev - Add a CXL memory device
> + * devm_cxl_add_classdev - Add a CXL memory class-code device
> * @cxlds: CXL device state to associate with the memdev
> - * @attach: Caller depends on CXL topology attachment
> *
> * Upon return the device will have had a chance to attach to the
> * cxl_mem driver, but may fail to attach if the CXL topology is not ready
> * (hardware CXL link down, or software platform CXL root not attached).
> *
> - * When @attach is NULL it indicates the caller wants the memdev to remain
> - * registered even if it does not immediately attach to the CXL hierarchy. When
> - * @attach is provided a cxl_mem_probe() failure leads to failure of this routine.
> + * The parent of the resulting device and the devm context for allocations is
> + * @cxlds->dev.
> + */
> +struct cxl_memdev *devm_cxl_add_classdev(struct cxl_dev_state *cxlds)
> +{
> + return __devm_cxl_add_memdev(cxlds, NULL);
> +}
> +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_classdev, "CXL");
> +
> +/**
> + * devm_cxl_probe_mem - Add a CXL memory device and probe its region
> + * @cxlds: CXL device state to associate with the memdev
> + * @hpa_range: CXL.mem physical address range result
> + *
> + * Upon return the device will have had a chance to attach to the
> + * cxl_mem driver, but may fail to attach if the CXL topology is not ready
> + * (hardware CXL link down, or software platform CXL root not attached).
> + *
> + * Failure to probe the memdev and/or setup a region for the memdev
> + * results in this function failing.
> *
> * The parent of the resulting device and the devm context for allocations is
> * @cxlds->dev.
> */
> -struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
> - const struct cxl_memdev_attach *attach)
> +struct cxl_memdev *devm_cxl_probe_mem(struct cxl_dev_state *cxlds,
> + struct range *hpa_range)
> {
> - return __devm_cxl_add_memdev(cxlds, attach);
> + struct cxl_attach_region *attach =
> + devm_kmalloc(cxlds->dev, sizeof(*attach), GFP_KERNEL);
> + struct cxl_memdev *cxlmd;
> +
> + if (!attach)
> + return ERR_PTR(-ENOMEM);
> +
> + *attach = (struct cxl_attach_region) {
> + .attach = {
> + .probe = cxl_memdev_attach_region,
> + },
> + .hpa_range = { 0, -1 },
> + };
> +
> + cxlmd = __devm_cxl_add_memdev(cxlds, &attach->attach);
> + *hpa_range = attach->hpa_range;
> + return cxlmd;
> }
> -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
> +EXPORT_SYMBOL_NS_GPL(devm_cxl_probe_mem, "CXL");
>
> static ssize_t trigger_poison_list_store(struct device *dev,
> struct device_attribute *attr,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index bace662dc988..267c679b0b3c 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -878,7 +878,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (rc)
> dev_dbg(&pdev->dev, "No CXL Features discovered\n");
>
> - cxlmd = devm_cxl_add_memdev(cxlds, NULL);
> + cxlmd = devm_cxl_add_classdev(cxlds);
> if (IS_ERR(cxlmd))
> return PTR_ERR(cxlmd);
>
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index 271c7ad8cc32..095ca544ac02 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -1769,7 +1769,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>
> cxl_mock_add_event_logs(&mdata->mes);
>
> - cxlmd = devm_cxl_add_memdev(cxlds, NULL);
> + cxlmd = devm_cxl_add_classdev(cxlds);
> if (IS_ERR(cxlmd))
> return PTR_ERR(cxlmd);
>
next prev parent reply other threads:[~2026-05-20 17:37 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-19 21:01 [PATCH 0/5] cxl: Introduce devm_cxl_probe_mem() and fix region deletion Dan Williams
2026-05-19 21:01 ` [PATCH 1/5] cxl/region: Block region delete during region creation Dan Williams
2026-05-19 21:59 ` Dave Jiang
2026-05-21 14:32 ` Alejandro Lucero Palau
2026-05-19 21:01 ` [PATCH 2/5] cxl/region: Resolve region deletion races Dan Williams
2026-05-19 22:17 ` Dave Jiang
2026-05-21 14:33 ` Alejandro Lucero Palau
2026-05-19 21:01 ` [PATCH 3/5] cxl/memdev: Pin parents for entire memdev lifetime Dan Williams
2026-05-19 22:24 ` Dave Jiang
2026-05-21 14:33 ` Alejandro Lucero Palau
2026-05-19 21:01 ` [PATCH 4/5] cxl/memdev: Introduce cxl_class_memdev_type Dan Williams
2026-05-19 22:50 ` Dave Jiang
2026-05-21 14:33 ` Alejandro Lucero Palau
2026-05-19 21:01 ` [PATCH 5/5] cxl/region: Introduce devm_cxl_probe_mem() Dan Williams
2026-05-19 22:56 ` Dave Jiang
2026-05-20 17:37 ` Dave Jiang [this message]
2026-05-21 14:44 ` Alejandro Lucero Palau
2026-05-21 14:31 ` [PATCH 0/5] cxl: Introduce devm_cxl_probe_mem() and fix region deletion Alejandro Lucero Palau
2026-06-05 7:56 ` Alejandro Lucero Palau
2026-06-05 15:20 ` Dave Jiang
2026-06-06 6:48 ` Alejandro Lucero Palau
2026-06-08 19:30 ` Dave Jiang
2026-06-12 20:52 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1c4db903-290e-4aa0-b3f2-c4e7688eeb2e@intel.com \
--to=dave.jiang@intel.com \
--cc=alejandro.lucero-palau@amd.com \
--cc=alucerop@amd.com \
--cc=djbw@kernel.org \
--cc=jic23@kernel.org \
--cc=linux-cxl@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox