public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
From: "Cheatham, Benjamin" <benjamin.cheatham@amd.com>
To: Dan Williams <dan.j.williams@intel.com>, <dave.jiang@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<alejandro.lucero-palau@amd.com>
Subject: Re: [RFC PATCH 4/4] cxl/region: Introduce cxl_memdev_attach_region
Date: Thu, 9 Apr 2026 17:02:49 -0500	[thread overview]
Message-ID: <59f69b87-e37e-44c7-8c15-c332118622b5@amd.com> (raw)
In-Reply-To: <20260403210050.1058650-5-dan.j.williams@intel.com>

On 4/3/2026 4:00 PM, Dan Williams wrote:
> To date, platform firmware maps accelerator memory and accelerator drivers
> simply want an address range that they can map themselves. This typically
> results in a single region being auto-assembled upon registration of a
> memory device.  Use the @attach mechanism of devm_cxl_add_memdev()
> parameter to retrieve that region while also adhering to CXL subsystem
> locking and lifetime rules. As part of adhering to current object lifetime
> rules, if the region or the CXL port topology is invalidated, the CXL core
> arranges for the accelertor driver to be detached as well.

I'm probably missing something, but where does the core detach the accelerator driver?
I assume it's part of unregister_region(), but I'm not seeing any driver_detach() calls
on the accelerator device.

I don't think there's any demand for this at the moment, so it's more of a "food for thought"
kind of thing, but why not allow for a fallback mode of operation instead of detaching
the accelerator driver? First thing that comes to mind is an optional callback as part
of struct cxl_memdev_attach that is called by the core during CXL removal. The assumption
would be that an accelerator driver that provides the callback would clean up all of its CXL
usage in the callback then continue in a PCIe-only mode of operation (hardware willing).
When the callback isn't provided, you get the behavior in this patch. Again, this isn't
needed here but may be a nice middle-ground later on down the line.

> 
> The locking and lifetime rules were validated by this local change to
> cxl_mock_mem. This tests all the lock acquisition and cleanup at
> modprobe -r cxl_test time. More work needed to also test the full positive
> case.
> 
>     struct cxl_attach_region
>             attach = { .attach = {
>                                .probe = cxl_memdev_attach_region,
>                        } };
>     cxlmd = devm_cxl_add_memdev(cxlds, &attach.attach);
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  include/cxl/cxl.h         |  16 +++++
>  drivers/cxl/core/region.c | 125 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 141 insertions(+)
> 
> diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
> index 10a9b8fa2f6b..1698d15ec1ca 100644
> --- a/include/cxl/cxl.h
> +++ b/include/cxl/cxl.h
> @@ -153,6 +153,22 @@ struct cxl_memdev_attach {
>  	int (*probe)(struct cxl_memdev *cxlmd);
>  };
>  
> +/**
> + * struct cxl_attach_region - coordinate mapping a region at memdev registration
> + * @attach: common core attachment descriptor
> + * @region: physical address range of the region
> + *
> + * For the common simple case of a CXL device with private (non-general purpose
> + * / "accelerator") memory, enumerate firmware instantiated region, or
> + * instantiate a region for the device's capacity. Destroy the region on detach.
> + */
> +struct cxl_attach_region {
> +	struct cxl_memdev_attach attach;
> +	struct range region;
> +};
> +
> +int cxl_memdev_attach_region(struct cxl_memdev *cxlmd);
> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 11bc0b88b05f..090f52392b20 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1123,6 +1123,19 @@ static int cxl_rr_assign_decoder(struct cxl_port *port, struct cxl_region *cxlr,
>  static void cxl_region_setup_flags(struct cxl_region *cxlr,
>  				   struct cxl_decoder *cxld)
>  {
> +	if (is_endpoint_decoder(&cxld->dev)) {
> +		struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev);
> +		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +
> +		/*
> +		 * When a region's memdevs specify an @attach method the attach
> +		 * provider is responsible for dispositioning the region for
> +		 * both probe and userspace management
> +		 */
> +		if (cxlmd->attach)
> +			set_bit(CXL_REGION_F_LOCK, &cxlr->flags);
> +	}
> +

It seems unfortunate that you set the region lock bit here and then immediately do a check
to set the bit again. I can't think of a way to leverage cxld->flags for type 2 in an
ergonomic way though, so I guess it's fine.
>  	if (cxld->flags & CXL_DECODER_F_LOCK) {
>  		set_bit(CXL_REGION_F_LOCK, &cxlr->flags);
>  		clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
> @@ -4226,6 +4239,115 @@ static int cxl_region_can_probe(struct cxl_region *cxlr)
>  	return 0;
>  }
>  
> +static int first_mapped_decoder(struct device *dev, const void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	if (cxled->cxld.region)
> +		return 1;
> +
> +	return 0;
> +}
> +
> +/*
> + * As this is running in endpoint port remove context it does not race cxl_root
> + * destruction since port topologies are always removed depth first.
> + */
> +static void cxl_endpoint_region_autoremove(void *_cxlr)
> +{
> +	struct cxl_region *cxlr = _cxlr;
> +	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
> +	struct cxl_port *port = cxlrd_to_port(cxlrd);
> +
> +	devm_release_action(port->uport_dev, unregister_region, cxlr);
> +}
> +
> +/*
> + * Runs in cxl_mem_probe context after successful endpoint probe, assumes the
> + * simple case of single mapped decoder per memdev.
> + */
> +int cxl_memdev_attach_region(struct cxl_memdev *cxlmd)
> +{
> +	struct cxl_attach_region *attach =
> +		container_of(cxlmd->attach, typeof(*attach), attach);
> +	struct cxl_port *endpoint = cxlmd->endpoint;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_region *cxlr;
> +	int rc;
> +
> +	/* hold endpoint lock to setup autoremove of the region */
> +	guard(device)(&endpoint->dev);
> +	if (!endpoint->dev.driver)
> +		return -ENXIO;
> +	guard(rwsem_read)(&cxl_rwsem.region);
> +	guard(rwsem_read)(&cxl_rwsem.dpa);
> +
> +	/*
> +	 * TODO auto-instantiate a region, for now assume this will find an
> +	 * auto-region
> +	 */
> +	struct device *dev __free(put_device) =
> +		device_find_child(&endpoint->dev, NULL, first_mapped_decoder);
> +
> +	if (!dev) {
> +		dev_dbg(cxlmd->cxlds->dev, "no region found for memdev %s\n",
> +			dev_name(&cxlmd->dev));
> +		return -ENXIO;
> +	}
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	cxlr = cxled->cxld.region;
> +
> +	if (cxlr->params.state < CXL_CONFIG_COMMIT) {
> +		dev_dbg(cxlmd->cxlds->dev,
> +			"region %s not committed for memdev %s\n",
> +			dev_name(&cxlr->dev), dev_name(&cxlmd->dev));
> +		return -ENXIO;
> +	}
> +
> +	if (cxlr->params.nr_targets > 1) {
> +		dev_dbg(cxlmd->cxlds->dev,
> +			"Only attach to local non-interleaved region\n");
> +		return -ENXIO;
> +	}
> +
> +	/* Only teardown regions that pass validation, ignore the rest */
> +	rc = devm_add_action_or_reset(&endpoint->dev,
> +				      cxl_endpoint_region_autoremove, cxlr);
> +	if (rc)
> +		return rc;
> +
> +	attach->region = (struct range) {
> +		.start = cxlr->params.res->start,
> +		.end = cxlr->params.res->end,
> +	};
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_attach_region, "CXL");
> +
> +/*
> + * The presence of an attach method indicates that the region is designated for
> + * a purpose outside of CXL core memory expansion defaults.
> + */
> +static bool cxl_region_has_memdev_attach(struct cxl_region *cxlr)
> +{

I think this should be renamed to something like "cxl_region_is_private()"; it
better matches the intended use of the function while lessening the burden on
a non-type 2 educated reader. Also has the added benefit of being one less
change site if the mechanism for determining if a region belongs to an accelerator
changes in the future.

Thanks,
Ben

  parent reply	other threads:[~2026-04-09 22:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-03 21:00 [RFC PATCH 0/4] cxl: Region attach for accelerators Dan Williams
2026-04-03 21:00 ` [RFC PATCH 1/4] cxl/mem: Add support to cleanly continue after attach Dan Williams
2026-04-09 22:02   ` Cheatham, Benjamin
2026-04-11 22:33     ` Dan Williams
2026-04-03 21:00 ` [RFC PATCH 2/4] cxl/region: Move region lock error code to -EBUSY Dan Williams
2026-04-03 21:00 ` [RFC PATCH 3/4] cxl/region: Block region delete for locked regions Dan Williams
2026-04-03 21:00 ` [RFC PATCH 4/4] cxl/region: Introduce cxl_memdev_attach_region Dan Williams
2026-04-07 10:25   ` Alejandro Lucero Palau
2026-04-11 21:42     ` Dan Williams
2026-04-14 15:41       ` Alejandro Lucero Palau
2026-04-09 22:02   ` Cheatham, Benjamin [this message]
2026-04-11 23:02     ` Dan Williams
2026-04-12  8:57       ` Lukas Wunner
2026-04-13 14:25       ` Cheatham, Benjamin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59f69b87-e37e-44c7-8c15-c332118622b5@amd.com \
    --to=benjamin.cheatham@amd.com \
    --cc=alejandro.lucero-palau@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox