Linux CXL
 help / color / mirror / Atom feed
From: "Cheatham, Benjamin" <benjamin.cheatham@amd.com>
To: <alejandro.lucero-palau@amd.com>, <linux-cxl@vger.kernel.org>,
	<djbw@kernel.org>, <edward.cree@amd.com>, <davem@davemloft.net>,
	<kuba@kernel.org>, <pabeni@redhat.com>, <edumazet@google.com>,
	<dave.jiang@intel.com>
Cc: Alejandro Lucero <alucerop@amd.com>
Subject: Re: [PATCH v26 6/8] cxl: attach region to an accelerator/type2 memdev
Date: Wed, 29 Apr 2026 16:14:51 -0500	[thread overview]
Message-ID: <a8390b56-e5a9-4e06-b8f8-e66a8bd9e824@amd.com> (raw)
In-Reply-To: <20260423180528.17166-7-alejandro.lucero-palau@amd.com>

On 4/23/2026 1:05 PM, alejandro.lucero-palau@amd.com wrote:
> From: Alejandro Lucero <alucerop@amd.com>
> 
> Support an accelerator driver to safely work with an autodiscovered
> region from a committed HDM decoder through:
> 
>         1) an accelerator driver cxl_attach_region struct with attach
>            and detach callbacks.
> 
>         2) a specific function, cxl_memdev_attach_region() keeping the
>            required locks for finding a region linked to the memdev
>            endpoint, and
> 
>         3) invoking attach callback while keeping the locking allowing to
>            work (ioremap and other internal stuff) with the related physical
>            range by the accelerator driver, and
> 
>         4) linking a detach callback to the endpoint device removal where
>            the accelerator driver can stop using the region range.
> 
> This covers the cases of a potential removal of cxl_acpi module or a
> accelerator memdev unbinding from cxl_mem driver through sysfs.
> 
> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> ---
>  drivers/cxl/core/region.c          | 118 ++++++++++++++++++++++++++++-
>  drivers/net/ethernet/sfc/efx_cxl.c |  37 +++++++++
>  drivers/net/ethernet/sfc/efx_cxl.h |   2 +
>  include/cxl/cxl.h                  |  17 +++++
>  4 files changed, 171 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index e50dc716d4e8..68f5a1fd1b1c 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2711,9 +2711,16 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
>  	if (rc)
>  		goto err;
>  
> -	rc = devm_add_action_or_reset(port->uport_dev, unregister_region, cxlr);
> -	if (rc)
> -		return ERR_PTR(rc);
> +	/*
> +	 * For accelerators/type2, region release linked to endpoint device.
> +	 * See handling of cxl_endpoint_region_autoremove() below by
> +	 * cxl_memdev_attach_region().
> +	 */
> +	if (type == CXL_DECODER_HOSTONLYMEM) {
> +		rc = devm_add_action_or_reset(port->uport_dev, unregister_region, cxlr);
> +		if (rc)
> +			return ERR_PTR(rc);
> +	}
>  
>  	dev_dbg(port->uport_dev, "%s: created %s\n",
>  		dev_name(&cxlrd->cxlsd.cxld.dev), dev_name(dev));
> @@ -4043,6 +4050,111 @@ static int cxl_region_can_probe(struct cxl_region *cxlr)
>  	return 0;
>  }
>  
> +static int first_mapped_decoder(struct device *dev, const void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	if (cxled->cxld.region)
> +		return 1;
> +
> +	return 0;
> +}
> +
> +/*
> + * As this is running in endpoint port remove context it does not race cxl_root
> + * destruction since port topologies are always removed depth first.
> + */
> +static void cxl_endpoint_region_autoremove(void *_cxlr)
> +{
> +	unregister_region(_cxlr);
> +}
> +
> +/**
> + * cxl_memdev_attach_region - bind region to accelerator memdev
> + *
> + * @cxlmd: a pointer to cxl_memdev to use
> + * @attach: a pointer to region attach struct with callbacks for
> + * 	    safely working with a region range by the caller
> + *
> + * Returns 0 or error.
> + */
> +int cxl_memdev_attach_region(struct cxl_memdev *cxlmd,
> +			     struct cxl_attach_region *attach)
> +{
> +	struct cxl_port *endpoint = cxlmd->endpoint;
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_region *cxlr;
> +	int rc;
> +
> +	/* hold endpoint lock to setup autoremove of the region */
> +	guard(device)(&endpoint->dev);
> +	if (!endpoint->dev.driver)
> +		return -ENXIO;
> +	guard(rwsem_read)(&cxl_rwsem.region);
> +	guard(rwsem_read)(&cxl_rwsem.dpa);
> +
> +	/*
> +	 * TODO auto-instantiate a region, for now assume this will find an
> +	 * auto-region
> +	 */
> +	struct device *dev __free(put_device) =
> +		device_find_child(&endpoint->dev, NULL, first_mapped_decoder);
> +
> +	if (!dev) {
> +		dev_dbg(cxlmd->cxlds->dev, "no region found for memdev %s\n",
> +			dev_name(&cxlmd->dev));
> +		return -ENXIO;
> +	}
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	cxlr = cxled->cxld.region;
> +
> +	if (cxlr->params.state < CXL_CONFIG_COMMIT) {
> +		dev_dbg(cxlmd->cxlds->dev,
> +			"region %s not committed for memdev %s\n",
> +			dev_name(&cxlr->dev), dev_name(&cxlmd->dev));
> +		return -ENXIO;
> +	}
> +
> +	if (cxlr->params.nr_targets > 1) {
> +		dev_dbg(cxlmd->cxlds->dev,
> +			"Only attach to local non-interleaved region\n");
> +		return -ENXIO;
> +	}
> +
> +	attach->region = (struct range) {
> +		.start = cxlr->params.res->start,
> +		.end = cxlr->params.res->end,
> +	};
> +
> +	/*
> +	 * With endpoint locked leave the caller to safely work with the region
> +	 * range.
> +	 */

This reads a bit weirdly, how about: "Let the caller do its region-specific setup"? I
don't think it's important to mention the endpoint being locked here.

> +	rc = attach->attach(attach->data);
> +	if (rc)
> +		return rc;
> +
> +	/* Only teardown regions that pass validation, ignore the rest */
> +	rc = devm_add_action_or_reset(&endpoint->dev,
> +				      cxl_endpoint_region_autoremove, cxlr);
> +	if (rc)
> +		return rc;
> +
> +	/* Link type2 driver callback for stopping use of the region range. */
> +	rc = devm_add_action_or_reset(&endpoint->dev,
> +				      attach->detach, attach->data);
> +	if (rc)
> +		return rc;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_attach_region, "CXL");
> +
>  static int cxl_region_probe(struct device *dev)
>  {
>  	struct cxl_region *cxlr = to_cxl_region(dev);
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
> index 7d8a6c2133c8..9ca16b051d62 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> @@ -14,6 +14,36 @@
>  
>  #define EFX_CTPIO_BUFFER_SIZE	SZ_256M
>  
> +/* Called with cxl endpoint device locked for precluding potential related
> + * cxl region removal triggered from user space, allowing safely mapping of
> + * such cxl region by the sfc driver.
> + */
> +static int efx_cxl_map_region(void *data) {
> +	struct efx_probe_data *probe_data = data;
> +	struct efx_nic *efx = &probe_data->efx;
> +	struct pci_dev *pci_dev = efx->pci_dev;
> +	struct efx_cxl *cxl = probe_data->cxl;
> +	struct range *cxl_pio_range = &cxl->attach_region.region;
> +
> +	cxl->ctpio_cxl = ioremap(cxl_pio_range->start,
> +				 cxl_pio_range->end - cxl_pio_range->start + 1);
> +	if (!cxl->ctpio_cxl) {
> +		pci_err(pci_dev, "CXL ioremap region (%pra) failed\n",
> +				 cxl_pio_range);
> +		return -ENOMEM;
> +	}
> +	probe_data->cxl_pio_initialised = true;
> +	return 0;
> +}
> +
> +/* Called at driver exit or when user space triggers cxl region removal. */
> +static void efx_cxl_unmap_region(void *data) {
> +	struct efx_probe_data *probe_data = data;
> +
> +	probe_data->cxl_pio_initialised = false;
> +	iounmap(probe_data->cxl->ctpio_cxl);
> +}
> +
>  int efx_cxl_init(struct efx_probe_data *probe_data)
>  {
>  	struct efx_nic *efx = &probe_data->efx;
> @@ -81,6 +111,13 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
>  		return PTR_ERR(cxl->cxlmd);
>  	}
>  
> +	probe_data->cxl->attach_region.attach = efx_cxl_map_region;
> +	probe_data->cxl->attach_region.detach = efx_cxl_unmap_region;
> +	probe_data->cxl->attach_region.data = probe_data;
> +
> +	cxl_memdev_attach_region(cxl->cxlmd,
> +				 &probe_data->cxl->attach_region);

I would think you want to check the return code here, no? I looked at the last patch
and don't think this will cause issues, but I would at least expect some debug output
here.
> +
>  	probe_data->cxl = cxl;
>  
>  	return 0;
> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
> index 04e46278464d..1c294cd1df56 100644
> --- a/drivers/net/ethernet/sfc/efx_cxl.h
> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
> @@ -20,6 +20,8 @@ struct efx_probe_data;
>  struct efx_cxl {
>  	struct cxl_dev_state cxlds;
>  	struct cxl_memdev *cxlmd;
> +	struct cxl_attach_region attach_region;
> +	void __iomem *ctpio_cxl;
>  };
>  
>  int efx_cxl_init(struct efx_probe_data *probe_data);
> diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
> index 10a9b8fa2f6b..22d9435b351f 100644
> --- a/include/cxl/cxl.h
> +++ b/include/cxl/cxl.h
> @@ -7,6 +7,7 @@
>  
>  #include <linux/node.h>
>  #include <linux/ioport.h>
> +#include <linux/pci.h>
>  #include <cxl/mailbox.h>
>  
>  /**
> @@ -153,6 +154,20 @@ struct cxl_memdev_attach {
>  	int (*probe)(struct cxl_memdev *cxlmd);
>  };
>  
> +/**
> + * struct cxl_attach_region - accelerator region handling
> + * @attach: invoked at cxl_memdev_attach_region() with endpoint device locked.
> + * @detach: invoked at endpoint release.
> + * @data: pointer referencing accelerator data for attach and detach calls.
> + * @region: initialised with autodiscovered region values linked to memdev.
> + */
> +struct cxl_attach_region {
> +	int (*attach)(void *);
> +	void (*detach)(void *);
> +	void *data;
> +	struct range region;
> +};

I like your approach here, but have a problem with this. It seems like an attempt to circumvent the cxl_memdev_attach
struct above, which makes it borderline useless (as of now). Would you be opposed to folding it into cxl_memdev_attach
instead? I'm thinking doing something like:

/**
 * struct cxl_memdev_attach - CXL memdev driver and region attach handlers
 *
 * @mem_probe: Callback used during cxl_mem::probe for memdev specific setup
 * @region_attach: Device specific setup invoked during cxl_memdev_attach_region() with endpoint device locked
 * @region_detach: Device specific cleanup invoked at endpoint release
 * @region_data: Pointer passed to region_attach and region_detach calls
 * @region: Memory range covered by autodiscovered region linked to memdev
 */
struct cxl_memdev_attach {
	int (*mem_probe)(struct cxl_memdev *cxlmd);
	int (*region_attach)(void *data);
	int (*region_detach)(void *data);
	void *region_data;
	struct range region;
};

and then replace cxl_attach_region with cxl_memdev_attach accordingly. I think that's a 3-way win: You get
what you want to do, Dan's work isn't invalidated, and cxl_memdev_attach becomes a one-stop shop for all
CXL accelerator attach shenanigans.

> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -231,4 +246,6 @@ struct cxl_dev_state *_devm_cxl_dev_state_create(struct device *dev,
>  int cxl_set_capacity(struct cxl_dev_state *cxlds, u64 capacity);
>  struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
>  				       const struct cxl_memdev_attach *attach);
> +struct cxl_region;
> +int cxl_memdev_attach_region(struct cxl_memdev *cxlmd, struct cxl_attach_region *attach);
>  #endif /* __CXL_CXL_H__ */


  reply	other threads:[~2026-04-29 21:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 18:05 [PATCH v26 0/8] Type2 device basic support alejandro.lucero-palau
2026-04-23 18:05 ` [PATCH v26 1/8] sfc: add cxl support alejandro.lucero-palau
2026-04-29 21:14   ` Cheatham, Benjamin
2026-05-01 10:07     ` Alejandro Lucero Palau
2026-04-23 18:05 ` [PATCH v26 2/8] cxl/sfc: Map cxl regs alejandro.lucero-palau
2026-04-23 18:05 ` [PATCH v26 3/8] cxl/sfc: Initialize dpa without a mailbox alejandro.lucero-palau
2026-04-23 18:05 ` [PATCH v26 4/8] cxl: Prepare memdev creation for type2 alejandro.lucero-palau
2026-04-30 23:23   ` Dan Williams (nvidia)
2026-04-23 18:05 ` [PATCH v26 5/8] sfc: create type2 cxl memdev alejandro.lucero-palau
2026-04-23 18:05 ` [PATCH v26 6/8] cxl: attach region to an accelerator/type2 memdev alejandro.lucero-palau
2026-04-29 21:14   ` Cheatham, Benjamin [this message]
2026-05-01 10:35     ` Alejandro Lucero Palau
2026-05-01  2:00   ` Dan Williams (nvidia)
2026-05-01 10:59     ` Alejandro Lucero Palau
2026-05-02  0:46       ` Dan Williams (nvidia)
2026-05-05 20:51         ` Alejandro Lucero Palau
2026-04-23 18:05 ` [PATCH v26 7/8] cxl: Avoid dax creation for accelerators alejandro.lucero-palau
2026-04-29 21:14   ` Cheatham, Benjamin
2026-04-23 18:05 ` [PATCH v26 8/8] sfc: support pio mapping based on cxl alejandro.lucero-palau
2026-04-23 22:07 ` [PATCH v26 0/8] Type2 device basic support Dave Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8390b56-e5a9-4e06-b8f8-e66a8bd9e824@amd.com \
    --to=benjamin.cheatham@amd.com \
    --cc=alejandro.lucero-palau@amd.com \
    --cc=alucerop@amd.com \
    --cc=dave.jiang@intel.com \
    --cc=davem@davemloft.net \
    --cc=djbw@kernel.org \
    --cc=edumazet@google.com \
    --cc=edward.cree@amd.com \
    --cc=kuba@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox