From: Dave Jiang <dave.jiang@intel.com>
To: "Cheatham, Benjamin" <benjamin.cheatham@amd.com>,
alejandro.lucero-palau@amd.com
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
linux-cxl@vger.kernel.org, netdev@vger.kernel.org,
dan.j.williams@intel.com, edward.cree@amd.com,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
edumazet@google.com
Subject: Re: [PATCH v19 18/22] cxl: Allow region creation by type2 drivers
Date: Wed, 15 Oct 2025 14:42:03 -0700 [thread overview]
Message-ID: <aa942655-d740-4052-8ddc-13540b06ef14@intel.com> (raw)
In-Reply-To: <c42081c1-09e6-45be-8f9e-e4eea0eb1296@amd.com>
On 10/9/25 1:56 PM, Cheatham, Benjamin wrote:
> On 10/6/2025 5:01 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Creating a CXL region requires userspace intervention through the cxl
>> sysfs files. Type2 support should allow accelerator drivers to create
>> such cxl region from kernel code.
>>
>> Adding that functionality and integrating it with current support for
>> memory expanders.
>>
>> Support an action by the type2 driver to be linked to the created region
>> for unwinding the resources allocated properly.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> ---
>
> Fix for this one should be split between 13/22 and this patch, but the majority of it is in this one. The idea is
> if we don't find a free decoder we check for pre-programmed decoders and use that instead. Unfortunately, this
> invalidates some of the assumptions made by __construct_new_region().
Wouldn't you look for a pre-programmed decoder first and construct the auto region before you try to manually create one? Also for a type 2 device, would the driver know what it wants and what the region configuration should look like? Would it be a single region either it's auto or manual, or would there be a configuration of multiple regions possible? To me a type 2 region is more intentional where the driver would know exactly what it needs and thus trying to get that from the cxl core.
DJ
>
> __construct_new_region() assumes that 1) the underlying HPA is unallocated and 2) the HDM decoders aren't programmed. Neither
> of those are true for a decoder that's programmed by BIOS. The HPA is allocated as part of endpoint_port_probe()
> (see devm_cxl_enumerate_decoders() in cxl/core/hdm.c specifically) and the HDM decoders are enabled and committed by BIOS before
> we ever see them. So the idea here is to split the second half of __construct_new_region() into the 2 cases: un-programmed decoders
> (__setup_new_region()) and pre-programmed decoders (__setup_new_auto_region()). The main differences between the two is we don't
> allocate the HPA region or commit the HDM decoders and just insert the region resource below the CXL window instead in the auto case.
>
> I'm not sure if I've done everything correctly, but I don't see any errors and get the following iomem tree:
> 1050000000-144fffffff : CXL Window 0
> 1050000000-144fffffff : region0
> 1050000000-144fffffff : Soft Reserved
>
> ---
>
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 4af5de5e0a44..a5fa8dd0e63f 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -137,6 +137,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> struct cxl_endpoint_dvsec_info *info);
> int cxl_port_get_possible_dports(struct cxl_port *port);
>
> +bool is_auto_decoder(struct cxl_endpoint_decoder *cxled);
> +
> #ifdef CONFIG_CXL_FEATURES
> struct cxl_feat_entry *
> cxl_feature_info(struct cxl_features_state *cxlfs, const uuid_t *uuid);
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 1f7aa79c1541..8f6236a88c0b 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -712,16 +712,33 @@ static int find_free_decoder(struct device *dev, const void *data)
> return 1;
> }
>
> +bool is_auto_decoder(struct cxl_endpoint_decoder *cxled)
> +{
> + return cxled->state == CXL_DECODER_STATE_AUTO && cxled->pos < 0 &&
> + (cxled->cxld.flags & CXL_DECODER_F_ENABLE);
> +}
> +
> +static int find_auto_decoder(struct device *dev, const void *data)
> +{
> + if (!is_endpoint_decoder(dev))
> + return 0;
> +
> + return is_auto_decoder(to_cxl_endpoint_decoder(dev));
> +}
> +
> static struct cxl_endpoint_decoder *
> cxl_find_free_decoder(struct cxl_memdev *cxlmd)
> {
> struct cxl_port *endpoint = cxlmd->endpoint;
> struct device *dev;
>
> - scoped_guard(rwsem_read, &cxl_rwsem.dpa) {
> - dev = device_find_child(&endpoint->dev, NULL,
> - find_free_decoder);
> - }
> + guard(rwsem_read)(&cxl_rwsem.dpa);
> + dev = device_find_child(&endpoint->dev, NULL,
> + find_free_decoder);
> + if (dev)
> + return to_cxl_endpoint_decoder(dev);
> +
> + dev = device_find_child(&endpoint->dev, NULL, find_auto_decoder);
> if (dev)
> return to_cxl_endpoint_decoder(dev);
>
> @@ -761,6 +778,9 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
> if (!cxled)
> return ERR_PTR(-ENODEV);
>
> + if (is_auto_decoder(cxled))
> + return_ptr(cxled);
> +
> rc = cxl_dpa_set_part(cxled, mode);
> if (rc)
> return ERR_PTR(rc);
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 2d60131edff3..004e01ad0e5f 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3699,48 +3699,74 @@ cxl_find_region_by_range(struct cxl_root_decoder *cxlrd, struct range *hpa)
> }
>
> static struct cxl_region *
> -__construct_new_region(struct cxl_root_decoder *cxlrd,
> - struct cxl_endpoint_decoder **cxled, int ways)
> +__setup_new_auto_region(struct cxl_region *cxlr, struct cxl_root_decoder *cxlrd,
> + struct cxl_endpoint_decoder **cxled, int ways)
> {
> - struct cxl_memdev *cxlmd = cxled_to_memdev(cxled[0]);
> - struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> - struct cxl_region_params *p;
> + struct range *hpa = &cxled[0]->cxld.hpa_range;
> + struct cxl_region_params *p = &cxlr->params;
> resource_size_t size = 0;
> - struct cxl_region *cxlr;
> - int rc, i;
> + struct resource *res;
> + int rc = -EINVAL, i = 0;
>
> - cxlr = construct_region_begin(cxlrd, cxled[0]);
> - if (IS_ERR(cxlr))
> - return cxlr;
> + scoped_guard(rwsem_read, &cxl_rwsem.dpa)
> + {
> + for (i = 0; i < ways; i++) {
> + if (!cxled[i]->dpa_res)
> + goto err;
>
> - guard(rwsem_write)(&cxl_rwsem.region);
> + if (!is_auto_decoder(cxled[i]))
> + goto err;
>
> - /*
> - * Sanity check. This should not happen with an accel driver handling
> - * the region creation.
> - */
> - p = &cxlr->params;
> - if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> - dev_err(cxlmd->dev.parent,
> - "%s:%s: %s unexpected region state\n",
> - dev_name(&cxlmd->dev), dev_name(&cxled[0]->cxld.dev),
> - __func__);
> - rc = -EBUSY;
> - goto err;
> + size += resource_size(cxled[i]->dpa_res);
> + }
> }
>
> - rc = set_interleave_ways(cxlr, ways);
> - if (rc)
> + set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
> +
> + p->res = kmalloc(sizeof(*res), GFP_KERNEL);
> + if (!p->res) {
> + rc = -ENOMEM;
> goto err;
> + }
>
> - rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
> + *p->res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
> + dev_name(&cxlr->dev));
> +
> + rc = insert_resource(cxlrd->res, p->res);
> if (rc)
> - goto err;
> + dev_warn(&cxlr->dev, "Could not insert resource\n");
> +
> + p->state = CXL_CONFIG_INTERLEAVE_ACTIVE;
> + scoped_guard(rwsem_read, &cxl_rwsem.dpa)
> + {
> + for (i = 0; i < ways; i++) {
> + rc = cxl_region_attach(cxlr, cxled[i], -1);
> + if (rc)
> + goto err;
> + }
> + }
> +
> + return cxlr;
> +
> +err:
> + drop_region(cxlr);
> + return ERR_PTR(rc);
> +}
> +
> +static struct cxl_region *
> +__setup_new_region(struct cxl_region *cxlr, struct cxl_root_decoder *cxlrd,
> + struct cxl_endpoint_decoder **cxled, int ways)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> + resource_size_t size = 0;
> + int rc = -EINVAL, i = 0;
>
> - scoped_guard(rwsem_read, &cxl_rwsem.dpa) {
> + scoped_guard(rwsem_read, &cxl_rwsem.dpa)
> + {
> for (i = 0; i < ways; i++) {
> if (!cxled[i]->dpa_res)
> break;
> +
> size += resource_size(cxled[i]->dpa_res);
> }
> }
> @@ -3752,7 +3778,8 @@ __construct_new_region(struct cxl_root_decoder *cxlrd,
> if (rc)
> goto err;
>
> - scoped_guard(rwsem_read, &cxl_rwsem.dpa) {
> + scoped_guard(rwsem_read, &cxl_rwsem.dpa)
> + {
> for (i = 0; i < ways; i++) {
> rc = cxl_region_attach(cxlr, cxled[i], 0);
> if (rc)
> @@ -3760,16 +3787,61 @@ __construct_new_region(struct cxl_root_decoder *cxlrd,
> }
> }
>
> + rc = cxl_region_decode_commit(cxlr);
> if (rc)
> goto err;
>
> - rc = cxl_region_decode_commit(cxlr);
> + p->state = CXL_CONFIG_COMMIT;
> + return cxlr;
> +
> +err:
> + drop_region(cxlr);
> + return ERR_PTR(rc);
> +}
> +
> +static struct cxl_region *
> +__construct_new_region(struct cxl_root_decoder *cxlrd,
> + struct cxl_endpoint_decoder **cxled, int ways)
> +{
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled[0]);
> + struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> + struct cxl_region_params *p;
> + struct cxl_region *cxlr;
> + int rc;
> +
> + cxlr = construct_region_begin(cxlrd, cxled[0]);
> + if (IS_ERR(cxlr))
> + return cxlr;
> +
> + guard(rwsem_write)(&cxl_rwsem.region);
> +
> + /*
> + * Sanity check. This should not happen with an accel driver handling
> + * the region creation.
> + */
> + p = &cxlr->params;
> + if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> + dev_err(cxlmd->dev.parent,
> + "%s:%s: %s unexpected region state\n",
> + dev_name(&cxlmd->dev), dev_name(&cxled[0]->cxld.dev),
> + __func__);
> + rc = -EBUSY;
> + goto err;
> + }
> +
> + rc = set_interleave_ways(cxlr, ways);
> if (rc)
> goto err;
>
> - p->state = CXL_CONFIG_COMMIT;
> + rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
> + if (rc)
> + goto err;
> +
> + if (is_auto_decoder(cxled[0]))
> + return __setup_new_auto_region(cxlr, cxlrd, cxled, ways);
> + else
> + return __setup_new_region(cxlr, cxlrd, cxled, ways);
>
> - return cxlr;
> err:
> drop_region(cxlr);
> return ERR_PTR(rc);
next prev parent reply other threads:[~2025-10-15 21:42 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 10:01 [PATCH v19 00/22] Type2 device basic support alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 01/22] cxl/mem: Arrange for always-synchronous memdev attach alejandro.lucero-palau
2025-10-07 12:40 ` Jonathan Cameron
2025-10-07 12:42 ` Jonathan Cameron
2025-10-10 23:11 ` Dave Jiang
2025-10-29 11:20 ` Alejandro Lucero Palau
2025-10-30 19:57 ` Koralahalli Channabasappa, Smita
2025-11-10 10:43 ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 02/22] cxl/port: Arrange for always synchronous endpoint attach alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 03/22] cxl/mem: Introduce a memdev creation ->probe() operation alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 04/22] cxl: Add type2 device basic support alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 05/22] sfc: add cxl support alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 06/22] cxl: Move pci generic code alejandro.lucero-palau
2025-10-07 13:01 ` Jonathan Cameron
2025-11-10 11:23 ` Alejandro Lucero Palau
2025-11-11 13:41 ` Jonathan Cameron
2025-10-06 10:01 ` [PATCH v19 07/22] cxl: allow Type2 drivers to map cxl component regs alejandro.lucero-palau
2025-10-07 13:18 ` Jonathan Cameron
2025-11-10 11:28 ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 08/22] cxl: Support dpa initialization without a mailbox alejandro.lucero-palau
2025-10-07 13:22 ` Jonathan Cameron
2025-11-10 11:28 ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 09/22] cxl: Prepare memdev creation for type2 alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 10/22] sfc: create type2 cxl memdev alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 11/22] cxl: Define a driver interface for HPA free space enumeration alejandro.lucero-palau
2025-10-07 13:43 ` Jonathan Cameron
2025-11-10 11:46 ` Alejandro Lucero Palau
2025-10-09 20:55 ` Cheatham, Benjamin
2025-10-10 11:16 ` Alejandro Lucero Palau
2025-10-15 17:52 ` Dave Jiang
2025-10-15 18:17 ` Dave Jiang
2025-11-10 11:57 ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 12/22] sfc: get root decoder alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 13/22] cxl: Define a driver interface for DPA allocation alejandro.lucero-palau
2025-10-07 13:52 ` Jonathan Cameron
2025-10-15 20:07 ` Dave Jiang
2025-11-10 12:02 ` Alejandro Lucero Palau
2025-10-15 20:08 ` Dave Jiang
2025-11-10 12:04 ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 14/22] sfc: get endpoint decoder alejandro.lucero-palau
2025-10-15 20:15 ` Dave Jiang
2025-11-10 12:08 ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 15/22] cxl: Make region type based on endpoint type alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 16/22] cxl/region: Factor out interleave ways setup alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 17/22] cxl/region: Factor out interleave granularity setup alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 18/22] cxl: Allow region creation by type2 drivers alejandro.lucero-palau
2025-10-07 14:11 ` Jonathan Cameron
2025-11-10 13:47 ` Alejandro Lucero Palau
2025-11-11 14:04 ` Jonathan Cameron
2025-10-09 20:56 ` Cheatham, Benjamin
2025-10-15 21:42 ` Dave Jiang [this message]
2025-10-16 13:23 ` Cheatham, Benjamin
2025-10-20 13:24 ` Alejandro Lucero Palau
2025-10-20 13:59 ` Dave Jiang
2025-10-20 14:59 ` Alejandro Lucero Palau
2025-10-15 21:36 ` Dave Jiang
2025-10-20 13:04 ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 19/22] cxl: Avoid dax creation for accelerators alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 20/22] sfc: create cxl region alejandro.lucero-palau
2025-10-07 14:13 ` Jonathan Cameron
2025-10-06 10:01 ` [PATCH v19 21/22] cxl: Add function for obtaining region range alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 22/22] sfc: support pio mapping based on cxl alejandro.lucero-palau
2025-10-07 14:48 ` Jonathan Cameron
2025-11-10 14:54 ` Alejandro Lucero Palau
2025-10-07 23:41 ` [PATCH v19 00/22] Type2 device basic support Dave Jiang
2025-10-10 10:39 ` Alejandro Lucero Palau
2025-10-10 15:57 ` Dave Jiang
2025-10-10 16:54 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa942655-d740-4052-8ddc-13540b06ef14@intel.com \
--to=dave.jiang@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alejandro.lucero-palau@amd.com \
--cc=benjamin.cheatham@amd.com \
--cc=dan.j.williams@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=edward.cree@amd.com \
--cc=kuba@kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox