Re: [PATCH v19 18/22] cxl: Allow region creation by type2 drivers

public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed

From: Dave Jiang <dave.jiang@intel.com>
To: "Cheatham, Benjamin" <benjamin.cheatham@amd.com>,
	alejandro.lucero-palau@amd.com
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	linux-cxl@vger.kernel.org, netdev@vger.kernel.org,
	dan.j.williams@intel.com, edward.cree@amd.com,
	davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	edumazet@google.com
Subject: Re: [PATCH v19 18/22] cxl: Allow region creation by type2 drivers
Date: Wed, 15 Oct 2025 14:42:03 -0700	[thread overview]
Message-ID: <aa942655-d740-4052-8ddc-13540b06ef14@intel.com> (raw)
In-Reply-To: <c42081c1-09e6-45be-8f9e-e4eea0eb1296@amd.com>



On 10/9/25 1:56 PM, Cheatham, Benjamin wrote:
> On 10/6/2025 5:01 AM, alejandro.lucero-palau@amd.com wrote:
>> From: Alejandro Lucero <alucerop@amd.com>
>>
>> Creating a CXL region requires userspace intervention through the cxl
>> sysfs files. Type2 support should allow accelerator drivers to create
>> such cxl region from kernel code.
>>
>> Adding that functionality and integrating it with current support for
>> memory expanders.
>>
>> Support an action by the type2 driver to be linked to the created region
>> for unwinding the resources allocated properly.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592159835.1948938.1647215579839222774.stgit@dwillia2-xfh.jf.intel.com/
>>
>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> ---
> 
> Fix for this one should be split between 13/22 and this patch, but the majority of it is in this one. The idea is
> if we don't find a free decoder we check for pre-programmed decoders and use that instead. Unfortunately, this
> invalidates some of the assumptions made by __construct_new_region().

Wouldn't you look for a pre-programmed decoder first and construct the auto region before you try to manually create one? Also for a type 2 device, would the driver know what it wants and what the region configuration should look like? Would it be a single region either it's auto or manual, or would there be a configuration of multiple regions possible? To me a type 2 region is more intentional where the driver would know exactly what it needs and thus trying to get that from the cxl core. 

DJ


> 
> __construct_new_region() assumes that 1) the underlying HPA is unallocated and 2) the HDM decoders aren't programmed. Neither
> of those are true for a decoder that's programmed by BIOS. The HPA is allocated as part of endpoint_port_probe()
> (see devm_cxl_enumerate_decoders() in cxl/core/hdm.c specifically) and the HDM decoders are enabled and committed by BIOS before
> we ever see them. So the idea here is to split the second half of __construct_new_region() into the 2 cases: un-programmed decoders
> (__setup_new_region()) and pre-programmed decoders (__setup_new_auto_region()). The main differences between the two is we don't
> allocate the HPA region or commit the HDM decoders and just insert the region resource below the CXL window instead in the auto case.
> 
> I'm not sure if I've done everything correctly, but I don't see any errors and get the following iomem tree:
> 	1050000000-144fffffff : CXL Window 0
>   	  1050000000-144fffffff : region0
>     	    1050000000-144fffffff : Soft Reserved
> 
> ---
> 
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 4af5de5e0a44..a5fa8dd0e63f 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -137,6 +137,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>                         struct cxl_endpoint_dvsec_info *info);
>  int cxl_port_get_possible_dports(struct cxl_port *port);
> 
> +bool is_auto_decoder(struct cxl_endpoint_decoder *cxled);
> +
>  #ifdef CONFIG_CXL_FEATURES
>  struct cxl_feat_entry *
>  cxl_feature_info(struct cxl_features_state *cxlfs, const uuid_t *uuid);
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 1f7aa79c1541..8f6236a88c0b 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -712,16 +712,33 @@ static int find_free_decoder(struct device *dev, const void *data)
>         return 1;
>  }
> 
> +bool is_auto_decoder(struct cxl_endpoint_decoder *cxled)
> +{
> +       return cxled->state == CXL_DECODER_STATE_AUTO && cxled->pos < 0 &&
> +              (cxled->cxld.flags & CXL_DECODER_F_ENABLE);
> +}
> +
> +static int find_auto_decoder(struct device *dev, const void *data)
> +{
> +       if (!is_endpoint_decoder(dev))
> +               return 0;
> +
> +       return is_auto_decoder(to_cxl_endpoint_decoder(dev));
> +}
> +
>  static struct cxl_endpoint_decoder *
>  cxl_find_free_decoder(struct cxl_memdev *cxlmd)
>  {
>         struct cxl_port *endpoint = cxlmd->endpoint;
>         struct device *dev;
> 
> -       scoped_guard(rwsem_read, &cxl_rwsem.dpa) {
> -               dev = device_find_child(&endpoint->dev, NULL,
> -                                       find_free_decoder);
> -       }
> +       guard(rwsem_read)(&cxl_rwsem.dpa);
> +       dev = device_find_child(&endpoint->dev, NULL,
> +                               find_free_decoder);
> +       if (dev)
> +               return to_cxl_endpoint_decoder(dev);
> +
> +       dev = device_find_child(&endpoint->dev, NULL, find_auto_decoder);
>         if (dev)
>                 return to_cxl_endpoint_decoder(dev);
> 
> @@ -761,6 +778,9 @@ struct cxl_endpoint_decoder *cxl_request_dpa(struct cxl_memdev *cxlmd,
>         if (!cxled)
>                 return ERR_PTR(-ENODEV);
> 
> +       if (is_auto_decoder(cxled))
> +               return_ptr(cxled);
> +
>         rc = cxl_dpa_set_part(cxled, mode);
>         if (rc)
>                 return ERR_PTR(rc);
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 2d60131edff3..004e01ad0e5f 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3699,48 +3699,74 @@ cxl_find_region_by_range(struct cxl_root_decoder *cxlrd, struct range *hpa)
>  }
> 
>  static struct cxl_region *
> -__construct_new_region(struct cxl_root_decoder *cxlrd,
> -                      struct cxl_endpoint_decoder **cxled, int ways)
> +__setup_new_auto_region(struct cxl_region *cxlr, struct cxl_root_decoder *cxlrd,
> +                       struct cxl_endpoint_decoder **cxled, int ways)
>  {
> -       struct cxl_memdev *cxlmd = cxled_to_memdev(cxled[0]);
> -       struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> -       struct cxl_region_params *p;
> +       struct range *hpa = &cxled[0]->cxld.hpa_range;
> +       struct cxl_region_params *p = &cxlr->params;
>         resource_size_t size = 0;
> -       struct cxl_region *cxlr;
> -       int rc, i;
> +       struct resource *res;
> +       int rc = -EINVAL, i = 0;
> 
> -       cxlr = construct_region_begin(cxlrd, cxled[0]);
> -       if (IS_ERR(cxlr))
> -               return cxlr;
> +       scoped_guard(rwsem_read, &cxl_rwsem.dpa)
> +       {
> +               for (i = 0; i < ways; i++) {
> +                       if (!cxled[i]->dpa_res)
> +                               goto err;
> 
> -       guard(rwsem_write)(&cxl_rwsem.region);
> +                       if (!is_auto_decoder(cxled[i]))
> +                               goto err;
> 
> -       /*
> -        * Sanity check. This should not happen with an accel driver handling
> -        * the region creation.
> -        */
> -       p = &cxlr->params;
> -       if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> -               dev_err(cxlmd->dev.parent,
> -                       "%s:%s: %s  unexpected region state\n",
> -                       dev_name(&cxlmd->dev), dev_name(&cxled[0]->cxld.dev),
> -                       __func__);
> -               rc = -EBUSY;
> -               goto err;
> +                       size += resource_size(cxled[i]->dpa_res);
> +               }
>         }
> 
> -       rc = set_interleave_ways(cxlr, ways);
> -       if (rc)
> +       set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
> +
> +       p->res = kmalloc(sizeof(*res), GFP_KERNEL);
> +       if (!p->res) {
> +               rc = -ENOMEM;
>                 goto err;
> +       }
> 
> -       rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
> +       *p->res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
> +                                      dev_name(&cxlr->dev));
> +
> +       rc = insert_resource(cxlrd->res, p->res);
>         if (rc)
> -               goto err;
> +               dev_warn(&cxlr->dev, "Could not insert resource\n");
> +
> +       p->state = CXL_CONFIG_INTERLEAVE_ACTIVE;
> +       scoped_guard(rwsem_read, &cxl_rwsem.dpa)
> +       {
> +               for (i = 0; i < ways; i++) {
> +                       rc = cxl_region_attach(cxlr, cxled[i], -1);
> +                       if (rc)
> +                               goto err;
> +               }
> +       }
> +
> +       return cxlr;
> +
> +err:
> +       drop_region(cxlr);
> +       return ERR_PTR(rc);
> +}
> +
> +static struct cxl_region *
> +__setup_new_region(struct cxl_region *cxlr, struct cxl_root_decoder *cxlrd,
> +                  struct cxl_endpoint_decoder **cxled, int ways)
> +{
> +       struct cxl_region_params *p = &cxlr->params;
> +       resource_size_t size = 0;
> +       int rc = -EINVAL, i = 0;
> 
> -       scoped_guard(rwsem_read, &cxl_rwsem.dpa) {
> +       scoped_guard(rwsem_read, &cxl_rwsem.dpa)
> +       {
>                 for (i = 0; i < ways; i++) {
>                         if (!cxled[i]->dpa_res)
>                                 break;
> +
>                         size += resource_size(cxled[i]->dpa_res);
>                 }
>         }
> @@ -3752,7 +3778,8 @@ __construct_new_region(struct cxl_root_decoder *cxlrd,
>         if (rc)
>                 goto err;
> 
> -       scoped_guard(rwsem_read, &cxl_rwsem.dpa) {
> +       scoped_guard(rwsem_read, &cxl_rwsem.dpa)
> +       {
>                 for (i = 0; i < ways; i++) {
>                         rc = cxl_region_attach(cxlr, cxled[i], 0);
>                         if (rc)
> @@ -3760,16 +3787,61 @@ __construct_new_region(struct cxl_root_decoder *cxlrd,
>                 }
>         }
> 
> +       rc = cxl_region_decode_commit(cxlr);
>         if (rc)
>                 goto err;
> 
> -       rc = cxl_region_decode_commit(cxlr);
> +       p->state = CXL_CONFIG_COMMIT;
> +       return cxlr;
> +
> +err:
> +       drop_region(cxlr);
> +       return ERR_PTR(rc);
> +}
> +
> +static struct cxl_region *
> +__construct_new_region(struct cxl_root_decoder *cxlrd,
> +                      struct cxl_endpoint_decoder **cxled, int ways)
> +{
> +       struct cxl_memdev *cxlmd = cxled_to_memdev(cxled[0]);
> +       struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
> +       struct cxl_region_params *p;
> +       struct cxl_region *cxlr;
> +       int rc;
> +
> +       cxlr = construct_region_begin(cxlrd, cxled[0]);
> +       if (IS_ERR(cxlr))
> +               return cxlr;
> +
> +       guard(rwsem_write)(&cxl_rwsem.region);
> +
> +       /*
> +        * Sanity check. This should not happen with an accel driver handling
> +        * the region creation.
> +        */
> +       p = &cxlr->params;
> +       if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) {
> +               dev_err(cxlmd->dev.parent,
> +                       "%s:%s: %s  unexpected region state\n",
> +                       dev_name(&cxlmd->dev), dev_name(&cxled[0]->cxld.dev),
> +                       __func__);
> +               rc = -EBUSY;
> +               goto err;
> +       }
> +
> +       rc = set_interleave_ways(cxlr, ways);
>         if (rc)
>                 goto err;
> 
> -       p->state = CXL_CONFIG_COMMIT;
> +       rc = set_interleave_granularity(cxlr, cxld->interleave_granularity);
> +       if (rc)
> +               goto err;
> +
> +       if (is_auto_decoder(cxled[0]))
> +               return __setup_new_auto_region(cxlr, cxlrd, cxled, ways);
> +       else
> +               return __setup_new_region(cxlr, cxlrd, cxled, ways);
> 
> -       return cxlr;
>  err:
>         drop_region(cxlr);
>         return ERR_PTR(rc);

next prev parent reply	other threads:[~2025-10-15 21:42 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-06 10:01 [PATCH v19 00/22] Type2 device basic support alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 01/22] cxl/mem: Arrange for always-synchronous memdev attach alejandro.lucero-palau
2025-10-07 12:40   ` Jonathan Cameron
2025-10-07 12:42     ` Jonathan Cameron
2025-10-10 23:11       ` Dave Jiang
2025-10-29 11:20     ` Alejandro Lucero Palau
2025-10-30 19:57       ` Koralahalli Channabasappa, Smita
2025-11-10 10:43         ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 02/22] cxl/port: Arrange for always synchronous endpoint attach alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 03/22] cxl/mem: Introduce a memdev creation ->probe() operation alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 04/22] cxl: Add type2 device basic support alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 05/22] sfc: add cxl support alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 06/22] cxl: Move pci generic code alejandro.lucero-palau
2025-10-07 13:01   ` Jonathan Cameron
2025-11-10 11:23     ` Alejandro Lucero Palau
2025-11-11 13:41       ` Jonathan Cameron
2025-10-06 10:01 ` [PATCH v19 07/22] cxl: allow Type2 drivers to map cxl component regs alejandro.lucero-palau
2025-10-07 13:18   ` Jonathan Cameron
2025-11-10 11:28     ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 08/22] cxl: Support dpa initialization without a mailbox alejandro.lucero-palau
2025-10-07 13:22   ` Jonathan Cameron
2025-11-10 11:28     ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 09/22] cxl: Prepare memdev creation for type2 alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 10/22] sfc: create type2 cxl memdev alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 11/22] cxl: Define a driver interface for HPA free space enumeration alejandro.lucero-palau
2025-10-07 13:43   ` Jonathan Cameron
2025-11-10 11:46     ` Alejandro Lucero Palau
2025-10-09 20:55   ` Cheatham, Benjamin
2025-10-10 11:16     ` Alejandro Lucero Palau
2025-10-15 17:52       ` Dave Jiang
2025-10-15 18:17   ` Dave Jiang
2025-11-10 11:57     ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 12/22] sfc: get root decoder alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 13/22] cxl: Define a driver interface for DPA allocation alejandro.lucero-palau
2025-10-07 13:52   ` Jonathan Cameron
2025-10-15 20:07     ` Dave Jiang
2025-11-10 12:02     ` Alejandro Lucero Palau
2025-10-15 20:08   ` Dave Jiang
2025-11-10 12:04     ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 14/22] sfc: get endpoint decoder alejandro.lucero-palau
2025-10-15 20:15   ` Dave Jiang
2025-11-10 12:08     ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 15/22] cxl: Make region type based on endpoint type alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 16/22] cxl/region: Factor out interleave ways setup alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 17/22] cxl/region: Factor out interleave granularity setup alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 18/22] cxl: Allow region creation by type2 drivers alejandro.lucero-palau
2025-10-07 14:11   ` Jonathan Cameron
2025-11-10 13:47     ` Alejandro Lucero Palau
2025-11-11 14:04       ` Jonathan Cameron
2025-10-09 20:56   ` Cheatham, Benjamin
2025-10-15 21:42     ` Dave Jiang [this message]
2025-10-16 13:23       ` Cheatham, Benjamin
2025-10-20 13:24         ` Alejandro Lucero Palau
2025-10-20 13:59           ` Dave Jiang
2025-10-20 14:59             ` Alejandro Lucero Palau
2025-10-15 21:36   ` Dave Jiang
2025-10-20 13:04     ` Alejandro Lucero Palau
2025-10-06 10:01 ` [PATCH v19 19/22] cxl: Avoid dax creation for accelerators alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 20/22] sfc: create cxl region alejandro.lucero-palau
2025-10-07 14:13   ` Jonathan Cameron
2025-10-06 10:01 ` [PATCH v19 21/22] cxl: Add function for obtaining region range alejandro.lucero-palau
2025-10-06 10:01 ` [PATCH v19 22/22] sfc: support pio mapping based on cxl alejandro.lucero-palau
2025-10-07 14:48   ` Jonathan Cameron
2025-11-10 14:54     ` Alejandro Lucero Palau
2025-10-07 23:41 ` [PATCH v19 00/22] Type2 device basic support Dave Jiang
2025-10-10 10:39   ` Alejandro Lucero Palau
2025-10-10 15:57     ` Dave Jiang
2025-10-10 16:54       ` Dave Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa942655-d740-4052-8ddc-13540b06ef14@intel.com \
    --to=dave.jiang@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alejandro.lucero-palau@amd.com \
    --cc=benjamin.cheatham@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=edward.cree@amd.com \
    --cc=kuba@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox