From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: <linux-cxl@vger.kernel.org>, Dave Jiang <dave.jiang@intel.com>,
"Alejandro Lucero" <alucerop@amd.com>,
Ira Weiny <ira.weiny@intel.com>
Subject: Re: [PATCH 4/4] cxl: Make cxl_dpa_alloc() DPA partition number agnostic
Date: Fri, 17 Jan 2025 11:12:15 +0000 [thread overview]
Message-ID: <20250117111215.000061c7@huawei.com> (raw)
In-Reply-To: <173709425022.753996.16667967718406367188.stgit@dwillia2-xfh.jf.intel.com>
On Thu, 16 Jan 2025 22:10:50 -0800
Dan Williams <dan.j.williams@intel.com> wrote:
> cxl_dpa_alloc() is a hard coded nest of assumptions around PMEM
> allocations being distinct from RAM allocations in specific ways when in
> practice the allocation rules are only relative to DPA partition index.
>
> The rules for cxl_dpa_alloc() are:
>
> - allocations can only come from 1 partition
>
> - if allocating at partition-index-N, all free space in partitions less
> than partition-index-N must be skipped over
>
> Use the new 'struct cxl_dpa_partition' array to support allocation with
> an arbitrary number of DPA partitions on the device.
>
> A follow-on patch can go further to cleanup 'enum cxl_decoder_mode'
> concept and supersede it with looking up the memory properties from
> partition metadata.
If we'd move to meta data and these were tightly packed then I'd be fine
with nr_partitions. Until that step, I find it confusing.
A few comments inline. This series does bring some advantages though
at cost of code that needs a bit more documentation at the very least.
>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Alejandro Lucero <alucerop@amd.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> drivers/cxl/core/hdm.c | 167 +++++++++++++++++++++++++++++++++---------------
> drivers/cxl/cxlmem.h | 9 +++
> 2 files changed, 125 insertions(+), 51 deletions(-)
>
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 7e1559b3ed88..4a2816102a1e 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -223,6 +223,30 @@ void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_dpa_debug, "CXL");
>
Some documentation would be useful. I'm not sure I understand
this algorithm correctly.
I think this complexity is all about ensuring that skip regions have
their resources broken up on partition boundaries?
Can we potentially relax constraints a little more to make this
easier to read by not caring on the ordering? Find overlap
of skip region with any partition and remove that bit unconditionally.
> +static void release_skip(struct cxl_dev_state *cxlds,
> + const resource_size_t skip_base,
> + const resource_size_t skip_len)
> +{
> + resource_size_t skip_start = skip_base, skip_rem = skip_len;
> +
> + for (int i = 0; i < cxlds->nr_partitions; i++) {
> + const struct resource *part_res = &cxlds->part[i].res;
> + resource_size_t skip_end, skip_size;
> +
> + if (skip_start < part_res->start || skip_start > part_res->end)
> + continue;
> +
> + skip_end = min(part_res->end, skip_start + skip_rem - 1);
> + skip_size = skip_end - skip_start + 1;
> + __release_region(&cxlds->dpa_res, skip_start, skip_size);
> + skip_start += skip_size;
> + skip_rem -= skip_size;
> +
> + if (!skip_rem)
> + break;
> + }
> +}
> +
> /*
> * Must be called in a context that synchronizes against this decoder's
> * port ->remove() callback (like an endpoint decoder sysfs attribute)
> @@ -241,7 +265,7 @@ static void __cxl_dpa_release(struct cxl_endpoint_decoder *cxled)
> skip_start = res->start - cxled->skip;
> __release_region(&cxlds->dpa_res, res->start, resource_size(res));
> if (cxled->skip)
> - __release_region(&cxlds->dpa_res, skip_start, cxled->skip);
> + release_skip(cxlds, skip_start, cxled->skip);
> cxled->skip = 0;
> cxled->dpa_res = NULL;
> put_device(&cxled->cxld.dev);
> @@ -268,6 +292,47 @@ static void devm_cxl_dpa_release(struct cxl_endpoint_decoder *cxled)
> __cxl_dpa_release(cxled);
> }
>
> +static int request_skip(struct cxl_dev_state *cxlds,
> + struct cxl_endpoint_decoder *cxled,
> + const resource_size_t skip_base,
> + const resource_size_t skip_len)
> +{
> + resource_size_t skip_start = skip_base, skip_rem = skip_len;
> +
> + for (int i = 0; i < cxlds->nr_partitions; i++) {
Likewise, if we relax a constraint on ordering can we make this simpler?
Would just need to keep track on whether we had reserved enough. I'm not
100% sure that is sufficient for the final error check.
> + const struct resource *part_res = &cxlds->part[i].res;
> + struct cxl_port *port = cxled_to_port(cxled);
> + resource_size_t skip_end, skip_size;
> + struct resource *res;
> +
> + if (skip_start < part_res->start || skip_start > part_res->end)
> + continue;
> +
> + skip_end = min(part_res->end, skip_start + skip_rem - 1);
> + skip_size = skip_end - skip_start + 1;
> +
> + res = __request_region(&cxlds->dpa_res, skip_start, skip_size,
> + dev_name(&cxled->cxld.dev), 0);
> + if (!res) {
> + dev_dbg(cxlds->dev,
> + "decoder%d.%d: failed to reserve skipped space\n",
> + port->id, cxled->cxld.id);
> + break;
> + }
> + skip_start += skip_size;
> + skip_rem -= skip_size;
> + if (!skip_rem)
> + break;
> + }
> +
> + if (skip_rem == 0)
> + return 0;
> +
> + release_skip(cxlds, skip_base, skip_len - skip_rem);
Ah, this complicates possibility of relaxations as we'd need to pass in what
partion number we'd reached when fail occurred.
Maybe this is the best algorithm, but I'd definitely like docs for this
function to make it clear what it's assumptions are (paritions in order of DPA etc)
> +
> + return -EBUSY;
> +}
> +
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 2e728d4b7327..43acd48b300f 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -515,6 +515,15 @@ static inline resource_size_t cxl_pmem_size(struct cxl_dev_state *cxlds)
> return resource_size(res);
> }
>
> +static inline bool cxl_partition_contains(struct cxl_dev_state *cxlds,
> + unsigned int part,
> + struct resource *res)
> +{
> + if (part >= cxlds->nr_partitions)
> + return false;
> + return resource_contains(&cxlds->part[part].res, res);
As per previous review. zero size resource never contains, so can drop the check
on nr_partitions and instead check against MAX that the core initializes to empty
(and might be overwritten by the drivers).
> +}
> +
> static inline struct cxl_dev_state *mbox_to_cxlds(struct cxl_mailbox *cxl_mbox)
> {
> return dev_get_drvdata(cxl_mbox->host);
>
>
next prev parent reply other threads:[~2025-01-17 11:12 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-17 6:10 [PATCH 0/4] cxl: DPA partition metadata is a mess Dan Williams
2025-01-17 6:10 ` [PATCH 1/4] cxl: Remove the CXL_DECODER_MIXED mistake Dan Williams
2025-01-17 10:03 ` Jonathan Cameron
2025-01-17 17:47 ` Dan Williams
2025-01-17 10:24 ` Alejandro Lucero Palau
2025-01-17 17:54 ` Dan Williams
2025-01-17 18:45 ` Ira Weiny
2025-01-17 6:10 ` [PATCH 2/4] cxl: Introduce to_{ram,pmem}_{res,perf}() helpers Dan Williams
2025-01-17 10:20 ` Jonathan Cameron
2025-01-17 10:23 ` Jonathan Cameron
2025-01-17 17:55 ` Dan Williams
2025-01-17 13:33 ` Alejandro Lucero Palau
2025-01-17 20:47 ` Dan Williams
2025-01-17 6:10 ` [PATCH 3/4] cxl: Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info' Dan Williams
2025-01-17 10:52 ` Jonathan Cameron
2025-01-17 13:38 ` Alejandro Lucero Palau
2025-01-17 18:23 ` Dan Williams
2025-01-17 20:32 ` Ira Weiny
2025-01-20 12:24 ` Alejandro Lucero Palau
2025-01-31 23:54 ` Dan Williams
2025-01-17 15:58 ` Alejandro Lucero Palau
2025-01-17 22:52 ` Dan Williams
2025-01-17 20:42 ` Ira Weiny
2025-01-17 22:08 ` Ira Weiny
2025-01-31 23:39 ` Dan Williams
2025-01-17 6:10 ` [PATCH 4/4] cxl: Make cxl_dpa_alloc() DPA partition number agnostic Dan Williams
2025-01-17 11:12 ` Jonathan Cameron [this message]
2025-01-17 18:37 ` Dan Williams
2025-01-17 15:42 ` Alejandro Lucero Palau
2025-01-17 20:57 ` Dan Williams
2025-01-20 12:39 ` Alejandro Lucero Palau
2025-02-01 0:08 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250117111215.000061c7@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alucerop@amd.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox