From: Robert Richter <rrichter@amd.com>
To: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>,
Jonathan Cameron <jonathan.cameron@huawei.com>,
Dave Jiang <dave.jiang@intel.com>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
ming.li@zohomail.com, linux-kernel@vger.kernel.org,
linux-cxl@vger.kernel.org
Subject: Re: [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole
Date: Fri, 21 Mar 2025 11:29:49 +0100 [thread overview]
Message-ID: <Z90_ncUzI20TGfq4@rric.localdomain> (raw)
In-Reply-To: <20250314113708.759808-4-fabio.m.de.francesco@linux.intel.com>
On 14.03.25 12:36:32, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
>
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. In some cases the size
> of that hole is not compatible with the CXL hardware decoder constraint
> that the size is always aligned to 256M * Interleave Ways.
>
> On those systems, BIOS publishes CFMWS which communicate the active System
> Physical Address (SPA) ranges that map to a subset of the Host Physical
> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> the endpoint is lost with no SPA to map to CXL HPA in that hole.
>
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, cxl_add_to_region() fails and returns an error
> because it can't find any CXL Window that matches a given CXL Endpoint
> Decoder.
>
> Detect a Low Memory Hole by comparing Root Decoders and Endpoint Decoders
> ranges with the use of arch_match_{spa,region}() helpers.
>
> Match Root Decoders and CXL Regions with corresponding CXL Endpoint
> Decoders. Currently a Low Memory Holes would prevent the matching functions
> to return true.
>
> Construct CXL Regions with HPA range's end adjusted to the matching SPA.
>
> Allow the attach target process to complete by allowing Regions to not
> comply with alignment constraints (i.e., alignment to NIW * 256M rule).
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/Kconfig | 5 ++++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/region.c | 56 +++++++++++++++++++++++++++++++++------
> tools/testing/cxl/Kbuild | 1 +
> 4 files changed, 55 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 205547e5543a..3bb282ef01df 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -139,6 +139,11 @@ config CXL_REGION
>
> If unsure say 'y'
>
> +config CXL_ARCH_LOW_MEMORY_HOLE
> + def_bool y
> + depends on CXL_REGION
> + depends on X86
> +
> config CXL_REGION_INVALIDATION_TEST
> bool "CXL: Region Cache Management Bypass (TEST)"
> depends on CXL_REGION
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 139b349b3a52..3dccd3c224f1 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -17,6 +17,7 @@ cxl_core-y += cdat.o
> cxl_core-y += acpi.o
> cxl_core-y += ras.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> cxl_core-$(CONFIG_CXL_MCE) += mce.o
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 97122d645cc1..9eb23ecedecf 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -13,6 +13,7 @@
> #include <cxlmem.h>
> #include <cxl.h>
> #include "core.h"
> +#include "lmh.h"
>
> /**
> * DOC: cxl core region
> @@ -835,6 +836,8 @@ static int match_free_decoder(struct device *dev, const void *data)
> static bool region_res_match_cxl_range(const struct cxl_region_params *p,
> struct range *range)
> {
> + struct cxl_decoder *cxld;
> +
> if (!p->res)
> return false;
>
> @@ -843,8 +846,15 @@ static bool region_res_match_cxl_range(const struct cxl_region_params *p,
> * to be fronted by the DRAM range in current known implementation.
> * This assumption will be made until a variant implementation exists.
> */
> - return p->res->start + p->cache_size == range->start &&
> - p->res->end == range->end;
> + if (p->res->start + p->cache_size == range->start &&
> + p->res->end == range->end)
> + return true;
> +
> + cxld = container_of(range, struct cxl_decoder, hpa_range);
> + if (arch_match_region(p, cxld))
> + return true;
> +
> + return false;
This reaches a complexity that cannot be handled in a couple of years
or even months. We need a maintainable solution for all this. Esp. the
use of callbacks or handlers enabled by platform checks would help to
better isolate the code.
> }
>
> static int match_auto_decoder(struct device *dev, const void *data)
> @@ -1760,6 +1770,7 @@ static int match_switch_decoder_by_range(struct device *dev,
> {
> const struct cxl_endpoint_decoder *cxled = data;
> struct cxl_switch_decoder *cxlsd;
> + struct cxl_root_decoder *cxlrd;
> const struct range *r1, *r2;
>
> if (!is_switch_decoder(dev))
> @@ -1769,8 +1780,13 @@ static int match_switch_decoder_by_range(struct device *dev,
> r1 = &cxlsd->cxld.hpa_range;
> r2 = &cxled->cxld.hpa_range;
>
> - if (is_root_decoder(dev))
> - return range_contains(r1, r2);
> + if (is_root_decoder(dev)) {
> + if (range_contains(r1, r2))
> + return 1;
> + cxlrd = to_cxl_root_decoder(dev);
> + if (arch_match_spa(cxlrd, cxled))
> + return 1;
See my other comment in patch #2 to simplify the match functions.
Applies to the checks below too.
> + }
> return (r1->start == r2->start && r1->end == r2->end);
> }
>
> @@ -1978,7 +1994,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> }
>
> if (resource_size(cxled->dpa_res) * p->interleave_ways + p->cache_size !=
> - resource_size(p->res)) {
> + resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
> dev_dbg(&cxlr->dev,
> "%s:%s-size-%#llx * ways-%d + cache-%#llx != region-size-%#llx\n",
> dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> @@ -3213,7 +3229,12 @@ static int match_root_decoder_by_range(struct device *dev,
> r1 = &cxlrd->cxlsd.cxld.hpa_range;
> r2 = &cxled->cxld.hpa_range;
>
> - return range_contains(r1, r2);
> + if (range_contains(r1, r2))
> + return true;
> + if (arch_match_spa(cxlrd, cxled))
> + return true;
> +
> + return false;
> }
>
> static int match_region_by_range(struct device *dev, const void *data)
> @@ -3230,8 +3251,12 @@ static int match_region_by_range(struct device *dev, const void *data)
> p = &cxlr->params;
>
> guard(rwsem_read)(&cxl_region_rwsem);
> - if (p->res && p->res->start == r->start && p->res->end == r->end)
> - return 1;
> + if (p->res) {
> + if (p->res->start == r->start && p->res->end == r->end)
> + return 1;
> + if (arch_match_region(p, &cxled->cxld))
> + return 1;
> + }
>
> return 0;
> }
> @@ -3319,6 +3344,21 @@ static int __construct_region(struct cxl_region *cxlr,
> "Extended linear cache calculation failed rc:%d\n", rc);
> }
>
> + /*
> + * Trim the HPA retrieved from hardware to fit the SPA mapped by the
> + * platform
> + */
> + if (arch_match_spa(cxlrd, cxled)) {
> + dev_dbg(cxlmd->dev.parent, "(LMH) Resource (%s: %pr)\n",
> + dev_name(&cxled->cxld.dev), res);
> +
> + arch_adjust_region_resource(res, cxlrd);
> +
> + dev_dbg(cxlmd->dev.parent,
> + "(LMH) has been adjusted (%s: %pr)\n",
> + dev_name(&cxled->cxld.dev), res);
> + }
See my earlier comment on squashing both function into one.
-Robert
> +
> rc = insert_resource(cxlrd->res, res);
> if (rc) {
> /*
> diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
> index 4efcc0606bd6..3b3c24b1496e 100644
> --- a/tools/testing/cxl/Kbuild
> +++ b/tools/testing/cxl/Kbuild
> @@ -64,6 +64,7 @@ cxl_core-y += $(CXL_CORE_SRC)/cdat.o
> cxl_core-y += $(CXL_CORE_SRC)/acpi.o
> cxl_core-y += $(CXL_CORE_SRC)/ras.o
> cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o
> +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += $(CXL_CORE_SRC)/lmh.o
> cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o
> cxl_core-$(CONFIG_CXL_FEATURES) += $(CXL_CORE_SRC)/features.o
> cxl_core-$(CONFIG_CXL_MCE) += $(CXL_CORE_SRC)/mce.o
> --
> 2.48.1
>
next prev parent reply other threads:[~2025-03-21 10:29 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
2025-03-14 11:36 ` [PATCH 1/4 v3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
2025-03-21 15:43 ` Dave Jiang
2025-03-14 11:36 ` [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86 Fabio M. De Francesco
2025-03-18 15:15 ` Ira Weiny
2025-03-21 10:21 ` Robert Richter
2025-03-26 16:47 ` Fabio M. De Francesco
2025-03-28 10:26 ` Robert Richter
2025-03-28 23:40 ` Dan Williams
2025-03-29 10:05 ` Fabio M. De Francesco
2025-03-14 11:36 ` [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
2025-03-18 20:35 ` Ira Weiny
2025-03-21 10:29 ` Robert Richter [this message]
2025-03-14 11:36 ` [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
2025-03-18 21:16 ` Ira Weiny
2025-03-21 10:42 ` Robert Richter
2025-03-26 16:58 ` Fabio M. De Francesco
2025-03-28 10:52 ` Robert Richter
2025-03-28 23:40 ` Dan Williams
2025-03-29 10:16 ` Fabio M. De Francesco
2025-03-29 22:01 ` Fabio M. De Francesco
2025-04-03 4:00 ` Dan Williams
2025-03-20 1:46 ` [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Alison Schofield
2025-03-26 16:23 ` Fabio M. De Francesco
2025-03-20 18:10 ` Alison Schofield
2025-03-26 16:24 ` Fabio M. De Francesco
2025-03-21 10:34 ` Robert Richter
2025-03-25 16:13 ` Fabio M. De Francesco
2025-03-28 9:02 ` Robert Richter
2025-03-28 21:10 ` Dave Jiang
2025-04-02 11:51 ` Robert Richter
2025-04-02 15:31 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z90_ncUzI20TGfq4@rric.localdomain \
--to=rrichter@amd.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=fabio.m.de.francesco@linux.intel.com \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.li@zohomail.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox