* [PATCH 0/4 v4] cxl/core: Enable Region creation/attach on x86 with LMH
@ 2025-07-24 14:20 Fabio M. De Francesco
2025-07-24 14:20 ` [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures Fabio M. De Francesco
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Fabio M. De Francesco @ 2025-07-24 14:20 UTC (permalink / raw)
To: linux-cxl
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, Fabio M. De Francesco
The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
Physical Address (HPA) windows that are associated with each CXL Host
Bridge. Each window represents a contiguous HPA that may be interleaved
with one or more targets (CXL v3.1 - 9.18.1.3).
The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
memory to which systems cannot send transactions. On those systems, BIOS
publishes CFMWS which communicate the active System Physical Address (SPA)
ranges that map to a subset of the Host Physical Address (HPA) ranges. The
SPA range trims out the hole, and capacity in the endpoint is lost with no
SPA to map to CXL HPA in that hole.
In the early stages of CXL Regions construction and attach on platforms
with Low Memory Holes, the driver fails and returns an error because it
expects that the CXL Endpoint Decoder range is a subset of the Root
Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
Decoders or already made CXL Regions and Decoders to allow the
construction of new CXL Regions and the attachment of Endpoint Decoders,
even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
range end to match Root Decoder's.
- Patch 1/4 changes the calling conventions of three match_*_by_range()
helpers in preparation of 3/4.
- Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
the HPA range end for CXL Regions construction.
- Patch 3/4 enables CXL Regions construction and Endpoint Decoders
attachment by matching Root Decoders or Regions with Endpoint
Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
constraints while Endpoints decoders' attachment.
- Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
Many thanks to Alison, Dan, Dave and Ira for their help.
Commenting on v1, Alison wrote a couple of observations on what users
will see. I suggest anyone interested to see how this series affect
users to take a look at her observations.[0] Thank you!
Changes for v4:
Re-base on top of
"cxl: Address translation support, part 1: Cleanups and refactoring";[1]
Drop no more necessary 2/4;
Drop collected tags because of major changes throughout the series.
1/3 - Adjust Endpoint Decoders dpa_res->end (Alison) [2]
3/3 - Use weak/strong mechanism (Dan) [3]
Changes for v3:
Re-base the series on cxl/next.
1/4 - 2/4:
Constify local variables.
3/4:
Call arch_match_region() from region_res_match_cxl_range().
4/4:
arch_match_region() - Check that region end is under start + 4G;
arch_match_spa() - Check that SPA range start is cfmws_range_start.
v3 - https://lore.kernel.org/linux-cxl/20250314113708.759808-1-fabio.m.de.francesco@linux.intel.com/
v2 - https://lore.kernel.org/linux-cxl/20250114203432.31861-1-fabio.m.de.francesco@linux.intel.com/
v1 - https://lore.kernel.org/all/20241122155226.2068287-1-fabio.m.de.francesco@linux.intel.com/
[0] - https://lore.kernel.org/all/Z0Tzif55CcHuujJ-@aschofie-mobl2.lan/
[1] - https://lore.kernel.org/linux-cxl/20250509150700.2817697-1-rrichter@amd.com/
[2] - https://lore.kernel.org/linux-cxl/Z9tzZkn1rqd2Uk_6@aschofie-mobl2.lan/
[3] - https://lore.kernel.org/linux-cxl/67ee07cd4f8ec_1c2c6294d5@dwillia2-xfh.jf.intel.com.notmuch/
Fabio M. De Francesco (4):
cxl/core: Change match_*_by_range() signatures
cxl/core: Add helpers to detect Low Memory Holes on x86
cxl/core: Enable Region creation on x86 with LMH
cxl/test: Simulate an x86 Low Memory Hole for tests
drivers/cxl/Kconfig | 5 ++
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/platform.c | 86 +++++++++++++++++++
drivers/cxl/core/platform.h | 32 +++++++
drivers/cxl/core/region.c | 113 ++++++++++++++++--------
tools/testing/cxl/Kbuild | 2 +
tools/testing/cxl/mock_platform.c | 137 ++++++++++++++++++++++++++++++
tools/testing/cxl/test/cxl.c | 10 +++
tools/testing/cxl/test/mock.h | 1 +
9 files changed, 352 insertions(+), 35 deletions(-)
create mode 100644 drivers/cxl/core/platform.c
create mode 100644 drivers/cxl/core/platform.h
create mode 100644 tools/testing/cxl/mock_platform.c
base-commit: acc2913692413df9d1
--
2.50.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures
2025-07-24 14:20 [PATCH 0/4 v4] cxl/core: Enable Region creation/attach on x86 with LMH Fabio M. De Francesco
@ 2025-07-24 14:20 ` Fabio M. De Francesco
2025-08-01 20:04 ` Cheatham, Benjamin
2025-08-13 12:27 ` Jonathan Cameron
2025-07-24 14:20 ` [PATCH 2/4 v4] cxl/core: Add helpers to detect Low Memory Holes on x86 Fabio M. De Francesco
` (2 subsequent siblings)
3 siblings, 2 replies; 12+ messages in thread
From: Fabio M. De Francesco @ 2025-07-24 14:20 UTC (permalink / raw)
To: linux-cxl
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, Fabio M. De Francesco
Replace struct range parameter with struct cxl_endpoint_decoder of
which range is a member in the match_*_by_range() functions and rename
them according to their semantics.
This is in preparation for expanding these helpers to perform arch
specific Root Decoders and Region matchings with
cxl_endpoint_decoder(s).
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/cxl/core/region.c | 60 +++++++++++++++++++++------------------
1 file changed, 33 insertions(+), 27 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 6e5e1460068d..f607e7f97184 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1759,27 +1759,29 @@ static int cmp_interleave_pos(const void *a, const void *b)
return cxled_a->pos - cxled_b->pos;
}
-static int match_switch_decoder_by_range(struct device *dev,
- const void *data)
+static int match_switch_and_ep_decoders(struct device *dev, const void *data)
{
+ const struct cxl_endpoint_decoder *cxled = data;
struct cxl_switch_decoder *cxlsd;
- const struct range *r1, *r2 = data;
-
+ const struct range *r1, *r2;
if (!is_switch_decoder(dev))
return 0;
cxlsd = to_cxl_switch_decoder(dev);
r1 = &cxlsd->cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
if (is_root_decoder(dev))
return range_contains(r1, r2);
return (r1->start == r2->start && r1->end == r2->end);
}
-static int find_pos_and_ways(struct cxl_port *port, struct range *range,
- int *pos, int *ways)
+static int find_pos_and_ways(struct cxl_port *port,
+ struct cxl_endpoint_decoder *cxled, int *pos,
+ int *ways)
{
+ struct range *range = &cxled->cxld.hpa_range;
struct cxl_switch_decoder *cxlsd;
struct cxl_port *parent;
struct device *dev;
@@ -1789,8 +1791,8 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
if (!parent)
return rc;
- dev = device_find_child(&parent->dev, range,
- match_switch_decoder_by_range);
+ dev = device_find_child(&parent->dev, cxled,
+ match_switch_and_ep_decoders);
if (!dev) {
dev_err(port->uport_dev,
"failed to find decoder mapping %#llx-%#llx\n",
@@ -1876,7 +1878,7 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
if (is_cxl_root(iter))
break;
- rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
+ rc = find_pos_and_ways(iter, cxled, &parent_pos, &parent_ways);
if (rc)
return rc;
@@ -3215,24 +3217,28 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
return rc;
}
-static int match_decoder_by_range(struct device *dev, const void *data)
+static int match_root_and_ep_decoders(struct device *dev, const void *data)
{
- const struct range *r1, *r2 = data;
- struct cxl_decoder *cxld;
+ const struct cxl_endpoint_decoder *cxled = data;
+ const struct range *r1, *r2;
+ struct cxl_root_decoder *cxlrd;
- if (!is_switch_decoder(dev))
+ if (!is_root_decoder(dev))
return 0;
- cxld = to_cxl_decoder(dev);
- r1 = &cxld->hpa_range;
+ cxlrd = to_cxl_root_decoder(dev);
+ r1 = &cxlrd->cxlsd.cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
+
return range_contains(r1, r2);
}
static struct cxl_decoder *
-cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
+cxl_port_find_root_decoder(struct cxl_port *port,
+ struct cxl_endpoint_decoder *cxled)
{
- struct device *cxld_dev = device_find_child(&port->dev, hpa,
- match_decoder_by_range);
+ struct device *cxld_dev = device_find_child(&port->dev, cxled,
+ match_root_and_ep_decoders);
return cxld_dev ? to_cxl_decoder(cxld_dev) : NULL;
}
@@ -3244,9 +3250,8 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
struct cxl_port *port = cxled_to_port(cxled);
struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
struct cxl_decoder *root, *cxld = &cxled->cxld;
- struct range *hpa = &cxld->hpa_range;
- root = cxl_port_find_switch_decoder(&cxl_root->port, hpa);
+ root = cxl_port_find_root_decoder(&cxl_root->port, cxled);
if (!root) {
dev_err(cxlmd->dev.parent,
"%s:%s no CXL window for range %#llx:%#llx\n",
@@ -3258,11 +3263,12 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
return to_cxl_root_decoder(&root->dev);
}
-static int match_region_by_range(struct device *dev, const void *data)
+static int match_region_and_ep_decoder(struct device *dev, const void *data)
{
+ const struct cxl_endpoint_decoder *cxled = data;
+ const struct range *r = &cxled->cxld.hpa_range;
struct cxl_region_params *p;
struct cxl_region *cxlr;
- const struct range *r = data;
if (!is_cxl_region(dev))
return 0;
@@ -3425,12 +3431,13 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
}
static struct cxl_region *
-cxl_find_region_by_range(struct cxl_root_decoder *cxlrd, struct range *hpa)
+cxl_find_region_by_range(struct cxl_root_decoder *cxlrd,
+ struct cxl_endpoint_decoder *cxled)
{
struct device *region_dev;
- region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
- match_region_by_range);
+ region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, cxled,
+ match_region_and_ep_decoder);
if (!region_dev)
return NULL;
@@ -3439,7 +3446,6 @@ cxl_find_region_by_range(struct cxl_root_decoder *cxlrd, struct range *hpa)
int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
{
- struct range *hpa = &cxled->cxld.hpa_range;
struct cxl_region_params *p;
bool attach = false;
int rc;
@@ -3455,7 +3461,7 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
*/
mutex_lock(&cxlrd->range_lock);
struct cxl_region *cxlr __free(put_cxl_region) =
- cxl_find_region_by_range(cxlrd, hpa);
+ cxl_find_region_by_range(cxlrd, cxled);
if (!cxlr)
cxlr = construct_region(cxlrd, cxled);
mutex_unlock(&cxlrd->range_lock);
--
2.50.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/4 v4] cxl/core: Add helpers to detect Low Memory Holes on x86
2025-07-24 14:20 [PATCH 0/4 v4] cxl/core: Enable Region creation/attach on x86 with LMH Fabio M. De Francesco
2025-07-24 14:20 ` [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures Fabio M. De Francesco
@ 2025-07-24 14:20 ` Fabio M. De Francesco
2025-08-01 20:04 ` Cheatham, Benjamin
2025-07-24 14:20 ` [PATCH 3/4 v4] cxl/core: Enable Region creation on x86 with LMH Fabio M. De Francesco
2025-07-24 14:20 ` [PATCH 4/4 v4] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
3 siblings, 1 reply; 12+ messages in thread
From: Fabio M. De Francesco @ 2025-07-24 14:20 UTC (permalink / raw)
To: linux-cxl
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, Fabio M. De Francesco
In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
ranges are always guaranteed to align to the NIW * 256M rule.
In order to construct Regions and attach Decoders, the driver needs to
match Root Decoders and Regions with Endpoint Decoders, but it fails and
the entire process returns errors because it doesn't expect to deal with
SPA range lengths smaller than corresponding HPA's.
Introduce functions that indirectly detect x86 LMH's by comparing SPA's
with corresponding HPA's. They will be used in the process of Regions
creation and Endpoint attachments to prevent driver failures in a few
steps of the above-mentioned process.
The helpers return true when HPA/SPA misalignments are detected under
specific conditions: both the SPA and HPA ranges must start at
LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
be less than HPA's, SPA's range's size be less than 4G, HPA's size be
aligned to the NIW * 256M rule.
Also introduce a function to adjust the range end of the Regions to be
created on x86 with LMH's.
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/cxl/Kconfig | 5 +++
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/platform.c | 85 +++++++++++++++++++++++++++++++++++++
drivers/cxl/core/platform.h | 32 ++++++++++++++
4 files changed, 123 insertions(+)
create mode 100644 drivers/cxl/core/platform.c
create mode 100644 drivers/cxl/core/platform.h
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 48b7314afdb8..eca90baeac10 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -211,6 +211,11 @@ config CXL_REGION
If unsure say 'y'
+config CXL_PLATFORM_QUIRKS
+ def_bool y
+ depends on CXL_REGION
+ depends on X86
+
config CXL_REGION_INVALIDATION_TEST
bool "CXL: Region Cache Management Bypass (TEST)"
depends on CXL_REGION
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 79e2ef81fde8..4be729fb7d64 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -18,6 +18,7 @@ cxl_core-y += ras.o
cxl_core-y += acpi.o
cxl_core-$(CONFIG_TRACING) += trace.o
cxl_core-$(CONFIG_CXL_REGION) += region.o
+cxl_core-$(CONFIG_CXL_PLATFORM_QUIRKS) += platform.o
cxl_core-$(CONFIG_CXL_MCE) += mce.o
cxl_core-$(CONFIG_CXL_FEATURES) += features.o
cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
diff --git a/drivers/cxl/core/platform.c b/drivers/cxl/core/platform.c
new file mode 100644
index 000000000000..8202750742d0
--- /dev/null
+++ b/drivers/cxl/core/platform.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/range.h>
+#include "platform.h"
+#include "cxlmem.h"
+#include "core.h"
+
+/* Start of CFMWS range that end before x86 Low Memory Holes */
+#define LMH_CFMWS_RANGE_START 0x0ULL
+
+/*
+ * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
+ *
+ * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
+ * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
+ * the given endpoint decoder HPA range size is always expected aligned and
+ * also larger than that of the matching root decoder. If there are LMH's,
+ * the root decoder range end is always less than SZ_4G.
+ */
+bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
+{
+ const struct range *r1, *r2;
+ int niw;
+
+ r1 = &cxlrd->cxlsd.cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
+ niw = cxled->cxld.interleave_ways;
+
+ if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
+ r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
+ IS_ALIGNED(range_len(r2), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+/*
+ * Similar to platform_root_decoder_contains(), it matches regions and
+ * decoders
+ */
+bool platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
+{
+ const struct range *r = &cxld->hpa_range;
+ const struct resource *res = p->res;
+ int niw = cxld->interleave_ways;
+
+ if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
+ res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
+ IS_ALIGNED(range_len(r), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+void platform_res_adjust(struct resource *res,
+ struct cxl_endpoint_decoder *cxled,
+ const struct cxl_root_decoder *cxlrd)
+{
+ if (!platform_root_decoder_contains(cxlrd, cxled))
+ return;
+
+ guard(rwsem_write)(&cxl_dpa_rwsem);
+ dev_info(cxled_to_memdev(cxled)->dev.parent,
+ "(LMH) Resources were (%s: %pr, %pr)\n",
+ dev_name(&cxled->cxld.dev), res, cxled->dpa_res);
+ if (res) {
+ /*
+ * A region must be constructed with Endpoint Decoder's
+ * HPA range end adjusted to Root Decoder's resource end
+ */
+ res->end = cxlrd->res->end;
+ }
+ /*
+ * The Endpoint Decoder's dpa_res->end must be adjusted with Root
+ * Decoder's resource end
+ */
+ cxled->dpa_res->end =
+ cxled->dpa_res->start +
+ resource_size(cxlrd->res) / cxled->cxld.interleave_ways - 1;
+ dev_info(cxled_to_memdev(cxled)->dev.parent,
+ "(LMH) Resources have been adjusted (%s: %pr, %pr)\n",
+ dev_name(&cxled->cxld.dev), res, cxled->dpa_res);
+}
diff --git a/drivers/cxl/core/platform.h b/drivers/cxl/core/platform.h
new file mode 100644
index 000000000000..0baa39938729
--- /dev/null
+++ b/drivers/cxl/core/platform.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include "cxl.h"
+
+#ifdef CONFIG_CXL_PLATFORM_QUIRKS
+bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled);
+bool platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld);
+void platform_res_adjust(struct resource *res,
+ struct cxl_endpoint_decoder *cxled,
+ const struct cxl_root_decoder *cxlrd);
+#else
+static bool
+platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
+{
+ return false;
+}
+
+static bool platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
+{
+ return false;
+}
+
+void platform_res_adjust(struct resource *res,
+ struct cxl_endpoint_decoder *cxled,
+ const struct cxl_root_decoder *cxlrd)
+{
+}
+#endif /* CONFIG_CXL_PLATFORM_QUIRKS */
--
2.50.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/4 v4] cxl/core: Enable Region creation on x86 with LMH
2025-07-24 14:20 [PATCH 0/4 v4] cxl/core: Enable Region creation/attach on x86 with LMH Fabio M. De Francesco
2025-07-24 14:20 ` [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures Fabio M. De Francesco
2025-07-24 14:20 ` [PATCH 2/4 v4] cxl/core: Add helpers to detect Low Memory Holes on x86 Fabio M. De Francesco
@ 2025-07-24 14:20 ` Fabio M. De Francesco
2025-08-01 20:04 ` Cheatham, Benjamin
2025-07-24 14:20 ` [PATCH 4/4 v4] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
3 siblings, 1 reply; 12+ messages in thread
From: Fabio M. De Francesco @ 2025-07-24 14:20 UTC (permalink / raw)
To: linux-cxl
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, Fabio M. De Francesco
The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
Physical Address (HPA) windows that are associated with each CXL Host
Bridge. Each window represents a contiguous HPA that may be interleaved
with one or more targets (CXL v3.1 - 9.18.1.3).
The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
memory to which systems cannot send transactions. In some cases the size
of that hole is not compatible with the CXL hardware decoder constraint
that the size is always aligned to 256M * Interleave Ways.
On those systems, BIOS publishes CFMWS which communicate the active System
Physical Address (SPA) ranges that map to a subset of the Host Physical
Address (HPA) ranges. The SPA range trims out the hole, and capacity in
the endpoint is lost with no SPA to map to CXL HPA in that hole.
In the early stages of CXL Regions construction and attach on platforms
with Low Memory Holes, cxl_add_to_region() fails and returns an error
because it can't find any CXL Window that matches a given CXL Endpoint
Decoder.
Detect a Low Memory Hole by comparing Root Decoders and Endpoint Decoders
ranges with the use of arch_match_{spa,region}() helpers.
Match Root Decoders and CXL Regions with corresponding CXL Endpoint
Decoders. Currently a Low Memory Holes would prevent the matching functions
to return true.
Construct CXL Regions with HPA range's end adjusted to the matching SPA.
Allow the attach target process to complete by allowing Regions to not
fit with alignment constraints (i.e., alignment to NIW * 256M rule).
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/cxl/core/region.c | 53 +++++++++++++++++++++++++++++++++------
tools/testing/cxl/Kbuild | 1 +
2 files changed, 46 insertions(+), 8 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index f607e7f97184..b7fdf9c4393d 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -12,6 +12,7 @@
#include <linux/memory-tiers.h>
#include <cxlmem.h>
#include <cxl.h>
+#include "platform.h"
#include "core.h"
/**
@@ -834,6 +835,8 @@ static int match_free_decoder(struct device *dev, const void *data)
static bool region_res_match_cxl_range(const struct cxl_region_params *p,
struct range *range)
{
+ struct cxl_decoder *cxld;
+
if (!p->res)
return false;
@@ -842,8 +845,15 @@ static bool region_res_match_cxl_range(const struct cxl_region_params *p,
* to be fronted by the DRAM range in current known implementation.
* This assumption will be made until a variant implementation exists.
*/
- return p->res->start + p->cache_size == range->start &&
- p->res->end == range->end;
+ if (p->res->start + p->cache_size == range->start &&
+ p->res->end == range->end)
+ return true;
+
+ cxld = container_of(range, struct cxl_decoder, hpa_range);
+ if (platform_region_contains(p, cxld))
+ return true;
+
+ return false;
}
static int match_auto_decoder(struct device *dev, const void *data)
@@ -1763,6 +1773,7 @@ static int match_switch_and_ep_decoders(struct device *dev, const void *data)
{
const struct cxl_endpoint_decoder *cxled = data;
struct cxl_switch_decoder *cxlsd;
+ struct cxl_root_decoder *cxlrd;
const struct range *r1, *r2;
if (!is_switch_decoder(dev))
@@ -1772,8 +1783,13 @@ static int match_switch_and_ep_decoders(struct device *dev, const void *data)
r1 = &cxlsd->cxld.hpa_range;
r2 = &cxled->cxld.hpa_range;
- if (is_root_decoder(dev))
- return range_contains(r1, r2);
+ if (is_root_decoder(dev)) {
+ if (range_contains(r1, r2))
+ return 1;
+ cxlrd = to_cxl_root_decoder(dev);
+ if (platform_root_decoder_contains(cxlrd, cxled))
+ return 1;
+ }
return (r1->start == r2->start && r1->end == r2->end);
}
@@ -1990,7 +2006,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
}
if (resource_size(cxled->dpa_res) * p->interleave_ways + p->cache_size !=
- resource_size(p->res)) {
+ resource_size(p->res) && !platform_root_decoder_contains(cxlrd, cxled)) {
dev_dbg(&cxlr->dev,
"%s:%s-size-%#llx * ways-%d + cache-%#llx != region-size-%#llx\n",
dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
@@ -3230,7 +3246,12 @@ static int match_root_and_ep_decoders(struct device *dev, const void *data)
r1 = &cxlrd->cxlsd.cxld.hpa_range;
r2 = &cxled->cxld.hpa_range;
- return range_contains(r1, r2);
+ if (range_contains(r1, r2))
+ return true;
+ if (platform_root_decoder_contains(cxlrd, cxled))
+ return true;
+
+ return false;
}
static struct cxl_decoder *
@@ -3277,8 +3298,12 @@ static int match_region_and_ep_decoder(struct device *dev, const void *data)
p = &cxlr->params;
guard(rwsem_read)(&cxl_region_rwsem);
- if (p->res && p->res->start == r->start && p->res->end == r->end)
- return 1;
+ if (p->res) {
+ if (p->res->start == r->start && p->res->end == r->end)
+ return 1;
+ if (platform_region_contains(p, &cxled->cxld))
+ return 1;
+ }
return 0;
}
@@ -3355,6 +3380,12 @@ static int __construct_region(struct cxl_region *cxlr,
*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
dev_name(&cxlr->dev));
+ /*
+ * Trim the HPA retrieved from hardware to fit the SPA mapped by the
+ * platform
+ */
+ platform_res_adjust(res, cxled, cxlrd);
+
rc = cxl_extended_linear_cache_resize(cxlr, res);
if (rc && rc != -EOPNOTSUPP) {
/*
@@ -3464,6 +3495,12 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
cxl_find_region_by_range(cxlrd, cxled);
if (!cxlr)
cxlr = construct_region(cxlrd, cxled);
+ else
+ /*
+ * Adjust the Endpoint Decoder's dpa_res to fit the Region which
+ * it has to be attached to
+ */
+ platform_res_adjust(NULL, cxled, cxlrd);
mutex_unlock(&cxlrd->range_lock);
rc = PTR_ERR_OR_ZERO(cxlr);
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index 31a2d73c963f..77e392c4b541 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -65,6 +65,7 @@ cxl_core-y += $(CXL_CORE_SRC)/ras.o
cxl_core-y += $(CXL_CORE_SRC)/acpi.o
cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o
cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o
+cxl_core-$(CONFIG_CXL_PLATFORM_QUIRKS) += $(CXL_CORE_SRC)/platform.o
cxl_core-$(CONFIG_CXL_MCE) += $(CXL_CORE_SRC)/mce.o
cxl_core-$(CONFIG_CXL_FEATURES) += $(CXL_CORE_SRC)/features.o
cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += $(CXL_CORE_SRC)/edac.o
--
2.50.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/4 v4] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-07-24 14:20 [PATCH 0/4 v4] cxl/core: Enable Region creation/attach on x86 with LMH Fabio M. De Francesco
` (2 preceding siblings ...)
2025-07-24 14:20 ` [PATCH 3/4 v4] cxl/core: Enable Region creation on x86 with LMH Fabio M. De Francesco
@ 2025-07-24 14:20 ` Fabio M. De Francesco
2025-08-20 21:08 ` Alison Schofield
3 siblings, 1 reply; 12+ messages in thread
From: Fabio M. De Francesco @ 2025-07-24 14:20 UTC (permalink / raw)
To: linux-cxl
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, Fabio M. De Francesco
Simulate an x86 Low Memory Hole for the CXL tests by changing the first
mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
to 1GB.
The auto-created region of cxl-test uses mock_cfmws[0], therefore the LMH
path in the CXL Driver will be exercised every time the cxl-test module is
loaded. Executing unit test: cxl-topology.sh, confirms the region created
successfully with a LMH.
Since mock_cfmws[0] range base address is typically different from the one
published by the BIOS on real hardware, the driver would fail to create and
attach CXL Regions when it's run on the mock environment created by
cxl-tests.
Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
Root Decoders and Regions with Endpoint Decoders when the driver is run on
mock devices.
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/cxl/core/platform.c | 9 +-
tools/testing/cxl/Kbuild | 1 +
tools/testing/cxl/mock_platform.c | 137 ++++++++++++++++++++++++++++++
tools/testing/cxl/test/cxl.c | 10 +++
tools/testing/cxl/test/mock.h | 1 +
5 files changed, 154 insertions(+), 4 deletions(-)
create mode 100644 tools/testing/cxl/mock_platform.c
diff --git a/drivers/cxl/core/platform.c b/drivers/cxl/core/platform.c
index 8202750742d0..ba1dafece495 100644
--- a/drivers/cxl/core/platform.c
+++ b/drivers/cxl/core/platform.c
@@ -17,8 +17,9 @@
* also larger than that of the matching root decoder. If there are LMH's,
* the root decoder range end is always less than SZ_4G.
*/
-bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
- const struct cxl_endpoint_decoder *cxled)
+__weak bool
+platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
{
const struct range *r1, *r2;
int niw;
@@ -39,8 +40,8 @@ bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
* Similar to platform_root_decoder_contains(), it matches regions and
* decoders
*/
-bool platform_region_contains(const struct cxl_region_params *p,
- const struct cxl_decoder *cxld)
+__weak bool platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
{
const struct range *r = &cxld->hpa_range;
const struct resource *res = p->res;
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index 77e392c4b541..64c5c8c34805 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -66,6 +66,7 @@ cxl_core-y += $(CXL_CORE_SRC)/acpi.o
cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o
cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o
cxl_core-$(CONFIG_CXL_PLATFORM_QUIRKS) += $(CXL_CORE_SRC)/platform.o
+cxl_core-$(CONFIG_CXL_PLATFORM_QUIRKS) += mock_platform.o
cxl_core-$(CONFIG_CXL_MCE) += $(CXL_CORE_SRC)/mce.o
cxl_core-$(CONFIG_CXL_FEATURES) += $(CXL_CORE_SRC)/features.o
cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += $(CXL_CORE_SRC)/edac.o
diff --git a/tools/testing/cxl/mock_platform.c b/tools/testing/cxl/mock_platform.c
new file mode 100644
index 000000000000..1775c64b3c7c
--- /dev/null
+++ b/tools/testing/cxl/mock_platform.c
@@ -0,0 +1,137 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/completion.h>
+#include <linux/module.h>
+#include <linux/range.h>
+#include <linux/pci.h>
+
+#include <cxlmem.h>
+#include <platform.h>
+#include "test/mock.h"
+
+static u64 mock_cfmws0_range_start;
+
+void set_mock_cfmws0_range_start(u64 start)
+{
+ mock_cfmws0_range_start = start;
+}
+EXPORT_SYMBOL_NS_GPL(set_mock_cfmws0_range_start, "CXL");
+
+static bool is_mock_port(struct device *dev)
+{
+ struct cxl_mock_ops *(*get_ops_fn)(int *index);
+ struct cxl_mock_ops *ops = NULL;
+ void (*put_ops_fn)(int index);
+ bool is_mock = false;
+ int index;
+
+ get_ops_fn = symbol_get(get_cxl_mock_ops);
+ if (!get_ops_fn)
+ return false;
+ put_ops_fn = symbol_get(put_cxl_mock_ops);
+ if (!put_ops_fn)
+ goto out;
+
+ ops = get_ops_fn(&index);
+ if (ops)
+ is_mock = ops->is_mock_port(dev);
+ put_ops_fn(index);
+
+out:
+ symbol_put(get_cxl_mock_ops);
+
+ return is_mock;
+}
+
+/* Start of CFMWS range that end before x86 Low Memory Holes */
+#define LMH_CFMWS_RANGE_START 0x0ULL
+
+static bool
+real_platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
+{
+ const struct range *r1, *r2;
+ int niw;
+
+ r1 = &cxlrd->cxlsd.cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
+ niw = cxled->cxld.interleave_ways;
+
+ if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
+ r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
+ IS_ALIGNED(range_len(r2), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+static bool
+cxl_test_platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
+{
+ const struct range *r1, *r2;
+ int niw;
+
+ r1 = &cxlrd->cxlsd.cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
+ niw = cxled->cxld.interleave_ways;
+
+ if (r1->start == mock_cfmws0_range_start && r1->start == r2->start &&
+ r1->end < (mock_cfmws0_range_start + SZ_4G) && r1->end < r2->end &&
+ IS_ALIGNED(range_len(r2), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
+{
+ struct cxl_port *port = to_cxl_port(cxled->cxld.dev.parent);
+
+ if (is_mock_port(port->uport_dev))
+ return cxl_test_platform_root_decoder_contains(cxlrd, cxled);
+
+ return real_platform_root_decoder_contains(cxlrd, cxled);
+}
+
+static bool real_platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
+{
+ const struct range *r = &cxld->hpa_range;
+ const struct resource *res = p->res;
+ int niw = cxld->interleave_ways;
+
+ if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
+ res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
+ IS_ALIGNED(range_len(r), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+static bool cxl_test_platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
+{
+ const struct range *r = &cxld->hpa_range;
+ const struct resource *res = p->res;
+ int niw = cxld->interleave_ways;
+
+ if (res->start == mock_cfmws0_range_start && res->start == r->start &&
+ res->end < (mock_cfmws0_range_start + SZ_4G) && res->end < r->end &&
+ IS_ALIGNED(range_len(r), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+bool platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
+{
+ struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+
+ if (is_mock_port(port->uport_dev))
+ return cxl_test_platform_region_contains(p, cxld);
+
+ return real_platform_region_contains(p, cxld);
+}
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 8a5815ca870d..a411c055d390 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -212,7 +212,11 @@ static struct {
.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
.qtg_id = FAKE_QTG_ID,
+#if defined(CONFIG_CXL_PLATFORM_QUIRKS)
+ .window_size = SZ_256M * 3UL,
+#else
.window_size = SZ_256M * 4UL,
+#endif
},
.target = { 0 },
},
@@ -453,6 +457,8 @@ static int populate_cedt(void)
if (!res)
return -ENOMEM;
window->base_hpa = res->range.start;
+ if (i == 0)
+ set_mock_cfmws0_range_start(res->range.start);
}
return 0;
@@ -744,7 +750,11 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
struct cxl_endpoint_decoder *cxled;
struct cxl_switch_decoder *cxlsd;
struct cxl_port *port, *iter;
+#if defined(CONFIG_CXL_PLATFORM_QUIRKS)
+ const int size = SZ_1G;
+#else
const int size = SZ_512M;
+#endif
struct cxl_memdev *cxlmd;
struct cxl_dport *dport;
struct device *dev;
diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
index d1b0271d2822..792eabbd0f18 100644
--- a/tools/testing/cxl/test/mock.h
+++ b/tools/testing/cxl/test/mock.h
@@ -32,3 +32,4 @@ void register_cxl_mock_ops(struct cxl_mock_ops *ops);
void unregister_cxl_mock_ops(struct cxl_mock_ops *ops);
struct cxl_mock_ops *get_cxl_mock_ops(int *index);
void put_cxl_mock_ops(int index);
+void set_mock_cfmws0_range_start(u64 start);
--
2.50.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures
2025-07-24 14:20 ` [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures Fabio M. De Francesco
@ 2025-08-01 20:04 ` Cheatham, Benjamin
2025-08-13 12:27 ` Jonathan Cameron
1 sibling, 0 replies; 12+ messages in thread
From: Cheatham, Benjamin @ 2025-08-01 20:04 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, linux-cxl
On 7/24/2025 9:20 AM, Fabio M. De Francesco wrote:
> Replace struct range parameter with struct cxl_endpoint_decoder of
> which range is a member in the match_*_by_range() functions and rename
> them according to their semantics.
>
> This is in preparation for expanding these helpers to perform arch
> specific Root Decoders and Region matchings with
> cxl_endpoint_decoder(s).
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/core/region.c | 60 +++++++++++++++++++++------------------
> 1 file changed, 33 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6e5e1460068d..f607e7f97184 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1759,27 +1759,29 @@ static int cmp_interleave_pos(const void *a, const void *b)
> return cxled_a->pos - cxled_b->pos;
> }
>
> -static int match_switch_decoder_by_range(struct device *dev,
> - const void *data)
> +static int match_switch_and_ep_decoders(struct device *dev, const void *data)
> {
> + const struct cxl_endpoint_decoder *cxled = data;
> struct cxl_switch_decoder *cxlsd;
> - const struct range *r1, *r2 = data;
> -
> + const struct range *r1, *r2;
>
> if (!is_switch_decoder(dev))
> return 0;
>
> cxlsd = to_cxl_switch_decoder(dev);
> r1 = &cxlsd->cxld.hpa_range;
> + r2 = &cxled->cxld.hpa_range;
>
> if (is_root_decoder(dev))
> return range_contains(r1, r2);
> return (r1->start == r2->start && r1->end == r2->end);
> }
>
> -static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> - int *pos, int *ways)
> +static int find_pos_and_ways(struct cxl_port *port,
> + struct cxl_endpoint_decoder *cxled, int *pos,
> + int *ways)
> {
> + struct range *range = &cxled->cxld.hpa_range;
> struct cxl_switch_decoder *cxlsd;
> struct cxl_port *parent;
> struct device *dev;
> @@ -1789,8 +1791,8 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> if (!parent)
> return rc;
>
> - dev = device_find_child(&parent->dev, range,
> - match_switch_decoder_by_range);
> + dev = device_find_child(&parent->dev, cxled,
> + match_switch_and_ep_decoders);
> if (!dev) {
> dev_err(port->uport_dev,
> "failed to find decoder mapping %#llx-%#llx\n",
> @@ -1876,7 +1878,7 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> if (is_cxl_root(iter))
> break;
>
> - rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
> + rc = find_pos_and_ways(iter, cxled, &parent_pos, &parent_ways);
> if (rc)
> return rc;
>
> @@ -3215,24 +3217,28 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> return rc;
> }
>
> -static int match_decoder_by_range(struct device *dev, const void *data)
> +static int match_root_and_ep_decoders(struct device *dev, const void *data)
> {
> - const struct range *r1, *r2 = data;
> - struct cxl_decoder *cxld;
> + const struct cxl_endpoint_decoder *cxled = data;
> + const struct range *r1, *r2;
> + struct cxl_root_decoder *cxlrd;
Nit: cxlrd should be above the r1, r2 declaration (reverse christmas tree).
>
> - if (!is_switch_decoder(dev))
> + if (!is_root_decoder(dev))
> return 0;
>
> - cxld = to_cxl_decoder(dev);
> - r1 = &cxld->hpa_range;
> + cxlrd = to_cxl_root_decoder(dev);
> + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> + r2 = &cxled->cxld.hpa_range;
> +
> return range_contains(r1, r2);
> }
You could reuse the code from match_switch_and_ep_decoders() here by doing:
static int match_root_and_ep_decoders(struct device *dev, const void *data)
{
if (!is_root_decoder(dev))
return 0;
return match_switch_and_ep_decoders(dev, data);
}
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/4 v4] cxl/core: Add helpers to detect Low Memory Holes on x86
2025-07-24 14:20 ` [PATCH 2/4 v4] cxl/core: Add helpers to detect Low Memory Holes on x86 Fabio M. De Francesco
@ 2025-08-01 20:04 ` Cheatham, Benjamin
2025-08-21 13:25 ` Fabio M. De Francesco
0 siblings, 1 reply; 12+ messages in thread
From: Cheatham, Benjamin @ 2025-08-01 20:04 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, linux-cxl
On 7/24/2025 9:20 AM, Fabio M. De Francesco wrote:
> In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
s/publishes/publish
> SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> ranges are always guaranteed to align to the NIW * 256M rule.
s/to align to/to align due to/
Also a spec reference for the rule would be helpful (same with next patch).
>
> In order to construct Regions and attach Decoders, the driver needs to
> match Root Decoders and Regions with Endpoint Decoders, but it fails and
> the entire process returns errors because it doesn't expect to deal with
> SPA range lengths smaller than corresponding HPA's.
>
> Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> with corresponding HPA's. They will be used in the process of Regions
> creation and Endpoint attachments to prevent driver failures in a few
> steps of the above-mentioned process.
>
> The helpers return true when HPA/SPA misalignments are detected under
> specific conditions: both the SPA and HPA ranges must start at
> LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
maybe sub "that in x86 with LMH's is 0x0" with "0x0 on x86 with LMH's" also
s/SPA range sizes/SPA range's size/.
> be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> aligned to the NIW * 256M rule.
s/be/is/ in above list.
>
> Also introduce a function to adjust the range end of the Regions to be
> created on x86 with LMH's.
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/Kconfig | 5 +++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/platform.c | 85 +++++++++++++++++++++++++++++++++++++
> drivers/cxl/core/platform.h | 32 ++++++++++++++
> 4 files changed, 123 insertions(+)
> create mode 100644 drivers/cxl/core/platform.c
> create mode 100644 drivers/cxl/core/platform.h
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 48b7314afdb8..eca90baeac10 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -211,6 +211,11 @@ config CXL_REGION
>
> If unsure say 'y'
>
> +config CXL_PLATFORM_QUIRKS
> + def_bool y
> + depends on CXL_REGION
> + depends on X86
> +
I'm confused on the intention behind this symbol. The naming suggests it's for all platform quirks,
but the code and dependencies make this x86-specific.
I'm going to suggest making this x86-specific for now. I'm not aware of any other platforms with quirks
(someone correct me if I'm wrong), so making this x86-specific is fine for now. I would rename this
symbol to CXL_X86_QUIRKS, leave dependencies as-is, and rename platform.c to something like platform_x86.c.
Then, if someone comes along with other platform quirks they can do their own symbol and file (or come
up with a generic scheme).
> config CXL_REGION_INVALIDATION_TEST
> bool "CXL: Region Cache Management Bypass (TEST)"
> depends on CXL_REGION
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 79e2ef81fde8..4be729fb7d64 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -18,6 +18,7 @@ cxl_core-y += ras.o
> cxl_core-y += acpi.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> +cxl_core-$(CONFIG_CXL_PLATFORM_QUIRKS) += platform.o
> cxl_core-$(CONFIG_CXL_MCE) += mce.o
> cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
> diff --git a/drivers/cxl/core/platform.c b/drivers/cxl/core/platform.c
> new file mode 100644
> index 000000000000..8202750742d0
> --- /dev/null
> +++ b/drivers/cxl/core/platform.c
> @@ -0,0 +1,85 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/range.h>
> +#include "platform.h"
> +#include "cxlmem.h"
> +#include "core.h"
> +
> +/* Start of CFMWS range that end before x86 Low Memory Holes */
> +#define LMH_CFMWS_RANGE_START 0x0ULL
> +
> +/*
> + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> + *
> + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> + * the given endpoint decoder HPA range size is always expected aligned and
> + * also larger than that of the matching root decoder. If there are LMH's,
> + * the root decoder range end is always less than SZ_4G.
> + */
Where does the SZ_4G restraint come from?
Also, might as well make this a kdoc comment since you've already done the majority of the work.
> +bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
> + const struct cxl_endpoint_decoder *cxled)
> +{
Assuming you take renaming suggestion above, these functions should probably be start with
"platform_x86_*" instead.
> + const struct range *r1, *r2;
> + int niw;
> +
> + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> + r2 = &cxled->cxld.hpa_range;
> + niw = cxled->cxld.interleave_ways;
> +
> + if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> + r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> + IS_ALIGNED(range_len(r2), niw * SZ_256M))
> + return true;
> +
> + return false;
> +}
> +
> +/*
> + * Similar to platform_root_decoder_contains(), it matches regions and
> + * decoders
> + */
> +bool platform_region_contains(const struct cxl_region_params *p,
> + const struct cxl_decoder *cxld)
> +{
> + const struct range *r = &cxld->hpa_range;
> + const struct resource *res = p->res;
> + int niw = cxld->interleave_ways;
> +
> + if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> + res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> + IS_ALIGNED(range_len(r), niw * SZ_256M))
> + return true;
> +
This if condition could probably move into its own helper function that takes the ranges and
interleave ways. My only hang up is that these functions become 2-3 lines each after doing so.
You could also get rid of these two functions and just export the "helper" function instead.
It would probably add some bloat to patch 3/4, so it's your call here.
> + return false;
> +}
> +
> +void platform_res_adjust(struct resource *res,
> + struct cxl_endpoint_decoder *cxled,
> + const struct cxl_root_decoder *cxlrd)
> +{
> + if (!platform_root_decoder_contains(cxlrd, cxled))
> + return;
> +
> + guard(rwsem_write)(&cxl_dpa_rwsem);
> + dev_info(cxled_to_memdev(cxled)->dev.parent,
> + "(LMH) Resources were (%s: %pr, %pr)\n",
> + dev_name(&cxled->cxld.dev), res, cxled->dpa_res);
> + if (res) {
> + /*
> + * A region must be constructed with Endpoint Decoder's
> + * HPA range end adjusted to Root Decoder's resource end
> + */
This comment (and one below) are confusing to me. I'd reword as "Trim region resource
overlap with LMH".
> + res->end = cxlrd->res->end;
> + }
> + /*
> + * The Endpoint Decoder's dpa_res->end must be adjusted with Root
> + * Decoder's resource end
> + */
and reword this one to "Match endpoint decoder's DPA resource to root decoder's". You could
also leave out this comment altogether, the below is self-explanatory IMO.
> + cxled->dpa_res->end =
> + cxled->dpa_res->start +
> + resource_size(cxlrd->res) / cxled->cxld.interleave_ways - 1;
> + dev_info(cxled_to_memdev(cxled)->dev.parent,
> + "(LMH) Resources have been adjusted (%s: %pr, %pr)\n",
> + dev_name(&cxled->cxld.dev), res, cxled->dpa_res);
I think this function is a bit too noisy. I'd recommend having just this print and
have it say something like "Low memory hole overlap detected, trimmed DPA to %pr".
You could also include the amount trimmed, but that may not be very useful info.
I'd make the first print a dev_dbg() and spell out LMH at the very least. If it's a
dev_info() it should be relatively easy to tell what's going on for a layman.
> +}
> diff --git a/drivers/cxl/core/platform.h b/drivers/cxl/core/platform.h
> new file mode 100644
> index 000000000000..0baa39938729
> --- /dev/null
> +++ b/drivers/cxl/core/platform.h
This is a new file so that these functions can hook into cxl_test, right?
If not, I'd move the below into cxl/core/core.h and remove this file.
> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include "cxl.h"
> +
> +#ifdef CONFIG_CXL_PLATFORM_QUIRKS
> +bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
> + const struct cxl_endpoint_decoder *cxled);
> +bool platform_region_contains(const struct cxl_region_params *p,
> + const struct cxl_decoder *cxld);
> +void platform_res_adjust(struct resource *res,
> + struct cxl_endpoint_decoder *cxled,
> + const struct cxl_root_decoder *cxlrd);
> +#else
> +static bool
> +platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
> + const struct cxl_endpoint_decoder *cxled)
> +{
> + return false;
> +}
> +
> +static bool platform_region_contains(const struct cxl_region_params *p,
> + const struct cxl_decoder *cxld)
> +{
> + return false;
> +}
> +
> +void platform_res_adjust(struct resource *res,
> + struct cxl_endpoint_decoder *cxled,
> + const struct cxl_root_decoder *cxlrd)
> +{
> +}
Don't these need "inline" in the function signatures?
> +#endif /* CONFIG_CXL_PLATFORM_QUIRKS */
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/4 v4] cxl/core: Enable Region creation on x86 with LMH
2025-07-24 14:20 ` [PATCH 3/4 v4] cxl/core: Enable Region creation on x86 with LMH Fabio M. De Francesco
@ 2025-08-01 20:04 ` Cheatham, Benjamin
2025-08-21 15:15 ` Fabio M. De Francesco
0 siblings, 1 reply; 12+ messages in thread
From: Cheatham, Benjamin @ 2025-08-01 20:04 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, linux-cxl
On 7/24/2025 9:20 AM, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
Update to 3.2 spec? Sorry I forgot to mention it earlier, but you'll probably
want to do this for the whole series.
>
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. In some cases the size
> of that hole is not compatible with the CXL hardware decoder constraint
> that the size is always aligned to 256M * Interleave Ways.
Spec ref here.
>
> On those systems, BIOS publishes CFMWS which communicate the active System
> Physical Address (SPA) ranges that map to a subset of the Host Physical
> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> the endpoint is lost with no SPA to map to CXL HPA in that hole.
>
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, cxl_add_to_region() fails and returns an error
> because it can't find any CXL Window that matches a given CXL Endpoint
> Decoder.
>
> Detect a Low Memory Hole by comparing Root Decoders and Endpoint Decoders
> ranges with the use of arch_match_{spa,region}() helpers.
>
> Match Root Decoders and CXL Regions with corresponding CXL Endpoint
> Decoders. Currently a Low Memory Holes would prevent the matching functions
> to return true.
>
> Construct CXL Regions with HPA range's end adjusted to the matching SPA.
>
> Allow the attach target process to complete by allowing Regions to not
> fit with alignment constraints (i.e., alignment to NIW * 256M rule).
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/core/region.c | 53 +++++++++++++++++++++++++++++++++------
> tools/testing/cxl/Kbuild | 1 +
> 2 files changed, 46 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index f607e7f97184..b7fdf9c4393d 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -12,6 +12,7 @@
> #include <linux/memory-tiers.h>
> #include <cxlmem.h>
> #include <cxl.h>
> +#include "platform.h"
> #include "core.h"
>
> /**
> @@ -834,6 +835,8 @@ static int match_free_decoder(struct device *dev, const void *data)
> static bool region_res_match_cxl_range(const struct cxl_region_params *p,
> struct range *range)
> {
> + struct cxl_decoder *cxld;
> +
> if (!p->res)
> return false;
>
> @@ -842,8 +845,15 @@ static bool region_res_match_cxl_range(const struct cxl_region_params *p,
> * to be fronted by the DRAM range in current known implementation.
> * This assumption will be made until a variant implementation exists.
> */
> - return p->res->start + p->cache_size == range->start &&
> - p->res->end == range->end;
> + if (p->res->start + p->cache_size == range->start &&
> + p->res->end == range->end)
> + return true;
> +
> + cxld = container_of(range, struct cxl_decoder, hpa_range);
> + if (platform_region_contains(p, cxld))
> + return true;
> +
> + return false;
Can just return result of platform_region_contains() here.
> }
>
> static int match_auto_decoder(struct device *dev, const void *data)
> @@ -1763,6 +1773,7 @@ static int match_switch_and_ep_decoders(struct device *dev, const void *data)
> {
> const struct cxl_endpoint_decoder *cxled = data;
> struct cxl_switch_decoder *cxlsd;
> + struct cxl_root_decoder *cxlrd;
> const struct range *r1, *r2;
>
> if (!is_switch_decoder(dev))
> @@ -1772,8 +1783,13 @@ static int match_switch_and_ep_decoders(struct device *dev, const void *data)
> r1 = &cxlsd->cxld.hpa_range;
> r2 = &cxled->cxld.hpa_range;
>
> - if (is_root_decoder(dev))
> - return range_contains(r1, r2);
> + if (is_root_decoder(dev)) {
> + if (range_contains(r1, r2))
> + return 1;
> + cxlrd = to_cxl_root_decoder(dev);
> + if (platform_root_decoder_contains(cxlrd, cxled))
> + return 1;
> + }
I don't think it's possible, but the way this is written allows falling through
to the return statement below. This if statement should probably be:
if (is_root_decoder(dev)) {
cxlrd = to_cxl_root_decoder(dev);
return range_contains(r1, r2) || platform_root_decoder_contains(cxlrd, cxled));
}
> return (r1->start == r2->start && r1->end == r2->end);
> }
>
> @@ -1990,7 +2006,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> }
>
> if (resource_size(cxled->dpa_res) * p->interleave_ways + p->cache_size !=
> - resource_size(p->res)) {
> + resource_size(p->res) && !platform_root_decoder_contains(cxlrd, cxled)) {
> dev_dbg(&cxlr->dev,
> "%s:%s-size-%#llx * ways-%d + cache-%#llx != region-size-%#llx\n",
> dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> @@ -3230,7 +3246,12 @@ static int match_root_and_ep_decoders(struct device *dev, const void *data)
> r1 = &cxlrd->cxlsd.cxld.hpa_range;
> r2 = &cxled->cxld.hpa_range;
>
> - return range_contains(r1, r2);
> + if (range_contains(r1, r2))
> + return true;
> + if (platform_root_decoder_contains(cxlrd, cxled))
> + return true;
> +
> + return false;
Can just return the || of the two above if statements.
> }
>
> static struct cxl_decoder *
> @@ -3277,8 +3298,12 @@ static int match_region_and_ep_decoder(struct device *dev, const void *data)
> p = &cxlr->params;
>
> guard(rwsem_read)(&cxl_region_rwsem);
> - if (p->res && p->res->start == r->start && p->res->end == r->end)
> - return 1;
> + if (p->res) {
> + if (p->res->start == r->start && p->res->end == r->end)
> + return 1;
> + if (platform_region_contains(p, &cxled->cxld))
> + return 1;
> + }
Same thing here.
>
> return 0;
> }
> @@ -3355,6 +3380,12 @@ static int __construct_region(struct cxl_region *cxlr,
> *res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
> dev_name(&cxlr->dev));
>
> + /*
> + * Trim the HPA retrieved from hardware to fit the SPA mapped by the
> + * platform
> + */
> + platform_res_adjust(res, cxled, cxlrd);
> +
> rc = cxl_extended_linear_cache_resize(cxlr, res);
> if (rc && rc != -EOPNOTSUPP) {
> /*
> @@ -3464,6 +3495,12 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
> cxl_find_region_by_range(cxlrd, cxled);
> if (!cxlr)
> cxlr = construct_region(cxlrd, cxled);
> + else
> + /*
> + * Adjust the Endpoint Decoder's dpa_res to fit the Region which
> + * it has to be attached to
> + */
> + platform_res_adjust(NULL, cxled, cxlrd);
I'm 50/50 on whether these comments are unnecessary. The routine is pretty well documented
and also has an explanatory comment above the definition in platform.c. I think you
can probably remove them, but I'll defer to your/someone else's judgement here.
Thanks,
Ben
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures
2025-07-24 14:20 ` [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures Fabio M. De Francesco
2025-08-01 20:04 ` Cheatham, Benjamin
@ 2025-08-13 12:27 ` Jonathan Cameron
1 sibling, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2025-08-13 12:27 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: linux-cxl, Davidlohr Bueso, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel
On Thu, 24 Jul 2025 16:20:31 +0200
"Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> wrote:
> Replace struct range parameter with struct cxl_endpoint_decoder of
> which range is a member in the match_*_by_range() functions and rename
> them according to their semantics.
>
> This is in preparation for expanding these helpers to perform arch
> specific Root Decoders and Region matchings with
> cxl_endpoint_decoder(s).
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 4/4 v4] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-07-24 14:20 ` [PATCH 4/4 v4] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
@ 2025-08-20 21:08 ` Alison Schofield
0 siblings, 0 replies; 12+ messages in thread
From: Alison Schofield @ 2025-08-20 21:08 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: linux-cxl, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel
On Thu, Jul 24, 2025 at 04:20:34PM +0200, Fabio M. De Francesco wrote:
> Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> to 1GB.
>
> The auto-created region of cxl-test uses mock_cfmws[0], therefore the LMH
> path in the CXL Driver will be exercised every time the cxl-test module is
> loaded. Executing unit test: cxl-topology.sh, confirms the region created
> successfully with a LMH.
>
> Since mock_cfmws[0] range base address is typically different from the one
> published by the BIOS on real hardware, the driver would fail to create and
> attach CXL Regions when it's run on the mock environment created by
> cxl-tests.
Can you add a unit test that in addition to confirming that the auto region
appears, (like cxl-topology.sh does) also verifies region and decoder
settings and address translations.
That is work I did in a prior review and I was remiss in not asking for
a unit test back then. I suspect it entails turning what you examine
during your testing into a unit test script. cxl-poison.sh shows how to send
clear-poison commands to validate address translations.
Related topic - if/when you get on real hardware with an LMH it would be
useful to collect address translation samples to add to the cxl-translate.sh
collection. (cxl-translate.sh is in review, not yet merged, see on the
mailing list)
snip
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/4 v4] cxl/core: Add helpers to detect Low Memory Holes on x86
2025-08-01 20:04 ` Cheatham, Benjamin
@ 2025-08-21 13:25 ` Fabio M. De Francesco
0 siblings, 0 replies; 12+ messages in thread
From: Fabio M. De Francesco @ 2025-08-21 13:25 UTC (permalink / raw)
To: Cheatham, Benjamin
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, linux-cxl
On Friday, August 1, 2025 10:04:16 PM Central European Summer Time Cheatham, Benjamin wrote:
> On 7/24/2025 9:20 AM, Fabio M. De Francesco wrote:
> > In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
>
> s/publishes/publish
>
> > SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> > HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> > ranges are always guaranteed to align to the NIW * 256M rule.
>
> s/to align to/to align due to/
>
> Also a spec reference for the rule would be helpful (same with next patch).
>
> >
> > In order to construct Regions and attach Decoders, the driver needs to
> > match Root Decoders and Regions with Endpoint Decoders, but it fails and
> > the entire process returns errors because it doesn't expect to deal with
> > SPA range lengths smaller than corresponding HPA's.
> >
> > Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> > with corresponding HPA's. They will be used in the process of Regions
> > creation and Endpoint attachments to prevent driver failures in a few
> > steps of the above-mentioned process.
> >
> > The helpers return true when HPA/SPA misalignments are detected under
> > specific conditions: both the SPA and HPA ranges must start at
> > LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
>
> maybe sub "that in x86 with LMH's is 0x0" with "0x0 on x86 with LMH's" also
> s/SPA range sizes/SPA range's size/.
>
> > be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> > aligned to the NIW * 256M rule.
>
> s/be/is/ in above list.
>
I'll change the lines you corrected. Thank you.
> >
> > Also introduce a function to adjust the range end of the Regions to be
> > created on x86 with LMH's.
> >
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Dave Jiang <dave.jiang@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> > drivers/cxl/Kconfig | 5 +++
> > drivers/cxl/core/Makefile | 1 +
> > drivers/cxl/core/platform.c | 85 +++++++++++++++++++++++++++++++++++++
> > drivers/cxl/core/platform.h | 32 ++++++++++++++
> > 4 files changed, 123 insertions(+)
> > create mode 100644 drivers/cxl/core/platform.c
> > create mode 100644 drivers/cxl/core/platform.h
> >
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index 48b7314afdb8..eca90baeac10 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -211,6 +211,11 @@ config CXL_REGION
> >
> > If unsure say 'y'
> >
> > +config CXL_PLATFORM_QUIRKS
> > + def_bool y
> > + depends on CXL_REGION
> > + depends on X86
> > +
>
> I'm confused on the intention behind this symbol. The naming suggests it's for all platform quirks,
> but the code and dependencies make this x86-specific.
>
I'll remove this dependence on X86; it's neither needed nor wanted.
>
> I'm going to suggest making this x86-specific for now. I'm not aware of any other platforms with quirks
> (someone correct me if I'm wrong), so making this x86-specific is fine for now. I would rename this
> symbol to CXL_X86_QUIRKS, leave dependencies as-is, and rename platform.c to something like platform_x86.c.
> Then, if someone comes along with other platform quirks they can do their own symbol and file (or come
> up with a generic scheme).
>
Dan suggested this approach and I agree with him.[1]
>
> > config CXL_REGION_INVALIDATION_TEST
> > bool "CXL: Region Cache Management Bypass (TEST)"
> > depends on CXL_REGION
> > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > index 79e2ef81fde8..4be729fb7d64 100644
> > --- a/drivers/cxl/core/Makefile
> > +++ b/drivers/cxl/core/Makefile
> > @@ -18,6 +18,7 @@ cxl_core-y += ras.o
> > cxl_core-y += acpi.o
> > cxl_core-$(CONFIG_TRACING) += trace.o
> > cxl_core-$(CONFIG_CXL_REGION) += region.o
> > +cxl_core-$(CONFIG_CXL_PLATFORM_QUIRKS) += platform.o
> > cxl_core-$(CONFIG_CXL_MCE) += mce.o
> > cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> > cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
> > diff --git a/drivers/cxl/core/platform.c b/drivers/cxl/core/platform.c
> > new file mode 100644
> > index 000000000000..8202750742d0
> > --- /dev/null
> > +++ b/drivers/cxl/core/platform.c
> > @@ -0,0 +1,85 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +
> > +#include <linux/range.h>
> > +#include "platform.h"
> > +#include "cxlmem.h"
> > +#include "core.h"
> > +
> > +/* Start of CFMWS range that end before x86 Low Memory Holes */
> > +#define LMH_CFMWS_RANGE_START 0x0ULL
> > +
> > +/*
> > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > + *
> > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > + * the given endpoint decoder HPA range size is always expected aligned and
> > + * also larger than that of the matching root decoder. If there are LMH's,
> > + * the root decoder range end is always less than SZ_4G.
> > + */
>
> Where does the SZ_4G restraint come from?
>
The hole is in low memory.
>
> Also, might as well make this a kdoc comment since you've already done the majority of the work.
>
> > +bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled)
> > +{
>
> Assuming you take renaming suggestion above, these functions should probably be start with
> "platform_x86_*" instead.
>
> > + const struct range *r1, *r2;
> > + int niw;
> > +
> > + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > + r2 = &cxled->cxld.hpa_range;
> > + niw = cxled->cxld.interleave_ways;
> > +
> > + if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > + r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> > + IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > + return true;
> > +
> > + return false;
> > +}
> > +
> > +/*
> > + * Similar to platform_root_decoder_contains(), it matches regions and
> > + * decoders
> > + */
> > +bool platform_region_contains(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld)
> > +{
> > + const struct range *r = &cxld->hpa_range;
> > + const struct resource *res = p->res;
> > + int niw = cxld->interleave_ways;
> > +
> > + if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > + res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> > + IS_ALIGNED(range_len(r), niw * SZ_256M))
> > + return true;
> > +
>
> This if condition could probably move into its own helper function that takes the ranges and
> interleave ways. My only hang up is that these functions become 2-3 lines each after doing so.
>
> You could also get rid of these two functions and just export the "helper" function instead.
> It would probably add some bloat to patch 3/4, so it's your call here.
>
I'd rather leave these functions as they are.
>
> > + return false;
> > +}
> > +
> > +void platform_res_adjust(struct resource *res,
> > + struct cxl_endpoint_decoder *cxled,
> > + const struct cxl_root_decoder *cxlrd)
> > +{
> > + if (!platform_root_decoder_contains(cxlrd, cxled))
> > + return;
> > +
> > + guard(rwsem_write)(&cxl_dpa_rwsem);
> > + dev_info(cxled_to_memdev(cxled)->dev.parent,
> > + "(LMH) Resources were (%s: %pr, %pr)\n",
> > + dev_name(&cxled->cxld.dev), res, cxled->dpa_res);
> > + if (res) {
> > + /*
> > + * A region must be constructed with Endpoint Decoder's
> > + * HPA range end adjusted to Root Decoder's resource end
> > + */
>
> This comment (and one below) are confusing to me. I'd reword as "Trim region resource
> overlap with LMH".
>
Okay, thanks.
>
> > + res->end = cxlrd->res->end;
> > + }
> > + /*
> > + * The Endpoint Decoder's dpa_res->end must be adjusted with Root
> > + * Decoder's resource end
> > + */
>
> and reword this one to "Match endpoint decoder's DPA resource to root decoder's". You could
> also leave out this comment altogether, the below is self-explanatory IMO.
>
Okay.
>
> > + cxled->dpa_res->end =
> > + cxled->dpa_res->start +
> > + resource_size(cxlrd->res) / cxled->cxld.interleave_ways - 1;
> > + dev_info(cxled_to_memdev(cxled)->dev.parent,
> > + "(LMH) Resources have been adjusted (%s: %pr, %pr)\n",
> > + dev_name(&cxled->cxld.dev), res, cxled->dpa_res);
>
> I think this function is a bit too noisy. I'd recommend having just this print and
> have it say something like "Low memory hole overlap detected, trimmed DPA to %pr".
> You could also include the amount trimmed, but that may not be very useful info.
>
> I'd make the first print a dev_dbg() and spell out LMH at the very least. If it's a
> dev_info() it should be relatively easy to tell what's going on for a layman.
>
I'm using dev_info() according to what Alison suggested.[2]
>
> > +}
> > diff --git a/drivers/cxl/core/platform.h b/drivers/cxl/core/platform.h
> > new file mode 100644
> > index 000000000000..0baa39938729
> > --- /dev/null
> > +++ b/drivers/cxl/core/platform.h
>
> This is a new file so that these functions can hook into cxl_test, right?
> If not, I'd move the below into cxl/core/core.h and remove this file.
>
> > @@ -0,0 +1,32 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include "cxl.h"
> > +
> > +#ifdef CONFIG_CXL_PLATFORM_QUIRKS
> > +bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled);
> > +bool platform_region_contains(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld);
> > +void platform_res_adjust(struct resource *res,
> > + struct cxl_endpoint_decoder *cxled,
> > + const struct cxl_root_decoder *cxlrd);
> > +#else
> > +static bool
> > +platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled)
> > +{
> > + return false;
> > +}
> > +
> > +static bool platform_region_contains(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld)
> > +{
> > + return false;
> > +}
> > +
> > +void platform_res_adjust(struct resource *res,
> > + struct cxl_endpoint_decoder *cxled,
> > + const struct cxl_root_decoder *cxlrd)
> > +{
> > +}
>
> Don't these need "inline" in the function signatures?
>
Yes, sure. They'll be inline in the next version.
>
> > +#endif /* CONFIG_CXL_PLATFORM_QUIRKS */
>
Thank you,
Fabio
[1] https://lore.kernel.org/linux-cxl/67ee07cd4f8ec_1c2c6294d5@dwillia2-xfh.jf.intel.com.notmuch
[2] https://lore.kernel.org/linux-cxl/Z9xaBaM8Mzc8rQ90@aschofie-mobl2.lan/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/4 v4] cxl/core: Enable Region creation on x86 with LMH
2025-08-01 20:04 ` Cheatham, Benjamin
@ 2025-08-21 15:15 ` Fabio M. De Francesco
0 siblings, 0 replies; 12+ messages in thread
From: Fabio M. De Francesco @ 2025-08-21 15:15 UTC (permalink / raw)
To: Cheatham, Benjamin
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, linux-cxl
On Friday, August 1, 2025 10:04:18 PM Central European Summer Time Cheatham, Benjamin wrote:
> On 7/24/2025 9:20 AM, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
>
> Update to 3.2 spec? Sorry I forgot to mention it earlier, but you'll probably
> want to do this for the whole series.
>
Sure, thanks.
> >
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. In some cases the size
> > of that hole is not compatible with the CXL hardware decoder constraint
> > that the size is always aligned to 256M * Interleave Ways.
>
> Spec ref here.
>
Okay.
> >
> > On those systems, BIOS publishes CFMWS which communicate the active System
> > Physical Address (SPA) ranges that map to a subset of the Host Physical
> > Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> > the endpoint is lost with no SPA to map to CXL HPA in that hole.
> >
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, cxl_add_to_region() fails and returns an error
> > because it can't find any CXL Window that matches a given CXL Endpoint
> > Decoder.
> >
> > Detect a Low Memory Hole by comparing Root Decoders and Endpoint Decoders
> > ranges with the use of arch_match_{spa,region}() helpers.
> >
> > Match Root Decoders and CXL Regions with corresponding CXL Endpoint
> > Decoders. Currently a Low Memory Holes would prevent the matching functions
> > to return true.
> >
> > Construct CXL Regions with HPA range's end adjusted to the matching SPA.
> >
> > Allow the attach target process to complete by allowing Regions to not
> > fit with alignment constraints (i.e., alignment to NIW * 256M rule).
> >
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Dave Jiang <dave.jiang@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> > drivers/cxl/core/region.c | 53 +++++++++++++++++++++++++++++++++------
> > tools/testing/cxl/Kbuild | 1 +
> > 2 files changed, 46 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index f607e7f97184..b7fdf9c4393d 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -12,6 +12,7 @@
> > #include <linux/memory-tiers.h>
> > #include <cxlmem.h>
> > #include <cxl.h>
> > +#include "platform.h"
> > #include "core.h"
> >
> > /**
> > @@ -834,6 +835,8 @@ static int match_free_decoder(struct device *dev, const void *data)
> > static bool region_res_match_cxl_range(const struct cxl_region_params *p,
> > struct range *range)
> > {
> > + struct cxl_decoder *cxld;
> > +
> > if (!p->res)
> > return false;
> >
> > @@ -842,8 +845,15 @@ static bool region_res_match_cxl_range(const struct cxl_region_params *p,
> > * to be fronted by the DRAM range in current known implementation.
> > * This assumption will be made until a variant implementation exists.
> > */
> > - return p->res->start + p->cache_size == range->start &&
> > - p->res->end == range->end;
> > + if (p->res->start + p->cache_size == range->start &&
> > + p->res->end == range->end)
> > + return true;
> > +
> > + cxld = container_of(range, struct cxl_decoder, hpa_range);
> > + if (platform_region_contains(p, cxld))
> > + return true;
> > +
> > + return false;
>
> Can just return result of platform_region_contains() here.
>
Okay.
>
> > }
> >
> > static int match_auto_decoder(struct device *dev, const void *data)
> > @@ -1763,6 +1773,7 @@ static int match_switch_and_ep_decoders(struct device *dev, const void *data)
> > {
> > const struct cxl_endpoint_decoder *cxled = data;
> > struct cxl_switch_decoder *cxlsd;
> > + struct cxl_root_decoder *cxlrd;
> > const struct range *r1, *r2;
> >
> > if (!is_switch_decoder(dev))
> > @@ -1772,8 +1783,13 @@ static int match_switch_and_ep_decoders(struct device *dev, const void *data)
> > r1 = &cxlsd->cxld.hpa_range;
> > r2 = &cxled->cxld.hpa_range;
> >
> > - if (is_root_decoder(dev))
> > - return range_contains(r1, r2);
> > + if (is_root_decoder(dev)) {
> > + if (range_contains(r1, r2))
> > + return 1;
> > + cxlrd = to_cxl_root_decoder(dev);
> > + if (platform_root_decoder_contains(cxlrd, cxled))
> > + return 1;
> > + }
>
> I don't think it's possible, but the way this is written allows falling through
> to the return statement below.
>
It's possible. It can happen and it's harmless yet unnecessary.
> This if statement should probably be:
>
> if (is_root_decoder(dev)) {
> cxlrd = to_cxl_root_decoder(dev);
> return range_contains(r1, r2) || platform_root_decoder_contains(cxlrd, cxled));
> }
>
> > return (r1->start == r2->start && r1->end == r2->end);
> > }
> >
And since it's unnecessary, I agree with you and rewrite it.
I just wanted, out of my personal preference, to highlight that calls
of platform_root_decoder_contains() happen only to allow the driver
to succeed also on a less common case, that is on platforms with LMH.
>
> > @@ -1990,7 +2006,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> > }
> >
> > if (resource_size(cxled->dpa_res) * p->interleave_ways + p->cache_size !=
> > - resource_size(p->res)) {
> > + resource_size(p->res) && !platform_root_decoder_contains(cxlrd, cxled)) {
> > dev_dbg(&cxlr->dev,
> > "%s:%s-size-%#llx * ways-%d + cache-%#llx != region-size-%#llx\n",
> > dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> > @@ -3230,7 +3246,12 @@ static int match_root_and_ep_decoders(struct device *dev, const void *data)
> > r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > r2 = &cxled->cxld.hpa_range;
> >
> > - return range_contains(r1, r2);
> > + if (range_contains(r1, r2))
> > + return true;
> > + if (platform_root_decoder_contains(cxlrd, cxled))
> > + return true;
> > +
> > + return false;
>
> Can just return the || of the two above if statements.
>
Sure, as it is going to be with the other function.
> > }
> >
> > static struct cxl_decoder *
> > @@ -3277,8 +3298,12 @@ static int match_region_and_ep_decoder(struct device *dev, const void *data)
> > p = &cxlr->params;
> >
> > guard(rwsem_read)(&cxl_region_rwsem);
> > - if (p->res && p->res->start == r->start && p->res->end == r->end)
> > - return 1;
> > + if (p->res) {
> > + if (p->res->start == r->start && p->res->end == r->end)
> > + return 1;
> > + if (platform_region_contains(p, &cxled->cxld))
> > + return 1;
> > + }
>
> Same thing here.
>
> >
> > return 0;
> > }
> > @@ -3355,6 +3380,12 @@ static int __construct_region(struct cxl_region *cxlr,
> > *res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
> > dev_name(&cxlr->dev));
> >
> > + /*
> > + * Trim the HPA retrieved from hardware to fit the SPA mapped by the
> > + * platform
> > + */
> > + platform_res_adjust(res, cxled, cxlrd);
> > +
> > rc = cxl_extended_linear_cache_resize(cxlr, res);
> > if (rc && rc != -EOPNOTSUPP) {
> > /*
> > @@ -3464,6 +3495,12 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
> > cxl_find_region_by_range(cxlrd, cxled);
> > if (!cxlr)
> > cxlr = construct_region(cxlrd, cxled);
> > + else
> > + /*
> > + * Adjust the Endpoint Decoder's dpa_res to fit the Region which
> > + * it has to be attached to
> > + */
> > + platform_res_adjust(NULL, cxled, cxlrd);
>
> I'm 50/50 on whether these comments are unnecessary. The routine is pretty well documented
> and also has an explanatory comment above the definition in platform.c. I think you
> can probably remove them, but I'll defer to your/someone else's judgement here.
>
I'll think about it. Anyway, I wanted to clarify that the two call sites
serve two different purposes (i.e., for already constructed regions we may
still need to adjust dpa_res).
Thanks,
Fabio
>
> Thanks,
> Ben
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-08-21 15:15 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-24 14:20 [PATCH 0/4 v4] cxl/core: Enable Region creation/attach on x86 with LMH Fabio M. De Francesco
2025-07-24 14:20 ` [PATCH 1/4 v4] cxl/core: Change match_*_by_range() signatures Fabio M. De Francesco
2025-08-01 20:04 ` Cheatham, Benjamin
2025-08-13 12:27 ` Jonathan Cameron
2025-07-24 14:20 ` [PATCH 2/4 v4] cxl/core: Add helpers to detect Low Memory Holes on x86 Fabio M. De Francesco
2025-08-01 20:04 ` Cheatham, Benjamin
2025-08-21 13:25 ` Fabio M. De Francesco
2025-07-24 14:20 ` [PATCH 3/4 v4] cxl/core: Enable Region creation on x86 with LMH Fabio M. De Francesco
2025-08-01 20:04 ` Cheatham, Benjamin
2025-08-21 15:15 ` Fabio M. De Francesco
2025-07-24 14:20 ` [PATCH 4/4 v4] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
2025-08-20 21:08 ` Alison Schofield
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).