* [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
@ 2025-03-14 11:36 Fabio M. De Francesco
2025-03-14 11:36 ` [PATCH 1/4 v3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
` (6 more replies)
0 siblings, 7 replies; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-14 11:36 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
Physical Address (HPA) windows that are associated with each CXL Host
Bridge. Each window represents a contiguous HPA that may be interleaved
with one or more targets (CXL v3.1 - 9.18.1.3).
The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
memory to which systems cannot send transactions. On those systems, BIOS
publishes CFMWS which communicate the active System Physical Address (SPA)
ranges that map to a subset of the Host Physical Address (HPA) ranges. The
SPA range trims out the hole, and capacity in the endpoint is lost with no
SPA to map to CXL HPA in that hole.
In the early stages of CXL Regions construction and attach on platforms
with Low Memory Holes, the driver fails and returns an error because it
expects that the CXL Endpoint Decoder range is a subset of the Root
Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
Decoders or already made CXL Regions and Decoders to allow the
construction of new CXL Regions and the attachment of Endpoint Decoders,
even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
range end to match Root Decoder's.
- Patch 1/4 changes the calling conventions of three match_*_by_range()
helpers in preparation of 3/4.
- Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
the HPA range end for CXL Regions construction.
- Patch 3/4 enables CXL Regions construction and Endpoint Decoders
attachment by matching Root Decoders or Regions with Endpoint
Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
constraints while Endpoints decoders' attachment.
- Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
Many thanks to Alison, Dan, and Ira for their help and for their reviews
of my RFC on Intel's internal ML.
Commenting on v1, Alison wrote a couple of observations on what users
will see. I suggest anyone interested to see how this series affect
users to take a look at her observations.[0] Thank you!
Changes for v3:
Re-base the series on cxl/next.
1/4 - 2/4:
Constify local variables.
3/4:
Call arch_match_region() from region_res_match_cxl_range().
4/4:
arch_match_region() - Check that region end is under start + 4G;
arch_match_spa() - Check that SPA range start is cfmws_range_start.
base-commit: acc2913692413df9d1
v2 - https://lore.kernel.org/linux-cxl/20250114203432.31861-1-fabio.m.de.francesco@linux.intel.com/
v1 - https://lore.kernel.org/all/20241122155226.2068287-1-fabio.m.de.francesco@linux.intel.com/
[0] - https://lore.kernel.org/all/Z0Tzif55CcHuujJ-@aschofie-mobl2.lan/
[1] - https://lore.kernel.org/oe-kbuild-all/202411252126.T7xKY88q-lkp@intel.com/
Fabio M. De Francesco (4):
cxl/core: Change match_*_by_range() calling convention
cxl/core: Add helpers to detect Low memory Holes on x86
cxl/core: Enable Region creation on x86 with Low Memory Hole
cxl/test: Simulate an x86 Low Memory Hole for tests
drivers/cxl/Kconfig | 5 ++
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/lmh.c | 83 ++++++++++++++++++++++++++++
drivers/cxl/core/lmh.h | 31 +++++++++++
drivers/cxl/core/region.c | 82 +++++++++++++++++++++------
tools/testing/cxl/Kbuild | 1 +
tools/testing/cxl/cxl_core_exports.c | 2 +
tools/testing/cxl/test/cxl.c | 10 ++++
8 files changed, 197 insertions(+), 18 deletions(-)
create mode 100644 drivers/cxl/core/lmh.c
create mode 100644 drivers/cxl/core/lmh.h
--
2.48.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/4 v3] cxl/core: Change match_*_by_range() calling convention
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
@ 2025-03-14 11:36 ` Fabio M. De Francesco
2025-03-21 15:43 ` Dave Jiang
2025-03-14 11:36 ` [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86 Fabio M. De Francesco
` (5 subsequent siblings)
6 siblings, 1 reply; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-14 11:36 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
Replace struct range parameter with struct cxl_endpoint_decoder of
which range is a member in the match_*_by_range() functions.
This is in preparation for expanding these helpers to perform arch
specific region matching that requires a cxl_endpoint_decoder.
No functional changes.
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/cxl/core/region.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index b3260d433ec7..97122d645cc1 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1758,24 +1758,27 @@ static struct cxl_port *next_port(struct cxl_port *port)
static int match_switch_decoder_by_range(struct device *dev,
const void *data)
{
+ const struct cxl_endpoint_decoder *cxled = data;
struct cxl_switch_decoder *cxlsd;
- const struct range *r1, *r2 = data;
-
+ const struct range *r1, *r2;
if (!is_switch_decoder(dev))
return 0;
cxlsd = to_cxl_switch_decoder(dev);
r1 = &cxlsd->cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
if (is_root_decoder(dev))
return range_contains(r1, r2);
return (r1->start == r2->start && r1->end == r2->end);
}
-static int find_pos_and_ways(struct cxl_port *port, struct range *range,
+static int find_pos_and_ways(struct cxl_port *port,
+ struct cxl_endpoint_decoder *cxled,
int *pos, int *ways)
{
+ struct range *range = &cxled->cxld.hpa_range;
struct cxl_switch_decoder *cxlsd;
struct cxl_port *parent;
struct device *dev;
@@ -1785,7 +1788,7 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
if (!parent)
return rc;
- dev = device_find_child(&parent->dev, range,
+ dev = device_find_child(&parent->dev, cxled,
match_switch_decoder_by_range);
if (!dev) {
dev_err(port->uport_dev,
@@ -1865,7 +1868,7 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
if (is_cxl_root(iter))
break;
- rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
+ rc = find_pos_and_ways(iter, cxled, &parent_pos, &parent_ways);
if (rc)
return rc;
@@ -3199,22 +3202,26 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
static int match_root_decoder_by_range(struct device *dev,
const void *data)
{
- const struct range *r1, *r2 = data;
+ const struct cxl_endpoint_decoder *cxled = data;
struct cxl_root_decoder *cxlrd;
+ const struct range *r1, *r2;
if (!is_root_decoder(dev))
return 0;
cxlrd = to_cxl_root_decoder(dev);
r1 = &cxlrd->cxlsd.cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
+
return range_contains(r1, r2);
}
static int match_region_by_range(struct device *dev, const void *data)
{
+ const struct cxl_endpoint_decoder *cxled = data;
+ const struct range *r = &cxled->cxld.hpa_range;
struct cxl_region_params *p;
struct cxl_region *cxlr;
- const struct range *r = data;
if (!is_cxl_region(dev))
return 0;
@@ -3382,7 +3389,6 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
- struct range *hpa = &cxled->cxld.hpa_range;
struct cxl_decoder *cxld = &cxled->cxld;
struct device *cxlrd_dev, *region_dev;
struct cxl_root_decoder *cxlrd;
@@ -3391,7 +3397,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
bool attach = false;
int rc;
- cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
+ cxlrd_dev = device_find_child(&root->dev, cxled,
match_root_decoder_by_range);
if (!cxlrd_dev) {
dev_err(cxlmd->dev.parent,
@@ -3408,7 +3414,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
* one does the construction and the others add to that.
*/
mutex_lock(&cxlrd->range_lock);
- region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
+ region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, cxled,
match_region_by_range);
if (!region_dev) {
cxlr = construct_region(cxlrd, cxled);
--
2.48.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
2025-03-14 11:36 ` [PATCH 1/4 v3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
@ 2025-03-14 11:36 ` Fabio M. De Francesco
2025-03-18 15:15 ` Ira Weiny
` (2 more replies)
2025-03-14 11:36 ` [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
` (4 subsequent siblings)
6 siblings, 3 replies; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-14 11:36 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
ranges are always guaranteed to align to the NIW * 256M rule.
In order to construct Regions and attach Decoders, the driver needs to
match Root Decoders and Regions with Endpoint Decoders, but it fails and
the entire process returns errors because it doesn't expect to deal with
SPA range lengths smaller than corresponding HPA's.
Introduce functions that indirectly detect x86 LMH's by comparing SPA's
with corresponding HPA's. They will be used in the process of Regions
creation and Endpoint attachments to prevent driver failures in a few
steps of the above-mentioned process.
The helpers return true when HPA/SPA misalignments are detected under
specific conditions: both the SPA and HPA ranges must start at
LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
be less than HPA's, SPA's range's size be less than 4G, HPA's size be
aligned to the NIW * 256M rule.
Also introduce a function to adjust the range end of the Regions to be
created on x86 with LMH's.
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
2 files changed, 85 insertions(+)
create mode 100644 drivers/cxl/core/lmh.c
create mode 100644 drivers/cxl/core/lmh.h
diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
new file mode 100644
index 000000000000..2e32f867eb94
--- /dev/null
+++ b/drivers/cxl/core/lmh.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/range.h>
+#include "lmh.h"
+
+/* Start of CFMWS range that end before x86 Low Memory Holes */
+#define LMH_CFMWS_RANGE_START 0x0ULL
+
+/*
+ * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
+ *
+ * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
+ * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
+ * the given endpoint decoder HPA range size is always expected aligned and
+ * also larger than that of the matching root decoder. If there are LMH's,
+ * the root decoder range end is always less than SZ_4G.
+ */
+bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
+{
+ const struct range *r1, *r2;
+ int niw;
+
+ r1 = &cxlrd->cxlsd.cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
+ niw = cxled->cxld.interleave_ways;
+
+ if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
+ r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
+ IS_ALIGNED(range_len(r2), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+/* Similar to arch_match_spa(), it matches regions and decoders */
+bool arch_match_region(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
+{
+ const struct range *r = &cxld->hpa_range;
+ const struct resource *res = p->res;
+ int niw = cxld->interleave_ways;
+
+ if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
+ res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
+ IS_ALIGNED(range_len(r), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+void arch_adjust_region_resource(struct resource *res,
+ struct cxl_root_decoder *cxlrd)
+{
+ res->end = cxlrd->res->end;
+}
diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
new file mode 100644
index 000000000000..16746ceac1ed
--- /dev/null
+++ b/drivers/cxl/core/lmh.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include "cxl.h"
+
+#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
+bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled);
+bool arch_match_region(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld);
+void arch_adjust_region_resource(struct resource *res,
+ struct cxl_root_decoder *cxlrd);
+#else
+static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
+ struct cxl_endpoint_decoder *cxled)
+{
+ return false;
+}
+
+static bool arch_match_region(struct cxl_region_params *p,
+ struct cxl_decoder *cxld)
+{
+ return false;
+}
+
+static void arch_adjust_region_resource(struct resource *res,
+ struct cxl_root_decoder *cxlrd)
+{
+}
+#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
--
2.48.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
2025-03-14 11:36 ` [PATCH 1/4 v3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
2025-03-14 11:36 ` [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86 Fabio M. De Francesco
@ 2025-03-14 11:36 ` Fabio M. De Francesco
2025-03-18 20:35 ` Ira Weiny
2025-03-21 10:29 ` Robert Richter
2025-03-14 11:36 ` [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
` (3 subsequent siblings)
6 siblings, 2 replies; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-14 11:36 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
Physical Address (HPA) windows that are associated with each CXL Host
Bridge. Each window represents a contiguous HPA that may be interleaved
with one or more targets (CXL v3.1 - 9.18.1.3).
The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
memory to which systems cannot send transactions. In some cases the size
of that hole is not compatible with the CXL hardware decoder constraint
that the size is always aligned to 256M * Interleave Ways.
On those systems, BIOS publishes CFMWS which communicate the active System
Physical Address (SPA) ranges that map to a subset of the Host Physical
Address (HPA) ranges. The SPA range trims out the hole, and capacity in
the endpoint is lost with no SPA to map to CXL HPA in that hole.
In the early stages of CXL Regions construction and attach on platforms
with Low Memory Holes, cxl_add_to_region() fails and returns an error
because it can't find any CXL Window that matches a given CXL Endpoint
Decoder.
Detect a Low Memory Hole by comparing Root Decoders and Endpoint Decoders
ranges with the use of arch_match_{spa,region}() helpers.
Match Root Decoders and CXL Regions with corresponding CXL Endpoint
Decoders. Currently a Low Memory Holes would prevent the matching functions
to return true.
Construct CXL Regions with HPA range's end adjusted to the matching SPA.
Allow the attach target process to complete by allowing Regions to not
comply with alignment constraints (i.e., alignment to NIW * 256M rule).
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/cxl/Kconfig | 5 ++++
drivers/cxl/core/Makefile | 1 +
drivers/cxl/core/region.c | 56 +++++++++++++++++++++++++++++++++------
tools/testing/cxl/Kbuild | 1 +
4 files changed, 55 insertions(+), 8 deletions(-)
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 205547e5543a..3bb282ef01df 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -139,6 +139,11 @@ config CXL_REGION
If unsure say 'y'
+config CXL_ARCH_LOW_MEMORY_HOLE
+ def_bool y
+ depends on CXL_REGION
+ depends on X86
+
config CXL_REGION_INVALIDATION_TEST
bool "CXL: Region Cache Management Bypass (TEST)"
depends on CXL_REGION
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 139b349b3a52..3dccd3c224f1 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -17,6 +17,7 @@ cxl_core-y += cdat.o
cxl_core-y += acpi.o
cxl_core-y += ras.o
cxl_core-$(CONFIG_TRACING) += trace.o
+cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
cxl_core-$(CONFIG_CXL_REGION) += region.o
cxl_core-$(CONFIG_CXL_FEATURES) += features.o
cxl_core-$(CONFIG_CXL_MCE) += mce.o
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 97122d645cc1..9eb23ecedecf 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -13,6 +13,7 @@
#include <cxlmem.h>
#include <cxl.h>
#include "core.h"
+#include "lmh.h"
/**
* DOC: cxl core region
@@ -835,6 +836,8 @@ static int match_free_decoder(struct device *dev, const void *data)
static bool region_res_match_cxl_range(const struct cxl_region_params *p,
struct range *range)
{
+ struct cxl_decoder *cxld;
+
if (!p->res)
return false;
@@ -843,8 +846,15 @@ static bool region_res_match_cxl_range(const struct cxl_region_params *p,
* to be fronted by the DRAM range in current known implementation.
* This assumption will be made until a variant implementation exists.
*/
- return p->res->start + p->cache_size == range->start &&
- p->res->end == range->end;
+ if (p->res->start + p->cache_size == range->start &&
+ p->res->end == range->end)
+ return true;
+
+ cxld = container_of(range, struct cxl_decoder, hpa_range);
+ if (arch_match_region(p, cxld))
+ return true;
+
+ return false;
}
static int match_auto_decoder(struct device *dev, const void *data)
@@ -1760,6 +1770,7 @@ static int match_switch_decoder_by_range(struct device *dev,
{
const struct cxl_endpoint_decoder *cxled = data;
struct cxl_switch_decoder *cxlsd;
+ struct cxl_root_decoder *cxlrd;
const struct range *r1, *r2;
if (!is_switch_decoder(dev))
@@ -1769,8 +1780,13 @@ static int match_switch_decoder_by_range(struct device *dev,
r1 = &cxlsd->cxld.hpa_range;
r2 = &cxled->cxld.hpa_range;
- if (is_root_decoder(dev))
- return range_contains(r1, r2);
+ if (is_root_decoder(dev)) {
+ if (range_contains(r1, r2))
+ return 1;
+ cxlrd = to_cxl_root_decoder(dev);
+ if (arch_match_spa(cxlrd, cxled))
+ return 1;
+ }
return (r1->start == r2->start && r1->end == r2->end);
}
@@ -1978,7 +1994,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
}
if (resource_size(cxled->dpa_res) * p->interleave_ways + p->cache_size !=
- resource_size(p->res)) {
+ resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
dev_dbg(&cxlr->dev,
"%s:%s-size-%#llx * ways-%d + cache-%#llx != region-size-%#llx\n",
dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
@@ -3213,7 +3229,12 @@ static int match_root_decoder_by_range(struct device *dev,
r1 = &cxlrd->cxlsd.cxld.hpa_range;
r2 = &cxled->cxld.hpa_range;
- return range_contains(r1, r2);
+ if (range_contains(r1, r2))
+ return true;
+ if (arch_match_spa(cxlrd, cxled))
+ return true;
+
+ return false;
}
static int match_region_by_range(struct device *dev, const void *data)
@@ -3230,8 +3251,12 @@ static int match_region_by_range(struct device *dev, const void *data)
p = &cxlr->params;
guard(rwsem_read)(&cxl_region_rwsem);
- if (p->res && p->res->start == r->start && p->res->end == r->end)
- return 1;
+ if (p->res) {
+ if (p->res->start == r->start && p->res->end == r->end)
+ return 1;
+ if (arch_match_region(p, &cxled->cxld))
+ return 1;
+ }
return 0;
}
@@ -3319,6 +3344,21 @@ static int __construct_region(struct cxl_region *cxlr,
"Extended linear cache calculation failed rc:%d\n", rc);
}
+ /*
+ * Trim the HPA retrieved from hardware to fit the SPA mapped by the
+ * platform
+ */
+ if (arch_match_spa(cxlrd, cxled)) {
+ dev_dbg(cxlmd->dev.parent, "(LMH) Resource (%s: %pr)\n",
+ dev_name(&cxled->cxld.dev), res);
+
+ arch_adjust_region_resource(res, cxlrd);
+
+ dev_dbg(cxlmd->dev.parent,
+ "(LMH) has been adjusted (%s: %pr)\n",
+ dev_name(&cxled->cxld.dev), res);
+ }
+
rc = insert_resource(cxlrd->res, res);
if (rc) {
/*
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index 4efcc0606bd6..3b3c24b1496e 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -64,6 +64,7 @@ cxl_core-y += $(CXL_CORE_SRC)/cdat.o
cxl_core-y += $(CXL_CORE_SRC)/acpi.o
cxl_core-y += $(CXL_CORE_SRC)/ras.o
cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o
+cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += $(CXL_CORE_SRC)/lmh.o
cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o
cxl_core-$(CONFIG_CXL_FEATURES) += $(CXL_CORE_SRC)/features.o
cxl_core-$(CONFIG_CXL_MCE) += $(CXL_CORE_SRC)/mce.o
--
2.48.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
` (2 preceding siblings ...)
2025-03-14 11:36 ` [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
@ 2025-03-14 11:36 ` Fabio M. De Francesco
2025-03-18 21:16 ` Ira Weiny
` (2 more replies)
2025-03-20 1:46 ` [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Alison Schofield
` (2 subsequent siblings)
6 siblings, 3 replies; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-14 11:36 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
Simulate an x86 Low Memory Hole for the CXL tests by changing the first
mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
to 1GB.
Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
base address is typically different from the one published by the BIOS on
real hardware, the driver would fail to create and attach CXL Regions if
it was run on the mock environment created by cxl-tests.
Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
Root Decoders and Regions with Endpoint Decoders when the driver is run on
mock devices.
Since the auto-created region of cxl-test uses mock_cfmws[0], the
LMH path in the CXL Driver will be exercised every time the cxl-test
module is loaded. Executing unit test: cxl-topology.sh, confirms the
region created successfully with a LMH.
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/cxl/core/lmh.c | 35 ++++++++++++++++++++++++----
drivers/cxl/core/lmh.h | 2 ++
tools/testing/cxl/cxl_core_exports.c | 2 ++
tools/testing/cxl/test/cxl.c | 10 ++++++++
4 files changed, 45 insertions(+), 4 deletions(-)
diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
index 2e32f867eb94..9c55670c1c84 100644
--- a/drivers/cxl/core/lmh.c
+++ b/drivers/cxl/core/lmh.c
@@ -1,11 +1,28 @@
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/range.h>
+#include <linux/pci.h>
+
#include "lmh.h"
/* Start of CFMWS range that end before x86 Low Memory Holes */
#define LMH_CFMWS_RANGE_START 0x0ULL
+static u64 mock_cfmws0_range_start = ULLONG_MAX;
+
+void set_mock_cfmws0_range_start(const u64 start)
+{
+ mock_cfmws0_range_start = start;
+}
+
+static u64 get_cfmws_range_start(const struct device *dev)
+{
+ if (dev_is_pci(dev))
+ return LMH_CFMWS_RANGE_START;
+
+ return mock_cfmws0_range_start;
+}
+
/*
* Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
*
@@ -19,14 +36,19 @@ bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
const struct cxl_endpoint_decoder *cxled)
{
const struct range *r1, *r2;
+ u64 cfmws_range_start;
int niw;
+ cfmws_range_start = get_cfmws_range_start(&cxled->cxld.dev);
+ if (cfmws_range_start == ULLONG_MAX)
+ return false;
+
r1 = &cxlrd->cxlsd.cxld.hpa_range;
r2 = &cxled->cxld.hpa_range;
niw = cxled->cxld.interleave_ways;
- if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
- r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
+ if (r1->start == cfmws_range_start && r1->start == r2->start &&
+ r1->end < (cfmws_range_start + SZ_4G) && r1->end < r2->end &&
IS_ALIGNED(range_len(r2), niw * SZ_256M))
return true;
@@ -40,9 +62,14 @@ bool arch_match_region(const struct cxl_region_params *p,
const struct range *r = &cxld->hpa_range;
const struct resource *res = p->res;
int niw = cxld->interleave_ways;
+ u64 cfmws_range_start;
+
+ cfmws_range_start = get_cfmws_range_start(&cxld->dev);
+ if (cfmws_range_start == ULLONG_MAX)
+ return false;
- if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
- res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
+ if (res->start == cfmws_range_start && res->start == r->start &&
+ res->end < (cfmws_range_start + SZ_4G) && res->end < r->end &&
IS_ALIGNED(range_len(r), niw * SZ_256M))
return true;
diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
index 16746ceac1ed..b6337120ee17 100644
--- a/drivers/cxl/core/lmh.h
+++ b/drivers/cxl/core/lmh.h
@@ -2,6 +2,8 @@
#include "cxl.h"
+void set_mock_cfmws0_range_start(u64 start);
+
#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
const struct cxl_endpoint_decoder *cxled);
diff --git a/tools/testing/cxl/cxl_core_exports.c b/tools/testing/cxl/cxl_core_exports.c
index f088792a8925..7b20f9fcf0d7 100644
--- a/tools/testing/cxl/cxl_core_exports.c
+++ b/tools/testing/cxl/cxl_core_exports.c
@@ -2,6 +2,8 @@
/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
#include "cxl.h"
+#include "lmh.h"
/* Exporting of cxl_core symbols that are only used by cxl_test */
EXPORT_SYMBOL_NS_GPL(cxl_num_decoders_committed, "CXL");
+EXPORT_SYMBOL_NS_GPL(set_mock_cfmws0_range_start, "CXL");
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 1c3336095923..8c69ce0a272f 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -9,6 +9,7 @@
#include <linux/pci.h>
#include <linux/mm.h>
#include <cxlmem.h>
+#include <core/lmh.h>
#include "../watermark.h"
#include "mock.h"
@@ -212,7 +213,11 @@ static struct {
.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
.qtg_id = FAKE_QTG_ID,
+#if defined(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE)
+ .window_size = SZ_256M * 3UL,
+#else
.window_size = SZ_256M * 4UL,
+#endif
},
.target = { 0 },
},
@@ -454,6 +459,7 @@ static int populate_cedt(void)
return -ENOMEM;
window->base_hpa = res->range.start;
}
+ set_mock_cfmws0_range_start(mock_cfmws[0]->base_hpa);
return 0;
}
@@ -744,7 +750,11 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
struct cxl_endpoint_decoder *cxled;
struct cxl_switch_decoder *cxlsd;
struct cxl_port *port, *iter;
+#if defined(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE)
+ const int size = SZ_1G;
+#else
const int size = SZ_512M;
+#endif
struct cxl_memdev *cxlmd;
struct cxl_dport *dport;
struct device *dev;
--
2.48.1
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86
2025-03-14 11:36 ` [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86 Fabio M. De Francesco
@ 2025-03-18 15:15 ` Ira Weiny
2025-03-21 10:21 ` Robert Richter
2025-03-28 23:40 ` Dan Williams
2 siblings, 0 replies; 32+ messages in thread
From: Ira Weiny @ 2025-03-18 15:15 UTC (permalink / raw)
To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
Fabio M. De Francesco wrote:
> In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> ranges are always guaranteed to align to the NIW * 256M rule.
>
> In order to construct Regions and attach Decoders, the driver needs to
> match Root Decoders and Regions with Endpoint Decoders, but it fails and
> the entire process returns errors because it doesn't expect to deal with
> SPA range lengths smaller than corresponding HPA's.
>
> Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> with corresponding HPA's. They will be used in the process of Regions
> creation and Endpoint attachments to prevent driver failures in a few
> steps of the above-mentioned process.
>
> The helpers return true when HPA/SPA misalignments are detected under
> specific conditions: both the SPA and HPA ranges must start at
> LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> aligned to the NIW * 256M rule.
>
> Also introduce a function to adjust the range end of the Regions to be
> created on x86 with LMH's.
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
[snip]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole
2025-03-14 11:36 ` [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
@ 2025-03-18 20:35 ` Ira Weiny
2025-03-21 10:29 ` Robert Richter
1 sibling, 0 replies; 32+ messages in thread
From: Ira Weiny @ 2025-03-18 20:35 UTC (permalink / raw)
To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
>
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. In some cases the size
> of that hole is not compatible with the CXL hardware decoder constraint
> that the size is always aligned to 256M * Interleave Ways.
>
> On those systems, BIOS publishes CFMWS which communicate the active System
> Physical Address (SPA) ranges that map to a subset of the Host Physical
> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> the endpoint is lost with no SPA to map to CXL HPA in that hole.
>
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, cxl_add_to_region() fails and returns an error
> because it can't find any CXL Window that matches a given CXL Endpoint
> Decoder.
>
> Detect a Low Memory Hole by comparing Root Decoders and Endpoint Decoders
> ranges with the use of arch_match_{spa,region}() helpers.
>
> Match Root Decoders and CXL Regions with corresponding CXL Endpoint
> Decoders. Currently a Low Memory Holes would prevent the matching functions
> to return true.
>
> Construct CXL Regions with HPA range's end adjusted to the matching SPA.
>
> Allow the attach target process to complete by allowing Regions to not
> comply with alignment constraints (i.e., alignment to NIW * 256M rule).
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
[snip]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-14 11:36 ` [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
@ 2025-03-18 21:16 ` Ira Weiny
2025-03-21 10:42 ` Robert Richter
2025-03-28 23:40 ` Dan Williams
2 siblings, 0 replies; 32+ messages in thread
From: Ira Weiny @ 2025-03-18 21:16 UTC (permalink / raw)
To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
Fabio M. De Francesco wrote:
> Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> to 1GB.
>
> Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
> base address is typically different from the one published by the BIOS on
> real hardware, the driver would fail to create and attach CXL Regions if
> it was run on the mock environment created by cxl-tests.
>
> Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
> Root Decoders and Regions with Endpoint Decoders when the driver is run on
> mock devices.
>
> Since the auto-created region of cxl-test uses mock_cfmws[0], the
> LMH path in the CXL Driver will be exercised every time the cxl-test
> module is loaded. Executing unit test: cxl-topology.sh, confirms the
> region created successfully with a LMH.
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
[snip]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
` (3 preceding siblings ...)
2025-03-14 11:36 ` [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
@ 2025-03-20 1:46 ` Alison Schofield
2025-03-26 16:23 ` Fabio M. De Francesco
2025-03-20 18:10 ` Alison Schofield
2025-03-21 10:34 ` Robert Richter
6 siblings, 1 reply; 32+ messages in thread
From: Alison Schofield @ 2025-03-20 1:46 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, Robert Richter, ming.li, linux-kernel,
linux-cxl
On Fri, Mar 14, 2025 at 12:36:29PM +0100, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
>
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. On those systems, BIOS
> publishes CFMWS which communicate the active System Physical Address (SPA)
> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> SPA range trims out the hole, and capacity in the endpoint is lost with no
> SPA to map to CXL HPA in that hole.
>
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, the driver fails and returns an error because it
> expects that the CXL Endpoint Decoder range is a subset of the Root
> Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
>
> Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> Decoders or already made CXL Regions and Decoders to allow the
> construction of new CXL Regions and the attachment of Endpoint Decoders,
> even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> range end to match Root Decoder's.
I think the dpa_res field of the endpoint decoder needs adjusting.
After the region is setup, the cxled->dpa_res has the unadjusted value
and that leads to region warning and address translation failure because
the driver 'thinks' that DPA is within a region, but when it tries
to translate to an HPA in that region, it fails.
Here's where I looked at it: using the cxl-test LMH auto-region (nice!)
each endpoint decoder is programmed to contribute 512MB to the 1024MB region.
The LMH adjustment shrunk the region to 768MB, so each endpoint is only
contributing 384MB to the region.
DPA->HPA address translations of DPA addresses in the 384->512 gap cause
a problem. The driver will needlessly warn that they are in a region for
any poison inject or clear, and will fail address translations for any
poison, general media or dram event.
I think this should fail in region.c: __cxl_dpa_to_region()
if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
return 0;
For that to fail, LMH code needs to adjust cxled->dpa_res too.
To test is using clear_poison you can:
# echo 536866816 > /sys/kernel/debug/cxl/mem1/clear_poison
(536866816 = 512MB - 4096)
[ ] cxl_core:__cxl_dpa_to_region:2860: cxl decoder18.0: dpa:0x1ffff000 mapped in region:region0
[ ] cxl_core:cxl_dpa_to_hpa:2963: cxl_region region0: Addr trans fail: hpa 0x3ff04fffe000 not in region
snip
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
` (4 preceding siblings ...)
2025-03-20 1:46 ` [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Alison Schofield
@ 2025-03-20 18:10 ` Alison Schofield
2025-03-26 16:24 ` Fabio M. De Francesco
2025-03-21 10:34 ` Robert Richter
6 siblings, 1 reply; 32+ messages in thread
From: Alison Schofield @ 2025-03-20 18:10 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, Robert Richter, ming.li, linux-kernel,
linux-cxl
On Fri, Mar 14, 2025 at 12:36:29PM +0100, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
>
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. On those systems, BIOS
> publishes CFMWS which communicate the active System Physical Address (SPA)
> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> SPA range trims out the hole, and capacity in the endpoint is lost with no
> SPA to map to CXL HPA in that hole.
>
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, the driver fails and returns an error because it
> expects that the CXL Endpoint Decoder range is a subset of the Root
> Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
>
> Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> Decoders or already made CXL Regions and Decoders to allow the
> construction of new CXL Regions and the attachment of Endpoint Decoders,
> even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> range end to match Root Decoder's.
Unless dev_dbg() is on, LMH handling won't be apparent. How about adding
a dev_info() that informs the user that a region was adjusted to account
for a LMH. I'm foreseeing looking at logs and divining a LMH, when a
simple log message would be a nice shortcut.
CXL subsystem is quiet with only 8 dev_info()'s, so if this breaks some
vow of silence, nevermind :)
-snip
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86
2025-03-14 11:36 ` [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86 Fabio M. De Francesco
2025-03-18 15:15 ` Ira Weiny
@ 2025-03-21 10:21 ` Robert Richter
2025-03-26 16:47 ` Fabio M. De Francesco
2025-03-28 23:40 ` Dan Williams
2 siblings, 1 reply; 32+ messages in thread
From: Robert Richter @ 2025-03-21 10:21 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
On 14.03.25 12:36:31, Fabio M. De Francesco wrote:
> In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> ranges are always guaranteed to align to the NIW * 256M rule.
>
> In order to construct Regions and attach Decoders, the driver needs to
> match Root Decoders and Regions with Endpoint Decoders, but it fails and
> the entire process returns errors because it doesn't expect to deal with
> SPA range lengths smaller than corresponding HPA's.
>
> Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> with corresponding HPA's. They will be used in the process of Regions
> creation and Endpoint attachments to prevent driver failures in a few
> steps of the above-mentioned process.
>
> The helpers return true when HPA/SPA misalignments are detected under
> specific conditions: both the SPA and HPA ranges must start at
> LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
"... that for Intel with LMHs is 0x0", not true for AMD.
> be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> aligned to the NIW * 256M rule.
>
> Also introduce a function to adjust the range end of the Regions to be
> created on x86 with LMH's.
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
The split of the code in patch #2 and #3 does not make sense. The
"interface" you introduce here is out of context which is patch #3.
And in patch #3 the functions are actually used, so you need to switch
back to this one. Other than that, the code is not enabled here at
all, it is even not built.
> 2 files changed, 85 insertions(+)
> create mode 100644 drivers/cxl/core/lmh.c
> create mode 100644 drivers/cxl/core/lmh.h
>
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> new file mode 100644
> index 000000000000..2e32f867eb94
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.c
> @@ -0,0 +1,56 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/range.h>
> +#include "lmh.h"
> +
> +/* Start of CFMWS range that end before x86 Low Memory Holes */
> +#define LMH_CFMWS_RANGE_START 0x0ULL
This looks arbitrary. An endpoint decoder's zero base address could
have other reasons too, e.g. the need of address translation. So you
need a different check for the mem hole.
In my previous review comment I have requested a platform check for
this code to enable.
> +
> +/*
> + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> + *
> + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> + * the given endpoint decoder HPA range size is always expected aligned and
> + * also larger than that of the matching root decoder. If there are LMH's,
> + * the root decoder range end is always less than SZ_4G.
> + */
> +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> + const struct cxl_endpoint_decoder *cxled)
> +{
> + const struct range *r1, *r2;
> + int niw;
> +
> + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> + r2 = &cxled->cxld.hpa_range;
> + niw = cxled->cxld.interleave_ways;
> +
> + if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> + r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
How about removing LMH_CFMWS_RANGE_START at all? It is zero anyway and
would make this check much easier.
Can this being modified to reuse this codes for "holes" other than
below 4G?
> + IS_ALIGNED(range_len(r2), niw * SZ_256M))
> + return true;
> +
> + return false;
> +}
> +
> +/* Similar to arch_match_spa(), it matches regions and decoders */
> +bool arch_match_region(const struct cxl_region_params *p,
> + const struct cxl_decoder *cxld)
> +{
> + const struct range *r = &cxld->hpa_range;
> + const struct resource *res = p->res;
> + int niw = cxld->interleave_ways;
> +
> + if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> + res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
Same here.
> + IS_ALIGNED(range_len(r), niw * SZ_256M))
> + return true;
> +
> + return false;
> +}
Right now the default check is:
(p->res->start == r->start && p->res->end == r->end)
Can't we just calculate a matching spa range for the decoder and
region and then check if both match? Both match functions would be
obsolete then.
> +
> +void arch_adjust_region_resource(struct resource *res,
> + struct cxl_root_decoder *cxlrd)
> +{
> + res->end = cxlrd->res->end;
> +}
This should be squashed with arch_match_spa(): same style and
interface as for cxl_extended_linear_cache_resize(). Please generalize
the interface of cxl_extended_linear_cache_resize() first.
> diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> new file mode 100644
> index 000000000000..16746ceac1ed
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include "cxl.h"
> +
> +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> + const struct cxl_endpoint_decoder *cxled);
> +bool arch_match_region(const struct cxl_region_params *p,
> + const struct cxl_decoder *cxld);
> +void arch_adjust_region_resource(struct resource *res,
> + struct cxl_root_decoder *cxlrd);
> +#else
> +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> + struct cxl_endpoint_decoder *cxled)
> +{
> + return false;
> +}
> +
> +static bool arch_match_region(struct cxl_region_params *p,
> + struct cxl_decoder *cxld)
> +{
> + return false;
> +}
> +
> +static void arch_adjust_region_resource(struct resource *res,
> + struct cxl_root_decoder *cxlrd)
> +{
> +}
> +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
I don't think we will need all this helpers if there are platform
specific callbacks as suggested in a comment to v1.
-Robert
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole
2025-03-14 11:36 ` [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
2025-03-18 20:35 ` Ira Weiny
@ 2025-03-21 10:29 ` Robert Richter
1 sibling, 0 replies; 32+ messages in thread
From: Robert Richter @ 2025-03-21 10:29 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
On 14.03.25 12:36:32, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
>
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. In some cases the size
> of that hole is not compatible with the CXL hardware decoder constraint
> that the size is always aligned to 256M * Interleave Ways.
>
> On those systems, BIOS publishes CFMWS which communicate the active System
> Physical Address (SPA) ranges that map to a subset of the Host Physical
> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> the endpoint is lost with no SPA to map to CXL HPA in that hole.
>
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, cxl_add_to_region() fails and returns an error
> because it can't find any CXL Window that matches a given CXL Endpoint
> Decoder.
>
> Detect a Low Memory Hole by comparing Root Decoders and Endpoint Decoders
> ranges with the use of arch_match_{spa,region}() helpers.
>
> Match Root Decoders and CXL Regions with corresponding CXL Endpoint
> Decoders. Currently a Low Memory Holes would prevent the matching functions
> to return true.
>
> Construct CXL Regions with HPA range's end adjusted to the matching SPA.
>
> Allow the attach target process to complete by allowing Regions to not
> comply with alignment constraints (i.e., alignment to NIW * 256M rule).
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/Kconfig | 5 ++++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/region.c | 56 +++++++++++++++++++++++++++++++++------
> tools/testing/cxl/Kbuild | 1 +
> 4 files changed, 55 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 205547e5543a..3bb282ef01df 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -139,6 +139,11 @@ config CXL_REGION
>
> If unsure say 'y'
>
> +config CXL_ARCH_LOW_MEMORY_HOLE
> + def_bool y
> + depends on CXL_REGION
> + depends on X86
> +
> config CXL_REGION_INVALIDATION_TEST
> bool "CXL: Region Cache Management Bypass (TEST)"
> depends on CXL_REGION
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 139b349b3a52..3dccd3c224f1 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -17,6 +17,7 @@ cxl_core-y += cdat.o
> cxl_core-y += acpi.o
> cxl_core-y += ras.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> cxl_core-$(CONFIG_CXL_MCE) += mce.o
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 97122d645cc1..9eb23ecedecf 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -13,6 +13,7 @@
> #include <cxlmem.h>
> #include <cxl.h>
> #include "core.h"
> +#include "lmh.h"
>
> /**
> * DOC: cxl core region
> @@ -835,6 +836,8 @@ static int match_free_decoder(struct device *dev, const void *data)
> static bool region_res_match_cxl_range(const struct cxl_region_params *p,
> struct range *range)
> {
> + struct cxl_decoder *cxld;
> +
> if (!p->res)
> return false;
>
> @@ -843,8 +846,15 @@ static bool region_res_match_cxl_range(const struct cxl_region_params *p,
> * to be fronted by the DRAM range in current known implementation.
> * This assumption will be made until a variant implementation exists.
> */
> - return p->res->start + p->cache_size == range->start &&
> - p->res->end == range->end;
> + if (p->res->start + p->cache_size == range->start &&
> + p->res->end == range->end)
> + return true;
> +
> + cxld = container_of(range, struct cxl_decoder, hpa_range);
> + if (arch_match_region(p, cxld))
> + return true;
> +
> + return false;
This reaches a complexity that cannot be handled in a couple of years
or even months. We need a maintainable solution for all this. Esp. the
use of callbacks or handlers enabled by platform checks would help to
better isolate the code.
> }
>
> static int match_auto_decoder(struct device *dev, const void *data)
> @@ -1760,6 +1770,7 @@ static int match_switch_decoder_by_range(struct device *dev,
> {
> const struct cxl_endpoint_decoder *cxled = data;
> struct cxl_switch_decoder *cxlsd;
> + struct cxl_root_decoder *cxlrd;
> const struct range *r1, *r2;
>
> if (!is_switch_decoder(dev))
> @@ -1769,8 +1780,13 @@ static int match_switch_decoder_by_range(struct device *dev,
> r1 = &cxlsd->cxld.hpa_range;
> r2 = &cxled->cxld.hpa_range;
>
> - if (is_root_decoder(dev))
> - return range_contains(r1, r2);
> + if (is_root_decoder(dev)) {
> + if (range_contains(r1, r2))
> + return 1;
> + cxlrd = to_cxl_root_decoder(dev);
> + if (arch_match_spa(cxlrd, cxled))
> + return 1;
See my other comment in patch #2 to simplify the match functions.
Applies to the checks below too.
> + }
> return (r1->start == r2->start && r1->end == r2->end);
> }
>
> @@ -1978,7 +1994,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> }
>
> if (resource_size(cxled->dpa_res) * p->interleave_ways + p->cache_size !=
> - resource_size(p->res)) {
> + resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
> dev_dbg(&cxlr->dev,
> "%s:%s-size-%#llx * ways-%d + cache-%#llx != region-size-%#llx\n",
> dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> @@ -3213,7 +3229,12 @@ static int match_root_decoder_by_range(struct device *dev,
> r1 = &cxlrd->cxlsd.cxld.hpa_range;
> r2 = &cxled->cxld.hpa_range;
>
> - return range_contains(r1, r2);
> + if (range_contains(r1, r2))
> + return true;
> + if (arch_match_spa(cxlrd, cxled))
> + return true;
> +
> + return false;
> }
>
> static int match_region_by_range(struct device *dev, const void *data)
> @@ -3230,8 +3251,12 @@ static int match_region_by_range(struct device *dev, const void *data)
> p = &cxlr->params;
>
> guard(rwsem_read)(&cxl_region_rwsem);
> - if (p->res && p->res->start == r->start && p->res->end == r->end)
> - return 1;
> + if (p->res) {
> + if (p->res->start == r->start && p->res->end == r->end)
> + return 1;
> + if (arch_match_region(p, &cxled->cxld))
> + return 1;
> + }
>
> return 0;
> }
> @@ -3319,6 +3344,21 @@ static int __construct_region(struct cxl_region *cxlr,
> "Extended linear cache calculation failed rc:%d\n", rc);
> }
>
> + /*
> + * Trim the HPA retrieved from hardware to fit the SPA mapped by the
> + * platform
> + */
> + if (arch_match_spa(cxlrd, cxled)) {
> + dev_dbg(cxlmd->dev.parent, "(LMH) Resource (%s: %pr)\n",
> + dev_name(&cxled->cxld.dev), res);
> +
> + arch_adjust_region_resource(res, cxlrd);
> +
> + dev_dbg(cxlmd->dev.parent,
> + "(LMH) has been adjusted (%s: %pr)\n",
> + dev_name(&cxled->cxld.dev), res);
> + }
See my earlier comment on squashing both function into one.
-Robert
> +
> rc = insert_resource(cxlrd->res, res);
> if (rc) {
> /*
> diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
> index 4efcc0606bd6..3b3c24b1496e 100644
> --- a/tools/testing/cxl/Kbuild
> +++ b/tools/testing/cxl/Kbuild
> @@ -64,6 +64,7 @@ cxl_core-y += $(CXL_CORE_SRC)/cdat.o
> cxl_core-y += $(CXL_CORE_SRC)/acpi.o
> cxl_core-y += $(CXL_CORE_SRC)/ras.o
> cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o
> +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += $(CXL_CORE_SRC)/lmh.o
> cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o
> cxl_core-$(CONFIG_CXL_FEATURES) += $(CXL_CORE_SRC)/features.o
> cxl_core-$(CONFIG_CXL_MCE) += $(CXL_CORE_SRC)/mce.o
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
` (5 preceding siblings ...)
2025-03-20 18:10 ` Alison Schofield
@ 2025-03-21 10:34 ` Robert Richter
2025-03-25 16:13 ` Fabio M. De Francesco
6 siblings, 1 reply; 32+ messages in thread
From: Robert Richter @ 2025-03-21 10:34 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
Fabio,
On 14.03.25 12:36:29, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
>
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. On those systems, BIOS
> publishes CFMWS which communicate the active System Physical Address (SPA)
> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> SPA range trims out the hole, and capacity in the endpoint is lost with no
> SPA to map to CXL HPA in that hole.
>
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, the driver fails and returns an error because it
> expects that the CXL Endpoint Decoder range is a subset of the Root
> Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
>
> Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> Decoders or already made CXL Regions and Decoders to allow the
> construction of new CXL Regions and the attachment of Endpoint Decoders,
> even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> range end to match Root Decoder's.
>
> - Patch 1/4 changes the calling conventions of three match_*_by_range()
> helpers in preparation of 3/4.
> - Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
> the HPA range end for CXL Regions construction.
> - Patch 3/4 enables CXL Regions construction and Endpoint Decoders
> attachment by matching Root Decoders or Regions with Endpoint
> Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
> constraints while Endpoints decoders' attachment.
> - Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
>
> Many thanks to Alison, Dan, and Ira for their help and for their reviews
> of my RFC on Intel's internal ML.
>
> Commenting on v1, Alison wrote a couple of observations on what users
> will see. I suggest anyone interested to see how this series affect
> users to take a look at her observations.[0] Thank you!
>
> Changes for v3:
>
> Re-base the series on cxl/next.
>
> 1/4 - 2/4:
> Constify local variables.
> 3/4:
> Call arch_match_region() from region_res_match_cxl_range().
> 4/4:
> arch_match_region() - Check that region end is under start + 4G;
> arch_match_spa() - Check that SPA range start is cfmws_range_start.
I have sent comments for version 1 and suggested a simpler approach
for this to implement. My comments haven't been addressed yet, but we
need better isolation to reduce interference with other platforms and
archs. Please take a look again.
Many thanks,
-Robert
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-14 11:36 ` [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
2025-03-18 21:16 ` Ira Weiny
@ 2025-03-21 10:42 ` Robert Richter
2025-03-26 16:58 ` Fabio M. De Francesco
2025-03-28 23:40 ` Dan Williams
2 siblings, 1 reply; 32+ messages in thread
From: Robert Richter @ 2025-03-21 10:42 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
On 14.03.25 12:36:33, Fabio M. De Francesco wrote:
> Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> to 1GB.
>
> Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
> base address is typically different from the one published by the BIOS on
> real hardware, the driver would fail to create and attach CXL Regions if
> it was run on the mock environment created by cxl-tests.
>
> Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
> Root Decoders and Regions with Endpoint Decoders when the driver is run on
> mock devices.
>
> Since the auto-created region of cxl-test uses mock_cfmws[0], the
> LMH path in the CXL Driver will be exercised every time the cxl-test
> module is loaded. Executing unit test: cxl-topology.sh, confirms the
> region created successfully with a LMH.
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/core/lmh.c | 35 ++++++++++++++++++++++++----
> drivers/cxl/core/lmh.h | 2 ++
Can you take a look to move all those changes to testing/? This
indicates the interface of your mock functions need improvement.
-Robert
> tools/testing/cxl/cxl_core_exports.c | 2 ++
> tools/testing/cxl/test/cxl.c | 10 ++++++++
> 4 files changed, 45 insertions(+), 4 deletions(-)
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4 v3] cxl/core: Change match_*_by_range() calling convention
2025-03-14 11:36 ` [PATCH 1/4 v3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
@ 2025-03-21 15:43 ` Dave Jiang
0 siblings, 0 replies; 32+ messages in thread
From: Dave Jiang @ 2025-03-21 15:43 UTC (permalink / raw)
To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl
On 3/14/25 4:36 AM, Fabio M. De Francesco wrote:
> Replace struct range parameter with struct cxl_endpoint_decoder of
> which range is a member in the match_*_by_range() functions.
>
> This is in preparation for expanding these helpers to perform arch
> specific region matching that requires a cxl_endpoint_decoder.
>
> No functional changes.
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
> drivers/cxl/core/region.c | 26 ++++++++++++++++----------
> 1 file changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index b3260d433ec7..97122d645cc1 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1758,24 +1758,27 @@ static struct cxl_port *next_port(struct cxl_port *port)
> static int match_switch_decoder_by_range(struct device *dev,
> const void *data)
> {
> + const struct cxl_endpoint_decoder *cxled = data;
> struct cxl_switch_decoder *cxlsd;
> - const struct range *r1, *r2 = data;
> -
> + const struct range *r1, *r2;
>
> if (!is_switch_decoder(dev))
> return 0;
>
> cxlsd = to_cxl_switch_decoder(dev);
> r1 = &cxlsd->cxld.hpa_range;
> + r2 = &cxled->cxld.hpa_range;
>
> if (is_root_decoder(dev))
> return range_contains(r1, r2);
> return (r1->start == r2->start && r1->end == r2->end);
> }
>
> -static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> +static int find_pos_and_ways(struct cxl_port *port,
> + struct cxl_endpoint_decoder *cxled,
> int *pos, int *ways)
> {
> + struct range *range = &cxled->cxld.hpa_range;
> struct cxl_switch_decoder *cxlsd;
> struct cxl_port *parent;
> struct device *dev;
> @@ -1785,7 +1788,7 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
> if (!parent)
> return rc;
>
> - dev = device_find_child(&parent->dev, range,
> + dev = device_find_child(&parent->dev, cxled,
> match_switch_decoder_by_range);
> if (!dev) {
> dev_err(port->uport_dev,
> @@ -1865,7 +1868,7 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
> if (is_cxl_root(iter))
> break;
>
> - rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
> + rc = find_pos_and_ways(iter, cxled, &parent_pos, &parent_ways);
> if (rc)
> return rc;
>
> @@ -3199,22 +3202,26 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
> static int match_root_decoder_by_range(struct device *dev,
> const void *data)
> {
> - const struct range *r1, *r2 = data;
> + const struct cxl_endpoint_decoder *cxled = data;
> struct cxl_root_decoder *cxlrd;
> + const struct range *r1, *r2;
>
> if (!is_root_decoder(dev))
> return 0;
>
> cxlrd = to_cxl_root_decoder(dev);
> r1 = &cxlrd->cxlsd.cxld.hpa_range;
> + r2 = &cxled->cxld.hpa_range;
> +
> return range_contains(r1, r2);
> }
>
> static int match_region_by_range(struct device *dev, const void *data)
> {
> + const struct cxl_endpoint_decoder *cxled = data;
> + const struct range *r = &cxled->cxld.hpa_range;
> struct cxl_region_params *p;
> struct cxl_region *cxlr;
> - const struct range *r = data;
>
> if (!is_cxl_region(dev))
> return 0;
> @@ -3382,7 +3389,6 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> {
> struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> - struct range *hpa = &cxled->cxld.hpa_range;
> struct cxl_decoder *cxld = &cxled->cxld;
> struct device *cxlrd_dev, *region_dev;
> struct cxl_root_decoder *cxlrd;
> @@ -3391,7 +3397,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> bool attach = false;
> int rc;
>
> - cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
> + cxlrd_dev = device_find_child(&root->dev, cxled,
> match_root_decoder_by_range);
> if (!cxlrd_dev) {
> dev_err(cxlmd->dev.parent,
> @@ -3408,7 +3414,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
> * one does the construction and the others add to that.
> */
> mutex_lock(&cxlrd->range_lock);
> - region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
> + region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, cxled,
> match_region_by_range);
> if (!region_dev) {
> cxlr = construct_region(cxlrd, cxled);
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-21 10:34 ` Robert Richter
@ 2025-03-25 16:13 ` Fabio M. De Francesco
2025-03-28 9:02 ` Robert Richter
0 siblings, 1 reply; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-25 16:13 UTC (permalink / raw)
To: Robert Richter
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
On Friday, March 21, 2025 11:34:35 AM Central European Standard Time Robert Richter wrote:
> Fabio,
>
> On 14.03.25 12:36:29, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> >
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. On those systems, BIOS
> > publishes CFMWS which communicate the active System Physical Address (SPA)
> > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > SPA to map to CXL HPA in that hole.
> >
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, the driver fails and returns an error because it
> > expects that the CXL Endpoint Decoder range is a subset of the Root
> > Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> >
> > Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> > Decoders or already made CXL Regions and Decoders to allow the
> > construction of new CXL Regions and the attachment of Endpoint Decoders,
> > even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> > range end to match Root Decoder's.
> >
> > - Patch 1/4 changes the calling conventions of three match_*_by_range()
> > helpers in preparation of 3/4.
> > - Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
> > the HPA range end for CXL Regions construction.
> > - Patch 3/4 enables CXL Regions construction and Endpoint Decoders
> > attachment by matching Root Decoders or Regions with Endpoint
> > Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
> > constraints while Endpoints decoders' attachment.
> > - Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
> >
> > Many thanks to Alison, Dan, and Ira for their help and for their reviews
> > of my RFC on Intel's internal ML.
> >
> > Commenting on v1, Alison wrote a couple of observations on what users
> > will see. I suggest anyone interested to see how this series affect
> > users to take a look at her observations.[0] Thank you!
> >
> > Changes for v3:
> >
> > Re-base the series on cxl/next.
> >
> > 1/4 - 2/4:
> > Constify local variables.
> > 3/4:
> > Call arch_match_region() from region_res_match_cxl_range().
> > 4/4:
> > arch_match_region() - Check that region end is under start + 4G;
> > arch_match_spa() - Check that SPA range start is cfmws_range_start.
>
> I have sent comments for version 1 and suggested a simpler approach
> for this to implement.
>
I replied to your comments for version 1. Please find my message at
https://lore.kernel.org/linux-cxl/9930375.eNJFYEL58v@fdefranc-mobl3/.
>
> My comments haven't been addressed yet,
>
I think it's not possible to detect an LMH while still in cxl_port_add().
Therefore, I think that this is the best way to go. It works to prevent
the driver to fail to create Regions and attach Endpoint Decoders on x86
with Low Memory Holes, provided that the lower SPA range starts at 0x0.
On platforms with other kind of holes, the driver will continue to fail
as it currently does.
>
> but we
> need better isolation to reduce interference with other platforms and
> archs. Please take a look again.
>
Interference? Do you mean that this series would make the driver fail on
other platforms?
Of course I don't want anything like that. I'm not clear about it...
Would you please describe how would this series interfere and what
would happen on other platforms?
Thanks,
Fabio
>
> Many thanks,
>
> -Robert
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-20 1:46 ` [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Alison Schofield
@ 2025-03-26 16:23 ` Fabio M. De Francesco
0 siblings, 0 replies; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-26 16:23 UTC (permalink / raw)
To: Alison Schofield, Alison Schofield
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, Robert Richter, ming.li, linux-kernel,
linux-cxl, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, linux-cxl
On Thursday, March 20, 2025 2:46:14 AM Central European Standard Time Alison Schofield wrote:
> On Fri, Mar 14, 2025 at 12:36:29PM +0100, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> >
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. On those systems, BIOS
> > publishes CFMWS which communicate the active System Physical Address (SPA)
> > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > SPA to map to CXL HPA in that hole.
> >
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, the driver fails and returns an error because it
> > expects that the CXL Endpoint Decoder range is a subset of the Root
> > Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> >
> > Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> > Decoders or already made CXL Regions and Decoders to allow the
> > construction of new CXL Regions and the attachment of Endpoint Decoders,
> > even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> > range end to match Root Decoder's.
>
> I think the dpa_res field of the endpoint decoder needs adjusting.
> After the region is setup, the cxled->dpa_res has the unadjusted value
> and that leads to region warning and address translation failure because
> the driver 'thinks' that DPA is within a region, but when it tries
> to translate to an HPA in that region, it fails.
>
> Here's where I looked at it: using the cxl-test LMH auto-region (nice!)
> each endpoint decoder is programmed to contribute 512MB to the 1024MB region.
> The LMH adjustment shrunk the region to 768MB, so each endpoint is only
> contributing 384MB to the region.
>
> DPA->HPA address translations of DPA addresses in the 384->512 gap cause
> a problem. The driver will needlessly warn that they are in a region for
> any poison inject or clear, and will fail address translations for any
> poison, general media or dram event.
>
> I think this should fail in region.c: __cxl_dpa_to_region()
> if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
> return 0;
>
> For that to fail, LMH code needs to adjust cxled->dpa_res too.
>
>
> To test is using clear_poison you can:
> # echo 536866816 > /sys/kernel/debug/cxl/mem1/clear_poison
> (536866816 = 512MB - 4096)
>
> [ ] cxl_core:__cxl_dpa_to_region:2860: cxl decoder18.0: dpa:0x1ffff000 mapped in region:region0
> [ ] cxl_core:cxl_dpa_to_hpa:2963: cxl_region region0: Addr trans fail: hpa 0x3ff04fffe000 not in region
>
>
> snip
> >
>
Alison,
I'll adjust cxled->dpa_res too.
Thank you for noticing this issue.
Fabio
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-20 18:10 ` Alison Schofield
@ 2025-03-26 16:24 ` Fabio M. De Francesco
0 siblings, 0 replies; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-26 16:24 UTC (permalink / raw)
To: Alison Schofield
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
Ira Weiny, Dan Williams, Robert Richter, ming.li, linux-kernel,
linux-cxl
[-- Attachment #1: Type: text/plain, Size: 1990 bytes --]
On Thursday, March 20, 2025 7:10:13 PM Central European Standard Time Alison Schofield wrote:
> On Fri, Mar 14, 2025 at 12:36:29PM +0100, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> >
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. On those systems, BIOS
> > publishes CFMWS which communicate the active System Physical Address (SPA)
> > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > SPA to map to CXL HPA in that hole.
> >
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, the driver fails and returns an error because it
> > expects that the CXL Endpoint Decoder range is a subset of the Root
> > Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> >
> > Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> > Decoders or already made CXL Regions and Decoders to allow the
> > construction of new CXL Regions and the attachment of Endpoint Decoders,
> > even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> > range end to match Root Decoder's.
>
> Unless dev_dbg() is on, LMH handling won't be apparent. How about adding
> a dev_info() that informs the user that a region was adjusted to account
> for a LMH. I'm foreseeing looking at logs and divining a LMH, when a
> simple log message would be a nice shortcut.
>
> CXL subsystem is quiet with only 8 dev_info()'s, so if this breaks some
> vow of silence, nevermind :)
>
Yes, sure. It sounds good.
Thanks,
Fabio
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86
2025-03-21 10:21 ` Robert Richter
@ 2025-03-26 16:47 ` Fabio M. De Francesco
2025-03-28 10:26 ` Robert Richter
0 siblings, 1 reply; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-26 16:47 UTC (permalink / raw)
To: Robert Richter
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
On Friday, March 21, 2025 11:21:44 AM Central European Standard Time Robert Richter wrote:
> On 14.03.25 12:36:31, Fabio M. De Francesco wrote:
> > In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> > SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> > HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> > ranges are always guaranteed to align to the NIW * 256M rule.
> >
> > In order to construct Regions and attach Decoders, the driver needs to
> > match Root Decoders and Regions with Endpoint Decoders, but it fails and
> > the entire process returns errors because it doesn't expect to deal with
> > SPA range lengths smaller than corresponding HPA's.
> >
> > Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> > with corresponding HPA's. They will be used in the process of Regions
> > creation and Endpoint attachments to prevent driver failures in a few
> > steps of the above-mentioned process.
> >
> > The helpers return true when HPA/SPA misalignments are detected under
> > specific conditions: both the SPA and HPA ranges must start at
> > LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
>
> "... that for Intel with LMHs is 0x0", not true for AMD.
>
> > be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> > aligned to the NIW * 256M rule.
> >
> > Also introduce a function to adjust the range end of the Regions to be
> > created on x86 with LMH's.
> >
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> > drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> > drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
>
> The split of the code in patch #2 and #3 does not make sense. The
> "interface" you introduce here is out of context which is patch #3.
> And in patch #3 the functions are actually used, so you need to switch
> back to this one. Other than that, the code is not enabled here at
> all, it is even not built.
>
I prefer to split the introduction of helpers and the use of them in
separate patches.
>
> > 2 files changed, 85 insertions(+)
> > create mode 100644 drivers/cxl/core/lmh.c
> > create mode 100644 drivers/cxl/core/lmh.h
> >
> > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > new file mode 100644
> > index 000000000000..2e32f867eb94
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.c
> > @@ -0,0 +1,56 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +
> > +#include <linux/range.h>
> > +#include "lmh.h"
> > +
> > +/* Start of CFMWS range that end before x86 Low Memory Holes */
> > +#define LMH_CFMWS_RANGE_START 0x0ULL
>
> This looks arbitrary. An endpoint decoder's zero base address could
> have other reasons too, e.g. the need of address translation. So you
> need a different check for the mem hole.
>
> In my previous review comment I have requested a platform check for
> this code to enable.
>
> > +
> > +/*
> > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > + *
> > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > + * the given endpoint decoder HPA range size is always expected aligned and
> > + * also larger than that of the matching root decoder. If there are LMH's,
> > + * the root decoder range end is always less than SZ_4G.
> > + */
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled)
> > +{
> > + const struct range *r1, *r2;
> > + int niw;
> > +
> > + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > + r2 = &cxled->cxld.hpa_range;
> > + niw = cxled->cxld.interleave_ways;
> > +
> > + if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > + r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
>
> How about removing LMH_CFMWS_RANGE_START at all? It is zero anyway and
> would make this check much easier.
>
I think that in the next version I'll check that the range start is a multiple
of Number of Interleave Ways * 256M (it would include 0x0).
>
> Can this being modified to reuse this codes for "holes" other than
> below 4G?
>
This series enables CXL Regions creation and Endpoints attach for x86 LOW
Memory Holes. I prefer not to enable regions on platforms with memory holes
beyond 4G until I'm sure it's needed. arch_match_spa() and arch_match_region()
can be easily modified if in future they need to be.
>
> > + IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > + return true;
> > +
> > + return false;
> > +}
> > +
> > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > +bool arch_match_region(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld)
> > +{
> > + const struct range *r = &cxld->hpa_range;
> > + const struct resource *res = p->res;
> > + int niw = cxld->interleave_ways;
> > +
> > + if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > + res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
>
> Same here.
>
> > + IS_ALIGNED(range_len(r), niw * SZ_256M))
> > + return true;
> > +
> > + return false;
> > +}
>
> Right now the default check is:
>
> (p->res->start == r->start && p->res->end == r->end)
>
> Can't we just calculate a matching spa range for the decoder and
> region and then check if both match? Both match functions would be
> obsolete then.
>
I prefer to keep the design untouched. Anyway, I'll check for other aligned
res->start, as said above.
>
> > +
> > +void arch_adjust_region_resource(struct resource *res,
> > + struct cxl_root_decoder *cxlrd)
> > +{
> > + res->end = cxlrd->res->end;
> > +}
>
> This should be squashed with arch_match_spa(): same style and
> interface as for cxl_extended_linear_cache_resize(). Please generalize
> the interface of cxl_extended_linear_cache_resize() first.
>
Do you mean that arch_match_spa() should adjust res->end? I don't think
it should.
>
> > diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> > new file mode 100644
> > index 000000000000..16746ceac1ed
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include "cxl.h"
> > +
> > +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled);
> > +bool arch_match_region(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld);
> > +void arch_adjust_region_resource(struct resource *res,
> > + struct cxl_root_decoder *cxlrd);
> > +#else
> > +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > + struct cxl_endpoint_decoder *cxled)
> > +{
> > + return false;
> > +}
> > +
> > +static bool arch_match_region(struct cxl_region_params *p,
> > + struct cxl_decoder *cxld)
> > +{
> > + return false;
> > +}
> > +
> > +static void arch_adjust_region_resource(struct resource *res,
> > + struct cxl_root_decoder *cxlrd)
> > +{
> > +}
> > +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
>
> I don't think we will need all this helpers if there are platform
> specific callbacks as suggested in a comment to v1.
>
> -Robert
>
As said in a reply to 0/4, I'm not yet aware of issues that would interfere
with non Intel x86. Anyway, I'll think a bit more about using platform
specific callbacks, even if currently I don't think they are necessary.
>
Thanks,
Fabio
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-21 10:42 ` Robert Richter
@ 2025-03-26 16:58 ` Fabio M. De Francesco
2025-03-28 10:52 ` Robert Richter
0 siblings, 1 reply; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-26 16:58 UTC (permalink / raw)
To: Robert Richter
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
[-- Attachment #1: Type: text/plain, Size: 2215 bytes --]
On Friday, March 21, 2025 11:42:09 AM Central European Standard Time Robert Richter wrote:
> On 14.03.25 12:36:33, Fabio M. De Francesco wrote:
> > Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> > mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> > to 1GB.
> >
> > Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
> > base address is typically different from the one published by the BIOS on
> > real hardware, the driver would fail to create and attach CXL Regions if
> > it was run on the mock environment created by cxl-tests.
> >
> > Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
> > Root Decoders and Regions with Endpoint Decoders when the driver is run on
> > mock devices.
> >
> > Since the auto-created region of cxl-test uses mock_cfmws[0], the
> > LMH path in the CXL Driver will be exercised every time the cxl-test
> > module is loaded. Executing unit test: cxl-topology.sh, confirms the
> > region created successfully with a LMH.
> >
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> > drivers/cxl/core/lmh.c | 35 ++++++++++++++++++++++++----
> > drivers/cxl/core/lmh.h | 2 ++
>
> Can you take a look to move all those changes to testing/? This
> indicates the interface of your mock functions need improvement.
>
> -Robert
>
> > tools/testing/cxl/cxl_core_exports.c | 2 ++
> > tools/testing/cxl/test/cxl.c | 10 ++++++++
> > 4 files changed, 45 insertions(+), 4 deletions(-)
>
Your comment is not very detailed but I think you are suggesting to
override the lmh functions in testing. If that is the case, I don't think
that any function which is exported from and also called by cxl-core
can be override by testing with the strong/weak mechanism.
But I'm not an expert of linker related topics and not even sure that I
understood what you suggested.
Would you please elaborate more?
Thanks,
Fabio
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-25 16:13 ` Fabio M. De Francesco
@ 2025-03-28 9:02 ` Robert Richter
2025-03-28 21:10 ` Dave Jiang
0 siblings, 1 reply; 32+ messages in thread
From: Robert Richter @ 2025-03-28 9:02 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
On 25.03.25 17:13:50, Fabio M. De Francesco wrote:
> On Friday, March 21, 2025 11:34:35 AM Central European Standard Time Robert Richter wrote:
> > Fabio,
> >
> > On 14.03.25 12:36:29, Fabio M. De Francesco wrote:
> > > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > > Physical Address (HPA) windows that are associated with each CXL Host
> > > Bridge. Each window represents a contiguous HPA that may be interleaved
> > > with one or more targets (CXL v3.1 - 9.18.1.3).
> > >
> > > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > > memory to which systems cannot send transactions. On those systems, BIOS
> > > publishes CFMWS which communicate the active System Physical Address (SPA)
> > > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > > SPA to map to CXL HPA in that hole.
> > >
> > > In the early stages of CXL Regions construction and attach on platforms
> > > with Low Memory Holes, the driver fails and returns an error because it
> > > expects that the CXL Endpoint Decoder range is a subset of the Root
> > > Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> > >
> > > Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> > > Decoders or already made CXL Regions and Decoders to allow the
> > > construction of new CXL Regions and the attachment of Endpoint Decoders,
> > > even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> > > range end to match Root Decoder's.
> > >
> > > - Patch 1/4 changes the calling conventions of three match_*_by_range()
> > > helpers in preparation of 3/4.
> > > - Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
> > > the HPA range end for CXL Regions construction.
> > > - Patch 3/4 enables CXL Regions construction and Endpoint Decoders
> > > attachment by matching Root Decoders or Regions with Endpoint
> > > Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
> > > constraints while Endpoints decoders' attachment.
> > > - Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
> > >
> > > Many thanks to Alison, Dan, and Ira for their help and for their reviews
> > > of my RFC on Intel's internal ML.
> > >
> > > Commenting on v1, Alison wrote a couple of observations on what users
> > > will see. I suggest anyone interested to see how this series affect
> > > users to take a look at her observations.[0] Thank you!
> > >
> > > Changes for v3:
> > >
> > > Re-base the series on cxl/next.
> > >
> > > 1/4 - 2/4:
> > > Constify local variables.
> > > 3/4:
> > > Call arch_match_region() from region_res_match_cxl_range().
> > > 4/4:
> > > arch_match_region() - Check that region end is under start + 4G;
> > > arch_match_spa() - Check that SPA range start is cfmws_range_start.
> >
> > I have sent comments for version 1 and suggested a simpler approach
> > for this to implement.
> >
> I replied to your comments for version 1. Please find my message at
> https://lore.kernel.org/linux-cxl/9930375.eNJFYEL58v@fdefranc-mobl3/.
The outcome was "I'll think about it.".
> >
> > My comments haven't been addressed yet,
> >
> I think it's not possible to detect an LMH while still in cxl_port_add().
Why is that not possible? I have outlined a solution before: Implement
a function to run platform specific port setup. Run a platform check.
Enable a platform dependend callback. Setup the port using platform
specifics.
> Therefore, I think that this is the best way to go. It works to prevent
> the driver to fail to create Regions and attach Endpoint Decoders on x86
> with Low Memory Holes, provided that the lower SPA range starts at 0x0.
An alternative works as well, that is not an argument.
>
> On platforms with other kind of holes, the driver will continue to fail
> as it currently does.
And those platform will then add more specific code and more
complexity in the main path other need to run? See, the code needs to
be encapsulated.
> >
> > but we
> > need better isolation to reduce interference with other platforms and
> > archs. Please take a look again.
> >
> Interference? Do you mean that this series would make the driver fail on
> other platforms?
No, other platforms must deal with that specific code and constrains.
>
> Of course I don't want anything like that. I'm not clear about it...
> Would you please describe how would this series interfere and what
> would happen on other platforms?
Other platforms should not care about platform-specifics of others. So
again, use a platform check and only enable that code there necessary.
And this requires a well defined interface to common code.
Thanks,
-Robert
>
> Thanks,
>
> Fabio
> >
> > Many thanks,
> >
> > -Robert
> >
>
>
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86
2025-03-26 16:47 ` Fabio M. De Francesco
@ 2025-03-28 10:26 ` Robert Richter
0 siblings, 0 replies; 32+ messages in thread
From: Robert Richter @ 2025-03-28 10:26 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
On 26.03.25 17:47:26, Fabio M. De Francesco wrote:
> On Friday, March 21, 2025 11:21:44 AM Central European Standard Time Robert Richter wrote:
> > On 14.03.25 12:36:31, Fabio M. De Francesco wrote:
> > > In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> > > SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> > > HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> > > ranges are always guaranteed to align to the NIW * 256M rule.
> > >
> > > In order to construct Regions and attach Decoders, the driver needs to
> > > match Root Decoders and Regions with Endpoint Decoders, but it fails and
> > > the entire process returns errors because it doesn't expect to deal with
> > > SPA range lengths smaller than corresponding HPA's.
> > >
> > > Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> > > with corresponding HPA's. They will be used in the process of Regions
> > > creation and Endpoint attachments to prevent driver failures in a few
> > > steps of the above-mentioned process.
> > >
> > > The helpers return true when HPA/SPA misalignments are detected under
> > > specific conditions: both the SPA and HPA ranges must start at
> > > LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> >
> > "... that for Intel with LMHs is 0x0", not true for AMD.
> >
> > > be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> > > aligned to the NIW * 256M rule.
> > >
> > > Also introduce a function to adjust the range end of the Regions to be
> > > created on x86 with LMH's.
> > >
> > > Cc: Alison Schofield <alison.schofield@intel.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > > ---
> > > drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> > > drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
> >
> > The split of the code in patch #2 and #3 does not make sense. The
> > "interface" you introduce here is out of context which is patch #3.
> > And in patch #3 the functions are actually used, so you need to switch
> > back to this one. Other than that, the code is not enabled here at
> > all, it is even not built.
> >
> I prefer to split the introduction of helpers and the use of them in
> separate patches.
You introduce dead code here. If you want to split it, then add an
empty stub function that is actually called in the first patch and add
the actual implementation in the second.
> >
> > > 2 files changed, 85 insertions(+)
> > > create mode 100644 drivers/cxl/core/lmh.c
> > > create mode 100644 drivers/cxl/core/lmh.h
> > >
> > > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > > new file mode 100644
> > > index 000000000000..2e32f867eb94
> > > --- /dev/null
> > > +++ b/drivers/cxl/core/lmh.c
> > > @@ -0,0 +1,56 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +
> > > +#include <linux/range.h>
> > > +#include "lmh.h"
> > > +
> > > +/* Start of CFMWS range that end before x86 Low Memory Holes */
> > > +#define LMH_CFMWS_RANGE_START 0x0ULL
> >
> > This looks arbitrary. An endpoint decoder's zero base address could
> > have other reasons too, e.g. the need of address translation. So you
> > need a different check for the mem hole.
> >
> > In my previous review comment I have requested a platform check for
> > this code to enable.
> >
> > > +
> > > +/*
> > > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > > + *
> > > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > > + * the given endpoint decoder HPA range size is always expected aligned and
> > > + * also larger than that of the matching root decoder. If there are LMH's,
> > > + * the root decoder range end is always less than SZ_4G.
> > > + */
> > > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > > + const struct cxl_endpoint_decoder *cxled)
> > > +{
> > > + const struct range *r1, *r2;
> > > + int niw;
> > > +
> > > + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > > + r2 = &cxled->cxld.hpa_range;
> > > + niw = cxled->cxld.interleave_ways;
> > > +
> > > + if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > > + r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> >
> > How about removing LMH_CFMWS_RANGE_START at all? It is zero anyway and
> > would make this check much easier.
> >
> I think that in the next version I'll check that the range start is a multiple
> of Number of Interleave Ways * 256M (it would include 0x0).
The code suggests that r2->start is zero, but shouldn't be the base of
the endpoint non-zero? And r2->end is greater than SZ_4G which is the
case for all ranges above 4G. Confused, how do you determine the
endpoint fits into the LMH range?
> >
> > Can this being modified to reuse this codes for "holes" other than
> > below 4G?
> >
> This series enables CXL Regions creation and Endpoints attach for x86 LOW
> Memory Holes. I prefer not to enable regions on platforms with memory holes
> beyond 4G until I'm sure it's needed. arch_match_spa() and arch_match_region()
> can be easily modified if in future they need to be.
> >
> > > + IS_ALIGNED(range_len(r2), niw * SZ_256M))
Isn't that length check of the endpoint always true, since this is a
CXL range?
> > > + return true;
The whole check looks odd, please clarify.
> > > +
> > > + return false;
> > > +}
> > > +
> > > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > > +bool arch_match_region(const struct cxl_region_params *p,
> > > + const struct cxl_decoder *cxld)
> > > +{
> > > + const struct range *r = &cxld->hpa_range;
> > > + const struct resource *res = p->res;
> > > + int niw = cxld->interleave_ways;
> > > +
> > > + if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > > + res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> >
> > Same here.
> >
> > > + IS_ALIGNED(range_len(r), niw * SZ_256M))
> > > + return true;
> > > +
> > > + return false;
> > > +}
> >
> > Right now the default check is:
> >
> > (p->res->start == r->start && p->res->end == r->end)
> >
> > Can't we just calculate a matching spa range for the decoder and
> > region and then check if both match? Both match functions would be
> > obsolete then.
> >
> I prefer to keep the design untouched. Anyway, I'll check for other aligned
> res->start, as said above.
> >
> > > +
> > > +void arch_adjust_region_resource(struct resource *res,
> > > + struct cxl_root_decoder *cxlrd)
> > > +{
> > > + res->end = cxlrd->res->end;
> > > +}
> >
> > This should be squashed with arch_match_spa(): same style and
> > interface as for cxl_extended_linear_cache_resize(). Please generalize
> > the interface of cxl_extended_linear_cache_resize() first.
> >
> Do you mean that arch_match_spa() should adjust res->end? I don't think
> it should.
Yes, it should follow the style of already existing code with unified
interfaces.
> >
> > > diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> > > new file mode 100644
> > > index 000000000000..16746ceac1ed
> > > --- /dev/null
> > > +++ b/drivers/cxl/core/lmh.h
> > > @@ -0,0 +1,29 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +
> > > +#include "cxl.h"
> > > +
> > > +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> > > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > > + const struct cxl_endpoint_decoder *cxled);
> > > +bool arch_match_region(const struct cxl_region_params *p,
> > > + const struct cxl_decoder *cxld);
> > > +void arch_adjust_region_resource(struct resource *res,
> > > + struct cxl_root_decoder *cxlrd);
> > > +#else
> > > +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > > + struct cxl_endpoint_decoder *cxled)
> > > +{
> > > + return false;
> > > +}
> > > +
> > > +static bool arch_match_region(struct cxl_region_params *p,
> > > + struct cxl_decoder *cxld)
> > > +{
> > > + return false;
> > > +}
> > > +
> > > +static void arch_adjust_region_resource(struct resource *res,
> > > + struct cxl_root_decoder *cxlrd)
> > > +{
> > > +}
> > > +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
> >
> > I don't think we will need all this helpers if there are platform
> > specific callbacks as suggested in a comment to v1.
> >
> > -Robert
> >
> As said in a reply to 0/4, I'm not yet aware of issues that would interfere
> with non Intel x86. Anyway, I'll think a bit more about using platform
> specific callbacks, even if currently I don't think they are necessary.
Please do, thanks.
-Robert
> >
> Thanks,
>
> Fabio
>
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-26 16:58 ` Fabio M. De Francesco
@ 2025-03-28 10:52 ` Robert Richter
0 siblings, 0 replies; 32+ messages in thread
From: Robert Richter @ 2025-03-28 10:52 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, ming.li, linux-kernel,
linux-cxl
On 26.03.25 17:58:50, Fabio M. De Francesco wrote:
> On Friday, March 21, 2025 11:42:09 AM Central European Standard Time Robert Richter wrote:
> > On 14.03.25 12:36:33, Fabio M. De Francesco wrote:
> > > Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> > > mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> > > to 1GB.
> > >
> > > Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
> > > base address is typically different from the one published by the BIOS on
> > > real hardware, the driver would fail to create and attach CXL Regions if
> > > it was run on the mock environment created by cxl-tests.
> > >
> > > Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
> > > Root Decoders and Regions with Endpoint Decoders when the driver is run on
> > > mock devices.
> > >
> > > Since the auto-created region of cxl-test uses mock_cfmws[0], the
> > > LMH path in the CXL Driver will be exercised every time the cxl-test
> > > module is loaded. Executing unit test: cxl-topology.sh, confirms the
> > > region created successfully with a LMH.
> > >
> > > Cc: Alison Schofield <alison.schofield@intel.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > > ---
> > > drivers/cxl/core/lmh.c | 35 ++++++++++++++++++++++++----
> > > drivers/cxl/core/lmh.h | 2 ++
> >
> > Can you take a look to move all those changes to testing/? This
> > indicates the interface of your mock functions need improvement.
> >
> > -Robert
> >
> > > tools/testing/cxl/cxl_core_exports.c | 2 ++
> > > tools/testing/cxl/test/cxl.c | 10 ++++++++
> > > 4 files changed, 45 insertions(+), 4 deletions(-)
> >
> Your comment is not very detailed but I think you are suggesting to
> override the lmh functions in testing. If that is the case, I don't think
> that any function which is exported from and also called by cxl-core
> can be override by testing with the strong/weak mechanism.
I think you need to modify cxl_test to create a LMH hole (register a
mock bridge and root port with the necessary parameters, etc.) and
then run cxl_core unmodified.
-Robert
>
> But I'm not an expert of linker related topics and not even sure that I
> understood what you suggested.
>
> Would you please elaborate more?
>
> Thanks,
>
> Fabio
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-28 9:02 ` Robert Richter
@ 2025-03-28 21:10 ` Dave Jiang
2025-04-02 11:51 ` Robert Richter
0 siblings, 1 reply; 32+ messages in thread
From: Dave Jiang @ 2025-03-28 21:10 UTC (permalink / raw)
To: Robert Richter, Fabio M. De Francesco
Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, ming.li, linux-kernel, linux-cxl
On 3/28/25 2:02 AM, Robert Richter wrote:
> On 25.03.25 17:13:50, Fabio M. De Francesco wrote:
>> On Friday, March 21, 2025 11:34:35 AM Central European Standard Time Robert Richter wrote:
>>> Fabio,
>>>
>>> On 14.03.25 12:36:29, Fabio M. De Francesco wrote:
>>>> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
>>>> Physical Address (HPA) windows that are associated with each CXL Host
>>>> Bridge. Each window represents a contiguous HPA that may be interleaved
>>>> with one or more targets (CXL v3.1 - 9.18.1.3).
>>>>
>>>> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
>>>> memory to which systems cannot send transactions. On those systems, BIOS
>>>> publishes CFMWS which communicate the active System Physical Address (SPA)
>>>> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
>>>> SPA range trims out the hole, and capacity in the endpoint is lost with no
>>>> SPA to map to CXL HPA in that hole.
>>>>
>>>> In the early stages of CXL Regions construction and attach on platforms
>>>> with Low Memory Holes, the driver fails and returns an error because it
>>>> expects that the CXL Endpoint Decoder range is a subset of the Root
>>>> Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
>>>>
>>>> Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
>>>> Decoders or already made CXL Regions and Decoders to allow the
>>>> construction of new CXL Regions and the attachment of Endpoint Decoders,
>>>> even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
>>>> range end to match Root Decoder's.
>>>>
>>>> - Patch 1/4 changes the calling conventions of three match_*_by_range()
>>>> helpers in preparation of 3/4.
>>>> - Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
>>>> the HPA range end for CXL Regions construction.
>>>> - Patch 3/4 enables CXL Regions construction and Endpoint Decoders
>>>> attachment by matching Root Decoders or Regions with Endpoint
>>>> Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
>>>> constraints while Endpoints decoders' attachment.
>>>> - Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
>>>>
>>>> Many thanks to Alison, Dan, and Ira for their help and for their reviews
>>>> of my RFC on Intel's internal ML.
>>>>
>>>> Commenting on v1, Alison wrote a couple of observations on what users
>>>> will see. I suggest anyone interested to see how this series affect
>>>> users to take a look at her observations.[0] Thank you!
>>>>
>>>> Changes for v3:
>>>>
>>>> Re-base the series on cxl/next.
>>>>
>>>> 1/4 - 2/4:
>>>> Constify local variables.
>>>> 3/4:
>>>> Call arch_match_region() from region_res_match_cxl_range().
>>>> 4/4:
>>>> arch_match_region() - Check that region end is under start + 4G;
>>>> arch_match_spa() - Check that SPA range start is cfmws_range_start.
>>>
>>> I have sent comments for version 1 and suggested a simpler approach
>>> for this to implement.
>>>
>> I replied to your comments for version 1. Please find my message at
>> https://lore.kernel.org/linux-cxl/9930375.eNJFYEL58v@fdefranc-mobl3/.
>
> The outcome was "I'll think about it.".
>
>>>
>>> My comments haven't been addressed yet,
>>>
>> I think it's not possible to detect an LMH while still in cxl_port_add().
>
> Why is that not possible? I have outlined a solution before: Implement
> a function to run platform specific port setup. Run a platform check.
> Enable a platform dependend callback. Setup the port using platform
> specifics.
>
>> Therefore, I think that this is the best way to go. It works to prevent
>> the driver to fail to create Regions and attach Endpoint Decoders on x86
>> with Low Memory Holes, provided that the lower SPA range starts at 0x0.
>
> An alternative works as well, that is not an argument.
>
>>
>> On platforms with other kind of holes, the driver will continue to fail
>> as it currently does.
>
> And those platform will then add more specific code and more
> complexity in the main path other need to run? See, the code needs to
> be encapsulated.
>
>>>
>>> but we
>>> need better isolation to reduce interference with other platforms and
>>> archs. Please take a look again.
>>>
>> Interference? Do you mean that this series would make the driver fail on
>> other platforms?
>
> No, other platforms must deal with that specific code and constrains.
>
>>
>> Of course I don't want anything like that. I'm not clear about it...
>> Would you please describe how would this series interfere and what
>> would happen on other platforms?
>
> Other platforms should not care about platform-specifics of others. So
> again, use a platform check and only enable that code there necessary.
> And this requires a well defined interface to common code.
Hi Robert,
Can you please share more on the background information and/or your specific concerns on the possible memory holes in the other platforms that need to be considered and not covered by Fabio's code? Let's all get on the same page of what specifics we need to consider to make this work. Preferably I want to avoid arch and platform specific code in CXL if possible. Of course that may not always be possible. Would like see if we can avoid a bunch of #ifdef X86/ARM/INTEL/AMD and do it more cleanly. But fully understand the situation first would be helpful to determine that. Thank you!
DJ
>
> Thanks,
>
> -Robert
>
>>
>> Thanks,
>>
>> Fabio
>>>
>>> Many thanks,
>>>
>>> -Robert
>>>
>>
>>
>>
>>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86
2025-03-14 11:36 ` [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86 Fabio M. De Francesco
2025-03-18 15:15 ` Ira Weiny
2025-03-21 10:21 ` Robert Richter
@ 2025-03-28 23:40 ` Dan Williams
2025-03-29 10:05 ` Fabio M. De Francesco
2 siblings, 1 reply; 32+ messages in thread
From: Dan Williams @ 2025-03-28 23:40 UTC (permalink / raw)
To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
Fabio M. De Francesco wrote:
> In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> ranges are always guaranteed to align to the NIW * 256M rule.
>
> In order to construct Regions and attach Decoders, the driver needs to
> match Root Decoders and Regions with Endpoint Decoders, but it fails and
> the entire process returns errors because it doesn't expect to deal with
> SPA range lengths smaller than corresponding HPA's.
>
> Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> with corresponding HPA's. They will be used in the process of Regions
> creation and Endpoint attachments to prevent driver failures in a few
> steps of the above-mentioned process.
>
> The helpers return true when HPA/SPA misalignments are detected under
> specific conditions: both the SPA and HPA ranges must start at
> LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> aligned to the NIW * 256M rule.
>
> Also introduce a function to adjust the range end of the Regions to be
> created on x86 with LMH's.
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
> 2 files changed, 85 insertions(+)
> create mode 100644 drivers/cxl/core/lmh.c
> create mode 100644 drivers/cxl/core/lmh.h
>
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> new file mode 100644
> index 000000000000..2e32f867eb94
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.c
> @@ -0,0 +1,56 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/range.h>
> +#include "lmh.h"
> +
> +/* Start of CFMWS range that end before x86 Low Memory Holes */
> +#define LMH_CFMWS_RANGE_START 0x0ULL
> +
> +/*
> + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> + *
> + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> + * the given endpoint decoder HPA range size is always expected aligned and
> + * also larger than that of the matching root decoder. If there are LMH's,
> + * the root decoder range end is always less than SZ_4G.
> + */
> +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> + const struct cxl_endpoint_decoder *cxled)
> +{
> + const struct range *r1, *r2;
> + int niw;
> +
> + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> + r2 = &cxled->cxld.hpa_range;
> + niw = cxled->cxld.interleave_ways;
> +
> + if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> + r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> + IS_ALIGNED(range_len(r2), niw * SZ_256M))
> + return true;
> +
> + return false;
> +}
> +
> +/* Similar to arch_match_spa(), it matches regions and decoders */
> +bool arch_match_region(const struct cxl_region_params *p,
> + const struct cxl_decoder *cxld)
> +{
> + const struct range *r = &cxld->hpa_range;
> + const struct resource *res = p->res;
> + int niw = cxld->interleave_ways;
> +
> + if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> + res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> + IS_ALIGNED(range_len(r), niw * SZ_256M))
> + return true;
> +
> + return false;
> +}
> +
> +void arch_adjust_region_resource(struct resource *res,
> + struct cxl_root_decoder *cxlrd)
> +{
> + res->end = cxlrd->res->end;
> +}
> diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> new file mode 100644
> index 000000000000..16746ceac1ed
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include "cxl.h"
> +
> +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> + const struct cxl_endpoint_decoder *cxled);
> +bool arch_match_region(const struct cxl_region_params *p,
> + const struct cxl_decoder *cxld);
> +void arch_adjust_region_resource(struct resource *res,
> + struct cxl_root_decoder *cxlrd);
> +#else
> +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> + struct cxl_endpoint_decoder *cxled)
> +{
> + return false;
I would have expected the default match routines to do the default
matching, not return false.
This can document the common expectation on architectures that do not
need to account for decoders not aligning to window boundaries due to
holes.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-14 11:36 ` [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
2025-03-18 21:16 ` Ira Weiny
2025-03-21 10:42 ` Robert Richter
@ 2025-03-28 23:40 ` Dan Williams
2025-03-29 10:16 ` Fabio M. De Francesco
2025-04-03 4:00 ` Dan Williams
2 siblings, 2 replies; 32+ messages in thread
From: Dan Williams @ 2025-03-28 23:40 UTC (permalink / raw)
To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
Fabio M. De Francesco wrote:
> Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> to 1GB.
>
> Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
> base address is typically different from the one published by the BIOS on
> real hardware, the driver would fail to create and attach CXL Regions if
> it was run on the mock environment created by cxl-tests.
>
> Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
> Root Decoders and Regions with Endpoint Decoders when the driver is run on
> mock devices.
>
> Since the auto-created region of cxl-test uses mock_cfmws[0], the
> LMH path in the CXL Driver will be exercised every time the cxl-test
> module is loaded. Executing unit test: cxl-topology.sh, confirms the
> region created successfully with a LMH.
>
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/cxl/core/lmh.c | 35 ++++++++++++++++++++++++----
> drivers/cxl/core/lmh.h | 2 ++
> tools/testing/cxl/cxl_core_exports.c | 2 ++
> tools/testing/cxl/test/cxl.c | 10 ++++++++
> 4 files changed, 45 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> index 2e32f867eb94..9c55670c1c84 100644
> --- a/drivers/cxl/core/lmh.c
> +++ b/drivers/cxl/core/lmh.c
> @@ -1,11 +1,28 @@
> // SPDX-License-Identifier: GPL-2.0-only
>
> #include <linux/range.h>
> +#include <linux/pci.h>
> +
> #include "lmh.h"
>
> /* Start of CFMWS range that end before x86 Low Memory Holes */
> #define LMH_CFMWS_RANGE_START 0x0ULL
>
> +static u64 mock_cfmws0_range_start = ULLONG_MAX;
> +
> +void set_mock_cfmws0_range_start(const u64 start)
> +{
> + mock_cfmws0_range_start = start;
> +}
> +
> +static u64 get_cfmws_range_start(const struct device *dev)
> +{
> + if (dev_is_pci(dev))
> + return LMH_CFMWS_RANGE_START;
> +
> + return mock_cfmws0_range_start;
> +}
> +
cxl_test should never result in "mock" infrastructure appearing outside
of tools/testing/cxl/
> /*
> * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> *
> @@ -19,14 +36,19 @@ bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> const struct cxl_endpoint_decoder *cxled)
> {
> const struct range *r1, *r2;
> + u64 cfmws_range_start;
> int niw;
>
> + cfmws_range_start = get_cfmws_range_start(&cxled->cxld.dev);
> + if (cfmws_range_start == ULLONG_MAX)
> + return false;
> +
> r1 = &cxlrd->cxlsd.cxld.hpa_range;
> r2 = &cxled->cxld.hpa_range;
> niw = cxled->cxld.interleave_ways;
>
> - if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> - r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> + if (r1->start == cfmws_range_start && r1->start == r2->start &&
> + r1->end < (cfmws_range_start + SZ_4G) && r1->end < r2->end &&
> IS_ALIGNED(range_len(r2), niw * SZ_256M))
> return true;
>
> @@ -40,9 +62,14 @@ bool arch_match_region(const struct cxl_region_params *p,
> const struct range *r = &cxld->hpa_range;
> const struct resource *res = p->res;
> int niw = cxld->interleave_ways;
> + u64 cfmws_range_start;
> +
> + cfmws_range_start = get_cfmws_range_start(&cxld->dev);
> + if (cfmws_range_start == ULLONG_MAX)
> + return false;
>
> - if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> - res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> + if (res->start == cfmws_range_start && res->start == r->start &&
> + res->end < (cfmws_range_start + SZ_4G) && res->end < r->end &&
> IS_ALIGNED(range_len(r), niw * SZ_256M))
> return true;
Someone should be able to read the straight line CXL driver code and
never know that an alternate implementation exists for changing these
details.
So, the mock interface for this stuff should intercept at the
arch_match_spa() and arch_match_region() level.
To me that looks like mark these implementations with the __mock
attribute, similar to to_cxl_host_bridge(). Then define strong versions
in tools/testing/cxl/mock_lmh.c.
The strong versions would apply memory hole semantics to both windows
starting at zero and whatever cxl_test window you choose.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86
2025-03-28 23:40 ` Dan Williams
@ 2025-03-29 10:05 ` Fabio M. De Francesco
0 siblings, 0 replies; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-29 10:05 UTC (permalink / raw)
To: Dan Williams
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams, Robert Richter, ming.li,
linux-kernel, linux-cxl
[-- Attachment #1: Type: text/plain, Size: 6521 bytes --]
On Saturday, March 29, 2025 12:40:34 AM Central European Standard Time Dan Williams wrote:
> Fabio M. De Francesco wrote:
> > In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> > SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> > HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> > ranges are always guaranteed to align to the NIW * 256M rule.
> >
> > In order to construct Regions and attach Decoders, the driver needs to
> > match Root Decoders and Regions with Endpoint Decoders, but it fails and
> > the entire process returns errors because it doesn't expect to deal with
> > SPA range lengths smaller than corresponding HPA's.
> >
> > Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> > with corresponding HPA's. They will be used in the process of Regions
> > creation and Endpoint attachments to prevent driver failures in a few
> > steps of the above-mentioned process.
> >
> > The helpers return true when HPA/SPA misalignments are detected under
> > specific conditions: both the SPA and HPA ranges must start at
> > LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> > be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> > aligned to the NIW * 256M rule.
> >
> > Also introduce a function to adjust the range end of the Regions to be
> > created on x86 with LMH's.
> >
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> > drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> > drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
> > 2 files changed, 85 insertions(+)
> > create mode 100644 drivers/cxl/core/lmh.c
> > create mode 100644 drivers/cxl/core/lmh.h
> >
> > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > new file mode 100644
> > index 000000000000..2e32f867eb94
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.c
> > @@ -0,0 +1,56 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +
> > +#include <linux/range.h>
> > +#include "lmh.h"
> > +
> > +/* Start of CFMWS range that end before x86 Low Memory Holes */
> > +#define LMH_CFMWS_RANGE_START 0x0ULL
> > +
> > +/*
> > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > + *
> > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > + * the given endpoint decoder HPA range size is always expected aligned and
> > + * also larger than that of the matching root decoder. If there are LMH's,
> > + * the root decoder range end is always less than SZ_4G.
> > + */
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled)
> > +{
> > + const struct range *r1, *r2;
> > + int niw;
> > +
> > + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > + r2 = &cxled->cxld.hpa_range;
> > + niw = cxled->cxld.interleave_ways;
> > +
> > + if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > + r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> > + IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > + return true;
> > +
> > + return false;
> > +}
> > +
> > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > +bool arch_match_region(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld)
> > +{
> > + const struct range *r = &cxld->hpa_range;
> > + const struct resource *res = p->res;
> > + int niw = cxld->interleave_ways;
> > +
> > + if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > + res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> > + IS_ALIGNED(range_len(r), niw * SZ_256M))
> > + return true;
> > +
> > + return false;
> > +}
> > +
> > +void arch_adjust_region_resource(struct resource *res,
> > + struct cxl_root_decoder *cxlrd)
> > +{
> > + res->end = cxlrd->res->end;
> > +}
> > diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> > new file mode 100644
> > index 000000000000..16746ceac1ed
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include "cxl.h"
> > +
> > +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled);
> > +bool arch_match_region(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld);
> > +void arch_adjust_region_resource(struct resource *res,
> > + struct cxl_root_decoder *cxlrd);
> > +#else
> > +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > + struct cxl_endpoint_decoder *cxled)
> > +{
> > + return false;
>
> I would have expected the default match routines to do the default
> matching, not return false.
>
> This can document the common expectation on architectures that do not
> need to account for decoders not aligning to window boundaries due to
> holes.
>
Hi Dan,
A typical example of arch_match_spa() use is from match_root_decoder_by_range()
which returns false on platforms that don't enable support for the low memory hole.
Therefore, the default behavior is failing the matching by returning false.
This is how arch_match_spa() is used to detect a hole and allow the matching:
static int match_root_decoder_by_range(struct device *dev,
const void *data)
{
const struct cxl_endpoint_decoder *cxled = data;
struct cxl_root_decoder *cxlrd;
const struct range *r1, *r2;
if (!is_root_decoder(dev))
return 0;
cxlrd = to_cxl_root_decoder(dev);
r1 = &cxlrd->cxlsd.cxld.hpa_range;
r2 = &cxled->cxld.hpa_range;
if (range_contains(r1, r2))
return true;
if (arch_match_spa(cxlrd, cxled))
return true;
return false;
}
Currently the default behavior is for match_root_decoder_by_range() to
not match root and endpoint decoders. I left that default unchanged for
all platforms and architectures that don't enable LMH support.
Thanks,
Fabio
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-28 23:40 ` Dan Williams
@ 2025-03-29 10:16 ` Fabio M. De Francesco
2025-03-29 22:01 ` Fabio M. De Francesco
2025-04-03 4:00 ` Dan Williams
1 sibling, 1 reply; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-29 10:16 UTC (permalink / raw)
To: Dan Williams
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Robert Richter, ming.li, linux-kernel,
linux-cxl
[-- Attachment #1: Type: text/plain, Size: 5375 bytes --]
On Saturday, March 29, 2025 12:40:47 AM Central European Standard Time Dan Williams wrote:
> Fabio M. De Francesco wrote:
> > Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> > mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> > to 1GB.
> >
> > Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
> > base address is typically different from the one published by the BIOS on
> > real hardware, the driver would fail to create and attach CXL Regions if
> > it was run on the mock environment created by cxl-tests.
> >
> > Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
> > Root Decoders and Regions with Endpoint Decoders when the driver is run on
> > mock devices.
> >
> > Since the auto-created region of cxl-test uses mock_cfmws[0], the
> > LMH path in the CXL Driver will be exercised every time the cxl-test
> > module is loaded. Executing unit test: cxl-topology.sh, confirms the
> > region created successfully with a LMH.
> >
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> > drivers/cxl/core/lmh.c | 35 ++++++++++++++++++++++++----
> > drivers/cxl/core/lmh.h | 2 ++
> > tools/testing/cxl/cxl_core_exports.c | 2 ++
> > tools/testing/cxl/test/cxl.c | 10 ++++++++
> > 4 files changed, 45 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > index 2e32f867eb94..9c55670c1c84 100644
> > --- a/drivers/cxl/core/lmh.c
> > +++ b/drivers/cxl/core/lmh.c
> > @@ -1,11 +1,28 @@
> > // SPDX-License-Identifier: GPL-2.0-only
> >
> > #include <linux/range.h>
> > +#include <linux/pci.h>
> > +
> > #include "lmh.h"
> >
> > /* Start of CFMWS range that end before x86 Low Memory Holes */
> > #define LMH_CFMWS_RANGE_START 0x0ULL
> >
> > +static u64 mock_cfmws0_range_start = ULLONG_MAX;
> > +
> > +void set_mock_cfmws0_range_start(const u64 start)
> > +{
> > + mock_cfmws0_range_start = start;
> > +}
> > +
> > +static u64 get_cfmws_range_start(const struct device *dev)
> > +{
> > + if (dev_is_pci(dev))
> > + return LMH_CFMWS_RANGE_START;
> > +
> > + return mock_cfmws0_range_start;
> > +}
> > +
>
> cxl_test should never result in "mock" infrastructure appearing outside
> of tools/testing/cxl/
>
> > /*
> > * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > *
> > @@ -19,14 +36,19 @@ bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > const struct cxl_endpoint_decoder *cxled)
> > {
> > const struct range *r1, *r2;
> > + u64 cfmws_range_start;
> > int niw;
> >
> > + cfmws_range_start = get_cfmws_range_start(&cxled->cxld.dev);
> > + if (cfmws_range_start == ULLONG_MAX)
> > + return false;
> > +
> > r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > r2 = &cxled->cxld.hpa_range;
> > niw = cxled->cxld.interleave_ways;
> >
> > - if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > - r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> > + if (r1->start == cfmws_range_start && r1->start == r2->start &&
> > + r1->end < (cfmws_range_start + SZ_4G) && r1->end < r2->end &&
> > IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > return true;
> >
> > @@ -40,9 +62,14 @@ bool arch_match_region(const struct cxl_region_params *p,
> > const struct range *r = &cxld->hpa_range;
> > const struct resource *res = p->res;
> > int niw = cxld->interleave_ways;
> > + u64 cfmws_range_start;
> > +
> > + cfmws_range_start = get_cfmws_range_start(&cxld->dev);
> > + if (cfmws_range_start == ULLONG_MAX)
> > + return false;
> >
> > - if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > - res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> > + if (res->start == cfmws_range_start && res->start == r->start &&
> > + res->end < (cfmws_range_start + SZ_4G) && res->end < r->end &&
> > IS_ALIGNED(range_len(r), niw * SZ_256M))
> > return true;
>
> Someone should be able to read the straight line CXL driver code and
> never know that an alternate implementation exists for changing these
> details.
>
> So, the mock interface for this stuff should intercept at the
> arch_match_spa() and arch_match_region() level.
>
> To me that looks like mark these implementations with the __mock
> attribute, similar to to_cxl_host_bridge(). Then define strong versions
> in tools/testing/cxl/mock_lmh.c.
>
> The strong versions would apply memory hole semantics to both windows
> starting at zero and whatever cxl_test window you choose.
>
I thought the same and wanted to use the strong/weak mechanism, but then
I noticed that the strong version (in tools/testing/cxl/mock_lmh.c) was never
called. I think it never happens because of the weak version is called from
cxl_core. I think that all functions called from cxl_core can't be override
from cxl_test.
Is that deduction unfounded? Am I missing something?
Thanks,
Fabio
P.S.: Please notice that to_cxl_host_bridge() is never used in cxl_core.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-29 10:16 ` Fabio M. De Francesco
@ 2025-03-29 22:01 ` Fabio M. De Francesco
0 siblings, 0 replies; 32+ messages in thread
From: Fabio M. De Francesco @ 2025-03-29 22:01 UTC (permalink / raw)
To: Dan Williams
Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Robert Richter, ming.li, linux-kernel,
linux-cxl
[-- Attachment #1: Type: text/plain, Size: 6027 bytes --]
On Saturday, March 29, 2025 11:16:09 AM Central European Standard Time Fabio M. De Francesco wrote:
> On Saturday, March 29, 2025 12:40:47 AM Central European Standard Time Dan Williams wrote:
> > Fabio M. De Francesco wrote:
> > > Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> > > mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> > > to 1GB.
> > >
> > > Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
> > > base address is typically different from the one published by the BIOS on
> > > real hardware, the driver would fail to create and attach CXL Regions if
> > > it was run on the mock environment created by cxl-tests.
> > >
> > > Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
> > > Root Decoders and Regions with Endpoint Decoders when the driver is run on
> > > mock devices.
> > >
> > > Since the auto-created region of cxl-test uses mock_cfmws[0], the
> > > LMH path in the CXL Driver will be exercised every time the cxl-test
> > > module is loaded. Executing unit test: cxl-topology.sh, confirms the
> > > region created successfully with a LMH.
> > >
> > > Cc: Alison Schofield <alison.schofield@intel.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > > ---
> > > drivers/cxl/core/lmh.c | 35 ++++++++++++++++++++++++----
> > > drivers/cxl/core/lmh.h | 2 ++
> > > tools/testing/cxl/cxl_core_exports.c | 2 ++
> > > tools/testing/cxl/test/cxl.c | 10 ++++++++
> > > 4 files changed, 45 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > > index 2e32f867eb94..9c55670c1c84 100644
> > > --- a/drivers/cxl/core/lmh.c
> > > +++ b/drivers/cxl/core/lmh.c
> > > @@ -1,11 +1,28 @@
> > > // SPDX-License-Identifier: GPL-2.0-only
> > >
> > > #include <linux/range.h>
> > > +#include <linux/pci.h>
> > > +
> > > #include "lmh.h"
> > >
> > > /* Start of CFMWS range that end before x86 Low Memory Holes */
> > > #define LMH_CFMWS_RANGE_START 0x0ULL
> > >
> > > +static u64 mock_cfmws0_range_start = ULLONG_MAX;
> > > +
> > > +void set_mock_cfmws0_range_start(const u64 start)
> > > +{
> > > + mock_cfmws0_range_start = start;
> > > +}
> > > +
> > > +static u64 get_cfmws_range_start(const struct device *dev)
> > > +{
> > > + if (dev_is_pci(dev))
> > > + return LMH_CFMWS_RANGE_START;
> > > +
> > > + return mock_cfmws0_range_start;
> > > +}
> > > +
> >
> > cxl_test should never result in "mock" infrastructure appearing outside
> > of tools/testing/cxl/
> >
> > > /*
> > > * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > > *
> > > @@ -19,14 +36,19 @@ bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > > const struct cxl_endpoint_decoder *cxled)
> > > {
> > > const struct range *r1, *r2;
> > > + u64 cfmws_range_start;
> > > int niw;
> > >
> > > + cfmws_range_start = get_cfmws_range_start(&cxled->cxld.dev);
> > > + if (cfmws_range_start == ULLONG_MAX)
> > > + return false;
> > > +
> > > r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > > r2 = &cxled->cxld.hpa_range;
> > > niw = cxled->cxld.interleave_ways;
> > >
> > > - if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > > - r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> > > + if (r1->start == cfmws_range_start && r1->start == r2->start &&
> > > + r1->end < (cfmws_range_start + SZ_4G) && r1->end < r2->end &&
> > > IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > > return true;
> > >
> > > @@ -40,9 +62,14 @@ bool arch_match_region(const struct cxl_region_params *p,
> > > const struct range *r = &cxld->hpa_range;
> > > const struct resource *res = p->res;
> > > int niw = cxld->interleave_ways;
> > > + u64 cfmws_range_start;
> > > +
> > > + cfmws_range_start = get_cfmws_range_start(&cxld->dev);
> > > + if (cfmws_range_start == ULLONG_MAX)
> > > + return false;
> > >
> > > - if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > > - res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> > > + if (res->start == cfmws_range_start && res->start == r->start &&
> > > + res->end < (cfmws_range_start + SZ_4G) && res->end < r->end &&
> > > IS_ALIGNED(range_len(r), niw * SZ_256M))
> > > return true;
> >
> > Someone should be able to read the straight line CXL driver code and
> > never know that an alternate implementation exists for changing these
> > details.
> >
> > So, the mock interface for this stuff should intercept at the
> > arch_match_spa() and arch_match_region() level.
> >
> > To me that looks like mark these implementations with the __mock
> > attribute, similar to to_cxl_host_bridge(). Then define strong versions
> > in tools/testing/cxl/mock_lmh.c.
> >
> > The strong versions would apply memory hole semantics to both windows
> > starting at zero and whatever cxl_test window you choose.
> >
> I thought the same and wanted to use the strong/weak mechanism, but then
> I noticed that the strong version (in tools/testing/cxl/mock_lmh.c) was never
> called. I think it never happens because of the weak version is called from
> cxl_core. I think that all functions called from cxl_core can't be override
> from cxl_test.
>
> Is that deduction unfounded? Am I missing something?
>
> Thanks,
>
> Fabio
>
> P.S.: Please notice that to_cxl_host_bridge() is never used in cxl_core.
>
I mistakenly thought you were suggesting something like the wrap approach
that is not possible if the caller of the wrapped function is internal to
the CXL core.[1]
Fabio
[1] https://lore.kernel.org/all/6711b7c0c0b53_3ee2294a6@dwillia2-xfh.jf.intel.com.notmuch/
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-03-28 21:10 ` Dave Jiang
@ 2025-04-02 11:51 ` Robert Richter
2025-04-02 15:31 ` Dave Jiang
0 siblings, 1 reply; 32+ messages in thread
From: Robert Richter @ 2025-04-02 11:51 UTC (permalink / raw)
To: Dave Jiang
Cc: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams, ming.li,
linux-kernel, linux-cxl
Dave,
thank you for your answer.
On 28.03.25 14:10:00, Dave Jiang wrote:
>
>
> On 3/28/25 2:02 AM, Robert Richter wrote:
> > On 25.03.25 17:13:50, Fabio M. De Francesco wrote:
> >> Interference? Do you mean that this series would make the driver fail on
> >> other platforms?
> >
> > No, other platforms must deal with that specific code and constrains.
> >
> >>
> >> Of course I don't want anything like that. I'm not clear about it...
> >> Would you please describe how would this series interfere and what
> >> would happen on other platforms?
> >
> > Other platforms should not care about platform-specifics of others. So
> > again, use a platform check and only enable that code there necessary.
> > And this requires a well defined interface to common code.
>
> Hi Robert,
> Can you please share more on the background information and/or your
> specific concerns on the possible memory holes in the other
> platforms that need to be considered and not covered by Fabio's
> code? Let's all get on the same page of what specifics we need to
> consider to make this work. Preferably I want to avoid arch and
> platform specific code in CXL if possible. Of course that may not
> always be possible. Would like see if we can avoid a bunch of #ifdef
> X86/ARM/INTEL/AMD and do it more cleanly. But fully understand the
> situation first would be helpful to determine that. Thank you!
We implement a "special" case in the main path. This adds unnecessary
complexity to the code, makes it hard to maintain, change or even to
understand in the future. It becomes more error-prone. Though it is
limited to x86 arch, the code runs for all platforms. A reuse for
other archs will enable it for all platforms of that archs too.
This general approach to add "special" cases does not scale. We see
this already with the "extended linear cache" and now the "low mem
hole". While I am fine with all those special cases (AMD address
translation is another), we need a proper way to enable and implement
those by reducing complexity and with a good isolation. This makes
future changes easier and reduces conflicts with other
implementations.
The change of this series does not much, just find a CFMWS region that
is unaligned to the EP decoder's range and then just shrink the used
SPA range of the EP to match that region. That can be implemented in a
very simple way if we introduce a spa_range paramater plus a custom
port setup. The generalized part of my address translation part alrady
implements this, it can be reused here. To implement LMH support only
the following is needed then:
* add a setup function with a platform check to add a custom
callback,
* the callback checks for the LMH range and adjusts the spa_range.
The modified spa_range matches then with the region range (no changes
needed here). That's it.
I can help making this work.
I hope that makes sense?
Thanks,
-Robert
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole
2025-04-02 11:51 ` Robert Richter
@ 2025-04-02 15:31 ` Dave Jiang
0 siblings, 0 replies; 32+ messages in thread
From: Dave Jiang @ 2025-04-02 15:31 UTC (permalink / raw)
To: Robert Richter
Cc: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams, ming.li,
linux-kernel, linux-cxl
On 4/2/25 4:51 AM, Robert Richter wrote:
> Dave,
>
> thank you for your answer.
>
> On 28.03.25 14:10:00, Dave Jiang wrote:
>>
>>
>> On 3/28/25 2:02 AM, Robert Richter wrote:
>>> On 25.03.25 17:13:50, Fabio M. De Francesco wrote:
>
>>>> Interference? Do you mean that this series would make the driver fail on
>>>> other platforms?
>>>
>>> No, other platforms must deal with that specific code and constrains.
>>>
>>>>
>>>> Of course I don't want anything like that. I'm not clear about it...
>>>> Would you please describe how would this series interfere and what
>>>> would happen on other platforms?
>>>
>>> Other platforms should not care about platform-specifics of others. So
>>> again, use a platform check and only enable that code there necessary.
>>> And this requires a well defined interface to common code.
>>
>> Hi Robert,
>
>> Can you please share more on the background information and/or your
>> specific concerns on the possible memory holes in the other
>> platforms that need to be considered and not covered by Fabio's
>> code? Let's all get on the same page of what specifics we need to
>> consider to make this work. Preferably I want to avoid arch and
>> platform specific code in CXL if possible. Of course that may not
>> always be possible. Would like see if we can avoid a bunch of #ifdef
>> X86/ARM/INTEL/AMD and do it more cleanly. But fully understand the
>> situation first would be helpful to determine that. Thank you!
>
> We implement a "special" case in the main path. This adds unnecessary
> complexity to the code, makes it hard to maintain, change or even to
> understand in the future. It becomes more error-prone. Though it is
> limited to x86 arch, the code runs for all platforms. A reuse for
> other archs will enable it for all platforms of that archs too.
>
> This general approach to add "special" cases does not scale. We see
> this already with the "extended linear cache" and now the "low mem
> hole". While I am fine with all those special cases (AMD address
> translation is another), we need a proper way to enable and implement
> those by reducing complexity and with a good isolation. This makes
> future changes easier and reduces conflicts with other
> implementations.
I'm more of thinking that if those special cases are detectable rather than a set of ambiguous rules then we might address those quirks in a way better than #ifdefs. For "extended linear cache", it is detected via HMAT spec change. So while only Intel implements this right now on a platform, other vendors can and may in the future. The LMH is more difficult as there are no are no standard ways to enumerate it. Hopefully a set of clear rules will define this. It does look like Dan is trying to get the CXL spec to clearly define this and discussion in the WG is coming in the next couple weeks. The AMD translation can be detected by seeing if certain ACPI callback methods exist right? Is there more required to detect the special translation needs to be applied? But I do agree with the reducing complexity for maintenance and future implementations.
>
> The change of this series does not much, just find a CFMWS region that
> is unaligned to the EP decoder's range and then just shrink the used
> SPA range of the EP to match that region. That can be implemented in a
> very simple way if we introduce a spa_range paramater plus a custom
> port setup. The generalized part of my address translation part alrady
> implements this, it can be reused here. To implement LMH support only
> the following is needed then:
>
> * add a setup function with a platform check to add a custom
> callback,
>
> * the callback checks for the LMH range and adjusts the spa_range.
>
> The modified spa_range matches then with the region range (no changes
> needed here). That's it.
>
> I can help making this work.
Code is always welcome :)
>
> I hope that makes sense?
Yes. Appreciate you explaining. thank you!
>
> Thanks,
>
> -Robert
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests
2025-03-28 23:40 ` Dan Williams
2025-03-29 10:16 ` Fabio M. De Francesco
@ 2025-04-03 4:00 ` Dan Williams
1 sibling, 0 replies; 32+ messages in thread
From: Dan Williams @ 2025-04-03 4:00 UTC (permalink / raw)
To: Dan Williams, Fabio M. De Francesco, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny
Cc: Robert Richter, ming.li, linux-kernel, linux-cxl,
Fabio M. De Francesco
Dan Williams wrote:
> Fabio M. De Francesco wrote:
> > Simulate an x86 Low Memory Hole for the CXL tests by changing the first
> > mock CFMWS range size to 768MB and the CXL Endpoint Decoder HPA range sizes
> > to 1GB.
> >
> > Since the auto-created region of cxl-test uses mock_cfmws[0], whose range
> > base address is typically different from the one published by the BIOS on
> > real hardware, the driver would fail to create and attach CXL Regions if
> > it was run on the mock environment created by cxl-tests.
> >
> > Therefore, save the mock_cfmsw[0] range base_hpa and reuse it to match CXL
> > Root Decoders and Regions with Endpoint Decoders when the driver is run on
> > mock devices.
> >
> > Since the auto-created region of cxl-test uses mock_cfmws[0], the
> > LMH path in the CXL Driver will be exercised every time the cxl-test
> > module is loaded. Executing unit test: cxl-topology.sh, confirms the
> > region created successfully with a LMH.
> >
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> > drivers/cxl/core/lmh.c | 35 ++++++++++++++++++++++++----
> > drivers/cxl/core/lmh.h | 2 ++
> > tools/testing/cxl/cxl_core_exports.c | 2 ++
> > tools/testing/cxl/test/cxl.c | 10 ++++++++
> > 4 files changed, 45 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > index 2e32f867eb94..9c55670c1c84 100644
> > --- a/drivers/cxl/core/lmh.c
> > +++ b/drivers/cxl/core/lmh.c
[..]
>
> Someone should be able to read the straight line CXL driver code and
> never know that an alternate implementation exists for changing these
> details.
>
> So, the mock interface for this stuff should intercept at the
> arch_match_spa() and arch_match_region() level.
>
> To me that looks like mark these implementations with the __mock
> attribute, similar to to_cxl_host_bridge(). Then define strong versions
> in tools/testing/cxl/mock_lmh.c.
>
> The strong versions would apply memory hole semantics to both windows
> starting at zero and whatever cxl_test window you choose.
In the interests of moving this forward and because cxl_test, while
useful, is a problem I created, here is a rough mockup of this proposal.
The observation that this hole detection logic is self contained and
suitable to be duplicated in the test code.
The other proposal in this mockup is that the names of these
augmentation functions be changed from "arch" to "platform" as this hole
is a total system / platform quirk not limited to a CPU arch.
Lastly, while I did not make that change I think lmh.c should probably
be renamed platform.c on the expectation that all future platform
quirkiness can land in that file. CONFIG_CXL_PLATFORM_QUIRKS can be
selected in the x86 case, but something tells me every arch that ships
CXL will eventually select CONFIG_CXL_PLATFORM_QUIRKS which might make
the config option moot in the future.
This builds and loads CXL test, and I see the mock versions of these
platform helpers being called.
-- 8< --
diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
index 9c55670c1c84..1bfe516aacf3 100644
--- a/drivers/cxl/core/lmh.c
+++ b/drivers/cxl/core/lmh.c
@@ -3,26 +3,12 @@
#include <linux/range.h>
#include <linux/pci.h>
+#include "cxlmem.h"
#include "lmh.h"
/* Start of CFMWS range that end before x86 Low Memory Holes */
#define LMH_CFMWS_RANGE_START 0x0ULL
-static u64 mock_cfmws0_range_start = ULLONG_MAX;
-
-void set_mock_cfmws0_range_start(const u64 start)
-{
- mock_cfmws0_range_start = start;
-}
-
-static u64 get_cfmws_range_start(const struct device *dev)
-{
- if (dev_is_pci(dev))
- return LMH_CFMWS_RANGE_START;
-
- return mock_cfmws0_range_start;
-}
-
/*
* Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
*
@@ -32,52 +18,48 @@ static u64 get_cfmws_range_start(const struct device *dev)
* also larger than that of the matching root decoder. If there are LMH's,
* the root decoder range end is always less than SZ_4G.
*/
-bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
- const struct cxl_endpoint_decoder *cxled)
+__weak bool
+platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
{
const struct range *r1, *r2;
- u64 cfmws_range_start;
int niw;
- cfmws_range_start = get_cfmws_range_start(&cxled->cxld.dev);
- if (cfmws_range_start == ULLONG_MAX)
- return false;
-
r1 = &cxlrd->cxlsd.cxld.hpa_range;
r2 = &cxled->cxld.hpa_range;
niw = cxled->cxld.interleave_ways;
- if (r1->start == cfmws_range_start && r1->start == r2->start &&
- r1->end < (cfmws_range_start + SZ_4G) && r1->end < r2->end &&
+ if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
+ r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
IS_ALIGNED(range_len(r2), niw * SZ_256M))
return true;
return false;
}
-/* Similar to arch_match_spa(), it matches regions and decoders */
-bool arch_match_region(const struct cxl_region_params *p,
- const struct cxl_decoder *cxld)
+__weak bool platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
{
const struct range *r = &cxld->hpa_range;
const struct resource *res = p->res;
int niw = cxld->interleave_ways;
- u64 cfmws_range_start;
-
- cfmws_range_start = get_cfmws_range_start(&cxld->dev);
- if (cfmws_range_start == ULLONG_MAX)
- return false;
- if (res->start == cfmws_range_start && res->start == r->start &&
- res->end < (cfmws_range_start + SZ_4G) && res->end < r->end &&
+ if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
+ res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
IS_ALIGNED(range_len(r), niw * SZ_256M))
return true;
return false;
}
-void arch_adjust_region_resource(struct resource *res,
- struct cxl_root_decoder *cxlrd)
+void platform_region_adjust(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled,
+ struct resource *res)
{
- res->end = cxlrd->res->end;
+ if (platform_root_decoder_contains(cxlrd, cxled)) {
+ res->end = cxlrd->res->end;
+ dev_dbg(cxled_to_memdev(cxled)->dev.parent,
+ "(LMH) has been adjusted (%s: %pr)\n",
+ dev_name(&cxled->cxld.dev), res);
+ }
}
diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
index b6337120ee17..34f88b9e8290 100644
--- a/drivers/cxl/core/lmh.h
+++ b/drivers/cxl/core/lmh.h
@@ -2,30 +2,31 @@
#include "cxl.h"
-void set_mock_cfmws0_range_start(u64 start);
-
#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
-bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
- const struct cxl_endpoint_decoder *cxled);
-bool arch_match_region(const struct cxl_region_params *p,
- const struct cxl_decoder *cxld);
-void arch_adjust_region_resource(struct resource *res,
- struct cxl_root_decoder *cxlrd);
+bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled);
+bool platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld);
+void platform_region_adjust(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled,
+ struct resource *res);
#else
-static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
- struct cxl_endpoint_decoder *cxled)
+static bool platform_root_decoder_contains(struct cxl_root_decoder *cxlrd,
+ struct cxl_endpoint_decoder *cxled)
{
return false;
}
-static bool arch_match_region(struct cxl_region_params *p,
- struct cxl_decoder *cxld)
+static bool platform_region_contains(struct cxl_region_params *p,
+ struct cxl_decoder *cxld)
{
return false;
}
-static void arch_adjust_region_resource(struct resource *res,
- struct cxl_root_decoder *cxlrd)
+static inline void
+platform_region_adjust(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled,
+ struct resource *res)
{
}
#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 9eb23ecedecf..7822e726ea8c 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -851,7 +851,7 @@ static bool region_res_match_cxl_range(const struct cxl_region_params *p,
return true;
cxld = container_of(range, struct cxl_decoder, hpa_range);
- if (arch_match_region(p, cxld))
+ if (platform_region_contains(p, cxld))
return true;
return false;
@@ -1784,7 +1784,7 @@ static int match_switch_decoder_by_range(struct device *dev,
if (range_contains(r1, r2))
return 1;
cxlrd = to_cxl_root_decoder(dev);
- if (arch_match_spa(cxlrd, cxled))
+ if (platform_root_decoder_contains(cxlrd, cxled))
return 1;
}
return (r1->start == r2->start && r1->end == r2->end);
@@ -1994,7 +1994,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
}
if (resource_size(cxled->dpa_res) * p->interleave_ways + p->cache_size !=
- resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
+ resource_size(p->res) && !platform_root_decoder_contains(cxlrd, cxled)) {
dev_dbg(&cxlr->dev,
"%s:%s-size-%#llx * ways-%d + cache-%#llx != region-size-%#llx\n",
dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
@@ -3231,7 +3231,7 @@ static int match_root_decoder_by_range(struct device *dev,
if (range_contains(r1, r2))
return true;
- if (arch_match_spa(cxlrd, cxled))
+ if (platform_root_decoder_contains(cxlrd, cxled))
return true;
return false;
@@ -3254,7 +3254,7 @@ static int match_region_by_range(struct device *dev, const void *data)
if (p->res) {
if (p->res->start == r->start && p->res->end == r->end)
return 1;
- if (arch_match_region(p, &cxled->cxld))
+ if (platform_region_contains(p, &cxled->cxld))
return 1;
}
@@ -3348,16 +3348,7 @@ static int __construct_region(struct cxl_region *cxlr,
* Trim the HPA retrieved from hardware to fit the SPA mapped by the
* platform
*/
- if (arch_match_spa(cxlrd, cxled)) {
- dev_dbg(cxlmd->dev.parent, "(LMH) Resource (%s: %pr)\n",
- dev_name(&cxled->cxld.dev), res);
-
- arch_adjust_region_resource(res, cxlrd);
-
- dev_dbg(cxlmd->dev.parent,
- "(LMH) has been adjusted (%s: %pr)\n",
- dev_name(&cxled->cxld.dev), res);
- }
+ platform_region_adjust(cxlrd, cxled, res);
rc = insert_resource(cxlrd->res, res);
if (rc) {
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 3ec6b906371b..1b3a97ef3f11 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -74,7 +74,7 @@ static inline struct cxl_port *cxlrd_to_port(struct cxl_root_decoder *cxlrd)
}
static inline struct cxl_memdev *
-cxled_to_memdev(struct cxl_endpoint_decoder *cxled)
+cxled_to_memdev(const struct cxl_endpoint_decoder *cxled)
{
struct cxl_port *port = to_cxl_port(cxled->cxld.dev.parent);
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index 3b3c24b1496e..9d24a7fd2536 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -65,6 +65,7 @@ cxl_core-y += $(CXL_CORE_SRC)/acpi.o
cxl_core-y += $(CXL_CORE_SRC)/ras.o
cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o
cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += $(CXL_CORE_SRC)/lmh.o
+cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += mock_lmh.o
cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o
cxl_core-$(CONFIG_CXL_FEATURES) += $(CXL_CORE_SRC)/features.o
cxl_core-$(CONFIG_CXL_MCE) += $(CXL_CORE_SRC)/mce.o
diff --git a/tools/testing/cxl/cxl_core_exports.c b/tools/testing/cxl/cxl_core_exports.c
index 7b20f9fcf0d7..f088792a8925 100644
--- a/tools/testing/cxl/cxl_core_exports.c
+++ b/tools/testing/cxl/cxl_core_exports.c
@@ -2,8 +2,6 @@
/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
#include "cxl.h"
-#include "lmh.h"
/* Exporting of cxl_core symbols that are only used by cxl_test */
EXPORT_SYMBOL_NS_GPL(cxl_num_decoders_committed, "CXL");
-EXPORT_SYMBOL_NS_GPL(set_mock_cfmws0_range_start, "CXL");
diff --git a/tools/testing/cxl/mock_lmh.c b/tools/testing/cxl/mock_lmh.c
new file mode 100644
index 000000000000..b849166ba86a
--- /dev/null
+++ b/tools/testing/cxl/mock_lmh.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/module.h>
+#include <linux/range.h>
+#include <linux/pci.h>
+
+#include <cxlmem.h>
+#include <lmh.h>
+#include "test/mock.h"
+
+static bool is_mock_dev(struct device *dev)
+{
+ struct cxl_mock_ops *(*get_ops_fn)(int *index);
+ struct cxl_mock_ops *ops = NULL;
+ void (*put_ops_fn)(int index);
+ bool is_mock = false;
+ int index;
+
+ get_ops_fn = symbol_get(get_cxl_mock_ops);
+ if (!get_ops_fn)
+ return false;
+ put_ops_fn = symbol_get(put_cxl_mock_ops);
+ if (!put_ops_fn)
+ goto out;
+
+ ops = get_ops_fn(&index);
+ if (ops)
+ is_mock = ops->is_mock_dev(dev);
+ put_ops_fn(index);
+
+out:
+ symbol_put(get_cxl_mock_ops);
+
+ return is_mock;
+}
+
+/* Start of CFMWS range that end before x86 Low Memory Holes */
+#define LMH_CFMWS_RANGE_START 0x0ULL
+
+static bool
+real_platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
+{
+ const struct range *r1, *r2;
+ int niw;
+
+ r1 = &cxlrd->cxlsd.cxld.hpa_range;
+ r2 = &cxled->cxld.hpa_range;
+ niw = cxled->cxld.interleave_ways;
+
+ if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
+ r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
+ IS_ALIGNED(range_len(r2), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+bool platform_root_decoder_contains(const struct cxl_root_decoder *cxlrd,
+ const struct cxl_endpoint_decoder *cxled)
+{
+ if (is_mock_dev(cxled_to_memdev(cxled)->dev.parent)) {
+ /* cxl_test_platform_root_decoder_contains(...) */
+ return false;
+ }
+
+ return real_platform_root_decoder_contains(cxlrd, cxled);
+}
+
+static bool real_platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
+{
+ const struct range *r = &cxld->hpa_range;
+ const struct resource *res = p->res;
+ int niw = cxld->interleave_ways;
+
+ if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
+ res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
+ IS_ALIGNED(range_len(r), niw * SZ_256M))
+ return true;
+
+ return false;
+}
+
+bool platform_region_contains(const struct cxl_region_params *p,
+ const struct cxl_decoder *cxld)
+{
+ struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+
+ if (is_mock_dev(port->uport_dev)) {
+ /* cxl_test_platform_root_decoder_contains(...) */
+ return false;
+ }
+
+ return real_platform_region_contains(p, cxld);
+}
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 8c69ce0a272f..f54a648d2268 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -459,7 +459,6 @@ static int populate_cedt(void)
return -ENOMEM;
window->base_hpa = res->range.start;
}
- set_mock_cfmws0_range_start(mock_cfmws[0]->base_hpa);
return 0;
}
^ permalink raw reply related [flat|nested] 32+ messages in thread
end of thread, other threads:[~2025-04-03 4:00 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-14 11:36 [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
2025-03-14 11:36 ` [PATCH 1/4 v3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
2025-03-21 15:43 ` Dave Jiang
2025-03-14 11:36 ` [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86 Fabio M. De Francesco
2025-03-18 15:15 ` Ira Weiny
2025-03-21 10:21 ` Robert Richter
2025-03-26 16:47 ` Fabio M. De Francesco
2025-03-28 10:26 ` Robert Richter
2025-03-28 23:40 ` Dan Williams
2025-03-29 10:05 ` Fabio M. De Francesco
2025-03-14 11:36 ` [PATCH 3/4 v3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
2025-03-18 20:35 ` Ira Weiny
2025-03-21 10:29 ` Robert Richter
2025-03-14 11:36 ` [PATCH 4/4 v3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
2025-03-18 21:16 ` Ira Weiny
2025-03-21 10:42 ` Robert Richter
2025-03-26 16:58 ` Fabio M. De Francesco
2025-03-28 10:52 ` Robert Richter
2025-03-28 23:40 ` Dan Williams
2025-03-29 10:16 ` Fabio M. De Francesco
2025-03-29 22:01 ` Fabio M. De Francesco
2025-04-03 4:00 ` Dan Williams
2025-03-20 1:46 ` [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole Alison Schofield
2025-03-26 16:23 ` Fabio M. De Francesco
2025-03-20 18:10 ` Alison Schofield
2025-03-26 16:24 ` Fabio M. De Francesco
2025-03-21 10:34 ` Robert Richter
2025-03-25 16:13 ` Fabio M. De Francesco
2025-03-28 9:02 ` Robert Richter
2025-03-28 21:10 ` Dave Jiang
2025-04-02 11:51 ` Robert Richter
2025-04-02 15:31 ` Dave Jiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox