public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole
@ 2024-11-22 15:51 Fabio M. De Francesco
  2024-11-22 15:51 ` [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Fabio M. De Francesco @ 2024-11-22 15:51 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Fabio M. De Francesco,
	Huang Ying, Yao Xingtao, Li Ming, linux-kernel, linux-cxl

The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
Physical Address (HPA) windows that are associated with each CXL Host
Bridge. Each window represents a contiguous HPA that may be interleaved
with one or more targets (CXL v3.1 - 9.18.1.3).

The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
memory to which systems cannot send transactions. On those systems, BIOS
publishes CFMWS which communicate the active System Physical Address (SPA)
ranges that map to a subset of the Host Physical Address (HPA) ranges. The
SPA range trims out the hole, and capacity in the endpoint is lost with no
SPA to map to CXL HPA in that hole.

In the early stages of CXL Regions construction and attach on platforms
with Low Memory Holes, the driver fails and returns an error because it
expects that the CXL Endpoint Decoder range is a subset of the Root
Decoder's.

Then detect SPA/HPA misalignment and allow CXL Regions construction and 
attach if and only if the misalignment is due to x86 Low Memory Holes.

- Patch 1/3 changes the calling conventions of three match_*_by_range()
  helpers in preparation of 2/3.
- Patch 2/3 detects x86 LMH and enables CXL Regions construction and
  attach by trimming HPA by SPA.
- Patch 3/3 simulates a LMH for running the CXL tests on patched driver.

Many thanks to Alison, Dan, and Ira for their help and for their reviews
of my RFC on Intel's internal ML.

Fabio M. De Francesco (3):
  cxl/core: Change match_*_by_range() calling convention
  cxl/core: Enable Region creation on x86 with Low Memory Hole
  cxl/test: Simulate an x86 Low Memory Hole for tests

 drivers/cxl/Kconfig          |  5 +++
 drivers/cxl/core/Makefile    |  1 +
 drivers/cxl/core/lmh.c       | 58 ++++++++++++++++++++++++++
 drivers/cxl/core/region.c    | 80 ++++++++++++++++++++++++++++--------
 drivers/cxl/cxl.h            | 32 +++++++++++++++
 tools/testing/cxl/Kbuild     |  1 +
 tools/testing/cxl/test/cxl.c |  4 +-
 7 files changed, 161 insertions(+), 20 deletions(-)
 create mode 100644 drivers/cxl/core/lmh.c

-- 
2.46.2


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention
  2024-11-22 15:51 [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
@ 2024-11-22 15:51 ` Fabio M. De Francesco
  2024-11-22 17:28   ` Ira Weiny
  2024-11-25 21:10   ` Alison Schofield
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 25+ messages in thread
From: Fabio M. De Francesco @ 2024-11-22 15:51 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Fabio M. De Francesco,
	Huang Ying, Yao Xingtao, Li Ming, linux-kernel, linux-cxl

In preparation for a patch that supports matching between Decoders that may
have larger HPA ranges than their corresponding Root Decoders or Regions,
change the calling convention of three match_*_by_range() to take CXL
Endpoint Decoders as their second arguments which later will be reused by
two new functions (arch_match_spa() and arch_match_region()).

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
 drivers/cxl/core/region.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index dff618c708dc..ac2c486c16e9 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1724,23 +1724,27 @@ static struct cxl_port *next_port(struct cxl_port *port)
 
 static int match_switch_decoder_by_range(struct device *dev, void *data)
 {
+	struct cxl_endpoint_decoder *cxled = data;
 	struct cxl_switch_decoder *cxlsd;
-	struct range *r1, *r2 = data;
+	struct range *r1, *r2;
 
 	if (!is_switch_decoder(dev))
 		return 0;
 
 	cxlsd = to_cxl_switch_decoder(dev);
 	r1 = &cxlsd->cxld.hpa_range;
+	r2 = &cxled->cxld.hpa_range;
 
 	if (is_root_decoder(dev))
 		return range_contains(r1, r2);
 	return (r1->start == r2->start && r1->end == r2->end);
 }
 
-static int find_pos_and_ways(struct cxl_port *port, struct range *range,
+static int find_pos_and_ways(struct cxl_port *port,
+			     struct cxl_endpoint_decoder *cxled,
 			     int *pos, int *ways)
 {
+	struct range *range = &cxled->cxld.hpa_range;
 	struct cxl_switch_decoder *cxlsd;
 	struct cxl_port *parent;
 	struct device *dev;
@@ -1750,7 +1754,7 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
 	if (!parent)
 		return rc;
 
-	dev = device_find_child(&parent->dev, range,
+	dev = device_find_child(&parent->dev, cxled,
 				match_switch_decoder_by_range);
 	if (!dev) {
 		dev_err(port->uport_dev,
@@ -1830,7 +1834,7 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
 		if (is_cxl_root(iter))
 			break;
 
-		rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
+		rc = find_pos_and_ways(iter, cxled, &parent_pos, &parent_ways);
 		if (rc)
 			return rc;
 
@@ -3184,22 +3188,26 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
 
 static int match_root_decoder_by_range(struct device *dev, void *data)
 {
-	struct range *r1, *r2 = data;
+	struct cxl_endpoint_decoder *cxled = data;
 	struct cxl_root_decoder *cxlrd;
+	struct range *r1, *r2;
 
 	if (!is_root_decoder(dev))
 		return 0;
 
 	cxlrd = to_cxl_root_decoder(dev);
 	r1 = &cxlrd->cxlsd.cxld.hpa_range;
+	r2 = &cxled->cxld.hpa_range;
+
 	return range_contains(r1, r2);
 }
 
 static int match_region_by_range(struct device *dev, void *data)
 {
+	struct cxl_endpoint_decoder *cxled = data;
+	struct range *r = &cxled->cxld.hpa_range;
 	struct cxl_region_params *p;
 	struct cxl_region *cxlr;
-	struct range *r = data;
 	int rc = 0;
 
 	if (!is_cxl_region(dev))
@@ -3303,7 +3311,6 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-	struct range *hpa = &cxled->cxld.hpa_range;
 	struct cxl_decoder *cxld = &cxled->cxld;
 	struct device *cxlrd_dev, *region_dev;
 	struct cxl_root_decoder *cxlrd;
@@ -3312,7 +3319,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
 	bool attach = false;
 	int rc;
 
-	cxlrd_dev = device_find_child(&root->dev, &cxld->hpa_range,
+	cxlrd_dev = device_find_child(&root->dev, cxled,
 				      match_root_decoder_by_range);
 	if (!cxlrd_dev) {
 		dev_err(cxlmd->dev.parent,
@@ -3329,7 +3336,7 @@ int cxl_add_to_region(struct cxl_port *root, struct cxl_endpoint_decoder *cxled)
 	 * one does the construction and the others add to that.
 	 */
 	mutex_lock(&cxlrd->range_lock);
-	region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
+	region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, cxled,
 				       match_region_by_range);
 	if (!region_dev) {
 		cxlr = construct_region(cxlrd, cxled);
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 15:51 [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
  2024-11-22 15:51 ` [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
@ 2024-11-22 15:51 ` Fabio M. De Francesco
  2024-11-22 17:25   ` Ira Weiny
                     ` (6 more replies)
  2024-11-22 15:51 ` [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
                   ` (2 subsequent siblings)
  4 siblings, 7 replies; 25+ messages in thread
From: Fabio M. De Francesco @ 2024-11-22 15:51 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Fabio M. De Francesco,
	Huang Ying, Yao Xingtao, Li Ming, linux-kernel, linux-cxl

The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
Physical Address (HPA) windows that are associated with each CXL Host
Bridge. Each window represents a contiguous HPA that may be interleaved
with one or more targets (CXL v3.1 - 9.18.1.3).

The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
memory to which systems cannot send transactions. In some cases the size
of that hole is not compatible with the CXL hardware decoder constraint
that the size is always aligned to 256M * Interleave Ways.

On those systems, BIOS publishes CFMWS which communicate the active System
Physical Address (SPA) ranges that map to a subset of the Host Physical
Address (HPA) ranges. The SPA range trims out the hole, and capacity in
the endpoint is lost with no SPA to map to CXL HPA in that hole.

In the early stages of CXL Regions construction and attach on platforms
with Low Memory Holes, cxl_add_to_region() fails and returns an error
because it can't find any CXL Window that matches a given CXL Endpoint
Decoder.

Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
ranges: both must start at physical address 0 and end below 4 GB, while
the Root Decoder ranges end at lower addresses than the matching Endpoint
Decoders which instead must always respect the above-mentioned CXL hardware
decoders HPA alignment constraint.

Match misaligned CFMWS and CXL Regions with corresponding CXL Endpoint
Decoders if driver detects Low Memory Holes.

Construct CXL Regions with HPA range's end trimmed by matching SPA.

Allow the attach target process to complete by relaxing Decoder constraints
which would lead to failures.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
 drivers/cxl/Kconfig       |  5 ++++
 drivers/cxl/core/Makefile |  1 +
 drivers/cxl/core/lmh.c    | 53 +++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++++++++-------
 drivers/cxl/cxl.h         | 25 ++++++++++++++++++
 5 files changed, 130 insertions(+), 9 deletions(-)
 create mode 100644 drivers/cxl/core/lmh.c

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 876469e23f7a..07b87f217e59 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -128,6 +128,11 @@ config CXL_REGION
 
 	  If unsure say 'y'
 
+config CXL_ARCH_LOW_MEMORY_HOLE
+	def_bool y
+	depends on CXL_REGION
+	depends on X86
+
 config CXL_REGION_INVALIDATION_TEST
 	bool "CXL: Region Cache Management Bypass (TEST)"
 	depends on CXL_REGION
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 9259bcc6773c..6e80215e8444 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -15,4 +15,5 @@ cxl_core-y += hdm.o
 cxl_core-y += pmu.o
 cxl_core-y += cdat.o
 cxl_core-$(CONFIG_TRACING) += trace.o
+cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
new file mode 100644
index 000000000000..da76b2a534ec
--- /dev/null
+++ b/drivers/cxl/core/lmh.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/range.h>
+#include "cxl.h"
+
+/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
+#define MISALIGNED_CFMWS_RANGE_BASE 0x0
+
+/*
+ * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
+ *
+ * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
+ * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
+ * the given endpoint decoder HPA range size is always expected aligned and
+ * also larger than that of the matching root decoder
+ */
+bool arch_match_spa(struct cxl_root_decoder *cxlrd,
+		    struct cxl_endpoint_decoder *cxled)
+{
+	struct range *r1, *r2;
+	int niw;
+
+	r1 = &cxlrd->cxlsd.cxld.hpa_range;
+	r2 = &cxled->cxld.hpa_range;
+	niw = cxled->cxld.interleave_ways;
+
+	if (r1->start == MISALIGNED_CFMWS_RANGE_BASE &&
+	    r1->start == r2->start && r1->end < r2->end &&
+	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
+		return true;
+	return false;
+}
+
+/* Similar to arch_match_spa(), it matches regions and decoders */
+bool arch_match_region(struct cxl_region_params *p,
+		       struct cxl_decoder *cxld)
+{
+	struct range *r = &cxld->hpa_range;
+	struct resource *res = p->res;
+	int niw = cxld->interleave_ways;
+
+	if (res->start == MISALIGNED_CFMWS_RANGE_BASE &&
+	    res->start == r->start && res->end < r->end &&
+	    IS_ALIGNED(range_len(r), niw * SZ_256M))
+		return true;
+	return false;
+}
+
+void arch_trim_hpa_by_spa(struct resource *res,
+			  struct cxl_root_decoder *cxlrd)
+{
+	res->end = cxlrd->res->end;
+}
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index ac2c486c16e9..3cb5a69e9731 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
 	cxld = to_cxl_decoder(dev);
 	r = &cxld->hpa_range;
 
-	if (p->res && p->res->start == r->start && p->res->end == r->end)
-		return 1;
+	if (p->res) {
+		if (p->res->start == r->start && p->res->end == r->end)
+			return 1;
+		if (arch_match_region(p, cxld))
+			return 1;
+	}
 
 	return 0;
 }
@@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
 		if (cxld->interleave_ways != iw ||
 		    cxld->interleave_granularity != ig ||
 		    cxld->hpa_range.start != p->res->start ||
-		    cxld->hpa_range.end != p->res->end ||
+		    (cxld->hpa_range.end != p->res->end &&
+		     !arch_match_region(p, cxld)) ||
 		    ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
 			dev_err(&cxlr->dev,
 				"%s:%s %s expected iw: %d ig: %d %pr\n",
@@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
 {
 	struct cxl_endpoint_decoder *cxled = data;
 	struct cxl_switch_decoder *cxlsd;
+	struct cxl_root_decoder *cxlrd;
 	struct range *r1, *r2;
 
 	if (!is_switch_decoder(dev))
@@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
 	r1 = &cxlsd->cxld.hpa_range;
 	r2 = &cxled->cxld.hpa_range;
 
-	if (is_root_decoder(dev))
-		return range_contains(r1, r2);
+	if (is_root_decoder(dev)) {
+		if (range_contains(r1, r2))
+			return 1;
+		cxlrd = to_cxl_root_decoder(dev);
+		if (arch_match_spa(cxlrd, cxled))
+			return 1;
+	}
 	return (r1->start == r2->start && r1->end == r2->end);
 }
 
@@ -1943,7 +1954,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 	}
 
 	if (resource_size(cxled->dpa_res) * p->interleave_ways !=
-	    resource_size(p->res)) {
+	    resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
 		dev_dbg(&cxlr->dev,
 			"%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n",
 			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
@@ -3199,7 +3210,13 @@ static int match_root_decoder_by_range(struct device *dev, void *data)
 	r1 = &cxlrd->cxlsd.cxld.hpa_range;
 	r2 = &cxled->cxld.hpa_range;
 
-	return range_contains(r1, r2);
+	if (range_contains(r1, r2))
+		return true;
+
+	if (arch_match_spa(cxlrd, cxled))
+		return true;
+
+	return false;
 }
 
 static int match_region_by_range(struct device *dev, void *data)
@@ -3217,8 +3234,12 @@ static int match_region_by_range(struct device *dev, void *data)
 	p = &cxlr->params;
 
 	down_read(&cxl_region_rwsem);
-	if (p->res && p->res->start == r->start && p->res->end == r->end)
-		rc = 1;
+	if (p->res) {
+		if (p->res->start == r->start && p->res->end == r->end)
+			rc = 1;
+		if (arch_match_region(p, &cxled->cxld))
+			rc = 1;
+	}
 	up_read(&cxl_region_rwsem);
 
 	return rc;
@@ -3270,6 +3291,22 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 
 	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
 				    dev_name(&cxlr->dev));
+
+	/*
+	 * Trim the HPA retrieved from hardware to fit the SPA mapped by the
+	 * platform
+	 */
+	if (arch_match_spa(cxlrd, cxled)) {
+		struct range *range = &cxlrd->cxlsd.cxld.hpa_range;
+
+		arch_trim_hpa_by_spa(res, cxlrd);
+		dev_dbg(cxlmd->dev.parent,
+			"%s: Trim HPA (%s: %pr) by SPA (%s: %pr)\n",
+			__func__,
+			dev_name(&cxlrd->cxlsd.cxld.dev), range,
+			dev_name(&cxled->cxld.dev), hpa);
+	}
+
 	rc = insert_resource(cxlrd->res, res);
 	if (rc) {
 		/*
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 5406e3ab3d4a..a5ad4499381e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -902,6 +902,31 @@ void cxl_coordinates_combine(struct access_coordinate *out,
 
 bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
 
+#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
+bool arch_match_spa(struct cxl_root_decoder *cxlrd,
+		    struct cxl_endpoint_decoder *cxled);
+bool arch_match_region(struct cxl_region_params *p,
+		       struct cxl_decoder *cxld);
+void arch_trim_hpa_by_spa(struct resource *res,
+			  struct cxl_root_decoder *cxlrd);
+#else
+bool arch_match_spa(struct cxl_root_decoder *cxlrd,
+		    struct cxl_endpoint_decoder *cxled)
+{
+	return false;
+}
+
+bool arch_match_region(struct cxl_region_params *p,
+		       struct cxl_decoder *cxld)
+{
+	return false;
+}
+
+void arch_trim_hpa_by_spa(struct resource *res,
+			  struct cxl_root_decoder *cxlrd)
+{ }
+#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests
  2024-11-22 15:51 [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
  2024-11-22 15:51 ` [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
@ 2024-11-22 15:51 ` Fabio M. De Francesco
  2024-11-22 17:26   ` Ira Weiny
                     ` (2 more replies)
  2024-11-22 19:46 ` [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Gregory Price
  2024-11-25 22:00 ` Alison Schofield
  4 siblings, 3 replies; 25+ messages in thread
From: Fabio M. De Francesco @ 2024-11-22 15:51 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Fabio M. De Francesco,
	Huang Ying, Yao Xingtao, Li Ming, linux-kernel, linux-cxl

Simulate an x86 Low Memory Hole for the CXL tests by changing
mock_cfmws[0] range size to 768MB and CXL Endpoint Decoder HPA range size
to 1GB and have get_cfmws_range_start() return two different addresses
which depend on whether the passed device is real or mock.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
 drivers/cxl/core/lmh.c       | 21 +++++++++++++--------
 drivers/cxl/cxl.h            |  7 +++++++
 tools/testing/cxl/Kbuild     |  1 +
 tools/testing/cxl/test/cxl.c |  4 ++--
 4 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
index da76b2a534ec..350008324bdc 100644
--- a/drivers/cxl/core/lmh.c
+++ b/drivers/cxl/core/lmh.c
@@ -1,10 +1,15 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
 #include <linux/range.h>
+#include <linux/pci.h>
 #include "cxl.h"
 
-/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
-#define MISALIGNED_CFMWS_RANGE_BASE 0x0
+u64 get_cfmws_range_start(struct device *dev)
+{
+	if (dev_is_pci(dev))
+		return MISALIGNED_CFMWS_RANGE_START;
+	return MISALIGNED_MOCK_CFMWS_RANGE_START;
+}
 
 /*
  * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
@@ -17,6 +22,7 @@
 bool arch_match_spa(struct cxl_root_decoder *cxlrd,
 		    struct cxl_endpoint_decoder *cxled)
 {
+	u64 cfmws_range_start = get_cfmws_range_start(&cxled->cxld.dev);
 	struct range *r1, *r2;
 	int niw;
 
@@ -24,9 +30,8 @@ bool arch_match_spa(struct cxl_root_decoder *cxlrd,
 	r2 = &cxled->cxld.hpa_range;
 	niw = cxled->cxld.interleave_ways;
 
-	if (r1->start == MISALIGNED_CFMWS_RANGE_BASE &&
-	    r1->start == r2->start && r1->end < r2->end &&
-	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
+	if (r1->start == cfmws_range_start && r1->start == r2->start &&
+	    r1->end < r2->end && IS_ALIGNED(range_len(r2), niw * SZ_256M))
 		return true;
 	return false;
 }
@@ -35,13 +40,13 @@ bool arch_match_spa(struct cxl_root_decoder *cxlrd,
 bool arch_match_region(struct cxl_region_params *p,
 		       struct cxl_decoder *cxld)
 {
+	u64 cfmws_range_start = get_cfmws_range_start(&cxld->dev);
 	struct range *r = &cxld->hpa_range;
 	struct resource *res = p->res;
 	int niw = cxld->interleave_ways;
 
-	if (res->start == MISALIGNED_CFMWS_RANGE_BASE &&
-	    res->start == r->start && res->end < r->end &&
-	    IS_ALIGNED(range_len(r), niw * SZ_256M))
+	if (res->start == cfmws_range_start && res->start == r->start &&
+	    res->end < r->end && IS_ALIGNED(range_len(r), niw * SZ_256M))
 		return true;
 	return false;
 }
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index a5ad4499381e..51dc80f8e50c 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -903,12 +903,19 @@ void cxl_coordinates_combine(struct access_coordinate *out,
 bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
 
 #ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
+
+/* Range start address of misaligned CFMWS in x86 with LMH */
+#define MISALIGNED_CFMWS_RANGE_START 0x0
+/* Range start address of mock misaligned CFMWS for tests */
+#define MISALIGNED_MOCK_CFMWS_RANGE_START 0xf010000000
+
 bool arch_match_spa(struct cxl_root_decoder *cxlrd,
 		    struct cxl_endpoint_decoder *cxled);
 bool arch_match_region(struct cxl_region_params *p,
 		       struct cxl_decoder *cxld);
 void arch_trim_hpa_by_spa(struct resource *res,
 			  struct cxl_root_decoder *cxlrd);
+u64 get_cfmws_range_start(struct device *dev);
 #else
 bool arch_match_spa(struct cxl_root_decoder *cxlrd,
 		    struct cxl_endpoint_decoder *cxled)
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index b1256fee3567..fe9c4480f758 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -62,6 +62,7 @@ cxl_core-y += $(CXL_CORE_SRC)/hdm.o
 cxl_core-y += $(CXL_CORE_SRC)/pmu.o
 cxl_core-y += $(CXL_CORE_SRC)/cdat.o
 cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o
+cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += $(CXL_CORE_SRC)/lmh.o
 cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o
 cxl_core-y += config_check.o
 cxl_core-y += cxl_core_test.o
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 050725afa45d..b61c3d78fed3 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -212,7 +212,7 @@ static struct {
 			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
 					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
 			.qtg_id = FAKE_QTG_ID,
-			.window_size = SZ_256M * 4UL,
+			.window_size = SZ_256M * 3UL,
 		},
 		.target = { 0 },
 	},
@@ -744,7 +744,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
 	struct cxl_endpoint_decoder *cxled;
 	struct cxl_switch_decoder *cxlsd;
 	struct cxl_port *port, *iter;
-	const int size = SZ_512M;
+	const int size = SZ_1G;
 	struct cxl_memdev *cxlmd;
 	struct cxl_dport *dport;
 	struct device *dev;
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
@ 2024-11-22 17:25   ` Ira Weiny
  2024-11-25 11:23     ` Li Ming
  2024-11-25  8:41   ` kernel test robot
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Ira Weiny @ 2024-11-22 17:25 UTC (permalink / raw)
  To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
	Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Huang Ying, Yao Xingtao, Li Ming, linux-kernel,
	linux-cxl

Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. In some cases the size
> of that hole is not compatible with the CXL hardware decoder constraint
> that the size is always aligned to 256M * Interleave Ways.
> 
> On those systems, BIOS publishes CFMWS which communicate the active System
> Physical Address (SPA) ranges that map to a subset of the Host Physical
> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> the endpoint is lost with no SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, cxl_add_to_region() fails and returns an error
> because it can't find any CXL Window that matches a given CXL Endpoint
> Decoder.
> 
> Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> ranges: both must start at physical address 0 and end below 4 GB, while
> the Root Decoder ranges end at lower addresses than the matching Endpoint
> Decoders which instead must always respect the above-mentioned CXL hardware
> decoders HPA alignment constraint.
> 
> Match misaligned CFMWS and CXL Regions with corresponding CXL Endpoint
> Decoders if driver detects Low Memory Holes.
> 
> Construct CXL Regions with HPA range's end trimmed by matching SPA.
> 
> Allow the attach target process to complete by relaxing Decoder constraints
> which would lead to failures.
> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>

Over all this looks good but I wonder if there is a slight optimization
which could be done.  See below.

> ---
>  drivers/cxl/Kconfig       |  5 ++++
>  drivers/cxl/core/Makefile |  1 +
>  drivers/cxl/core/lmh.c    | 53 +++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++++++++-------
>  drivers/cxl/cxl.h         | 25 ++++++++++++++++++
>  5 files changed, 130 insertions(+), 9 deletions(-)
>  create mode 100644 drivers/cxl/core/lmh.c
> 
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 876469e23f7a..07b87f217e59 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -128,6 +128,11 @@ config CXL_REGION
>  
>  	  If unsure say 'y'
>  
> +config CXL_ARCH_LOW_MEMORY_HOLE
> +	def_bool y
> +	depends on CXL_REGION
> +	depends on X86
> +
>  config CXL_REGION_INVALIDATION_TEST
>  	bool "CXL: Region Cache Management Bypass (TEST)"
>  	depends on CXL_REGION
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 9259bcc6773c..6e80215e8444 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -15,4 +15,5 @@ cxl_core-y += hdm.o
>  cxl_core-y += pmu.o
>  cxl_core-y += cdat.o
>  cxl_core-$(CONFIG_TRACING) += trace.o
> +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
>  cxl_core-$(CONFIG_CXL_REGION) += region.o
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> new file mode 100644
> index 000000000000..da76b2a534ec
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.c
> @@ -0,0 +1,53 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/range.h>
> +#include "cxl.h"
> +
> +/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
> +#define MISALIGNED_CFMWS_RANGE_BASE 0x0
> +
> +/*
> + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> + *
> + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> + * the given endpoint decoder HPA range size is always expected aligned and
> + * also larger than that of the matching root decoder
> + */
> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled)
> +{
> +	struct range *r1, *r2;
> +	int niw;
> +
> +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> +	r2 = &cxled->cxld.hpa_range;
> +	niw = cxled->cxld.interleave_ways;
> +
> +	if (r1->start == MISALIGNED_CFMWS_RANGE_BASE &&
> +	    r1->start == r2->start && r1->end < r2->end &&

Should this be 'r1->end <= r2->end'?

And could that simplify the logic in match_switch_decoder_by_range()?  See
below...

> +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
> +		return true;
> +	return false;
> +}
> +
> +/* Similar to arch_match_spa(), it matches regions and decoders */
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld)
> +{
> +	struct range *r = &cxld->hpa_range;
> +	struct resource *res = p->res;
> +	int niw = cxld->interleave_ways;
> +
> +	if (res->start == MISALIGNED_CFMWS_RANGE_BASE &&
> +	    res->start == r->start && res->end < r->end &&
> +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> +		return true;
> +	return false;
> +}
> +
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd)
> +{
> +	res->end = cxlrd->res->end;
> +}
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index ac2c486c16e9..3cb5a69e9731 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
>  	cxld = to_cxl_decoder(dev);
>  	r = &cxld->hpa_range;
>  
> -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> -		return 1;
> +	if (p->res) {
> +		if (p->res->start == r->start && p->res->end == r->end)
> +			return 1;
> +		if (arch_match_region(p, cxld))
> +			return 1;
> +	}
>  
>  	return 0;
>  }
> @@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>  		if (cxld->interleave_ways != iw ||
>  		    cxld->interleave_granularity != ig ||
>  		    cxld->hpa_range.start != p->res->start ||
> -		    cxld->hpa_range.end != p->res->end ||
> +		    (cxld->hpa_range.end != p->res->end &&
> +		     !arch_match_region(p, cxld)) ||
>  		    ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
>  			dev_err(&cxlr->dev,
>  				"%s:%s %s expected iw: %d ig: %d %pr\n",
> @@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>  {
>  	struct cxl_endpoint_decoder *cxled = data;
>  	struct cxl_switch_decoder *cxlsd;
> +	struct cxl_root_decoder *cxlrd;
>  	struct range *r1, *r2;
>  
>  	if (!is_switch_decoder(dev))
> @@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>  	r1 = &cxlsd->cxld.hpa_range;
>  	r2 = &cxled->cxld.hpa_range;
>  
> -	if (is_root_decoder(dev))
> -		return range_contains(r1, r2);
> +	if (is_root_decoder(dev)) {
> +		if (range_contains(r1, r2))
> +			return 1;
> +		cxlrd = to_cxl_root_decoder(dev);
> +		if (arch_match_spa(cxlrd, cxled))
> +			return 1;
> +	}

If the logic in arch_match_spa() was changed to allow for an equal end
value could this block be simplified to?

	if (is_root_decoder(dev)) {
		cxlrd = to_cxl_root_decoder(dev);
		if (arch_match_spa(cxlrd, cxled))
			return 1;
	}

>  	return (r1->start == r2->start && r1->end == r2->end);
>  }
>  
> @@ -1943,7 +1954,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  	}
>  
>  	if (resource_size(cxled->dpa_res) * p->interleave_ways !=
> -	    resource_size(p->res)) {
> +	    resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
>  		dev_dbg(&cxlr->dev,
>  			"%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n",
>  			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> @@ -3199,7 +3210,13 @@ static int match_root_decoder_by_range(struct device *dev, void *data)
>  	r1 = &cxlrd->cxlsd.cxld.hpa_range;
>  	r2 = &cxled->cxld.hpa_range;
>  
> -	return range_contains(r1, r2);
> +	if (range_contains(r1, r2))
> +		return true;
> +
> +	if (arch_match_spa(cxlrd, cxled))
> +		return true;

Same question/simplification here?

Ira

> +
> +	return false;
>  }
>  
>  static int match_region_by_range(struct device *dev, void *data)
> @@ -3217,8 +3234,12 @@ static int match_region_by_range(struct device *dev, void *data)
>  	p = &cxlr->params;
>  
>  	down_read(&cxl_region_rwsem);
> -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> -		rc = 1;
> +	if (p->res) {
> +		if (p->res->start == r->start && p->res->end == r->end)
> +			rc = 1;
> +		if (arch_match_region(p, &cxled->cxld))
> +			rc = 1;
> +	}
>  	up_read(&cxl_region_rwsem);
>  
>  	return rc;
> @@ -3270,6 +3291,22 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  
>  	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
>  				    dev_name(&cxlr->dev));
> +
> +	/*
> +	 * Trim the HPA retrieved from hardware to fit the SPA mapped by the
> +	 * platform
> +	 */
> +	if (arch_match_spa(cxlrd, cxled)) {
> +		struct range *range = &cxlrd->cxlsd.cxld.hpa_range;
> +
> +		arch_trim_hpa_by_spa(res, cxlrd);
> +		dev_dbg(cxlmd->dev.parent,
> +			"%s: Trim HPA (%s: %pr) by SPA (%s: %pr)\n",
> +			__func__,
> +			dev_name(&cxlrd->cxlsd.cxld.dev), range,
> +			dev_name(&cxled->cxld.dev), hpa);
> +	}
> +
>  	rc = insert_resource(cxlrd->res, res);
>  	if (rc) {
>  		/*
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5406e3ab3d4a..a5ad4499381e 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -902,6 +902,31 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>  
>  bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>  
> +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled);
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld);
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd);
> +#else
> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled)
> +{
> +	return false;
> +}
> +
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld)
> +{
> +	return false;
> +}
> +
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd)
> +{ }
> +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
> +
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
>   * of these symbols in tools/testing/cxl/.
> -- 
> 2.46.2
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests
  2024-11-22 15:51 ` [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
@ 2024-11-22 17:26   ` Ira Weiny
  2024-11-25 20:46   ` Alison Schofield
  2024-11-26 15:00   ` Jonathan Cameron
  2 siblings, 0 replies; 25+ messages in thread
From: Ira Weiny @ 2024-11-22 17:26 UTC (permalink / raw)
  To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
	Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Huang Ying, Yao Xingtao, Li Ming, linux-kernel,
	linux-cxl

Fabio M. De Francesco wrote:
> Simulate an x86 Low Memory Hole for the CXL tests by changing
> mock_cfmws[0] range size to 768MB and CXL Endpoint Decoder HPA range size
> to 1GB and have get_cfmws_range_start() return two different addresses
> which depend on whether the passed device is real or mock.
> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>

Nice!

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention
  2024-11-22 15:51 ` [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
@ 2024-11-22 17:28   ` Ira Weiny
  2024-11-25 21:10   ` Alison Schofield
  1 sibling, 0 replies; 25+ messages in thread
From: Ira Weiny @ 2024-11-22 17:28 UTC (permalink / raw)
  To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
	Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Huang Ying, Yao Xingtao, Li Ming, linux-kernel,
	linux-cxl

Fabio M. De Francesco wrote:
> In preparation for a patch that supports matching between Decoders that may
> have larger HPA ranges than their corresponding Root Decoders or Regions,
> change the calling convention of three match_*_by_range() to take CXL
> Endpoint Decoders as their second arguments which later will be reused by
> two new functions (arch_match_spa() and arch_match_region()).
> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

[snip]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole
  2024-11-22 15:51 [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
                   ` (2 preceding siblings ...)
  2024-11-22 15:51 ` [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
@ 2024-11-22 19:46 ` Gregory Price
  2024-11-25 22:00 ` Alison Schofield
  4 siblings, 0 replies; 25+ messages in thread
From: Gregory Price @ 2024-11-22 19:46 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao,
	Li Ming, linux-kernel, linux-cxl, rrichter, terry.bowman

On Fri, Nov 22, 2024 at 04:51:51PM +0100, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. On those systems, BIOS
> publishes CFMWS which communicate the active System Physical Address (SPA)
> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> SPA range trims out the hole, and capacity in the endpoint is lost with no
> SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, the driver fails and returns an error because it
> expects that the CXL Endpoint Decoder range is a subset of the Root
> Decoder's.
> 
> Then detect SPA/HPA misalignment and allow CXL Regions construction and 
> attach if and only if the misalignment is due to x86 Low Memory Holes.

+cc Robert Richter and Terry Bowman

This is not the only memory-hole possibility. We may need something
more robust, rather than optimizing for a single memory hole solution.

Robert and Terry may have some additional context here.

~Gregory

> 
> - Patch 1/3 changes the calling conventions of three match_*_by_range()
>   helpers in preparation of 2/3.
> - Patch 2/3 detects x86 LMH and enables CXL Regions construction and
>   attach by trimming HPA by SPA.
> - Patch 3/3 simulates a LMH for running the CXL tests on patched driver.
> 
> Many thanks to Alison, Dan, and Ira for their help and for their reviews
> of my RFC on Intel's internal ML.
> 
> Fabio M. De Francesco (3):
>   cxl/core: Change match_*_by_range() calling convention
>   cxl/core: Enable Region creation on x86 with Low Memory Hole
>   cxl/test: Simulate an x86 Low Memory Hole for tests
> 
>  drivers/cxl/Kconfig          |  5 +++
>  drivers/cxl/core/Makefile    |  1 +
>  drivers/cxl/core/lmh.c       | 58 ++++++++++++++++++++++++++
>  drivers/cxl/core/region.c    | 80 ++++++++++++++++++++++++++++--------
>  drivers/cxl/cxl.h            | 32 +++++++++++++++
>  tools/testing/cxl/Kbuild     |  1 +
>  tools/testing/cxl/test/cxl.c |  4 +-
>  7 files changed, 161 insertions(+), 20 deletions(-)
>  create mode 100644 drivers/cxl/core/lmh.c
> 
> -- 
> 2.46.2
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
  2024-11-22 17:25   ` Ira Weiny
@ 2024-11-25  8:41   ` kernel test robot
  2024-11-25  8:42   ` Li Ming
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2024-11-25  8:41 UTC (permalink / raw)
  To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
	Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Huang Ying, Yao Xingtao, Li Ming, linux-kernel,
	linux-cxl
  Cc: oe-kbuild-all

Hi Fabio,

kernel test robot noticed the following build warnings:

[auto build test WARNING on cxl/next]
[also build test WARNING on linus/master cxl/pending v6.12 next-20241125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Fabio-M-De-Francesco/cxl-core-Change-match_-_by_range-calling-convention/20241125-102754
base:   https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git next
patch link:    https://lore.kernel.org/r/20241122155226.2068287-3-fabio.m.de.francesco%40linux.intel.com
patch subject: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
config: i386-buildonly-randconfig-001-20241125 (https://download.01.org/0day-ci/archive/20241125/202411251632.4eenIlnF-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241125/202411251632.4eenIlnF-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411251632.4eenIlnF-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from drivers/cxl/cxlmem.h:12,
                    from drivers/cxl/port.c:7:
>> drivers/cxl/cxl.h:921:6: warning: no previous prototype for 'arch_match_spa' [-Wmissing-prototypes]
     921 | bool arch_match_spa(struct cxl_root_decoder *cxlrd,
         |      ^~~~~~~~~~~~~~
>> drivers/cxl/cxl.h:927:6: warning: no previous prototype for 'arch_match_region' [-Wmissing-prototypes]
     927 | bool arch_match_region(struct cxl_region_params *p,
         |      ^~~~~~~~~~~~~~~~~
>> drivers/cxl/cxl.h:933:6: warning: no previous prototype for 'arch_trim_hpa_by_spa' [-Wmissing-prototypes]
     933 | void arch_trim_hpa_by_spa(struct resource *res,
         |      ^~~~~~~~~~~~~~~~~~~~

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for GET_FREE_REGION
   Depends on [n]: SPARSEMEM [=n]
   Selected by [y]:
   - RESOURCE_KUNIT_TEST [=y] && RUNTIME_TESTING_MENU [=y] && KUNIT [=y]


vim +/arch_match_spa +921 drivers/cxl/cxl.h

   912	
   913	#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
   914	bool arch_match_spa(struct cxl_root_decoder *cxlrd,
   915			    struct cxl_endpoint_decoder *cxled);
   916	bool arch_match_region(struct cxl_region_params *p,
   917			       struct cxl_decoder *cxld);
   918	void arch_trim_hpa_by_spa(struct resource *res,
   919				  struct cxl_root_decoder *cxlrd);
   920	#else
 > 921	bool arch_match_spa(struct cxl_root_decoder *cxlrd,
   922			    struct cxl_endpoint_decoder *cxled)
   923	{
   924		return false;
   925	}
   926	
 > 927	bool arch_match_region(struct cxl_region_params *p,
   928			       struct cxl_decoder *cxld)
   929	{
   930		return false;
   931	}
   932	
 > 933	void arch_trim_hpa_by_spa(struct resource *res,
   934				  struct cxl_root_decoder *cxlrd)
   935	{ }
   936	#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
   937	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
  2024-11-22 17:25   ` Ira Weiny
  2024-11-25  8:41   ` kernel test robot
@ 2024-11-25  8:42   ` Li Ming
  2024-11-25 17:22     ` Fabio M. De Francesco
  2024-11-25 13:37   ` kernel test robot
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Li Ming @ 2024-11-25  8:42 UTC (permalink / raw)
  To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
	Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Huang Ying, Yao Xingtao, Li Ming, linux-kernel,
	linux-cxl



On 11/22/2024 11:51 PM, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. In some cases the size
> of that hole is not compatible with the CXL hardware decoder constraint
> that the size is always aligned to 256M * Interleave Ways.
> 
> On those systems, BIOS publishes CFMWS which communicate the active System
> Physical Address (SPA) ranges that map to a subset of the Host Physical
> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> the endpoint is lost with no SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, cxl_add_to_region() fails and returns an error
> because it can't find any CXL Window that matches a given CXL Endpoint
> Decoder.
> 
> Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> ranges: both must start at physical address 0 and end below 4 GB, while
> the Root Decoder ranges end at lower addresses than the matching Endpoint
> Decoders which instead must always respect the above-mentioned CXL hardware
> decoders HPA alignment constraint.

Hi Fabio,

Here mentions that the end address must be below 4GB, but I don't find any checking about that in the implementation, is it not needed?

[snip]


> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index ac2c486c16e9..3cb5a69e9731 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
>  	cxld = to_cxl_decoder(dev);
>  	r = &cxld->hpa_range;
>  
> -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> -		return 1;
> +	if (p->res) {
> +		if (p->res->start == r->start && p->res->end == r->end)
> +			return 1;
> +		if (arch_match_region(p, cxld))
> +			return 1;
> +	}

I think that it is better to check LMH cases before checking (p->res->start == r->start && p->res->end == r->end).
Per the changelog and the implementation of arch_match_region(), below case should fails but current checking will succeed:
the value of 'p->res->start' is MISALIGNED_CFMWS_RANGE_BASE and the the value of 'p->res->end' is equals to the value of 'r->end'.


>  
>  	return 0;
>  }
> @@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>  		if (cxld->interleave_ways != iw ||
>  		    cxld->interleave_granularity != ig ||
>  		    cxld->hpa_range.start != p->res->start ||
> -		    cxld->hpa_range.end != p->res->end ||
> +		    (cxld->hpa_range.end != p->res->end &&
> +		     !arch_match_region(p, cxld)) ||
>  		    ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
>  			dev_err(&cxlr->dev,
>  				"%s:%s %s expected iw: %d ig: %d %pr\n",
> @@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>  {
>  	struct cxl_endpoint_decoder *cxled = data;
>  	struct cxl_switch_decoder *cxlsd;
> +	struct cxl_root_decoder *cxlrd;
>  	struct range *r1, *r2;
>  
>  	if (!is_switch_decoder(dev))
> @@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>  	r1 = &cxlsd->cxld.hpa_range;
>  	r2 = &cxled->cxld.hpa_range;
>  
> -	if (is_root_decoder(dev))
> -		return range_contains(r1, r2);
> +	if (is_root_decoder(dev)) {
> +		if (range_contains(r1, r2))
> +			return 1;
> +		cxlrd = to_cxl_root_decoder(dev);
> +		if (arch_match_spa(cxlrd, cxled))
> +			return 1;

Same as above, what will happen if the root decoder's address range still contains endpoint decoder's address range in LMH cases? should fails or succeed?

> +	}
>  	return (r1->start == r2->start && r1->end == r2->end);
>  }
>  
> @@ -1943,7 +1954,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  	}
>  
>  	if (resource_size(cxled->dpa_res) * p->interleave_ways !=
> -	    resource_size(p->res)) {
> +	    resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
>  		dev_dbg(&cxlr->dev,
>  			"%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n",
>  			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> @@ -3199,7 +3210,13 @@ static int match_root_decoder_by_range(struct device *dev, void *data)
>  	r1 = &cxlrd->cxlsd.cxld.hpa_range;
>  	r2 = &cxled->cxld.hpa_range;
>  
> -	return range_contains(r1, r2);
> +	if (range_contains(r1, r2))
> +		return true;
> +
> +	if (arch_match_spa(cxlrd, cxled))
> +		return true;
> +
> +	return false;

Same as above.

>  }
>  
>  static int match_region_by_range(struct device *dev, void *data)
> @@ -3217,8 +3234,12 @@ static int match_region_by_range(struct device *dev, void *data)
>  	p = &cxlr->params;
>  
>  	down_read(&cxl_region_rwsem);
> -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> -		rc = 1;
> +	if (p->res) {
> +		if (p->res->start == r->start && p->res->end == r->end)
> +			rc = 1;
> +		if (arch_match_region(p, &cxled->cxld))
> +			rc = 1;
> +	}

Same as above.

Thanks
Ming


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 17:25   ` Ira Weiny
@ 2024-11-25 11:23     ` Li Ming
  0 siblings, 0 replies; 25+ messages in thread
From: Li Ming @ 2024-11-25 11:23 UTC (permalink / raw)
  To: Ira Weiny, Fabio M. De Francesco, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
	Dan Williams, Huang Ying, Yao Xingtao, Li Ming, linux-kernel,
	linux-cxl



On 11/23/2024 1:25 AM, Ira Weiny wrote:
> Fabio M. De Francesco wrote:
>> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
>> Physical Address (HPA) windows that are associated with each CXL Host
>> Bridge. Each window represents a contiguous HPA that may be interleaved
>> with one or more targets (CXL v3.1 - 9.18.1.3).
>>
>> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
>> memory to which systems cannot send transactions. In some cases the size
>> of that hole is not compatible with the CXL hardware decoder constraint
>> that the size is always aligned to 256M * Interleave Ways.
>>
>> On those systems, BIOS publishes CFMWS which communicate the active System
>> Physical Address (SPA) ranges that map to a subset of the Host Physical
>> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
>> the endpoint is lost with no SPA to map to CXL HPA in that hole.
>>
>> In the early stages of CXL Regions construction and attach on platforms
>> with Low Memory Holes, cxl_add_to_region() fails and returns an error
>> because it can't find any CXL Window that matches a given CXL Endpoint
>> Decoder.
>>
>> Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
>> ranges: both must start at physical address 0 and end below 4 GB, while
>> the Root Decoder ranges end at lower addresses than the matching Endpoint
>> Decoders which instead must always respect the above-mentioned CXL hardware
>> decoders HPA alignment constraint.
>>
>> Match misaligned CFMWS and CXL Regions with corresponding CXL Endpoint
>> Decoders if driver detects Low Memory Holes.
>>
>> Construct CXL Regions with HPA range's end trimmed by matching SPA.
>>
>> Allow the attach target process to complete by relaxing Decoder constraints
>> which would lead to failures.
>>
>> Cc: Alison Schofield <alison.schofield@intel.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> 
> Over all this looks good but I wonder if there is a slight optimization
> which could be done.  See below.
> 
>> ---
>>  drivers/cxl/Kconfig       |  5 ++++
>>  drivers/cxl/core/Makefile |  1 +
>>  drivers/cxl/core/lmh.c    | 53 +++++++++++++++++++++++++++++++++++++
>>  drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++++++++-------
>>  drivers/cxl/cxl.h         | 25 ++++++++++++++++++
>>  5 files changed, 130 insertions(+), 9 deletions(-)
>>  create mode 100644 drivers/cxl/core/lmh.c
>>
>> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
>> index 876469e23f7a..07b87f217e59 100644
>> --- a/drivers/cxl/Kconfig
>> +++ b/drivers/cxl/Kconfig
>> @@ -128,6 +128,11 @@ config CXL_REGION
>>  
>>  	  If unsure say 'y'
>>  
>> +config CXL_ARCH_LOW_MEMORY_HOLE
>> +	def_bool y
>> +	depends on CXL_REGION
>> +	depends on X86
>> +
>>  config CXL_REGION_INVALIDATION_TEST
>>  	bool "CXL: Region Cache Management Bypass (TEST)"
>>  	depends on CXL_REGION
>> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
>> index 9259bcc6773c..6e80215e8444 100644
>> --- a/drivers/cxl/core/Makefile
>> +++ b/drivers/cxl/core/Makefile
>> @@ -15,4 +15,5 @@ cxl_core-y += hdm.o
>>  cxl_core-y += pmu.o
>>  cxl_core-y += cdat.o
>>  cxl_core-$(CONFIG_TRACING) += trace.o
>> +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
>>  cxl_core-$(CONFIG_CXL_REGION) += region.o
>> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
>> new file mode 100644
>> index 000000000000..da76b2a534ec
>> --- /dev/null
>> +++ b/drivers/cxl/core/lmh.c
>> @@ -0,0 +1,53 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +
>> +#include <linux/range.h>
>> +#include "cxl.h"
>> +
>> +/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
>> +#define MISALIGNED_CFMWS_RANGE_BASE 0x0
>> +
>> +/*
>> + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
>> + *
>> + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
>> + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
>> + * the given endpoint decoder HPA range size is always expected aligned and
>> + * also larger than that of the matching root decoder
>> + */
>> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
>> +		    struct cxl_endpoint_decoder *cxled)
>> +{
>> +	struct range *r1, *r2;
>> +	int niw;
>> +
>> +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
>> +	r2 = &cxled->cxld.hpa_range;
>> +	niw = cxled->cxld.interleave_ways;
>> +
>> +	if (r1->start == MISALIGNED_CFMWS_RANGE_BASE &&
>> +	    r1->start == r2->start && r1->end < r2->end &&
> 
> Should this be 'r1->end <= r2->end'?
> 

Hi Ira,

I think that cannot just simply use 'r1->end <= r2->end' to instead of 'r1->end < r2->end' here, because there is a 'r1->start == MISALIGNED_CFMWS_RANGE_BASE' checking which means the 'if' checking is only for LMH cases. 
But maybe can put both LMH cases checkings and non-LMH cases checkings into one function.

Thanks
Ming

[snip]



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
                     ` (2 preceding siblings ...)
  2024-11-25  8:42   ` Li Ming
@ 2024-11-25 13:37   ` kernel test robot
  2024-11-25 20:35   ` Alison Schofield
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2024-11-25 13:37 UTC (permalink / raw)
  To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
	Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Huang Ying, Yao Xingtao, Li Ming, linux-kernel,
	linux-cxl
  Cc: oe-kbuild-all

Hi Fabio,

kernel test robot noticed the following build errors:

[auto build test ERROR on cxl/next]
[also build test ERROR on linus/master cxl/pending v6.12 next-20241125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Fabio-M-De-Francesco/cxl-core-Change-match_-_by_range-calling-convention/20241125-102754
base:   https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git next
patch link:    https://lore.kernel.org/r/20241122155226.2068287-3-fabio.m.de.francesco%40linux.intel.com
patch subject: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20241125/202411252126.T7xKY88q-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241125/202411252126.T7xKY88q-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411252126.T7xKY88q-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   In file included from drivers/dax/cxl.c:6:
>> drivers/dax/../cxl/cxl.h:921:6: warning: no previous prototype for 'arch_match_spa' [-Wmissing-prototypes]
     921 | bool arch_match_spa(struct cxl_root_decoder *cxlrd,
         |      ^~~~~~~~~~~~~~
>> drivers/dax/../cxl/cxl.h:927:6: warning: no previous prototype for 'arch_match_region' [-Wmissing-prototypes]
     927 | bool arch_match_region(struct cxl_region_params *p,
         |      ^~~~~~~~~~~~~~~~~
>> drivers/dax/../cxl/cxl.h:933:6: warning: no previous prototype for 'arch_trim_hpa_by_spa' [-Wmissing-prototypes]
     933 | void arch_trim_hpa_by_spa(struct resource *res,
         |      ^~~~~~~~~~~~~~~~~~~~
--
   s390-linux-ld: drivers/cxl/core/port.o: in function `arch_match_spa':
>> port.c:(.text+0x87c0): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/port.o: in function `arch_match_region':
>> port.c:(.text+0x8840): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/port.o: in function `arch_trim_hpa_by_spa':
>> port.c:(.text+0x88c0): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/pmem.o: in function `arch_match_spa':
   pmem.c:(.text+0xe00): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/pmem.o: in function `arch_match_region':
   pmem.c:(.text+0xe80): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/pmem.o: in function `arch_trim_hpa_by_spa':
   pmem.c:(.text+0xf00): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/regs.o: in function `arch_match_spa':
   regs.c:(.text+0x1740): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/regs.o: in function `arch_match_region':
   regs.c:(.text+0x17c0): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/regs.o: in function `arch_trim_hpa_by_spa':
   regs.c:(.text+0x1840): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/memdev.o: in function `arch_match_spa':
   memdev.c:(.text+0x3ec0): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/memdev.o: in function `arch_match_region':
   memdev.c:(.text+0x3f40): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/memdev.o: in function `arch_trim_hpa_by_spa':
   memdev.c:(.text+0x3fc0): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/mbox.o: in function `arch_match_spa':
   mbox.c:(.text+0x4f80): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/mbox.o: in function `arch_match_region':
   mbox.c:(.text+0x5000): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/mbox.o: in function `arch_trim_hpa_by_spa':
   mbox.c:(.text+0x5080): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/pci.o: in function `arch_match_spa':
   pci.c:(.text+0x3e40): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/pci.o: in function `arch_match_region':
   pci.c:(.text+0x3ec0): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/pci.o: in function `arch_trim_hpa_by_spa':
   pci.c:(.text+0x3f40): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/hdm.o: in function `arch_match_spa':
   hdm.c:(.text+0x4880): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/hdm.o: in function `arch_match_region':
   hdm.c:(.text+0x4900): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/hdm.o: in function `arch_trim_hpa_by_spa':
   hdm.c:(.text+0x4980): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/pmu.o: in function `arch_match_spa':
   pmu.c:(.text+0x340): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/pmu.o: in function `arch_match_region':
   pmu.c:(.text+0x3c0): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/pmu.o: in function `arch_trim_hpa_by_spa':
   pmu.c:(.text+0x440): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/cdat.o: in function `arch_match_spa':
   cdat.c:(.text+0x19c0): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/cdat.o: in function `arch_match_region':
   cdat.c:(.text+0x1a40): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/cdat.o: in function `arch_trim_hpa_by_spa':
   cdat.c:(.text+0x1ac0): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/trace.o: in function `arch_match_spa':
   trace.c:(.text+0x8ec0): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/trace.o: in function `arch_match_region':
   trace.c:(.text+0x8f40): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/trace.o: in function `arch_trim_hpa_by_spa':
   trace.c:(.text+0x8fc0): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/core/region.o: in function `arch_match_spa':
   region.c:(.text+0xca00): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/core/region.o: in function `arch_match_region':
   region.c:(.text+0xce00): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/core/region.o: in function `arch_trim_hpa_by_spa':
   region.c:(.text+0x11a80): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/pci.o: in function `arch_match_spa':
   pci.c:(.text+0x4cc0): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/pci.o: in function `arch_match_region':
   pci.c:(.text+0x4d40): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/pci.o: in function `arch_trim_hpa_by_spa':
   pci.c:(.text+0x4dc0): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/mem.o: in function `arch_match_spa':
   mem.c:(.text+0xe40): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/mem.o: in function `arch_match_region':
   mem.c:(.text+0xec0): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/mem.o: in function `arch_trim_hpa_by_spa':
   mem.c:(.text+0xf40): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/pmem.o: in function `arch_match_spa':
   pmem.c:(.text+0x18c0): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/pmem.o: in function `arch_match_region':
   pmem.c:(.text+0x1940): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/pmem.o: in function `arch_trim_hpa_by_spa':
   pmem.c:(.text+0x19c0): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/security.o: in function `arch_match_spa':
   security.c:(.text+0xb00): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/security.o: in function `arch_match_region':
   security.c:(.text+0xb80): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/security.o: in function `arch_trim_hpa_by_spa':
   security.c:(.text+0xc00): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/cxl/port.o: in function `arch_match_spa':
   port.c:(.text+0x840): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/cxl/port.o: in function `arch_match_region':
   port.c:(.text+0x8c0): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/cxl/port.o: in function `arch_trim_hpa_by_spa':
   port.c:(.text+0x940): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here
   s390-linux-ld: drivers/perf/cxl_pmu.o: in function `arch_match_spa':
   cxl_pmu.c:(.text+0x3380): multiple definition of `arch_match_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x200): first defined here
   s390-linux-ld: drivers/perf/cxl_pmu.o: in function `arch_match_region':
   cxl_pmu.c:(.text+0x3400): multiple definition of `arch_match_region'; drivers/dax/cxl.o:cxl.c:(.text+0x280): first defined here
   s390-linux-ld: drivers/perf/cxl_pmu.o: in function `arch_trim_hpa_by_spa':
   cxl_pmu.c:(.text+0x3480): multiple definition of `arch_trim_hpa_by_spa'; drivers/dax/cxl.o:cxl.c:(.text+0x300): first defined here

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-25  8:42   ` Li Ming
@ 2024-11-25 17:22     ` Fabio M. De Francesco
  2024-11-26  2:13       ` Li Ming
  0 siblings, 1 reply; 25+ messages in thread
From: Fabio M. De Francesco @ 2024-11-25 17:22 UTC (permalink / raw)
  To: Li Ming, Li Ming
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao,
	linux-kernel, linux-cxl

On Monday, November 25, 2024 9:42:56 AM GMT+1 Li Ming wrote:
> 
> On 11/22/2024 11:51 PM, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> > 
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. In some cases the size
> > of that hole is not compatible with the CXL hardware decoder constraint
> > that the size is always aligned to 256M * Interleave Ways.
> > 
> > On those systems, BIOS publishes CFMWS which communicate the active System
> > Physical Address (SPA) ranges that map to a subset of the Host Physical
> > Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> > the endpoint is lost with no SPA to map to CXL HPA in that hole.
> > 
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, cxl_add_to_region() fails and returns an error
> > because it can't find any CXL Window that matches a given CXL Endpoint
> > Decoder.
> > 
> > Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> > ranges: both must start at physical address 0 and end below 4 GB, while
> > the Root Decoder ranges end at lower addresses than the matching Endpoint
> > Decoders which instead must always respect the above-mentioned CXL hardware
> > decoders HPA alignment constraint.
> 
> Hi Fabio,
> 
> Here mentions that the end address must be below 4GB, but I don't find any checking about that in the implementation, is it not needed?
> 
Hi Ming,

Good question, thanks!

While a first version of arch_match_spa() had a check of 'r2->end < SZ_4G',
I dropped it for the final one. It ended up out of sync with the commit message.

I think that we don't want that check. I'll rework the commit message for v2.

If the hardware decoders HPA ranges extended beyond the end of Low 
Memory, the LMH would still need to be detected and the decoders capacity 
would still need to be trimmed to match their corresponding CFMWS range end. 
>
> [snip]
> 
> 
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index ac2c486c16e9..3cb5a69e9731 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
> >  	cxld = to_cxl_decoder(dev);
> >  	r = &cxld->hpa_range;
> >  
> > -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> > -		return 1;
> > +	if (p->res) {
> > +		if (p->res->start == r->start && p->res->end == r->end)
> > +			return 1;
> > +		if (arch_match_region(p, cxld))
> > +			return 1;
> > +	}
> 
> I think that it is better to check LMH cases before checking (p->res->start == r->start && p->res->end == r->end).
> Per the changelog and the implementation of arch_match_region(), below case should fails but current checking will succeed:
> the value of 'p->res->start' is MISALIGNED_CFMWS_RANGE_BASE and the the value of 'p->res->end' is equals to the value of 'r->end'.
> 
I think that the expected "normal" case should always come first. In the expected
scenarios the driver deals either with SPA range == HPA range 
(e.g, in match_auto_decoder()) or with SPA range that fully contains the HPA range
(match_decoder_by_range()). 

If either one of those two cases holds, arch_match_*() helper doesn't need to be
called and the match must succeed. 

Besides, other architectures are free to define holes with many constraints that 
the driver doesn't want to check first if the expected case is already met.
> 
> >  
> >  	return 0;
> >  }
> > @@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> >  		if (cxld->interleave_ways != iw ||
> >  		    cxld->interleave_granularity != ig ||
> >  		    cxld->hpa_range.start != p->res->start ||
> > -		    cxld->hpa_range.end != p->res->end ||
> > +		    (cxld->hpa_range.end != p->res->end &&
> > +		     !arch_match_region(p, cxld)) ||
> >  		    ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
> >  			dev_err(&cxlr->dev,
> >  				"%s:%s %s expected iw: %d ig: %d %pr\n",
> > @@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> >  {
> >  	struct cxl_endpoint_decoder *cxled = data;
> >  	struct cxl_switch_decoder *cxlsd;
> > +	struct cxl_root_decoder *cxlrd;
> >  	struct range *r1, *r2;
> >  
> >  	if (!is_switch_decoder(dev))
> > @@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> >  	r1 = &cxlsd->cxld.hpa_range;
> >  	r2 = &cxled->cxld.hpa_range;
> >  
> > -	if (is_root_decoder(dev))
> > -		return range_contains(r1, r2);
> > +	if (is_root_decoder(dev)) {
> > +		if (range_contains(r1, r2))
> > +			return 1;
> > +		cxlrd = to_cxl_root_decoder(dev);
> > +		if (arch_match_spa(cxlrd, cxled))
> > +			return 1;
> 
> Same as above, what will happen if the root decoder's address range still contains endpoint decoder's address range in LMH cases? should fails or succeed?
> 
If r1 contains r2, there is no LMH and the driver is dealing with the regular, 
expected, case. It must succeed.

Think of the arch_match_*() helpers like something that avoid unwanted
failures if arch permits exceptions. Before returning errors when the "normal"
tests fail, check if the arch allows any exceptions and make the driver
ignore that "strange" SPA/HPA misalignment.
>
> [snip]
>
Thanks,

Fabio






^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
                     ` (3 preceding siblings ...)
  2024-11-25 13:37   ` kernel test robot
@ 2024-11-25 20:35   ` Alison Schofield
  2024-11-25 22:44   ` kernel test robot
  2024-12-16 21:30   ` Robert Richter
  6 siblings, 0 replies; 25+ messages in thread
From: Alison Schofield @ 2024-11-25 20:35 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao, Li Ming,
	linux-kernel, linux-cxl

On Fri, Nov 22, 2024 at 04:51:53PM +0100, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. In some cases the size
> of that hole is not compatible with the CXL hardware decoder constraint
> that the size is always aligned to 256M * Interleave Ways.
> 
> On those systems, BIOS publishes CFMWS which communicate the active System
> Physical Address (SPA) ranges that map to a subset of the Host Physical
> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> the endpoint is lost with no SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, cxl_add_to_region() fails and returns an error
> because it can't find any CXL Window that matches a given CXL Endpoint
> Decoder.
> 
> Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> ranges: both must start at physical address 0 and end below 4 GB, while
> the Root Decoder ranges end at lower addresses than the matching Endpoint
> Decoders which instead must always respect the above-mentioned CXL hardware
> decoders HPA alignment constraint.
> 
> Match misaligned CFMWS and CXL Regions with corresponding CXL Endpoint
> Decoders if driver detects Low Memory Holes.
> 
> Construct CXL Regions with HPA range's end trimmed by matching SPA.
> 
> Allow the attach target process to complete by relaxing Decoder constraints
> which would lead to failures.

This may be crisper in 2 patches so that we can separate
a) introducing a method to handle arch specific constraints in region creation 
from
b) use method a) to handle the x86 LMH specific constraint.

Sometimes with an add like this, we'll say your the first, so go for it,
and the next one will have to generic-ize. This time, we are aware that
other holes may be lurking, so perhaps we can driver that generic soln
from the start.

snip
> 
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c

snip

> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled)
> +{
> +	struct range *r1, *r2;
> +	int niw;
> +
> +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> +	r2 = &cxled->cxld.hpa_range;
> +	niw = cxled->cxld.interleave_ways;
> +
> +	if (r1->start == MISALIGNED_CFMWS_RANGE_BASE &&
> +	    r1->start == r2->start && r1->end < r2->end &&
> +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
> +		return true;

space before final return please

> +	return false;
> +}
> +
> +/* Similar to arch_match_spa(), it matches regions and decoders */
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld)
> +{
> +	struct range *r = &cxld->hpa_range;
> +	struct resource *res = p->res;
> +	int niw = cxld->interleave_ways;
> +
> +	if (res->start == MISALIGNED_CFMWS_RANGE_BASE &&
> +	    res->start == r->start && res->end < r->end &&
> +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> +		return true;

space before final return please

> +	return false;
> +}
> +
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd)
> +{
> +	res->end = cxlrd->res->end;
> +}

I'm not so sure about the above function name.

trim_by_spa implies to me that the hpa is trimmed by the amount of the spa.
The hpa is trimmed to align with the spa and that is special to this LMH
case.

Would something completely generic suffice here, like:
arch_adjust_region_resource()


> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c

snip

> @@ -3270,6 +3291,22 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  
>  	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
>  				    dev_name(&cxlr->dev));
> +
> +	/*
> +	 * Trim the HPA retrieved from hardware to fit the SPA mapped by the
> +	 * platform
> +	 */
> +	if (arch_match_spa(cxlrd, cxled)) {
> +		struct range *range = &cxlrd->cxlsd.cxld.hpa_range;
> +
> +		arch_trim_hpa_by_spa(res, cxlrd);
> +		dev_dbg(cxlmd->dev.parent,
> +			"%s: Trim HPA (%s: %pr) by SPA (%s: %pr)\n",
> +			__func__,
> +			dev_name(&cxlrd->cxlsd.cxld.dev), range,
> +			dev_name(&cxled->cxld.dev), hpa);
> +	}

no need for __func__ as the dev_dbg() includes that.

%pr works for struct resource, like the HPA, but not struct range.
I'm guessing you'll v2 after v6.13-rc1 is out and then you can
use the new struct range print format Ira added.

When I tried this out I spew'd more. I'm not suggesting you print all
I printed but maybe find a hint in here of what someone debugging will
want to know. Maybe some of the debug should emit from the the lmh.c
function so that it can be more LMH special>

[] cxl_core:construct_region:3302: cxl region0: LMH: res was [mem 0xf010000000-0xf04fffffff flags 0x200]
[] cxl_core:construct_region:3306: cxl region0: LMH: res now trimmed to [mem 0xf010000000-0xf03fffffff flags 0x200]
[] cxl_core:construct_region:3307: cxl region0: LMH: res now matches root decoder decoder0.0 [0xf010000000 - 0xf03fffffff]
[] cxl_core:construct_region:3309: cxl region0: LMH: res does not match endpoint decoder decoder15.0 [0xf010000000 - 0xf04fffffff]

I wonder if we are overdoing the HPA SPA language to the point of confusion
(maybe just me). By this point we've decided the root decoder range is good
so we are trimming the region resource (typically derived from the endpoint
decoder range) to match the root decoder range.

snip

> +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled);
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld);
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd);
> +#else
> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled)
> +{
> +	return false;
> +}
> +
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld)
> +{
> +	return false;
> +}
> +
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd)
> +{ }
> +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */

Some longish lines are needlessly split.
git clang-format can help with that.

-- Alison

snip to end.
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests
  2024-11-22 15:51 ` [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
  2024-11-22 17:26   ` Ira Weiny
@ 2024-11-25 20:46   ` Alison Schofield
  2024-11-26 15:00   ` Jonathan Cameron
  2 siblings, 0 replies; 25+ messages in thread
From: Alison Schofield @ 2024-11-25 20:46 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao, Li Ming,
	linux-kernel, linux-cxl

On Fri, Nov 22, 2024 at 04:51:54PM +0100, Fabio M. De Francesco wrote:
> Simulate an x86 Low Memory Hole for the CXL tests by changing
> mock_cfmws[0] range size to 768MB and CXL Endpoint Decoder HPA range size
> to 1GB and have get_cfmws_range_start() return two different addresses
> which depend on whether the passed device is real or mock.

How about adding:

Since the auto-created region of cxl-test uses mock_cfmws[0], the
LMH path in the CXL Driver will be exercised every time the cxl-test
module is loaded. Executing unit test: cxl-topology.sh, confirms the
region created succesfully with a LMH.

> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/cxl/core/lmh.c       | 21 +++++++++++++--------
>  drivers/cxl/cxl.h            |  7 +++++++
>  tools/testing/cxl/Kbuild     |  1 +
>  tools/testing/cxl/test/cxl.c |  4 ++--
>  4 files changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> index da76b2a534ec..350008324bdc 100644
> --- a/drivers/cxl/core/lmh.c
> +++ b/drivers/cxl/core/lmh.c
> @@ -1,10 +1,15 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  
>  #include <linux/range.h>
> +#include <linux/pci.h>
>  #include "cxl.h"
>  
> -/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
> -#define MISALIGNED_CFMWS_RANGE_BASE 0x0
> +u64 get_cfmws_range_start(struct device *dev)

Can this func be static, and then keep the #defines here, in lmh.c,
rather than move to cxl.h ?


> +{
> +	if (dev_is_pci(dev))
> +		return MISALIGNED_CFMWS_RANGE_START;

space before final return please

> +	return MISALIGNED_MOCK_CFMWS_RANGE_START;
> +}
>  

snip


> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index a5ad4499381e..51dc80f8e50c 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -903,12 +903,19 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>  bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>  
>  #ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> +
> +/* Range start address of misaligned CFMWS in x86 with LMH */
> +#define MISALIGNED_CFMWS_RANGE_START 0x0
> +/* Range start address of mock misaligned CFMWS for tests */
> +#define MISALIGNED_MOCK_CFMWS_RANGE_START 0xf010000000
> +

As noted above, wondering why these need to be in cxl.h

Why 'MISALIGNED_ and not 'LMH_' , especially if you can move to lmh.c.

Is it guaranteed that MOCK_CFMWS_RANGE_START is at 0xf010000000?

-- Alison


snip to end


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention
  2024-11-22 15:51 ` [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
  2024-11-22 17:28   ` Ira Weiny
@ 2024-11-25 21:10   ` Alison Schofield
  1 sibling, 0 replies; 25+ messages in thread
From: Alison Schofield @ 2024-11-25 21:10 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao, Li Ming,
	linux-kernel, linux-cxl

On Fri, Nov 22, 2024 at 04:51:52PM +0100, Fabio M. De Francesco wrote:
> In preparation for a patch that supports matching between Decoders that may
> have larger HPA ranges than their corresponding Root Decoders or Regions,
> change the calling convention of three match_*_by_range() to take CXL
> Endpoint Decoders as their second arguments which later will be reused by
> two new functions (arch_match_spa() and arch_match_region()).

Suggest not telling the future, esp future func names.
Here's a suggestion:

Replace struct range parameter with struct cxl_endpoint_decoder of
which range is a member in the match_*_by_range() functions.

This is in preparation for expanding these helpers to perform arch
specific region matching that requires a cxl_endpoint_decoder.

No functional change. (right ?)


Code looks good!

Reviewed-by: Alison Schofield <alison.schofield@intel.com>

snip to end
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole
  2024-11-22 15:51 [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
                   ` (3 preceding siblings ...)
  2024-11-22 19:46 ` [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Gregory Price
@ 2024-11-25 22:00 ` Alison Schofield
  2024-12-03 18:23   ` Fabio M. De Francesco
  4 siblings, 1 reply; 25+ messages in thread
From: Alison Schofield @ 2024-11-25 22:00 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao, Li Ming,
	linux-kernel, linux-cxl

On Fri, Nov 22, 2024 at 04:51:51PM +0100, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. On those systems, BIOS
> publishes CFMWS which communicate the active System Physical Address (SPA)
> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> SPA range trims out the hole, and capacity in the endpoint is lost with no
> SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, the driver fails and returns an error because it
> expects that the CXL Endpoint Decoder range is a subset of the Root
> Decoder's.
> 
> Then detect SPA/HPA misalignment and allow CXL Regions construction and 
> attach if and only if the misalignment is due to x86 Low Memory Holes.
> 

Hi Fabio,

I took this for a test drive in cxl-test - thanks for that patch!

Here's a couple of observations on what users will see. Just stirring
the pot here, not knowing if there is, or even needs to be an explanation
to userspace about LMH.

1) Users will see that the endpoint decoders intend to map more than the
root decoder. Users may question their trimmed region size.

2) At least in this example, I don't think users can re-create this
region in place, ie via hotplug.  Once this region is destroyed, we
default to creating a smaller, aligned region, in its place.

cxl-cli output is appended showing the auto created region, it's decoders,
and then the creation of a user requested region, not exactly in its
place.


Upon load of cxl-test:

# cxl list -r region0 --decoders -u
[
  {
    "root decoders":[
      {
        "decoder":"decoder0.0",
        "resource":"0xf010000000",
        "size":"768.00 MiB (805.31 MB)",
        "interleave_ways":1,
        "max_available_extent":0,
        "volatile_capable":true,
        "qos_class":42,
        "nr_targets":1
      }
    ]
  },
  {
    "port decoders":[
      {
        "decoder":"decoder1.0",
        "resource":"0xf010000000",
        "size":"1024.00 MiB (1073.74 MB)",
        "interleave_ways":1,
        "region":"region0",
        "nr_targets":1
      },
      {
        "decoder":"decoder6.0",
        "resource":"0xf010000000",
        "size":"1024.00 MiB (1073.74 MB)",
        "interleave_ways":2,
        "interleave_granularity":4096,
        "region":"region0",
        "nr_targets":2
      }
    ]
  },
  {
    "endpoint decoders":[
      {
        "decoder":"decoder10.0",
        "resource":"0xf010000000",
        "size":"1024.00 MiB (1073.74 MB)",
        "interleave_ways":2,
        "interleave_granularity":4096,
        "region":"region0",
        "dpa_resource":"0",
        "dpa_size":"512.00 MiB (536.87 MB)",
        "mode":"ram"
      },
      {
        "decoder":"decoder13.0",
        "resource":"0xf010000000",
        "size":"1024.00 MiB (1073.74 MB)",
        "interleave_ways":2,
        "interleave_granularity":4096,
        "region":"region0",
        "dpa_resource":"0",
        "dpa_size":"512.00 MiB (536.87 MB)",
        "mode":"ram"
      }
    ]
  }
]

After destroying the auto region, root decoder show the 768MiB available:

# cxl list -d decoder0.0 -u
{
  "decoder":"decoder0.0",
  "resource":"0xf010000000",
  "size":"768.00 MiB (805.31 MB)",
  "interleave_ways":1,
  "max_available_extent":"768.00 MiB (805.31 MB)",
  "volatile_capable":true,
  "qos_class":42,
  "nr_targets":1
}


# cxl create-region -d decoder0.0 -m mem5 mem4
{
  "region":"region0",
  "resource":"0xf010000000",
  "size":"512.00 MiB (536.87 MB)",
  "type":"ram",
  "interleave_ways":2,
  "interleave_granularity":256,
  "decode_state":"commit",

snip

# cxl list -r region0 --decoders -u
[
  {
    "root decoders":[
      {
        "decoder":"decoder0.0",
        "resource":"0xf010000000",
        "size":"768.00 MiB (805.31 MB)",
        "interleave_ways":1,
        "max_available_extent":"256.00 MiB (268.44 MB)",
        "volatile_capable":true,
        "qos_class":42,
        "nr_targets":1
      }
    ]
  },
  {
    "port decoders":[
      {
        "decoder":"decoder1.0",
        "resource":"0xf010000000",
        "size":"512.00 MiB (536.87 MB)",
        "interleave_ways":1,
        "region":"region0",
        "nr_targets":1
      },
      {
        "decoder":"decoder6.0",
        "resource":"0xf010000000",
        "size":"512.00 MiB (536.87 MB)",
        "interleave_ways":2,
        "interleave_granularity":256,
        "region":"region0",
        "nr_targets":2
      }
    ]
  },
  {
    "endpoint decoders":[
      {
        "decoder":"decoder10.0",
        "resource":"0xf010000000",
        "size":"512.00 MiB (536.87 MB)",
        "interleave_ways":2,
        "interleave_granularity":256,
        "region":"region0",
        "dpa_resource":"0",
        "dpa_size":"256.00 MiB (268.44 MB)",
        "mode":"ram"
      },
      {
        "decoder":"decoder13.0",
        "resource":"0xf010000000",
        "size":"512.00 MiB (536.87 MB)",
        "interleave_ways":2,
        "interleave_granularity":256,
        "region":"region0",
        "dpa_resource":"0",
        "dpa_size":"256.00 MiB (268.44 MB)",
        "mode":"ram"
      }
    ]
  }
]


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
                     ` (4 preceding siblings ...)
  2024-11-25 20:35   ` Alison Schofield
@ 2024-11-25 22:44   ` kernel test robot
  2024-12-16 21:30   ` Robert Richter
  6 siblings, 0 replies; 25+ messages in thread
From: kernel test robot @ 2024-11-25 22:44 UTC (permalink / raw)
  To: Fabio M. De Francesco, Davidlohr Bueso, Jonathan Cameron,
	Dave Jiang, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Huang Ying, Yao Xingtao, Li Ming, linux-kernel,
	linux-cxl
  Cc: oe-kbuild-all

Hi Fabio,

kernel test robot noticed the following build errors:

[auto build test ERROR on cxl/next]
[also build test ERROR on linus/master cxl/pending v6.12 next-20241125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Fabio-M-De-Francesco/cxl-core-Change-match_-_by_range-calling-convention/20241125-102754
base:   https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git next
patch link:    https://lore.kernel.org/r/20241122155226.2068287-3-fabio.m.de.francesco%40linux.intel.com
patch subject: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
config: i386-buildonly-randconfig-001-20241125 (https://download.01.org/0day-ci/archive/20241126/202411260633.Oq6ps76M-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241126/202411260633.Oq6ps76M-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411260633.Oq6ps76M-lkp@intel.com/

All errors (new ones prefixed by >>):

   ld: drivers/cxl/core/pmem.o: in function `arch_match_spa':
>> drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/pmem.o: in function `arch_match_region':
>> drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/pmem.o: in function `arch_trim_hpa_by_spa':
>> drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/core/regs.o: in function `arch_match_spa':
>> drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/regs.o: in function `arch_match_region':
>> drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/regs.o: in function `arch_trim_hpa_by_spa':
>> drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/core/memdev.o: in function `arch_match_spa':
>> drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/memdev.o: in function `arch_match_region':
>> drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/memdev.o: in function `arch_trim_hpa_by_spa':
>> drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/core/mbox.o: in function `arch_match_spa':
>> drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/mbox.o: in function `arch_match_region':
>> drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/mbox.o: in function `arch_trim_hpa_by_spa':
>> drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/core/pci.o: in function `arch_match_spa':
>> drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/pci.o: in function `arch_match_region':
>> drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/pci.o: in function `arch_trim_hpa_by_spa':
>> drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/core/hdm.o: in function `arch_match_spa':
>> drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/hdm.o: in function `arch_match_region':
>> drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/hdm.o: in function `arch_trim_hpa_by_spa':
>> drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/core/pmu.o: in function `arch_match_spa':
>> drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/pmu.o: in function `arch_match_region':
>> drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/pmu.o: in function `arch_trim_hpa_by_spa':
   drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/core/cdat.o: in function `arch_match_spa':
   drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/cdat.o: in function `arch_match_region':
   drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/cdat.o: in function `arch_trim_hpa_by_spa':
   drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/core/trace.o: in function `arch_match_spa':
   drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/core/trace.o: in function `arch_match_region':
   drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/core/trace.o: in function `arch_trim_hpa_by_spa':
   drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/cxl/port.o: in function `arch_match_spa':
   drivers/cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/cxl/port.o: in function `arch_match_region':
   drivers/cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/cxl/port.o: in function `arch_trim_hpa_by_spa':
   drivers/cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here
   ld: drivers/perf/cxl_pmu.o: in function `arch_match_spa':
   drivers/perf/../cxl/cxl.h:923: multiple definition of `arch_match_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:923: first defined here
   ld: drivers/perf/cxl_pmu.o: in function `arch_match_region':
   drivers/perf/../cxl/cxl.h:927: multiple definition of `arch_match_region'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:927: first defined here
   ld: drivers/perf/cxl_pmu.o: in function `arch_trim_hpa_by_spa':
   drivers/perf/../cxl/cxl.h:935: multiple definition of `arch_trim_hpa_by_spa'; drivers/cxl/core/port.o:drivers/cxl/cxl.h:935: first defined here

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for GET_FREE_REGION
   Depends on [n]: SPARSEMEM [=n]
   Selected by [y]:
   - RESOURCE_KUNIT_TEST [=y] && RUNTIME_TESTING_MENU [=y] && KUNIT [=y]


vim +923 drivers/cxl/cxl.h

   912	
   913	#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
   914	bool arch_match_spa(struct cxl_root_decoder *cxlrd,
   915			    struct cxl_endpoint_decoder *cxled);
   916	bool arch_match_region(struct cxl_region_params *p,
   917			       struct cxl_decoder *cxld);
   918	void arch_trim_hpa_by_spa(struct resource *res,
   919				  struct cxl_root_decoder *cxlrd);
   920	#else
   921	bool arch_match_spa(struct cxl_root_decoder *cxlrd,
   922			    struct cxl_endpoint_decoder *cxled)
 > 923	{
   924		return false;
   925	}
   926	
 > 927	bool arch_match_region(struct cxl_region_params *p,
   928			       struct cxl_decoder *cxld)
   929	{
   930		return false;
   931	}
   932	
   933	void arch_trim_hpa_by_spa(struct resource *res,
   934				  struct cxl_root_decoder *cxlrd)
 > 935	{ }
   936	#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
   937	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-25 17:22     ` Fabio M. De Francesco
@ 2024-11-26  2:13       ` Li Ming
  0 siblings, 0 replies; 25+ messages in thread
From: Li Ming @ 2024-11-26  2:13 UTC (permalink / raw)
  To: Fabio M. De Francesco, Li Ming
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao,
	linux-kernel, linux-cxl



On 11/26/2024 1:22 AM, Fabio M. De Francesco wrote:
> On Monday, November 25, 2024 9:42:56 AM GMT+1 Li Ming wrote:
>>
>> On 11/22/2024 11:51 PM, Fabio M. De Francesco wrote:
>>> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
>>> Physical Address (HPA) windows that are associated with each CXL Host
>>> Bridge. Each window represents a contiguous HPA that may be interleaved
>>> with one or more targets (CXL v3.1 - 9.18.1.3).
>>>
>>> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
>>> memory to which systems cannot send transactions. In some cases the size
>>> of that hole is not compatible with the CXL hardware decoder constraint
>>> that the size is always aligned to 256M * Interleave Ways.
>>>
>>> On those systems, BIOS publishes CFMWS which communicate the active System
>>> Physical Address (SPA) ranges that map to a subset of the Host Physical
>>> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
>>> the endpoint is lost with no SPA to map to CXL HPA in that hole.
>>>
>>> In the early stages of CXL Regions construction and attach on platforms
>>> with Low Memory Holes, cxl_add_to_region() fails and returns an error
>>> because it can't find any CXL Window that matches a given CXL Endpoint
>>> Decoder.
>>>
>>> Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
>>> ranges: both must start at physical address 0 and end below 4 GB, while
>>> the Root Decoder ranges end at lower addresses than the matching Endpoint
>>> Decoders which instead must always respect the above-mentioned CXL hardware
>>> decoders HPA alignment constraint.
>>
>> Hi Fabio,
>>
>> Here mentions that the end address must be below 4GB, but I don't find any checking about that in the implementation, is it not needed?
>>
> Hi Ming,
> 
> Good question, thanks!
> 
> While a first version of arch_match_spa() had a check of 'r2->end < SZ_4G',
> I dropped it for the final one. It ended up out of sync with the commit message.
> 
> I think that we don't want that check. I'll rework the commit message for v2.
> 
> If the hardware decoders HPA ranges extended beyond the end of Low 
> Memory, the LMH would still need to be detected and the decoders capacity 
> would still need to be trimmed to match their corresponding CFMWS range end. 
>>
>> [snip]
>>
>>
>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>> index ac2c486c16e9..3cb5a69e9731 100644
>>> --- a/drivers/cxl/core/region.c
>>> +++ b/drivers/cxl/core/region.c
>>> @@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
>>>  	cxld = to_cxl_decoder(dev);
>>>  	r = &cxld->hpa_range;
>>>  
>>> -	if (p->res && p->res->start == r->start && p->res->end == r->end)
>>> -		return 1;
>>> +	if (p->res) {
>>> +		if (p->res->start == r->start && p->res->end == r->end)
>>> +			return 1;
>>> +		if (arch_match_region(p, cxld))
>>> +			return 1;
>>> +	}
>>
>> I think that it is better to check LMH cases before checking (p->res->start == r->start && p->res->end == r->end).
>> Per the changelog and the implementation of arch_match_region(), below case should fails but current checking will succeed:
>> the value of 'p->res->start' is MISALIGNED_CFMWS_RANGE_BASE and the the value of 'p->res->end' is equals to the value of 'r->end'.
>>
> I think that the expected "normal" case should always come first. In the expected
> scenarios the driver deals either with SPA range == HPA range 
> (e.g, in match_auto_decoder()) or with SPA range that fully contains the HPA range
> (match_decoder_by_range()). 
> 
> If either one of those two cases holds, arch_match_*() helper doesn't need to be
> called and the match must succeed. 
> 
> Besides, other architectures are free to define holes with many constraints that 
> the driver doesn't want to check first if the expected case is already met.
>>
>>>  
>>>  	return 0;
>>>  }
>>> @@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>>>  		if (cxld->interleave_ways != iw ||
>>>  		    cxld->interleave_granularity != ig ||
>>>  		    cxld->hpa_range.start != p->res->start ||
>>> -		    cxld->hpa_range.end != p->res->end ||
>>> +		    (cxld->hpa_range.end != p->res->end &&
>>> +		     !arch_match_region(p, cxld)) ||
>>>  		    ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
>>>  			dev_err(&cxlr->dev,
>>>  				"%s:%s %s expected iw: %d ig: %d %pr\n",
>>> @@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>>>  {
>>>  	struct cxl_endpoint_decoder *cxled = data;
>>>  	struct cxl_switch_decoder *cxlsd;
>>> +	struct cxl_root_decoder *cxlrd;
>>>  	struct range *r1, *r2;
>>>  
>>>  	if (!is_switch_decoder(dev))
>>> @@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>>>  	r1 = &cxlsd->cxld.hpa_range;
>>>  	r2 = &cxled->cxld.hpa_range;
>>>  
>>> -	if (is_root_decoder(dev))
>>> -		return range_contains(r1, r2);
>>> +	if (is_root_decoder(dev)) {
>>> +		if (range_contains(r1, r2))
>>> +			return 1;
>>> +		cxlrd = to_cxl_root_decoder(dev);
>>> +		if (arch_match_spa(cxlrd, cxled))
>>> +			return 1;
>>
>> Same as above, what will happen if the root decoder's address range still contains endpoint decoder's address range in LMH cases? should fails or succeed?
>>
> If r1 contains r2, there is no LMH and the driver is dealing with the regular, 
> expected, case. It must succeed.
> 
> Think of the arch_match_*() helpers like something that avoid unwanted
> failures if arch permits exceptions. Before returning errors when the "normal"
> tests fail, check if the arch allows any exceptions and make the driver
> ignore that "strange" SPA/HPA misalignment.
>>
>> [snip]
>>
> Thanks,
> 
> Fabio
> 
> 

Understand it, thanks for your explanation.

Ming

> 
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests
  2024-11-22 15:51 ` [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
  2024-11-22 17:26   ` Ira Weiny
  2024-11-25 20:46   ` Alison Schofield
@ 2024-11-26 15:00   ` Jonathan Cameron
  2 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2024-11-26 15:00 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao, Li Ming,
	linux-kernel, linux-cxl

On Fri, 22 Nov 2024 16:51:54 +0100
"Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> wrote:

> Simulate an x86 Low Memory Hole for the CXL tests by changing
> mock_cfmws[0] range size to 768MB and CXL Endpoint Decoder HPA range size
> to 1GB and have get_cfmws_range_start() return two different addresses
> which depend on whether the passed device is real or mock.
> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>

> diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
> index 050725afa45d..b61c3d78fed3 100644
> --- a/tools/testing/cxl/test/cxl.c
> +++ b/tools/testing/cxl/test/cxl.c
> @@ -212,7 +212,7 @@ static struct {
>  			.restrictions = ACPI_CEDT_CFMWS_RESTRICT_TYPE3 |
>  					ACPI_CEDT_CFMWS_RESTRICT_VOLATILE,
>  			.qtg_id = FAKE_QTG_ID,
> -			.window_size = SZ_256M * 4UL,
> +			.window_size = SZ_256M * 3UL,
This is changing the test whether or not we have ARCH_LOW_MEMORY_HOLE.
Doesn't that result in failures on arm64 etc?

Jonathan

>  		},
>  		.target = { 0 },
>  	},
> @@ -744,7 +744,7 @@ static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
>  	struct cxl_endpoint_decoder *cxled;
>  	struct cxl_switch_decoder *cxlsd;
>  	struct cxl_port *port, *iter;
> -	const int size = SZ_512M;
> +	const int size = SZ_1G;
>  	struct cxl_memdev *cxlmd;
>  	struct cxl_dport *dport;
>  	struct device *dev;


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole
  2024-11-25 22:00 ` Alison Schofield
@ 2024-12-03 18:23   ` Fabio M. De Francesco
  0 siblings, 0 replies; 25+ messages in thread
From: Fabio M. De Francesco @ 2024-12-03 18:23 UTC (permalink / raw)
  To: Alison Schofield, linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Vishal Verma,
	Ira Weiny, Dan Williams, Yao Xingtao, linux-kernel, Li Ming

On Monday, November 25, 2024 11:00:41 PM GMT+1 Alison Schofield wrote:
> On Fri, Nov 22, 2024 at 04:51:51PM +0100, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> > 
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. On those systems, BIOS
> > publishes CFMWS which communicate the active System Physical Address (SPA)
> > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > SPA to map to CXL HPA in that hole.
> > 
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, the driver fails and returns an error because it
> > expects that the CXL Endpoint Decoder range is a subset of the Root
> > Decoder's.
> > 
> > Then detect SPA/HPA misalignment and allow CXL Regions construction and 
> > attach if and only if the misalignment is due to x86 Low Memory Holes.
> > 
> 
> Hi Fabio,
> 
> I took this for a test drive in cxl-test - thanks for that patch!
> 
> Here's a couple of observations on what users will see. Just stirring
> the pot here, not knowing if there is, or even needs to be an explanation
> to userspace about LMH.
> 
> 1) Users will see that the endpoint decoders intend to map more than the
> root decoder. Users may question their trimmed region size.
> 
> 2) At least in this example, I don't think users can re-create this
> region in place, ie via hotplug.  Once this region is destroyed, we
> default to creating a smaller, aligned region, in its place.
> 
Hi Alison,

Thank you for your valuable comments and suggestions on my series. I will 
take them into account for the next version (v2).

I also appreciate your observations regarding the consistency between the two 
methods of creating regions. I agree that we should aim for more uniformity in
this area and we will consider that in the future but we will only create spec 
standard aligned  user created regions now.

Thanks again for your insights.

Fabio
>
> cxl-cli output is appended showing the auto created region, it's decoders,
> and then the creation of a user requested region, not exactly in its
> place.
> 
> 
> Upon load of cxl-test:
> 
> # cxl list -r region0 --decoders -u
> [
>   {
>     "root decoders":[
>       {
>         "decoder":"decoder0.0",
>         "resource":"0xf010000000",
>         "size":"768.00 MiB (805.31 MB)",
>         "interleave_ways":1,
>         "max_available_extent":0,
>         "volatile_capable":true,
>         "qos_class":42,
>         "nr_targets":1
>       }
>     ]
>   },
>   {
>     "port decoders":[
>       {
>         "decoder":"decoder1.0",
>         "resource":"0xf010000000",
>         "size":"1024.00 MiB (1073.74 MB)",
>         "interleave_ways":1,
>         "region":"region0",
>         "nr_targets":1
>       },
>       {
>         "decoder":"decoder6.0",
>         "resource":"0xf010000000",
>         "size":"1024.00 MiB (1073.74 MB)",
>         "interleave_ways":2,
>         "interleave_granularity":4096,
>         "region":"region0",
>         "nr_targets":2
>       }
>     ]
>   },
>   {
>     "endpoint decoders":[
>       {
>         "decoder":"decoder10.0",
>         "resource":"0xf010000000",
>         "size":"1024.00 MiB (1073.74 MB)",
>         "interleave_ways":2,
>         "interleave_granularity":4096,
>         "region":"region0",
>         "dpa_resource":"0",
>         "dpa_size":"512.00 MiB (536.87 MB)",
>         "mode":"ram"
>       },
>       {
>         "decoder":"decoder13.0",
>         "resource":"0xf010000000",
>         "size":"1024.00 MiB (1073.74 MB)",
>         "interleave_ways":2,
>         "interleave_granularity":4096,
>         "region":"region0",
>         "dpa_resource":"0",
>         "dpa_size":"512.00 MiB (536.87 MB)",
>         "mode":"ram"
>       }
>     ]
>   }
> ]
> 
> After destroying the auto region, root decoder show the 768MiB available:
> 
> # cxl list -d decoder0.0 -u
> {
>   "decoder":"decoder0.0",
>   "resource":"0xf010000000",
>   "size":"768.00 MiB (805.31 MB)",
>   "interleave_ways":1,
>   "max_available_extent":"768.00 MiB (805.31 MB)",
>   "volatile_capable":true,
>   "qos_class":42,
>   "nr_targets":1
> }
> 
> 
> # cxl create-region -d decoder0.0 -m mem5 mem4
> {
>   "region":"region0",
>   "resource":"0xf010000000",
>   "size":"512.00 MiB (536.87 MB)",
>   "type":"ram",
>   "interleave_ways":2,
>   "interleave_granularity":256,
>   "decode_state":"commit",
> 
> snip
> 
> # cxl list -r region0 --decoders -u
> [
>   {
>     "root decoders":[
>       {
>         "decoder":"decoder0.0",
>         "resource":"0xf010000000",
>         "size":"768.00 MiB (805.31 MB)",
>         "interleave_ways":1,
>         "max_available_extent":"256.00 MiB (268.44 MB)",
>         "volatile_capable":true,
>         "qos_class":42,
>         "nr_targets":1
>       }
>     ]
>   },
>   {
>     "port decoders":[
>       {
>         "decoder":"decoder1.0",
>         "resource":"0xf010000000",
>         "size":"512.00 MiB (536.87 MB)",
>         "interleave_ways":1,
>         "region":"region0",
>         "nr_targets":1
>       },
>       {
>         "decoder":"decoder6.0",
>         "resource":"0xf010000000",
>         "size":"512.00 MiB (536.87 MB)",
>         "interleave_ways":2,
>         "interleave_granularity":256,
>         "region":"region0",
>         "nr_targets":2
>       }
>     ]
>   },
>   {
>     "endpoint decoders":[
>       {
>         "decoder":"decoder10.0",
>         "resource":"0xf010000000",
>         "size":"512.00 MiB (536.87 MB)",
>         "interleave_ways":2,
>         "interleave_granularity":256,
>         "region":"region0",
>         "dpa_resource":"0",
>         "dpa_size":"256.00 MiB (268.44 MB)",
>         "mode":"ram"
>       },
>       {
>         "decoder":"decoder13.0",
>         "resource":"0xf010000000",
>         "size":"512.00 MiB (536.87 MB)",
>         "interleave_ways":2,
>         "interleave_granularity":256,
>         "region":"region0",
>         "dpa_resource":"0",
>         "dpa_size":"256.00 MiB (268.44 MB)",
>         "mode":"ram"
>       }
>     ]
>   }
> ]
> 
> 





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
                     ` (5 preceding siblings ...)
  2024-11-25 22:44   ` kernel test robot
@ 2024-12-16 21:30   ` Robert Richter
  2025-01-08 14:48     ` Fabio M. De Francesco
  6 siblings, 1 reply; 25+ messages in thread
From: Robert Richter @ 2024-12-16 21:30 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Huang Ying, Yao Xingtao,
	Li Ming, linux-kernel, linux-cxl

Hi Fabio,

On 22.11.24 16:51:53, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. In some cases the size
> of that hole is not compatible with the CXL hardware decoder constraint
> that the size is always aligned to 256M * Interleave Ways.
> 
> On those systems, BIOS publishes CFMWS which communicate the active System
> Physical Address (SPA) ranges that map to a subset of the Host Physical
> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> the endpoint is lost with no SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, cxl_add_to_region() fails and returns an error
> because it can't find any CXL Window that matches a given CXL Endpoint
> Decoder.
> 
> Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> ranges: both must start at physical address 0 and end below 4 GB, while
> the Root Decoder ranges end at lower addresses than the matching Endpoint
> Decoders which instead must always respect the above-mentioned CXL hardware
> decoders HPA alignment constraint.
> 
> Match misaligned CFMWS and CXL Regions with corresponding CXL Endpoint
> Decoders if driver detects Low Memory Holes.
> 
> Construct CXL Regions with HPA range's end trimmed by matching SPA.
> 
> Allow the attach target process to complete by relaxing Decoder constraints
> which would lead to failures.

I am implementing an address translation series and will post the
patches in a couple of days. It solves a similar problem to remap
(zero-based) endpoint address ranges to a different SPA range. I
implemented a generic approach that can easily be reused here. With
only a few changes to your series it could be integrated there (I
could rebase my code onto your's then in a later version of my
series).

The change I suggest to your series will also significantly simplify
your code and the function interface. The general idea is to have a
.spa_range attribute in struct cxl_endpoint_decoder:

@@ -428,6 +428,7 @@ struct cxl_endpoint_decoder {
        struct cxl_decoder cxld;
        struct cxl_root_decoder *cxlrd;
        struct resource *dpa_res;
+       struct range spa_range;
        resource_size_t skip;
        enum cxl_decoder_mode mode;
        enum cxl_decoder_state state;

Now, then an endpoint is detected, the spa_range is assigned,
typically this is cxled->spa_range = cxled->cxld.hpa_range (SPA ==
HPA). Whenever a port is added, a function cxl_port_platform_setup()
is called which could also contain a call to
cxl_port_setup_intel_lmh() or so.

@@ -865,6 +870,8 @@ static int cxl_port_add(struct cxl_port *port,
                        return rc;
        }
 
+       cxl_port_platform_setup(port);
+
        rc = device_add(dev);
        if (rc)
                return rc;

cxl_port_setup_intel_lmh() does all the platform checks and
recalculates the spa_range for the endpoints based on the code you
already have. The endpoint's SPA address range (instead of hpa_range)
is now used to construct the region by changing
match_region_by_range() accordingly. This avoids the instrumentation
of all the other checks you needed to add in this patch (use of
arch_match_region() and arch_match_spa() etc.).

The only function that needs to be exported is
cxl_port_setup_intel_lmh() then. Everything else is generic code. This
is also a good encapsulation to other archs, vendors, platforms etc. I
think that would really improve your series. Let me know if you have
questions.

See also inline below, I have some questions and comments.

> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/cxl/Kconfig       |  5 ++++
>  drivers/cxl/core/Makefile |  1 +
>  drivers/cxl/core/lmh.c    | 53 +++++++++++++++++++++++++++++++++++++

Could you name this file intel.c and have necessary Kconfig options?
If code should be reused later we still can move it to common.c or
similar. It would be good to guard that having a pci_device_id check
to a root port or a vendor cpu check.

See my comment no enablement above.

>  drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++++++++-------
>  drivers/cxl/cxl.h         | 25 ++++++++++++++++++
>  5 files changed, 130 insertions(+), 9 deletions(-)
>  create mode 100644 drivers/cxl/core/lmh.c
> 
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 876469e23f7a..07b87f217e59 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -128,6 +128,11 @@ config CXL_REGION
>  
>  	  If unsure say 'y'
>  
> +config CXL_ARCH_LOW_MEMORY_HOLE
> +	def_bool y
> +	depends on CXL_REGION
> +	depends on X86
> +
>  config CXL_REGION_INVALIDATION_TEST
>  	bool "CXL: Region Cache Management Bypass (TEST)"
>  	depends on CXL_REGION
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 9259bcc6773c..6e80215e8444 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -15,4 +15,5 @@ cxl_core-y += hdm.o
>  cxl_core-y += pmu.o
>  cxl_core-y += cdat.o
>  cxl_core-$(CONFIG_TRACING) += trace.o
> +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
>  cxl_core-$(CONFIG_CXL_REGION) += region.o
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> new file mode 100644
> index 000000000000..da76b2a534ec
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.c
> @@ -0,0 +1,53 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/range.h>
> +#include "cxl.h"
> +
> +/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
> +#define MISALIGNED_CFMWS_RANGE_BASE 0x0

Could you more elaborate on this? Why is the base zero?

> +
> +/*
> + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> + *
> + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> + * the given endpoint decoder HPA range size is always expected aligned and
> + * also larger than that of the matching root decoder
> + */
> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled)
> +{
> +	struct range *r1, *r2;
> +	int niw;
> +
> +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> +	r2 = &cxled->cxld.hpa_range;
> +	niw = cxled->cxld.interleave_ways;
> +
> +	if (r1->start == MISALIGNED_CFMWS_RANGE_BASE &&
> +	    r1->start == r2->start && r1->end < r2->end &&
> +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
> +		return true;
> +	return false;
> +}
> +
> +/* Similar to arch_match_spa(), it matches regions and decoders */
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld)
> +{
> +	struct range *r = &cxld->hpa_range;
> +	struct resource *res = p->res;
> +	int niw = cxld->interleave_ways;
> +
> +	if (res->start == MISALIGNED_CFMWS_RANGE_BASE &&
> +	    res->start == r->start && res->end < r->end &&
> +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> +		return true;

This basically means that the region base is zero? How do you
determine the actual base address of the range then? The cxl test
patch contains a non-zero address. How is BIOS supposed to configure
this trimmed mem hole regions? Could you share a snippet of a CFMWS
entry for this?

Thanks,

-Robert

> +	return false;
> +}
> +
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd)
> +{
> +	res->end = cxlrd->res->end;
> +}
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index ac2c486c16e9..3cb5a69e9731 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
>  	cxld = to_cxl_decoder(dev);
>  	r = &cxld->hpa_range;
>  
> -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> -		return 1;
> +	if (p->res) {
> +		if (p->res->start == r->start && p->res->end == r->end)
> +			return 1;
> +		if (arch_match_region(p, cxld))
> +			return 1;
> +	}
>  
>  	return 0;
>  }
> @@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>  		if (cxld->interleave_ways != iw ||
>  		    cxld->interleave_granularity != ig ||
>  		    cxld->hpa_range.start != p->res->start ||
> -		    cxld->hpa_range.end != p->res->end ||
> +		    (cxld->hpa_range.end != p->res->end &&
> +		     !arch_match_region(p, cxld)) ||
>  		    ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
>  			dev_err(&cxlr->dev,
>  				"%s:%s %s expected iw: %d ig: %d %pr\n",
> @@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>  {
>  	struct cxl_endpoint_decoder *cxled = data;
>  	struct cxl_switch_decoder *cxlsd;
> +	struct cxl_root_decoder *cxlrd;
>  	struct range *r1, *r2;
>  
>  	if (!is_switch_decoder(dev))
> @@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>  	r1 = &cxlsd->cxld.hpa_range;
>  	r2 = &cxled->cxld.hpa_range;
>  
> -	if (is_root_decoder(dev))
> -		return range_contains(r1, r2);
> +	if (is_root_decoder(dev)) {
> +		if (range_contains(r1, r2))
> +			return 1;
> +		cxlrd = to_cxl_root_decoder(dev);
> +		if (arch_match_spa(cxlrd, cxled))
> +			return 1;
> +	}
>  	return (r1->start == r2->start && r1->end == r2->end);
>  }
>  
> @@ -1943,7 +1954,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  	}
>  
>  	if (resource_size(cxled->dpa_res) * p->interleave_ways !=
> -	    resource_size(p->res)) {
> +	    resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
>  		dev_dbg(&cxlr->dev,
>  			"%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n",
>  			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> @@ -3199,7 +3210,13 @@ static int match_root_decoder_by_range(struct device *dev, void *data)
>  	r1 = &cxlrd->cxlsd.cxld.hpa_range;
>  	r2 = &cxled->cxld.hpa_range;
>  
> -	return range_contains(r1, r2);
> +	if (range_contains(r1, r2))
> +		return true;
> +
> +	if (arch_match_spa(cxlrd, cxled))
> +		return true;
> +
> +	return false;
>  }
>  
>  static int match_region_by_range(struct device *dev, void *data)
> @@ -3217,8 +3234,12 @@ static int match_region_by_range(struct device *dev, void *data)
>  	p = &cxlr->params;
>  
>  	down_read(&cxl_region_rwsem);
> -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> -		rc = 1;
> +	if (p->res) {
> +		if (p->res->start == r->start && p->res->end == r->end)
> +			rc = 1;
> +		if (arch_match_region(p, &cxled->cxld))
> +			rc = 1;
> +	}
>  	up_read(&cxl_region_rwsem);
>  
>  	return rc;
> @@ -3270,6 +3291,22 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
>  
>  	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
>  				    dev_name(&cxlr->dev));
> +
> +	/*
> +	 * Trim the HPA retrieved from hardware to fit the SPA mapped by the
> +	 * platform
> +	 */
> +	if (arch_match_spa(cxlrd, cxled)) {
> +		struct range *range = &cxlrd->cxlsd.cxld.hpa_range;
> +
> +		arch_trim_hpa_by_spa(res, cxlrd);
> +		dev_dbg(cxlmd->dev.parent,
> +			"%s: Trim HPA (%s: %pr) by SPA (%s: %pr)\n",
> +			__func__,
> +			dev_name(&cxlrd->cxlsd.cxld.dev), range,
> +			dev_name(&cxled->cxld.dev), hpa);
> +	}
> +
>  	rc = insert_resource(cxlrd->res, res);
>  	if (rc) {
>  		/*
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5406e3ab3d4a..a5ad4499381e 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -902,6 +902,31 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>  
>  bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>  
> +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled);
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld);
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd);
> +#else
> +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +		    struct cxl_endpoint_decoder *cxled)
> +{
> +	return false;
> +}
> +
> +bool arch_match_region(struct cxl_region_params *p,
> +		       struct cxl_decoder *cxld)
> +{
> +	return false;
> +}
> +
> +void arch_trim_hpa_by_spa(struct resource *res,
> +			  struct cxl_root_decoder *cxlrd)
> +{ }
> +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
> +
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
>   * of these symbols in tools/testing/cxl/.
> -- 
> 2.46.2
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2024-12-16 21:30   ` Robert Richter
@ 2025-01-08 14:48     ` Fabio M. De Francesco
  2025-01-09 10:58       ` Robert Richter
  0 siblings, 1 reply; 25+ messages in thread
From: Fabio M. De Francesco @ 2025-01-08 14:48 UTC (permalink / raw)
  To: Robert Richter
  Cc: linux-kernel, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Yao Xingtao, linux-cxl, Li Ming

Hi Robert,

On Monday, December 16, 2024 10:30:11 PM GMT+1 Robert Richter wrote:
> Hi Fabio,
> 
> On 22.11.24 16:51:53, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> > 
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. In some cases the size
> > of that hole is not compatible with the CXL hardware decoder constraint
> > that the size is always aligned to 256M * Interleave Ways.
> > 
> > On those systems, BIOS publishes CFMWS which communicate the active System
> > Physical Address (SPA) ranges that map to a subset of the Host Physical
> > Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> > the endpoint is lost with no SPA to map to CXL HPA in that hole.
> > 
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, cxl_add_to_region() fails and returns an error
> > because it can't find any CXL Window that matches a given CXL Endpoint
> > Decoder.
> > 
> > Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> > ranges: both must start at physical address 0 and end below 4 GB, while
> > the Root Decoder ranges end at lower addresses than the matching Endpoint
> > Decoders which instead must always respect the above-mentioned CXL hardware
> > decoders HPA alignment constraint.
> > 
> > Match misaligned CFMWS and CXL Regions with corresponding CXL Endpoint
> > Decoders if driver detects Low Memory Holes.
> > 
> > Construct CXL Regions with HPA range's end trimmed by matching SPA.
> > 
> > Allow the attach target process to complete by relaxing Decoder constraints
> > which would lead to failures.
> 
> I am implementing an address translation series and will post the
> patches in a couple of days. It solves a similar problem to remap
> (zero-based) endpoint address ranges to a different SPA range. I
> implemented a generic approach that can easily be reused here. With
> only a few changes to your series it could be integrated there (I
> could rebase my code onto your's then in a later version of my
> series).
> 
> The change I suggest to your series will also significantly simplify
> your code and the function interface. The general idea is to have a
> .spa_range attribute in struct cxl_endpoint_decoder:
> 
> @@ -428,6 +428,7 @@ struct cxl_endpoint_decoder {
>         struct cxl_decoder cxld;
>         struct cxl_root_decoder *cxlrd;
>         struct resource *dpa_res;
> +       struct range spa_range;
>         resource_size_t skip;
>         enum cxl_decoder_mode mode;
>         enum cxl_decoder_state state;
> 
> Now, then an endpoint is detected, the spa_range is assigned,
> typically this is cxled->spa_range = cxled->cxld.hpa_range (SPA ==
> HPA). Whenever a port is added, a function cxl_port_platform_setup()
> is called which could also contain a call to
> cxl_port_setup_intel_lmh() or so.
> 
> @@ -865,6 +870,8 @@ static int cxl_port_add(struct cxl_port *port,
>                         return rc;
>         }
>  
> +       cxl_port_platform_setup(port);
> +
>         rc = device_add(dev);
>         if (rc)
>                 return rc;
> 
> cxl_port_setup_intel_lmh() does all the platform checks and
> recalculates the spa_range for the endpoints based on the code you
> already have. 
>
> The endpoint's SPA address range (instead of hpa_range)
> is now used to construct the region by changing
> match_region_by_range() accordingly. This avoids the instrumentation
> of all the other checks you needed to add in this patch (use of
> arch_match_region() and arch_match_spa() etc.).
> 
> The only function that needs to be exported is
> cxl_port_setup_intel_lmh() then. Everything else is generic code. This
> is also a good encapsulation to other archs, vendors, platforms etc. I
> think that would really improve your series. Let me know if you have
> questions.
>
Sorry for this late reply. I just come back from vacation :)

It's too early for asking questions. I'll wait until I read your series.
I just saw you posted it yesterday.

I'll rework my series and just export cxl_port_setup_intel_lmh() as you
suggested, if it would result into a better solution.

I can't yet easily figure out how to detect an LMH while still in cxl_port_add(),
I'll think about it.
>   
> See also inline below, I have some questions and comments.
> 
> > 
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> >  drivers/cxl/Kconfig       |  5 ++++
> >  drivers/cxl/core/Makefile |  1 +
> >  drivers/cxl/core/lmh.c    | 53 +++++++++++++++++++++++++++++++++++++
> 
> Could you name this file intel.c and have necessary Kconfig options?
>
I named it x86.c, but in internal review Dan asked to rename it lmh.c
> 
> If code should be reused later we still can move it to common.c or
> similar. It would be good to guard that having a pci_device_id check
> to a root port or a vendor cpu check.
> 
> See my comment no enablement above.
> 
> >  drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++++++++-------
> >  drivers/cxl/cxl.h         | 25 ++++++++++++++++++
> >  5 files changed, 130 insertions(+), 9 deletions(-)
> >  create mode 100644 drivers/cxl/core/lmh.c
> > 
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index 876469e23f7a..07b87f217e59 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -128,6 +128,11 @@ config CXL_REGION
> >  
> >  	  If unsure say 'y'
> >  
> > +config CXL_ARCH_LOW_MEMORY_HOLE
> > +	def_bool y
> > +	depends on CXL_REGION
> > +	depends on X86
> > +
> >  config CXL_REGION_INVALIDATION_TEST
> >  	bool "CXL: Region Cache Management Bypass (TEST)"
> >  	depends on CXL_REGION
> > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > index 9259bcc6773c..6e80215e8444 100644
> > --- a/drivers/cxl/core/Makefile
> > +++ b/drivers/cxl/core/Makefile
> > @@ -15,4 +15,5 @@ cxl_core-y += hdm.o
> >  cxl_core-y += pmu.o
> >  cxl_core-y += cdat.o
> >  cxl_core-$(CONFIG_TRACING) += trace.o
> > +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
> >  cxl_core-$(CONFIG_CXL_REGION) += region.o
> > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > new file mode 100644
> > index 000000000000..da76b2a534ec
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.c
> > @@ -0,0 +1,53 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +
> > +#include <linux/range.h>
> > +#include "cxl.h"
> > +
> > +/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
> > +#define MISALIGNED_CFMWS_RANGE_BASE 0x0
> 
> Could you more elaborate on this? Why is the base zero?
> 
I know it because I saw it's zero in all configurations showed by and discussed with 
BIOS engineers.
>
> > +
> > +/*
> > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > + *
> > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > + * the given endpoint decoder HPA range size is always expected aligned and
> > + * also larger than that of the matching root decoder
> > + */
> > +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > +		    struct cxl_endpoint_decoder *cxled)
> > +{
> > +	struct range *r1, *r2;
> > +	int niw;
> > +
> > +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > +	r2 = &cxled->cxld.hpa_range;
> > +	niw = cxled->cxld.interleave_ways;
> > +
> > +	if (r1->start == MISALIGNED_CFMWS_RANGE_BASE &&
> > +	    r1->start == r2->start && r1->end < r2->end &&
> > +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > +		return true;
> > +	return false;
> > +}
> > +
> > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > +bool arch_match_region(struct cxl_region_params *p,
> > +		       struct cxl_decoder *cxld)
> > +{
> > +	struct range *r = &cxld->hpa_range;
> > +	struct resource *res = p->res;
> > +	int niw = cxld->interleave_ways;
> > +
> > +	if (res->start == MISALIGNED_CFMWS_RANGE_BASE &&
> > +	    res->start == r->start && res->end < r->end &&
> > +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> > +		return true;
> 
> This basically means that the region base is zero?
>
Yes, it is. 
>
> How do you
> determine the actual base address of the range then? The cxl test
> patch contains a non-zero address.
>
Next version of this series is going to use 0x0 for real hardware and 
mock_cfmws[0]->base_hpa for the topology created by cxl tests.
>
> this trimmed mem hole regions? Could you share a snippet of a CFMWS
> entry for this?
>
Sorry, I don't have it because no real hardware was required to make and test
my series. 
> 
> Thanks,
> 
> -Robert
> 
> > +	return false;
> > +}
> > +
> > +void arch_trim_hpa_by_spa(struct resource *res,
> > +			  struct cxl_root_decoder *cxlrd)
> > +{
> > +	res->end = cxlrd->res->end;
> > +}
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index ac2c486c16e9..3cb5a69e9731 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
> >  	cxld = to_cxl_decoder(dev);
> >  	r = &cxld->hpa_range;
> >  
> > -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> > -		return 1;
> > +	if (p->res) {
> > +		if (p->res->start == r->start && p->res->end == r->end)
> > +			return 1;
> > +		if (arch_match_region(p, cxld))
> > +			return 1;
> > +	}
> >  
> >  	return 0;
> >  }
> > @@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> >  		if (cxld->interleave_ways != iw ||
> >  		    cxld->interleave_granularity != ig ||
> >  		    cxld->hpa_range.start != p->res->start ||
> > -		    cxld->hpa_range.end != p->res->end ||
> > +		    (cxld->hpa_range.end != p->res->end &&
> > +		     !arch_match_region(p, cxld)) ||
> >  		    ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
> >  			dev_err(&cxlr->dev,
> >  				"%s:%s %s expected iw: %d ig: %d %pr\n",
> > @@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> >  {
> >  	struct cxl_endpoint_decoder *cxled = data;
> >  	struct cxl_switch_decoder *cxlsd;
> > +	struct cxl_root_decoder *cxlrd;
> >  	struct range *r1, *r2;
> >  
> >  	if (!is_switch_decoder(dev))
> > @@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> >  	r1 = &cxlsd->cxld.hpa_range;
> >  	r2 = &cxled->cxld.hpa_range;
> >  
> > -	if (is_root_decoder(dev))
> > -		return range_contains(r1, r2);
> > +	if (is_root_decoder(dev)) {
> > +		if (range_contains(r1, r2))
> > +			return 1;
> > +		cxlrd = to_cxl_root_decoder(dev);
> > +		if (arch_match_spa(cxlrd, cxled))
> > +			return 1;
> > +	}
> >  	return (r1->start == r2->start && r1->end == r2->end);
> >  }
> >  
> > @@ -1943,7 +1954,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
> >  	}
> >  
> >  	if (resource_size(cxled->dpa_res) * p->interleave_ways !=
> > -	    resource_size(p->res)) {
> > +	    resource_size(p->res) && !arch_match_spa(cxlrd, cxled)) {
> >  		dev_dbg(&cxlr->dev,
> >  			"%s:%s: decoder-size-%#llx * ways-%d != region-size-%#llx\n",
> >  			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
> > @@ -3199,7 +3210,13 @@ static int match_root_decoder_by_range(struct device *dev, void *data)
> >  	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> >  	r2 = &cxled->cxld.hpa_range;
> >  
> > -	return range_contains(r1, r2);
> > +	if (range_contains(r1, r2))
> > +		return true;
> > +
> > +	if (arch_match_spa(cxlrd, cxled))
> > +		return true;
> > +
> > +	return false;
> >  }
> >  
> >  static int match_region_by_range(struct device *dev, void *data)
> > @@ -3217,8 +3234,12 @@ static int match_region_by_range(struct device *dev, void *data)
> >  	p = &cxlr->params;
> >  
> >  	down_read(&cxl_region_rwsem);
> > -	if (p->res && p->res->start == r->start && p->res->end == r->end)
> > -		rc = 1;
> > +	if (p->res) {
> > +		if (p->res->start == r->start && p->res->end == r->end)
> > +			rc = 1;
> > +		if (arch_match_region(p, &cxled->cxld))
> > +			rc = 1;
> > +	}
> >  	up_read(&cxl_region_rwsem);
> >  
> >  	return rc;
> > @@ -3270,6 +3291,22 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
> >  
> >  	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
> >  				    dev_name(&cxlr->dev));
> > +
> > +	/*
> > +	 * Trim the HPA retrieved from hardware to fit the SPA mapped by the
> > +	 * platform
> > +	 */
> > +	if (arch_match_spa(cxlrd, cxled)) {
> > +		struct range *range = &cxlrd->cxlsd.cxld.hpa_range;
> > +
> > +		arch_trim_hpa_by_spa(res, cxlrd);
> > +		dev_dbg(cxlmd->dev.parent,
> > +			"%s: Trim HPA (%s: %pr) by SPA (%s: %pr)\n",
> > +			__func__,
> > +			dev_name(&cxlrd->cxlsd.cxld.dev), range,
> > +			dev_name(&cxled->cxld.dev), hpa);
> > +	}
> > +
> >  	rc = insert_resource(cxlrd->res, res);
> >  	if (rc) {
> >  		/*
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 5406e3ab3d4a..a5ad4499381e 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -902,6 +902,31 @@ void cxl_coordinates_combine(struct access_coordinate *out,
> >  
> >  bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
> >  
> > +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> > +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > +		    struct cxl_endpoint_decoder *cxled);
> > +bool arch_match_region(struct cxl_region_params *p,
> > +		       struct cxl_decoder *cxld);
> > +void arch_trim_hpa_by_spa(struct resource *res,
> > +			  struct cxl_root_decoder *cxlrd);
> > +#else
> > +bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > +		    struct cxl_endpoint_decoder *cxled)
> > +{
> > +	return false;
> > +}
> > +
> > +bool arch_match_region(struct cxl_region_params *p,
> > +		       struct cxl_decoder *cxld)
> > +{
> > +	return false;
> > +}
> > +
> > +void arch_trim_hpa_by_spa(struct resource *res,
> > +			  struct cxl_root_decoder *cxlrd)
> > +{ }
> > +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
> > +
> >  /*
> >   * Unit test builds overrides this to __weak, find the 'strong' version
> >   * of these symbols in tools/testing/cxl/.
> 
> 





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2025-01-08 14:48     ` Fabio M. De Francesco
@ 2025-01-09 10:58       ` Robert Richter
  2025-01-10 16:06         ` Fabio M. De Francesco
  0 siblings, 1 reply; 25+ messages in thread
From: Robert Richter @ 2025-01-09 10:58 UTC (permalink / raw)
  To: Fabio M. De Francesco
  Cc: linux-kernel, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Yao Xingtao, linux-cxl, Li Ming

Fabio,

On 08.01.25 15:48:34, Fabio M. De Francesco wrote:
> Hi Robert,
> 
> On Monday, December 16, 2024 10:30:11 PM GMT+1 Robert Richter wrote:
> > Hi Fabio,
> > 
> > On 22.11.24 16:51:53, Fabio M. De Francesco wrote:
> > > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > > Physical Address (HPA) windows that are associated with each CXL Host
> > > Bridge. Each window represents a contiguous HPA that may be interleaved
> > > with one or more targets (CXL v3.1 - 9.18.1.3).
> > > 
> > > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > > memory to which systems cannot send transactions. In some cases the size
> > > of that hole is not compatible with the CXL hardware decoder constraint
> > > that the size is always aligned to 256M * Interleave Ways.
> > > 
> > > On those systems, BIOS publishes CFMWS which communicate the active System
> > > Physical Address (SPA) ranges that map to a subset of the Host Physical
> > > Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> > > the endpoint is lost with no SPA to map to CXL HPA in that hole.
> > > 
> > > In the early stages of CXL Regions construction and attach on platforms
> > > with Low Memory Holes, cxl_add_to_region() fails and returns an error
> > > because it can't find any CXL Window that matches a given CXL Endpoint
> > > Decoder.
> > > 
> > > Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> > > ranges: both must start at physical address 0 and end below 4 GB, while
> > > the Root Decoder ranges end at lower addresses than the matching Endpoint
> > > Decoders which instead must always respect the above-mentioned CXL hardware
> > > decoders HPA alignment constraint.
> > > 
> > > Match misaligned CFMWS and CXL Regions with corresponding CXL Endpoint
> > > Decoders if driver detects Low Memory Holes.
> > > 
> > > Construct CXL Regions with HPA range's end trimmed by matching SPA.
> > > 
> > > Allow the attach target process to complete by relaxing Decoder constraints
> > > which would lead to failures.
> > 
> > I am implementing an address translation series and will post the
> > patches in a couple of days. It solves a similar problem to remap
> > (zero-based) endpoint address ranges to a different SPA range. I
> > implemented a generic approach that can easily be reused here. With
> > only a few changes to your series it could be integrated there (I
> > could rebase my code onto your's then in a later version of my
> > series).
> > 
> > The change I suggest to your series will also significantly simplify
> > your code and the function interface. The general idea is to have a
> > .spa_range attribute in struct cxl_endpoint_decoder:
> > 
> > @@ -428,6 +428,7 @@ struct cxl_endpoint_decoder {
> >         struct cxl_decoder cxld;
> >         struct cxl_root_decoder *cxlrd;
> >         struct resource *dpa_res;
> > +       struct range spa_range;
> >         resource_size_t skip;
> >         enum cxl_decoder_mode mode;
> >         enum cxl_decoder_state state;
> > 
> > Now, then an endpoint is detected, the spa_range is assigned,
> > typically this is cxled->spa_range = cxled->cxld.hpa_range (SPA ==
> > HPA). Whenever a port is added, a function cxl_port_platform_setup()
> > is called which could also contain a call to
> > cxl_port_setup_intel_lmh() or so.
> > 
> > @@ -865,6 +870,8 @@ static int cxl_port_add(struct cxl_port *port,
> >                         return rc;
> >         }
> >  
> > +       cxl_port_platform_setup(port);
> > +
> >         rc = device_add(dev);
> >         if (rc)
> >                 return rc;
> > 
> > cxl_port_setup_intel_lmh() does all the platform checks and
> > recalculates the spa_range for the endpoints based on the code you
> > already have. 
> >
> > The endpoint's SPA address range (instead of hpa_range)
> > is now used to construct the region by changing
> > match_region_by_range() accordingly. This avoids the instrumentation
> > of all the other checks you needed to add in this patch (use of
> > arch_match_region() and arch_match_spa() etc.).
> > 
> > The only function that needs to be exported is
> > cxl_port_setup_intel_lmh() then. Everything else is generic code. This
> > is also a good encapsulation to other archs, vendors, platforms etc. I
> > think that would really improve your series. Let me know if you have
> > questions.
> >
> Sorry for this late reply. I just come back from vacation :)
> 
> It's too early for asking questions. I'll wait until I read your series.
> I just saw you posted it yesterday.

np, my bad, it took longer than expected to send this out. Take your
time.

> 
> I'll rework my series and just export cxl_port_setup_intel_lmh() as you
> suggested, if it would result into a better solution.

In a v2 I am going to reorder and possibly split the patches for you
to reuse the first part then.

> 
> I can't yet easily figure out how to detect an LMH while still in cxl_port_add(),

I think you need to detect it close before region setup when decoders
are enumerated. I will revisit your patch again and take look to find
a solution.

> I'll think about it.
> >   
> > See also inline below, I have some questions and comments.
> > 
> > > 
> > > Cc: Alison Schofield <alison.schofield@intel.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > > ---
> > >  drivers/cxl/Kconfig       |  5 ++++
> > >  drivers/cxl/core/Makefile |  1 +
> > >  drivers/cxl/core/lmh.c    | 53 +++++++++++++++++++++++++++++++++++++
> > 
> > Could you name this file intel.c and have necessary Kconfig options?
> >
> I named it x86.c, but in internal review Dan asked to rename it lmh.c

The LMH detection is unclear yet, just checking for zero is IMO not
sufficient (looked at your answer below). As long as there is no
generic check a platform check is needed in addition.

> > 
> > If code should be reused later we still can move it to common.c or
> > similar. It would be good to guard that having a pci_device_id check
> > to a root port or a vendor cpu check.
> > 
> > See my comment no enablement above.
> > 
> > >  drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++++++++-------
> > >  drivers/cxl/cxl.h         | 25 ++++++++++++++++++
> > >  5 files changed, 130 insertions(+), 9 deletions(-)
> > >  create mode 100644 drivers/cxl/core/lmh.c
> > > 
> > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > > index 876469e23f7a..07b87f217e59 100644
> > > --- a/drivers/cxl/Kconfig
> > > +++ b/drivers/cxl/Kconfig
> > > @@ -128,6 +128,11 @@ config CXL_REGION
> > >  
> > >  	  If unsure say 'y'
> > >  
> > > +config CXL_ARCH_LOW_MEMORY_HOLE
> > > +	def_bool y
> > > +	depends on CXL_REGION
> > > +	depends on X86
> > > +
> > >  config CXL_REGION_INVALIDATION_TEST
> > >  	bool "CXL: Region Cache Management Bypass (TEST)"
> > >  	depends on CXL_REGION
> > > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > > index 9259bcc6773c..6e80215e8444 100644
> > > --- a/drivers/cxl/core/Makefile
> > > +++ b/drivers/cxl/core/Makefile
> > > @@ -15,4 +15,5 @@ cxl_core-y += hdm.o
> > >  cxl_core-y += pmu.o
> > >  cxl_core-y += cdat.o
> > >  cxl_core-$(CONFIG_TRACING) += trace.o
> > > +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
> > >  cxl_core-$(CONFIG_CXL_REGION) += region.o
> > > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > > new file mode 100644
> > > index 000000000000..da76b2a534ec
> > > --- /dev/null
> > > +++ b/drivers/cxl/core/lmh.c
> > > @@ -0,0 +1,53 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +
> > > +#include <linux/range.h>
> > > +#include "cxl.h"
> > > +
> > > +/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
> > > +#define MISALIGNED_CFMWS_RANGE_BASE 0x0
> > 
> > Could you more elaborate on this? Why is the base zero?
> > 
> I know it because I saw it's zero in all configurations showed by and discussed with 
> BIOS engineers.

As said above, an additional check is needed here.

[...]

> > > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > > +bool arch_match_region(struct cxl_region_params *p,
> > > +		       struct cxl_decoder *cxld)
> > > +{
> > > +	struct range *r = &cxld->hpa_range;
> > > +	struct resource *res = p->res;
> > > +	int niw = cxld->interleave_ways;
> > > +
> > > +	if (res->start == MISALIGNED_CFMWS_RANGE_BASE &&
> > > +	    res->start == r->start && res->end < r->end &&
> > > +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> > > +		return true;
> > 
> > This basically means that the region base is zero?
> >
> Yes, it is. 
> >
> > How do you
> > determine the actual base address of the range then? The cxl test
> > patch contains a non-zero address.
> >
> Next version of this series is going to use 0x0 for real hardware and 
> mock_cfmws[0]->base_hpa for the topology created by cxl tests.
> >
> > this trimmed mem hole regions? Could you share a snippet of a CFMWS
> > entry for this?
> >
> Sorry, I don't have it because no real hardware was required to make and test
> my series. 

Ok, will reorder my patches in v2 for you to reuse and take a look at
your patch and the base_hpa setup. Will comment there again.

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
  2025-01-09 10:58       ` Robert Richter
@ 2025-01-10 16:06         ` Fabio M. De Francesco
  0 siblings, 0 replies; 25+ messages in thread
From: Fabio M. De Francesco @ 2025-01-10 16:06 UTC (permalink / raw)
  To: Robert Richter
  Cc: linux-kernel, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Yao Xingtao, linux-cxl, Li Ming

Robert,

On Thursday, January 9, 2025 11:58:37 AM GMT+1 Robert Richter wrote:
> Fabio,
> 
> On 08.01.25 15:48:34, Fabio M. De Francesco wrote:
> > Hi Robert,
> > 
> > On Monday, December 16, 2024 10:30:11 PM GMT+1 Robert Richter wrote:
> > > Hi Fabio,
> > > 
> > > On 22.11.24 16:51:53, Fabio M. De Francesco wrote:
> > > > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > > > Physical Address (HPA) windows that are associated with each CXL Host
> > > > Bridge. Each window represents a contiguous HPA that may be interleaved
> > > > with one or more targets (CXL v3.1 - 9.18.1.3).
> > > > 
> > > > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > > > memory to which systems cannot send transactions. In some cases the size
> > > > of that hole is not compatible with the CXL hardware decoder constraint
> > > > that the size is always aligned to 256M * Interleave Ways.
> > > > 
> > > > On those systems, BIOS publishes CFMWS which communicate the active System
> > > > Physical Address (SPA) ranges that map to a subset of the Host Physical
> > > > Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> > > > the endpoint is lost with no SPA to map to CXL HPA in that hole.
> > > > 
> > > > In the early stages of CXL Regions construction and attach on platforms
> > > > with Low Memory Holes, cxl_add_to_region() fails and returns an error
> > > > because it can't find any CXL Window that matches a given CXL Endpoint
> > > > Decoder.
> > > > 
> > > > Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> > > > ranges: both must start at physical address 0 and end below 4 GB, while
> > > > the Root Decoder ranges end at lower addresses than the matching Endpoint
> > > > Decoders which instead must always respect the above-mentioned CXL hardware
> > > > decoders HPA alignment constraint.
> > > > 
> > > > Match misaligned CFMWS and CXL Regions with corresponding CXL Endpoint
> > > > Decoders if driver detects Low Memory Holes.
> > > > 
> > > > Construct CXL Regions with HPA range's end trimmed by matching SPA.
> > > > 
> > > > Allow the attach target process to complete by relaxing Decoder constraints
> > > > which would lead to failures.
> > > 
> > >  [...]
> > >

> > > I am implementing an address translation series and will post the
> > > patches in a couple of days. It solves a similar problem to remap
> > > (zero-based) endpoint address ranges to a different SPA range. I
> > > implemented a generic approach that can easily be reused here. With
> > > only a few changes to your series it could be integrated there (I
> > > could rebase my code onto your's then in a later version of my
> > > series).
> > > 
> > > The change I suggest to your series will also significantly simplify
> > > your code and the function interface. The general idea is to have a
> > > .spa_range attribute in struct cxl_endpoint_decoder:
> > > 
> > > @@ -428,6 +428,7 @@ struct cxl_endpoint_decoder {
> > >         struct cxl_decoder cxld;
> > >         struct cxl_root_decoder *cxlrd;
> > >         struct resource *dpa_res;
> > > +       struct range spa_range;
> > >         resource_size_t skip;
> > >         enum cxl_decoder_mode mode;
> > >         enum cxl_decoder_state state;
> > > 
> > > Now, then an endpoint is detected, the spa_range is assigned,
> > > typically this is cxled->spa_range = cxled->cxld.hpa_range (SPA ==
> > > HPA). Whenever a port is added, a function cxl_port_platform_setup()
> > > is called which could also contain a call to
> > > cxl_port_setup_intel_lmh() or so.
> > > 
> > > @@ -865,6 +870,8 @@ static int cxl_port_add(struct cxl_port *port,
> > >                         return rc;
> > >         }
> > >  
> > > +       cxl_port_platform_setup(port);
> > > +
> > >         rc = device_add(dev);
> > >         if (rc)
> > >                 return rc;
> > > 
> > > cxl_port_setup_intel_lmh() does all the platform checks and
> > > recalculates the spa_range for the endpoints based on the code you
> > > already have. 
> > >
> > > The endpoint's SPA address range (instead of hpa_range)
> > > is now used to construct the region by changing
> > > match_region_by_range() accordingly. This avoids the instrumentation
> > > of all the other checks you needed to add in this patch (use of
> > > arch_match_region() and arch_match_spa() etc.).
> > > 
> > > The only function that needs to be exported is
> > > cxl_port_setup_intel_lmh() then. Everything else is generic code. This
> > > is also a good encapsulation to other archs, vendors, platforms etc. I
> > > think that would really improve your series. Let me know if you have
> > > questions.
> > >
> > Sorry for this late reply. I just come back from vacation :)
> > 
> > It's too early for asking questions. I'll wait until I read your series.
> > I just saw you posted it yesterday.
> 
> np, my bad, it took longer than expected to send this out. Take your
> time.
>
No problem. Yesterday, I began to read your series. I'll comment and 
review there as soon as possible. 

Thank you for adding me to the list of recipients.
> 
> > 
> > I'll rework my series and just export cxl_port_setup_intel_lmh() as you
> > suggested, if it would result into a better solution.
> 
> In a v2 I am going to reorder and possibly split the patches for you
> to reuse the first part then.
> 
Thanks for that, but I can't yet have visibility on what is going to happen.
Internally, I've been suggested to post my v2 as is, not re-based on your 
series for now. A few colleagues want to have the opportunity to read it.
Later we will see which the best course of action will be.
> > 
> > I can't yet easily figure out how to detect an LMH while still in cxl_port_add(),
> 
> I think you need to detect it close before region setup when decoders
> are enumerated. I will revisit your patch again and take look to find
> a solution.
> 
Thanks, but I think you missed something. More on this below...
>
> > I'll think about it.
> > >   
> > > See also inline below, I have some questions and comments.
> > > 
> > > > 
> > > > Cc: Alison Schofield <alison.schofield@intel.com>
> > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > > > ---
> > > >  drivers/cxl/Kconfig       |  5 ++++
> > > >  drivers/cxl/core/Makefile |  1 +
> > > >  drivers/cxl/core/lmh.c    | 53 +++++++++++++++++++++++++++++++++++++
> > > 
> > > Could you name this file intel.c and have necessary Kconfig options?
> > >
> > I named it x86.c, but in internal review Dan asked to rename it lmh.c
> 
> The LMH detection is unclear yet, just checking for zero is IMO not
> sufficient (looked at your answer below). As long as there is no
> generic check a platform check is needed in addition.
>
Thank you for providing your proposed response. I'll now review it for 
clarity and fluency, and offer suggestions for improvement:

It's not just checking for zero. In the 'if' statements of 
arch_match_spa() and arch_match_region(), there's a bit of more complexity 
involved.

These functions essentially test CFMWS and corresponding EP Decoders 
when both ranges start at zero. In such cases, if SPA is a subset of EP 
Decoder HPA, and that HPA aligns with the 256M * NIW constraints, we assume
an LMH is truncating the CFMWS range. We then return true to match Root 
Decoders to Endpoint Decoders, or Regions to EP Decoders, preventing driver 
failures while it constructs Regions and attach EP to them.

I believe I discussed this with Ming Li, who had similar questions. However,
the lack of immediate clarity suggests that I should rework these 'if' 
statements. 

A revised approach for v2 might test whether an SPA range that starts at 
zero is shorter than the corresponding EP Deccoder HPA range AND ends before 
4G. Again, if it holds true, assume an LMH is truncating that published CFMWS
and so return true and preventing driver failures.

I'll think about it for v2. Anyway I'm going to submit it in one or two days.
> 
> > > 
> > > If code should be reused later we still can move it to common.c or
> > > similar. It would be good to guard that having a pci_device_id check
> > > to a root port or a vendor cpu check.

That sounds good. I'll think about it. Sorry for missing it in my previous
reply.

> > > 
> > > See my comment no enablement above.
> > > 
> > > >  drivers/cxl/core/region.c | 55 ++++++++++++++++++++++++++++++++-------
> > > >  drivers/cxl/cxl.h         | 25 ++++++++++++++++++
> > > >  5 files changed, 130 insertions(+), 9 deletions(-)
> > > >  create mode 100644 drivers/cxl/core/lmh.c
> > > > 
> > > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > > > index 876469e23f7a..07b87f217e59 100644
> > > > --- a/drivers/cxl/Kconfig
> > > > +++ b/drivers/cxl/Kconfig
> > > > @@ -128,6 +128,11 @@ config CXL_REGION
> > > >  
> > > >  	  If unsure say 'y'
> > > >  
> > > > +config CXL_ARCH_LOW_MEMORY_HOLE
> > > > +	def_bool y
> > > > +	depends on CXL_REGION
> > > > +	depends on X86
> > > > +
> > > >  config CXL_REGION_INVALIDATION_TEST
> > > >  	bool "CXL: Region Cache Management Bypass (TEST)"
> > > >  	depends on CXL_REGION
> > > > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > > > index 9259bcc6773c..6e80215e8444 100644
> > > > --- a/drivers/cxl/core/Makefile
> > > > +++ b/drivers/cxl/core/Makefile
> > > > @@ -15,4 +15,5 @@ cxl_core-y += hdm.o
> > > >  cxl_core-y += pmu.o
> > > >  cxl_core-y += cdat.o
> > > >  cxl_core-$(CONFIG_TRACING) += trace.o
> > > > +cxl_core-$(CONFIG_CXL_ARCH_LOW_MEMORY_HOLE) += lmh.o
> > > >  cxl_core-$(CONFIG_CXL_REGION) += region.o
> > > > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > > > new file mode 100644
> > > > index 000000000000..da76b2a534ec
> > > > --- /dev/null
> > > > +++ b/drivers/cxl/core/lmh.c
> > > > @@ -0,0 +1,53 @@
> > > > +// SPDX-License-Identifier: GPL-2.0-only
> > > > +
> > > > +#include <linux/range.h>
> > > > +#include "cxl.h"
> > > > +
> > > > +/* In x86 with memory hole, misaligned CFMWS range starts at 0x0 */
> > > > +#define MISALIGNED_CFMWS_RANGE_BASE 0x0
> > > 
> > > Could you more elaborate on this? Why is the base zero?
> > > 
> > I know it because I saw it's zero in all configurations showed by and discussed with 
> > BIOS engineers.
> 
> As said above, an additional check is needed here.
> 
Okay.
>
> [...]
> 
> > > > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > > > +bool arch_match_region(struct cxl_region_params *p,
> > > > +		       struct cxl_decoder *cxld)
> > > > +{
> > > > +	struct range *r = &cxld->hpa_range;
> > > > +	struct resource *res = p->res;
> > > > +	int niw = cxld->interleave_ways;
> > > > +
> > > > +	if (res->start == MISALIGNED_CFMWS_RANGE_BASE &&
> > > > +	    res->start == r->start && res->end < r->end &&
> > > > +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> > > > +		return true;
> > > 
> > > This basically means that the region base is zero?
> > >
> > Yes, it is. 
> > >
> > > How do you
> > > determine the actual base address of the range then? The cxl test
> > > patch contains a non-zero address.
> > >
> > Next version of this series is going to use 0x0 for real hardware and 
> > mock_cfmws[0]->base_hpa for the topology created by cxl tests.
> > >
> > > this trimmed mem hole regions? Could you share a snippet of a CFMWS
> > > entry for this?
> > >
> > Sorry, I don't have it because no real hardware was required to make and test
> > my series. 
> 
> Ok, will reorder my patches in v2 for you to reuse and take a look at
> your patch and the base_hpa setup. Will comment there again.
> 
> Thanks,
> 
> -Robert
> 
Thanks,

Fabio




^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2025-01-10 16:06 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-22 15:51 [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
2024-11-22 15:51 ` [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
2024-11-22 17:28   ` Ira Weiny
2024-11-25 21:10   ` Alison Schofield
2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
2024-11-22 17:25   ` Ira Weiny
2024-11-25 11:23     ` Li Ming
2024-11-25  8:41   ` kernel test robot
2024-11-25  8:42   ` Li Ming
2024-11-25 17:22     ` Fabio M. De Francesco
2024-11-26  2:13       ` Li Ming
2024-11-25 13:37   ` kernel test robot
2024-11-25 20:35   ` Alison Schofield
2024-11-25 22:44   ` kernel test robot
2024-12-16 21:30   ` Robert Richter
2025-01-08 14:48     ` Fabio M. De Francesco
2025-01-09 10:58       ` Robert Richter
2025-01-10 16:06         ` Fabio M. De Francesco
2024-11-22 15:51 ` [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
2024-11-22 17:26   ` Ira Weiny
2024-11-25 20:46   ` Alison Schofield
2024-11-26 15:00   ` Jonathan Cameron
2024-11-22 19:46 ` [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Gregory Price
2024-11-25 22:00 ` Alison Schofield
2024-12-03 18:23   ` Fabio M. De Francesco

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox