linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] cxl: docs/driver-api/conventions resolve conflicts between CFMWS, LMH, Decoders
@ 2025-08-20 15:06 Fabio M. De Francesco
  2025-08-21 15:22 ` Dave Jiang
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Fabio M. De Francesco @ 2025-08-20 15:06 UTC (permalink / raw)
  To: linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Jonathan Corbet, linux-doc,
	linux-kernel, ALOK TIWARI, Randy Dunlap, Gregory Price,
	Fabio M. De Francesco

Add documentation on how to resolve conflicts between CXL Fixed Memory
Windows, Platform Low Memory Holes, intermediate Switch and Endpoint
Decoders.

Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---

v3 -> v4: Show and explain how CFMWS, Root Decoders, Intermediate
	  Switch and Endpoint Decoders match and attach Regions in
	  x86 platforms with Low Memory Holes (Dave, Gregory, Ira)
	  Remove a wrong argument about large interleaves (Jonathan)

v2 -> v3: Rework a few phrases for better clarity.
	  Fix grammar and syntactic errors (Randy, Alok).
	  Fix semantic errors ("size does not comply", Alok).
	  Fix technical errors ("decoder's total memory?", Alok).
	  
v1 -> v2: Rewrite "Summary of the Change" section, 3r paragraph.

 Documentation/driver-api/cxl/conventions.rst | 111 +++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/Documentation/driver-api/cxl/conventions.rst b/Documentation/driver-api/cxl/conventions.rst
index da347a81a237..714240ed2e04 100644
--- a/Documentation/driver-api/cxl/conventions.rst
+++ b/Documentation/driver-api/cxl/conventions.rst
@@ -45,3 +45,114 @@ Detailed Description of the Change
 ----------------------------------
 
 <Propose spec language that corrects the conflict.>
+
+
+Resolve conflict between CFMWS, Platform Memory Holes, and Endpoint Decoders
+============================================================================
+
+Document
+--------
+
+CXL Revision 3.2, Version 1.0
+
+License
+-------
+
+SPDX-License Identifier: CC-BY-4.0
+
+Creator/Contributors
+--------------------
+
+Fabio M. De Francesco, Intel
+Dan J. Williams, Intel
+Mahesh Natu, Intel
+
+Summary of the Change
+---------------------
+
+According to the current CXL Specifications (Revision 3.2, Version 1.0),
+the CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
+Physical Address (HPA) windows associated with each CXL Host Bridge. Each
+window represents a contiguous HPA range that may be interleaved across
+one or more targets, including CXL Host Bridges. Each window has a set of
+restrictions that govern its usage. It is the OSPM’s responsibility to
+utilize each window for the specified use.
+
+Table 9-22 states the Window Size field contains the total number of
+consecutive bytes of HPA this window represents. This value must be a
+multiple of the Number of Interleave Ways * 256 MB.
+
+Platform Firmware (BIOS) might reserve physical addresses below 4 GB,
+such as the Low Memory Hole for PCIe MMIO. In such cases, the CFMWS Range
+Size may not adhere to the NIW * 256 MB rule.
+
+On these systems, BIOS publishes CFMWS to communicate the active System
+Physical Address (SPA) ranges that map to a subset of the Host Physical
+Address (HPA) ranges. The SPA range trims out the hole, resulting in lost
+capacity in the endpoint with no SPA to map to the CXL HPA range that
+exceeds the matching CFMWS range.
+
+E.g, a real x86 platform with two CFMWS, 384 GB total memory, and LMH
+starting at 2 GB:
+
+Window | CFMWS Base | CFMWS Size | HDM Decoder Base | HDM Decoder Size | Ways | Granularity
+  0    |   0 GB     |     2 GB   |      0 GB        |       3 GB       |  12  |    256
+  1    |   4 GB     |   380 GB   |      0 GB        |     380 GB       |  12  |    256
+
+HDM decoder base and HDM decoder size represent all the 12 Endpoint
+Decoders of a 12 way region and all the intermediate Switch Decoders.
+They are configured by the BIOS according to the NIW * 256MB rule,
+resulting in a HPA range size of 3GB.
+
+The CFMWS Base and CFMWS Size are used to configure the Root Decoder HPA
+range base and size. CFMWS cannot intersect Memory Holes, then the CFMWS[0]
+size is smaller (2GB) than that of the Switch and Endpoint Decoders that
+make the hierarchy (3GB).
+
+On that platform, only the first 2GB will be potentially usable but,
+because of the current specs, Linux fails to make them available to the
+users. The driver expects that Root Decoder HPA size, which is equal to
+the CFMWS from which it is configured, to be greater or equal to the
+matching Switch and Endpoint HDM Decoders.
+
+The CXL driver fails to construct Regions and to attach Endpoint and
+intermediate Switch Decoders to those Regions after their construction.
+
+In order to succeed with Region construction and Decoders attachment,
+Linux must construct Regions with Root Decoders size, and then attach to
+them all the intermediate Switch and Endpoint Decoders that are part of the
+hierarchy, even though the Decoders HPA range sizes may be larger than
+those Regions whose sizes are trimmed by Low Memory Holes.
+
+Benefits of the Change
+----------------------
+
+Without this change, the OSPM wouldn't match Intermediate and Endpoint
+Decoders with Root Decoders configured with CFMWS HPA sizes that don't
+align with the NIW * 256MB constraint, leading to lost memdev capacity.
+This change allows the OSPM to construct Regions and attach Intermediate
+Switch and Endpoint Decoders to them, so that the addressable part of the
+memory devices total capacity is not lost.
+
+References
+----------
+
+Compute Express Link Specification Revision 3.2, Version 1.0
+<https://www.computeexpresslink.org/>
+
+Detailed Description of the Change
+----------------------------------
+
+The description of the Window Size field in table 9-22 needs to account
+for platforms with Low Memory Holes, where SPA ranges might be subsets of
+the endpoints' HPA. Therefore, it has to be changed to the following:
+
+"The total number of consecutive bytes of HPA this window represents.
+This value shall be a multiple of NIW * 256 MB. On platforms that reserve
+physical addresses below 4 GB, such as the Low Memory Hole for PCIe MMIO
+on x86 or a requirement for greater than 8-way interleave CXL Regions
+starting at address 0, an instance of CFMWS whose Base HPA is 0 might have
+a window size that doesn't align with the NIW * 256 MB constraint. Note
+that the matching intermediate Switch and Endpoint Decoders' HPA range
+sizes must still align to the above-mentioned rule, but the memory capacity
+that exceeds the CFMWS window size will not be accessible."
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread
* [PATCH v4] cxl: docs/driver-api/conventions resolve conflicts between CFMWS, LMH, Decoders
@ 2025-08-20 14:55 Fabio M. De Francesco
  0 siblings, 0 replies; 9+ messages in thread
From: Fabio M. De Francesco @ 2025-08-20 14:55 UTC (permalink / raw)
  To: linux-cxl @ vger . kernel . org --cc=Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Jonathan Corbet, linux-doc, linux-kernel,
	ALOK TIWARI, Randy Dunlap, Gregory Price
  Cc: Fabio M. De Francesco

Add documentation on how to resolve conflicts between CXL Fixed Memory
Windows, Platform Low Memory Holes, intermediate Switch and Endpoint
Decoders.

Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---

v3 -> v4: Show and explain how CFMWS, Root Decoders, Intermediate
	  Switch and Endpoint Decoders match and attach Regions in
	  x86 platforms with Low Memory Holes (Dave, Gregory, Ira)
	  Remove a wrong argument about large interleaves (Jonathan)

v2 -> v3: Rework a few phrases for better clarity.
	  Fix grammar and syntactic errors (Randy, Alok).
	  Fix semantic errors ("size does not comply", Alok).
	  Fix technical errors ("decoder's total memory?", Alok).
	  
v1 -> v2: Rewrite "Summary of the Change" section, 3r paragraph.

 Documentation/driver-api/cxl/conventions.rst | 111 +++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/Documentation/driver-api/cxl/conventions.rst b/Documentation/driver-api/cxl/conventions.rst
index da347a81a237..714240ed2e04 100644
--- a/Documentation/driver-api/cxl/conventions.rst
+++ b/Documentation/driver-api/cxl/conventions.rst
@@ -45,3 +45,114 @@ Detailed Description of the Change
 ----------------------------------
 
 <Propose spec language that corrects the conflict.>
+
+
+Resolve conflict between CFMWS, Platform Memory Holes, and Endpoint Decoders
+============================================================================
+
+Document
+--------
+
+CXL Revision 3.2, Version 1.0
+
+License
+-------
+
+SPDX-License Identifier: CC-BY-4.0
+
+Creator/Contributors
+--------------------
+
+Fabio M. De Francesco, Intel
+Dan J. Williams, Intel
+Mahesh Natu, Intel
+
+Summary of the Change
+---------------------
+
+According to the current CXL Specifications (Revision 3.2, Version 1.0),
+the CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
+Physical Address (HPA) windows associated with each CXL Host Bridge. Each
+window represents a contiguous HPA range that may be interleaved across
+one or more targets, including CXL Host Bridges. Each window has a set of
+restrictions that govern its usage. It is the OSPM’s responsibility to
+utilize each window for the specified use.
+
+Table 9-22 states the Window Size field contains the total number of
+consecutive bytes of HPA this window represents. This value must be a
+multiple of the Number of Interleave Ways * 256 MB.
+
+Platform Firmware (BIOS) might reserve physical addresses below 4 GB,
+such as the Low Memory Hole for PCIe MMIO. In such cases, the CFMWS Range
+Size may not adhere to the NIW * 256 MB rule.
+
+On these systems, BIOS publishes CFMWS to communicate the active System
+Physical Address (SPA) ranges that map to a subset of the Host Physical
+Address (HPA) ranges. The SPA range trims out the hole, resulting in lost
+capacity in the endpoint with no SPA to map to the CXL HPA range that
+exceeds the matching CFMWS range.
+
+E.g, a real x86 platform with two CFMWS, 384 GB total memory, and LMH
+starting at 2 GB:
+
+Window | CFMWS Base | CFMWS Size | HDM Decoder Base | HDM Decoder Size | Ways | Granularity
+  0    |   0 GB     |     2 GB   |      0 GB        |       3 GB       |  12  |    256
+  1    |   4 GB     |   380 GB   |      0 GB        |     380 GB       |  12  |    256
+
+HDM decoder base and HDM decoder size represent all the 12 Endpoint
+Decoders of a 12 way region and all the intermediate Switch Decoders.
+They are configured by the BIOS according to the NIW * 256MB rule,
+resulting in a HPA range size of 3GB.
+
+The CFMWS Base and CFMWS Size are used to configure the Root Decoder HPA
+range base and size. CFMWS cannot intersect Memory Holes, then the CFMWS[0]
+size is smaller (2GB) than that of the Switch and Endpoint Decoders that
+make the hierarchy (3GB).
+
+On that platform, only the first 2GB will be potentially usable but,
+because of the current specs, Linux fails to make them available to the
+users. The driver expects that Root Decoder HPA size, which is equal to
+the CFMWS from which it is configured, to be greater or equal to the
+matching Switch and Endpoint HDM Decoders.
+
+The CXL driver fails to construct Regions and to attach Endpoint and
+intermediate Switch Decoders to those Regions after their construction.
+
+In order to succeed with Region construction and Decoders attachment,
+Linux must construct Regions with Root Decoders size, and then attach to
+them all the intermediate Switch and Endpoint Decoders that are part of the
+hierarchy, even though the Decoders HPA range sizes may be larger than
+those Regions whose sizes are trimmed by Low Memory Holes.
+
+Benefits of the Change
+----------------------
+
+Without this change, the OSPM wouldn't match Intermediate and Endpoint
+Decoders with Root Decoders configured with CFMWS HPA sizes that don't
+align with the NIW * 256MB constraint, leading to lost memdev capacity.
+This change allows the OSPM to construct Regions and attach Intermediate
+Switch and Endpoint Decoders to them, so that the addressable part of the
+memory devices total capacity is not lost.
+
+References
+----------
+
+Compute Express Link Specification Revision 3.2, Version 1.0
+<https://www.computeexpresslink.org/>
+
+Detailed Description of the Change
+----------------------------------
+
+The description of the Window Size field in table 9-22 needs to account
+for platforms with Low Memory Holes, where SPA ranges might be subsets of
+the endpoints' HPA. Therefore, it has to be changed to the following:
+
+"The total number of consecutive bytes of HPA this window represents.
+This value shall be a multiple of NIW * 256 MB. On platforms that reserve
+physical addresses below 4 GB, such as the Low Memory Hole for PCIe MMIO
+on x86 or a requirement for greater than 8-way interleave CXL Regions
+starting at address 0, an instance of CFMWS whose Base HPA is 0 might have
+a window size that doesn't align with the NIW * 256 MB constraint. Note
+that the matching intermediate Switch and Endpoint Decoders' HPA range
+sizes must still align to the above-mentioned rule, but the memory capacity
+that exceeds the CFMWS window size will not be accessible."
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-09-01 15:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-20 15:06 [PATCH v4] cxl: docs/driver-api/conventions resolve conflicts between CFMWS, LMH, Decoders Fabio M. De Francesco
2025-08-21 15:22 ` Dave Jiang
2025-08-22  1:55 ` Bagas Sanjaya
2025-08-26 13:49 ` Robert Richter
2025-09-01 12:22   ` Fabio M. De Francesco
2025-09-01 15:23     ` Robert Richter
2025-08-27 20:23 ` Gregory Price
2025-09-01 12:26   ` Fabio M. De Francesco
  -- strict thread matches above, loose matches on Subject: below --
2025-08-20 14:55 Fabio M. De Francesco

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).