[PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
@ 2026-01-10 11:46 Robert Richter
  2026-01-10 11:46 ` [PATCH v9 01/13] cxl/region: Rename misleading variable name @hpa to @hpa_range Robert Richter
                   ` (13 more replies)
  0 siblings, 14 replies; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

This patch set adds support for address translation using ACPI PRM and
enables this for AMD Zen5 platforms. The current approach bases on v4
and is in response to earlier attempts to implement CXL address
translation:

 * v1: [1] and the comments on it, esp. Dan's [2],
 * v2: [3] and comments on [4], esp. Dave's [5],
 * v3: [6] and comments on it, esp. Dave's [7],
 * v4: [8].

This version addresses Alison's review comments to change the
implementation to disable HPA/SPA translation handler. There are a
view minor but no major changes otherwise. See the changelog for
details. Thank you all for your reviews and testing.

Documentation of CXL Address Translation Support will be added to the
Kernel's "Compute Express Link: Linux Conventions". This patch
submission will be the base for a documentation patch that describes CXL
Address Translation support accordingly.

The CXL driver currently does not implement address translation which
assumes the host physical addresses (HPA) and system physical
addresses (SPA) are equal.

Systems with different HPA and SPA addresses need address translation.
If this is the case, the hardware addresses esp. used in the HDM
decoder configurations are different to the system's or parent port
address ranges. E.g. AMD Zen5 systems may be configured to use
'Normalized addresses'. Then, CXL endpoints have their own physical
address base which is not the same as the SPA used by the CXL host
bridge. Thus, addresses need to be translated from the endpoint's to
its CXL host bridge's address range.

To enable address translation, the endpoint's HPA range must be
translated to the CXL host bridge's address range. A callback is
introduced to translate a decoder's HPA to the CXL host bridge's
address range. The callback is then used to determine the region
parameters which includes the SPA translated address range of the
endpoint decoder and the interleaving configuration. This is stored in
struct cxl_region which allows an endpoint decoder to determine that
parameters based on its assigned region.

Note that only auto-discovery of decoders is supported. Thus, decoders
are locked and cannot be configured manually.

Finally, Zen5 address translation is enabled using ACPI PRMT.

This series bases on v6.19-rc1.

V9:
 * rebased onto v6.19-rc1,
 * updated sob-chains,
 * removed alignment check in cxl_prm_setup_root() for DPA ranges,
 * moved assignment to variable len in cxl_prm_setup_root() closer to user,
 * removed patch from series (Alison):
   [PATCH v8 12/13] cxl: Check if ULLONG_MAX was returned from translation functions
 * added patch to factor out poison setup code,
 * changed implementation to disable HPA/SPA translation handlers (Alison),

V8:
 * rebased onto cxl-for-6.19,
 * updated sob-chains,
 * renamed cxl_root callback to translation_setup_root,
 * renamed functions to cxl_root_setup_translation and cxl_prm_setup_root,
 * added comment around cxl_root_setup_translation(),
 * added check for ULLONG_MAX of return value of translation functions,
 * added callback to setup translation for regions
   (cxl_region_setup_translation, cxl_prm_setup_region),
 * add HPA/SPA callback handlers that return ULLONG_MAX (Alison),

V7:
 * rebased onto cxl/for-6.19/cxl-prm,
 * reworded comment and description of 11/11 (decoder lock),

V6:
 * rebased onto v6.18-rc5 and CXL updates for v6.19,
 * note: applies on top of: [PATCH v3 0/3] CXL updates for v6.19,

V5:
 * fixed build error with !CXL_REGION (kbot),
 * updated sob-chains,
 * added note to get_cxl_root_decoder() to drop reference after use
   (Dave),
 * moved initialization of base* variables in
   cxl_prm_translate_hpa_range() (Dave, Jonathan),
 * fixed initialization of cxlr->hpa_range for the non-auto case
   (Alison),
 * added description of the @hpa_range arg to
   cxl_calc_interleave_pos() (kbot),
 * removed optional patches 12-14 to send them separately (Alison,
   Dave),
 * reordered patches 1-6 to reduce dependencies between them and give
   way for early pick up candidates,
 * rebased onto cxl/next (c692f5a947ad),
 * added commas in comment in cxl_add_to_region() (Jonathan),
 * removed cxlmd from struct cxl_region_context (Dave, Jonathan),
 * removed use of PTR_ERR_OR_ZERO() (Jonathan),
 * increased wrap width to 80 chars for comments in cxl_atl.c (Jonathan),
 * moved (ways > 1) check out of while loop in cxl_prm_translate_hpa_range()
   (Jonathan),
 * removed trailing comma in struct prm_cxl_dpa_spa_data initializer (Jonathan),
 * updated patch description on locking the decoders (Dave, Jonathan),
 * spell fix in patch description (Jonathan),

V4:
 * rebased onto v6.18-rc2 (cxl/next),
 * updated sob-chain,
 * reworked and simplified code to use an address translation callback
   bound to the root port,
 * moved all address translation code to core/atl.c,
 * cxlr->cxlrd change, updated patch description (Alison),
 * use DEFINE_RANGE() (Jonathan),
 * change name to @hpa_range (Dave, Jonathan),
 * updated patch description if there is a no-op (Gregory),
 * use Designated initializers for struct cxl_region_context (Dave),
 * move callback handler to struct cxl_root_ops (Dave),
 * move handler initialization to acpi_probe() (Dave),
 * updated comment where Normalized Addressing is checked (Dave),
 * limit PRM enablement only to AMD supported kernel configs (AMD_NB)
   (Jonathan),
 * added 3 related optional cleanup patches at the end of the series,

V3:
 * rebased onto cxl/next,
 * complete rework to reduce number of required changes/patches and to
   remove platform specific code (Dan and Dave),
 * changed implementation allowing to add address translation to the
   CXL specification (documention patch in preparation),
 * simplified and generalized determination of interleaving
   parameters using the address translation callback,
 * depend only on the existence of the ACPI PRM GUID for CXL Address
   Translation enablement, removed platform checks,
 * small changes to region code only which does not require a full
   rework and refactoring of the code, just separating region
   parameter setup and region construction,
 * moved code to new core/atl.c file,
 * fixed subsys_initcall order dependency of EFI runtime services
   (Gregory and Joshua),

V2:
 * rebased onto cxl/next,
 * split of v1 in two parts:
   * removed cleanups and updates from this series to post them as a
     separate series (Dave),
   * this part 2 applies on top of part 1, v3,
 * added tags to SOB chain,
 * reworked architecture, vendor and platform setup (Jonathan):
   * added patch "cxl/x86: Prepare for architectural platform setup",
   * added function arch_cxl_port_platform_setup() plus a __weak
     versions for archs other than x86,
   * moved code to core/x86,
 * added comment to cxl_to_hpa_fn (Ben),
 * updated year in copyright statement (Ben),
 * cxl_port_calc_hpa(): Removed HPA check for zero (Jonathan), return
   1 if modified,
 * cxl_port_calc_pos(): Updated description and wording (Ben),
 * added several patches around interleaving and SPA calculation in
   cxl_endpoint_decoder_initialize(),
 * reworked iterator in cxl_endpoint_decoder_initialize() (Gregory),
 * fixed region interleaving parameters() (Alison),
 * fixed check in cxl_region_attach() (Alison),
 * Clarified in coverletter that not all ports in a system must
   implement the to_hpa() callback (Terry).

[1] https://lore.kernel.org/linux-cxl/20240701174754.967954-1-rrichter@amd.com/
[2] https://lore.kernel.org/linux-cxl/669086821f136_5fffa29473@dwillia2-xfh.jf.intel.com.notmuch/
[3] https://patchwork.kernel.org/project/cxl/cover/20250218132356.1809075-1-rrichter@amd.com/
[4] https://patchwork.kernel.org/project/cxl/cover/20250715191143.1023512-1-rrichter@amd.com/
[5] https://lore.kernel.org/all/78284b12-3e0b-4758-af18-397f32136c3f@intel.com/
[6] https://patchwork.kernel.org/project/cxl/cover/20250912144514.526441-1-rrichter@amd.com/
[7] https://lore.kernel.org/all/20250912144514.526441-8-rrichter@amd.com/T/#m23c2adb9d1e20770ccd5d11475288bda382b0af5
[8] https://patchwork.kernel.org/project/cxl/cover/20251103184804.509762-1-rrichter@amd.com/

Robert Richter (13):
  cxl/region: Rename misleading variable name @hpa to @hpa_range
  cxl/region: Store root decoder in struct cxl_region
  cxl/region: Store HPA range in struct cxl_region
  cxl: Simplify cxl_root_ops allocation and handling
  cxl/region: Separate region parameter setup and region construction
  cxl/region: Add @hpa_range argument to function
    cxl_calc_interleave_pos()
  cxl/region: Use region data to get the root decoder
  cxl: Introduce callback for HPA address ranges translation
  cxl/acpi: Prepare use of EFI runtime services
  cxl: Enable AMD Zen5 address translation using ACPI PRMT
  cxl/atl: Lock decoders that need address translation
  cxl/region: Factor out code into cxl_region_setup_poison()
  cxl: Disable HPA/SPA translation handlers for Normalized Addressing

 drivers/cxl/Kconfig       |   5 +
 drivers/cxl/acpi.c        |  17 +--
 drivers/cxl/core/Makefile |   1 +
 drivers/cxl/core/atl.c    | 211 ++++++++++++++++++++++++++++++++
 drivers/cxl/core/cdat.c   |   8 +-
 drivers/cxl/core/core.h   |   8 ++
 drivers/cxl/core/port.c   |   8 +-
 drivers/cxl/core/region.c | 247 ++++++++++++++++++++++++--------------
 drivers/cxl/cxl.h         |  40 ++++--
 9 files changed, 426 insertions(+), 119 deletions(-)
 create mode 100644 drivers/cxl/core/atl.c


base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
-- 
2.47.3


^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v9 01/13] cxl/region: Rename misleading variable name @hpa to @hpa_range
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  3:12   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 02/13] cxl/region: Store root decoder in struct cxl_region Robert Richter
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

@hpa is actually a @hpa_range, rename variables accordingly.

Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/region.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index ae899f68551f..51f1a5545324 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3474,9 +3474,9 @@ static int match_decoder_by_range(struct device *dev, const void *data)
 }
 
 static struct cxl_decoder *
-cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
+cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa_range)
 {
-	struct device *cxld_dev = device_find_child(&port->dev, hpa,
+	struct device *cxld_dev = device_find_child(&port->dev, hpa_range,
 						    match_decoder_by_range);
 
 	return cxld_dev ? to_cxl_decoder(cxld_dev) : NULL;
@@ -3489,14 +3489,14 @@ cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
 	struct cxl_port *port = cxled_to_port(cxled);
 	struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
 	struct cxl_decoder *root, *cxld = &cxled->cxld;
-	struct range *hpa = &cxld->hpa_range;
+	struct range *hpa_range = &cxld->hpa_range;
 
-	root = cxl_port_find_switch_decoder(&cxl_root->port, hpa);
+	root = cxl_port_find_switch_decoder(&cxl_root->port, hpa_range);
 	if (!root) {
 		dev_err(cxlmd->dev.parent,
 			"%s:%s no CXL window for range %#llx:%#llx\n",
 			dev_name(&cxlmd->dev), dev_name(&cxld->dev),
-			cxld->hpa_range.start, cxld->hpa_range.end);
+			hpa_range->start, hpa_range->end);
 		return NULL;
 	}
 
@@ -3562,7 +3562,7 @@ static int __construct_region(struct cxl_region *cxlr,
 			      struct cxl_endpoint_decoder *cxled)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-	struct range *hpa = &cxled->cxld.hpa_range;
+	struct range *hpa_range = &cxled->cxld.hpa_range;
 	struct cxl_region_params *p;
 	struct resource *res;
 	int rc;
@@ -3583,7 +3583,7 @@ static int __construct_region(struct cxl_region *cxlr,
 	if (!res)
 		return -ENOMEM;
 
-	*res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa),
+	*res = DEFINE_RES_MEM_NAMED(hpa_range->start, range_len(hpa_range),
 				    dev_name(&cxlr->dev));
 
 	rc = cxl_extended_linear_cache_resize(cxlr, res);
@@ -3666,11 +3666,12 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 }
 
 static struct cxl_region *
-cxl_find_region_by_range(struct cxl_root_decoder *cxlrd, struct range *hpa)
+cxl_find_region_by_range(struct cxl_root_decoder *cxlrd,
+			 struct range *hpa_range)
 {
 	struct device *region_dev;
 
-	region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa,
+	region_dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa_range,
 				       match_region_by_range);
 	if (!region_dev)
 		return NULL;
@@ -3680,7 +3681,7 @@ cxl_find_region_by_range(struct cxl_root_decoder *cxlrd, struct range *hpa)
 
 int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
 {
-	struct range *hpa = &cxled->cxld.hpa_range;
+	struct range *hpa_range = &cxled->cxld.hpa_range;
 	struct cxl_region_params *p;
 	bool attach = false;
 	int rc;
@@ -3691,12 +3692,13 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
 		return -ENXIO;
 
 	/*
-	 * Ensure that if multiple threads race to construct_region() for @hpa
-	 * one does the construction and the others add to that.
+	 * Ensure that, if multiple threads race to construct_region()
+	 * for the HPA range, one does the construction and the others
+	 * add to that.
 	 */
 	mutex_lock(&cxlrd->range_lock);
 	struct cxl_region *cxlr __free(put_cxl_region) =
-		cxl_find_region_by_range(cxlrd, hpa);
+		cxl_find_region_by_range(cxlrd, hpa_range);
 	if (!cxlr)
 		cxlr = construct_region(cxlrd, cxled);
 	mutex_unlock(&cxlrd->range_lock);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 02/13] cxl/region: Store root decoder in struct cxl_region
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
  2026-01-10 11:46 ` [PATCH v9 01/13] cxl/region: Rename misleading variable name @hpa to @hpa_range Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  3:13   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 03/13] cxl/region: Store HPA range " Robert Richter
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

A region is always bound to a root decoder. The region's associated
root decoder is often needed. Add it to struct cxl_region.

This simplifies the code by removing dynamic lookups and the root
decoder argument from the function argument list where possible.

Patch is a prerequisite to implement address translation which uses
struct cxl_region to store all relevant region and interleaving
parameters. It changes the argument list of __construct_region() in
preparation of adding a context argument. Additionally the arg list of
cxl_region_attach_position() is simplified and the use of
to_cxl_root_decoder() removed, which always reconstructs and checks
the pointer. The pointer never changes and is frequently used. Code
becomes more readable as this amphazises the binding between both
objects.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/region.c | 37 +++++++++++++++++++------------------
 drivers/cxl/cxl.h         |  2 ++
 2 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 51f1a5545324..22bd8ff37cef 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -489,9 +489,9 @@ static ssize_t interleave_ways_store(struct device *dev,
 				     struct device_attribute *attr,
 				     const char *buf, size_t len)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
-	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
 	struct cxl_region *cxlr = to_cxl_region(dev);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
+	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
 	struct cxl_region_params *p = &cxlr->params;
 	unsigned int val, save;
 	int rc;
@@ -552,9 +552,9 @@ static ssize_t interleave_granularity_store(struct device *dev,
 					    struct device_attribute *attr,
 					    const char *buf, size_t len)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
-	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
 	struct cxl_region *cxlr = to_cxl_region(dev);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
+	struct cxl_decoder *cxld = &cxlrd->cxlsd.cxld;
 	struct cxl_region_params *p = &cxlr->params;
 	int rc, val;
 	u16 ig;
@@ -628,7 +628,7 @@ static DEVICE_ATTR_RO(mode);
 
 static int alloc_hpa(struct cxl_region *cxlr, resource_size_t size)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	struct cxl_region_params *p = &cxlr->params;
 	struct resource *res;
 	u64 remainder = 0;
@@ -1373,7 +1373,7 @@ static int cxl_port_setup_targets(struct cxl_port *port,
 				  struct cxl_region *cxlr,
 				  struct cxl_endpoint_decoder *cxled)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	int parent_iw, parent_ig, ig, iw, rc, pos = cxled->pos;
 	struct cxl_port *parent_port = to_cxl_port(port->dev.parent);
 	struct cxl_region_ref *cxl_rr = cxl_rr_load(port, cxlr);
@@ -1731,10 +1731,10 @@ static int cxl_region_validate_position(struct cxl_region *cxlr,
 }
 
 static int cxl_region_attach_position(struct cxl_region *cxlr,
-				      struct cxl_root_decoder *cxlrd,
 				      struct cxl_endpoint_decoder *cxled,
 				      const struct cxl_dport *dport, int pos)
 {
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	struct cxl_switch_decoder *cxlsd = &cxlrd->cxlsd;
 	struct cxl_decoder *cxld = &cxlsd->cxld;
@@ -1971,7 +1971,7 @@ static int cxl_region_sort_targets(struct cxl_region *cxlr)
 static int cxl_region_attach(struct cxl_region *cxlr,
 			     struct cxl_endpoint_decoder *cxled, int pos)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
 	struct cxl_region_params *p = &cxlr->params;
@@ -2076,8 +2076,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 			ep_port = cxled_to_port(cxled);
 			dport = cxl_find_dport_by_dev(root_port,
 						      ep_port->host_bridge);
-			rc = cxl_region_attach_position(cxlr, cxlrd, cxled,
-							dport, i);
+			rc = cxl_region_attach_position(cxlr, cxled, dport, i);
 			if (rc)
 				return rc;
 		}
@@ -2100,7 +2099,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 	if (rc)
 		return rc;
 
-	rc = cxl_region_attach_position(cxlr, cxlrd, cxled, dport, pos);
+	rc = cxl_region_attach_position(cxlr, cxled, dport, pos);
 	if (rc)
 		return rc;
 
@@ -2396,8 +2395,8 @@ static const struct attribute_group *region_groups[] = {
 
 static void cxl_region_release(struct device *dev)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev->parent);
 	struct cxl_region *cxlr = to_cxl_region(dev);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	int id = atomic_read(&cxlrd->region_id);
 
 	/*
@@ -2480,10 +2479,12 @@ static struct cxl_region *cxl_region_alloc(struct cxl_root_decoder *cxlrd, int i
 	 * region id allocations
 	 */
 	get_device(dev->parent);
+	cxlr->cxlrd = cxlrd;
+	cxlr->id = id;
+
 	device_set_pm_not_required(dev);
 	dev->bus = &cxl_bus_type;
 	dev->type = &cxl_region_type;
-	cxlr->id = id;
 	cxl_region_set_lock(cxlr, &cxlrd->cxlsd.cxld);
 
 	return cxlr;
@@ -3115,7 +3116,7 @@ EXPORT_SYMBOL_FOR_MODULES(cxl_calculate_hpa_offset, "cxl_translate");
 u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
 		   u64 dpa)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	struct cxl_region_params *p = &cxlr->params;
 	struct cxl_endpoint_decoder *cxled = NULL;
 	u64 dpa_offset, hpa_offset, hpa;
@@ -3168,7 +3169,7 @@ static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset,
 				       struct dpa_result *result)
 {
 	struct cxl_region_params *p = &cxlr->params;
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	struct cxl_endpoint_decoder *cxled;
 	u64 hpa, hpa_offset, dpa_offset;
 	u16 eig = 0;
@@ -3522,7 +3523,7 @@ static int match_region_by_range(struct device *dev, const void *data)
 static int cxl_extended_linear_cache_resize(struct cxl_region *cxlr,
 					    struct resource *res)
 {
-	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	struct cxl_region_params *p = &cxlr->params;
 	resource_size_t size = resource_size(res);
 	resource_size_t cache_size, start;
@@ -3558,9 +3559,9 @@ static int cxl_extended_linear_cache_resize(struct cxl_region *cxlr,
 }
 
 static int __construct_region(struct cxl_region *cxlr,
-			      struct cxl_root_decoder *cxlrd,
 			      struct cxl_endpoint_decoder *cxled)
 {
+	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	struct range *hpa_range = &cxled->cxld.hpa_range;
 	struct cxl_region_params *p;
@@ -3656,7 +3657,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 		return cxlr;
 	}
 
-	rc = __construct_region(cxlr, cxlrd, cxled);
+	rc = __construct_region(cxlr, cxled);
 	if (rc) {
 		devm_release_action(port->uport_dev, unregister_region, cxlr);
 		return ERR_PTR(rc);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index ba17fa86d249..10ce9c3a8a55 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -529,6 +529,7 @@ enum cxl_partition_mode {
  * struct cxl_region - CXL region
  * @dev: This region's device
  * @id: This region's id. Id is globally unique across all regions
+ * @cxlrd: Region's root decoder
  * @mode: Operational mode of the mapped capacity
  * @type: Endpoint decoder target type
  * @cxl_nvb: nvdimm bridge for coordinating @cxlr_pmem setup / shutdown
@@ -542,6 +543,7 @@ enum cxl_partition_mode {
 struct cxl_region {
 	struct device dev;
 	int id;
+	struct cxl_root_decoder *cxlrd;
 	enum cxl_partition_mode mode;
 	enum cxl_decoder_type type;
 	struct cxl_nvdimm_bridge *cxl_nvb;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 03/13] cxl/region: Store HPA range in struct cxl_region
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
  2026-01-10 11:46 ` [PATCH v9 01/13] cxl/region: Rename misleading variable name @hpa to @hpa_range Robert Richter
  2026-01-10 11:46 ` [PATCH v9 02/13] cxl/region: Store root decoder in struct cxl_region Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  3:14   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 04/13] cxl: Simplify cxl_root_ops allocation and handling Robert Richter
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

Each region has a known host physical address (HPA) range it is
assigned to. Endpoint decoders assigned to a region share the same HPA
range. The region's address range is the system's physical address
(SPA) range.

Endpoint decoders in systems that need address translation use HPAs
which are not SPAs. To make the SPA range accessible to the endpoint
decoders, store and track the region's SPA range in struct cxl_region.
Introduce the @hpa_range member to the struct. Now, the SPA range of
an endpoint decoder can be determined based on its assigned region.

Patch is a prerequisite to implement address translation which uses
struct cxl_region to store all relevant region and interleaving
parameters.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/region.c | 7 +++++++
 drivers/cxl/cxl.h         | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 22bd8ff37cef..04c3ff66ec81 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -664,6 +664,8 @@ static int alloc_hpa(struct cxl_region *cxlr, resource_size_t size)
 		return PTR_ERR(res);
 	}
 
+	cxlr->hpa_range = DEFINE_RANGE(res->start, res->end);
+
 	p->res = res;
 	p->state = CXL_CONFIG_INTERLEAVE_ACTIVE;
 
@@ -700,6 +702,8 @@ static int free_hpa(struct cxl_region *cxlr)
 	if (p->state >= CXL_CONFIG_ACTIVE)
 		return -EBUSY;
 
+	cxlr->hpa_range = DEFINE_RANGE(0, -1);
+
 	cxl_region_iomem_release(cxlr);
 	p->state = CXL_CONFIG_IDLE;
 	return 0;
@@ -2453,6 +2457,8 @@ static void unregister_region(void *_cxlr)
 	for (i = 0; i < p->interleave_ways; i++)
 		detach_target(cxlr, i);
 
+	cxlr->hpa_range = DEFINE_RANGE(0, -1);
+
 	cxl_region_iomem_release(cxlr);
 	put_device(&cxlr->dev);
 }
@@ -3579,6 +3585,7 @@ static int __construct_region(struct cxl_region *cxlr,
 	}
 
 	set_bit(CXL_REGION_F_AUTO, &cxlr->flags);
+	cxlr->hpa_range = *hpa_range;
 
 	res = kmalloc(sizeof(*res), GFP_KERNEL);
 	if (!res)
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 10ce9c3a8a55..3a5ca1936ed1 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -530,6 +530,7 @@ enum cxl_partition_mode {
  * @dev: This region's device
  * @id: This region's id. Id is globally unique across all regions
  * @cxlrd: Region's root decoder
+ * @hpa_range: Address range occupied by the region
  * @mode: Operational mode of the mapped capacity
  * @type: Endpoint decoder target type
  * @cxl_nvb: nvdimm bridge for coordinating @cxlr_pmem setup / shutdown
@@ -544,6 +545,7 @@ struct cxl_region {
 	struct device dev;
 	int id;
 	struct cxl_root_decoder *cxlrd;
+	struct range hpa_range;
 	enum cxl_partition_mode mode;
 	enum cxl_decoder_type type;
 	struct cxl_nvdimm_bridge *cxl_nvb;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 04/13] cxl: Simplify cxl_root_ops allocation and handling
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (2 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 03/13] cxl/region: Store HPA range " Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  3:16   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 05/13] cxl/region: Separate region parameter setup and region construction Robert Richter
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

A root port's callback handlers are collected in struct cxl_root_ops.
The structure is dynamically allocated, though it contains only a
single pointer in it. This also requires to check two pointers to
check for the existance of a callback.

Simplify the allocation, release and handler check by embedding the
ops statically in struct cxl_root.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/acpi.c      |  7 ++-----
 drivers/cxl/core/cdat.c |  8 ++++----
 drivers/cxl/core/port.c |  8 ++------
 drivers/cxl/cxl.h       | 19 ++++++++++---------
 4 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 77ac940e3013..b4bed40ef7c0 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -318,10 +318,6 @@ static int cxl_acpi_qos_class(struct cxl_root *cxl_root,
 	return cxl_acpi_evaluate_qtg_dsm(handle, coord, entries, qos_class);
 }
 
-static const struct cxl_root_ops acpi_root_ops = {
-	.qos_class = cxl_acpi_qos_class,
-};
-
 static void del_cxl_resource(struct resource *res)
 {
 	if (!res)
@@ -923,9 +919,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	cxl_res->end = -1;
 	cxl_res->flags = IORESOURCE_MEM;
 
-	cxl_root = devm_cxl_add_root(host, &acpi_root_ops);
+	cxl_root = devm_cxl_add_root(host);
 	if (IS_ERR(cxl_root))
 		return PTR_ERR(cxl_root);
+	cxl_root->ops.qos_class = cxl_acpi_qos_class;
 	root_port = &cxl_root->port;
 
 	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c
index 7120b5f2e31f..18f0f2a25113 100644
--- a/drivers/cxl/core/cdat.c
+++ b/drivers/cxl/core/cdat.c
@@ -213,7 +213,7 @@ static int cxl_port_perf_data_calculate(struct cxl_port *port,
 	if (!cxl_root)
 		return -ENODEV;
 
-	if (!cxl_root->ops || !cxl_root->ops->qos_class)
+	if (!cxl_root->ops.qos_class)
 		return -EOPNOTSUPP;
 
 	xa_for_each(dsmas_xa, index, dent) {
@@ -221,9 +221,9 @@ static int cxl_port_perf_data_calculate(struct cxl_port *port,
 
 		cxl_coordinates_combine(dent->coord, dent->cdat_coord, ep_c);
 		dent->entries = 1;
-		rc = cxl_root->ops->qos_class(cxl_root,
-					      &dent->coord[ACCESS_COORDINATE_CPU],
-					      1, &qos_class);
+		rc = cxl_root->ops.qos_class(cxl_root,
+					     &dent->coord[ACCESS_COORDINATE_CPU],
+					     1, &qos_class);
 		if (rc != 1)
 			continue;
 
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index fef3aa0c6680..2338d146577c 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -954,19 +954,15 @@ struct cxl_port *devm_cxl_add_port(struct device *host,
 }
 EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, "CXL");
 
-struct cxl_root *devm_cxl_add_root(struct device *host,
-				   const struct cxl_root_ops *ops)
+struct cxl_root *devm_cxl_add_root(struct device *host)
 {
-	struct cxl_root *cxl_root;
 	struct cxl_port *port;
 
 	port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
 	if (IS_ERR(port))
 		return ERR_CAST(port);
 
-	cxl_root = to_cxl_root(port);
-	cxl_root->ops = ops;
-	return cxl_root;
+	return to_cxl_root(port);
 }
 EXPORT_SYMBOL_NS_GPL(devm_cxl_add_root, "CXL");
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 3a5ca1936ed1..0e15dc6e169f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -646,6 +646,14 @@ struct cxl_port {
 	resource_size_t component_reg_phys;
 };
 
+struct cxl_root;
+
+struct cxl_root_ops {
+	int (*qos_class)(struct cxl_root *cxl_root,
+			 struct access_coordinate *coord, int entries,
+			 int *qos_class);
+};
+
 /**
  * struct cxl_root - logical collection of root cxl_port items
  *
@@ -654,7 +662,7 @@ struct cxl_port {
  */
 struct cxl_root {
 	struct cxl_port port;
-	const struct cxl_root_ops *ops;
+	struct cxl_root_ops ops;
 };
 
 static inline struct cxl_root *
@@ -663,12 +671,6 @@ to_cxl_root(const struct cxl_port *port)
 	return container_of(port, struct cxl_root, port);
 }
 
-struct cxl_root_ops {
-	int (*qos_class)(struct cxl_root *cxl_root,
-			 struct access_coordinate *coord, int entries,
-			 int *qos_class);
-};
-
 static inline struct cxl_dport *
 cxl_find_dport_by_dev(struct cxl_port *port, const struct device *dport_dev)
 {
@@ -782,8 +784,7 @@ struct cxl_port *devm_cxl_add_port(struct device *host,
 				   struct device *uport_dev,
 				   resource_size_t component_reg_phys,
 				   struct cxl_dport *parent_dport);
-struct cxl_root *devm_cxl_add_root(struct device *host,
-				   const struct cxl_root_ops *ops);
+struct cxl_root *devm_cxl_add_root(struct device *host);
 struct cxl_root *find_cxl_root(struct cxl_port *port);
 
 DEFINE_FREE(put_cxl_root, struct cxl_root *, if (_T) put_device(&_T->port.dev))
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 05/13] cxl/region: Separate region parameter setup and region construction
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (3 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 04/13] cxl: Simplify cxl_root_ops allocation and handling Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  3:17   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 06/13] cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() Robert Richter
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

To construct a region, the region parameters such as address range and
interleaving config need to be determined. This is done while
constructing the region by inspecting the endpoint decoder
configuration. The endpoint decoder is passed as a function argument.

With address translation the endpoint decoder data is no longer
sufficient to extract the region parameters as some of the information
is obtained using other methods such as using firmware calls.

In a first step, separate code to determine the region parameters from
the region construction. Temporarily store all the data to create the
region in the new struct cxl_region_context. Once the region data is
determined and struct cxl_region_context is filled, construct the
region.

Patch is a prerequisite to implement address translation. The code
separation helps to later extend it to determine region parameters
using other methods as needed, esp. to support address translation.

Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/core.h   |  8 ++++++++
 drivers/cxl/core/region.c | 27 ++++++++++++++++++---------
 2 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 1fb66132b777..ae9e1bb51562 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -19,6 +19,14 @@ enum cxl_detach_mode {
 };
 
 #ifdef CONFIG_CXL_REGION
+
+struct cxl_region_context {
+	struct cxl_endpoint_decoder *cxled;
+	struct range hpa_range;
+	int interleave_ways;
+	int interleave_granularity;
+};
+
 extern struct device_attribute dev_attr_create_pmem_region;
 extern struct device_attribute dev_attr_create_ram_region;
 extern struct device_attribute dev_attr_delete_region;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 04c3ff66ec81..5ae77e9feb4d 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3565,11 +3565,12 @@ static int cxl_extended_linear_cache_resize(struct cxl_region *cxlr,
 }
 
 static int __construct_region(struct cxl_region *cxlr,
-			      struct cxl_endpoint_decoder *cxled)
+			      struct cxl_region_context *ctx)
 {
+	struct cxl_endpoint_decoder *cxled = ctx->cxled;
 	struct cxl_root_decoder *cxlrd = cxlr->cxlrd;
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-	struct range *hpa_range = &cxled->cxld.hpa_range;
+	struct range *hpa_range = &ctx->hpa_range;
 	struct cxl_region_params *p;
 	struct resource *res;
 	int rc;
@@ -3622,8 +3623,8 @@ static int __construct_region(struct cxl_region *cxlr,
 	}
 
 	p->res = res;
-	p->interleave_ways = cxled->cxld.interleave_ways;
-	p->interleave_granularity = cxled->cxld.interleave_granularity;
+	p->interleave_ways = ctx->interleave_ways;
+	p->interleave_granularity = ctx->interleave_granularity;
 	p->state = CXL_CONFIG_INTERLEAVE_ACTIVE;
 
 	rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group());
@@ -3643,8 +3644,9 @@ static int __construct_region(struct cxl_region *cxlr,
 
 /* Establish an empty region covering the given HPA range */
 static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
-					   struct cxl_endpoint_decoder *cxled)
+					   struct cxl_region_context *ctx)
 {
+	struct cxl_endpoint_decoder *cxled = ctx->cxled;
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	struct cxl_port *port = cxlrd_to_port(cxlrd);
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
@@ -3664,7 +3666,7 @@ static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd,
 		return cxlr;
 	}
 
-	rc = __construct_region(cxlr, cxled);
+	rc = __construct_region(cxlr, ctx);
 	if (rc) {
 		devm_release_action(port->uport_dev, unregister_region, cxlr);
 		return ERR_PTR(rc);
@@ -3689,11 +3691,18 @@ cxl_find_region_by_range(struct cxl_root_decoder *cxlrd,
 
 int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
 {
-	struct range *hpa_range = &cxled->cxld.hpa_range;
+	struct cxl_region_context ctx;
 	struct cxl_region_params *p;
 	bool attach = false;
 	int rc;
 
+	ctx = (struct cxl_region_context) {
+		.cxled = cxled,
+		.hpa_range = cxled->cxld.hpa_range,
+		.interleave_ways = cxled->cxld.interleave_ways,
+		.interleave_granularity = cxled->cxld.interleave_granularity,
+	};
+
 	struct cxl_root_decoder *cxlrd __free(put_cxl_root_decoder) =
 		cxl_find_root_decoder(cxled);
 	if (!cxlrd)
@@ -3706,9 +3715,9 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
 	 */
 	mutex_lock(&cxlrd->range_lock);
 	struct cxl_region *cxlr __free(put_cxl_region) =
-		cxl_find_region_by_range(cxlrd, hpa_range);
+		cxl_find_region_by_range(cxlrd, &ctx.hpa_range);
 	if (!cxlr)
-		cxlr = construct_region(cxlrd, cxled);
+		cxlr = construct_region(cxlrd, &ctx);
 	mutex_unlock(&cxlrd->range_lock);
 
 	rc = PTR_ERR_OR_ZERO(cxlr);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 06/13] cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (4 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 05/13] cxl/region: Separate region parameter setup and region construction Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  3:17   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 07/13] cxl/region: Use region data to get the root decoder Robert Richter
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

cxl_calc_interleave_pos() uses the endpoint decoder's HPA range to
determine its interleaving position. This requires the endpoint
decoders to be an SPA, which is not the case for systems that need
address translation.

Add a separate @hpa_range argument to function
cxl_calc_interleave_pos() to specify the address range. Now it is
possible to pass the SPA translated address range of an endpoint
decoder to function cxl_calc_interleave_pos().

Refactor only, no functional changes.

Patch is a prerequisite to implement address translation.

Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/region.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5ae77e9feb4d..60d2d1dae2aa 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1878,6 +1878,7 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
 /**
  * cxl_calc_interleave_pos() - calculate an endpoint position in a region
  * @cxled: endpoint decoder member of given region
+ * @hpa_range: translated HPA range of the endpoint
  *
  * The endpoint position is calculated by traversing the topology from
  * the endpoint to the root decoder and iteratively applying this
@@ -1890,11 +1891,11 @@ static int find_pos_and_ways(struct cxl_port *port, struct range *range,
  * Return: position >= 0 on success
  *	   -ENXIO on failure
  */
-static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
+static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled,
+				   struct range *hpa_range)
 {
 	struct cxl_port *iter, *port = cxled_to_port(cxled);
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-	struct range *range = &cxled->cxld.hpa_range;
 	int parent_ways = 0, parent_pos = 0, pos = 0;
 	int rc;
 
@@ -1932,7 +1933,8 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
 		if (is_cxl_root(iter))
 			break;
 
-		rc = find_pos_and_ways(iter, range, &parent_pos, &parent_ways);
+		rc = find_pos_and_ways(iter, hpa_range, &parent_pos,
+				       &parent_ways);
 		if (rc)
 			return rc;
 
@@ -1942,7 +1944,7 @@ static int cxl_calc_interleave_pos(struct cxl_endpoint_decoder *cxled)
 	dev_dbg(&cxlmd->dev,
 		"decoder:%s parent:%s port:%s range:%#llx-%#llx pos:%d\n",
 		dev_name(&cxled->cxld.dev), dev_name(cxlmd->dev.parent),
-		dev_name(&port->dev), range->start, range->end, pos);
+		dev_name(&port->dev), hpa_range->start, hpa_range->end, pos);
 
 	return pos;
 }
@@ -1955,7 +1957,7 @@ static int cxl_region_sort_targets(struct cxl_region *cxlr)
 	for (i = 0; i < p->nr_targets; i++) {
 		struct cxl_endpoint_decoder *cxled = p->targets[i];
 
-		cxled->pos = cxl_calc_interleave_pos(cxled);
+		cxled->pos = cxl_calc_interleave_pos(cxled, &cxlr->hpa_range);
 		/*
 		 * Record that sorting failed, but still continue to calc
 		 * cxled->pos so that follow-on code paths can reliably
@@ -2139,7 +2141,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 		struct cxl_endpoint_decoder *cxled = p->targets[i];
 		int test_pos;
 
-		test_pos = cxl_calc_interleave_pos(cxled);
+		test_pos = cxl_calc_interleave_pos(cxled, &cxlr->hpa_range);
 		dev_dbg(&cxled->cxld.dev,
 			"Test cxl_calc_interleave_pos(): %s test_pos:%d cxled->pos:%d\n",
 			(test_pos == cxled->pos) ? "success" : "fail",
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 07/13] cxl/region: Use region data to get the root decoder
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (5 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 06/13] cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  3:19   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 08/13] cxl: Introduce callback for HPA address ranges translation Robert Richter
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

To find a region's root decoder, the endpoint's HPA range is used to
search the matching decoder by its range. With address translation the
endpoint decoder's range is in a different address space and thus
cannot be used to determine the root decoder.

The region parameters are encapsulated within struc cxl_region_context
and may include the translated Host Physical Address (HPA) range. Use
this context to identify the root decoder rather than relying on the
endpoint.

Modify cxl_find_root_decoder() and add the region context as
parameter. Rename this function to get_cxl_root_decoder() as a
counterpart to put_cxl_root_decoder(). Simplify the implementation by
removing function cxl_port_find_switch_decode(). The function is
unnecessary because it is not referenced or utilized elsewhere in the
code.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/region.c | 50 +++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 26 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 60d2d1dae2aa..912796fd708e 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3469,47 +3469,44 @@ static int devm_cxl_add_dax_region(struct cxl_region *cxlr)
 	return rc;
 }
 
-static int match_decoder_by_range(struct device *dev, const void *data)
+static int match_root_decoder(struct device *dev, const void *data)
 {
 	const struct range *r1, *r2 = data;
-	struct cxl_decoder *cxld;
+	struct cxl_root_decoder *cxlrd;
 
-	if (!is_switch_decoder(dev))
+	if (!is_root_decoder(dev))
 		return 0;
 
-	cxld = to_cxl_decoder(dev);
-	r1 = &cxld->hpa_range;
-	return range_contains(r1, r2);
-}
-
-static struct cxl_decoder *
-cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa_range)
-{
-	struct device *cxld_dev = device_find_child(&port->dev, hpa_range,
-						    match_decoder_by_range);
+	cxlrd = to_cxl_root_decoder(dev);
+	r1 = &cxlrd->cxlsd.cxld.hpa_range;
 
-	return cxld_dev ? to_cxl_decoder(cxld_dev) : NULL;
+	return range_contains(r1, r2);
 }
 
+/*
+ * Note, when finished with the device, drop the reference with
+ * put_device() or use the put_cxl_root_decoder helper.
+ */
 static struct cxl_root_decoder *
-cxl_find_root_decoder(struct cxl_endpoint_decoder *cxled)
+get_cxl_root_decoder(struct cxl_endpoint_decoder *cxled,
+		     struct cxl_region_context *ctx)
 {
 	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 	struct cxl_port *port = cxled_to_port(cxled);
 	struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
-	struct cxl_decoder *root, *cxld = &cxled->cxld;
-	struct range *hpa_range = &cxld->hpa_range;
+	struct device *cxlrd_dev;
 
-	root = cxl_port_find_switch_decoder(&cxl_root->port, hpa_range);
-	if (!root) {
+	cxlrd_dev = device_find_child(&cxl_root->port.dev, &ctx->hpa_range,
+				      match_root_decoder);
+	if (!cxlrd_dev) {
 		dev_err(cxlmd->dev.parent,
 			"%s:%s no CXL window for range %#llx:%#llx\n",
-			dev_name(&cxlmd->dev), dev_name(&cxld->dev),
-			hpa_range->start, hpa_range->end);
-		return NULL;
+			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
+			ctx->hpa_range.start, ctx->hpa_range.end);
+		return ERR_PTR(-ENXIO);
 	}
 
-	return to_cxl_root_decoder(&root->dev);
+	return to_cxl_root_decoder(cxlrd_dev);
 }
 
 static int match_region_by_range(struct device *dev, const void *data)
@@ -3706,9 +3703,10 @@ int cxl_add_to_region(struct cxl_endpoint_decoder *cxled)
 	};
 
 	struct cxl_root_decoder *cxlrd __free(put_cxl_root_decoder) =
-		cxl_find_root_decoder(cxled);
-	if (!cxlrd)
-		return -ENXIO;
+		get_cxl_root_decoder(cxled, &ctx);
+
+	if (IS_ERR(cxlrd))
+		return PTR_ERR(cxlrd);
 
 	/*
 	 * Ensure that, if multiple threads race to construct_region()
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 08/13] cxl: Introduce callback for HPA address ranges translation
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (6 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 07/13] cxl/region: Use region data to get the root decoder Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  3:20   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 09/13] cxl/acpi: Prepare use of EFI runtime services Robert Richter
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

Introduce a callback to translate an endpoint's HPA range to the
address range of the root port which is the System Physical Address
(SPA) range used by a region. The callback can be set if a platform
needs to handle address translation.

The callback is attached to the root port. An endpoint's root port can
easily be determined in the PCI hierarchy without any CXL specific
knowledge. This allows the early use of address translation for CXL
enumeration. Address translation is esp. needed for the detection of
the root decoders. Thus, the callback is embedded in struct
cxl_root_ops instead of struct cxl_rd_ops.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/region.c | 24 ++++++++++++++++++++++++
 drivers/cxl/cxl.h         |  1 +
 2 files changed, 25 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 912796fd708e..ed8469fa55a9 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3483,6 +3483,15 @@ static int match_root_decoder(struct device *dev, const void *data)
 	return range_contains(r1, r2);
 }
 
+static int cxl_root_setup_translation(struct cxl_root *cxl_root,
+				      struct cxl_region_context *ctx)
+{
+	if (!cxl_root->ops.translation_setup_root)
+		return 0;
+
+	return cxl_root->ops.translation_setup_root(cxl_root, ctx);
+}
+
 /*
  * Note, when finished with the device, drop the reference with
  * put_device() or use the put_cxl_root_decoder helper.
@@ -3495,6 +3504,21 @@ get_cxl_root_decoder(struct cxl_endpoint_decoder *cxled,
 	struct cxl_port *port = cxled_to_port(cxled);
 	struct cxl_root *cxl_root __free(put_cxl_root) = find_cxl_root(port);
 	struct device *cxlrd_dev;
+	int rc;
+
+	/*
+	 * Adjust the endpoint's HPA range and interleaving
+	 * configuration to the root decoder’s memory space before
+	 * setting up the root decoder.
+	 */
+	rc = cxl_root_setup_translation(cxl_root, ctx);
+	if (rc) {
+		dev_err(cxlmd->dev.parent,
+			"%s:%s Failed to setup translation for address range %#llx:%#llx\n",
+			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
+			ctx->hpa_range.start, ctx->hpa_range.end);
+		return ERR_PTR(rc);
+	}
 
 	cxlrd_dev = device_find_child(&cxl_root->port.dev, &ctx->hpa_range,
 				      match_root_decoder);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 0e15dc6e169f..8ea334d81edf 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -652,6 +652,7 @@ struct cxl_root_ops {
 	int (*qos_class)(struct cxl_root *cxl_root,
 			 struct access_coordinate *coord, int entries,
 			 int *qos_class);
+	int (*translation_setup_root)(struct cxl_root *cxl_root, void *data);
 };
 
 /**
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 09/13] cxl/acpi: Prepare use of EFI runtime services
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (7 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 08/13] cxl: Introduce callback for HPA address ranges translation Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-10 11:46 ` [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT Robert Richter
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

In order to use EFI runtime services, esp. ACPI PRM which uses the
efi_rts_wq workqueue, initialize EFI before CXL ACPI.

There is a subsys_initcall order dependency if driver is builtin:

 subsys_initcall(cxl_acpi_init);
 subsys_initcall(efisubsys_init);

Prevent the efi_rts_wq workqueue being used by cxl_acpi_init() before
its allocation. Use subsys_initcall_sync(cxl_acpi_init) to always run
efisubsys_init() first.

Reported-by: Gregory Price <gourry@gourry.net>
Tested-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/acpi.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index b4bed40ef7c0..a31d0f97f916 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -1005,8 +1005,12 @@ static void __exit cxl_acpi_exit(void)
 	cxl_bus_drain();
 }
 
-/* load before dax_hmem sees 'Soft Reserved' CXL ranges */
-subsys_initcall(cxl_acpi_init);
+/*
+ * Load before dax_hmem sees 'Soft Reserved' CXL ranges. Use
+ * subsys_initcall_sync() since there is an order dependency with
+ * subsys_initcall(efisubsys_init), which must run first.
+ */
+subsys_initcall_sync(cxl_acpi_init);
 
 /*
  * Arrange for host-bridge ports to be active synchronous with
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (8 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 09/13] cxl/acpi: Prepare use of EFI runtime services Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-14  7:47   ` Ard Biesheuvel
  2026-01-10 11:46 ` [PATCH v9 11/13] cxl/atl: Lock decoders that need address translation Robert Richter
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

Add AMD Zen5 support for address translation.

Zen5 systems may be configured to use 'Normalized addresses'. Then,
host physical addresses (HPA) are different from their system physical
addresses (SPA). The endpoint has its own physical address space and
an incoming HPA is already converted to the device's physical address
(DPA). Thus it has interleaving disabled and CXL endpoints are
programmed passthrough (DPA == HPA).

Host Physical Addresses (HPAs) need to be translated from the endpoint
to its CXL host bridge, esp. to identify the endpoint's root decoder
and region's address range. ACPI Platform Runtime Mechanism (PRM)
provides a handler to translate the DPA to its SPA. This is documented
in:

 AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
 ACPI v6.5 Porting Guide, Publication # 58088
 https://www.amd.com/en/search/documentation/hub.html

With Normalized Addressing this PRM handler must be used to translate
an HPA of an endpoint to its SPA.

Do the following to implement AMD Zen5 address translation:

Introduce a new file core/atl.c to handle ACPI PRM specific address
translation code. Naming is loosely related to the kernel's AMD
Address Translation Library (CONFIG_AMD_ATL) but implementation does
not depend on it, nor it is vendor specific. Use Kbuild and Kconfig
options respectively to enable the code depending on architecture and
platform options.

AMD Zen5 systems support the ACPI PRM CXL Address Translation firmware
call (see ACPI v6.5 Porting Guide, Address Translation - CXL DPA to
System Physical Address). Firmware enables the PRM handler if the
platform has address translation implemented. Check firmware and
kernel support of ACPI PRM using the specific GUID. On success enable
address translation by setting up the earlier introduced root port
callback, see function cxl_prm_setup_translation(). Setup is done in
cxl_setup_prm_address_translation(), it is the only function that
needs to be exported. For low level PRM firmware calls, use the ACPI
framework.

Identify the region's interleaving ways by inspecting the address
ranges. Also determine the interleaving granularity using the address
translation callback. Note that the position of the chunk from one
interleaving block to the next may vary and thus cannot be considered
constant. Address offsets larger than the interleaving block size
cannot be used to calculate the granularity. Thus, probe the
granularity using address translation for various HPAs in the same
interleaving block.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/Kconfig       |   5 +
 drivers/cxl/acpi.c        |   2 +
 drivers/cxl/core/Makefile |   1 +
 drivers/cxl/core/atl.c    | 190 ++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h         |   7 ++
 5 files changed, 205 insertions(+)
 create mode 100644 drivers/cxl/core/atl.c

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 48b7314afdb8..103950a9b73e 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -233,4 +233,9 @@ config CXL_MCE
 	def_bool y
 	depends on X86_MCE && MEMORY_FAILURE
 
+config CXL_ATL
+	def_bool y
+	depends on CXL_REGION
+	depends on ACPI_PRMT && AMD_NB
+
 endif
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index a31d0f97f916..50c2987e0459 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -925,6 +925,8 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	cxl_root->ops.qos_class = cxl_acpi_qos_class;
 	root_port = &cxl_root->port;
 
+	cxl_setup_prm_address_translation(cxl_root);
+
 	rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
 			      add_host_bridge_dport);
 	if (rc < 0)
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 5ad8fef210b5..11fe272a6e29 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -20,3 +20,4 @@ cxl_core-$(CONFIG_CXL_REGION) += region.o
 cxl_core-$(CONFIG_CXL_MCE) += mce.o
 cxl_core-$(CONFIG_CXL_FEATURES) += features.o
 cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
+cxl_core-$(CONFIG_CXL_ATL) += atl.o
diff --git a/drivers/cxl/core/atl.c b/drivers/cxl/core/atl.c
new file mode 100644
index 000000000000..c36984686fb0
--- /dev/null
+++ b/drivers/cxl/core/atl.c
@@ -0,0 +1,190 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2025 Advanced Micro Devices, Inc.
+ */
+
+#include <linux/prmt.h>
+#include <linux/pci.h>
+#include <linux/acpi.h>
+
+#include <cxlmem.h>
+#include "core.h"
+
+/*
+ * PRM Address Translation - CXL DPA to System Physical Address
+ *
+ * Reference:
+ *
+ * AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
+ * ACPI v6.5 Porting Guide, Publication # 58088
+ */
+
+static const guid_t prm_cxl_dpa_spa_guid =
+	GUID_INIT(0xee41b397, 0x25d4, 0x452c, 0xad, 0x54, 0x48, 0xc6, 0xe3,
+		  0x48, 0x0b, 0x94);
+
+struct prm_cxl_dpa_spa_data {
+	u64 dpa;
+	u8 reserved;
+	u8 devfn;
+	u8 bus;
+	u8 segment;
+	u64 *spa;
+} __packed;
+
+static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
+{
+	struct prm_cxl_dpa_spa_data data;
+	u64 spa;
+	int rc;
+
+	data = (struct prm_cxl_dpa_spa_data) {
+		.dpa     = dpa,
+		.devfn   = pci_dev->devfn,
+		.bus     = pci_dev->bus->number,
+		.segment = pci_domain_nr(pci_dev->bus),
+		.spa     = &spa,
+	};
+
+	rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
+	if (rc) {
+		pci_dbg(pci_dev, "failed to get SPA for %#llx: %d\n", dpa, rc);
+		return ULLONG_MAX;
+	}
+
+	pci_dbg(pci_dev, "PRM address translation: DPA -> SPA: %#llx -> %#llx\n", dpa, spa);
+
+	return spa;
+}
+
+static int cxl_prm_setup_root(struct cxl_root *cxl_root, void *data)
+{
+	struct cxl_region_context *ctx = data;
+	struct cxl_endpoint_decoder *cxled = ctx->cxled;
+	struct cxl_decoder *cxld = &cxled->cxld;
+	struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+	struct range hpa_range = ctx->hpa_range;
+	struct pci_dev *pci_dev;
+	u64 spa_len, len;
+	u64 addr, base_spa, base;
+	int ways, gran;
+
+	/*
+	 * When Normalized Addressing is enabled, the endpoint maintains a 1:1
+	 * mapping between HPA and DPA. If disabled, skip address translation
+	 * and perform only a range check.
+	 */
+	if (hpa_range.start != cxled->dpa_res->start)
+		return 0;
+
+	/*
+	 * Endpoints are programmed passthrough in Normalized Addressing mode.
+	 */
+	if (ctx->interleave_ways != 1) {
+		dev_dbg(&cxld->dev, "unexpected interleaving config: ways: %d granularity: %d\n",
+			ctx->interleave_ways, ctx->interleave_granularity);
+		return -ENXIO;
+	}
+
+	if (!cxlmd || !dev_is_pci(cxlmd->dev.parent)) {
+		dev_dbg(&cxld->dev, "No endpoint found: %s, range %#llx-%#llx\n",
+			dev_name(cxld->dev.parent), hpa_range.start,
+			hpa_range.end);
+		return -ENXIO;
+	}
+
+	pci_dev = to_pci_dev(cxlmd->dev.parent);
+
+	/* Translate HPA range to SPA. */
+	base = hpa_range.start;
+	hpa_range.start = prm_cxl_dpa_spa(pci_dev, hpa_range.start);
+	hpa_range.end = prm_cxl_dpa_spa(pci_dev, hpa_range.end);
+	base_spa = hpa_range.start;
+
+	if (hpa_range.start == ULLONG_MAX || hpa_range.end == ULLONG_MAX) {
+		dev_dbg(cxld->dev.parent,
+			"CXL address translation: Failed to translate HPA range: %#llx-%#llx:%#llx-%#llx(%s)\n",
+			hpa_range.start, hpa_range.end, ctx->hpa_range.start,
+			ctx->hpa_range.end, dev_name(&cxld->dev));
+		return -ENXIO;
+	}
+
+	/*
+	 * Since translated addresses include the interleaving offsets, align
+	 * the range to 256 MB.
+	 */
+	hpa_range.start = ALIGN_DOWN(hpa_range.start, SZ_256M);
+	hpa_range.end = ALIGN(hpa_range.end, SZ_256M) - 1;
+
+	len = range_len(&ctx->hpa_range);
+	spa_len = range_len(&hpa_range);
+	if (!len || !spa_len || spa_len % len) {
+		dev_dbg(cxld->dev.parent,
+			"CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
+			hpa_range.start, hpa_range.end, ctx->hpa_range.start,
+			ctx->hpa_range.end, dev_name(&cxld->dev));
+		return -ENXIO;
+	}
+
+	ways = spa_len / len;
+	gran = SZ_256;
+
+	/*
+	 * Determine interleave granularity
+	 *
+	 * Note: The position of the chunk from one interleaving block to the
+	 * next may vary and thus cannot be considered constant. Address offsets
+	 * larger than the interleaving block size cannot be used to calculate
+	 * the granularity.
+	 */
+	if (ways > 1) {
+		while (gran <= SZ_16M) {
+			addr = prm_cxl_dpa_spa(pci_dev, base + gran);
+			if (addr != base_spa + gran)
+				break;
+			gran <<= 1;
+		}
+	}
+
+	if (gran > SZ_16M) {
+		dev_dbg(cxld->dev.parent,
+			"CXL address translation: Cannot determine granularity: %#llx-%#llx:%#llx-%#llx(%s)\n",
+			hpa_range.start, hpa_range.end, ctx->hpa_range.start,
+			ctx->hpa_range.end, dev_name(&cxld->dev));
+		return -ENXIO;
+	}
+
+	ctx->hpa_range = hpa_range;
+	ctx->interleave_ways = ways;
+	ctx->interleave_granularity = gran;
+
+	dev_dbg(&cxld->dev,
+		"address mapping found for %s (hpa -> spa): %#llx+%#llx -> %#llx+%#llx ways:%d granularity:%d\n",
+		dev_name(cxlmd->dev.parent), base, len, hpa_range.start,
+		spa_len, ways, gran);
+
+	return 0;
+}
+
+void cxl_setup_prm_address_translation(struct cxl_root *cxl_root)
+{
+	struct device *host = cxl_root->port.uport_dev;
+	u64 spa;
+	struct prm_cxl_dpa_spa_data data = { .spa = &spa };
+	int rc;
+
+	/*
+	 * Applies only to PCIe Host Bridges which are children of the CXL Root
+	 * Device (HID=“ACPI0017”). Check this and drop cxl_test instances.
+	 */
+	if (!acpi_match_device(host->driver->acpi_match_table, host))
+		return;
+
+	/* Check kernel (-EOPNOTSUPP) and firmware support (-ENODEV) */
+	rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
+	if (rc == -EOPNOTSUPP || rc == -ENODEV)
+		return;
+
+	cxl_root->ops.translation_setup_root = cxl_prm_setup_root;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_setup_prm_address_translation, "CXL");
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 8ea334d81edf..20b0fd43fa7b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -817,6 +817,13 @@ static inline void cxl_dport_init_ras_reporting(struct cxl_dport *dport,
 						struct device *host) { }
 #endif
 
+#ifdef CONFIG_CXL_ATL
+void cxl_setup_prm_address_translation(struct cxl_root *cxl_root);
+#else
+static inline
+void cxl_setup_prm_address_translation(struct cxl_root *cxl_root) {}
+#endif
+
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
 struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 11/13] cxl/atl: Lock decoders that need address translation
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (9 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-10 11:46 ` [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison() Robert Richter
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

The current kernel implementation does not support endpoint setup with
Normalized Addressing. It only translates an endpoint's DPA to the SPA
range of the host bridge. Therefore, the endpoint address range cannot
be determined, making a non-auto setup impossible. If a decoder
requires address translation, reprogramming should be disabled and the
decoder locked.

The BIOS, however, provides all the necessary address translation
data, which the kernel can use to reconfigure endpoint decoders with
normalized addresses. Locking the decoders in the BIOS would prevent a
capable kernel (or other operating systems) from shutting down
auto-generated regions and managing resources dynamically.

Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/atl.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/cxl/core/atl.c b/drivers/cxl/core/atl.c
index c36984686fb0..09d0ea1792d9 100644
--- a/drivers/cxl/core/atl.c
+++ b/drivers/cxl/core/atl.c
@@ -154,6 +154,24 @@ static int cxl_prm_setup_root(struct cxl_root *cxl_root, void *data)
 		return -ENXIO;
 	}
 
+	/*
+	 * The current kernel implementation does not support endpoint
+	 * setup with Normalized Addressing. It only translates an
+	 * endpoint's DPA to the SPA range of the host bridge.
+	 * Therefore, the endpoint address range cannot be determined,
+	 * making a non-auto setup impossible. If a decoder requires
+	 * address translation, reprogramming should be disabled and
+	 * the decoder locked.
+	 *
+	 * The BIOS, however, provides all the necessary address
+	 * translation data, which the kernel can use to reconfigure
+	 * endpoint decoders with normalized addresses. Locking the
+	 * decoders in the BIOS would prevent a capable kernel (or
+	 * other operating systems) from shutting down auto-generated
+	 * regions and managing resources dynamically.
+	 */
+	cxld->flags |= CXL_DECODER_F_LOCK;
+
 	ctx->hpa_range = hpa_range;
 	ctx->interleave_ways = ways;
 	ctx->interleave_granularity = gran;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison()
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (10 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 11/13] cxl/atl: Lock decoders that need address translation Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-13 22:39   ` Dave Jiang
  2026-01-14  3:32   ` Alison Schofield
  2026-01-10 11:46 ` [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing Robert Richter
  2026-02-03 18:52 ` [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Dave Jiang
  13 siblings, 2 replies; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

Poison injection setup code is embedded in cxl_region_probe(). For
improved encapsulation, readability, and maintainability, factor out
code into function cxl_region_setup_poison().

This patch is a prerequisit to disable poison injection for Normalized
Addressing.

No functional changes.

Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/region.c | 53 +++++++++++++++++++++------------------
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index ed8469fa55a9..80cd77f0842e 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3916,6 +3916,31 @@ static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
 DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
 			 cxl_region_debugfs_poison_clear, "%llx\n");
 
+static int cxl_region_setup_poison(struct cxl_region *cxlr)
+{
+	struct device *dev = &cxlr->dev;
+	struct cxl_region_params *p = &cxlr->params;
+	struct dentry *dentry;
+
+	/* Create poison attributes if all memdevs support the capabilities */
+	for (int i = 0; i < p->nr_targets; i++) {
+		struct cxl_endpoint_decoder *cxled = p->targets[i];
+		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+
+		if (!cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_INJECT) ||
+		    !cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_CLEAR))
+			return 0;
+	}
+
+	dentry = cxl_debugfs_create_dir(dev_name(dev));
+	debugfs_create_file("inject_poison", 0200, dentry, cxlr,
+			    &cxl_poison_inject_fops);
+	debugfs_create_file("clear_poison", 0200, dentry, cxlr,
+			    &cxl_poison_clear_fops);
+
+	return devm_add_action_or_reset(dev, remove_debugfs, dentry);
+}
+
 static int cxl_region_can_probe(struct cxl_region *cxlr)
 {
 	struct cxl_region_params *p = &cxlr->params;
@@ -3945,7 +3970,6 @@ static int cxl_region_probe(struct device *dev)
 {
 	struct cxl_region *cxlr = to_cxl_region(dev);
 	struct cxl_region_params *p = &cxlr->params;
-	bool poison_supported = true;
 	int rc;
 
 	rc = cxl_region_can_probe(cxlr);
@@ -3969,30 +3993,9 @@ static int cxl_region_probe(struct device *dev)
 	if (rc)
 		return rc;
 
-	/* Create poison attributes if all memdevs support the capabilities */
-	for (int i = 0; i < p->nr_targets; i++) {
-		struct cxl_endpoint_decoder *cxled = p->targets[i];
-		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
-
-		if (!cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_INJECT) ||
-		    !cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_CLEAR)) {
-			poison_supported = false;
-			break;
-		}
-	}
-
-	if (poison_supported) {
-		struct dentry *dentry;
-
-		dentry = cxl_debugfs_create_dir(dev_name(dev));
-		debugfs_create_file("inject_poison", 0200, dentry, cxlr,
-				    &cxl_poison_inject_fops);
-		debugfs_create_file("clear_poison", 0200, dentry, cxlr,
-				    &cxl_poison_clear_fops);
-		rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
-		if (rc)
-			return rc;
-	}
+	rc = cxl_region_setup_poison(cxlr);
+	if (rc)
+		return rc;
 
 	switch (cxlr->mode) {
 	case CXL_PARTMODE_PMEM:
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (11 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison() Robert Richter
@ 2026-01-10 11:46 ` Robert Richter
  2026-01-13 23:15   ` Dave Jiang
                     ` (2 more replies)
  2026-02-03 18:52 ` [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Dave Jiang
  13 siblings, 3 replies; 51+ messages in thread
From: Robert Richter @ 2026-01-10 11:46 UTC (permalink / raw)
  To: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Robert Richter

The root decoder provides the callbacks hpa_to_spa and spa_to_hpa to
perform Host Physical Address (HPA) and System Physical Address
translations, respectively. The callbacks are required to convert
addresses when HPA != SPA. XOR interleaving depends on this mechanism,
and the necessary handlers are implemented.

The translation handlers are used for poison injection
(trace_cxl_poison, cxl_poison_inject_fops) and error handling
(cxl_event_trace_record).

In AMD Zen5 systems with Normalized Addressing, endpoint addresses are
not SPAs, and translation handlers are required for these features to
function correctly.

Now, as ACPI PRM translation could be expensive in tracing or error
handling code paths, do not yet enable translations to avoid its
intensive use. Instead, disable those features which are used only for
debugging and enhanced logging.

Introduce the flag CXL_REGION_F_NORM_ADDR that indicates Normalized
Addressing for a region and use it to disable poison injection and DPA
to HPA conversion.

Note: Dropped unused CXL_DECODER_F_MASK macro.

Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/atl.c    |  3 +++
 drivers/cxl/core/region.c | 33 +++++++++++++++++++++++++--------
 drivers/cxl/cxl.h         |  9 ++++++++-
 3 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/atl.c b/drivers/cxl/core/atl.c
index 09d0ea1792d9..13f1118dc026 100644
--- a/drivers/cxl/core/atl.c
+++ b/drivers/cxl/core/atl.c
@@ -169,8 +169,11 @@ static int cxl_prm_setup_root(struct cxl_root *cxl_root, void *data)
 	 * decoders in the BIOS would prevent a capable kernel (or
 	 * other operating systems) from shutting down auto-generated
 	 * regions and managing resources dynamically.
+	 *
+	 * Indicate that Normalized Addressing is enabled.
 	 */
 	cxld->flags |= CXL_DECODER_F_LOCK;
+	cxld->flags |= CXL_DECODER_F_NORM_ADDR;
 
 	ctx->hpa_range = hpa_range;
 	ctx->interleave_ways = ways;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 80cd77f0842e..8b68ccce1554 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1097,14 +1097,16 @@ static int cxl_rr_assign_decoder(struct cxl_port *port, struct cxl_region *cxlr,
 	return 0;
 }
 
-static void cxl_region_set_lock(struct cxl_region *cxlr,
-				struct cxl_decoder *cxld)
+static void cxl_region_setup_flags(struct cxl_region *cxlr,
+				   struct cxl_decoder *cxld)
 {
-	if (!test_bit(CXL_DECODER_F_LOCK, &cxld->flags))
-		return;
+	if (test_bit(CXL_DECODER_F_LOCK, &cxld->flags)) {
+		set_bit(CXL_REGION_F_LOCK, &cxlr->flags);
+		clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
+	}
 
-	set_bit(CXL_REGION_F_LOCK, &cxlr->flags);
-	clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
+	if (test_bit(CXL_DECODER_F_NORM_ADDR, &cxld->flags))
+		set_bit(CXL_REGION_F_NORM_ADDR, &cxlr->flags);
 }
 
 /**
@@ -1218,7 +1220,7 @@ static int cxl_port_attach_region(struct cxl_port *port,
 		}
 	}
 
-	cxl_region_set_lock(cxlr, cxld);
+	cxl_region_setup_flags(cxlr, cxld);
 
 	rc = cxl_rr_ep_add(cxl_rr, cxled);
 	if (rc) {
@@ -2493,7 +2495,7 @@ static struct cxl_region *cxl_region_alloc(struct cxl_root_decoder *cxlrd, int i
 	device_set_pm_not_required(dev);
 	dev->bus = &cxl_bus_type;
 	dev->type = &cxl_region_type;
-	cxl_region_set_lock(cxlr, &cxlrd->cxlsd.cxld);
+	cxl_region_setup_flags(cxlr, &cxlrd->cxlsd.cxld);
 
 	return cxlr;
 }
@@ -3132,6 +3134,13 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
 	u8 eiw = 0;
 	int pos;
 
+	/*
+	 * Conversion between SPA and DPA is not supported in
+	 * Normalized Address mode.
+	 */
+	if (test_bit(CXL_REGION_F_NORM_ADDR, &cxlr->flags))
+		return ULLONG_MAX;
+
 	for (int i = 0; i < p->nr_targets; i++) {
 		if (cxlmd == cxled_to_memdev(p->targets[i])) {
 			cxled = p->targets[i];
@@ -3922,6 +3931,14 @@ static int cxl_region_setup_poison(struct cxl_region *cxlr)
 	struct cxl_region_params *p = &cxlr->params;
 	struct dentry *dentry;
 
+	/*
+	 * Do not enable poison injection in Normalized Address mode.
+	 * Conversion between SPA and DPA is required for this, but it is
+	 * not supported in this mode.
+	 */
+	if (test_bit(CXL_REGION_F_NORM_ADDR, &cxlr->flags))
+		return 0;
+
 	/* Create poison attributes if all memdevs support the capabilities */
 	for (int i = 0; i < p->nr_targets; i++) {
 		struct cxl_endpoint_decoder *cxled = p->targets[i];
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 20b0fd43fa7b..0ab0a86e1d4f 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -332,7 +332,7 @@ int cxl_dport_map_rcd_linkcap(struct pci_dev *pdev, struct cxl_dport *dport);
 #define CXL_DECODER_F_TYPE3 BIT(3)
 #define CXL_DECODER_F_LOCK  BIT(4)
 #define CXL_DECODER_F_ENABLE    BIT(5)
-#define CXL_DECODER_F_MASK  GENMASK(5, 0)
+#define CXL_DECODER_F_NORM_ADDR BIT(6)
 
 enum cxl_decoder_type {
 	CXL_DECODER_DEVMEM = 2,
@@ -525,6 +525,13 @@ enum cxl_partition_mode {
  */
 #define CXL_REGION_F_LOCK 2
 
+/*
+ * Indicate Normalized Addressing. Use it to disable SPA conversion if
+ * HPA != SPA and an address translation callback handler does not
+ * exist. Flag is needed by AMD Zen5 platforms.
+ */
+#define CXL_REGION_F_NORM_ADDR 3
+
 /**
  * struct cxl_region - CXL region
  * @dev: This region's device
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison()
  2026-01-10 11:46 ` [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison() Robert Richter
@ 2026-01-13 22:39   ` Dave Jiang
  2026-01-14  3:32   ` Alison Schofield
  1 sibling, 0 replies; 51+ messages in thread
From: Dave Jiang @ 2026-01-13 22:39 UTC (permalink / raw)
  To: Robert Richter, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Jonathan Cameron, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn



On 1/10/26 4:46 AM, Robert Richter wrote:
> Poison injection setup code is embedded in cxl_region_probe(). For
> improved encapsulation, readability, and maintainability, factor out
> code into function cxl_region_setup_poison().
> 
> This patch is a prerequisit to disable poison injection for Normalized
> Addressing.
> 
> No functional changes.
> 
> Signed-off-by: Robert Richter <rrichter@amd.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

Funny how I created the same patch yesterday for something else and didn't realize you already posted it.

> ---
>  drivers/cxl/core/region.c | 53 +++++++++++++++++++++------------------
>  1 file changed, 28 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index ed8469fa55a9..80cd77f0842e 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3916,6 +3916,31 @@ static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
>  DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
>  			 cxl_region_debugfs_poison_clear, "%llx\n");
>  
> +static int cxl_region_setup_poison(struct cxl_region *cxlr)
> +{
> +	struct device *dev = &cxlr->dev;
> +	struct cxl_region_params *p = &cxlr->params;
> +	struct dentry *dentry;
> +
> +	/* Create poison attributes if all memdevs support the capabilities */
> +	for (int i = 0; i < p->nr_targets; i++) {
> +		struct cxl_endpoint_decoder *cxled = p->targets[i];
> +		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +
> +		if (!cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_INJECT) ||
> +		    !cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_CLEAR))
> +			return 0;
> +	}
> +
> +	dentry = cxl_debugfs_create_dir(dev_name(dev));
> +	debugfs_create_file("inject_poison", 0200, dentry, cxlr,
> +			    &cxl_poison_inject_fops);
> +	debugfs_create_file("clear_poison", 0200, dentry, cxlr,
> +			    &cxl_poison_clear_fops);
> +
> +	return devm_add_action_or_reset(dev, remove_debugfs, dentry);
> +}
> +
>  static int cxl_region_can_probe(struct cxl_region *cxlr)
>  {
>  	struct cxl_region_params *p = &cxlr->params;
> @@ -3945,7 +3970,6 @@ static int cxl_region_probe(struct device *dev)
>  {
>  	struct cxl_region *cxlr = to_cxl_region(dev);
>  	struct cxl_region_params *p = &cxlr->params;
> -	bool poison_supported = true;
>  	int rc;
>  
>  	rc = cxl_region_can_probe(cxlr);
> @@ -3969,30 +3993,9 @@ static int cxl_region_probe(struct device *dev)
>  	if (rc)
>  		return rc;
>  
> -	/* Create poison attributes if all memdevs support the capabilities */
> -	for (int i = 0; i < p->nr_targets; i++) {
> -		struct cxl_endpoint_decoder *cxled = p->targets[i];
> -		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> -
> -		if (!cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_INJECT) ||
> -		    !cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_CLEAR)) {
> -			poison_supported = false;
> -			break;
> -		}
> -	}
> -
> -	if (poison_supported) {
> -		struct dentry *dentry;
> -
> -		dentry = cxl_debugfs_create_dir(dev_name(dev));
> -		debugfs_create_file("inject_poison", 0200, dentry, cxlr,
> -				    &cxl_poison_inject_fops);
> -		debugfs_create_file("clear_poison", 0200, dentry, cxlr,
> -				    &cxl_poison_clear_fops);
> -		rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
> -		if (rc)
> -			return rc;
> -	}
> +	rc = cxl_region_setup_poison(cxlr);
> +	if (rc)
> +		return rc;
>  
>  	switch (cxlr->mode) {
>  	case CXL_PARTMODE_PMEM:


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  2026-01-10 11:46 ` [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing Robert Richter
@ 2026-01-13 23:15   ` Dave Jiang
  2026-01-14  3:59   ` Alison Schofield
  2026-01-14 18:22   ` Jonathan Cameron
  2 siblings, 0 replies; 51+ messages in thread
From: Dave Jiang @ 2026-01-13 23:15 UTC (permalink / raw)
  To: Robert Richter, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Jonathan Cameron, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn



On 1/10/26 4:46 AM, Robert Richter wrote:
> The root decoder provides the callbacks hpa_to_spa and spa_to_hpa to
> perform Host Physical Address (HPA) and System Physical Address
> translations, respectively. The callbacks are required to convert
> addresses when HPA != SPA. XOR interleaving depends on this mechanism,
> and the necessary handlers are implemented.
> 
> The translation handlers are used for poison injection
> (trace_cxl_poison, cxl_poison_inject_fops) and error handling
> (cxl_event_trace_record).
> 
> In AMD Zen5 systems with Normalized Addressing, endpoint addresses are
> not SPAs, and translation handlers are required for these features to
> function correctly.
> 
> Now, as ACPI PRM translation could be expensive in tracing or error
> handling code paths, do not yet enable translations to avoid its
> intensive use. Instead, disable those features which are used only for
> debugging and enhanced logging.
> 
> Introduce the flag CXL_REGION_F_NORM_ADDR that indicates Normalized
> Addressing for a region and use it to disable poison injection and DPA
> to HPA conversion.
> 
> Note: Dropped unused CXL_DECODER_F_MASK macro.
> 
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
>  drivers/cxl/core/atl.c    |  3 +++
>  drivers/cxl/core/region.c | 33 +++++++++++++++++++++++++--------
>  drivers/cxl/cxl.h         |  9 ++++++++-
>  3 files changed, 36 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/cxl/core/atl.c b/drivers/cxl/core/atl.c
> index 09d0ea1792d9..13f1118dc026 100644
> --- a/drivers/cxl/core/atl.c
> +++ b/drivers/cxl/core/atl.c
> @@ -169,8 +169,11 @@ static int cxl_prm_setup_root(struct cxl_root *cxl_root, void *data)
>  	 * decoders in the BIOS would prevent a capable kernel (or
>  	 * other operating systems) from shutting down auto-generated
>  	 * regions and managing resources dynamically.
> +	 *
> +	 * Indicate that Normalized Addressing is enabled.
>  	 */
>  	cxld->flags |= CXL_DECODER_F_LOCK;
> +	cxld->flags |= CXL_DECODER_F_NORM_ADDR;

IMO,
Spelling out NORMALIZED probably make the flag clearer the address is normalized vs a normal address.

DJ
>  
>  	ctx->hpa_range = hpa_range;
>  	ctx->interleave_ways = ways;
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 80cd77f0842e..8b68ccce1554 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1097,14 +1097,16 @@ static int cxl_rr_assign_decoder(struct cxl_port *port, struct cxl_region *cxlr,
>  	return 0;
>  }
>  
> -static void cxl_region_set_lock(struct cxl_region *cxlr,
> -				struct cxl_decoder *cxld)
> +static void cxl_region_setup_flags(struct cxl_region *cxlr,
> +				   struct cxl_decoder *cxld)
>  {
> -	if (!test_bit(CXL_DECODER_F_LOCK, &cxld->flags))
> -		return;
> +	if (test_bit(CXL_DECODER_F_LOCK, &cxld->flags)) {
> +		set_bit(CXL_REGION_F_LOCK, &cxlr->flags);
> +		clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
> +	}
>  
> -	set_bit(CXL_REGION_F_LOCK, &cxlr->flags);
> -	clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
> +	if (test_bit(CXL_DECODER_F_NORM_ADDR, &cxld->flags))
> +		set_bit(CXL_REGION_F_NORM_ADDR, &cxlr->flags);
>  }
>  
>  /**
> @@ -1218,7 +1220,7 @@ static int cxl_port_attach_region(struct cxl_port *port,
>  		}
>  	}
>  
> -	cxl_region_set_lock(cxlr, cxld);
> +	cxl_region_setup_flags(cxlr, cxld);
>  
>  	rc = cxl_rr_ep_add(cxl_rr, cxled);
>  	if (rc) {
> @@ -2493,7 +2495,7 @@ static struct cxl_region *cxl_region_alloc(struct cxl_root_decoder *cxlrd, int i
>  	device_set_pm_not_required(dev);
>  	dev->bus = &cxl_bus_type;
>  	dev->type = &cxl_region_type;
> -	cxl_region_set_lock(cxlr, &cxlrd->cxlsd.cxld);
> +	cxl_region_setup_flags(cxlr, &cxlrd->cxlsd.cxld);
>  
>  	return cxlr;
>  }
> @@ -3132,6 +3134,13 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
>  	u8 eiw = 0;
>  	int pos;
>  
> +	/*
> +	 * Conversion between SPA and DPA is not supported in
> +	 * Normalized Address mode.
> +	 */
> +	if (test_bit(CXL_REGION_F_NORM_ADDR, &cxlr->flags))
> +		return ULLONG_MAX;
> +
>  	for (int i = 0; i < p->nr_targets; i++) {
>  		if (cxlmd == cxled_to_memdev(p->targets[i])) {
>  			cxled = p->targets[i];
> @@ -3922,6 +3931,14 @@ static int cxl_region_setup_poison(struct cxl_region *cxlr)
>  	struct cxl_region_params *p = &cxlr->params;
>  	struct dentry *dentry;
>  
> +	/*
> +	 * Do not enable poison injection in Normalized Address mode.
> +	 * Conversion between SPA and DPA is required for this, but it is
> +	 * not supported in this mode.
> +	 */
> +	if (test_bit(CXL_REGION_F_NORM_ADDR, &cxlr->flags))
> +		return 0;
> +
>  	/* Create poison attributes if all memdevs support the capabilities */
>  	for (int i = 0; i < p->nr_targets; i++) {
>  		struct cxl_endpoint_decoder *cxled = p->targets[i];
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 20b0fd43fa7b..0ab0a86e1d4f 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -332,7 +332,7 @@ int cxl_dport_map_rcd_linkcap(struct pci_dev *pdev, struct cxl_dport *dport);
>  #define CXL_DECODER_F_TYPE3 BIT(3)
>  #define CXL_DECODER_F_LOCK  BIT(4)
>  #define CXL_DECODER_F_ENABLE    BIT(5)
> -#define CXL_DECODER_F_MASK  GENMASK(5, 0)
> +#define CXL_DECODER_F_NORM_ADDR BIT(6)
>  
>  enum cxl_decoder_type {
>  	CXL_DECODER_DEVMEM = 2,
> @@ -525,6 +525,13 @@ enum cxl_partition_mode {
>   */
>  #define CXL_REGION_F_LOCK 2
>  
> +/*
> + * Indicate Normalized Addressing. Use it to disable SPA conversion if
> + * HPA != SPA and an address translation callback handler does not
> + * exist. Flag is needed by AMD Zen5 platforms.
> + */
> +#define CXL_REGION_F_NORM_ADDR 3
> +
>  /**
>   * struct cxl_region - CXL region
>   * @dev: This region's device


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 01/13] cxl/region: Rename misleading variable name @hpa to @hpa_range
  2026-01-10 11:46 ` [PATCH v9 01/13] cxl/region: Rename misleading variable name @hpa to @hpa_range Robert Richter
@ 2026-01-14  3:12   ` Alison Schofield
  0 siblings, 0 replies; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:12 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:46PM +0100, Robert Richter wrote:
> @hpa is actually a @hpa_range, rename variables accordingly.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 02/13] cxl/region: Store root decoder in struct cxl_region
  2026-01-10 11:46 ` [PATCH v9 02/13] cxl/region: Store root decoder in struct cxl_region Robert Richter
@ 2026-01-14  3:13   ` Alison Schofield
  0 siblings, 0 replies; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:13 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:47PM +0100, Robert Richter wrote:
> A region is always bound to a root decoder. The region's associated
> root decoder is often needed. Add it to struct cxl_region.
> 
> This simplifies the code by removing dynamic lookups and the root
> decoder argument from the function argument list where possible.
> 
> Patch is a prerequisite to implement address translation which uses
> struct cxl_region to store all relevant region and interleaving
> parameters. It changes the argument list of __construct_region() in
> preparation of adding a context argument. Additionally the arg list of
> cxl_region_attach_position() is simplified and the use of
> to_cxl_root_decoder() removed, which always reconstructs and checks
> the pointer. The pointer never changes and is frequently used. Code
> becomes more readable as this amphazises the binding between both
> objects.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 03/13] cxl/region: Store HPA range in struct cxl_region
  2026-01-10 11:46 ` [PATCH v9 03/13] cxl/region: Store HPA range " Robert Richter
@ 2026-01-14  3:14   ` Alison Schofield
  0 siblings, 0 replies; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:14 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:48PM +0100, Robert Richter wrote:
> Each region has a known host physical address (HPA) range it is
> assigned to. Endpoint decoders assigned to a region share the same HPA
> range. The region's address range is the system's physical address
> (SPA) range.
> 
> Endpoint decoders in systems that need address translation use HPAs
> which are not SPAs. To make the SPA range accessible to the endpoint
> decoders, store and track the region's SPA range in struct cxl_region.
> Introduce the @hpa_range member to the struct. Now, the SPA range of
> an endpoint decoder can be determined based on its assigned region.
> 
> Patch is a prerequisite to implement address translation which uses
> struct cxl_region to store all relevant region and interleaving
> parameters.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 04/13] cxl: Simplify cxl_root_ops allocation and handling
  2026-01-10 11:46 ` [PATCH v9 04/13] cxl: Simplify cxl_root_ops allocation and handling Robert Richter
@ 2026-01-14  3:16   ` Alison Schofield
  0 siblings, 0 replies; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:16 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:49PM +0100, Robert Richter wrote:
> A root port's callback handlers are collected in struct cxl_root_ops.
> The structure is dynamically allocated, though it contains only a
> single pointer in it. This also requires to check two pointers to
> check for the existance of a callback.
> 
> Simplify the allocation, release and handler check by embedding the
> ops statically in struct cxl_root.
> 

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 05/13] cxl/region: Separate region parameter setup and region construction
  2026-01-10 11:46 ` [PATCH v9 05/13] cxl/region: Separate region parameter setup and region construction Robert Richter
@ 2026-01-14  3:17   ` Alison Schofield
  0 siblings, 0 replies; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:17 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:50PM +0100, Robert Richter wrote:
> To construct a region, the region parameters such as address range and
> interleaving config need to be determined. This is done while
> constructing the region by inspecting the endpoint decoder
> configuration. The endpoint decoder is passed as a function argument.
> 
> With address translation the endpoint decoder data is no longer
> sufficient to extract the region parameters as some of the information
> is obtained using other methods such as using firmware calls.
> 
> In a first step, separate code to determine the region parameters from
> the region construction. Temporarily store all the data to create the
> region in the new struct cxl_region_context. Once the region data is
> determined and struct cxl_region_context is filled, construct the
> region.
> 
> Patch is a prerequisite to implement address translation. The code
> separation helps to later extend it to determine region parameters
> using other methods as needed, esp. to support address translation.
> 

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 06/13] cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
  2026-01-10 11:46 ` [PATCH v9 06/13] cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() Robert Richter
@ 2026-01-14  3:17   ` Alison Schofield
  0 siblings, 0 replies; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:17 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:51PM +0100, Robert Richter wrote:
> cxl_calc_interleave_pos() uses the endpoint decoder's HPA range to
> determine its interleaving position. This requires the endpoint
> decoders to be an SPA, which is not the case for systems that need
> address translation.
> 
> Add a separate @hpa_range argument to function
> cxl_calc_interleave_pos() to specify the address range. Now it is
> possible to pass the SPA translated address range of an endpoint
> decoder to function cxl_calc_interleave_pos().
> 
> Refactor only, no functional changes.
> 
> Patch is a prerequisite to implement address translation.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 07/13] cxl/region: Use region data to get the root decoder
  2026-01-10 11:46 ` [PATCH v9 07/13] cxl/region: Use region data to get the root decoder Robert Richter
@ 2026-01-14  3:19   ` Alison Schofield
  0 siblings, 0 replies; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:19 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:52PM +0100, Robert Richter wrote:
> To find a region's root decoder, the endpoint's HPA range is used to
> search the matching decoder by its range. With address translation the
> endpoint decoder's range is in a different address space and thus
> cannot be used to determine the root decoder.
> 
> The region parameters are encapsulated within struc cxl_region_context

maybe s/struc/struct upon applying 

> and may include the translated Host Physical Address (HPA) range. Use
> this context to identify the root decoder rather than relying on the
> endpoint.
> 
> Modify cxl_find_root_decoder() and add the region context as
> parameter. Rename this function to get_cxl_root_decoder() as a
> counterpart to put_cxl_root_decoder(). Simplify the implementation by
> removing function cxl_port_find_switch_decode(). The function is
> unnecessary because it is not referenced or utilized elsewhere in the
> code.
> 

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 08/13] cxl: Introduce callback for HPA address ranges translation
  2026-01-10 11:46 ` [PATCH v9 08/13] cxl: Introduce callback for HPA address ranges translation Robert Richter
@ 2026-01-14  3:20   ` Alison Schofield
  0 siblings, 0 replies; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:20 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:53PM +0100, Robert Richter wrote:
> Introduce a callback to translate an endpoint's HPA range to the
> address range of the root port which is the System Physical Address
> (SPA) range used by a region. The callback can be set if a platform
> needs to handle address translation.
> 
> The callback is attached to the root port. An endpoint's root port can
> easily be determined in the PCI hierarchy without any CXL specific
> knowledge. This allows the early use of address translation for CXL
> enumeration. Address translation is esp. needed for the detection of
> the root decoders. Thus, the callback is embedded in struct
> cxl_root_ops instead of struct cxl_rd_ops.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison()
  2026-01-10 11:46 ` [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison() Robert Richter
  2026-01-13 22:39   ` Dave Jiang
@ 2026-01-14  3:32   ` Alison Schofield
  2026-01-14 18:17     ` Jonathan Cameron
  1 sibling, 1 reply; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:32 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:57PM +0100, Robert Richter wrote:
> Poison injection setup code is embedded in cxl_region_probe(). For
> improved encapsulation, readability, and maintainability, factor out
> code into function cxl_region_setup_poison().
> 
> This patch is a prerequisit to disable poison injection for Normalized
> Addressing.

I prefer we be clearer about intent. Poison injection altogether is not
being disabled, only by region offset. 

Replace this:
> This patch is a prerequisit to disable poison injection for Normalized
> Addressing.

With this:
This patch is a prerequisite to disable poison by region offset for
Normalized Addressing.

With that,
Reviewed-by: Alison Schofield <alison.schofield@intel.com>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  2026-01-10 11:46 ` [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing Robert Richter
  2026-01-13 23:15   ` Dave Jiang
@ 2026-01-14  3:59   ` Alison Schofield
  2026-01-14 11:32     ` Robert Richter
  2026-01-14 18:22   ` Jonathan Cameron
  2 siblings, 1 reply; 51+ messages in thread
From: Alison Schofield @ 2026-01-14  3:59 UTC (permalink / raw)
  To: Robert Richter
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, Jan 10, 2026 at 12:46:58PM +0100, Robert Richter wrote:
> The root decoder provides the callbacks hpa_to_spa and spa_to_hpa to
> perform Host Physical Address (HPA) and System Physical Address
> translations, respectively. The callbacks are required to convert
> addresses when HPA != SPA. XOR interleaving depends on this mechanism,
> and the necessary handlers are implemented.
> 
> The translation handlers are used for poison injection
> (trace_cxl_poison, cxl_poison_inject_fops) and error handling
> (cxl_event_trace_record).
> 
> In AMD Zen5 systems with Normalized Addressing, endpoint addresses are
> not SPAs, and translation handlers are required for these features to
> function correctly.
> 
> Now, as ACPI PRM translation could be expensive in tracing or error
> handling code paths, do not yet enable translations to avoid its
> intensive use. Instead, disable those features which are used only for
> debugging and enhanced logging.
> 
> Introduce the flag CXL_REGION_F_NORM_ADDR that indicates Normalized
> Addressing for a region and use it to disable poison injection and DPA
> to HPA conversion.
> 
> Note: Dropped unused CXL_DECODER_F_MASK macro.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>

So not for the commit log, but for my closure ;) - 

A system with normalized addressing:

will still:
	support poison listings by memdev and by region
	support poison inject and clear by memdev

they'll be different, in that:
	if a DPA address maps into a region, the region SPA mapping will
	always be ULLONG_MAX. The region name will still be available and
	valid. That same difference applies for General Media and DRAM events.
	(This is the 'enhanced logging' referred to in the commit log.)

will not:
	support poison inject or clear by region.
	(This is the 'debugging' referred to in commit log.)


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-10 11:46 ` [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT Robert Richter
@ 2026-01-14  7:47   ` Ard Biesheuvel
  2026-01-14 14:00     ` Robert Richter
  0 siblings, 1 reply; 51+ messages in thread
From: Ard Biesheuvel @ 2026-01-14  7:47 UTC (permalink / raw)
  To: Robert Richter, Peter Zijlstra
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Dave Jiang, Davidlohr Bueso, linux-cxl,
	linux-kernel, Gregory Price, Fabio M. De Francesco, Terry Bowman,
	Joshua Hahn

(cc Peter)

On Sat, 10 Jan 2026 at 12:46, Robert Richter <rrichter@amd.com> wrote:
>
> Add AMD Zen5 support for address translation.
>
...
> Do the following to implement AMD Zen5 address translation:
>
> Introduce a new file core/atl.c to handle ACPI PRM specific address
> translation code. Naming is loosely related to the kernel's AMD
> Address Translation Library (CONFIG_AMD_ATL) but implementation does
> not depend on it, nor it is vendor specific. Use Kbuild and Kconfig
> options respectively to enable the code depending on architecture and
> platform options.
>
> AMD Zen5 systems support the ACPI PRM CXL Address Translation firmware
> call (see ACPI v6.5 Porting Guide, Address Translation - CXL DPA to
> System Physical Address). Firmware enables the PRM handler if the
> platform has address translation implemented. Check firmware and
> kernel support of ACPI PRM using the specific GUID. On success enable
> address translation by setting up the earlier introduced root port
> callback, see function cxl_prm_setup_translation(). Setup is done in
> cxl_setup_prm_address_translation(), it is the only function that
> needs to be exported. For low level PRM firmware calls, use the ACPI
> framework.
>

Does the PRM service in question tolerate being invoked unprivileged?
The PRM spec requires this, and this is something we may need to
enforce at some point.

cc'ing Peter with whom I've discussed this just recently.



> Identify the region's interleaving ways by inspecting the address
> ranges. Also determine the interleaving granularity using the address
> translation callback. Note that the position of the chunk from one
> interleaving block to the next may vary and thus cannot be considered
> constant. Address offsets larger than the interleaving block size
> cannot be used to calculate the granularity. Thus, probe the
> granularity using address translation for various HPAs in the same
> interleaving block.
>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Tested-by: Gregory Price <gourry@gourry.net>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
>  drivers/cxl/Kconfig       |   5 +
>  drivers/cxl/acpi.c        |   2 +
>  drivers/cxl/core/Makefile |   1 +
>  drivers/cxl/core/atl.c    | 190 ++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h         |   7 ++
>  5 files changed, 205 insertions(+)
>  create mode 100644 drivers/cxl/core/atl.c
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 48b7314afdb8..103950a9b73e 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -233,4 +233,9 @@ config CXL_MCE
>         def_bool y
>         depends on X86_MCE && MEMORY_FAILURE
>
> +config CXL_ATL
> +       def_bool y
> +       depends on CXL_REGION
> +       depends on ACPI_PRMT && AMD_NB
> +
>  endif
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index a31d0f97f916..50c2987e0459 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -925,6 +925,8 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>         cxl_root->ops.qos_class = cxl_acpi_qos_class;
>         root_port = &cxl_root->port;
>
> +       cxl_setup_prm_address_translation(cxl_root);
> +
>         rc = bus_for_each_dev(adev->dev.bus, NULL, root_port,
>                               add_host_bridge_dport);
>         if (rc < 0)
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 5ad8fef210b5..11fe272a6e29 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -20,3 +20,4 @@ cxl_core-$(CONFIG_CXL_REGION) += region.o
>  cxl_core-$(CONFIG_CXL_MCE) += mce.o
>  cxl_core-$(CONFIG_CXL_FEATURES) += features.o
>  cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) += edac.o
> +cxl_core-$(CONFIG_CXL_ATL) += atl.o
> diff --git a/drivers/cxl/core/atl.c b/drivers/cxl/core/atl.c
> new file mode 100644
> index 000000000000..c36984686fb0
> --- /dev/null
> +++ b/drivers/cxl/core/atl.c
> @@ -0,0 +1,190 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2025 Advanced Micro Devices, Inc.
> + */
> +
> +#include <linux/prmt.h>
> +#include <linux/pci.h>
> +#include <linux/acpi.h>
> +
> +#include <cxlmem.h>
> +#include "core.h"
> +
> +/*
> + * PRM Address Translation - CXL DPA to System Physical Address
> + *
> + * Reference:
> + *
> + * AMD Family 1Ah Models 00h–0Fh and Models 10h–1Fh
> + * ACPI v6.5 Porting Guide, Publication # 58088
> + */
> +
> +static const guid_t prm_cxl_dpa_spa_guid =
> +       GUID_INIT(0xee41b397, 0x25d4, 0x452c, 0xad, 0x54, 0x48, 0xc6, 0xe3,
> +                 0x48, 0x0b, 0x94);
> +
> +struct prm_cxl_dpa_spa_data {
> +       u64 dpa;
> +       u8 reserved;
> +       u8 devfn;
> +       u8 bus;
> +       u8 segment;
> +       u64 *spa;
> +} __packed;
> +
> +static u64 prm_cxl_dpa_spa(struct pci_dev *pci_dev, u64 dpa)
> +{
> +       struct prm_cxl_dpa_spa_data data;
> +       u64 spa;
> +       int rc;
> +
> +       data = (struct prm_cxl_dpa_spa_data) {
> +               .dpa     = dpa,
> +               .devfn   = pci_dev->devfn,
> +               .bus     = pci_dev->bus->number,
> +               .segment = pci_domain_nr(pci_dev->bus),
> +               .spa     = &spa,
> +       };
> +
> +       rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
> +       if (rc) {
> +               pci_dbg(pci_dev, "failed to get SPA for %#llx: %d\n", dpa, rc);
> +               return ULLONG_MAX;
> +       }
> +
> +       pci_dbg(pci_dev, "PRM address translation: DPA -> SPA: %#llx -> %#llx\n", dpa, spa);
> +
> +       return spa;
> +}
> +
> +static int cxl_prm_setup_root(struct cxl_root *cxl_root, void *data)
> +{
> +       struct cxl_region_context *ctx = data;
> +       struct cxl_endpoint_decoder *cxled = ctx->cxled;
> +       struct cxl_decoder *cxld = &cxled->cxld;
> +       struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +       struct range hpa_range = ctx->hpa_range;
> +       struct pci_dev *pci_dev;
> +       u64 spa_len, len;
> +       u64 addr, base_spa, base;
> +       int ways, gran;
> +
> +       /*
> +        * When Normalized Addressing is enabled, the endpoint maintains a 1:1
> +        * mapping between HPA and DPA. If disabled, skip address translation
> +        * and perform only a range check.
> +        */
> +       if (hpa_range.start != cxled->dpa_res->start)
> +               return 0;
> +
> +       /*
> +        * Endpoints are programmed passthrough in Normalized Addressing mode.
> +        */
> +       if (ctx->interleave_ways != 1) {
> +               dev_dbg(&cxld->dev, "unexpected interleaving config: ways: %d granularity: %d\n",
> +                       ctx->interleave_ways, ctx->interleave_granularity);
> +               return -ENXIO;
> +       }
> +
> +       if (!cxlmd || !dev_is_pci(cxlmd->dev.parent)) {
> +               dev_dbg(&cxld->dev, "No endpoint found: %s, range %#llx-%#llx\n",
> +                       dev_name(cxld->dev.parent), hpa_range.start,
> +                       hpa_range.end);
> +               return -ENXIO;
> +       }
> +
> +       pci_dev = to_pci_dev(cxlmd->dev.parent);
> +
> +       /* Translate HPA range to SPA. */
> +       base = hpa_range.start;
> +       hpa_range.start = prm_cxl_dpa_spa(pci_dev, hpa_range.start);
> +       hpa_range.end = prm_cxl_dpa_spa(pci_dev, hpa_range.end);
> +       base_spa = hpa_range.start;
> +
> +       if (hpa_range.start == ULLONG_MAX || hpa_range.end == ULLONG_MAX) {
> +               dev_dbg(cxld->dev.parent,
> +                       "CXL address translation: Failed to translate HPA range: %#llx-%#llx:%#llx-%#llx(%s)\n",
> +                       hpa_range.start, hpa_range.end, ctx->hpa_range.start,
> +                       ctx->hpa_range.end, dev_name(&cxld->dev));
> +               return -ENXIO;
> +       }
> +
> +       /*
> +        * Since translated addresses include the interleaving offsets, align
> +        * the range to 256 MB.
> +        */
> +       hpa_range.start = ALIGN_DOWN(hpa_range.start, SZ_256M);
> +       hpa_range.end = ALIGN(hpa_range.end, SZ_256M) - 1;
> +
> +       len = range_len(&ctx->hpa_range);
> +       spa_len = range_len(&hpa_range);
> +       if (!len || !spa_len || spa_len % len) {
> +               dev_dbg(cxld->dev.parent,
> +                       "CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
> +                       hpa_range.start, hpa_range.end, ctx->hpa_range.start,
> +                       ctx->hpa_range.end, dev_name(&cxld->dev));
> +               return -ENXIO;
> +       }
> +
> +       ways = spa_len / len;
> +       gran = SZ_256;
> +
> +       /*
> +        * Determine interleave granularity
> +        *
> +        * Note: The position of the chunk from one interleaving block to the
> +        * next may vary and thus cannot be considered constant. Address offsets
> +        * larger than the interleaving block size cannot be used to calculate
> +        * the granularity.
> +        */
> +       if (ways > 1) {
> +               while (gran <= SZ_16M) {
> +                       addr = prm_cxl_dpa_spa(pci_dev, base + gran);
> +                       if (addr != base_spa + gran)
> +                               break;
> +                       gran <<= 1;
> +               }
> +       }
> +
> +       if (gran > SZ_16M) {
> +               dev_dbg(cxld->dev.parent,
> +                       "CXL address translation: Cannot determine granularity: %#llx-%#llx:%#llx-%#llx(%s)\n",
> +                       hpa_range.start, hpa_range.end, ctx->hpa_range.start,
> +                       ctx->hpa_range.end, dev_name(&cxld->dev));
> +               return -ENXIO;
> +       }
> +
> +       ctx->hpa_range = hpa_range;
> +       ctx->interleave_ways = ways;
> +       ctx->interleave_granularity = gran;
> +
> +       dev_dbg(&cxld->dev,
> +               "address mapping found for %s (hpa -> spa): %#llx+%#llx -> %#llx+%#llx ways:%d granularity:%d\n",
> +               dev_name(cxlmd->dev.parent), base, len, hpa_range.start,
> +               spa_len, ways, gran);
> +
> +       return 0;
> +}
> +
> +void cxl_setup_prm_address_translation(struct cxl_root *cxl_root)
> +{
> +       struct device *host = cxl_root->port.uport_dev;
> +       u64 spa;
> +       struct prm_cxl_dpa_spa_data data = { .spa = &spa };
> +       int rc;
> +
> +       /*
> +        * Applies only to PCIe Host Bridges which are children of the CXL Root
> +        * Device (HID=“ACPI0017”). Check this and drop cxl_test instances.
> +        */
> +       if (!acpi_match_device(host->driver->acpi_match_table, host))
> +               return;
> +
> +       /* Check kernel (-EOPNOTSUPP) and firmware support (-ENODEV) */
> +       rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
> +       if (rc == -EOPNOTSUPP || rc == -ENODEV)
> +               return;
> +
> +       cxl_root->ops.translation_setup_root = cxl_prm_setup_root;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_setup_prm_address_translation, "CXL");
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 8ea334d81edf..20b0fd43fa7b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -817,6 +817,13 @@ static inline void cxl_dport_init_ras_reporting(struct cxl_dport *dport,
>                                                 struct device *host) { }
>  #endif
>
> +#ifdef CONFIG_CXL_ATL
> +void cxl_setup_prm_address_translation(struct cxl_root *cxl_root);
> +#else
> +static inline
> +void cxl_setup_prm_address_translation(struct cxl_root *cxl_root) {}
> +#endif
> +
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
>  struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
> --
> 2.47.3
>
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  2026-01-14  3:59   ` Alison Schofield
@ 2026-01-14 11:32     ` Robert Richter
  0 siblings, 0 replies; 51+ messages in thread
From: Robert Richter @ 2026-01-14 11:32 UTC (permalink / raw)
  To: Alison Schofield
  Cc: Vishal Verma, Ira Weiny, Dan Williams, Jonathan Cameron,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On 13.01.26 19:59:05, Alison Schofield wrote:
> On Sat, Jan 10, 2026 at 12:46:58PM +0100, Robert Richter wrote:
> > The root decoder provides the callbacks hpa_to_spa and spa_to_hpa to
> > perform Host Physical Address (HPA) and System Physical Address
> > translations, respectively. The callbacks are required to convert
> > addresses when HPA != SPA. XOR interleaving depends on this mechanism,
> > and the necessary handlers are implemented.
> > 
> > The translation handlers are used for poison injection
> > (trace_cxl_poison, cxl_poison_inject_fops) and error handling
> > (cxl_event_trace_record).
> > 
> > In AMD Zen5 systems with Normalized Addressing, endpoint addresses are
> > not SPAs, and translation handlers are required for these features to
> > function correctly.
> > 
> > Now, as ACPI PRM translation could be expensive in tracing or error
> > handling code paths, do not yet enable translations to avoid its
> > intensive use. Instead, disable those features which are used only for
> > debugging and enhanced logging.
> > 
> > Introduce the flag CXL_REGION_F_NORM_ADDR that indicates Normalized
> > Addressing for a region and use it to disable poison injection and DPA
> > to HPA conversion.
> > 
> > Note: Dropped unused CXL_DECODER_F_MASK macro.
> 
> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
> 
> So not for the commit log, but for my closure ;) - 
> 
> A system with normalized addressing:
> 
> will still:
> 	support poison listings by memdev and by region
> 	support poison inject and clear by memdev
> 
> they'll be different, in that:
> 	if a DPA address maps into a region, the region SPA mapping will
> 	always be ULLONG_MAX. The region name will still be available and
> 	valid. That same difference applies for General Media and DRAM events.
> 	(This is the 'enhanced logging' referred to in the commit log.)
> 
> will not:
> 	support poison inject or clear by region.
> 	(This is the 'debugging' referred to in commit log.)
> 

Good conclusion. :-) Thanks for review.

-Robert

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-14  7:47   ` Ard Biesheuvel
@ 2026-01-14 14:00     ` Robert Richter
  2026-01-14 15:21       ` Ard Biesheuvel
  0 siblings, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-01-14 14:00 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Peter Zijlstra, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso,
	linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn

On Wed, Jan 14, 2026 at 08:47:22AM +0100, Ard Biesheuvel wrote:
> (cc Peter)
> 
> On Sat, 10 Jan 2026 at 12:46, Robert Richter <rrichter@amd.com> wrote:
> >
> > Add AMD Zen5 support for address translation.
> >
> ...
> > Do the following to implement AMD Zen5 address translation:
> >
> > Introduce a new file core/atl.c to handle ACPI PRM specific address
> > translation code. Naming is loosely related to the kernel's AMD
> > Address Translation Library (CONFIG_AMD_ATL) but implementation does
> > not depend on it, nor it is vendor specific. Use Kbuild and Kconfig
> > options respectively to enable the code depending on architecture and
> > platform options.
> >
> > AMD Zen5 systems support the ACPI PRM CXL Address Translation firmware
> > call (see ACPI v6.5 Porting Guide, Address Translation - CXL DPA to
> > System Physical Address). Firmware enables the PRM handler if the
> > platform has address translation implemented. Check firmware and
> > kernel support of ACPI PRM using the specific GUID. On success enable
> > address translation by setting up the earlier introduced root port
> > callback, see function cxl_prm_setup_translation(). Setup is done in
> > cxl_setup_prm_address_translation(), it is the only function that
> > needs to be exported. For low level PRM firmware calls, use the ACPI
> > framework.
> >
> 
> Does the PRM service in question tolerate being invoked unprivileged?
> The PRM spec requires this, and this is something we may need to
> enforce at some point.
> 
> cc'ing Peter with whom I've discussed this just recently.

Interesting appoach, need to check if that works. I haven't tried that
yet. Though, that needs some rework of the kernel code as some high
priority code depends on the translation and that would cause kind of
priority inversion. E.g. an interrupt handler cannot wait until a
dpa-to-spa conversion is done.

For CXL it is only used for region setup in the init path and process
context. For tracing and error handling those translations are
disabled. See patch 13/13.

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-14 14:00     ` Robert Richter
@ 2026-01-14 15:21       ` Ard Biesheuvel
  2026-01-14 18:08         ` Jonathan Cameron
  0 siblings, 1 reply; 51+ messages in thread
From: Ard Biesheuvel @ 2026-01-14 15:21 UTC (permalink / raw)
  To: Robert Richter
  Cc: Peter Zijlstra, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Jonathan Cameron, Dave Jiang, Davidlohr Bueso,
	linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn

On Wed, 14 Jan 2026 at 15:00, Robert Richter <rrichter@amd.com> wrote:
>
> On Wed, Jan 14, 2026 at 08:47:22AM +0100, Ard Biesheuvel wrote:
> > (cc Peter)
> >
> > On Sat, 10 Jan 2026 at 12:46, Robert Richter <rrichter@amd.com> wrote:
> > >
> > > Add AMD Zen5 support for address translation.
> > >
> > ...
> > > Do the following to implement AMD Zen5 address translation:
> > >
> > > Introduce a new file core/atl.c to handle ACPI PRM specific address
> > > translation code. Naming is loosely related to the kernel's AMD
> > > Address Translation Library (CONFIG_AMD_ATL) but implementation does
> > > not depend on it, nor it is vendor specific. Use Kbuild and Kconfig
> > > options respectively to enable the code depending on architecture and
> > > platform options.
> > >
> > > AMD Zen5 systems support the ACPI PRM CXL Address Translation firmware
> > > call (see ACPI v6.5 Porting Guide, Address Translation - CXL DPA to
> > > System Physical Address). Firmware enables the PRM handler if the
> > > platform has address translation implemented. Check firmware and
> > > kernel support of ACPI PRM using the specific GUID. On success enable
> > > address translation by setting up the earlier introduced root port
> > > callback, see function cxl_prm_setup_translation(). Setup is done in
> > > cxl_setup_prm_address_translation(), it is the only function that
> > > needs to be exported. For low level PRM firmware calls, use the ACPI
> > > framework.
> > >
> >
> > Does the PRM service in question tolerate being invoked unprivileged?
> > The PRM spec requires this, and this is something we may need to
> > enforce at some point.
> >
> > cc'ing Peter with whom I've discussed this just recently.
>
> Interesting appoach, need to check if that works. I haven't tried that
> yet. Though, that needs some rework of the kernel code as some high
> priority code depends on the translation and that would cause kind of
> priority inversion. E.g. an interrupt handler cannot wait until a
> dpa-to-spa conversion is done.
>

This is not about running it in user space, but about running the code
in an unprivileged sandbox. So scheduling wpuldn't really come into
play here.

> For CXL it is only used for region setup in the init path and process
> context. For tracing and error handling those translations are
> disabled. See patch 13/13.
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-14 15:21       ` Ard Biesheuvel
@ 2026-01-14 18:08         ` Jonathan Cameron
  2026-01-15  8:04           ` Peter Zijlstra
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Cameron @ 2026-01-14 18:08 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Robert Richter, Peter Zijlstra, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Dave Jiang, Davidlohr Bueso, linux-cxl,
	linux-kernel, Gregory Price, Fabio M. De Francesco, Terry Bowman,
	Joshua Hahn

On Wed, 14 Jan 2026 16:21:03 +0100
Ard Biesheuvel <ardb@kernel.org> wrote:

> On Wed, 14 Jan 2026 at 15:00, Robert Richter <rrichter@amd.com> wrote:
> >
> > On Wed, Jan 14, 2026 at 08:47:22AM +0100, Ard Biesheuvel wrote:  
> > > (cc Peter)
> > >
> > > On Sat, 10 Jan 2026 at 12:46, Robert Richter <rrichter@amd.com> wrote:  
> > > >
> > > > Add AMD Zen5 support for address translation.
> > > >  
> > > ...  
> > > > Do the following to implement AMD Zen5 address translation:
> > > >
> > > > Introduce a new file core/atl.c to handle ACPI PRM specific address
> > > > translation code. Naming is loosely related to the kernel's AMD
> > > > Address Translation Library (CONFIG_AMD_ATL) but implementation does
> > > > not depend on it, nor it is vendor specific. Use Kbuild and Kconfig
> > > > options respectively to enable the code depending on architecture and
> > > > platform options.
> > > >
> > > > AMD Zen5 systems support the ACPI PRM CXL Address Translation firmware
> > > > call (see ACPI v6.5 Porting Guide, Address Translation - CXL DPA to
> > > > System Physical Address). Firmware enables the PRM handler if the
> > > > platform has address translation implemented. Check firmware and
> > > > kernel support of ACPI PRM using the specific GUID. On success enable
> > > > address translation by setting up the earlier introduced root port
> > > > callback, see function cxl_prm_setup_translation(). Setup is done in
> > > > cxl_setup_prm_address_translation(), it is the only function that
> > > > needs to be exported. For low level PRM firmware calls, use the ACPI
> > > > framework.
> > > >  
> > >
> > > Does the PRM service in question tolerate being invoked unprivileged?
> > > The PRM spec requires this, and this is something we may need to
> > > enforce at some point.
> > >
> > > cc'ing Peter with whom I've discussed this just recently.  
> >
> > Interesting appoach, need to check if that works. I haven't tried that
> > yet. Though, that needs some rework of the kernel code as some high
> > priority code depends on the translation and that would cause kind of
> > priority inversion. E.g. an interrupt handler cannot wait until a
> > dpa-to-spa conversion is done.
> >  
> 
> This is not about running it in user space, but about running the code
> in an unprivileged sandbox. So scheduling wpuldn't really come into
> play here.

Hi Ard,

I haven't looked into the background yet, so a naive question:

Do we have a potential issue wrt to merging this as it stands and improving
on it later?  i.e. Is this a blocking issue for this patch set?

Thanks,

Jonathan

> 
> > For CXL it is only used for region setup in the init path and process
> > context. For tracing and error handling those translations are
> > disabled. See patch 13/13.
> >  
> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison()
  2026-01-14  3:32   ` Alison Schofield
@ 2026-01-14 18:17     ` Jonathan Cameron
  0 siblings, 0 replies; 51+ messages in thread
From: Jonathan Cameron @ 2026-01-14 18:17 UTC (permalink / raw)
  To: Alison Schofield
  Cc: Robert Richter, Vishal Verma, Ira Weiny, Dan Williams, Dave Jiang,
	Davidlohr Bueso, linux-cxl, linux-kernel, Gregory Price,
	Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Tue, 13 Jan 2026 19:32:54 -0800
Alison Schofield <alison.schofield@intel.com> wrote:

> On Sat, Jan 10, 2026 at 12:46:57PM +0100, Robert Richter wrote:
> > Poison injection setup code is embedded in cxl_region_probe(). For
> > improved encapsulation, readability, and maintainability, factor out
> > code into function cxl_region_setup_poison().
> > 
> > This patch is a prerequisit to disable poison injection for Normalized
> > Addressing.  
> 
> I prefer we be clearer about intent. Poison injection altogether is not
> being disabled, only by region offset. 
> 
> Replace this:
> > This patch is a prerequisit to disable poison injection for Normalized
> > Addressing.  
> 
> With this:
> This patch is a prerequisite to disable poison by region offset for
> Normalized Addressing.
> 
> With that,
> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Alison's request seems sensible to me.  Otherwise looks good.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

> 
> 
> 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  2026-01-10 11:46 ` [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing Robert Richter
  2026-01-13 23:15   ` Dave Jiang
  2026-01-14  3:59   ` Alison Schofield
@ 2026-01-14 18:22   ` Jonathan Cameron
  2 siblings, 0 replies; 51+ messages in thread
From: Jonathan Cameron @ 2026-01-14 18:22 UTC (permalink / raw)
  To: Robert Richter
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Dave Jiang, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Sat, 10 Jan 2026 12:46:58 +0100
Robert Richter <rrichter@amd.com> wrote:

> The root decoder provides the callbacks hpa_to_spa and spa_to_hpa to
> perform Host Physical Address (HPA) and System Physical Address
> translations, respectively. The callbacks are required to convert
> addresses when HPA != SPA. XOR interleaving depends on this mechanism,
> and the necessary handlers are implemented.
> 
> The translation handlers are used for poison injection
> (trace_cxl_poison, cxl_poison_inject_fops) and error handling
> (cxl_event_trace_record).
> 
> In AMD Zen5 systems with Normalized Addressing, endpoint addresses are
> not SPAs, and translation handlers are required for these features to
> function correctly.
> 
> Now, as ACPI PRM translation could be expensive in tracing or error
> handling code paths, do not yet enable translations to avoid its
> intensive use. Instead, disable those features which are used only for
> debugging and enhanced logging.
> 
> Introduce the flag CXL_REGION_F_NORM_ADDR that indicates Normalized
> Addressing for a region and use it to disable poison injection and DPA
> to HPA conversion.
> 
> Note: Dropped unused CXL_DECODER_F_MASK macro.

Meh, ideally that would be a precusor patch that Dave could pick up immediately.

Hopefully the whole thing merges though and so we don't have to care
about that.

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-14 18:08         ` Jonathan Cameron
@ 2026-01-15  8:04           ` Peter Zijlstra
  2026-01-15  8:30             ` Ard Biesheuvel
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2026-01-15  8:04 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Ard Biesheuvel, Robert Richter, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Dave Jiang, Davidlohr Bueso, linux-cxl,
	linux-kernel, Gregory Price, Fabio M. De Francesco, Terry Bowman,
	Joshua Hahn

On Wed, Jan 14, 2026 at 06:08:59PM +0000, Jonathan Cameron wrote:

> Do we have a potential issue wrt to merging this as it stands and improving
> on it later?  i.e. Is this a blocking issue for this patch set?

Well, why do you *have* to use PRMT at all? And this is a serious
question; PRMT is basically injecting unaudited magic code into the
kernel, and that is a security risk.

Worse, in order to run this shit, we have to lower or disable various
security measures.

If I had my way, we would WARN and TAINT the kernel whenever such
garbage got used.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-15  8:04           ` Peter Zijlstra
@ 2026-01-15  8:30             ` Ard Biesheuvel
  2026-01-16 14:38               ` Peter Zijlstra
  0 siblings, 1 reply; 51+ messages in thread
From: Ard Biesheuvel @ 2026-01-15  8:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jonathan Cameron, Robert Richter, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Dave Jiang, Davidlohr Bueso, linux-cxl,
	linux-kernel, Gregory Price, Fabio M. De Francesco, Terry Bowman,
	Joshua Hahn

On Thu, 15 Jan 2026 at 09:04, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jan 14, 2026 at 06:08:59PM +0000, Jonathan Cameron wrote:
>
> > Do we have a potential issue wrt to merging this as it stands and improving
> > on it later?  i.e. Is this a blocking issue for this patch set?
>
> Well, why do you *have* to use PRMT at all? And this is a serious
> question; PRMT is basically injecting unaudited magic code into the
> kernel, and that is a security risk.
>
> Worse, in order to run this shit, we have to lower or disable various
> security measures.
>

Only if we decide to keep running it privileged, which the PRM spec no
longer requires (as you have confirmed yourself when we last discussed
this, right?)

> If I had my way, we would WARN and TAINT the kernel whenever such
> garbage got used.

These are things that used to live in SMM, requiring all CPUs to
disappear into SMM mode in a way that was completely opaque to the OS.

PRM runs under the control of the OS, does not require privileges and
only needs MMIO access to the regions it describes in its manifest
(which the OS can inspect, if desired). So if there are security
concerns with PRM today, it is because we were lazy and did not
implement PRM securely from the beginning.

In my defense, I wasn't aware of the unprivileged requirement until
you spotted it recently: it was something I had asked for when the PRM
spec was put up for "review" by the Intel and MS authors, and they
told me they couldn't possibly make any changes at that point, because
it had already gone into production. But as it turns out, the change
was made after all.

I am a total noob when it comes to how x86 does its ring0/ring3
switching, but with some help, I should be able to prototype something
to call into the PRM service unprivileged, running under the efi_mm.

Would that allay your concerns?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-15  8:30             ` Ard Biesheuvel
@ 2026-01-16 14:38               ` Peter Zijlstra
  2026-01-19 14:33                 ` Robert Richter
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2026-01-16 14:38 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jonathan Cameron, Robert Richter, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Dave Jiang, Davidlohr Bueso, linux-cxl,
	linux-kernel, Gregory Price, Fabio M. De Francesco, Terry Bowman,
	Joshua Hahn

On Thu, Jan 15, 2026 at 09:30:10AM +0100, Ard Biesheuvel wrote:
> On Thu, 15 Jan 2026 at 09:04, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, Jan 14, 2026 at 06:08:59PM +0000, Jonathan Cameron wrote:
> >
> > > Do we have a potential issue wrt to merging this as it stands and improving
> > > on it later?  i.e. Is this a blocking issue for this patch set?
> >
> > Well, why do you *have* to use PRMT at all? And this is a serious
> > question; PRMT is basically injecting unaudited magic code into the
> > kernel, and that is a security risk.
> >
> > Worse, in order to run this shit, we have to lower or disable various
> > security measures.
> >
> 
> Only if we decide to keep running it privileged, which the PRM spec no
> longer requires (as you have confirmed yourself when we last discussed
> this, right?)

Indeed. But those very constraints also make me wonder why we would ever
bother with PRM at all, and not simply require a native driver. Then you
actually *know* what the thing does and can debug/fix it without having
to rely on BIOS updates and whatnot.

Worse, you might have to deal with various incompatible buggy PRM
versions because BIOS :/

> > If I had my way, we would WARN and TAINT the kernel whenever such
> > garbage got used.
> 
> These are things that used to live in SMM, requiring all CPUs to
> disappear into SMM mode in a way that was completely opaque to the OS.
> 
> PRM runs under the control of the OS, does not require privileges and
> only needs MMIO access to the regions it describes in its manifest
> (which the OS can inspect, if desired). So if there are security
> concerns with PRM today, it is because we were lazy and did not
> implement PRM securely from the beginning.
> 
> In my defense, I wasn't aware of the unprivileged requirement until
> you spotted it recently: it was something I had asked for when the PRM
> spec was put up for "review" by the Intel and MS authors, and they
> told me they couldn't possibly make any changes at that point, because
> it had already gone into production. But as it turns out, the change
> was made after all.
> 
> I am a total noob when it comes to how x86 does its ring0/ring3
> switching, but with some help, I should be able to prototype something
> to call into the PRM service unprivileged, running under the efi_mm.

The ring transition itself is done using IRET; create a iret frame with
userspace CS and the right IP (and flag etc.) and off you go. The
problem is getting back in the kernel I suppose. All the 'normal' kernel
entry points assume the kernel stack is empty and all that.

The whole usermodehelper stuff creates a whole extra thread, sets
everything up and drops into userspace. Perhaps that is the easiest
solution. Basically you set the thread's mm to efi_mm, populate
task_pt_regs() with the right bits and simply drop into 'userspace'.

Then it can complete by terminating itself (sys_exit()) and the calling
context reaps the thing and continues.

> Would that allay your concerns?

Yeah, running it as userspace would be fine; we don't trust that.

But again; a native driver is ever so much better than relying on PRM.

In this case it is AMD doing a driver for their own chips, they know how
they work, they should be able to write this natively.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-16 14:38               ` Peter Zijlstra
@ 2026-01-19 14:33                 ` Robert Richter
  2026-01-19 15:00                   ` Gregory Price
                                     ` (3 more replies)
  0 siblings, 4 replies; 51+ messages in thread
From: Robert Richter @ 2026-01-19 14:33 UTC (permalink / raw)
  To: Peter Zijlstra, Dan Williams, Dave Jiang
  Cc: Ard Biesheuvel, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn,
	Borislav Petkov, Yazen Ghannam, Rafael J. Wysocki, John Allen

(+Rafael and some AMD folks)

Hi Peter,

On Fri, Jan 16, 2026 at 03:38:38PM +0100, Peter Zijlstra wrote:
> On Thu, Jan 15, 2026 at 09:30:10AM +0100, Ard Biesheuvel wrote:
> > On Thu, 15 Jan 2026 at 09:04, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Wed, Jan 14, 2026 at 06:08:59PM +0000, Jonathan Cameron wrote:
> > >
> > > > Do we have a potential issue wrt to merging this as it stands and improving
> > > > on it later?  i.e. Is this a blocking issue for this patch set?
> > >
> > > Well, why do you *have* to use PRMT at all? And this is a serious
> > > question; PRMT is basically injecting unaudited magic code into the
> > > kernel, and that is a security risk.
> > >
> > > Worse, in order to run this shit, we have to lower or disable various
> > > security measures.
> > >
> > 
> > Only if we decide to keep running it privileged, which the PRM spec no
> > longer requires (as you have confirmed yourself when we last discussed
> > this, right?)
> 
> Indeed. But those very constraints also make me wonder why we would ever
> bother with PRM at all, and not simply require a native driver. Then you
> actually *know* what the thing does and can debug/fix it without having
> to rely on BIOS updates and whatnot.

an address translation driver needs the configuration data from the
Data Fabric, which is only known to firmware but not to the kernel.
Other ways would be necessary to expose and calculate that data, if it
is even feasible to make this information available.

So using PRM looks reasonable to me as this abstracts the logic and
data behind a method, same as doing a library call. Of course, you
don't want to trust that, but that could be addressed running it
unprivileged.

> Worse, you might have to deal with various incompatible buggy PRM
> versions because BIOS :/

The address translation functions are straight forward. I haven't
experienced any issues here. If there would be any, this will be
solvable, e.g. by requiring a specific minimum version or uuid to run
PRM.

> 
> > > If I had my way, we would WARN and TAINT the kernel whenever such
> > > garbage got used.
> > 
> > These are things that used to live in SMM, requiring all CPUs to
> > disappear into SMM mode in a way that was completely opaque to the OS.
> > 
> > PRM runs under the control of the OS, does not require privileges and
> > only needs MMIO access to the regions it describes in its manifest
> > (which the OS can inspect, if desired). So if there are security
> > concerns with PRM today, it is because we were lazy and did not
> > implement PRM securely from the beginning.
> > 
> > In my defense, I wasn't aware of the unprivileged requirement until
> > you spotted it recently: it was something I had asked for when the PRM
> > spec was put up for "review" by the Intel and MS authors, and they
> > told me they couldn't possibly make any changes at that point, because
> > it had already gone into production. But as it turns out, the change
> > was made after all.
> > 
> > I am a total noob when it comes to how x86 does its ring0/ring3
> > switching, but with some help, I should be able to prototype something
> > to call into the PRM service unprivileged, running under the efi_mm.
> 
> The ring transition itself is done using IRET; create a iret frame with
> userspace CS and the right IP (and flag etc.) and off you go. The
> problem is getting back in the kernel I suppose. All the 'normal' kernel
> entry points assume the kernel stack is empty and all that.
> 
> The whole usermodehelper stuff creates a whole extra thread, sets
> everything up and drops into userspace. Perhaps that is the easiest
> solution. Basically you set the thread's mm to efi_mm, populate
> task_pt_regs() with the right bits and simply drop into 'userspace'.
> 
> Then it can complete by terminating itself (sys_exit()) and the calling
> context reaps the thing and continues.

I can help with testing and also work on securing the PRM calls.
Thanks Ard for also looking into this.

> 
> > Would that allay your concerns?
> 
> Yeah, running it as userspace would be fine; we don't trust that.
> 
> But again; a native driver is ever so much better than relying on PRM.
> 
> In this case it is AMD doing a driver for their own chips, they know how
> they work, they should be able to write this natively.

Since a native driver introduces additional issues, as explained
above, I would prefer to use PRM for address translation and instead
ensure the PRM call is secure.

Dan, Dave, regarding this series, the cxl driver just uses existing
PRM kernel code and does not implement anything new here. Is there
anything that would prevent this series from being accepted? We are
already at v10 and review is complete:

https://patchwork.kernel.org/project/cxl/list/?series=1042412

I will follow up with working on unprivileged PRM calls. I think, that
will be the best solution here.

Thanks,

-Robert

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-19 14:33                 ` Robert Richter
@ 2026-01-19 15:00                   ` Gregory Price
  2026-01-19 15:15                   ` Dave Jiang
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 51+ messages in thread
From: Gregory Price @ 2026-01-19 15:00 UTC (permalink / raw)
  To: Robert Richter
  Cc: Peter Zijlstra, Dan Williams, Dave Jiang, Ard Biesheuvel,
	Jonathan Cameron, Alison Schofield, Vishal Verma, Ira Weiny,
	Davidlohr Bueso, linux-cxl, linux-kernel, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn, Borislav Petkov, Yazen Ghannam,
	Rafael J. Wysocki, John Allen

On Mon, Jan 19, 2026 at 03:33:33PM +0100, Robert Richter wrote:
> Dan, Dave, regarding this series, the cxl driver just uses existing
> PRM kernel code and does not implement anything new here. Is there
> anything that would prevent this series from being accepted? We are
> already at v10 and review is complete:
> 
> https://patchwork.kernel.org/project/cxl/list/?series=1042412
> 

I will also add that this code has been heavily tested version to
version on many thousands of boxes for over a year.

~Gregory

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-19 14:33                 ` Robert Richter
  2026-01-19 15:00                   ` Gregory Price
@ 2026-01-19 15:15                   ` Dave Jiang
  2026-01-19 16:03                   ` Yazen Ghannam
  2026-01-20 21:23                   ` dan.j.williams
  3 siblings, 0 replies; 51+ messages in thread
From: Dave Jiang @ 2026-01-19 15:15 UTC (permalink / raw)
  To: Robert Richter, Peter Zijlstra, Dan Williams
  Cc: Ard Biesheuvel, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn,
	Borislav Petkov, Yazen Ghannam, Rafael J. Wysocki, John Allen



On 1/19/26 7:33 AM, Robert Richter wrote:
> (+Rafael and some AMD folks)
> 
> Hi Peter,
> 
> On Fri, Jan 16, 2026 at 03:38:38PM +0100, Peter Zijlstra wrote:
>> On Thu, Jan 15, 2026 at 09:30:10AM +0100, Ard Biesheuvel wrote:
>>> On Thu, 15 Jan 2026 at 09:04, Peter Zijlstra <peterz@infradead.org> wrote:
>>>>
>>>> On Wed, Jan 14, 2026 at 06:08:59PM +0000, Jonathan Cameron wrote:
>>>>
>>>>> Do we have a potential issue wrt to merging this as it stands and improving
>>>>> on it later?  i.e. Is this a blocking issue for this patch set?
>>>>
>>>> Well, why do you *have* to use PRMT at all? And this is a serious
>>>> question; PRMT is basically injecting unaudited magic code into the
>>>> kernel, and that is a security risk.
>>>>
>>>> Worse, in order to run this shit, we have to lower or disable various
>>>> security measures.
>>>>
>>>
>>> Only if we decide to keep running it privileged, which the PRM spec no
>>> longer requires (as you have confirmed yourself when we last discussed
>>> this, right?)
>>
>> Indeed. But those very constraints also make me wonder why we would ever
>> bother with PRM at all, and not simply require a native driver. Then you
>> actually *know* what the thing does and can debug/fix it without having
>> to rely on BIOS updates and whatnot.
> 
> an address translation driver needs the configuration data from the
> Data Fabric, which is only known to firmware but not to the kernel.
> Other ways would be necessary to expose and calculate that data, if it
> is even feasible to make this information available.
> 
> So using PRM looks reasonable to me as this abstracts the logic and
> data behind a method, same as doing a library call. Of course, you
> don't want to trust that, but that could be addressed running it
> unprivileged.
> 
>> Worse, you might have to deal with various incompatible buggy PRM
>> versions because BIOS :/
> 
> The address translation functions are straight forward. I haven't
> experienced any issues here. If there would be any, this will be
> solvable, e.g. by requiring a specific minimum version or uuid to run
> PRM.
> 
>>
>>>> If I had my way, we would WARN and TAINT the kernel whenever such
>>>> garbage got used.
>>>
>>> These are things that used to live in SMM, requiring all CPUs to
>>> disappear into SMM mode in a way that was completely opaque to the OS.
>>>
>>> PRM runs under the control of the OS, does not require privileges and
>>> only needs MMIO access to the regions it describes in its manifest
>>> (which the OS can inspect, if desired). So if there are security
>>> concerns with PRM today, it is because we were lazy and did not
>>> implement PRM securely from the beginning.
>>>
>>> In my defense, I wasn't aware of the unprivileged requirement until
>>> you spotted it recently: it was something I had asked for when the PRM
>>> spec was put up for "review" by the Intel and MS authors, and they
>>> told me they couldn't possibly make any changes at that point, because
>>> it had already gone into production. But as it turns out, the change
>>> was made after all.
>>>
>>> I am a total noob when it comes to how x86 does its ring0/ring3
>>> switching, but with some help, I should be able to prototype something
>>> to call into the PRM service unprivileged, running under the efi_mm.
>>
>> The ring transition itself is done using IRET; create a iret frame with
>> userspace CS and the right IP (and flag etc.) and off you go. The
>> problem is getting back in the kernel I suppose. All the 'normal' kernel
>> entry points assume the kernel stack is empty and all that.
>>
>> The whole usermodehelper stuff creates a whole extra thread, sets
>> everything up and drops into userspace. Perhaps that is the easiest
>> solution. Basically you set the thread's mm to efi_mm, populate
>> task_pt_regs() with the right bits and simply drop into 'userspace'.
>>
>> Then it can complete by terminating itself (sys_exit()) and the calling
>> context reaps the thing and continues.
> 
> I can help with testing and also work on securing the PRM calls.
> Thanks Ard for also looking into this.
> 
>>
>>> Would that allay your concerns?
>>
>> Yeah, running it as userspace would be fine; we don't trust that.
>>
>> But again; a native driver is ever so much better than relying on PRM.
>>
>> In this case it is AMD doing a driver for their own chips, they know how
>> they work, they should be able to write this natively.
> 
> Since a native driver introduces additional issues, as explained
> above, I would prefer to use PRM for address translation and instead
> ensure the PRM call is secure.
> 
> Dan, Dave, regarding this series, the cxl driver just uses existing
> PRM kernel code and does not implement anything new here. Is there
> anything that would prevent this series from being accepted? We are
> already at v10 and review is complete:
> 
> https://patchwork.kernel.org/project/cxl/list/?series=1042412
> 
> I will follow up with working on unprivileged PRM calls. I think, that
> will be the best solution here.

I have no objections with the promise of work on unprivileged PRM call. Please rev the convention doc with Dan's request and we can get this merged.

> 
> Thanks,
> 
> -Robert


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-19 14:33                 ` Robert Richter
  2026-01-19 15:00                   ` Gregory Price
  2026-01-19 15:15                   ` Dave Jiang
@ 2026-01-19 16:03                   ` Yazen Ghannam
  2026-01-21  0:35                     ` dan.j.williams
  2026-01-20 21:23                   ` dan.j.williams
  3 siblings, 1 reply; 51+ messages in thread
From: Yazen Ghannam @ 2026-01-19 16:03 UTC (permalink / raw)
  To: Robert Richter
  Cc: Peter Zijlstra, Dan Williams, Dave Jiang, Ard Biesheuvel,
	Jonathan Cameron, Alison Schofield, Vishal Verma, Ira Weiny,
	Davidlohr Bueso, linux-cxl, linux-kernel, Gregory Price,
	Fabio M. De Francesco, Terry Bowman, Joshua Hahn, Borislav Petkov,
	Rafael J. Wysocki, John Allen

On Mon, Jan 19, 2026 at 03:33:33PM +0100, Robert Richter wrote:
> (+Rafael and some AMD folks)
> 
> Hi Peter,
> 
> On Fri, Jan 16, 2026 at 03:38:38PM +0100, Peter Zijlstra wrote:
> > On Thu, Jan 15, 2026 at 09:30:10AM +0100, Ard Biesheuvel wrote:
> > > On Thu, 15 Jan 2026 at 09:04, Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Wed, Jan 14, 2026 at 06:08:59PM +0000, Jonathan Cameron wrote:
> > > >
> > > > > Do we have a potential issue wrt to merging this as it stands and improving
> > > > > on it later?  i.e. Is this a blocking issue for this patch set?
> > > >
> > > > Well, why do you *have* to use PRMT at all? And this is a serious
> > > > question; PRMT is basically injecting unaudited magic code into the
> > > > kernel, and that is a security risk.
> > > >
> > > > Worse, in order to run this shit, we have to lower or disable various
> > > > security measures.
> > > >
> > > 
> > > Only if we decide to keep running it privileged, which the PRM spec no
> > > longer requires (as you have confirmed yourself when we last discussed
> > > this, right?)
> > 
> > Indeed. But those very constraints also make me wonder why we would ever
> > bother with PRM at all, and not simply require a native driver. Then you
> > actually *know* what the thing does and can debug/fix it without having
> > to rely on BIOS updates and whatnot.
> 
> an address translation driver needs the configuration data from the
> Data Fabric, which is only known to firmware but not to the kernel.
> Other ways would be necessary to expose and calculate that data, if it
> is even feasible to make this information available.
> 
> So using PRM looks reasonable to me as this abstracts the logic and
> data behind a method, same as doing a library call. Of course, you
> don't want to trust that, but that could be addressed running it
> unprivileged.
> 

Additionally, the same translation code can be used in multiple places
(tools, FW, kernel, etc.). Most consumers treat the code like a library
that they include. It's coded once and bugs can be fixed in one place.

However, with a native kernel driver, we have to re-write everything to
match coding style, licensing, etc.

Also, new hardware may need changes to the code (sometimes major). So
there's upstream work, backporting (more testing), and so on.

See the AMD Address Translation Library at drivers/ras/amd/atl/.

> > Worse, you might have to deal with various incompatible buggy PRM
> > versions because BIOS :/
> 
> The address translation functions are straight forward. I haven't
> experienced any issues here. If there would be any, this will be
> solvable, e.g. by requiring a specific minimum version or uuid to run
> PRM.
> 

This is a good point, and I've brought this up with some of my
colleagues.

The PRM methods are supposed to be able to be updated at runtime by the
OS. We could think of this as a similar flow to microcode.

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-19 14:33                 ` Robert Richter
                                     ` (2 preceding siblings ...)
  2026-01-19 16:03                   ` Yazen Ghannam
@ 2026-01-20 21:23                   ` dan.j.williams
  3 siblings, 0 replies; 51+ messages in thread
From: dan.j.williams @ 2026-01-20 21:23 UTC (permalink / raw)
  To: Robert Richter, Peter Zijlstra, Dan Williams, Dave Jiang
  Cc: Ard Biesheuvel, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn,
	Borislav Petkov, Yazen Ghannam, Rafael J. Wysocki, John Allen

Robert Richter wrote:
[..]
> > Indeed. But those very constraints also make me wonder why we would ever
> > bother with PRM at all, and not simply require a native driver. Then you
> > actually *know* what the thing does and can debug/fix it without having
> > to rely on BIOS updates and whatnot.
> 
> an address translation driver needs the configuration data from the
> Data Fabric, which is only known to firmware but not to the kernel.
> Other ways would be necessary to expose and calculate that data, if it
> is even feasible to make this information available.

If it is just data is it amenable to put into a table?

Look at the complexity of the XOR addressing mode already defined in the
CEDT.CFMWS table, is the complexity significantly different than that?
 
> So using PRM looks reasonable to me as this abstracts the logic and
> data behind a method, same as doing a library call. Of course, you
> don't want to trust that, but that could be addressed running it
> unprivileged.

PRM should always be a last resort relative to an open specification
with a native driver implementation.

At a minimum Peter's feedback reiginited my simmering concerns with PRM
as a system-software design tool, and this should be a test case for
what Linux is willing and not willing to accept moving forward.

> > Worse, you might have to deal with various incompatible buggy PRM
> > versions because BIOS :/
> 
> The address translation functions are straight forward. I haven't
> experienced any issues here. If there would be any, this will be
> solvable, e.g. by requiring a specific minimum version or uuid to run
> PRM.

Can you publish the source to the PRM handler?

[..]
> > The whole usermodehelper stuff creates a whole extra thread, sets
> > everything up and drops into userspace. Perhaps that is the easiest
> > solution. Basically you set the thread's mm to efi_mm, populate
> > task_pt_regs() with the right bits and simply drop into 'userspace'.
> > 
> > Then it can complete by terminating itself (sys_exit()) and the calling
> > context reaps the thing and continues.
> 
> I can help with testing and also work on securing the PRM calls.
> Thanks Ard for also looking into this.
> 
> > 
> > > Would that allay your concerns?
> > 
> > Yeah, running it as userspace would be fine; we don't trust that.
> > 
> > But again; a native driver is ever so much better than relying on PRM.
> > 
> > In this case it is AMD doing a driver for their own chips, they know how
> > they work, they should be able to write this natively.
> 
> Since a native driver introduces additional issues, as explained
> above, I would prefer to use PRM for address translation and instead
> ensure the PRM call is secure.

How is this case outside of the typical issues that kernel and its ABI
are meant to abstract?

> Dan, Dave, regarding this series, the cxl driver just uses existing
> PRM kernel code and does not implement anything new here. Is there
> anything that would prevent this series from being accepted? We are
> already at v10 and review is complete:
> 
> https://patchwork.kernel.org/project/cxl/list/?series=1042412
> 
> I will follow up with working on unprivileged PRM calls. I think, that
> will be the best solution here.

The PRM to ring3 work is important for the PRM handlers that are
converting existing SMM flows to use PRM. For new DSMs the answer to the
"why not a native driver?" question needs to be clear.

That said, I am also interested in the PRM to ring3 work and did some
investigation there especially when the threat of runtime updates to PRM
handlers was being proposed. I think it is an important capability that
might also get some reuse with the confidential computing case for some
interactions with platform security services, but that is separate from
the primary question of enabling wider deployment of PRM solutions.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-19 16:03                   ` Yazen Ghannam
@ 2026-01-21  0:35                     ` dan.j.williams
  2026-01-21 14:58                       ` Yazen Ghannam
  0 siblings, 1 reply; 51+ messages in thread
From: dan.j.williams @ 2026-01-21  0:35 UTC (permalink / raw)
  To: Yazen Ghannam, Robert Richter
  Cc: Peter Zijlstra, Dan Williams, Dave Jiang, Ard Biesheuvel,
	Jonathan Cameron, Alison Schofield, Vishal Verma, Ira Weiny,
	Davidlohr Bueso, linux-cxl, linux-kernel, Gregory Price,
	Fabio M. De Francesco, Terry Bowman, Joshua Hahn, Borislav Petkov,
	Rafael J. Wysocki, John Allen

Yazen Ghannam wrote:
[..]
> Additionally, the same translation code can be used in multiple places
> (tools, FW, kernel, etc.). Most consumers treat the code like a library
> that they include. It's coded once and bugs can be fixed in one place.
> 
> However, with a native kernel driver, we have to re-write everything to
> match coding style, licensing, etc.
> 
> Also, new hardware may need changes to the code (sometimes major). So
> there's upstream work, backporting (more testing), and so on.
> 
> See the AMD Address Translation Library at drivers/ras/amd/atl/.

There is more nuance here.

There are indeed cases where there are high degrees of non-architectural
details in flux from one product to the next. For example, the details
that EDAC no longer needs to chase because the ADXL DSM exists are a
solution to the problem of shifting and complicated memory topology
details.

CXL is a standard that this architecture at issue decided to inject
software-model-destroying artificats like CXL-endpoint-HPA to
CXL-Host-Bridge-SPA (Normalized Addressing) translation.

A Normalized Address looks like a static offset per host bridge, not a
method call round trip to a runtime firmware service.

Note that there are other platforms that break basic HPA-to-SPA
assumptions, but those have been handled with native driver support via
XOR interleave, and non-CXL-Host-Bridge target updates to the
ACPI.CEDT.CFMWS table.

> > > Worse, you might have to deal with various incompatible buggy PRM
> > > versions because BIOS :/
> > 
> > The address translation functions are straight forward. I haven't
> > experienced any issues here. If there would be any, this will be
> > solvable, e.g. by requiring a specific minimum version or uuid to run
> > PRM.
> > 
> 
> This is a good point, and I've brought this up with some of my
> colleagues.

The more that software bugs leak into this interface requiring
consideration of versions and the like, the louder the requests for
"please move this to a driver" will become.

> The PRM methods are supposed to be able to be updated at runtime by the
> OS. We could think of this as a similar flow to microcode.

No, at the point where runtime updates are needed outside of a BIOS
update we have crossed the threshold into Linux actively taking on new
maintenance burden to enable hardware platforms to avoid the discipline
of architectural solutions.

Microcode is a confined solution space. PRM is unbounded.

Now, stepping back, this specific Zen5 support has been a long time
coming. Specifically, there are shipping platforms where Linux is unable
to use any of its CXL RAS support because it gets tripped up on this
fundamental step. I would like to see exact details on what this PRM
handler is doing so that we, linux-cxl community, can make a
determination about:

    "yes this algorithm is so tiny and static, PRM not indicated"

    "no, this is complicated and guaranteed to keep shifting product to
     product, Linux is better off with a PRM helper"

...but still merge this PRM call, regardless of the determination. Put
the next potential use of PRM on notice that native drivers are required
outside of meeting the "complicated + shifting" criteria that indicate
PRM.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-21  0:35                     ` dan.j.williams
@ 2026-01-21 14:58                       ` Yazen Ghannam
  2026-01-21 22:09                         ` dan.j.williams
  0 siblings, 1 reply; 51+ messages in thread
From: Yazen Ghannam @ 2026-01-21 14:58 UTC (permalink / raw)
  To: dan.j.williams
  Cc: Robert Richter, Peter Zijlstra, Dave Jiang, Ard Biesheuvel,
	Jonathan Cameron, Alison Schofield, Vishal Verma, Ira Weiny,
	Davidlohr Bueso, linux-cxl, linux-kernel, Gregory Price,
	Fabio M. De Francesco, Terry Bowman, Joshua Hahn, Borislav Petkov,
	Rafael J. Wysocki, John Allen

On Tue, Jan 20, 2026 at 04:35:57PM -0800, dan.j.williams@intel.com wrote:
> Yazen Ghannam wrote:
> [..]
> > Additionally, the same translation code can be used in multiple places
> > (tools, FW, kernel, etc.). Most consumers treat the code like a library
> > that they include. It's coded once and bugs can be fixed in one place.
> > 
> > However, with a native kernel driver, we have to re-write everything to
> > match coding style, licensing, etc.
> > 
> > Also, new hardware may need changes to the code (sometimes major). So
> > there's upstream work, backporting (more testing), and so on.
> > 
> > See the AMD Address Translation Library at drivers/ras/amd/atl/.
> 
> There is more nuance here.
> 
> There are indeed cases where there are high degrees of non-architectural
> details in flux from one product to the next. For example, the details
> that EDAC no longer needs to chase because the ADXL DSM exists are a
> solution to the problem of shifting and complicated memory topology
> details.
> 

Right, this is the intended use case. 

> CXL is a standard that this architecture at issue decided to inject
> software-model-destroying artificats like CXL-endpoint-HPA to
> CXL-Host-Bridge-SPA (Normalized Addressing) translation.
> 
> A Normalized Address looks like a static offset per host bridge, not a
> method call round trip to a runtime firmware service.
> 
> Note that there are other platforms that break basic HPA-to-SPA
> assumptions, but those have been handled with native driver support via
> XOR interleave, and non-CXL-Host-Bridge target updates to the
> ACPI.CEDT.CFMWS table.
> 

I see. So the concern is including model-specific methods that would
modify the CXL standard flow, correct?

Or, more specifically, is it reliance on external/system-specific
information?

Or the time spent on a round trip call to another service?

> > > > Worse, you might have to deal with various incompatible buggy PRM
> > > > versions because BIOS :/
> > > 
> > > The address translation functions are straight forward. I haven't
> > > experienced any issues here. If there would be any, this will be
> > > solvable, e.g. by requiring a specific minimum version or uuid to run
> > > PRM.
> > > 
> > 
> > This is a good point, and I've brought this up with some of my
> > colleagues.
> 
> The more that software bugs leak into this interface requiring
> consideration of versions and the like, the louder the requests for
> "please move this to a driver" will become.
> 

Yes, ack.

> > The PRM methods are supposed to be able to be updated at runtime by the
> > OS. We could think of this as a similar flow to microcode.
> 
> No, at the point where runtime updates are needed outside of a BIOS
> update we have crossed the threshold into Linux actively taking on new
> maintenance burden to enable hardware platforms to avoid the discipline
> of architectural solutions.
> 
> Microcode is a confined solution space. PRM is unbounded.
> 
> Now, stepping back, this specific Zen5 support has been a long time
> coming. Specifically, there are shipping platforms where Linux is unable
> to use any of its CXL RAS support because it gets tripped up on this
> fundamental step. I would like to see exact details on what this PRM
> handler is doing so that we, linux-cxl community, can make a
> determination about:
> 
>     "yes this algorithm is so tiny and static, PRM not indicated"
> 
>     "no, this is complicated and guaranteed to keep shifting product to
>      product, Linux is better off with a PRM helper"
> 
> ...but still merge this PRM call, regardless of the determination. Put
> the next potential use of PRM on notice that native drivers are required
> outside of meeting the "complicated + shifting" criteria that indicate
> PRM.

I can give a general overview. The AMD CXL address translation flows are
an extension of the AMD Data Fabric address translation flows.
Specifically for Zen5, it would be "DF v4.5" with adjustments for CXL.

The "DF 4.5" translation is upstream in the AMD Address Translation
Library. See code examples with "git grep -i df4p5".

I would consider this "complicated + shifting". This is true for general
memory errors reported through MCA/EDAC.

I defer to my CXL colleagues if the "shifting" criteria applies to
future CXL systems.

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-21 14:58                       ` Yazen Ghannam
@ 2026-01-21 22:09                         ` dan.j.williams
  2026-01-21 23:12                           ` Gregory Price
  0 siblings, 1 reply; 51+ messages in thread
From: dan.j.williams @ 2026-01-21 22:09 UTC (permalink / raw)
  To: Yazen Ghannam, dan.j.williams
  Cc: Robert Richter, Peter Zijlstra, Dave Jiang, Ard Biesheuvel,
	Jonathan Cameron, Alison Schofield, Vishal Verma, Ira Weiny,
	Davidlohr Bueso, linux-cxl, linux-kernel, Gregory Price,
	Fabio M. De Francesco, Terry Bowman, Joshua Hahn, Borislav Petkov,
	Rafael J. Wysocki, John Allen

Yazen Ghannam wrote:
> On Tue, Jan 20, 2026 at 04:35:57PM -0800, dan.j.williams@intel.com wrote:
> > Yazen Ghannam wrote:
> > [..]
> > > Additionally, the same translation code can be used in multiple places
> > > (tools, FW, kernel, etc.). Most consumers treat the code like a library
> > > that they include. It's coded once and bugs can be fixed in one place.
> > > 
> > > However, with a native kernel driver, we have to re-write everything to
> > > match coding style, licensing, etc.
> > > 
> > > Also, new hardware may need changes to the code (sometimes major). So
> > > there's upstream work, backporting (more testing), and so on.
> > > 
> > > See the AMD Address Translation Library at drivers/ras/amd/atl/.
> > 
> > There is more nuance here.
> > 
> > There are indeed cases where there are high degrees of non-architectural
> > details in flux from one product to the next. For example, the details
> > that EDAC no longer needs to chase because the ADXL DSM exists are a
> > solution to the problem of shifting and complicated memory topology
> > details.
> > 
> 
> Right, this is the intended use case. 
> 
> > CXL is a standard that this architecture at issue decided to inject
> > software-model-destroying artificats like CXL-endpoint-HPA to
> > CXL-Host-Bridge-SPA (Normalized Addressing) translation.
> > 
> > A Normalized Address looks like a static offset per host bridge, not a
> > method call round trip to a runtime firmware service.
> > 
> > Note that there are other platforms that break basic HPA-to-SPA
> > assumptions, but those have been handled with native driver support via
> > XOR interleave, and non-CXL-Host-Bridge target updates to the
> > ACPI.CEDT.CFMWS table.
> > 
> 
> I see. So the concern is including model-specific methods that would
> modify the CXL standard flow, correct?

Yes, but more than that, Linux benefits from one vendor's model-specific
feature being upleveled into a standard concept.

With ACPI there is a Code First process to get clarifications and small
features into the specification for situations like this. For CXL we can
only approximate that with documenting "conventions" for shipping
platforms [1]. The request for CXL is document the driver-breaking
platform features in a way that at least gives Linux a way to say "oh,
hey $HW_VENDOR, you seem to be taking the same liberties with the
specification as $OTHER_HW_VENDOR. Please implement it the same way
while working a change to the CXL specification on the backend."

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7ac6612d6b79

As I told Robert, I want a generic "Normalized Address" facility of
which Zen5 is the first user.

> Or, more specifically, is it reliance on external/system-specific
> information?

Reliance on system information is not a problem. ACPI is great at
distilling platform degrees of freedom into static tables and shared
concepts.

> Or the time spent on a round trip call to another service?

No, overhead is not the concern, opaqueness, complexity, and security
implications of sprinkling runtime service calls for what amounts to "do
some limited address math" is the problem. Static tables can carry a
large problem space without all the pitfalls of runtime service calls.
Examples are "CXL XOR Interleave Math Structure" and "Interleave Set
spans non-CXL domains" feature of the ACPI.CEDT

> > > The PRM methods are supposed to be able to be updated at runtime by the
> > > OS. We could think of this as a similar flow to microcode.
> > 
> > No, at the point where runtime updates are needed outside of a BIOS
> > update we have crossed the threshold into Linux actively taking on new
> > maintenance burden to enable hardware platforms to avoid the discipline
> > of architectural solutions.
> > 
> > Microcode is a confined solution space. PRM is unbounded.
> > 
> > Now, stepping back, this specific Zen5 support has been a long time
> > coming. Specifically, there are shipping platforms where Linux is unable
> > to use any of its CXL RAS support because it gets tripped up on this
> > fundamental step. I would like to see exact details on what this PRM
> > handler is doing so that we, linux-cxl community, can make a
> > determination about:
> > 
> >     "yes this algorithm is so tiny and static, PRM not indicated"
> > 
> >     "no, this is complicated and guaranteed to keep shifting product to
> >      product, Linux is better off with a PRM helper"
> > 
> > ...but still merge this PRM call, regardless of the determination. Put
> > the next potential use of PRM on notice that native drivers are required
> > outside of meeting the "complicated + shifting" criteria that indicate
> > PRM.
> 
> I can give a general overview. The AMD CXL address translation flows are
> an extension of the AMD Data Fabric address translation flows.
> Specifically for Zen5, it would be "DF v4.5" with adjustments for CXL.
> 
> The "DF 4.5" translation is upstream in the AMD Address Translation
> Library. See code examples with "git grep -i df4p5".

Right, that looks like all the same complexity that the Intel ADXL DSM
deals with, but ADXL only needs to handle the "complicated + shifting"
nature of product-to-product DRAM architecture changes. CXL address
translation is left to the OS driver because CXL is standardized (can
not shift).

> I would consider this "complicated + shifting". This is true for general
> memory errors reported through MCA/EDAC.
> 
> I defer to my CXL colleagues if the "shifting" criteria applies to
> future CXL systems.

My hypothesis is that it was convenient for $HW_VENDOR to glomm this
small subset of "CXL Normalized Address" into existing firmware method
infrastructure. It did so at the expense of exporting the complexity of
yet one more PRM method call to Linux.

A static table is unplanned work for $HW_VENDOR, comparable of amount of
work for Linux, and lower amount of risk to mitigate from PRM exposure
for Linux.

My goal here is to have an archived message to point to the next time
someone wants to reach for the "PRM" tool and understand that Linux has
a high bar for new invocations.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-21 22:09                         ` dan.j.williams
@ 2026-01-21 23:12                           ` Gregory Price
  2026-01-22  2:05                             ` dan.j.williams
  0 siblings, 1 reply; 51+ messages in thread
From: Gregory Price @ 2026-01-21 23:12 UTC (permalink / raw)
  To: dan.j.williams
  Cc: Yazen Ghannam, Robert Richter, Peter Zijlstra, Dave Jiang,
	Ard Biesheuvel, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Davidlohr Bueso, linux-cxl, linux-kernel,
	Fabio M. De Francesco, Terry Bowman, Joshua Hahn, Borislav Petkov,
	Rafael J. Wysocki, John Allen

On Wed, Jan 21, 2026 at 02:09:27PM -0800, dan.j.williams@intel.com wrote:
> > 
> > I see. So the concern is including model-specific methods that would
> > modify the CXL standard flow, correct?
> 
...
> 
> As I told Robert, I want a generic "Normalized Address" facility of
> which Zen5 is the first user.
> 

Isn't that what this patch functionally is w/ a specific PRM function?

   rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);

Or is the request now: replace this with static table data?


point of ignorance: what facility would you use to expose such tables?

-----

When i initialially hacked up driver support for this mode before
getting PRM support, the "hacked up translation code" I was this:

  /* Find 0-based offset into whole interleave region */
  dev = (pdev->bus->number == 0xe1) ? 0 : 1;
  offset = (0x100 * (((norm_addr >> 8) * 2) + dev)) + (norm_addr & 0xff);

  /* Find the SPA base for the address */
  for (idx = 0; idx < cfmws_nr; idx++) {
      size = cxl_get_cfmws_size(idx);
      /* We may have a gap in the CFMWS */
      if (offset < size) {
          *sys_addr = cxl_get_cfmws_base(idx) + offset;
          return 0;
      }
      offset -= size;
   }

------

This makes hard-assumptions about two things:

  device interleave index  - pcidev(0xe1) => 0
  cfmws base               - all CFMWS are used for this one region

cxl_get_cfmws_base() was a call into ACPI code, and the acpi code just
kept a global cache of the raw CEDT CFMWS structures (base + size);

So, assuming you had such tables, it would need to be like:

                  Normalized Decoders Table
    --------------------------------------------------------
    | CXL PCIDev | Decoder  | CFMW SPAN  |  Interleave IDX |
    --------------------------------------------------------
    |     d1     |    0     |    1,2     |        0        |
    |     e1     |    0     |    1,2     |        1        |
    --------------------------------------------------------
  --------------------------------^
  |            CFMW Index Table
  |  -----------------------------------------
  |  | CFMW ID |     BASE       |    SIZE    |
  |  -----------------------------------------
  |  |    0    | 0xb00000....   |     ...    |
  |->|    1    | 0xc05000....   |            |
  |->|    2    | 0x100500....   |            |
     |    3    | 0x200000....   |     ...    |
     -----------------------------------------

-------

The code above turns into

int cxl_normal_translate(pdev, norm_addr, u64* sys_addr)
{
    int i_idx = cxl_nrm_decoder_interleave_index(pdev);
    int span, i;
    u64 offset;

    if (i_idx < 0)
    	return -EINVAL;

    span = cxl_nrm_decoder_window_span(pdev);

    /* Normalized offset into whole region */
    offset = (0x100 * (((norm_addr >> 8) * 2) + i_idx)) + (norm_addr & 0xff);

    /* Find actual CFMW Base (might cross multiple w/ gaps) */
    for (i = 0; i < span; i++) {
        u64 base, size;
	int id;

        id = cxl_nrm_decoder_cfmws_id(i);
	if (id < 0)
	   return -EINVAL;

        if (!cxl_nrm_decoder_cfmws_data(id, &base, &size))
	   return -EINVAL;

	if (offset < size) {
	    *sys_addr = cxl_get_cfmws_base(id) + offset;
	    return 0;
	}
        offset -= size;
    }
    return -EINVAL;
}

Where the cxl_nrm_*() functions just query the exposed tables - however
that actually happens.

--------

I don't know whether the above math is actually true, it's basically
just the simply interleave maths. If something else is going on, then
this whole table thing might not actually work.

The rest of the patch set would more or less stay the same.

~Gregory

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-21 23:12                           ` Gregory Price
@ 2026-01-22  2:05                             ` dan.j.williams
  2026-01-22  6:09                               ` dan.j.williams
  0 siblings, 1 reply; 51+ messages in thread
From: dan.j.williams @ 2026-01-22  2:05 UTC (permalink / raw)
  To: Gregory Price, dan.j.williams
  Cc: Yazen Ghannam, Robert Richter, Peter Zijlstra, Dave Jiang,
	Ard Biesheuvel, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Davidlohr Bueso, linux-cxl, linux-kernel,
	Fabio M. De Francesco, Terry Bowman, Joshua Hahn, Borislav Petkov,
	Rafael J. Wysocki, John Allen

Gregory Price wrote:
> On Wed, Jan 21, 2026 at 02:09:27PM -0800, dan.j.williams@intel.com wrote:
> > > 
> > > I see. So the concern is including model-specific methods that would
> > > modify the CXL standard flow, correct?
> > 
> ...
> > 
> > As I told Robert, I want a generic "Normalized Address" facility of
> > which Zen5 is the first user.
> > 
> 
> Isn't that what this patch functionally is w/ a specific PRM function?
> 
>    rc = acpi_call_prm_handler(prm_cxl_dpa_spa_guid, &data);
> 
> Or is the request now: replace this with static table data?

As I mentioned at the bottom of this message to Yazen [1], the request
is to prove or disprove the hypothesis that a table would have sufficed,
but otherwise go ahead with merging this handler. Set a precedent that
the next attempt to solve a problem like this with PRM will face a
higher bar.

[1]: http://lore.kernel.org/69701f6de978_1d6f1001e@dwillia2-mobl4.notmuch

> point of ignorance: what facility would you use to expose such tables?

New sub-structure of the CEDT similar to the CXIMS.

> -----
> 
> When i initialially hacked up driver support for this mode before
> getting PRM support, the "hacked up translation code" I was this:
> 
>   /* Find 0-based offset into whole interleave region */
>   dev = (pdev->bus->number == 0xe1) ? 0 : 1;
>   offset = (0x100 * (((norm_addr >> 8) * 2) + dev)) + (norm_addr & 0xff);
> 
>   /* Find the SPA base for the address */
>   for (idx = 0; idx < cfmws_nr; idx++) {
>       size = cxl_get_cfmws_size(idx);
>       /* We may have a gap in the CFMWS */
>       if (offset < size) {
>           *sys_addr = cxl_get_cfmws_base(idx) + offset;
>           return 0;
>       }
>       offset -= size;
>    }
> 
> ------
> 
> This makes hard-assumptions about two things:
> 
>   device interleave index  - pcidev(0xe1) => 0
>   cfmws base               - all CFMWS are used for this one region
> 
> cxl_get_cfmws_base() was a call into ACPI code, and the acpi code just
> kept a global cache of the raw CEDT CFMWS structures (base + size);
> 
> So, assuming you had such tables, it would need to be like:
> 
>                   Normalized Decoders Table
>     --------------------------------------------------------
>     | CXL PCIDev | Decoder  | CFMW SPAN  |  Interleave IDX |
>     --------------------------------------------------------
>     |     d1     |    0     |    1,2     |        0        |
>     |     e1     |    0     |    1,2     |        1        |
>     --------------------------------------------------------
>   --------------------------------^
>   |            CFMW Index Table
>   |  -----------------------------------------
>   |  | CFMW ID |     BASE       |    SIZE    |
>   |  -----------------------------------------
>   |  |    0    | 0xb00000....   |     ...    |
>   |->|    1    | 0xc05000....   |            |
>   |->|    2    | 0x100500....   |            |
>      |    3    | 0x200000....   |     ...    |
>      -----------------------------------------
> 
> -------
> 
> The code above turns into
> 
> int cxl_normal_translate(pdev, norm_addr, u64* sys_addr)
> {
>     int i_idx = cxl_nrm_decoder_interleave_index(pdev);
>     int span, i;
>     u64 offset;
> 
>     if (i_idx < 0)
>     	return -EINVAL;
> 
>     span = cxl_nrm_decoder_window_span(pdev);
> 
>     /* Normalized offset into whole region */
>     offset = (0x100 * (((norm_addr >> 8) * 2) + i_idx)) + (norm_addr & 0xff);
> 
>     /* Find actual CFMW Base (might cross multiple w/ gaps) */
>     for (i = 0; i < span; i++) {
>         u64 base, size;
> 	int id;
> 
>         id = cxl_nrm_decoder_cfmws_id(i);
> 	if (id < 0)
> 	   return -EINVAL;
> 
>         if (!cxl_nrm_decoder_cfmws_data(id, &base, &size))
> 	   return -EINVAL;
> 
> 	if (offset < size) {
> 	    *sys_addr = cxl_get_cfmws_base(id) + offset;
> 	    return 0;
> 	}
>         offset -= size;
>     }
>     return -EINVAL;
> }
> 
> Where the cxl_nrm_*() functions just query the exposed tables - however
> that actually happens.
> 
> --------
> 
> I don't know whether the above math is actually true, it's basically
> just the simply interleave maths. If something else is going on, then
> this whole table thing might not actually work.
> 
> The rest of the patch set would more or less stay the same.

If the above is even close to being correct, I would merge that in a
heartbeat over this PRM proposal.

Robert, do you really want to be spending time on trying moving PRM to
userspace vs just doing the above?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT
  2026-01-22  2:05                             ` dan.j.williams
@ 2026-01-22  6:09                               ` dan.j.williams
  0 siblings, 0 replies; 51+ messages in thread
From: dan.j.williams @ 2026-01-22  6:09 UTC (permalink / raw)
  To: dan.j.williams, Gregory Price, dan.j.williams
  Cc: Yazen Ghannam, Robert Richter, Peter Zijlstra, Dave Jiang,
	Ard Biesheuvel, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Davidlohr Bueso, linux-cxl, linux-kernel,
	Fabio M. De Francesco, Terry Bowman, Joshua Hahn, Borislav Petkov,
	Rafael J. Wysocki, John Allen

dan.j.williams@ wrote:
[..]
> If the above is even close to being correct, I would merge that in a
> heartbeat over this PRM proposal.
> 
> Robert, do you really want to be spending time on trying moving PRM to
> userspace vs just doing the above?

To be clear I am still of the opinion that even if it is confirmed that
Gregory's algorithm would have done the trick with a new table, proceed with
the PRM solution. The PRM method appears to be already shipping and it fixes a
long overdue problem causing end user pain. The request is do not plan to ship
new PRM without clarity on why a native driver approach can not work.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
  2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
                   ` (12 preceding siblings ...)
  2026-01-10 11:46 ` [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing Robert Richter
@ 2026-02-03 18:52 ` Dave Jiang
  2026-02-03 21:35   ` Gregory Price
  2026-02-04 12:58   ` Robert Richter
  13 siblings, 2 replies; 51+ messages in thread
From: Dave Jiang @ 2026-02-03 18:52 UTC (permalink / raw)
  To: Robert Richter, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Jonathan Cameron, Davidlohr Bueso
  Cc: linux-cxl, linux-kernel, Gregory Price, Fabio M. De Francesco,
	Terry Bowman, Joshua Hahn



On 1/10/26 4:46 AM, Robert Richter wrote:
> This patch set adds support for address translation using ACPI PRM and
> enables this for AMD Zen5 platforms. The current approach bases on v4
> and is in response to earlier attempts to implement CXL address
> translation:
> 
>  * v1: [1] and the comments on it, esp. Dan's [2],
>  * v2: [3] and comments on [4], esp. Dave's [5],
>  * v3: [6] and comments on it, esp. Dave's [7],
>  * v4: [8].
> 
> This version addresses Alison's review comments to change the
> implementation to disable HPA/SPA translation handler. There are a
> view minor but no major changes otherwise. See the changelog for
> details. Thank you all for your reviews and testing.
> 
> Documentation of CXL Address Translation Support will be added to the
> Kernel's "Compute Express Link: Linux Conventions". This patch
> submission will be the base for a documentation patch that describes CXL
> Address Translation support accordingly.
> 
> The CXL driver currently does not implement address translation which
> assumes the host physical addresses (HPA) and system physical
> addresses (SPA) are equal.
> 
> Systems with different HPA and SPA addresses need address translation.
> If this is the case, the hardware addresses esp. used in the HDM
> decoder configurations are different to the system's or parent port
> address ranges. E.g. AMD Zen5 systems may be configured to use
> 'Normalized addresses'. Then, CXL endpoints have their own physical
> address base which is not the same as the SPA used by the CXL host
> bridge. Thus, addresses need to be translated from the endpoint's to
> its CXL host bridge's address range.
> 
> To enable address translation, the endpoint's HPA range must be
> translated to the CXL host bridge's address range. A callback is
> introduced to translate a decoder's HPA to the CXL host bridge's
> address range. The callback is then used to determine the region
> parameters which includes the SPA translated address range of the
> endpoint decoder and the interleaving configuration. This is stored in
> struct cxl_region which allows an endpoint decoder to determine that
> parameters based on its assigned region.
> 
> Note that only auto-discovery of decoders is supported. Thus, decoders
> are locked and cannot be configured manually.
> 
> Finally, Zen5 address translation is enabled using ACPI PRMT.
> 
> This series bases on v6.19-rc1.

Applied to cxl/next. Including the conventions doc.
00bc604c96bb762f0f050460e25de2729edb1699

> 
> V9:
>  * rebased onto v6.19-rc1,
>  * updated sob-chains,
>  * removed alignment check in cxl_prm_setup_root() for DPA ranges,
>  * moved assignment to variable len in cxl_prm_setup_root() closer to user,
>  * removed patch from series (Alison):
>    [PATCH v8 12/13] cxl: Check if ULLONG_MAX was returned from translation functions
>  * added patch to factor out poison setup code,
>  * changed implementation to disable HPA/SPA translation handlers (Alison),
> 
> V8:
>  * rebased onto cxl-for-6.19,
>  * updated sob-chains,
>  * renamed cxl_root callback to translation_setup_root,
>  * renamed functions to cxl_root_setup_translation and cxl_prm_setup_root,
>  * added comment around cxl_root_setup_translation(),
>  * added check for ULLONG_MAX of return value of translation functions,
>  * added callback to setup translation for regions
>    (cxl_region_setup_translation, cxl_prm_setup_region),
>  * add HPA/SPA callback handlers that return ULLONG_MAX (Alison),
> 
> V7:
>  * rebased onto cxl/for-6.19/cxl-prm,
>  * reworded comment and description of 11/11 (decoder lock),
> 
> V6:
>  * rebased onto v6.18-rc5 and CXL updates for v6.19,
>  * note: applies on top of: [PATCH v3 0/3] CXL updates for v6.19,
> 
> V5:
>  * fixed build error with !CXL_REGION (kbot),
>  * updated sob-chains,
>  * added note to get_cxl_root_decoder() to drop reference after use
>    (Dave),
>  * moved initialization of base* variables in
>    cxl_prm_translate_hpa_range() (Dave, Jonathan),
>  * fixed initialization of cxlr->hpa_range for the non-auto case
>    (Alison),
>  * added description of the @hpa_range arg to
>    cxl_calc_interleave_pos() (kbot),
>  * removed optional patches 12-14 to send them separately (Alison,
>    Dave),
>  * reordered patches 1-6 to reduce dependencies between them and give
>    way for early pick up candidates,
>  * rebased onto cxl/next (c692f5a947ad),
>  * added commas in comment in cxl_add_to_region() (Jonathan),
>  * removed cxlmd from struct cxl_region_context (Dave, Jonathan),
>  * removed use of PTR_ERR_OR_ZERO() (Jonathan),
>  * increased wrap width to 80 chars for comments in cxl_atl.c (Jonathan),
>  * moved (ways > 1) check out of while loop in cxl_prm_translate_hpa_range()
>    (Jonathan),
>  * removed trailing comma in struct prm_cxl_dpa_spa_data initializer (Jonathan),
>  * updated patch description on locking the decoders (Dave, Jonathan),
>  * spell fix in patch description (Jonathan),
> 
> V4:
>  * rebased onto v6.18-rc2 (cxl/next),
>  * updated sob-chain,
>  * reworked and simplified code to use an address translation callback
>    bound to the root port,
>  * moved all address translation code to core/atl.c,
>  * cxlr->cxlrd change, updated patch description (Alison),
>  * use DEFINE_RANGE() (Jonathan),
>  * change name to @hpa_range (Dave, Jonathan),
>  * updated patch description if there is a no-op (Gregory),
>  * use Designated initializers for struct cxl_region_context (Dave),
>  * move callback handler to struct cxl_root_ops (Dave),
>  * move handler initialization to acpi_probe() (Dave),
>  * updated comment where Normalized Addressing is checked (Dave),
>  * limit PRM enablement only to AMD supported kernel configs (AMD_NB)
>    (Jonathan),
>  * added 3 related optional cleanup patches at the end of the series,
> 
> V3:
>  * rebased onto cxl/next,
>  * complete rework to reduce number of required changes/patches and to
>    remove platform specific code (Dan and Dave),
>  * changed implementation allowing to add address translation to the
>    CXL specification (documention patch in preparation),
>  * simplified and generalized determination of interleaving
>    parameters using the address translation callback,
>  * depend only on the existence of the ACPI PRM GUID for CXL Address
>    Translation enablement, removed platform checks,
>  * small changes to region code only which does not require a full
>    rework and refactoring of the code, just separating region
>    parameter setup and region construction,
>  * moved code to new core/atl.c file,
>  * fixed subsys_initcall order dependency of EFI runtime services
>    (Gregory and Joshua),
> 
> V2:
>  * rebased onto cxl/next,
>  * split of v1 in two parts:
>    * removed cleanups and updates from this series to post them as a
>      separate series (Dave),
>    * this part 2 applies on top of part 1, v3,
>  * added tags to SOB chain,
>  * reworked architecture, vendor and platform setup (Jonathan):
>    * added patch "cxl/x86: Prepare for architectural platform setup",
>    * added function arch_cxl_port_platform_setup() plus a __weak
>      versions for archs other than x86,
>    * moved code to core/x86,
>  * added comment to cxl_to_hpa_fn (Ben),
>  * updated year in copyright statement (Ben),
>  * cxl_port_calc_hpa(): Removed HPA check for zero (Jonathan), return
>    1 if modified,
>  * cxl_port_calc_pos(): Updated description and wording (Ben),
>  * added several patches around interleaving and SPA calculation in
>    cxl_endpoint_decoder_initialize(),
>  * reworked iterator in cxl_endpoint_decoder_initialize() (Gregory),
>  * fixed region interleaving parameters() (Alison),
>  * fixed check in cxl_region_attach() (Alison),
>  * Clarified in coverletter that not all ports in a system must
>    implement the to_hpa() callback (Terry).
> 
> [1] https://lore.kernel.org/linux-cxl/20240701174754.967954-1-rrichter@amd.com/
> [2] https://lore.kernel.org/linux-cxl/669086821f136_5fffa29473@dwillia2-xfh.jf.intel.com.notmuch/
> [3] https://patchwork.kernel.org/project/cxl/cover/20250218132356.1809075-1-rrichter@amd.com/
> [4] https://patchwork.kernel.org/project/cxl/cover/20250715191143.1023512-1-rrichter@amd.com/
> [5] https://lore.kernel.org/all/78284b12-3e0b-4758-af18-397f32136c3f@intel.com/
> [6] https://patchwork.kernel.org/project/cxl/cover/20250912144514.526441-1-rrichter@amd.com/
> [7] https://lore.kernel.org/all/20250912144514.526441-8-rrichter@amd.com/T/#m23c2adb9d1e20770ccd5d11475288bda382b0af5
> [8] https://patchwork.kernel.org/project/cxl/cover/20251103184804.509762-1-rrichter@amd.com/
> 
> Robert Richter (13):
>   cxl/region: Rename misleading variable name @hpa to @hpa_range
>   cxl/region: Store root decoder in struct cxl_region
>   cxl/region: Store HPA range in struct cxl_region
>   cxl: Simplify cxl_root_ops allocation and handling
>   cxl/region: Separate region parameter setup and region construction
>   cxl/region: Add @hpa_range argument to function
>     cxl_calc_interleave_pos()
>   cxl/region: Use region data to get the root decoder
>   cxl: Introduce callback for HPA address ranges translation
>   cxl/acpi: Prepare use of EFI runtime services
>   cxl: Enable AMD Zen5 address translation using ACPI PRMT
>   cxl/atl: Lock decoders that need address translation
>   cxl/region: Factor out code into cxl_region_setup_poison()
>   cxl: Disable HPA/SPA translation handlers for Normalized Addressing
> 
>  drivers/cxl/Kconfig       |   5 +
>  drivers/cxl/acpi.c        |  17 +--
>  drivers/cxl/core/Makefile |   1 +
>  drivers/cxl/core/atl.c    | 211 ++++++++++++++++++++++++++++++++
>  drivers/cxl/core/cdat.c   |   8 +-
>  drivers/cxl/core/core.h   |   8 ++
>  drivers/cxl/core/port.c   |   8 +-
>  drivers/cxl/core/region.c | 247 ++++++++++++++++++++++++--------------
>  drivers/cxl/cxl.h         |  40 ++++--
>  9 files changed, 426 insertions(+), 119 deletions(-)
>  create mode 100644 drivers/cxl/core/atl.c
> 
> 
> base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
  2026-02-03 18:52 ` [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Dave Jiang
@ 2026-02-03 21:35   ` Gregory Price
  2026-02-04 12:58   ` Robert Richter
  1 sibling, 0 replies; 51+ messages in thread
From: Gregory Price @ 2026-02-03 21:35 UTC (permalink / raw)
  To: Dave Jiang
  Cc: Robert Richter, Alison Schofield, Vishal Verma, Ira Weiny,
	Dan Williams, Jonathan Cameron, Davidlohr Bueso, linux-cxl,
	linux-kernel, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

On Tue, Feb 03, 2026 at 11:52:02AM -0700, Dave Jiang wrote:
> > 
> > Finally, Zen5 address translation is enabled using ACPI PRMT.
> > 
> > This series bases on v6.19-rc1.
> 
> Applied to cxl/next. Including the conventions doc.
> 00bc604c96bb762f0f050460e25de2729edb1699
> 

Congrats Robert! And thank you!

~Gregory

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
  2026-02-03 18:52 ` [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Dave Jiang
  2026-02-03 21:35   ` Gregory Price
@ 2026-02-04 12:58   ` Robert Richter
  2026-02-04 17:56     ` Dave Jiang
  1 sibling, 1 reply; 51+ messages in thread
From: Robert Richter @ 2026-02-04 12:58 UTC (permalink / raw)
  To: Dave Jiang
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn

Hi Dave,

On 03.02.26 11:52:02, Dave Jiang wrote:
> 
> 
> On 1/10/26 4:46 AM, Robert Richter wrote:
> > This patch set adds support for address translation using ACPI PRM and
> > enables this for AMD Zen5 platforms. The current approach bases on v4
> > and is in response to earlier attempts to implement CXL address
> > translation:
> > 
> >  * v1: [1] and the comments on it, esp. Dan's [2],
> >  * v2: [3] and comments on [4], esp. Dave's [5],
> >  * v3: [6] and comments on it, esp. Dave's [7],
> >  * v4: [8].
> > 
> > This version addresses Alison's review comments to change the
> > implementation to disable HPA/SPA translation handler. There are a
> > view minor but no major changes otherwise. See the changelog for
> > details. Thank you all for your reviews and testing.
> > 
> > Documentation of CXL Address Translation Support will be added to the
> > Kernel's "Compute Express Link: Linux Conventions". This patch
> > submission will be the base for a documentation patch that describes CXL
> > Address Translation support accordingly.
> > 
> > The CXL driver currently does not implement address translation which
> > assumes the host physical addresses (HPA) and system physical
> > addresses (SPA) are equal.
> > 
> > Systems with different HPA and SPA addresses need address translation.
> > If this is the case, the hardware addresses esp. used in the HDM
> > decoder configurations are different to the system's or parent port
> > address ranges. E.g. AMD Zen5 systems may be configured to use
> > 'Normalized addresses'. Then, CXL endpoints have their own physical
> > address base which is not the same as the SPA used by the CXL host
> > bridge. Thus, addresses need to be translated from the endpoint's to
> > its CXL host bridge's address range.
> > 
> > To enable address translation, the endpoint's HPA range must be
> > translated to the CXL host bridge's address range. A callback is
> > introduced to translate a decoder's HPA to the CXL host bridge's
> > address range. The callback is then used to determine the region
> > parameters which includes the SPA translated address range of the
> > endpoint decoder and the interleaving configuration. This is stored in
> > struct cxl_region which allows an endpoint decoder to determine that
> > parameters based on its assigned region.
> > 
> > Note that only auto-discovery of decoders is supported. Thus, decoders
> > are locked and cannot be configured manually.
> > 
> > Finally, Zen5 address translation is enabled using ACPI PRMT.
> > 
> > This series bases on v6.19-rc1.
> 
> Applied to cxl/next. Including the conventions doc.
> 00bc604c96bb762f0f050460e25de2729edb1699

Thank you for applying the series, I noticed wrong authorship of
a0a135b410f57702ac6a463005c656f291eb7b90, could you fix that?

Thank you,

-Robert

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
  2026-02-04 12:58   ` Robert Richter
@ 2026-02-04 17:56     ` Dave Jiang
  0 siblings, 0 replies; 51+ messages in thread
From: Dave Jiang @ 2026-02-04 17:56 UTC (permalink / raw)
  To: Robert Richter
  Cc: Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Jonathan Cameron, Davidlohr Bueso, linux-cxl, linux-kernel,
	Gregory Price, Fabio M. De Francesco, Terry Bowman, Joshua Hahn



On 2/4/26 5:58 AM, Robert Richter wrote:
> Hi Dave,
> 
> On 03.02.26 11:52:02, Dave Jiang wrote:
>>
>>
>> On 1/10/26 4:46 AM, Robert Richter wrote:
>>> This patch set adds support for address translation using ACPI PRM and
>>> enables this for AMD Zen5 platforms. The current approach bases on v4
>>> and is in response to earlier attempts to implement CXL address
>>> translation:
>>>
>>>  * v1: [1] and the comments on it, esp. Dan's [2],
>>>  * v2: [3] and comments on [4], esp. Dave's [5],
>>>  * v3: [6] and comments on it, esp. Dave's [7],
>>>  * v4: [8].
>>>
>>> This version addresses Alison's review comments to change the
>>> implementation to disable HPA/SPA translation handler. There are a
>>> view minor but no major changes otherwise. See the changelog for
>>> details. Thank you all for your reviews and testing.
>>>
>>> Documentation of CXL Address Translation Support will be added to the
>>> Kernel's "Compute Express Link: Linux Conventions". This patch
>>> submission will be the base for a documentation patch that describes CXL
>>> Address Translation support accordingly.
>>>
>>> The CXL driver currently does not implement address translation which
>>> assumes the host physical addresses (HPA) and system physical
>>> addresses (SPA) are equal.
>>>
>>> Systems with different HPA and SPA addresses need address translation.
>>> If this is the case, the hardware addresses esp. used in the HDM
>>> decoder configurations are different to the system's or parent port
>>> address ranges. E.g. AMD Zen5 systems may be configured to use
>>> 'Normalized addresses'. Then, CXL endpoints have their own physical
>>> address base which is not the same as the SPA used by the CXL host
>>> bridge. Thus, addresses need to be translated from the endpoint's to
>>> its CXL host bridge's address range.
>>>
>>> To enable address translation, the endpoint's HPA range must be
>>> translated to the CXL host bridge's address range. A callback is
>>> introduced to translate a decoder's HPA to the CXL host bridge's
>>> address range. The callback is then used to determine the region
>>> parameters which includes the SPA translated address range of the
>>> endpoint decoder and the interleaving configuration. This is stored in
>>> struct cxl_region which allows an endpoint decoder to determine that
>>> parameters based on its assigned region.
>>>
>>> Note that only auto-discovery of decoders is supported. Thus, decoders
>>> are locked and cannot be configured manually.
>>>
>>> Finally, Zen5 address translation is enabled using ACPI PRMT.
>>>
>>> This series bases on v6.19-rc1.
>>
>> Applied to cxl/next. Including the conventions doc.
>> 00bc604c96bb762f0f050460e25de2729edb1699
> 
> Thank you for applying the series, I noticed wrong authorship of
> a0a135b410f57702ac6a463005c656f291eb7b90, could you fix that?

Sorry about that. Not sure how it happened. Fixed now. cxl/next repushed.
63fbf275fa9f18f7020fb8acf54fa107e51d0f23
> 
> Thank you,
> 
> -Robert


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2026-02-04 17:56 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-10 11:46 [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Robert Richter
2026-01-10 11:46 ` [PATCH v9 01/13] cxl/region: Rename misleading variable name @hpa to @hpa_range Robert Richter
2026-01-14  3:12   ` Alison Schofield
2026-01-10 11:46 ` [PATCH v9 02/13] cxl/region: Store root decoder in struct cxl_region Robert Richter
2026-01-14  3:13   ` Alison Schofield
2026-01-10 11:46 ` [PATCH v9 03/13] cxl/region: Store HPA range " Robert Richter
2026-01-14  3:14   ` Alison Schofield
2026-01-10 11:46 ` [PATCH v9 04/13] cxl: Simplify cxl_root_ops allocation and handling Robert Richter
2026-01-14  3:16   ` Alison Schofield
2026-01-10 11:46 ` [PATCH v9 05/13] cxl/region: Separate region parameter setup and region construction Robert Richter
2026-01-14  3:17   ` Alison Schofield
2026-01-10 11:46 ` [PATCH v9 06/13] cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() Robert Richter
2026-01-14  3:17   ` Alison Schofield
2026-01-10 11:46 ` [PATCH v9 07/13] cxl/region: Use region data to get the root decoder Robert Richter
2026-01-14  3:19   ` Alison Schofield
2026-01-10 11:46 ` [PATCH v9 08/13] cxl: Introduce callback for HPA address ranges translation Robert Richter
2026-01-14  3:20   ` Alison Schofield
2026-01-10 11:46 ` [PATCH v9 09/13] cxl/acpi: Prepare use of EFI runtime services Robert Richter
2026-01-10 11:46 ` [PATCH v9 10/13] cxl: Enable AMD Zen5 address translation using ACPI PRMT Robert Richter
2026-01-14  7:47   ` Ard Biesheuvel
2026-01-14 14:00     ` Robert Richter
2026-01-14 15:21       ` Ard Biesheuvel
2026-01-14 18:08         ` Jonathan Cameron
2026-01-15  8:04           ` Peter Zijlstra
2026-01-15  8:30             ` Ard Biesheuvel
2026-01-16 14:38               ` Peter Zijlstra
2026-01-19 14:33                 ` Robert Richter
2026-01-19 15:00                   ` Gregory Price
2026-01-19 15:15                   ` Dave Jiang
2026-01-19 16:03                   ` Yazen Ghannam
2026-01-21  0:35                     ` dan.j.williams
2026-01-21 14:58                       ` Yazen Ghannam
2026-01-21 22:09                         ` dan.j.williams
2026-01-21 23:12                           ` Gregory Price
2026-01-22  2:05                             ` dan.j.williams
2026-01-22  6:09                               ` dan.j.williams
2026-01-20 21:23                   ` dan.j.williams
2026-01-10 11:46 ` [PATCH v9 11/13] cxl/atl: Lock decoders that need address translation Robert Richter
2026-01-10 11:46 ` [PATCH v9 12/13] cxl/region: Factor out code into cxl_region_setup_poison() Robert Richter
2026-01-13 22:39   ` Dave Jiang
2026-01-14  3:32   ` Alison Schofield
2026-01-14 18:17     ` Jonathan Cameron
2026-01-10 11:46 ` [PATCH v9 13/13] cxl: Disable HPA/SPA translation handlers for Normalized Addressing Robert Richter
2026-01-13 23:15   ` Dave Jiang
2026-01-14  3:59   ` Alison Schofield
2026-01-14 11:32     ` Robert Richter
2026-01-14 18:22   ` Jonathan Cameron
2026-02-03 18:52 ` [PATCH v9 00/13] cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Dave Jiang
2026-02-03 21:35   ` Gregory Price
2026-02-04 12:58   ` Robert Richter
2026-02-04 17:56     ` Dave Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox