linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] cxl: Support Poison Inject & Clear by Region Offset
@ 2025-07-03  4:03 alison.schofield
  2025-07-03  4:03 ` [PATCH v2 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: alison.schofield @ 2025-07-03  4:03 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

From: Alison Schofield <alison.schofield@intel.com>

Beware the rename of patchset and individual patches. Mostly just doing
HPA->Region Offset, and HPA->SPA.

Changes in v2:

Rebased onto cxl repo branch for-6.17/cxl-acquire
As the ACQUIRE() set noted, ACQUIRE() syntax leads to a checkpatch
ERROR: do not use assignment in if condition
We'll have to endure that noise here until that set is merged with
a checkpatch.pl update or a change to the syntax.

Changelog appears per patch also:
Simplify return using test_bit() in cxl_memdev_has_poison() (Jonathan)
Use ACQUIRE() in the debugfs inject and clear funcs (DaveJ, Jonathan)
Shift by (eig + eiw) not (eig + 8) in MOD 3 interleaves** (Jonathan)
Simplify bottom bit handling by saving and restoring (Jonathan)
Return the rc from locked variants in cxl_clear/inject_poison
Fail if offset is in the extended linear cache (Jonathan)
Calculate the pos and dpa_offset inline, not in a helper
Remove the not customary __ prefix from locked variants
Added 'cxl' to an existing ABI for memdev clear_poison
Add and use a root decoder callback for spa_to_hpa()
Redefine ABI to take a region offset, not SPA (Dan)
Use div64_u64 instead of / to fix 32-bit ARM (lkp)
Use div64_u64_rem instead of % for arch safety
Remove KernelVersion field in ABI doc (Dan)
Pass pointer to results structures (DaveJ)
Add spec references and comments (DaveJ)
Warn against misuse in ABI doc (Dan)
Add validate_region_offset() helper


Begin (updated) Cover Letter

This series allows expert users to inject and clear poison by writing a
Host Physical Address (HPA) to a region debugfs files. At the core of this
new functionality is a helper that translates an HPA into a Device Physical
Address (DPA) and a memdev based on the region's decoder configuration.

The set is not merely a convenience wrapper for these region poison
operations as it enables these operations for XOR interleaved regions
where they were previously impossible.

Patch 1 defines a SPA->CXL HPA root decoder callback for XOR Math. It's a
restructuring and renaming exercise that enables the reuse of an existing
xormap function in either direction SPA<-->CXL HPA. It gets used in Patch 2.

Patch 2 introduces the translation logic capable of retrieving the memdev
and a DPA for a region offset.

Patch 4 adds a locked variant of the inject and clear poison ops to
support callers that must hold locks during the entire translation and
operation sequence. It gets used in Patch 4.

Patch 4 exposes the capability through region debugfs attributes that 
only appear when all participating memdevs support the poison commands.

At the end of Patch 4 a region offset has been translated to a memdev
and a DPA and can simply be passed through to the pre-existing per memdev
inject and clear poison routines.

A CXL Unit Test update to cxl-poison.sh is posted separately.


Alison Schofield (4):
  cxl: Define a SPA->CXL HPA root decoder callback for XOR Math
  cxl/region: Introduce SPA to DPA address translation
  cxl/core: Add locked variants of the poison inject and clear funcs
  cxl/region: Add inject and clear poison by region offset

 Documentation/ABI/testing/debugfs-cxl |  89 +++++++++-
 drivers/cxl/acpi.c                    |  30 ++--
 drivers/cxl/core/core.h               |   4 +
 drivers/cxl/core/memdev.c             |  68 +++++---
 drivers/cxl/core/region.c             | 225 ++++++++++++++++++++++++++
 drivers/cxl/cxl.h                     |   3 +
 drivers/cxl/cxlmem.h                  |   2 +
 7 files changed, 390 insertions(+), 31 deletions(-)


base-commit: 6c90a41a7833a6bb2fb17b9f3cafda66ebdf259b
-- 
2.37.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math
  2025-07-03  4:03 [PATCH v2 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
@ 2025-07-03  4:03 ` alison.schofield
  2025-07-03 13:43   ` Jonathan Cameron
  2025-07-03  4:03 ` [PATCH v2 2/4] cxl/region: Introduce SPA to DPA address translation alison.schofield
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: alison.schofield @ 2025-07-03  4:03 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

From: Alison Schofield <alison.schofield@intel.com>

When DPA->SPA translation was introduced, it included a helper that
applied the XOR maps to do the CXL HPA -> SPA translation for XOR
region interleaves. In preparation for adding SPA->DPA address
translation, introduce the reverse callback.

The root decoder callback is defined generically and not all usages
may be self inverting like this XOR function. Add another root decoder
callback that is the spa_to_hpa function.

Update the existing cxl_xor_hpa_to_spa() with a name that reflects
what it does without directionality: cxl_apply_xor_maps(), a generic
parameter: addr replaces hpa, and code comments stating that the
function supports the translation in either direction.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---

Changes in v2:
This is a new patch in v2 of the set.


 drivers/cxl/acpi.c | 30 ++++++++++++++++++------------
 drivers/cxl/cxl.h  |  3 +++
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index a1a99ec3f12c..fd12b9a96246 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -21,7 +21,7 @@ static const guid_t acpi_cxl_qtg_id_guid =
 		  0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
 
 
-static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
+static u64 cxl_apply_xor_maps(struct cxl_root_decoder *cxlrd, u64 addr)
 {
 	struct cxl_cxims_data *cximsd = cxlrd->platform_data;
 	int hbiw = cxlrd->cxlsd.nr_targets;
@@ -30,19 +30,23 @@ static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
 
 	/* No xormaps for host bridge interleave ways of 1 or 3 */
 	if (hbiw == 1 || hbiw == 3)
-		return hpa;
+		return addr;
 
 	/*
-	 * For root decoders using xormaps (hbiw: 2,4,6,8,12,16) restore
-	 * the position bit to its value before the xormap was applied at
-	 * HPA->DPA translation.
+	 * In regions using XOR interleave arithmetic the CXL HPA may not
+	 * be the same as the SPA. This helper performs the SPA->CXL HPA
+	 * or the CXL HPA->SPA translation. Since XOR is self-inverting,
+	 * so is this function.
+	 *
+	 * For root decoders using xormaps (hbiw: 2,4,6,8,12,16) applying the
+	 * xormaps will toggle a position bit.
 	 *
 	 * pos is the lowest set bit in an XORMAP
-	 * val is the XORALLBITS(HPA & XORMAP)
+	 * val is the XORALLBITS(addr & XORMAP)
 	 *
 	 * XORALLBITS: The CXL spec (3.1 Table 9-22) defines XORALLBITS
 	 * as an operation that outputs a single bit by XORing all the
-	 * bits in the input (hpa & xormap). Implement XORALLBITS using
+	 * bits in the input (addr & xormap). Implement XORALLBITS using
 	 * hweight64(). If the hamming weight is even the XOR of those
 	 * bits results in val==0, if odd the XOR result is val==1.
 	 */
@@ -51,11 +55,11 @@ static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
 		if (!cximsd->xormaps[i])
 			continue;
 		pos = __ffs(cximsd->xormaps[i]);
-		val = (hweight64(hpa & cximsd->xormaps[i]) & 1);
-		hpa = (hpa & ~(1ULL << pos)) | (val << pos);
+		val = (hweight64(addr & cximsd->xormaps[i]) & 1);
+		addr = (addr & ~(1ULL << pos)) | (val << pos);
 	}
 
-	return hpa;
+	return addr;
 }
 
 struct cxl_cxims_context {
@@ -413,8 +417,10 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
 
 	cxlrd->qos_class = cfmws->qtg_id;
 
-	if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR)
-		cxlrd->hpa_to_spa = cxl_xor_hpa_to_spa;
+	if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR) {
+		cxlrd->hpa_to_spa = cxl_apply_xor_maps;
+		cxlrd->spa_to_hpa = cxl_apply_xor_maps;
+	}
 
 	rc = cxl_decoder_add(cxld, target_map);
 	if (rc)
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 50799a681231..75e60a313be5 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -419,12 +419,14 @@ struct cxl_switch_decoder {
 
 struct cxl_root_decoder;
 typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
+typedef u64 (*cxl_spa_to_hpa_fn)(struct cxl_root_decoder *cxlrd, u64 spa);
 
 /**
  * struct cxl_root_decoder - Static platform CXL address decoder
  * @res: host / parent resource for region allocations
  * @region_id: region id for next region provisioning event
  * @hpa_to_spa: translate CXL host-physical-address to Platform system-physical-address
+ * @spa_to_hpa: translate Platform system-physical-address to CXL host-physical-address
  * @platform_data: platform specific configuration data
  * @range_lock: sync region autodiscovery by address range
  * @qos_class: QoS performance class cookie
@@ -434,6 +436,7 @@ struct cxl_root_decoder {
 	struct resource *res;
 	atomic_t region_id;
 	cxl_hpa_to_spa_fn hpa_to_spa;
+	cxl_spa_to_hpa_fn spa_to_hpa;
 	void *platform_data;
 	struct mutex range_lock;
 	int qos_class;
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/4] cxl/region: Introduce SPA to DPA address translation
  2025-07-03  4:03 [PATCH v2 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
  2025-07-03  4:03 ` [PATCH v2 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
@ 2025-07-03  4:03 ` alison.schofield
  2025-07-03 14:23   ` Jonathan Cameron
  2025-07-03  4:03 ` [PATCH v2 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
  2025-07-03  4:03 ` [PATCH v2 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
  3 siblings, 1 reply; 13+ messages in thread
From: alison.schofield @ 2025-07-03  4:03 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

From: Alison Schofield <alison.schofield@intel.com>

Add infrastructure to translate System Physical Addresses (HPA) to
Device Physical Addresses (DPA) within CXL regions. This capability
will be used by follow-on patches that add poison inject and clear
operations at the region level.

The SPA-to-DPA translation process follows these steps:
1. Apply root decoder transformations (SPA to HPA) if configured.
2. Extract the position in region interleave from the HPA offset.
3. Extract the DPA offset from the HPA offset.
4. Use position to find endpoint decoder.
5. Use endpoint decoder to find memdev and calculate DPA from offset.
6. Return the result - a memdev and a DPA.

It is Step 1 above that makes this a driver level operation and not
work we can push to user space. Rather than exporting the XOR maps for
root decoders configured with XOR interleave, the driver performs this
complex calculation for the user.

Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
Implementation Note: Device Decode Logic.

These calculations mirror much of the logic introduced earlier in DPA
to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
reverse the spec defined steps. This case is not the reversal, it
follows the device decode logic per the spec.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---

Changes in v2:
Shift by (eig + eiw) not (eig + 8) in MOD 3 interleaves (Jonathan)
Simplify bottom bit handling by saving and restoring (Jonathan)
Calculate the pos and dpa_offset inline, not in a helper
Use the new root decoder callback for spa_to_hpa()
Use div64_u64 instead of / to fix 32-bit ARM (lkp)
Use div64_u64_rem instead of % for arch safety
Pass pointer to results structures (DaveJ)
Add spec references and comments (DaveJ)
Add validate_region_offset() helper


 drivers/cxl/core/region.c | 100 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 100 insertions(+)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index a2ba19151d4f..d965f07ba8a8 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2956,6 +2956,106 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
 	return hpa;
 }
 
+struct dpa_result {
+	struct cxl_memdev *cxlmd;
+	u64 dpa;
+};
+
+static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
+						      u64 offset,
+						      struct dpa_result *result)
+{
+	struct cxl_region_params *p = &cxlr->params;
+	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+	struct cxl_endpoint_decoder *cxled;
+	u64 hpa, hpa_offset, dpa_offset;
+	u64 bits_upper, bits_lower;
+	u64 shifted, rem;
+	u16 eig = 0;
+	u8 eiw = 0;
+	int pos;
+
+	lockdep_assert_held(&cxl_rwsem.region);
+	lockdep_assert_held(&cxl_rwsem.dpa);
+
+	ways_to_eiw(p->interleave_ways, &eiw);
+	granularity_to_eig(p->interleave_granularity, &eig);
+
+	/*
+	 * If the root decoder has SPA to CXL HPA callback,
+	 * use it. Otherwise CXL HPA is assumed to equal SPA.
+	 */
+	if (cxlrd->spa_to_hpa) {
+		hpa = cxlrd->spa_to_hpa(cxlrd, p->res->start + offset);
+		hpa_offset = hpa - p->res->start;
+	} else {
+		hpa_offset = offset;
+	}
+	/*
+	 * Extracting the interleave position and the DPA offset
+	 * is based on the steps defined in CXL Spec 3.2 Section
+	 * 8.2.4.20.13 Implementation Note: Device Decode Logic
+	 */
+	/*
+	 * Interleave position:
+	 * eiw < 8
+	 *	Position is in the IW bits at HPA_OFFSET[IG+8+eiw-1:IG+8].
+	 *	Per spec "remove IW bits starting with bit position IG+8"
+	 * eiw >= 8
+	 *	Position is not explicitly stored in HPA_OFFSET. It is
+	 *	derived from the modulo-3 remainder of the upper bits.
+	 */
+	if (eiw < 8) {
+		pos = (hpa_offset >> (eig + 8)) & GENMASK(eiw - 1, 0);
+	} else {
+		shifted = hpa_offset >> (eig + eiw);
+		div64_u64_rem(shifted, 3, &rem);
+		pos = rem;
+	}
+	if (pos < 0 || pos >= p->nr_targets) {
+		dev_dbg(&cxlr->dev, "Invalid position %d for %d targets\n",
+			pos, p->nr_targets);
+		return -ENXIO;
+	}
+	/*
+	 * DPA offset:
+	 * Lower bits [IG+7:0] pass through unchanged
+	 * (eiw < 8)
+	 *	Per spec: DPAOffset[51:IG+8] = (HPAOffset[51:IG+IW+8] >> IW)
+	 *	Clear the position bits to isolate upper section, then
+	 *	reverse the left shift by eiw that occurred during DPA->HPA
+	 * (eiw >= 8)
+	 *	Per spec: DPAOffset[51:IG+8] = HPAOffset[51:IG+IW] / 3
+	 *	Extract upper bits from the correct bit range and divide by 3
+	 *	to recover the original DPA upper bits
+	 */
+
+	bits_lower = hpa_offset & GENMASK_ULL(eig + 7, 0);
+	if (eiw < 8) {
+		hpa_offset &= ~((u64)GENMASK(eiw - 1, 0) << (eig + 8));
+		dpa_offset = hpa_offset >> eiw;
+	} else {
+		bits_upper = div64_u64(hpa_offset >> (eig + eiw), 3);
+		dpa_offset = bits_upper << (eig + 8);
+	}
+	dpa_offset |= bits_lower;
+
+	/* Look-up and return the result: a memdev and a DPA */
+	for (int i = 0; i < p->nr_targets; i++) {
+		cxled = p->targets[i];
+		if (cxled->pos == pos) {
+			result->cxlmd = cxled_to_memdev(cxled);
+			result->dpa =
+				cxl_dpa_resource_start(cxled) + dpa_offset;
+
+			return 0;
+		}
+	}
+	dev_err(&cxlr->dev, "No device found for position %d\n", pos);
+
+	return -ENXIO;
+}
+
 static struct lock_class_key cxl_pmem_region_key;
 
 static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/4] cxl/core: Add locked variants of the poison inject and clear funcs
  2025-07-03  4:03 [PATCH v2 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
  2025-07-03  4:03 ` [PATCH v2 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
  2025-07-03  4:03 ` [PATCH v2 2/4] cxl/region: Introduce SPA to DPA address translation alison.schofield
@ 2025-07-03  4:03 ` alison.schofield
  2025-07-03 14:25   ` Jonathan Cameron
  2025-07-03  4:03 ` [PATCH v2 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
  3 siblings, 1 reply; 13+ messages in thread
From: alison.schofield @ 2025-07-03  4:03 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

From: Alison Schofield <alison.schofield@intel.com>

The core functions that validate and send inject and clear commands
to the memdev devices require holding both the dpa_rwsem and the
region_rwsem.

In preparation for another caller of these functions that must hold
the locks upon entry, split the work into a locked and unlocked pair.

Consideration was given to moving the locking to both callers,
however, the existing caller is not in the core (mem.c) and cannot
access the locks.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---

Changes in v2:
Return the rc from locked variants in cxl_clear/inject_poison
Remove the not customary __ prefix from locked variants


 drivers/cxl/core/memdev.c | 60 +++++++++++++++++++++++++++------------
 drivers/cxl/cxlmem.h      |  2 ++
 2 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index f5fbd34310fd..7af626f69ac4 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -276,7 +276,7 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
 	return 0;
 }
 
-int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
+int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
 {
 	struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
 	struct cxl_mbox_inject_poison inject;
@@ -288,13 +288,8 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 	if (!IS_ENABLED(CONFIG_DEBUG_FS))
 		return 0;
 
-	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
-	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
-		return rc;
-
-	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
-	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
-		return rc;
+	lockdep_assert_held(&cxl_rwsem.dpa);
+	lockdep_assert_held(&cxl_rwsem.region);
 
 	rc = cxl_validate_poison_dpa(cxlmd, dpa);
 	if (rc)
@@ -322,11 +317,28 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 	};
 	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT);
 
-	return 0;
+	return rc;
+}
+
+int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
+{
+	int rc;
+
+	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
+		return rc;
+
+	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+		return rc;
+
+	rc = cxl_inject_poison_locked(cxlmd, dpa);
+
+	return rc;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, "CXL");
 
-int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
+int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
 {
 	struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
 	struct cxl_mbox_clear_poison clear;
@@ -338,13 +350,8 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
 	if (!IS_ENABLED(CONFIG_DEBUG_FS))
 		return 0;
 
-	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
-	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
-		return rc;
-
-	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
-	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
-		return rc;
+	lockdep_assert_held(&cxl_rwsem.dpa);
+	lockdep_assert_held(&cxl_rwsem.region);
 
 	rc = cxl_validate_poison_dpa(cxlmd, dpa);
 	if (rc)
@@ -381,7 +388,24 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
 	};
 	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_CLEAR);
 
-	return 0;
+	return rc;
+}
+
+int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
+{
+	int rc;
+
+	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
+		return rc;
+
+	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+		return rc;
+
+	rc = cxl_clear_poison_locked(cxlmd, dpa);
+
+	return rc;
 }
 EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, "CXL");
 
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index f5b20641e57c..869c73de12c8 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -861,6 +861,8 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
+int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, u64 dpa);
+int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa);
 
 #ifdef CONFIG_CXL_EDAC_MEM_FEATURES
 int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/4] cxl/region: Add inject and clear poison by region offset
  2025-07-03  4:03 [PATCH v2 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
                   ` (2 preceding siblings ...)
  2025-07-03  4:03 ` [PATCH v2 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
@ 2025-07-03  4:03 ` alison.schofield
  2025-07-03 14:31   ` Jonathan Cameron
  3 siblings, 1 reply; 13+ messages in thread
From: alison.schofield @ 2025-07-03  4:03 UTC (permalink / raw)
  To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams
  Cc: linux-cxl

From: Alison Schofield <alison.schofield@intel.com>

Add CXL region debugfs attributes to inject and clear poison based
on an offset into the region. These new interfaces allow users to
operate on poison at the region level without needing to resolve
Device Physical Addresses (DPA) or target individual memdevs.

The implementation uses a new helper, region_offset_to_dpa_result()
that applies decoder interleave logic, including XOR-based address
decoding when applicable. Note that XOR decodes rely on driver
internal xormaps which are not exposed to userspace. So, this support
is not only a simplification of poison operations that could be done
using existing per memdev operations, but also it enables this
functionality for XOR interleaved regions for the first time.

New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/:
inject_poison and clear_poison. These are only exposed if all memdevs
participating in the region support both inject and clear commands,
ensuring consistent and reliable behavior across multi-device regions.

If tracing is enabled, these operations are logged as cxl_poison
events in /sys/kernel/tracing/trace.

The ABI documentation warns users of the significant risks that
come with using these capabilities.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---

Changes in v2:
Simplify return using test_bit() in cxl_memdev_has_poison() (Jonathan)
Use ACQUIRE() in the debugfs inject and clear funcs (DaveJ, Jonathan)
Fail if offset is in the extended linear cache (Jonathan)
Added 'cxl' to an existing ABI for memdev clear_poison
Redefine ABI to take a region offset, not SPA (Dan)
Remove KernelVersion field in ABI doc (Dan)
Warn against misuse in ABI doc (Dan)
Add validate_region_offset() helper


 Documentation/ABI/testing/debugfs-cxl |  89 ++++++++++++++++-
 drivers/cxl/core/core.h               |   4 +
 drivers/cxl/core/memdev.c             |   8 ++
 drivers/cxl/core/region.c             | 131 +++++++++++++++++++++++++-
 4 files changed, 228 insertions(+), 4 deletions(-)

diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
index 12488c14be64..2989d4da96c1 100644
--- a/Documentation/ABI/testing/debugfs-cxl
+++ b/Documentation/ABI/testing/debugfs-cxl
@@ -19,8 +19,22 @@ Description:
 		is returned to the user. The inject_poison attribute is only
 		visible for devices supporting the capability.
 
+		TEST-ONLY INTERFACE: This interface is intended for testing
+		and validation purposes only. It is not a data repair mechanism
+		and should never be used on production systems or live data.
 
-What:		/sys/kernel/debug/memX/clear_poison
+		DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
+		poison injection can result in permanent data loss. Injected
+		poison may render data permanently inaccessible even after
+		clearing, as the clear operation writes zeros and does not
+		recover original data.
+
+		SYSTEM STABILITY RISK: For volatile memory, poison injection
+		can cause kernel crashes, system instability, or unpredictable
+		behavior if the poisoned addresses are accessed by running code
+		or critical kernel structures.
+
+What:		/sys/kernel/debug/cxl/memX/clear_poison
 Date:		April, 2023
 KernelVersion:	v6.4
 Contact:	linux-cxl@vger.kernel.org
@@ -35,6 +49,79 @@ Description:
 		The clear_poison attribute is only visible for devices
 		supporting the capability.
 
+		TEST-ONLY INTERFACE: This interface is intended for testing
+		and validation purposes only. It is not a data repair mechanism
+		and should never be used on production systems or live data.
+
+		CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
+		specified address range and removes the address from the poison
+		list. It does NOT recover or restore original data that may have
+		been present before poison injection. Any original data at the
+		cleared address is permanently lost and replaced with zeros.
+
+		CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
+		purposes only and should not be used as a data repair tool.
+		Clearing poison is fundamentally different from data recovery
+		or error correction.
+
+What:		/sys/kernel/debug/cxl/regionX/inject_poison
+Date:		August, 2025
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(WO) When a Host Physical Address (HPA) is written to this
+		attribute, the region driver translates it to a Device
+		Physical Address (DPA) and identifies the corresponding
+		memdev. It then sends an inject poison command to that memdev
+		at the translated DPA. Refer to the memdev ABI entry at:
+		/sys/kernel/debug/cxl/memX/inject_poison for the detailed
+		behavior. This attribute is only visible if all memdevs
+		participating in the region support both inject and clear
+		poison commands.
+
+		TEST-ONLY INTERFACE: This interface is intended for testing
+		and validation purposes only. It is not a data repair mechanism
+		and should never be used on production systems or live data.
+
+		DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
+		poison injection can result in permanent data loss. Injected
+		poison may render data permanently inaccessible even after
+		clearing, as the clear operation writes zeros and does not
+		recover original data.
+
+		SYSTEM STABILITY RISK: For volatile memory, poison injection
+		can cause kernel crashes, system instability, or unpredictable
+		behavior if the poisoned addresses are accessed by running code
+		or critical kernel structures.
+
+What:		/sys/kernel/debug/cxl/regionX/clear_poison
+Date:		August, 2025
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(WO) When a Host Physical Address (HPA) is written to this
+		attribute, the region driver translates it to a Device
+		Physical Address (DPA) and identifies the corresponding
+		memdev. It then sends a clear poison command to that memdev
+		at the translated DPA. Refer to the memdev ABI entry at:
+		/sys/kernel/debug/cxl/memX/clear_poison for the detailed
+		behavior. This attribute is only visible if all memdevs
+		participating in the region support both inject and clear
+		poison commands.
+
+		TEST-ONLY INTERFACE: This interface is intended for testing
+		and validation purposes only. It is not a data repair mechanism
+		and should never be used on production systems or live data.
+
+		CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
+		specified address range and removes the address from the poison
+		list. It does NOT recover or restore original data that may have
+		been present before poison injection. Any original data at the
+		cleared address is permanently lost and replaced with zeros.
+
+		CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
+		purposes only and should not be used as a data repair tool.
+		Clearing poison is fundamentally different from data recovery
+		or error correction.
+
 What:		/sys/kernel/debug/cxl/einj_types
 Date:		January, 2024
 KernelVersion:	v6.9
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index ed7d08244542..9b4c305e5e5e 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -161,6 +161,10 @@ enum cxl_poison_trace_type {
 	CXL_POISON_TRACE_CLEAR,
 };
 
+enum poison_cmd_enabled_bits;
+bool cxl_memdev_has_poison_cmd(struct cxl_memdev *cxlmd,
+			       enum poison_cmd_enabled_bits cmd);
+
 long cxl_pci_get_latency(struct pci_dev *pdev);
 int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c);
 int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 7af626f69ac4..49b518b4c042 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -200,6 +200,14 @@ static ssize_t security_erase_store(struct device *dev,
 static struct device_attribute dev_attr_security_erase =
 	__ATTR(erase, 0200, NULL, security_erase_store);
 
+bool cxl_memdev_has_poison_cmd(struct cxl_memdev *cxlmd,
+			       enum poison_cmd_enabled_bits cmd)
+{
+	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
+
+	return test_bit(cmd, mds->poison.enabled_cmds);
+}
+
 static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd)
 {
 	struct cxl_dev_state *cxlds = cxlmd->cxlds;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index d965f07ba8a8..cc26623250bf 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2,6 +2,7 @@
 /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
 #include <linux/memregion.h>
 #include <linux/genalloc.h>
+#include <linux/debugfs.h>
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/memory.h>
@@ -2961,9 +2962,8 @@ struct dpa_result {
 	u64 dpa;
 };
 
-static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
-						      u64 offset,
-						      struct dpa_result *result)
+static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset,
+				       struct dpa_result *result)
 {
 	struct cxl_region_params *p = &cxlr->params;
 	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
@@ -3615,6 +3615,105 @@ static void shutdown_notifiers(void *_cxlr)
 	unregister_mt_adistance_algorithm(&cxlr->adist_notifier);
 }
 
+static void remove_debugfs(void *dentry)
+{
+	debugfs_remove_recursive(dentry);
+}
+
+static int validate_region_offset(struct cxl_region *cxlr, u64 offset)
+{
+	struct cxl_region_params *p = &cxlr->params;
+	resource_size_t region_size;
+	u64 hpa;
+
+	if (offset < p->cache_size) {
+		dev_err(&cxlr->dev,
+			"Offset %#llx is within extended linear cache %#llx\n",
+			offset, p->cache_size);
+		return -EINVAL;
+	}
+
+	region_size = resource_size(p->res);
+	if (offset >= region_size) {
+		dev_err(&cxlr->dev, "Offset %#llx exceeds region size %#llx\n",
+			offset, region_size);
+		return -EINVAL;
+	}
+
+	hpa = p->res->start + offset;
+	if (hpa < p->res->start || hpa > p->res->end) {
+		dev_err(&cxlr->dev, "HPA %#llx not in region %pr\n", hpa,
+			p->res);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
+{
+	struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
+	struct cxl_region *cxlr = data;
+	int rc;
+
+	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
+		return rc;
+
+	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+		return rc;
+
+	if (validate_region_offset(cxlr, offset))
+		return -EINVAL;
+
+	rc = region_offset_to_dpa_result(cxlr, offset, &result);
+	if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
+		dev_dbg(&cxlr->dev,
+			"Failed to resolve DPA for region offset %#llx\n",
+			offset);
+
+		return -EINVAL;
+	}
+
+	return cxl_inject_poison_locked(result.cxlmd, result.dpa);
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
+			 cxl_region_debugfs_poison_inject, "%llx\n");
+
+static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
+{
+	struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
+	struct cxl_region *cxlr = data;
+	int rc;
+
+	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
+		return rc;
+
+	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+		return rc;
+
+	if (validate_region_offset(cxlr, offset))
+		return -EINVAL;
+
+	rc = region_offset_to_dpa_result(cxlr, offset, &result);
+	if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
+		dev_dbg(&cxlr->dev,
+			"Failed to resolve DPA for region offset %#llx\n",
+			offset);
+
+		return -EINVAL;
+	}
+
+	return cxl_clear_poison_locked(result.cxlmd, result.dpa);
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
+			 cxl_region_debugfs_poison_clear, "%llx\n");
+
 static int cxl_region_can_probe(struct cxl_region *cxlr)
 {
 	struct cxl_region_params *p = &cxlr->params;
@@ -3644,6 +3743,7 @@ static int cxl_region_probe(struct device *dev)
 {
 	struct cxl_region *cxlr = to_cxl_region(dev);
 	struct cxl_region_params *p = &cxlr->params;
+	bool poison_supported = true;
 	int rc;
 
 	rc = cxl_region_can_probe(cxlr);
@@ -3667,6 +3767,31 @@ static int cxl_region_probe(struct device *dev)
 	if (rc)
 		return rc;
 
+	/* Create poison attributes if all memdevs support the capabilities */
+	for (int i = 0; i < p->nr_targets; i++) {
+		struct cxl_endpoint_decoder *cxled = p->targets[i];
+		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+
+		if (!cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_INJECT) ||
+		    !cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_CLEAR)) {
+			poison_supported = false;
+			break;
+		}
+	}
+
+	if (poison_supported) {
+		struct dentry *dentry;
+
+		dentry = cxl_debugfs_create_dir(dev_name(dev));
+		debugfs_create_file("inject_poison", 0200, dentry, cxlr,
+				    &cxl_poison_inject_fops);
+		debugfs_create_file("clear_poison", 0200, dentry, cxlr,
+				    &cxl_poison_clear_fops);
+		rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
+		if (rc)
+			return rc;
+	}
+
 	switch (cxlr->mode) {
 	case CXL_PARTMODE_PMEM:
 		rc = devm_cxl_region_edac_register(cxlr);
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math
  2025-07-03  4:03 ` [PATCH v2 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
@ 2025-07-03 13:43   ` Jonathan Cameron
  0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2025-07-03 13:43 UTC (permalink / raw)
  To: alison.schofield
  Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Wed,  2 Jul 2025 21:03:20 -0700
alison.schofield@intel.com wrote:

> From: Alison Schofield <alison.schofield@intel.com>
> 
> When DPA->SPA translation was introduced, it included a helper that
> applied the XOR maps to do the CXL HPA -> SPA translation for XOR
> region interleaves. In preparation for adding SPA->DPA address
> translation, introduce the reverse callback.
> 
> The root decoder callback is defined generically and not all usages
> may be self inverting like this XOR function. Add another root decoder
> callback that is the spa_to_hpa function.
> 
> Update the existing cxl_xor_hpa_to_spa() with a name that reflects
> what it does without directionality: cxl_apply_xor_maps(), a generic
> parameter: addr replaces hpa, and code comments stating that the
> function supports the translation in either direction.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] cxl/region: Introduce SPA to DPA address translation
  2025-07-03  4:03 ` [PATCH v2 2/4] cxl/region: Introduce SPA to DPA address translation alison.schofield
@ 2025-07-03 14:23   ` Jonathan Cameron
  2025-07-12  4:17     ` Alison Schofield
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Cameron @ 2025-07-03 14:23 UTC (permalink / raw)
  To: alison.schofield
  Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Wed,  2 Jul 2025 21:03:21 -0700
alison.schofield@intel.com wrote:

> From: Alison Schofield <alison.schofield@intel.com>
> 
> Add infrastructure to translate System Physical Addresses (HPA) to
> Device Physical Addresses (DPA) within CXL regions. This capability
> will be used by follow-on patches that add poison inject and clear
> operations at the region level.
> 
> The SPA-to-DPA translation process follows these steps:
> 1. Apply root decoder transformations (SPA to HPA) if configured.
> 2. Extract the position in region interleave from the HPA offset.
> 3. Extract the DPA offset from the HPA offset.
> 4. Use position to find endpoint decoder.
> 5. Use endpoint decoder to find memdev and calculate DPA from offset.
> 6. Return the result - a memdev and a DPA.
> 
> It is Step 1 above that makes this a driver level operation and not
> work we can push to user space. Rather than exporting the XOR maps for
> root decoders configured with XOR interleave, the driver performs this
> complex calculation for the user.
> 
> Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
> Implementation Note: Device Decode Logic.
> 
> These calculations mirror much of the logic introduced earlier in DPA
> to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
> reverse the spec defined steps. This case is not the reversal, it
> follows the device decode logic per the spec.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
> 
> Changes in v2:
> Shift by (eig + eiw) not (eig + 8) in MOD 3 interleaves (Jonathan)
> Simplify bottom bit handling by saving and restoring (Jonathan)
> Calculate the pos and dpa_offset inline, not in a helper
> Use the new root decoder callback for spa_to_hpa()
> Use div64_u64 instead of / to fix 32-bit ARM (lkp)
> Use div64_u64_rem instead of % for arch safety
> Pass pointer to results structures (DaveJ)
> Add spec references and comments (DaveJ)
> Add validate_region_offset() helper
> 
> 
>  drivers/cxl/core/region.c | 100 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 100 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index a2ba19151d4f..d965f07ba8a8 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2956,6 +2956,106 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
>  	return hpa;
>  }
>  
> +struct dpa_result {
> +	struct cxl_memdev *cxlmd;
> +	u64 dpa;
> +};
> +
> +static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
> +						      u64 offset,
> +						      struct dpa_result *result)
> +{
> +	struct cxl_region_params *p = &cxlr->params;
> +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> +	struct cxl_endpoint_decoder *cxled;
> +	u64 hpa, hpa_offset, dpa_offset;
> +	u64 bits_upper, bits_lower;
> +	u64 shifted, rem;
> +	u16 eig = 0;
> +	u8 eiw = 0;
> +	int pos;
> +
> +	lockdep_assert_held(&cxl_rwsem.region);
> +	lockdep_assert_held(&cxl_rwsem.dpa);
> +
> +	ways_to_eiw(p->interleave_ways, &eiw);

Maybe exercise paranoia and check return values from this

> +	granularity_to_eig(p->interleave_granularity, &eig);

and this.  That would mostly be to avoid reviewers having to
consider if errors are possible rather than because we think
we can get any.

> +
> +	/*
> +	 * If the root decoder has SPA to CXL HPA callback,
> +	 * use it. Otherwise CXL HPA is assumed to equal SPA.

Short wrap.  Go nearer 80 chars.

> +	 */
> +	if (cxlrd->spa_to_hpa) {
> +		hpa = cxlrd->spa_to_hpa(cxlrd, p->res->start + offset);
> +		hpa_offset = hpa - p->res->start;
> +	} else {
> +		hpa_offset = offset;
> +	}
> +	/*
> +	 * Extracting the interleave position and the DPA offset
> +	 * is based on the steps defined in CXL Spec 3.2 Section
> +	 * 8.2.4.20.13 Implementation Note: Device Decode Logic

Also rather short wrap.
	 * Extracting the interleave position and the DPA offset is based on
         * the steps defined in CXL Spec 3.2 Section 8.2.4.20.13 
	 * Implementation Note: Device Decode Logic

probably better.

> +	 */

Comment next to comment is a bit ugly.  Maybe combine them or stick a blank
line in between.

> +	/*
> +	 * Interleave position:
> +	 * eiw < 8
> +	 *	Position is in the IW bits at HPA_OFFSET[IG+8+eiw-1:IG+8].

Odd combination of IG and eiw in these comments.
I'd either go with spec terms IG and IW or kernel eig and eiw

> +	 *	Per spec "remove IW bits starting with bit position IG+8"
> +	 * eiw >= 8
> +	 *	Position is not explicitly stored in HPA_OFFSET. It is
> +	 *	derived from the modulo-3 remainder of the upper bits.
> +	 */
> +	if (eiw < 8) {
> +		pos = (hpa_offset >> (eig + 8)) & GENMASK(eiw - 1, 0);
> +	} else {
> +		shifted = hpa_offset >> (eig + eiw);
> +		div64_u64_rem(shifted, 3, &rem);
> +		pos = rem;

Same as:
	
		pos = shifted % 3;

How does this work for 6 or 12 way?

> +	}
> +	if (pos < 0 || pos >= p->nr_targets) {
> +		dev_dbg(&cxlr->dev, "Invalid position %d for %d targets\n",
> +			pos, p->nr_targets);
> +		return -ENXIO;
> +	}
> +	/*
> +	 * DPA offset:
> +	 * Lower bits [IG+7:0] pass through unchanged
> +	 * (eiw < 8)
> +	 *	Per spec: DPAOffset[51:IG+8] = (HPAOffset[51:IG+IW+8] >> IW)
> +	 *	Clear the position bits to isolate upper section, then
> +	 *	reverse the left shift by eiw that occurred during DPA->HPA
> +	 * (eiw >= 8)
> +	 *	Per spec: DPAOffset[51:IG+8] = HPAOffset[51:IG+IW] / 3
> +	 *	Extract upper bits from the correct bit range and divide by 3
> +	 *	to recover the original DPA upper bits
> +	 */
> +
> +	bits_lower = hpa_offset & GENMASK_ULL(eig + 7, 0);
> +	if (eiw < 8) {
> +		hpa_offset &= ~((u64)GENMASK(eiw - 1, 0) << (eig + 8));
Bottom bits already established, so I'd mask out all the top part.

		hpa_offset &= ~((u64)GENMASK(eiw + eig + 8 - 1, 0));

As you have it here I think you leave lower bits in place and or them with
themselves.

> +		dpa_offset = hpa_offset >> eiw;
> +	} else {
> +		bits_upper = div64_u64(hpa_offset >> (eig + eiw), 3);
> +		dpa_offset = bits_upper << (eig + 8);
> +	}
> +	dpa_offset |= bits_lower;
> +
> +	/* Look-up and return the result: a memdev and a DPA */
> +	for (int i = 0; i < p->nr_targets; i++) {
> +		cxled = p->targets[i];

It's getting a little deeply nested so maybe

		if (cxled->pos != pos)
			continue;

		result->cxlmd = cxled...
...

		return 0;
	}

> +		if (cxled->pos == pos) {
> +			result->cxlmd = cxled_to_memdev(cxled);
> +			result->dpa =
> +				cxl_dpa_resource_start(cxled) + dpa_offset;
> +
> +			return 0;
> +		}
> +	}
> +	dev_err(&cxlr->dev, "No device found for position %d\n", pos);
> +
> +	return -ENXIO;
> +}
> +
>  static struct lock_class_key cxl_pmem_region_key;
>  
>  static int cxl_pmem_region_alloc(struct cxl_region *cxlr)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] cxl/core: Add locked variants of the poison inject and clear funcs
  2025-07-03  4:03 ` [PATCH v2 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
@ 2025-07-03 14:25   ` Jonathan Cameron
  2025-07-12  4:23     ` Alison Schofield
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Cameron @ 2025-07-03 14:25 UTC (permalink / raw)
  To: alison.schofield
  Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Wed,  2 Jul 2025 21:03:22 -0700
alison.schofield@intel.com wrote:

> From: Alison Schofield <alison.schofield@intel.com>
> 
> The core functions that validate and send inject and clear commands
> to the memdev devices require holding both the dpa_rwsem and the
> region_rwsem.
> 
> In preparation for another caller of these functions that must hold
> the locks upon entry, split the work into a locked and unlocked pair.
> 
> Consideration was given to moving the locking to both callers,
> however, the existing caller is not in the core (mem.c) and cannot
> access the locks.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
A few minor things inline.

> ---
> 
> Changes in v2:
> Return the rc from locked variants in cxl_clear/inject_poison
> Remove the not customary __ prefix from locked variants
> 
> 
>  drivers/cxl/core/memdev.c | 60 +++++++++++++++++++++++++++------------
>  drivers/cxl/cxlmem.h      |  2 ++
>  2 files changed, 44 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index f5fbd34310fd..7af626f69ac4 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -276,7 +276,7 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
>  	return 0;
>  }
>  
> -int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
>  {
>  	struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
>  	struct cxl_mbox_inject_poison inject;
> @@ -288,13 +288,8 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  	if (!IS_ENABLED(CONFIG_DEBUG_FS))
>  		return 0;
>  
> -	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> -	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
> -		return rc;
> -
> -	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> -	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> -		return rc;
> +	lockdep_assert_held(&cxl_rwsem.dpa);
> +	lockdep_assert_held(&cxl_rwsem.region);
>  
>  	rc = cxl_validate_poison_dpa(cxlmd, dpa);
>  	if (rc)
> @@ -322,11 +317,28 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  	};
>  	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT);
>  
> -	return 0;
> +	return rc;
> +}
> +
> +int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> +	int rc;
> +
> +	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> +	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
> +		return rc;
> +
> +	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> +	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> +		return rc;
> +
> +	rc = cxl_inject_poison_locked(cxlmd, dpa);
> +
> +	return rc;

	return cxl_inject_poison_locked(cxlmd, dpa);

>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, "CXL");
>  
> -int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
>  {
>  	struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
>  	struct cxl_mbox_clear_poison clear;
> @@ -338,13 +350,8 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  	if (!IS_ENABLED(CONFIG_DEBUG_FS))
>  		return 0;
>  
> -	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> -	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
> -		return rc;
> -
> -	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> -	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> -		return rc;
> +	lockdep_assert_held(&cxl_rwsem.dpa);
> +	lockdep_assert_held(&cxl_rwsem.region);
>  
>  	rc = cxl_validate_poison_dpa(cxlmd, dpa);
>  	if (rc)
> @@ -381,7 +388,24 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
>  	};
>  	trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_CLEAR);
>  
> -	return 0;
> +	return rc;

Why this change?  Nothing locally visible seems to make rc != 0 in
a fashion that wasn't true before.


> +}
> +
> +int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> +	int rc;
> +
> +	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> +	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
> +		return rc;
> +
> +	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> +	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> +		return rc;
> +
> +	rc = cxl_clear_poison_locked(cxlmd, dpa);
> +
> +	return rc;

	return cxl_clear_poison_locked();

>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, "CXL");
>  
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index f5b20641e57c..869c73de12c8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -861,6 +861,8 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>  int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
>  int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
>  int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
> +int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, u64 dpa);
> +int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa);
>  
>  #ifdef CONFIG_CXL_EDAC_MEM_FEATURES
>  int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd);


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] cxl/region: Add inject and clear poison by region offset
  2025-07-03  4:03 ` [PATCH v2 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
@ 2025-07-03 14:31   ` Jonathan Cameron
  2025-07-12  4:40     ` Alison Schofield
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Cameron @ 2025-07-03 14:31 UTC (permalink / raw)
  To: alison.schofield
  Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Wed,  2 Jul 2025 21:03:23 -0700
alison.schofield@intel.com wrote:

> From: Alison Schofield <alison.schofield@intel.com>
> 
> Add CXL region debugfs attributes to inject and clear poison based
> on an offset into the region. These new interfaces allow users to
> operate on poison at the region level without needing to resolve
> Device Physical Addresses (DPA) or target individual memdevs.
> 
> The implementation uses a new helper, region_offset_to_dpa_result()
> that applies decoder interleave logic, including XOR-based address
> decoding when applicable. Note that XOR decodes rely on driver
> internal xormaps which are not exposed to userspace. So, this support
> is not only a simplification of poison operations that could be done
> using existing per memdev operations, but also it enables this
> functionality for XOR interleaved regions for the first time.
> 
> New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/:
> inject_poison and clear_poison. These are only exposed if all memdevs
> participating in the region support both inject and clear commands,
> ensuring consistent and reliable behavior across multi-device regions.
> 
> If tracing is enabled, these operations are logged as cxl_poison
> events in /sys/kernel/tracing/trace.
> 
> The ABI documentation warns users of the significant risks that
> come with using these capabilities.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
> 
> Changes in v2:
> Simplify return using test_bit() in cxl_memdev_has_poison() (Jonathan)
> Use ACQUIRE() in the debugfs inject and clear funcs (DaveJ, Jonathan)
> Fail if offset is in the extended linear cache (Jonathan)
> Added 'cxl' to an existing ABI for memdev clear_poison
> Redefine ABI to take a region offset, not SPA (Dan)
> Remove KernelVersion field in ABI doc (Dan)
> Warn against misuse in ABI doc (Dan)
> Add validate_region_offset() helper
> 
> 
>  Documentation/ABI/testing/debugfs-cxl |  89 ++++++++++++++++-
>  drivers/cxl/core/core.h               |   4 +
>  drivers/cxl/core/memdev.c             |   8 ++
>  drivers/cxl/core/region.c             | 131 +++++++++++++++++++++++++-
>  4 files changed, 228 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
> index 12488c14be64..2989d4da96c1 100644
> --- a/Documentation/ABI/testing/debugfs-cxl
> +++ b/Documentation/ABI/testing/debugfs-cxl
> @@ -19,8 +19,22 @@ Description:
>  		is returned to the user. The inject_poison attribute is only
>  		visible for devices supporting the capability.
>  
> +		TEST-ONLY INTERFACE: This interface is intended for testing
> +		and validation purposes only. It is not a data repair mechanism
> +		and should never be used on production systems or live data.
>  
> -What:		/sys/kernel/debug/memX/clear_poison
> +		DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
> +		poison injection can result in permanent data loss. Injected
> +		poison may render data permanently inaccessible even after
> +		clearing, as the clear operation writes zeros and does not
> +		recover original data.
> +
> +		SYSTEM STABILITY RISK: For volatile memory, poison injection
> +		can cause kernel crashes, system instability, or unpredictable
> +		behavior if the poisoned addresses are accessed by running code
> +		or critical kernel structures.
> +
> +What:		/sys/kernel/debug/cxl/memX/clear_poison

Moved or previously a bug?  Looks like a bug, so separate patch.
Not sure we will bother asking for a backport, but having it in here made
me stare at that line for an unhealthy game of spot the difference.

>  Date:		April, 2023
>  KernelVersion:	v6.4
>  Contact:	linux-cxl@vger.kernel.org



> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index d965f07ba8a8..cc26623250bf 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2,6 +2,7 @@



> +static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
> +{
> +	struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
> +	struct cxl_region *cxlr = data;
> +	int rc;
> +
> +	ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> +	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &region_rwsem)))
> +		return rc;
> +
> +	ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> +	if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> +		return rc;
> +
> +	if (validate_region_offset(cxlr, offset))
> +		return -EINVAL;
> +
> +	rc = region_offset_to_dpa_result(cxlr, offset, &result);
> +	if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
> +		dev_dbg(&cxlr->dev,
> +			"Failed to resolve DPA for region offset %#llx\n",
> +			offset);
> +
> +		return -EINVAL;

If rc was set, preferable to return that. I'd split the condition
into two parts to avoid this.

> +	}
> +


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] cxl/region: Introduce SPA to DPA address translation
  2025-07-03 14:23   ` Jonathan Cameron
@ 2025-07-12  4:17     ` Alison Schofield
  0 siblings, 0 replies; 13+ messages in thread
From: Alison Schofield @ 2025-07-12  4:17 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Thu, Jul 03, 2025 at 03:23:23PM +0100, Jonathan Cameron wrote:
> On Wed,  2 Jul 2025 21:03:21 -0700
> alison.schofield@intel.com wrote:
> 
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > Add infrastructure to translate System Physical Addresses (HPA) to
> > Device Physical Addresses (DPA) within CXL regions. This capability
> > will be used by follow-on patches that add poison inject and clear
> > operations at the region level.
> > 
> > The SPA-to-DPA translation process follows these steps:
> > 1. Apply root decoder transformations (SPA to HPA) if configured.
> > 2. Extract the position in region interleave from the HPA offset.
> > 3. Extract the DPA offset from the HPA offset.
> > 4. Use position to find endpoint decoder.
> > 5. Use endpoint decoder to find memdev and calculate DPA from offset.
> > 6. Return the result - a memdev and a DPA.
> > 
> > It is Step 1 above that makes this a driver level operation and not
> > work we can push to user space. Rather than exporting the XOR maps for
> > root decoders configured with XOR interleave, the driver performs this
> > complex calculation for the user.
> > 
> > Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
> > Implementation Note: Device Decode Logic.
> > 
> > These calculations mirror much of the logic introduced earlier in DPA
> > to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
> > reverse the spec defined steps. This case is not the reversal, it
> > follows the device decode logic per the spec.
> > 
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > ---
> > 
> > Changes in v2:
> > Shift by (eig + eiw) not (eig + 8) in MOD 3 interleaves (Jonathan)
> > Simplify bottom bit handling by saving and restoring (Jonathan)
> > Calculate the pos and dpa_offset inline, not in a helper
> > Use the new root decoder callback for spa_to_hpa()
> > Use div64_u64 instead of / to fix 32-bit ARM (lkp)
> > Use div64_u64_rem instead of % for arch safety
> > Pass pointer to results structures (DaveJ)
> > Add spec references and comments (DaveJ)
> > Add validate_region_offset() helper
> > 
> > 
> >  drivers/cxl/core/region.c | 100 ++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 100 insertions(+)
> > 
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index a2ba19151d4f..d965f07ba8a8 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -2956,6 +2956,106 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> >  	return hpa;
> >  }
> >  
> > +struct dpa_result {
> > +	struct cxl_memdev *cxlmd;
> > +	u64 dpa;
> > +};
> > +
> > +static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
> > +						      u64 offset,
> > +						      struct dpa_result *result)
> > +{
> > +	struct cxl_region_params *p = &cxlr->params;
> > +	struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> > +	struct cxl_endpoint_decoder *cxled;
> > +	u64 hpa, hpa_offset, dpa_offset;
> > +	u64 bits_upper, bits_lower;
> > +	u64 shifted, rem;
> > +	u16 eig = 0;
> > +	u8 eiw = 0;
> > +	int pos;
> > +
> > +	lockdep_assert_held(&cxl_rwsem.region);
> > +	lockdep_assert_held(&cxl_rwsem.dpa);
> > +
> > +	ways_to_eiw(p->interleave_ways, &eiw);
> 
> Maybe exercise paranoia and check return values from this
> 
> > +	granularity_to_eig(p->interleave_granularity, &eig);
> 
> and this.  That would mostly be to avoid reviewers having to
> consider if errors are possible rather than because we think
> we can get any.

Done

> 
> > +
> > +	/*
> > +	 * If the root decoder has SPA to CXL HPA callback,
> > +	 * use it. Otherwise CXL HPA is assumed to equal SPA.
> 
> Short wrap.  Go nearer 80 chars.

Done


> 
> > +	 */
> > +	if (cxlrd->spa_to_hpa) {
> > +		hpa = cxlrd->spa_to_hpa(cxlrd, p->res->start + offset);
> > +		hpa_offset = hpa - p->res->start;
> > +	} else {
> > +		hpa_offset = offset;
> > +	}
> > +	/*
> > +	 * Extracting the interleave position and the DPA offset
> > +	 * is based on the steps defined in CXL Spec 3.2 Section
> > +	 * 8.2.4.20.13 Implementation Note: Device Decode Logic
> 
> Also rather short wrap.
> 	 * Extracting the interleave position and the DPA offset is based on
>          * the steps defined in CXL Spec 3.2 Section 8.2.4.20.13 
> 	 * Implementation Note: Device Decode Logic
> 
> probably better.
> 
> > +	 */
> 
> Comment next to comment is a bit ugly.  Maybe combine them or stick a blank
> line in between.
Sort of done. Got rid of the above general comment and put the spec
reference in each comment section below.


> 
> > +	/*
> > +	 * Interleave position:
> > +	 * eiw < 8
> > +	 *	Position is in the IW bits at HPA_OFFSET[IG+8+eiw-1:IG+8].
> 
> Odd combination of IG and eiw in these comments.
> I'd either go with spec terms IG and IW or kernel eig and eiw

Done. Using spec terms in these.

> 
> > +	 *	Per spec "remove IW bits starting with bit position IG+8"
> > +	 * eiw >= 8
> > +	 *	Position is not explicitly stored in HPA_OFFSET. It is
> > +	 *	derived from the modulo-3 remainder of the upper bits.
> > +	 */
> > +	if (eiw < 8) {
> > +		pos = (hpa_offset >> (eig + 8)) & GENMASK(eiw - 1, 0);
> > +	} else {
> > +		shifted = hpa_offset >> (eig + eiw);
> > +		div64_u64_rem(shifted, 3, &rem);
> > +		pos = rem;
> 
> Same as:
> 	
> 		pos = shifted % 3;
> 

I had stayed away from the % but I'm back to using it here. I think this
MOD is OK. The shift operation makes the divident much smaller and the
divisor is tiny.

> How does this work for 6 or 12 way?

It doesn't. This does: 
	 * eiw >= 8
         *      Position is not explicitly stored in HPA_OFFSET bits. It is
         *      derived from the modulo operation of the upper bits using
         *      the total number of interleave ways.
         */

         pos = (hpa_offset >> (eig + 8)) % p->interleave_ways;


Jonathan - I added all this to a test program that runs known good
data sets. I'd used it for DPA->HPA->SPA back in the day. Now it
does the reverse. I'll post link to it in the v3 patchset too.

CXL stand-alone address translation test SPA<=>HPA<=>DPA
https://github.com/pmem/ndctl/commit/0b5ae6882d882145f671c6d6bd9a4fc1edb4b99d

> 
> > +	}
> > +	if (pos < 0 || pos >= p->nr_targets) {
> > +		dev_dbg(&cxlr->dev, "Invalid position %d for %d targets\n",
> > +			pos, p->nr_targets);
> > +		return -ENXIO;
> > +	}
> > +	/*
> > +	 * DPA offset:
> > +	 * Lower bits [IG+7:0] pass through unchanged
> > +	 * (eiw < 8)
> > +	 *	Per spec: DPAOffset[51:IG+8] = (HPAOffset[51:IG+IW+8] >> IW)
> > +	 *	Clear the position bits to isolate upper section, then
> > +	 *	reverse the left shift by eiw that occurred during DPA->HPA
> > +	 * (eiw >= 8)
> > +	 *	Per spec: DPAOffset[51:IG+8] = HPAOffset[51:IG+IW] / 3
> > +	 *	Extract upper bits from the correct bit range and divide by 3
> > +	 *	to recover the original DPA upper bits
> > +	 */
> > +
> > +	bits_lower = hpa_offset & GENMASK_ULL(eig + 7, 0);
> > +	if (eiw < 8) {
> > +		hpa_offset &= ~((u64)GENMASK(eiw - 1, 0) << (eig + 8));
> Bottom bits already established, so I'd mask out all the top part.

Done

> 
> 		hpa_offset &= ~((u64)GENMASK(eiw + eig + 8 - 1, 0));
> 
> As you have it here I think you leave lower bits in place and or them with
> themselves.
> 
> > +		dpa_offset = hpa_offset >> eiw;
> > +	} else {
> > +		bits_upper = div64_u64(hpa_offset >> (eig + eiw), 3);
> > +		dpa_offset = bits_upper << (eig + 8);
> > +	}
> > +	dpa_offset |= bits_lower;
> > +
> > +	/* Look-up and return the result: a memdev and a DPA */
> > +	for (int i = 0; i < p->nr_targets; i++) {
> > +		cxled = p->targets[i];
> 
> It's getting a little deeply nested so maybe

Done



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] cxl/core: Add locked variants of the poison inject and clear funcs
  2025-07-03 14:25   ` Jonathan Cameron
@ 2025-07-12  4:23     ` Alison Schofield
  0 siblings, 0 replies; 13+ messages in thread
From: Alison Schofield @ 2025-07-12  4:23 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Thu, Jul 03, 2025 at 03:25:34PM +0100, Jonathan Cameron wrote:
> On Wed,  2 Jul 2025 21:03:22 -0700
> alison.schofield@intel.com wrote:
> 
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > The core functions that validate and send inject and clear commands
> > to the memdev devices require holding both the dpa_rwsem and the
> > region_rwsem.
> > 
> > In preparation for another caller of these functions that must hold
> > the locks upon entry, split the work into a locked and unlocked pair.
> > 
> > Consideration was given to moving the locking to both callers,
> > however, the existing caller is not in the core (mem.c) and cannot
> > access the locks.
> > 
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> A few minor things inline.
> 
> > ---
> > 
> > Changes in v2:
> > Return the rc from locked variants in cxl_clear/inject_poison

Undid ^

> > Remove the not customary __ prefix from locked variants
> > 
> > 
> >  drivers/cxl/core/memdev.c | 60 +++++++++++++++++++++++++++------------
> >  drivers/cxl/cxlmem.h      |  2 ++
> >  2 files changed, 44 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c

snip

> >  
> > -	return 0;
> > +	return rc;
> 
> Why this change?  Nothing locally visible seems to make rc != 0 in
> a fashion that wasn't true before.

Yeah, that was a late hasty add that I didn't think through.
Undone.

snip

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] cxl/region: Add inject and clear poison by region offset
  2025-07-03 14:31   ` Jonathan Cameron
@ 2025-07-12  4:40     ` Alison Schofield
  2025-07-15 15:20       ` Jonathan Cameron
  0 siblings, 1 reply; 13+ messages in thread
From: Alison Schofield @ 2025-07-12  4:40 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

On Thu, Jul 03, 2025 at 03:31:38PM +0100, Jonathan Cameron wrote:
> On Wed,  2 Jul 2025 21:03:23 -0700
> alison.schofield@intel.com wrote:
> 
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > Add CXL region debugfs attributes to inject and clear poison based
> > on an offset into the region. These new interfaces allow users to
> > operate on poison at the region level without needing to resolve
> > Device Physical Addresses (DPA) or target individual memdevs.
> > 
> > The implementation uses a new helper, region_offset_to_dpa_result()
> > that applies decoder interleave logic, including XOR-based address
> > decoding when applicable. Note that XOR decodes rely on driver
> > internal xormaps which are not exposed to userspace. So, this support
> > is not only a simplification of poison operations that could be done
> > using existing per memdev operations, but also it enables this
> > functionality for XOR interleaved regions for the first time.
> > 
> > New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/:
> > inject_poison and clear_poison. These are only exposed if all memdevs
> > participating in the region support both inject and clear commands,
> > ensuring consistent and reliable behavior across multi-device regions.
> > 
> > If tracing is enabled, these operations are logged as cxl_poison
> > events in /sys/kernel/tracing/trace.
> > 
> > The ABI documentation warns users of the significant risks that
> > come with using these capabilities.
> > 
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > ---
> > 
> > Changes in v2:
> > Simplify return using test_bit() in cxl_memdev_has_poison() (Jonathan)
> > Use ACQUIRE() in the debugfs inject and clear funcs (DaveJ, Jonathan)
> > Fail if offset is in the extended linear cache (Jonathan)
> > Added 'cxl' to an existing ABI for memdev clear_poison
> > Redefine ABI to take a region offset, not SPA (Dan)
> > Remove KernelVersion field in ABI doc (Dan)
> > Warn against misuse in ABI doc (Dan)
> > Add validate_region_offset() helper
> > 
> > 
> >  Documentation/ABI/testing/debugfs-cxl |  89 ++++++++++++++++-
> >  drivers/cxl/core/core.h               |   4 +
> >  drivers/cxl/core/memdev.c             |   8 ++
> >  drivers/cxl/core/region.c             | 131 +++++++++++++++++++++++++-
> >  4 files changed, 228 insertions(+), 4 deletions(-)
> > 
> > diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
> > index 12488c14be64..2989d4da96c1 100644
> > --- a/Documentation/ABI/testing/debugfs-cxl
> > +++ b/Documentation/ABI/testing/debugfs-cxl
> > @@ -19,8 +19,22 @@ Description:
> >  		is returned to the user. The inject_poison attribute is only
> >  		visible for devices supporting the capability.
> >  
> > +		TEST-ONLY INTERFACE: This interface is intended for testing
> > +		and validation purposes only. It is not a data repair mechanism
> > +		and should never be used on production systems or live data.
> >  
> > -What:		/sys/kernel/debug/memX/clear_poison
> > +		DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
> > +		poison injection can result in permanent data loss. Injected
> > +		poison may render data permanently inaccessible even after
> > +		clearing, as the clear operation writes zeros and does not
> > +		recover original data.
> > +
> > +		SYSTEM STABILITY RISK: For volatile memory, poison injection
> > +		can cause kernel crashes, system instability, or unpredictable
> > +		behavior if the poisoned addresses are accessed by running code
> > +		or critical kernel structures.
> > +
> > +What:		/sys/kernel/debug/cxl/memX/clear_poison
> 
> Moved or previously a bug?  Looks like a bug, so separate patch.
> Not sure we will bother asking for a backport, but having it in here made
> me stare at that line for an unhealthy game of spot the difference.

Previous bug. Will fixup separately.

snip

> > +
> > +	rc = region_offset_to_dpa_result(cxlr, offset, &result);
> > +	if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
> > +		dev_dbg(&cxlr->dev,
> > +			"Failed to resolve DPA for region offset %#llx\n",
> > +			offset);
> > +
> > +		return -EINVAL;
> 
> If rc was set, preferable to return that. I'd split the condition
> into two parts to avoid this.

I want to dev_dbg() regardless, so how about adding the rc to dev_dbg()
and returning like this:

        rc = region_offset_to_dpa_result(cxlr, offset, &result);
        if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
                dev_dbg(&cxlr->dev,
-                       "Failed to resolve DPA for region offset %#llx\n",
-                       offset);
+                       "Failed to resolve DPA for region offset %#llx rc %d\n",
+                       offset, rc);
 
-               return -EINVAL;
+               return rc ? rc : -EINVAL;
        }



> 
> > +	}
> > +
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] cxl/region: Add inject and clear poison by region offset
  2025-07-12  4:40     ` Alison Schofield
@ 2025-07-15 15:20       ` Jonathan Cameron
  0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2025-07-15 15:20 UTC (permalink / raw)
  To: Alison Schofield
  Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
	Dan Williams, linux-cxl

> > > +
> > > +	rc = region_offset_to_dpa_result(cxlr, offset, &result);
> > > +	if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
> > > +		dev_dbg(&cxlr->dev,
> > > +			"Failed to resolve DPA for region offset %#llx\n",
> > > +			offset);
> > > +
> > > +		return -EINVAL;  
> > 
> > If rc was set, preferable to return that. I'd split the condition
> > into two parts to avoid this.  
> 
> I want to dev_dbg() regardless, so how about adding the rc to dev_dbg()
> and returning like this:
> 
>         rc = region_offset_to_dpa_result(cxlr, offset, &result);
>         if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
>                 dev_dbg(&cxlr->dev,
> -                       "Failed to resolve DPA for region offset %#llx\n",
> -                       offset);
> +                       "Failed to resolve DPA for region offset %#llx rc %d\n",
> +                       offset, rc);
>  
> -               return -EINVAL;
> +               return rc ? rc : -EINVAL;
>         }
> 
LGTM

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-07-15 15:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-03  4:03 [PATCH v2 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
2025-07-03  4:03 ` [PATCH v2 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
2025-07-03 13:43   ` Jonathan Cameron
2025-07-03  4:03 ` [PATCH v2 2/4] cxl/region: Introduce SPA to DPA address translation alison.schofield
2025-07-03 14:23   ` Jonathan Cameron
2025-07-12  4:17     ` Alison Schofield
2025-07-03  4:03 ` [PATCH v2 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
2025-07-03 14:25   ` Jonathan Cameron
2025-07-12  4:23     ` Alison Schofield
2025-07-03  4:03 ` [PATCH v2 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
2025-07-03 14:31   ` Jonathan Cameron
2025-07-12  4:40     ` Alison Schofield
2025-07-15 15:20       ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).