* [PATCH v3 0/4] cxl: Support Poison Inject & Clear by Region Offset
@ 2025-07-13 2:37 alison.schofield
2025-07-13 2:37 ` [PATCH v3 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: alison.schofield @ 2025-07-13 2:37 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl
From: Alison Schofield <alison.schofield@intel.com>
Built upon cxl repo branch for-6.17/cxl-acquire.
Please excuse the checkpatch errors in new ACQUIRE syntax.
A stand-alone test of the SPA<=>HPA<=>DPA of calculations is here:
https://github.com/pmem/ndctl/commit/0b5ae6882d882145f671c6d6bd9a4fc1edb4b99d
Changes in v3:
Patch 2
Check return of ways_to_eiw() & granularity_to_eig() (Jonathan)
Collapse a header comment into pos and offset blocks (Jonathan)
Calc pos for 3,6,12 using actual ways, not always 3 (Jonathan)
Mask off bottom bits in < 8 offset case (Jonathan)
Wrap code comments closer to column 80 (Jonathan)
Use a continue to un-nest a for loop. (Jonathan)
Patch 3
Undo v2 return of rc from locked variants. Return 0. (Jonathan)
Patch 4
Return rc from region_offset_to_dpa_result() if present (Jonathan)
Remove the errant Documentation/ABI fixup (Jonathan)
Add the above rc to the dev_dbg message
Changes in v2:
Rebased onto cxl repo branch for-6.17/cxl-acquire
As the ACQUIRE() set noted, ACQUIRE() syntax leads to a checkpatch
ERROR: do not use assignment in if condition
We'll have to endure that noise here until that set is merged with
a checkpatch.pl update or a change to the syntax.
Changelog appears per patch also:
Simplify return using test_bit() in cxl_memdev_has_poison() (Jonathan)
Use ACQUIRE() in the debugfs inject and clear funcs (DaveJ, Jonathan)
Shift by (eig + eiw) not (eig + 8) in MOD 3 interleaves** (Jonathan)
Simplify bottom bit handling by saving and restoring (Jonathan)
Return the rc from locked variants in cxl_clear/inject_poison
Fail if offset is in the extended linear cache (Jonathan)
Calculate the pos and dpa_offset inline, not in a helper
Remove the not customary __ prefix from locked variants
Added 'cxl' to an existing ABI for memdev clear_poison
Add and use a root decoder callback for spa_to_hpa()
Redefine ABI to take a region offset, not SPA (Dan)
Use div64_u64 instead of / to fix 32-bit ARM (lkp)
Use div64_u64_rem instead of % for arch safety
Remove KernelVersion field in ABI doc (Dan)
Pass pointer to results structures (DaveJ)
Add spec references and comments (DaveJ)
Warn against misuse in ABI doc (Dan)
Add validate_region_offset() helper
Begin Cover Letter (no change since v2)
This series allows expert users to inject and clear poison by writing a
Host Physical Address (HPA) to a region debugfs files. At the core of this
new functionality is a helper that translates an HPA into a Device Physical
Address (DPA) and a memdev based on the region's decoder configuration.
The set is not merely a convenience wrapper for these region poison
operations as it enables these operations for XOR interleaved regions
where they were previously impossible.
Patch 1 defines a SPA->CXL HPA root decoder callback for XOR Math. It's a
restructuring and renaming exercise that enables the reuse of an existing
xormap function in either direction SPA<-->CXL HPA. It gets used in Patch 2.
Patch 2 introduces the translation logic capable of retrieving the memdev
and a DPA for a region offset.
Patch 3 adds a locked variant of the inject and clear poison ops to
support callers that must hold locks during the entire translation and
operation sequence. It gets used in Patch 4.
Patch 4 exposes the capability through region debugfs attributes that
only appear when all participating memdevs support the poison commands.
At the end of Patch 4 a region offset has been translated to a memdev
and a DPA and can simply be passed through to the pre-existing per memdev
inject and clear poison routines.
A CXL Unit Test update to cxl-poison.sh is posted separately.
Alison Schofield (4):
cxl: Define a SPA->CXL HPA root decoder callback for XOR Math
cxl/region: Introduce SPA to DPA address translation
cxl/core: Add locked variants of the poison inject and clear funcs
cxl/region: Add inject and clear poison by region offset
Documentation/ABI/testing/debugfs-cxl | 87 ++++++++++
drivers/cxl/acpi.c | 30 ++--
drivers/cxl/core/core.h | 4 +
drivers/cxl/core/memdev.c | 64 ++++++--
drivers/cxl/core/region.c | 218 ++++++++++++++++++++++++++
drivers/cxl/cxl.h | 3 +
drivers/cxl/cxlmem.h | 2 +
7 files changed, 380 insertions(+), 28 deletions(-)
base-commit: 6c90a41a7833a6bb2fb17b9f3cafda66ebdf259b
--
2.37.3
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math
2025-07-13 2:37 [PATCH v3 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
@ 2025-07-13 2:37 ` alison.schofield
2025-07-13 2:37 ` [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation alison.schofield
` (2 subsequent siblings)
3 siblings, 0 replies; 14+ messages in thread
From: alison.schofield @ 2025-07-13 2:37 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl
From: Alison Schofield <alison.schofield@intel.com>
When DPA->SPA translation was introduced, it included a helper that
applied the XOR maps to do the CXL HPA -> SPA translation for XOR
region interleaves. In preparation for adding SPA->DPA address
translation, introduce the reverse callback.
The root decoder callback is defined generically and not all usages
may be self inverting like this XOR function. Add another root decoder
callback that is the spa_to_hpa function.
Update the existing cxl_xor_hpa_to_spa() with a name that reflects
what it does without directionality: cxl_apply_xor_maps(), a generic
parameter: addr replaces hpa, and code comments stating that the
function supports the translation in either direction.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
---
No changes in v3
drivers/cxl/acpi.c | 30 ++++++++++++++++++------------
drivers/cxl/cxl.h | 3 +++
2 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index a1a99ec3f12c..fd12b9a96246 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -21,7 +21,7 @@ static const guid_t acpi_cxl_qtg_id_guid =
0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
-static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
+static u64 cxl_apply_xor_maps(struct cxl_root_decoder *cxlrd, u64 addr)
{
struct cxl_cxims_data *cximsd = cxlrd->platform_data;
int hbiw = cxlrd->cxlsd.nr_targets;
@@ -30,19 +30,23 @@ static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
/* No xormaps for host bridge interleave ways of 1 or 3 */
if (hbiw == 1 || hbiw == 3)
- return hpa;
+ return addr;
/*
- * For root decoders using xormaps (hbiw: 2,4,6,8,12,16) restore
- * the position bit to its value before the xormap was applied at
- * HPA->DPA translation.
+ * In regions using XOR interleave arithmetic the CXL HPA may not
+ * be the same as the SPA. This helper performs the SPA->CXL HPA
+ * or the CXL HPA->SPA translation. Since XOR is self-inverting,
+ * so is this function.
+ *
+ * For root decoders using xormaps (hbiw: 2,4,6,8,12,16) applying the
+ * xormaps will toggle a position bit.
*
* pos is the lowest set bit in an XORMAP
- * val is the XORALLBITS(HPA & XORMAP)
+ * val is the XORALLBITS(addr & XORMAP)
*
* XORALLBITS: The CXL spec (3.1 Table 9-22) defines XORALLBITS
* as an operation that outputs a single bit by XORing all the
- * bits in the input (hpa & xormap). Implement XORALLBITS using
+ * bits in the input (addr & xormap). Implement XORALLBITS using
* hweight64(). If the hamming weight is even the XOR of those
* bits results in val==0, if odd the XOR result is val==1.
*/
@@ -51,11 +55,11 @@ static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
if (!cximsd->xormaps[i])
continue;
pos = __ffs(cximsd->xormaps[i]);
- val = (hweight64(hpa & cximsd->xormaps[i]) & 1);
- hpa = (hpa & ~(1ULL << pos)) | (val << pos);
+ val = (hweight64(addr & cximsd->xormaps[i]) & 1);
+ addr = (addr & ~(1ULL << pos)) | (val << pos);
}
- return hpa;
+ return addr;
}
struct cxl_cxims_context {
@@ -413,8 +417,10 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
cxlrd->qos_class = cfmws->qtg_id;
- if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR)
- cxlrd->hpa_to_spa = cxl_xor_hpa_to_spa;
+ if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR) {
+ cxlrd->hpa_to_spa = cxl_apply_xor_maps;
+ cxlrd->spa_to_hpa = cxl_apply_xor_maps;
+ }
rc = cxl_decoder_add(cxld, target_map);
if (rc)
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 50799a681231..75e60a313be5 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -419,12 +419,14 @@ struct cxl_switch_decoder {
struct cxl_root_decoder;
typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
+typedef u64 (*cxl_spa_to_hpa_fn)(struct cxl_root_decoder *cxlrd, u64 spa);
/**
* struct cxl_root_decoder - Static platform CXL address decoder
* @res: host / parent resource for region allocations
* @region_id: region id for next region provisioning event
* @hpa_to_spa: translate CXL host-physical-address to Platform system-physical-address
+ * @spa_to_hpa: translate Platform system-physical-address to CXL host-physical-address
* @platform_data: platform specific configuration data
* @range_lock: sync region autodiscovery by address range
* @qos_class: QoS performance class cookie
@@ -434,6 +436,7 @@ struct cxl_root_decoder {
struct resource *res;
atomic_t region_id;
cxl_hpa_to_spa_fn hpa_to_spa;
+ cxl_spa_to_hpa_fn spa_to_hpa;
void *platform_data;
struct mutex range_lock;
int qos_class;
--
2.37.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation
2025-07-13 2:37 [PATCH v3 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
2025-07-13 2:37 ` [PATCH v3 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
@ 2025-07-13 2:37 ` alison.schofield
2025-07-16 10:52 ` Jonathan Cameron
2025-07-13 2:37 ` [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
2025-07-13 2:37 ` [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
3 siblings, 1 reply; 14+ messages in thread
From: alison.schofield @ 2025-07-13 2:37 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl
From: Alison Schofield <alison.schofield@intel.com>
Add infrastructure to translate System Physical Addresses (SPA) to
Device Physical Addresses (DPA) within CXL regions. This capability
will be used by follow-on patches that add poison inject and clear
operations at the region level.
The SPA-to-DPA translation process follows these steps:
1. Apply root decoder transformations (SPA to HPA) if configured.
2. Extract the position in region interleave from the HPA offset.
3. Extract the DPA offset from the HPA offset.
4. Use position to find endpoint decoder.
5. Use endpoint decoder to find memdev and calculate DPA from offset.
6. Return the result - a memdev and a DPA.
It is Step 1 above that makes this a driver level operation and not
work we can push to user space. Rather than exporting the XOR maps for
root decoders configured with XOR interleave, the driver performs this
complex calculation for the user.
Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
Implementation Note: Device Decode Logic.
These calculations mirror much of the logic introduced earlier in DPA
to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
reverse the spec defined 'Deivce Decode Logic'.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
Changes in v3:
Check return of ways_to_eiw() & granularity_to_eig() (Jonathan)
Collapse a header comment into pos and offset blocks (Jonathan)
Calc pos for 3,6,12 using actual ways, not always 3 (Jonathan)
Mask off bottom bits in < 8 offset case (Jonathan)
Wrap code comments closer to column 80 (Jonathan)
Use a continue to un-nest a for loop. (Jonathan)
drivers/cxl/core/region.c | 93 +++++++++++++++++++++++++++++++++++++++
1 file changed, 93 insertions(+)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index a2ba19151d4f..404f69864d61 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2956,6 +2956,99 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
return hpa;
}
+struct dpa_result {
+ struct cxl_memdev *cxlmd;
+ u64 dpa;
+};
+
+static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
+ u64 offset,
+ struct dpa_result *result)
+{
+ struct cxl_region_params *p = &cxlr->params;
+ struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
+ struct cxl_endpoint_decoder *cxled;
+ u64 hpa, hpa_offset, dpa_offset;
+ u64 bits_upper, bits_lower;
+ u16 eig = 0;
+ u8 eiw = 0;
+ int pos;
+
+ lockdep_assert_held(&cxl_rwsem.region);
+ lockdep_assert_held(&cxl_rwsem.dpa);
+
+ if (granularity_to_eig(p->interleave_granularity, &eig) ||
+ ways_to_eiw(p->interleave_ways, &eiw))
+ return -ENXIO;
+
+ /*
+ * If the root decoder has SPA to CXL HPA callback, use it. Otherwise
+ * CXL HPA is assumed to equal SPA.
+ */
+ if (cxlrd->spa_to_hpa) {
+ hpa = cxlrd->spa_to_hpa(cxlrd, p->res->start + offset);
+ hpa_offset = hpa - p->res->start;
+ } else {
+ hpa_offset = offset;
+ }
+ /*
+ * Interleave position: CXL Spec 3.2 Section 8.2.4.20.13
+ * eiw < 8
+ * Position is in the IW bits at HPA_OFFSET[IG+8+IW-1:IG+8].
+ * Per spec "remove IW bits starting with bit position IG+8"
+ * eiw >= 8
+ * Position is not explicitly stored in HPA_OFFSET bits. It is
+ * derived from the modulo operation of the upper bits using
+ * the total number of interleave ways.
+ */
+ if (eiw < 8)
+ pos = (hpa_offset >> (eig + 8)) & GENMASK(eiw - 1, 0);
+ else
+ pos = (hpa_offset >> (eig + 8)) % p->interleave_ways;
+
+ if (pos < 0 || pos >= p->nr_targets) {
+ dev_dbg(&cxlr->dev, "Invalid position %d for %d targets\n",
+ pos, p->nr_targets);
+ return -ENXIO;
+ }
+ /*
+ * DPA offset: CXL Spec 3.2 Section 8.2.4.20.13
+ * Lower bits [IG+7:0] pass through unchanged
+ * (eiw < 8)
+ * Per spec: DPAOffset[51:IG+8] = (HPAOffset[51:IG+IW+8] >> IW)
+ * Clear the position bits to isolate upper section, then
+ * reverse the left shift by eiw that occurred during DPA->HPA
+ * (eiw >= 8)
+ * Per spec: DPAOffset[51:IG+8] = HPAOffset[51:IG+IW] / 3
+ * Extract upper bits from the correct bit range and divide by 3
+ * to recover the original DPA upper bits
+ */
+
+ bits_lower = hpa_offset & GENMASK_ULL(eig + 7, 0);
+ if (eiw < 8) {
+ hpa_offset &= ~((u64)GENMASK(eiw + eig + 8 - 1, 0));
+ dpa_offset = hpa_offset >> eiw;
+ } else {
+ bits_upper = div64_u64(hpa_offset >> (eig + eiw), 3);
+ dpa_offset = bits_upper << (eig + 8);
+ }
+ dpa_offset |= bits_lower;
+
+ /* Look-up and return the result: a memdev and a DPA */
+ for (int i = 0; i < p->nr_targets; i++) {
+ cxled = p->targets[i];
+ if (cxled->pos != pos)
+ continue;
+ result->cxlmd = cxled_to_memdev(cxled);
+ result->dpa = cxl_dpa_resource_start(cxled) + dpa_offset;
+
+ return 0;
+ }
+ dev_err(&cxlr->dev, "No device found for position %d\n", pos);
+
+ return -ENXIO;
+}
+
static struct lock_class_key cxl_pmem_region_key;
static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
--
2.37.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs
2025-07-13 2:37 [PATCH v3 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
2025-07-13 2:37 ` [PATCH v3 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
2025-07-13 2:37 ` [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation alison.schofield
@ 2025-07-13 2:37 ` alison.schofield
2025-07-15 21:33 ` Dave Jiang
2025-07-16 10:58 ` Jonathan Cameron
2025-07-13 2:37 ` [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
3 siblings, 2 replies; 14+ messages in thread
From: alison.schofield @ 2025-07-13 2:37 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl
From: Alison Schofield <alison.schofield@intel.com>
The core functions that validate and send inject and clear commands
to the memdev devices require holding both the dpa_rwsem and the
region_rwsem.
In preparation for another caller of these functions that must hold
the locks upon entry, split the work into a locked and unlocked pair.
Consideration was given to moving the locking to both callers,
however, the existing caller is not in the core (mem.c) and cannot
access the locks.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
Changes in v3:
Undo v2 return of rc from locked variants. Return 0. (Jonathan)
drivers/cxl/core/memdev.c | 56 ++++++++++++++++++++++++++++-----------
drivers/cxl/cxlmem.h | 2 ++
2 files changed, 42 insertions(+), 16 deletions(-)
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index f5fbd34310fd..097fc9828f3a 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -276,7 +276,7 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
return 0;
}
-int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
+int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
{
struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
struct cxl_mbox_inject_poison inject;
@@ -288,13 +288,8 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
if (!IS_ENABLED(CONFIG_DEBUG_FS))
return 0;
- ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
- if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
- return rc;
-
- ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
- if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
- return rc;
+ lockdep_assert_held(&cxl_rwsem.dpa);
+ lockdep_assert_held(&cxl_rwsem.region);
rc = cxl_validate_poison_dpa(cxlmd, dpa);
if (rc)
@@ -324,9 +319,26 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
return 0;
}
+
+int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
+{
+ int rc;
+
+ ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
+ return rc;
+
+ ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+ return rc;
+
+ rc = cxl_inject_poison_locked(cxlmd, dpa);
+
+ return rc;
+}
EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, "CXL");
-int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
+int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
{
struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
struct cxl_mbox_clear_poison clear;
@@ -338,13 +350,8 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
if (!IS_ENABLED(CONFIG_DEBUG_FS))
return 0;
- ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
- if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
- return rc;
-
- ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
- if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
- return rc;
+ lockdep_assert_held(&cxl_rwsem.dpa);
+ lockdep_assert_held(&cxl_rwsem.region);
rc = cxl_validate_poison_dpa(cxlmd, dpa);
if (rc)
@@ -383,6 +390,23 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
return 0;
}
+
+int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
+{
+ int rc;
+
+ ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
+ return rc;
+
+ ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+ return rc;
+
+ rc = cxl_clear_poison_locked(cxlmd, dpa);
+
+ return rc;
+}
EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, "CXL");
static struct attribute *cxl_memdev_attributes[] = {
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index f5b20641e57c..869c73de12c8 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -861,6 +861,8 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
+int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, u64 dpa);
+int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa);
#ifdef CONFIG_CXL_EDAC_MEM_FEATURES
int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd);
--
2.37.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset
2025-07-13 2:37 [PATCH v3 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
` (2 preceding siblings ...)
2025-07-13 2:37 ` [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
@ 2025-07-13 2:37 ` alison.schofield
2025-07-13 17:23 ` kernel test robot
2025-07-16 11:02 ` Jonathan Cameron
3 siblings, 2 replies; 14+ messages in thread
From: alison.schofield @ 2025-07-13 2:37 UTC (permalink / raw)
To: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
Vishal Verma, Ira Weiny, Dan Williams
Cc: linux-cxl
From: Alison Schofield <alison.schofield@intel.com>
Add CXL region debugfs attributes to inject and clear poison based
on an offset into the region. These new interfaces allow users to
operate on poison at the region level without needing to resolve
Device Physical Addresses (DPA) or target individual memdevs.
The implementation uses a new helper, region_offset_to_dpa_result()
that applies decoder interleave logic, including XOR-based address
decoding when applicable. Note that XOR decodes rely on driver
internal xormaps which are not exposed to userspace. So, this support
is not only a simplification of poison operations that could be done
using existing per memdev operations, but also it enables this
functionality for XOR interleaved regions for the first time.
New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/:
inject_poison and clear_poison. These are only exposed if all memdevs
participating in the region support both inject and clear commands,
ensuring consistent and reliable behavior across multi-device regions.
If tracing is enabled, these operations are logged as cxl_poison
events in /sys/kernel/tracing/trace.
The ABI documentation warns users of the significant risks that
come with using these capabilities.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
---
Changes in v3:
Return rc from region_offset_to_dpa_result() if present (Jonathan)
Remove the errant Documentation/ABI fixup (Jonathan)
Add the above rc to the dev_dbg message
Documentation/ABI/testing/debugfs-cxl | 87 +++++++++++++++++
drivers/cxl/core/core.h | 4 +
drivers/cxl/core/memdev.c | 8 ++
drivers/cxl/core/region.c | 131 +++++++++++++++++++++++++-
4 files changed, 227 insertions(+), 3 deletions(-)
diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
index 12488c14be64..e1f69e727190 100644
--- a/Documentation/ABI/testing/debugfs-cxl
+++ b/Documentation/ABI/testing/debugfs-cxl
@@ -19,6 +19,20 @@ Description:
is returned to the user. The inject_poison attribute is only
visible for devices supporting the capability.
+ TEST-ONLY INTERFACE: This interface is intended for testing
+ and validation purposes only. It is not a data repair mechanism
+ and should never be used on production systems or live data.
+
+ DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
+ poison injection can result in permanent data loss. Injected
+ poison may render data permanently inaccessible even after
+ clearing, as the clear operation writes zeros and does not
+ recover original data.
+
+ SYSTEM STABILITY RISK: For volatile memory, poison injection
+ can cause kernel crashes, system instability, or unpredictable
+ behavior if the poisoned addresses are accessed by running code
+ or critical kernel structures.
What: /sys/kernel/debug/memX/clear_poison
Date: April, 2023
@@ -35,6 +49,79 @@ Description:
The clear_poison attribute is only visible for devices
supporting the capability.
+ TEST-ONLY INTERFACE: This interface is intended for testing
+ and validation purposes only. It is not a data repair mechanism
+ and should never be used on production systems or live data.
+
+ CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
+ specified address range and removes the address from the poison
+ list. It does NOT recover or restore original data that may have
+ been present before poison injection. Any original data at the
+ cleared address is permanently lost and replaced with zeros.
+
+ CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
+ purposes only and should not be used as a data repair tool.
+ Clearing poison is fundamentally different from data recovery
+ or error correction.
+
+What: /sys/kernel/debug/cxl/regionX/inject_poison
+Date: August, 2025
+Contact: linux-cxl@vger.kernel.org
+Description:
+ (WO) When a Host Physical Address (HPA) is written to this
+ attribute, the region driver translates it to a Device
+ Physical Address (DPA) and identifies the corresponding
+ memdev. It then sends an inject poison command to that memdev
+ at the translated DPA. Refer to the memdev ABI entry at:
+ /sys/kernel/debug/cxl/memX/inject_poison for the detailed
+ behavior. This attribute is only visible if all memdevs
+ participating in the region support both inject and clear
+ poison commands.
+
+ TEST-ONLY INTERFACE: This interface is intended for testing
+ and validation purposes only. It is not a data repair mechanism
+ and should never be used on production systems or live data.
+
+ DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
+ poison injection can result in permanent data loss. Injected
+ poison may render data permanently inaccessible even after
+ clearing, as the clear operation writes zeros and does not
+ recover original data.
+
+ SYSTEM STABILITY RISK: For volatile memory, poison injection
+ can cause kernel crashes, system instability, or unpredictable
+ behavior if the poisoned addresses are accessed by running code
+ or critical kernel structures.
+
+What: /sys/kernel/debug/cxl/regionX/clear_poison
+Date: August, 2025
+Contact: linux-cxl@vger.kernel.org
+Description:
+ (WO) When a Host Physical Address (HPA) is written to this
+ attribute, the region driver translates it to a Device
+ Physical Address (DPA) and identifies the corresponding
+ memdev. It then sends a clear poison command to that memdev
+ at the translated DPA. Refer to the memdev ABI entry at:
+ /sys/kernel/debug/cxl/memX/clear_poison for the detailed
+ behavior. This attribute is only visible if all memdevs
+ participating in the region support both inject and clear
+ poison commands.
+
+ TEST-ONLY INTERFACE: This interface is intended for testing
+ and validation purposes only. It is not a data repair mechanism
+ and should never be used on production systems or live data.
+
+ CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
+ specified address range and removes the address from the poison
+ list. It does NOT recover or restore original data that may have
+ been present before poison injection. Any original data at the
+ cleared address is permanently lost and replaced with zeros.
+
+ CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
+ purposes only and should not be used as a data repair tool.
+ Clearing poison is fundamentally different from data recovery
+ or error correction.
+
What: /sys/kernel/debug/cxl/einj_types
Date: January, 2024
KernelVersion: v6.9
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index ed7d08244542..9b4c305e5e5e 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -161,6 +161,10 @@ enum cxl_poison_trace_type {
CXL_POISON_TRACE_CLEAR,
};
+enum poison_cmd_enabled_bits;
+bool cxl_memdev_has_poison_cmd(struct cxl_memdev *cxlmd,
+ enum poison_cmd_enabled_bits cmd);
+
long cxl_pci_get_latency(struct pci_dev *pdev);
int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *c);
int cxl_update_hmat_access_coordinates(int nid, struct cxl_region *cxlr,
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 097fc9828f3a..4ce8a40960b8 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -200,6 +200,14 @@ static ssize_t security_erase_store(struct device *dev,
static struct device_attribute dev_attr_security_erase =
__ATTR(erase, 0200, NULL, security_erase_store);
+bool cxl_memdev_has_poison_cmd(struct cxl_memdev *cxlmd,
+ enum poison_cmd_enabled_bits cmd)
+{
+ struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
+
+ return test_bit(cmd, mds->poison.enabled_cmds);
+}
+
static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd)
{
struct cxl_dev_state *cxlds = cxlmd->cxlds;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 404f69864d61..d0ea82e92faf 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2,6 +2,7 @@
/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
#include <linux/memregion.h>
#include <linux/genalloc.h>
+#include <linux/debugfs.h>
#include <linux/device.h>
#include <linux/module.h>
#include <linux/memory.h>
@@ -2961,9 +2962,8 @@ struct dpa_result {
u64 dpa;
};
-static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
- u64 offset,
- struct dpa_result *result)
+static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset,
+ struct dpa_result *result)
{
struct cxl_region_params *p = &cxlr->params;
struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
@@ -3608,6 +3608,105 @@ static void shutdown_notifiers(void *_cxlr)
unregister_mt_adistance_algorithm(&cxlr->adist_notifier);
}
+static void remove_debugfs(void *dentry)
+{
+ debugfs_remove_recursive(dentry);
+}
+
+static int validate_region_offset(struct cxl_region *cxlr, u64 offset)
+{
+ struct cxl_region_params *p = &cxlr->params;
+ resource_size_t region_size;
+ u64 hpa;
+
+ if (offset < p->cache_size) {
+ dev_err(&cxlr->dev,
+ "Offset %#llx is within extended linear cache %#llx\n",
+ offset, p->cache_size);
+ return -EINVAL;
+ }
+
+ region_size = resource_size(p->res);
+ if (offset >= region_size) {
+ dev_err(&cxlr->dev, "Offset %#llx exceeds region size %#llx\n",
+ offset, region_size);
+ return -EINVAL;
+ }
+
+ hpa = p->res->start + offset;
+ if (hpa < p->res->start || hpa > p->res->end) {
+ dev_err(&cxlr->dev, "HPA %#llx not in region %pr\n", hpa,
+ p->res);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int cxl_region_debugfs_poison_inject(void *data, u64 offset)
+{
+ struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
+ struct cxl_region *cxlr = data;
+ int rc;
+
+ ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
+ return rc;
+
+ ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+ return rc;
+
+ if (validate_region_offset(cxlr, offset))
+ return -EINVAL;
+
+ rc = region_offset_to_dpa_result(cxlr, offset, &result);
+ if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
+ dev_dbg(&cxlr->dev,
+ "Failed to resolve DPA for region offset %#llx rc %d\n",
+ offset, rc);
+
+ return rc ? rc : -EINVAL;
+ }
+
+ return cxl_inject_poison_locked(result.cxlmd, result.dpa);
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL,
+ cxl_region_debugfs_poison_inject, "%llx\n");
+
+static int cxl_region_debugfs_poison_clear(void *data, u64 offset)
+{
+ struct dpa_result result = { .dpa = ULLONG_MAX, .cxlmd = NULL };
+ struct cxl_region *cxlr = data;
+ int rc;
+
+ ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
+ return rc;
+
+ ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
+ return rc;
+
+ if (validate_region_offset(cxlr, offset))
+ return -EINVAL;
+
+ rc = region_offset_to_dpa_result(cxlr, offset, &result);
+ if (rc || !result.cxlmd || result.dpa == ULLONG_MAX) {
+ dev_dbg(&cxlr->dev,
+ "Failed to resolve DPA for region offset %#llx rc %d\n",
+ offset, rc);
+
+ return rc ? rc : -EINVAL;
+ }
+
+ return cxl_clear_poison_locked(result.cxlmd, result.dpa);
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL,
+ cxl_region_debugfs_poison_clear, "%llx\n");
+
static int cxl_region_can_probe(struct cxl_region *cxlr)
{
struct cxl_region_params *p = &cxlr->params;
@@ -3637,6 +3736,7 @@ static int cxl_region_probe(struct device *dev)
{
struct cxl_region *cxlr = to_cxl_region(dev);
struct cxl_region_params *p = &cxlr->params;
+ bool poison_supported = true;
int rc;
rc = cxl_region_can_probe(cxlr);
@@ -3660,6 +3760,31 @@ static int cxl_region_probe(struct device *dev)
if (rc)
return rc;
+ /* Create poison attributes if all memdevs support the capabilities */
+ for (int i = 0; i < p->nr_targets; i++) {
+ struct cxl_endpoint_decoder *cxled = p->targets[i];
+ struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
+
+ if (!cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_INJECT) ||
+ !cxl_memdev_has_poison_cmd(cxlmd, CXL_POISON_ENABLED_CLEAR)) {
+ poison_supported = false;
+ break;
+ }
+ }
+
+ if (poison_supported) {
+ struct dentry *dentry;
+
+ dentry = cxl_debugfs_create_dir(dev_name(dev));
+ debugfs_create_file("inject_poison", 0200, dentry, cxlr,
+ &cxl_poison_inject_fops);
+ debugfs_create_file("clear_poison", 0200, dentry, cxlr,
+ &cxl_poison_clear_fops);
+ rc = devm_add_action_or_reset(dev, remove_debugfs, dentry);
+ if (rc)
+ return rc;
+ }
+
switch (cxlr->mode) {
case CXL_PARTMODE_PMEM:
rc = devm_cxl_region_edac_register(cxlr);
--
2.37.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset
2025-07-13 2:37 ` [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
@ 2025-07-13 17:23 ` kernel test robot
2025-07-16 11:02 ` Jonathan Cameron
1 sibling, 0 replies; 14+ messages in thread
From: kernel test robot @ 2025-07-13 17:23 UTC (permalink / raw)
To: alison.schofield, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
Vishal Verma, Ira Weiny, Dan Williams
Cc: oe-kbuild-all, linux-cxl
Hi,
kernel test robot noticed the following build errors:
[auto build test ERROR on 6c90a41a7833a6bb2fb17b9f3cafda66ebdf259b]
url: https://github.com/intel-lab-lkp/linux/commits/alison-schofield-intel-com/cxl-Define-a-SPA-CXL-HPA-root-decoder-callback-for-XOR-Math/20250713-104013
base: 6c90a41a7833a6bb2fb17b9f3cafda66ebdf259b
patch link: https://lore.kernel.org/r/612f019902bfb63ab22d7c4cf0f2f52fbc3ba5b8.1752365427.git.alison.schofield%40intel.com
patch subject: [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset
config: i386-randconfig-001-20250713 (https://download.01.org/0day-ci/archive/20250714/202507140133.Qi2pYjES-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14+deb12u1) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250714/202507140133.Qi2pYjES-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202507140133.Qi2pYjES-lkp@intel.com/
All errors (new ones prefixed by >>):
ld: drivers/cxl/core/region.o: in function `region_offset_to_dpa_result':
>> drivers/cxl/core/region.c:3007: undefined reference to `__umoddi3'
vim +3007 drivers/cxl/core/region.c
f7f82c631d48c8 Alison Schofield 2025-07-12 2964
f15c7deb79e3c8 Alison Schofield 2025-07-12 2965 static int region_offset_to_dpa_result(struct cxl_region *cxlr, u64 offset,
f7f82c631d48c8 Alison Schofield 2025-07-12 2966 struct dpa_result *result)
f7f82c631d48c8 Alison Schofield 2025-07-12 2967 {
f7f82c631d48c8 Alison Schofield 2025-07-12 2968 struct cxl_region_params *p = &cxlr->params;
f7f82c631d48c8 Alison Schofield 2025-07-12 2969 struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
f7f82c631d48c8 Alison Schofield 2025-07-12 2970 struct cxl_endpoint_decoder *cxled;
f7f82c631d48c8 Alison Schofield 2025-07-12 2971 u64 hpa, hpa_offset, dpa_offset;
f7f82c631d48c8 Alison Schofield 2025-07-12 2972 u64 bits_upper, bits_lower;
f7f82c631d48c8 Alison Schofield 2025-07-12 2973 u16 eig = 0;
f7f82c631d48c8 Alison Schofield 2025-07-12 2974 u8 eiw = 0;
f7f82c631d48c8 Alison Schofield 2025-07-12 2975 int pos;
f7f82c631d48c8 Alison Schofield 2025-07-12 2976
f7f82c631d48c8 Alison Schofield 2025-07-12 2977 lockdep_assert_held(&cxl_rwsem.region);
f7f82c631d48c8 Alison Schofield 2025-07-12 2978 lockdep_assert_held(&cxl_rwsem.dpa);
f7f82c631d48c8 Alison Schofield 2025-07-12 2979
f7f82c631d48c8 Alison Schofield 2025-07-12 2980 if (granularity_to_eig(p->interleave_granularity, &eig) ||
f7f82c631d48c8 Alison Schofield 2025-07-12 2981 ways_to_eiw(p->interleave_ways, &eiw))
f7f82c631d48c8 Alison Schofield 2025-07-12 2982 return -ENXIO;
f7f82c631d48c8 Alison Schofield 2025-07-12 2983
f7f82c631d48c8 Alison Schofield 2025-07-12 2984 /*
f7f82c631d48c8 Alison Schofield 2025-07-12 2985 * If the root decoder has SPA to CXL HPA callback, use it. Otherwise
f7f82c631d48c8 Alison Schofield 2025-07-12 2986 * CXL HPA is assumed to equal SPA.
f7f82c631d48c8 Alison Schofield 2025-07-12 2987 */
f7f82c631d48c8 Alison Schofield 2025-07-12 2988 if (cxlrd->spa_to_hpa) {
f7f82c631d48c8 Alison Schofield 2025-07-12 2989 hpa = cxlrd->spa_to_hpa(cxlrd, p->res->start + offset);
f7f82c631d48c8 Alison Schofield 2025-07-12 2990 hpa_offset = hpa - p->res->start;
f7f82c631d48c8 Alison Schofield 2025-07-12 2991 } else {
f7f82c631d48c8 Alison Schofield 2025-07-12 2992 hpa_offset = offset;
f7f82c631d48c8 Alison Schofield 2025-07-12 2993 }
f7f82c631d48c8 Alison Schofield 2025-07-12 2994 /*
f7f82c631d48c8 Alison Schofield 2025-07-12 2995 * Interleave position: CXL Spec 3.2 Section 8.2.4.20.13
f7f82c631d48c8 Alison Schofield 2025-07-12 2996 * eiw < 8
f7f82c631d48c8 Alison Schofield 2025-07-12 2997 * Position is in the IW bits at HPA_OFFSET[IG+8+IW-1:IG+8].
f7f82c631d48c8 Alison Schofield 2025-07-12 2998 * Per spec "remove IW bits starting with bit position IG+8"
f7f82c631d48c8 Alison Schofield 2025-07-12 2999 * eiw >= 8
f7f82c631d48c8 Alison Schofield 2025-07-12 3000 * Position is not explicitly stored in HPA_OFFSET bits. It is
f7f82c631d48c8 Alison Schofield 2025-07-12 3001 * derived from the modulo operation of the upper bits using
f7f82c631d48c8 Alison Schofield 2025-07-12 3002 * the total number of interleave ways.
f7f82c631d48c8 Alison Schofield 2025-07-12 3003 */
f7f82c631d48c8 Alison Schofield 2025-07-12 3004 if (eiw < 8)
f7f82c631d48c8 Alison Schofield 2025-07-12 3005 pos = (hpa_offset >> (eig + 8)) & GENMASK(eiw - 1, 0);
f7f82c631d48c8 Alison Schofield 2025-07-12 3006 else
f7f82c631d48c8 Alison Schofield 2025-07-12 @3007 pos = (hpa_offset >> (eig + 8)) % p->interleave_ways;
f7f82c631d48c8 Alison Schofield 2025-07-12 3008
f7f82c631d48c8 Alison Schofield 2025-07-12 3009 if (pos < 0 || pos >= p->nr_targets) {
f7f82c631d48c8 Alison Schofield 2025-07-12 3010 dev_dbg(&cxlr->dev, "Invalid position %d for %d targets\n",
f7f82c631d48c8 Alison Schofield 2025-07-12 3011 pos, p->nr_targets);
f7f82c631d48c8 Alison Schofield 2025-07-12 3012 return -ENXIO;
f7f82c631d48c8 Alison Schofield 2025-07-12 3013 }
f7f82c631d48c8 Alison Schofield 2025-07-12 3014 /*
f7f82c631d48c8 Alison Schofield 2025-07-12 3015 * DPA offset: CXL Spec 3.2 Section 8.2.4.20.13
f7f82c631d48c8 Alison Schofield 2025-07-12 3016 * Lower bits [IG+7:0] pass through unchanged
f7f82c631d48c8 Alison Schofield 2025-07-12 3017 * (eiw < 8)
f7f82c631d48c8 Alison Schofield 2025-07-12 3018 * Per spec: DPAOffset[51:IG+8] = (HPAOffset[51:IG+IW+8] >> IW)
f7f82c631d48c8 Alison Schofield 2025-07-12 3019 * Clear the position bits to isolate upper section, then
f7f82c631d48c8 Alison Schofield 2025-07-12 3020 * reverse the left shift by eiw that occurred during DPA->HPA
f7f82c631d48c8 Alison Schofield 2025-07-12 3021 * (eiw >= 8)
f7f82c631d48c8 Alison Schofield 2025-07-12 3022 * Per spec: DPAOffset[51:IG+8] = HPAOffset[51:IG+IW] / 3
f7f82c631d48c8 Alison Schofield 2025-07-12 3023 * Extract upper bits from the correct bit range and divide by 3
f7f82c631d48c8 Alison Schofield 2025-07-12 3024 * to recover the original DPA upper bits
f7f82c631d48c8 Alison Schofield 2025-07-12 3025 */
f7f82c631d48c8 Alison Schofield 2025-07-12 3026
f7f82c631d48c8 Alison Schofield 2025-07-12 3027 bits_lower = hpa_offset & GENMASK_ULL(eig + 7, 0);
f7f82c631d48c8 Alison Schofield 2025-07-12 3028 if (eiw < 8) {
f7f82c631d48c8 Alison Schofield 2025-07-12 3029 hpa_offset &= ~((u64)GENMASK(eiw + eig + 8 - 1, 0));
f7f82c631d48c8 Alison Schofield 2025-07-12 3030 dpa_offset = hpa_offset >> eiw;
f7f82c631d48c8 Alison Schofield 2025-07-12 3031 } else {
f7f82c631d48c8 Alison Schofield 2025-07-12 3032 bits_upper = div64_u64(hpa_offset >> (eig + eiw), 3);
f7f82c631d48c8 Alison Schofield 2025-07-12 3033 dpa_offset = bits_upper << (eig + 8);
f7f82c631d48c8 Alison Schofield 2025-07-12 3034 }
f7f82c631d48c8 Alison Schofield 2025-07-12 3035 dpa_offset |= bits_lower;
f7f82c631d48c8 Alison Schofield 2025-07-12 3036
f7f82c631d48c8 Alison Schofield 2025-07-12 3037 /* Look-up and return the result: a memdev and a DPA */
f7f82c631d48c8 Alison Schofield 2025-07-12 3038 for (int i = 0; i < p->nr_targets; i++) {
f7f82c631d48c8 Alison Schofield 2025-07-12 3039 cxled = p->targets[i];
f7f82c631d48c8 Alison Schofield 2025-07-12 3040 if (cxled->pos != pos)
f7f82c631d48c8 Alison Schofield 2025-07-12 3041 continue;
f7f82c631d48c8 Alison Schofield 2025-07-12 3042 result->cxlmd = cxled_to_memdev(cxled);
f7f82c631d48c8 Alison Schofield 2025-07-12 3043 result->dpa = cxl_dpa_resource_start(cxled) + dpa_offset;
f7f82c631d48c8 Alison Schofield 2025-07-12 3044
f7f82c631d48c8 Alison Schofield 2025-07-12 3045 return 0;
f7f82c631d48c8 Alison Schofield 2025-07-12 3046 }
f7f82c631d48c8 Alison Schofield 2025-07-12 3047 dev_err(&cxlr->dev, "No device found for position %d\n", pos);
f7f82c631d48c8 Alison Schofield 2025-07-12 3048
f7f82c631d48c8 Alison Schofield 2025-07-12 3049 return -ENXIO;
f7f82c631d48c8 Alison Schofield 2025-07-12 3050 }
f7f82c631d48c8 Alison Schofield 2025-07-12 3051
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs
2025-07-13 2:37 ` [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
@ 2025-07-15 21:33 ` Dave Jiang
2025-07-16 10:58 ` Jonathan Cameron
1 sibling, 0 replies; 14+ messages in thread
From: Dave Jiang @ 2025-07-15 21:33 UTC (permalink / raw)
To: alison.schofield, Davidlohr Bueso, Jonathan Cameron, Vishal Verma,
Ira Weiny, Dan Williams
Cc: linux-cxl
On 7/12/25 7:37 PM, alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> The core functions that validate and send inject and clear commands
> to the memdev devices require holding both the dpa_rwsem and the
> region_rwsem.
>
> In preparation for another caller of these functions that must hold
> the locks upon entry, split the work into a locked and unlocked pair.
>
> Consideration was given to moving the locking to both callers,
> however, the existing caller is not in the core (mem.c) and cannot
> access the locks.
>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
>
> Changes in v3:
> Undo v2 return of rc from locked variants. Return 0. (Jonathan)
>
> drivers/cxl/core/memdev.c | 56 ++++++++++++++++++++++++++++-----------
> drivers/cxl/cxlmem.h | 2 ++
> 2 files changed, 42 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index f5fbd34310fd..097fc9828f3a 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -276,7 +276,7 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
> return 0;
> }
>
> -int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
> {
> struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
> struct cxl_mbox_inject_poison inject;
> @@ -288,13 +288,8 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> if (!IS_ENABLED(CONFIG_DEBUG_FS))
> return 0;
>
> - ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> - if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> - return rc;
> -
> - ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> - if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> - return rc;
> + lockdep_assert_held(&cxl_rwsem.dpa);
> + lockdep_assert_held(&cxl_rwsem.region);
>
> rc = cxl_validate_poison_dpa(cxlmd, dpa);
> if (rc)
> @@ -324,9 +319,26 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>
> return 0;
> }
> +
> +int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> + int rc;
> +
> + ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> + return rc;
> +
> + ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> + return rc;
> +
> + rc = cxl_inject_poison_locked(cxlmd, dpa);
> +
> + return rc;
> +}
> EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, "CXL");
>
> -int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa)
> {
> struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
> struct cxl_mbox_clear_poison clear;
> @@ -338,13 +350,8 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> if (!IS_ENABLED(CONFIG_DEBUG_FS))
> return 0;
>
> - ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> - if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> - return rc;
> -
> - ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> - if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> - return rc;
> + lockdep_assert_held(&cxl_rwsem.dpa);
> + lockdep_assert_held(&cxl_rwsem.region);
>
> rc = cxl_validate_poison_dpa(cxlmd, dpa);
> if (rc)
> @@ -383,6 +390,23 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
>
> return 0;
> }
> +
> +int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> + int rc;
> +
> + ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> + return rc;
> +
> + ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> + return rc;
> +
> + rc = cxl_clear_poison_locked(cxlmd, dpa);
> +
> + return rc;
> +}
> EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, "CXL");
>
> static struct attribute *cxl_memdev_attributes[] = {
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index f5b20641e57c..869c73de12c8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -861,6 +861,8 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
> int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
> int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
> +int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, u64 dpa);
> +int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u64 dpa);
>
> #ifdef CONFIG_CXL_EDAC_MEM_FEATURES
> int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation
2025-07-13 2:37 ` [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation alison.schofield
@ 2025-07-16 10:52 ` Jonathan Cameron
2025-07-16 21:26 ` Alison Schofield
2025-07-22 0:49 ` Alison Schofield
0 siblings, 2 replies; 14+ messages in thread
From: Jonathan Cameron @ 2025-07-16 10:52 UTC (permalink / raw)
To: alison.schofield
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Sat, 12 Jul 2025 19:37:55 -0700
alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> Add infrastructure to translate System Physical Addresses (SPA) to
> Device Physical Addresses (DPA) within CXL regions. This capability
> will be used by follow-on patches that add poison inject and clear
> operations at the region level.
>
> The SPA-to-DPA translation process follows these steps:
> 1. Apply root decoder transformations (SPA to HPA) if configured.
> 2. Extract the position in region interleave from the HPA offset.
> 3. Extract the DPA offset from the HPA offset.
> 4. Use position to find endpoint decoder.
> 5. Use endpoint decoder to find memdev and calculate DPA from offset.
> 6. Return the result - a memdev and a DPA.
>
> It is Step 1 above that makes this a driver level operation and not
> work we can push to user space. Rather than exporting the XOR maps for
> root decoders configured with XOR interleave, the driver performs this
> complex calculation for the user.
>
> Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
> Implementation Note: Device Decode Logic.
>
> These calculations mirror much of the logic introduced earlier in DPA
> to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
> reverse the spec defined 'Deivce Decode Logic'.
Device
>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Hi Alison,
A few minor things inline.
> ---
>
> Changes in v3:
> Check return of ways_to_eiw() & granularity_to_eig() (Jonathan)
> Collapse a header comment into pos and offset blocks (Jonathan)
> Calc pos for 3,6,12 using actual ways, not always 3 (Jonathan)
> Mask off bottom bits in < 8 offset case (Jonathan)
> Wrap code comments closer to column 80 (Jonathan)
> Use a continue to un-nest a for loop. (Jonathan)
>
> drivers/cxl/core/region.c | 93 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 93 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index a2ba19151d4f..404f69864d61 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2956,6 +2956,99 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> return hpa;
> }
>
> +struct dpa_result {
> + struct cxl_memdev *cxlmd;
> + u64 dpa;
> +};
> +
> +static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
> + u64 offset,
> + struct dpa_result *result)
> +{
> + struct cxl_region_params *p = &cxlr->params;
> + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> + struct cxl_endpoint_decoder *cxled;
> + u64 hpa, hpa_offset, dpa_offset;
> + u64 bits_upper, bits_lower;
> + u16 eig = 0;
> + u8 eiw = 0;
> + int pos;
> +
> + lockdep_assert_held(&cxl_rwsem.region);
> + lockdep_assert_held(&cxl_rwsem.dpa);
> +
> + if (granularity_to_eig(p->interleave_granularity, &eig) ||
> + ways_to_eiw(p->interleave_ways, &eiw))
> + return -ENXIO;
Why eat return values to replace with ENXIO? I'd prefer
rc = granularity_to_eig(p->interleave_granularity, &eig);
if (rc)
return rc;
rc = ways_to_eiw(p->interleave_ways, &eiw);
if (rc)
return rc;
> +
> + /*
> + * If the root decoder has SPA to CXL HPA callback, use it. Otherwise
> + * CXL HPA is assumed to equal SPA.
> + */
> + if (cxlrd->spa_to_hpa) {
> + hpa = cxlrd->spa_to_hpa(cxlrd, p->res->start + offset);
> + hpa_offset = hpa - p->res->start;
> + } else {
> + hpa_offset = offset;
> + }
> + /*
> + * Interleave position: CXL Spec 3.2 Section 8.2.4.20.13
> + * eiw < 8
> + * Position is in the IW bits at HPA_OFFSET[IG+8+IW-1:IG+8].
> + * Per spec "remove IW bits starting with bit position IG+8"
> + * eiw >= 8
> + * Position is not explicitly stored in HPA_OFFSET bits. It is
> + * derived from the modulo operation of the upper bits using
> + * the total number of interleave ways.
> + */
> + if (eiw < 8)
> + pos = (hpa_offset >> (eig + 8)) & GENMASK(eiw - 1, 0);
> + else
> + pos = (hpa_offset >> (eig + 8)) % p->interleave_ways;
> +
> + if (pos < 0 || pos >= p->nr_targets) {
> + dev_dbg(&cxlr->dev, "Invalid position %d for %d targets\n",
> + pos, p->nr_targets);
> + return -ENXIO;
> + }
Maybe a blank line here.
> + /*
> + * DPA offset: CXL Spec 3.2 Section 8.2.4.20.13
> + * Lower bits [IG+7:0] pass through unchanged
> + * (eiw < 8)
> + * Per spec: DPAOffset[51:IG+8] = (HPAOffset[51:IG+IW+8] >> IW)
> + * Clear the position bits to isolate upper section, then
> + * reverse the left shift by eiw that occurred during DPA->HPA
> + * (eiw >= 8)
> + * Per spec: DPAOffset[51:IG+8] = HPAOffset[51:IG+IW] / 3
> + * Extract upper bits from the correct bit range and divide by 3
> + * to recover the original DPA upper bits
> + */
> +
and not one here to keep this grouped slightly more closely with the
code it is talking about.
> + bits_lower = hpa_offset & GENMASK_ULL(eig + 7, 0);
> + if (eiw < 8) {
> + hpa_offset &= ~((u64)GENMASK(eiw + eig + 8 - 1, 0));
Really trivial but I'd use eig + eiw + 8 - 1 to align with the parameter
order in the comment. With matching that comment in mind why not,
dpa_offset = (hpa_offset & (GENMASK_ULL(51, eig + eiw + 8) >> eiw;
(Go long on this line for readability) Or use 63 if you prefer for the upper
a should remain safe.
Not using hpa_offset for the intermediate value avoid ambiguity about what
the value is at that stage (it's not the hpa_offset
> + dpa_offset = hpa_offset >> eiw;
> + } else {
> + bits_upper = div64_u64(hpa_offset >> (eig + eiw), 3);
> + dpa_offset = bits_upper << (eig + 8);
> + }
> + dpa_offset |= bits_lower;
> +
> + /* Look-up and return the result: a memdev and a DPA */
> + for (int i = 0; i < p->nr_targets; i++) {
> + cxled = p->targets[i];
> + if (cxled->pos != pos)
> + continue;
> + result->cxlmd = cxled_to_memdev(cxled);
> + result->dpa = cxl_dpa_resource_start(cxled) + dpa_offset;
> +
> + return 0;
> + }
> + dev_err(&cxlr->dev, "No device found for position %d\n", pos);
> +
> + return -ENXIO;
> +}
> +
> static struct lock_class_key cxl_pmem_region_key;
>
> static int cxl_pmem_region_alloc(struct cxl_region *cxlr)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs
2025-07-13 2:37 ` [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
2025-07-15 21:33 ` Dave Jiang
@ 2025-07-16 10:58 ` Jonathan Cameron
2025-07-22 1:11 ` Alison Schofield
1 sibling, 1 reply; 14+ messages in thread
From: Jonathan Cameron @ 2025-07-16 10:58 UTC (permalink / raw)
To: alison.schofield
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Sat, 12 Jul 2025 19:37:56 -0700
alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> The core functions that validate and send inject and clear commands
> to the memdev devices require holding both the dpa_rwsem and the
> region_rwsem.
>
> In preparation for another caller of these functions that must hold
> the locks upon entry, split the work into a locked and unlocked pair.
>
> Consideration was given to moving the locking to both callers,
> however, the existing caller is not in the core (mem.c) and cannot
> access the locks.
>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Trivial comment to cleanup inline.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> @@ -324,9 +319,26 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
>
> return 0;
> }
> +
> +int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> + int rc;
> +
> + ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> + return rc;
> +
> + ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> + return rc;
> +
> + rc = cxl_inject_poison_locked(cxlmd, dpa);
> +
> + return rc;
return cxl_inject_poison_locked(cxlmd, dpa);
> +}
> EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, "CXL");
> +
> +int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> + int rc;
> +
> + ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> + return rc;
> +
> + ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> + return rc;
> +
> + rc = cxl_clear_poison_locked(cxlmd, dpa);
> +
> + return rc;
> +}
return cxl_clear_poison_locked(cxlmd, dpa);
I wondered if this was to save churn in a later patch, but
not seeing anything related in patch 4.
> EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, "CXL");
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset
2025-07-13 2:37 ` [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
2025-07-13 17:23 ` kernel test robot
@ 2025-07-16 11:02 ` Jonathan Cameron
1 sibling, 0 replies; 14+ messages in thread
From: Jonathan Cameron @ 2025-07-16 11:02 UTC (permalink / raw)
To: alison.schofield
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Sat, 12 Jul 2025 19:37:57 -0700
alison.schofield@intel.com wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> Add CXL region debugfs attributes to inject and clear poison based
> on an offset into the region. These new interfaces allow users to
> operate on poison at the region level without needing to resolve
> Device Physical Addresses (DPA) or target individual memdevs.
>
> The implementation uses a new helper, region_offset_to_dpa_result()
> that applies decoder interleave logic, including XOR-based address
> decoding when applicable. Note that XOR decodes rely on driver
> internal xormaps which are not exposed to userspace. So, this support
> is not only a simplification of poison operations that could be done
> using existing per memdev operations, but also it enables this
> functionality for XOR interleaved regions for the first time.
>
> New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/:
> inject_poison and clear_poison. These are only exposed if all memdevs
> participating in the region support both inject and clear commands,
> ensuring consistent and reliable behavior across multi-device regions.
>
> If tracing is enabled, these operations are logged as cxl_poison
> events in /sys/kernel/tracing/trace.
>
> The ABI documentation warns users of the significant risks that
> come with using these capabilities.
>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation
2025-07-16 10:52 ` Jonathan Cameron
@ 2025-07-16 21:26 ` Alison Schofield
2025-07-21 10:35 ` Jonathan Cameron
2025-07-22 0:49 ` Alison Schofield
1 sibling, 1 reply; 14+ messages in thread
From: Alison Schofield @ 2025-07-16 21:26 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Wed, Jul 16, 2025 at 11:52:26AM +0100, Jonathan Cameron wrote:
> On Sat, 12 Jul 2025 19:37:55 -0700
> alison.schofield@intel.com wrote:
>
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > Add infrastructure to translate System Physical Addresses (SPA) to
> > Device Physical Addresses (DPA) within CXL regions. This capability
> > will be used by follow-on patches that add poison inject and clear
> > operations at the region level.
> >
> > The SPA-to-DPA translation process follows these steps:
> > 1. Apply root decoder transformations (SPA to HPA) if configured.
> > 2. Extract the position in region interleave from the HPA offset.
> > 3. Extract the DPA offset from the HPA offset.
> > 4. Use position to find endpoint decoder.
> > 5. Use endpoint decoder to find memdev and calculate DPA from offset.
> > 6. Return the result - a memdev and a DPA.
> >
> > It is Step 1 above that makes this a driver level operation and not
> > work we can push to user space. Rather than exporting the XOR maps for
> > root decoders configured with XOR interleave, the driver performs this
> > complex calculation for the user.
> >
> > Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
> > Implementation Note: Device Decode Logic.
> >
> > These calculations mirror much of the logic introduced earlier in DPA
> > to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
> > reverse the spec defined 'Deivce Decode Logic'.
>
> Device
>
> >
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Hi Alison,
>
> A few minor things inline.
> > ---
> >
> > Changes in v3:
> > Check return of ways_to_eiw() & granularity_to_eig() (Jonathan)
> > Collapse a header comment into pos and offset blocks (Jonathan)
> > Calc pos for 3,6,12 using actual ways, not always 3 (Jonathan)
> > Mask off bottom bits in < 8 offset case (Jonathan)
> > Wrap code comments closer to column 80 (Jonathan)
> > Use a continue to un-nest a for loop. (Jonathan)
> >
> > drivers/cxl/core/region.c | 93 +++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 93 insertions(+)
> >
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index a2ba19151d4f..404f69864d61 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -2956,6 +2956,99 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> > return hpa;
> > }
> >
> > +struct dpa_result {
> > + struct cxl_memdev *cxlmd;
> > + u64 dpa;
> > +};
> > +
> > +static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
> > + u64 offset,
> > + struct dpa_result *result)
> > +{
> > + struct cxl_region_params *p = &cxlr->params;
> > + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> > + struct cxl_endpoint_decoder *cxled;
> > + u64 hpa, hpa_offset, dpa_offset;
> > + u64 bits_upper, bits_lower;
> > + u16 eig = 0;
> > + u8 eiw = 0;
> > + int pos;
> > +
> > + lockdep_assert_held(&cxl_rwsem.region);
> > + lockdep_assert_held(&cxl_rwsem.dpa);
> > +
> > + if (granularity_to_eig(p->interleave_granularity, &eig) ||
> > + ways_to_eiw(p->interleave_ways, &eiw))
> > + return -ENXIO;
>
> Why eat return values to replace with ENXIO? I'd prefer
>
> rc = granularity_to_eig(p->interleave_granularity, &eig);
> if (rc)
> return rc;
>
> rc = ways_to_eiw(p->interleave_ways, &eiw);
> if (rc)
> return rc;
Ah, I should never have tried to appease you with a shortcut.
Let's revisit.
In v1,v2 it was this-
+ ways_to_eiw(p->interleave_ways, &eiw);
+ granularity_to_eig(p->interleave_granularity, &eig);
And you asked to check the return value. I *should* have said "No, the
input validation ensures these are valid. That is, when storing these
values in the region params, we do eiw_to_ways() and check that rc
value when storing. I'll note that this is not the first instance of
using these functions without checking return value.
How about one of these:
Original:
+ ways_to_eiw(p->interleave_ways, &eiw);
+ granularity_to_eig(p->interleave_granularity, &eig);
Original w comment:
+ /* Input validation ensures valid ways and gran */
+ ways_to_eiw(p->interleave_ways, &eiw);
+ granularity_to_eig(p->interleave_granularity, &eig);
Original w caution:
+ if (!p->interleave_ways || !p->interleave_granularity)
+ return -ENXIO;
+ ways_to_eiw(p->interleave_ways, &eiw);
+ granularity_to_eig(p->interleave_granularity, &eig);
-- Alison
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation
2025-07-16 21:26 ` Alison Schofield
@ 2025-07-21 10:35 ` Jonathan Cameron
0 siblings, 0 replies; 14+ messages in thread
From: Jonathan Cameron @ 2025-07-21 10:35 UTC (permalink / raw)
To: Alison Schofield
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Wed, 16 Jul 2025 14:26:41 -0700
Alison Schofield <alison.schofield@intel.com> wrote:
> On Wed, Jul 16, 2025 at 11:52:26AM +0100, Jonathan Cameron wrote:
> > On Sat, 12 Jul 2025 19:37:55 -0700
> > alison.schofield@intel.com wrote:
> >
> > > From: Alison Schofield <alison.schofield@intel.com>
> > >
> > > Add infrastructure to translate System Physical Addresses (SPA) to
> > > Device Physical Addresses (DPA) within CXL regions. This capability
> > > will be used by follow-on patches that add poison inject and clear
> > > operations at the region level.
> > >
> > > The SPA-to-DPA translation process follows these steps:
> > > 1. Apply root decoder transformations (SPA to HPA) if configured.
> > > 2. Extract the position in region interleave from the HPA offset.
> > > 3. Extract the DPA offset from the HPA offset.
> > > 4. Use position to find endpoint decoder.
> > > 5. Use endpoint decoder to find memdev and calculate DPA from offset.
> > > 6. Return the result - a memdev and a DPA.
> > >
> > > It is Step 1 above that makes this a driver level operation and not
> > > work we can push to user space. Rather than exporting the XOR maps for
> > > root decoders configured with XOR interleave, the driver performs this
> > > complex calculation for the user.
> > >
> > > Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
> > > Implementation Note: Device Decode Logic.
> > >
> > > These calculations mirror much of the logic introduced earlier in DPA
> > > to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
> > > reverse the spec defined 'Deivce Decode Logic'.
> >
> > Device
> >
> > >
> > > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > Hi Alison,
> >
> > A few minor things inline.
> > > ---
> > >
> > > Changes in v3:
> > > Check return of ways_to_eiw() & granularity_to_eig() (Jonathan)
> > > Collapse a header comment into pos and offset blocks (Jonathan)
> > > Calc pos for 3,6,12 using actual ways, not always 3 (Jonathan)
> > > Mask off bottom bits in < 8 offset case (Jonathan)
> > > Wrap code comments closer to column 80 (Jonathan)
> > > Use a continue to un-nest a for loop. (Jonathan)
> > >
> > > drivers/cxl/core/region.c | 93 +++++++++++++++++++++++++++++++++++++++
> > > 1 file changed, 93 insertions(+)
> > >
> > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > > index a2ba19151d4f..404f69864d61 100644
> > > --- a/drivers/cxl/core/region.c
> > > +++ b/drivers/cxl/core/region.c
> > > @@ -2956,6 +2956,99 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
> > > return hpa;
> > > }
> > >
> > > +struct dpa_result {
> > > + struct cxl_memdev *cxlmd;
> > > + u64 dpa;
> > > +};
> > > +
> > > +static int __maybe_unused region_offset_to_dpa_result(struct cxl_region *cxlr,
> > > + u64 offset,
> > > + struct dpa_result *result)
> > > +{
> > > + struct cxl_region_params *p = &cxlr->params;
> > > + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
> > > + struct cxl_endpoint_decoder *cxled;
> > > + u64 hpa, hpa_offset, dpa_offset;
> > > + u64 bits_upper, bits_lower;
> > > + u16 eig = 0;
> > > + u8 eiw = 0;
> > > + int pos;
> > > +
> > > + lockdep_assert_held(&cxl_rwsem.region);
> > > + lockdep_assert_held(&cxl_rwsem.dpa);
> > > +
> > > + if (granularity_to_eig(p->interleave_granularity, &eig) ||
> > > + ways_to_eiw(p->interleave_ways, &eiw))
> > > + return -ENXIO;
> >
> > Why eat return values to replace with ENXIO? I'd prefer
> >
> > rc = granularity_to_eig(p->interleave_granularity, &eig);
> > if (rc)
> > return rc;
> >
> > rc = ways_to_eiw(p->interleave_ways, &eiw);
> > if (rc)
> > return rc;
>
> Ah, I should never have tried to appease you with a shortcut.
> Let's revisit.
>
> In v1,v2 it was this-
> + ways_to_eiw(p->interleave_ways, &eiw);
> + granularity_to_eig(p->interleave_granularity, &eig);
>
> And you asked to check the return value. I *should* have said "No, the
> input validation ensures these are valid. That is, when storing these
> values in the region params, we do eiw_to_ways() and check that rc
> value when storing. I'll note that this is not the first instance of
> using these functions without checking return value.
>
> How about one of these:
>
> Original:
> + ways_to_eiw(p->interleave_ways, &eiw);
> + granularity_to_eig(p->interleave_granularity, &eig);
>
> Original w comment:
> + /* Input validation ensures valid ways and gran */
> + ways_to_eiw(p->interleave_ways, &eiw);
> + granularity_to_eig(p->interleave_granularity, &eig);
That's fine.
Personally I'd just check them separately and return the errors.
Sometimes it's less effort to add long hand error checks that
never fail (in slow paths!) than it is to explain why not ;)
Jonathan
>
> Original w caution:
> + if (!p->interleave_ways || !p->interleave_granularity)
> + return -ENXIO;
> + ways_to_eiw(p->interleave_ways, &eiw);
> + granularity_to_eig(p->interleave_granularity, &eig);
>
>
> -- Alison
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation
2025-07-16 10:52 ` Jonathan Cameron
2025-07-16 21:26 ` Alison Schofield
@ 2025-07-22 0:49 ` Alison Schofield
1 sibling, 0 replies; 14+ messages in thread
From: Alison Schofield @ 2025-07-22 0:49 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Wed, Jul 16, 2025 at 11:52:26AM +0100, Jonathan Cameron wrote:
> On Sat, 12 Jul 2025 19:37:55 -0700
> alison.schofield@intel.com wrote:
>
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > Add infrastructure to translate System Physical Addresses (SPA) to
> > Device Physical Addresses (DPA) within CXL regions. This capability
> > will be used by follow-on patches that add poison inject and clear
> > operations at the region level.
> >
> > The SPA-to-DPA translation process follows these steps:
> > 1. Apply root decoder transformations (SPA to HPA) if configured.
> > 2. Extract the position in region interleave from the HPA offset.
> > 3. Extract the DPA offset from the HPA offset.
> > 4. Use position to find endpoint decoder.
> > 5. Use endpoint decoder to find memdev and calculate DPA from offset.
> > 6. Return the result - a memdev and a DPA.
> >
> > It is Step 1 above that makes this a driver level operation and not
> > work we can push to user space. Rather than exporting the XOR maps for
> > root decoders configured with XOR interleave, the driver performs this
> > complex calculation for the user.
> >
> > Steps 2 and 3 follow the CXL Spec 3.2 Section 8.2.4.20.13
> > Implementation Note: Device Decode Logic.
> >
> > These calculations mirror much of the logic introduced earlier in DPA
> > to SPA translation, see cxl_dpa_to_hpa(), where the driver needed to
> > reverse the spec defined 'Deivce Decode Logic'.
>
> Device
Done
>
> >
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> Hi Alison,
>
> A few minor things inline.
A few responses that I've addressed in v4 -
> > +
> > + if (pos < 0 || pos >= p->nr_targets) {
> > + dev_dbg(&cxlr->dev, "Invalid position %d for %d targets\n",
> > + pos, p->nr_targets);
> > + return -ENXIO;
> > + }
> Maybe a blank line here.
Done
> > + /*
> > + * DPA offset: CXL Spec 3.2 Section 8.2.4.20.13
> > + * Lower bits [IG+7:0] pass through unchanged
> > + * (eiw < 8)
> > + * Per spec: DPAOffset[51:IG+8] = (HPAOffset[51:IG+IW+8] >> IW)
> > + * Clear the position bits to isolate upper section, then
> > + * reverse the left shift by eiw that occurred during DPA->HPA
> > + * (eiw >= 8)
> > + * Per spec: DPAOffset[51:IG+8] = HPAOffset[51:IG+IW] / 3
> > + * Extract upper bits from the correct bit range and divide by 3
> > + * to recover the original DPA upper bits
> > + */
> > +
>
> and not one here to keep this grouped slightly more closely with the
> code it is talking about.
>
> > + bits_lower = hpa_offset & GENMASK_ULL(eig + 7, 0);
> > + if (eiw < 8) {
> > + hpa_offset &= ~((u64)GENMASK(eiw + eig + 8 - 1, 0));
>
> Really trivial but I'd use eig + eiw + 8 - 1 to align with the parameter
> order in the comment. With matching that comment in mind why not,
>
> dpa_offset = (hpa_offset & (GENMASK_ULL(51, eig + eiw + 8) >> eiw;
> (Go long on this line for readability) Or use 63 if you prefer for the upper
> a should remain safe.
>
> Not using hpa_offset for the intermediate value avoid ambiguity about what
> the value is at that stage (it's not the hpa_offset
>
I've flipped to eig + eiw.
I see the ambiguity introduced by modifying hpa_offset in place.
However, I don't think doing the work in one line is needed. The work
flows like the comment:
* Clear the position bits to isolate upper section
temp = hpa_offset &= ~((u64)GENMASK(eig + eiw + 8 - 1, 0));
* the reverse the left shift by eiw that occurred during DPA->HPA
dpa_offset = temp >> eiw;
See what you think in next rev.
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs
2025-07-16 10:58 ` Jonathan Cameron
@ 2025-07-22 1:11 ` Alison Schofield
0 siblings, 0 replies; 14+ messages in thread
From: Alison Schofield @ 2025-07-22 1:11 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Davidlohr Bueso, Dave Jiang, Vishal Verma, Ira Weiny,
Dan Williams, linux-cxl
On Wed, Jul 16, 2025 at 11:58:09AM +0100, Jonathan Cameron wrote:
> On Sat, 12 Jul 2025 19:37:56 -0700
> alison.schofield@intel.com wrote:
>
> > From: Alison Schofield <alison.schofield@intel.com>
> >
> > The core functions that validate and send inject and clear commands
> > to the memdev devices require holding both the dpa_rwsem and the
> > region_rwsem.
> >
> > In preparation for another caller of these functions that must hold
> > the locks upon entry, split the work into a locked and unlocked pair.
> >
> > Consideration was given to moving the locking to both callers,
> > however, the existing caller is not in the core (mem.c) and cannot
> > access the locks.
> >
> > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
>
> Trivial comment to cleanup inline.
>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>
> > @@ -324,9 +319,26 @@ int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> >
> > return 0;
> > }
> > +
> > +int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
> > +{
> > + int rc;
> > +
> > + ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> > + if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> > + return rc;
> > +
> > + ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> > + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> > + return rc;
> > +
> > + rc = cxl_inject_poison_locked(cxlmd, dpa);
> > +
> > + return rc;
>
> return cxl_inject_poison_locked(cxlmd, dpa);
>
> > +}
> > EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, "CXL");
>
> > +
> > +int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa)
> > +{
> > + int rc;
> > +
> > + ACQUIRE(rwsem_read_intr, region_rwsem)(&cxl_rwsem.region);
> > + if ((rc = ACQUIRE_ERR(rwsem_read_intr, ®ion_rwsem)))
> > + return rc;
> > +
> > + ACQUIRE(rwsem_read_intr, dpa_rwsem)(&cxl_rwsem.dpa);
> > + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &dpa_rwsem)))
> > + return rc;
> > +
> > + rc = cxl_clear_poison_locked(cxlmd, dpa);
> > +
> > + return rc;
> > +}
>
> return cxl_clear_poison_locked(cxlmd, dpa);
>
> I wondered if this was to save churn in a later patch, but
> not seeing anything related in patch 4.
No savings. I'll collapse the returns for both funcs.
Thanks!
>
> > EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, "CXL");
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-07-22 1:12 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-13 2:37 [PATCH v3 0/4] cxl: Support Poison Inject & Clear by Region Offset alison.schofield
2025-07-13 2:37 ` [PATCH v3 1/4] cxl: Define a SPA->CXL HPA root decoder callback for XOR Math alison.schofield
2025-07-13 2:37 ` [PATCH v3 2/4] cxl/region: Introduce SPA to DPA address translation alison.schofield
2025-07-16 10:52 ` Jonathan Cameron
2025-07-16 21:26 ` Alison Schofield
2025-07-21 10:35 ` Jonathan Cameron
2025-07-22 0:49 ` Alison Schofield
2025-07-13 2:37 ` [PATCH v3 3/4] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
2025-07-15 21:33 ` Dave Jiang
2025-07-16 10:58 ` Jonathan Cameron
2025-07-22 1:11 ` Alison Schofield
2025-07-13 2:37 ` [PATCH v3 4/4] cxl/region: Add inject and clear poison by region offset alison.schofield
2025-07-13 17:23 ` kernel test robot
2025-07-16 11:02 ` Jonathan Cameron
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).