[PATCH v4 0/2] Support zero-sized HDM decoders

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/2] Support zero-sized HDM decoders
@ 2026-06-07  8:13 Richard Cheng
  2026-06-07  8:13 ` [PATCH v4 1/2] cxl/hdm: Allow zero sized " Richard Cheng
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Richard Cheng @ 2026-06-07  8:13 UTC (permalink / raw)
  To: dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, dan.j.williams
  Cc: linux-cxl, linux-kernel, newtonl, kristinc, kaihengf, kobak,
	vaslot, smadhavan, Richard Cheng

Hello,

This v4 continues Vishal Aslot's "Support zero-sized decoders" series [1]
and addresses the v3 review of patch 1's port->hdm_end handling [2].

CXL r3.2 §8.2.4.20.12 and §14.13.10 permit committing an HDM decoder with
size 0. BIOS commits and LOCKs such decoders to burn the trailing, unused
slots so the OS cannot program regions through them, e.g. a Type 3 device
in a Trusted Computing Base (TCB) established via the Trusted Security
Protocol (TSP). init_hdm_decoder() rejected these with -ENXIO during port
enumeration and aborted the whole port, so affected systems showed nothing
under 'cxl list'.

Patch 1 enumerates the decoder into the topology with its HW-reported LOCK
state and skips the DPA reservation it does not need.

On port->hdm_end (the v3 review): v3 advanced the watermark for the
zero-size decoder. sashiko correctly noted the write was outside
cxl_rwsem.dpa, and that advancing it without a balanced release strands
hdm_end -- cxl_dpa_free() returns early on !dpa_res, so it can never be
decremented past the zero-size id, breaking LIFO teardown of lower
decoders. v4 therefore does not touch hdm_end at all. The in-order check
in __cxl_dpa_reserve() is its only consumer and is never legitimately
reached past such a decoder: the burned slots are trailing, so enumeration
reserves no committed decoder after one, and the OS must not program a
region through a locked slot. hdm_end stays at the last sized reservation,
which is accurate. IMHO, if a non-trailing zero-size layout ever needs
support, the check should key off commit_end rather than hdm_end,
out of scope here.

Patch 1 also carries the poison fix: commit_end may now reference a decoder
with no DPA resource, so poison_by_decoder() falls through to
cxl_get_poison_unmapped() and collects poison from the unmapped DPA tail.

Patch 2 adds cxl_test coverage, gated behind a new mock_zero_size_decoders
module parameter (default off). v2 committed the mock slots unconditionally
on the host-bridge0 auto-region endpoints, which the region tests reuse,
regressing 7 of 17 cxl unit tests; defaulting off leaves the shared
topology untouched, and enabling it reproduces the BIOS layout.

Tested with the ndctl cxl unit suite: param off, no regressions (13 pass,
1 environmental skip); param on, cxl-poison's by-memdev-by-dpa case returns
all 4 records. cxl_test exercises the topology
and poison handling; the init_hdm_decoder enumeration path is validated on
real hardware.

The result is in the following:
"""
$ sudo env "PATH=$PATH" meson test -C build --suite cxl \
> --num-processes 1 -t 6 \
> --print-errorlogs
ninja: Entering directory `/home/nvidia/ndctl/build'
[1/50] Generating version.h with a custom command
 1/14 ndctl:cxl / cxl-topology.sh               OK                3.38s
 2/14 ndctl:cxl / cxl-region-sysfs.sh           OK                2.60s
 3/14 ndctl:cxl / cxl-labels.sh                 OK                2.53s
 4/14 ndctl:cxl / cxl-create-region.sh          OK                3.25s
 5/14 ndctl:cxl / cxl-xor-region.sh             OK                2.61s
 6/14 ndctl:cxl / cxl-events.sh                 OK                2.46s
 7/14 ndctl:cxl / cxl-sanitize.sh               OK                5.42s
 8/14 ndctl:cxl / cxl-destroy-region.sh         OK                2.47s
 9/14 ndctl:cxl / cxl-qos-class.sh              OK                2.54s
10/14 ndctl:cxl / cxl-translate.sh              OK                0.78s
11/14 ndctl:cxl / cxl-elc.sh                    OK                2.45s
12/14 ndctl:cxl / cxl-security.sh               SKIP              0.02s   exit status 77
13/14 ndctl:cxl / cxl-features.sh               OK                1.27s
14/14 ndctl:cxl / cxl-poison.sh                 OK                7.91s

Ok:                 13  
Expected Fail:      0   
Fail:               0   
Unexpected Pass:    0   
Skipped:            1   
Timeout:            0   

Full log written to /home/nvidia/ndctl/build/meson-logs/testlog.txt
"""

And the tests Alison mentioned.
"""
$ sudo env "PATH=$PATH" meson test -C build --num-processes 1 -t 6 --print-errorlogs \
> cxl-region-sysfs.sh cxl-create-region.sh cxl-xor-region.sh \
    cxl-destroy-region.sh cxl-qos-class.sh cxl-poison.sh
ninja: Entering directory `/home/nvidia/ndctl/build'
[1/50] Generating version.h with a custom command
1/6 ndctl:cxl / cxl-region-sysfs.sh          OK                2.62s
2/6 ndctl:cxl / cxl-create-region.sh         OK                3.15s
3/6 ndctl:cxl / cxl-xor-region.sh            OK                2.61s
4/6 ndctl:cxl / cxl-destroy-region.sh        OK                2.39s
5/6 ndctl:cxl / cxl-qos-class.sh             OK                2.47s
6/6 ndctl:cxl / cxl-poison.sh                OK                7.80s

Ok:                 6   
Expected Fail:      0   
Fail:               0   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            0   

Full log written to /home/nvidia/ndctl/build/meson-logs/testlog.txt
"""

[1] https://lore.kernel.org/all/20251015024019.1189713-1-vaslot@nvidia.com/T/#u
[2] https://lore.kernel.org/all/20260607053837.4389-1-icheng@nvidia.com/

Richard Cheng (2):
  cxl/hdm: Allow zero sized HDM decoders
  tools/testing/cxl: Enable zero sized decoder under hb0

 drivers/cxl/core/hdm.c       | 23 +++++++---
 drivers/cxl/core/region.c    | 42 +++++++++---------
 tools/testing/cxl/test/cxl.c | 83 +++++++++++++++++++++++++++++++++---
 3 files changed, 115 insertions(+), 33 deletions(-)

base-commit: ddd664bbff63e09e7a7f9acae9c43605d4cf185f
-- 
2.43.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/2] cxl/hdm: Allow zero sized HDM decoders
  2026-06-07  8:13 [PATCH v4 0/2] Support zero-sized HDM decoders Richard Cheng
@ 2026-06-07  8:13 ` Richard Cheng
  2026-06-07  8:30   ` sashiko-bot
  2026-06-07  8:13 ` [PATCH v4 2/2] tools/testing/cxl: Enable zero sized decoder under hb0 Richard Cheng
  2026-06-09 23:13 ` [PATCH v4 0/2] Support zero-sized HDM decoders Dan Williams (nvidia)
  2 siblings, 1 reply; 7+ messages in thread
From: Richard Cheng @ 2026-06-07  8:13 UTC (permalink / raw)
  To: dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, dan.j.williams
  Cc: linux-cxl, linux-kernel, newtonl, kristinc, kaihengf, kobak,
	vaslot, smadhavan, Richard Cheng

CXL r3.2 8.2.4.20.12 and 14.13.10 permit committing an HDM decoder with
size 0. BIOS commits and locks such decoders to burn the trailing, unused
slots so the OS cannot program regions through them, e.g. a Type 3 device
in a TSP-established TCB. init_hdm_decoder() rejected these with -ENXIO and
aborted port enumeration, so "cxl list" showed nothing.

Enumerate the decoder with its hardware LOCK state and skip the DPA
reservation it does not need.

commit_end may now reference a decoder with no DPA resource, so
poison_by_decoder() must still scan the unmapped DPA tail.

Signed-off-by: Vishal Aslot <vaslot@nvidia.com>
Signed-off-by: Richard Cheng <icheng@nvidia.com>
---
v3->v4:
      - Drop the port->hdm_end advance for zero-size decoders (sashiko AI
        review). The write was outside cxl_rwsem.dpa, and advancing the
        watermark without a balanced release strands hdm_end and breaks
        LIFO teardown. It is also unnecessary -- zero-size committed slots
        are trailing and locked, so the in-order check in
        __cxl_dpa_reserve() is never reached past them.

v2->v3:
      - Advance port->hdm_end for the committed zero-size decoder so a
        following committed decoder still passes the in-order check in
        __cxl_dpa_reserve() (it was left un-incremented in v2).
      - Fold the poison fix into this patch: commit_end may now reference a
        zero-size decoder with no DPA resource, so poison_by_decoder() falls
        through to run cxl_get_poison_unmapped() and scan the unmapped DPA
        tail.

v1->v2:
      - Add zero-size committed decoders to the topology instead of
        skipping them. Drop v1's -ENOSPC sentinel and the matching
        "continue" in devm_cxl_enumerate_decoders(); fall through so
        add_hdm_decoder() registers the decoder.
      - Set port->commit_end unconditionally for any committed decoder,
        not only non-zero-size ones, so subsequent decoders satisfy the
        out-of-order check.
      - Add an explicit early-return before devm_cxl_dpa_reserve() in the
        endpoint-decoder path. __cxl_dpa_reserve() rejects zero-size
        decoders.
      - Spell out TSP and TCB and cite spec sections in commit message.
      - Reorder series, implementation first.
---
 drivers/cxl/core/hdm.c    | 23 +++++++++++++++------
 drivers/cxl/core/region.c | 42 +++++++++++++++++++--------------------
 2 files changed, 38 insertions(+), 27 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 0c80b76a5f9b..b61f51134551 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -1031,13 +1031,17 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 			return -ENXIO;
 		}
 
-		if (size == 0) {
-			dev_warn(&port->dev,
-				 "decoder%d.%d: Committed with zero size\n",
-				 port->id, cxld->id);
-			return -ENXIO;
-		}
 		port->commit_end = cxld->id;
+
+		/*
+		 * CXL r3.2 8.2.4.20.12 permits committing an HDM decoder with
+		 * size 0. Enumerate it into the topology with its HW-reported
+		 * LOCK state instead of aborting the port.
+		 */
+		if (size == 0)
+			dev_dbg(&port->dev,
+				"decoder%d.%d: Committed with zero size\n",
+				port->id, cxld->id);
 	} else {
 		if (cxled) {
 			struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
@@ -1096,6 +1100,13 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 	if (!committed)
 		return 0;
 
+	/*
+	 * A committed zero-size decoder reserves no DPA. Leave port->hdm_end
+	 * untouched.
+	 */
+	if (size == 0)
+		return 0;
+
 	dpa_size = div_u64_rem(size, cxld->interleave_ways, &remainder);
 	if (remainder) {
 		dev_err(&port->dev,
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index e50dc716d4e8..a353d8e7489d 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2907,38 +2907,38 @@ static int poison_by_decoder(struct device *dev, void *arg)
 		return rc;
 
 	cxled = to_cxl_endpoint_decoder(dev);
-	if (!cxled->dpa_res)
-		return rc;
 
-	cxlmd = cxled_to_memdev(cxled);
-	cxlds = cxlmd->cxlds;
-	mode = cxlds->part[cxled->part].mode;
+	if (cxled->dpa_res) {
+		cxlmd = cxled_to_memdev(cxled);
+		cxlds = cxlmd->cxlds;
+		mode = cxlds->part[cxled->part].mode;
+
+		if (cxled->skip) {
+			offset = cxled->dpa_res->start - cxled->skip;
+			length = cxled->skip;
+			rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+			if (rc == -EFAULT && mode == CXL_PARTMODE_RAM)
+				rc = 0;
+			if (rc)
+				return rc;
+		}
 
-	if (cxled->skip) {
-		offset = cxled->dpa_res->start - cxled->skip;
-		length = cxled->skip;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		offset = cxled->dpa_res->start;
+		length = cxled->dpa_res->end - offset + 1;
+		rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region);
 		if (rc == -EFAULT && mode == CXL_PARTMODE_RAM)
 			rc = 0;
 		if (rc)
 			return rc;
-	}
 
-	offset = cxled->dpa_res->start;
-	length = cxled->dpa_res->end - offset + 1;
-	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region);
-	if (rc == -EFAULT && mode == CXL_PARTMODE_RAM)
-		rc = 0;
-	if (rc)
-		return rc;
-
-	/* Iterate until commit_end is reached */
-	if (cxled->cxld.id == ctx->port->commit_end) {
 		ctx->offset = cxled->dpa_res->end + 1;
 		ctx->part = cxled->part;
-		return 1;
 	}
 
+	/* Iterate until commit_end is reached */
+	if (cxled->cxld.id == ctx->port->commit_end)
+		return 1;
+
 	return 0;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/2] tools/testing/cxl: Enable zero sized decoder under hb0
  2026-06-07  8:13 [PATCH v4 0/2] Support zero-sized HDM decoders Richard Cheng
  2026-06-07  8:13 ` [PATCH v4 1/2] cxl/hdm: Allow zero sized " Richard Cheng
@ 2026-06-07  8:13 ` Richard Cheng
  2026-06-09 23:13 ` [PATCH v4 0/2] Support zero-sized HDM decoders Dan Williams (nvidia)
  2 siblings, 0 replies; 7+ messages in thread
From: Richard Cheng @ 2026-06-07  8:13 UTC (permalink / raw)
  To: dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, dan.j.williams
  Cc: linux-cxl, linux-kernel, newtonl, kristinc, kaihengf, kobak,
	vaslot, smadhavan, Richard Cheng

The kernel now allows committed HDM decoders of zero size so BIOS can
burn slots with LOCK; cxl_test needs to exercise that path.

Add a mock_zero_size_decoders module parameter (default off). When set,
the special endpoints under host-bridge0 (cxl_mem.0 and cxl_mem.4) commit
decoders 1 and 2 as zero-size + locked above the decoder[0] auto-region,
mirrored on the parent switch and host bridge. commit_end then lands on a
decoder with no DPA resource, exercising the new enumeration and
poison-by-endpoint paths.

Gating keeps the default topology, shared by the rest of the cxl suite,
unchanged.

Signed-off-by: Vishal Aslot <vaslot@nvidia.com>
Signed-off-by: Richard Cheng <icheng@nvidia.com>
---
v3->v4:
      - No change.

v2->v3:
      - Gate the zero-size + locked decoder injection behind a new
        mock_zero_size_decoders module parameter (default off). v2 applied
        it unconditionally on the host-bridge0 auto-region endpoints, which
        the region test suite reuses, regressing 7 of 17 cxl unit tests;
        defaulting off leaves the shared topology untouched.

v1->v2:
      - Replace second_decoder(), third_decoder() with a single
        match_decoder_by_index() helper, so all lookups share one matcher.
      - Use DEFINE_RANGE() for the empty range instead of an open-coded
        struct.
      - Set cxled->state = CXL_DECODER_STATE_MANUAL rather than STATE_AUTO.
      - Set CXL_DECODER_F_LOCK on the mock zero-size decoders to model the
        BIOS-burns-slots case.
---
 tools/testing/cxl/test/cxl.c | 83 +++++++++++++++++++++++++++++++++---
 1 file changed, 77 insertions(+), 6 deletions(-)

diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 296516eecfd6..190cb18d6932 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -17,6 +17,7 @@
 static int interleave_arithmetic;
 static bool extended_linear_cache;
 static bool fail_autoassemble;
+static bool mock_zero_size_decoders;
 
 #define FAKE_QTG_ID	42
 
@@ -1041,16 +1042,47 @@ static void default_mock_decoder(struct cxl_decoder *cxld)
 	WARN_ON_ONCE(!cxld_registry_new(cxld));
 }
 
-static int first_decoder(struct device *dev, const void *data)
+static int match_decoder_by_index(struct device *dev, const void *data)
 {
+	int target_id = *(const int *)data;
 	struct cxl_decoder *cxld;
 
 	if (!is_switch_decoder(dev))
 		return 0;
 	cxld = to_cxl_decoder(dev);
-	if (cxld->id == 0)
-		return 1;
-	return 0;
+	return cxld->id == target_id;
+}
+
+/*
+ * Mock a BIOS-burnt slot: a committed, locked, zero-size decoder
+ * (CXL r3.2 8.2.4.20.12). Gated by the mock_zero_size_decoders module
+ * param so the default cxl_test topology, shared by the region test
+ * suite, is left undisturbed.
+ */
+static void size_zero_mock_decoder_ep(struct cxl_decoder *cxld, u64 base)
+{
+	struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(&cxld->dev);
+
+	cxld->hpa_range = DEFINE_RANGE(base, base - 1);
+	cxld->interleave_ways = 2;
+	cxld->interleave_granularity = 4096;
+	cxld->target_type = CXL_DECODER_HOSTONLYMEM;
+	cxld->flags = CXL_DECODER_F_ENABLE | CXL_DECODER_F_LOCK;
+	cxled->state = CXL_DECODER_STATE_MANUAL;
+	cxld->commit = mock_decoder_commit;
+	cxld->reset = mock_decoder_reset;
+}
+
+static void size_zero_mock_decoder_sw(struct cxl_decoder *cxld, u64 base,
+				      int level)
+{
+	cxld->flags = CXL_DECODER_F_ENABLE | CXL_DECODER_F_LOCK;
+	cxld->target_type = CXL_DECODER_HOSTONLYMEM;
+	cxld->interleave_ways = level == 0 ? 2 : 1;
+	cxld->interleave_granularity = 4096;
+	cxld->hpa_range = DEFINE_RANGE(base, base - 1);
+	cxld->commit = mock_decoder_commit;
+	cxld->reset = mock_decoder_reset;
 }
 
 /*
@@ -1131,7 +1163,7 @@ static bool mock_init_hdm_decoder(struct cxl_decoder *cxld)
 	 * See 'cxl list -BMPu -m cxl_mem.0,cxl_mem.4'
 	 */
 	if (!is_endpoint_decoder(&cxld->dev) || !hb0 || pdev->id % 4 ||
-	    pdev->id > 4 || cxld->id > 0) {
+	    pdev->id > 4 || cxld->id > (mock_zero_size_decoders ? 2 : 0)) {
 		default_mock_decoder(cxld);
 		return false;
 	}
@@ -1145,6 +1177,20 @@ static bool mock_init_hdm_decoder(struct cxl_decoder *cxld)
 	base = window->base_hpa;
 	if (extended_linear_cache)
 		base += mock_auto_region_size;
+
+	/*
+	 * With mock_zero_size_decoders, decoders 1 and 2 of the special
+	 * endpoints mock BIOS-burnt zero-size + locked slots above the
+	 * decoder[0] auto-region (CXL r3.2 8.2.4.20.12). commit_end then
+	 * points at a decoder with no DPA resource, exercising the
+	 * zero-size enumeration and poison-by-endpoint code paths.
+	 */
+	if (cxld->id == 1 || cxld->id == 2) {
+		size_zero_mock_decoder_ep(cxld, base);
+		port->commit_end = cxld->id;
+		WARN_ON_ONCE(!cxld_registry_new(cxld));
+		return false;
+	}
 	cxld->hpa_range = (struct range) {
 		.start = base,
 		.end = base + mock_auto_region_size - 1,
@@ -1168,9 +1214,11 @@ static bool mock_init_hdm_decoder(struct cxl_decoder *cxld)
 	 */
 	iter = port;
 	for (i = 0; i < 2; i++) {
+		int id = 0;
+
 		dport = iter->parent_dport;
 		iter = dport->port;
-		dev = device_find_child(&iter->dev, NULL, first_decoder);
+		dev = device_find_child(&iter->dev, &id, match_decoder_by_index);
 		/*
 		 * Ancestor ports are guaranteed to be enumerated before
 		 * @port, and all ports have at least one decoder.
@@ -1214,6 +1262,26 @@ static bool mock_init_hdm_decoder(struct cxl_decoder *cxld)
 
 		cxld_registry_update(cxld);
 		put_device(dev);
+
+		if (!mock_zero_size_decoders)
+			continue;
+
+		/*
+		 * Mirror the endpoint: commit the next two switch decoders
+		 * as zero-size + locked so the burnt-slot layout extends
+		 * end-to-end through the switch and host bridge.
+		 */
+		for (id = 1; id <= 2; id++) {
+			dev = device_find_child(&iter->dev, &id,
+						match_decoder_by_index);
+			if (WARN_ON(!dev))
+				continue;
+			cxld = to_cxl_decoder(dev);
+			size_zero_mock_decoder_sw(cxld, base, i);
+			iter->commit_end = id;
+			cxld_registry_update(cxld);
+			put_device(dev);
+		}
 	}
 
 	return false;
@@ -2030,6 +2098,9 @@ module_param(extended_linear_cache, bool, 0444);
 MODULE_PARM_DESC(extended_linear_cache, "Enable extended linear cache support");
 module_param(fail_autoassemble, bool, 0444);
 MODULE_PARM_DESC(fail_autoassemble, "Simulate missing member of an auto-region");
+module_param(mock_zero_size_decoders, bool, 0444);
+MODULE_PARM_DESC(mock_zero_size_decoders,
+		 "Mock BIOS-burnt committed zero-size locked decoders under host-bridge0");
 module_init(cxl_test_init);
 module_exit(cxl_test_exit);
 MODULE_LICENSE("GPL v2");
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] cxl/hdm: Allow zero sized HDM decoders
  2026-06-07  8:13 ` [PATCH v4 1/2] cxl/hdm: Allow zero sized " Richard Cheng
@ 2026-06-07  8:30   ` sashiko-bot
  2026-06-09 23:06     ` Dan Williams (nvidia)
  0 siblings, 1 reply; 7+ messages in thread
From: sashiko-bot @ 2026-06-07  8:30 UTC (permalink / raw)
  To: Richard Cheng; +Cc: linux-cxl

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

New issues:
- [High] Logic error in poison_by_decoder silently skips unmapped DPA poison collection for endpoints with a zero-sized first decoder.

Pre-existing issues:
- [High] Stack buffer over-read in init_hdm_decoder leaks kernel stack memory to userspace.
--

commit b5ef0cd71d8dea661d93a7fc863b0b6ecc394576
Author: Richard Cheng <icheng@nvidia.com>

cxl/hdm: Allow zero sized HDM decoders

CXL r3.2 8.2.4.20.12 and 14.13.10 permit committing an HDM decoder with
size 0. This patch allows init_hdm_decoder() to enumerate these decoders
with their hardware LOCK state instead of aborting the port. It also
modifies poison_by_decoder() to scan the unmapped DPA tail when a
decoder has no DPA resource.

> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 0c80b76a5f9b4..b61f51134551d 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c

[ ... ]

> @@ -1096,6 +1100,13 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
>  	if (!committed)
>  		return 0;
>  
> +	/*
> +	 * A committed zero-size decoder reserves no DPA. Leave port->hdm_end
> +	 * untouched.
> +	 */
> +	if (size == 0)
> +		return 0;
> +
>  	dpa_size = div_u64_rem(size, cxld->interleave_ways, &remainder);

[Severity: High]
This is a pre-existing issue, but does the interleave ways configuration
here allow reading out of bounds of the target_list union higher up in
init_hdm_decoder()?

drivers/cxl/core/hdm.c:init_hdm_decoder() {
    ...
    union {
        u64 value;
        unsigned char target_id[8];
    } target_list;
    ...
    if (!cxled) {
        ...
        for (i = 0; i < cxld->interleave_ways; i++)
            cxld->target_map[i] = target_list.target_id[i];
        return 0;
    }
    ...
}

Since CXL 3.0 supports up to 16-way interleaving, if interleave_ways is
greater than 8, will this loop read past the 8-byte target_list.target_id
array and copy uninitialized kernel stack memory into target_map?

> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index e50dc716d4e82..a353d8e7489d3 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c

[ ... ]

> @@ -2907,38 +2907,38 @@ static int poison_by_decoder(struct device *dev, void *arg)

[ ... ]

> -	/* Iterate until commit_end is reached */
> -	if (cxled->cxld.id == ctx->port->commit_end) {
>  		ctx->offset = cxled->dpa_res->end + 1;
>  		ctx->part = cxled->part;
> -		return 1;
>  	}
>  
> +	/* Iterate until commit_end is reached */
> +	if (cxled->cxld.id == ctx->port->commit_end)
> +		return 1;

[Severity: High]
If this decoder is zero-sized and has no DPA resource, we skip updating
ctx->part and return 1 here.

If this is the first decoder on the device, ctx->part remains -1. Will
this cause cxl_get_poison_unmapped() to immediately return 0 when it
checks for part < 0:

drivers/cxl/core/region.c:cxl_get_poison_unmapped() {
    ...
    if (ctx->part < 0)
        return 0;
    ...
}

Could this result in silently skipping the poison collection for the entire
unmapped DPA space instead of scanning it as intended?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260607081345.61954-1-icheng@nvidia.com?part=1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] cxl/hdm: Allow zero sized HDM decoders
  2026-06-07  8:30   ` sashiko-bot
@ 2026-06-09 23:06     ` Dan Williams (nvidia)
  0 siblings, 0 replies; 7+ messages in thread
From: Dan Williams (nvidia) @ 2026-06-09 23:06 UTC (permalink / raw)
  To: sashiko-bot, Richard Cheng; +Cc: linux-cxl

sashiko-bot@ wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> 
> New issues:
> - [High] Logic error in poison_by_decoder silently skips unmapped DPA poison collection for endpoints with a zero-sized first decoder.
> 
> Pre-existing issues:
> - [High] Stack buffer over-read in init_hdm_decoder leaks kernel stack memory to userspace.
> --
> 
> commit b5ef0cd71d8dea661d93a7fc863b0b6ecc394576
> Author: Richard Cheng <icheng@nvidia.com>
> 
> cxl/hdm: Allow zero sized HDM decoders
> 
> CXL r3.2 8.2.4.20.12 and 14.13.10 permit committing an HDM decoder with
> size 0. This patch allows init_hdm_decoder() to enumerate these decoders
> with their hardware LOCK state instead of aborting the port. It also
> modifies poison_by_decoder() to scan the unmapped DPA tail when a
> decoder has no DPA resource.
> 
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index 0c80b76a5f9b4..b61f51134551d 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> 
> [ ... ]
> 
> > @@ -1096,6 +1100,13 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> >  	if (!committed)
> >  		return 0;
> >  
> > +	/*
> > +	 * A committed zero-size decoder reserves no DPA. Leave port->hdm_end
> > +	 * untouched.
> > +	 */
> > +	if (size == 0)
> > +		return 0;
> > +
> >  	dpa_size = div_u64_rem(size, cxld->interleave_ways, &remainder);
> 
> [Severity: High]
> This is a pre-existing issue, but does the interleave ways configuration
> here allow reading out of bounds of the target_list union higher up in
> init_hdm_decoder()?
> 
> drivers/cxl/core/hdm.c:init_hdm_decoder() {
>     ...
>     union {
>         u64 value;
>         unsigned char target_id[8];
>     } target_list;
>     ...
>     if (!cxled) {
>         ...
>         for (i = 0; i < cxld->interleave_ways; i++)
>             cxld->target_map[i] = target_list.target_id[i];
>         return 0;
>     }
>     ...
> }
> 
> Since CXL 3.0 supports up to 16-way interleaving, if interleave_ways is
> greater than 8, will this loop read past the 8-byte target_list.target_id
> array and copy uninitialized kernel stack memory into target_map?

No, hardware should never return an IW setting > 8 for a switch port,
but that could be made more explicit in the code.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 0/2] Support zero-sized HDM decoders
  2026-06-07  8:13 [PATCH v4 0/2] Support zero-sized HDM decoders Richard Cheng
  2026-06-07  8:13 ` [PATCH v4 1/2] cxl/hdm: Allow zero sized " Richard Cheng
  2026-06-07  8:13 ` [PATCH v4 2/2] tools/testing/cxl: Enable zero sized decoder under hb0 Richard Cheng
@ 2026-06-09 23:13 ` Dan Williams (nvidia)
  2026-06-09 23:37   ` Dan Williams (nvidia)
  2 siblings, 1 reply; 7+ messages in thread
From: Dan Williams (nvidia) @ 2026-06-09 23:13 UTC (permalink / raw)
  To: Richard Cheng, dave, jonathan.cameron, dave.jiang,
	alison.schofield, vishal.l.verma, ira.weiny, dan.j.williams
  Cc: linux-cxl, linux-kernel, newtonl, kristinc, kaihengf, kobak,
	vaslot, smadhavan, Richard Cheng

Richard Cheng wrote:
> Hello,
> 
> This v4 continues Vishal Aslot's "Support zero-sized decoders" series [1]
> and addresses the v3 review of patch 1's port->hdm_end handling [2].
> 
> CXL r3.2 §8.2.4.20.12 and §14.13.10 permit committing an HDM decoder with
> size 0. BIOS commits and LOCKs such decoders to burn the trailing, unused
> slots so the OS cannot program regions through them, e.g. a Type 3 device
> in a Trusted Computing Base (TCB) established via the Trusted Security
> Protocol (TSP). init_hdm_decoder() rejected these with -ENXIO during port
> enumeration and aborted the whole port, so affected systems showed nothing
> under 'cxl list'.
> 
> Patch 1 enumerates the decoder into the topology with its HW-reported LOCK
> state and skips the DPA reservation it does not need.
> 
> On port->hdm_end (the v3 review): v3 advanced the watermark for the
> zero-size decoder. sashiko correctly noted the write was outside
> cxl_rwsem.dpa, and that advancing it without a balanced release strands
> hdm_end -- cxl_dpa_free() returns early on !dpa_res, so it can never be
> decremented past the zero-size id, breaking LIFO teardown of lower
> decoders. v4 therefore does not touch hdm_end at all. The in-order check
> in __cxl_dpa_reserve() is its only consumer and is never legitimately
> reached past such a decoder: the burned slots are trailing, so enumeration
> reserves no committed decoder after one, and the OS must not program a
> region through a locked slot. hdm_end stays at the last sized reservation,
> which is accurate. IMHO, if a non-trailing zero-size layout ever needs
> support, the check should key off commit_end rather than hdm_end,
> out of scope here.

I am not comfortable with this outcome. It assumes that zero-sized
decoders are always committed. I would much rather keep the meaning of
hdm_end as the marker of the last decoder set aside for a reservation.

Consider the case of a zero-sized decoder in the PMEM partition with the
RAM partition not fully allocated. That decoder would have a non-zero
skip and zero sized DPA allocation to arrange for the next decoder to
start at the beginning of the PMEM partition. Otherwise, it would seem
to need more special casing to understand which decoder is responsible
for carrying the skip.

I think the way to solve this is something like below (untested). It
keeps @hdm_end aligned with the available decoders, and tracks the start
of zero allocations relative to their skip. I believe it may also
address the Sashiko report.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 0/2] Support zero-sized HDM decoders
  2026-06-09 23:13 ` [PATCH v4 0/2] Support zero-sized HDM decoders Dan Williams (nvidia)
@ 2026-06-09 23:37   ` Dan Williams (nvidia)
  0 siblings, 0 replies; 7+ messages in thread
From: Dan Williams (nvidia) @ 2026-06-09 23:37 UTC (permalink / raw)
  To: Dan Williams (nvidia), Richard Cheng, dave, jonathan.cameron,
	dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
	dan.j.williams
  Cc: linux-cxl, linux-kernel, newtonl, kristinc, kaihengf, kobak,
	vaslot, smadhavan, Richard Cheng

Dan Williams (nvidia) wrote:
[..]
> I think the way to solve this is something like below (untested).

...whoops and unset apparently.

> It keeps @hdm_end aligned with the available decoders, and tracks the
> start of zero allocations relative to their skip. I believe it may
> also address the Sashiko report.

-- >8 --
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 0c80b76a5f9b..3f9d97c9a3b7 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -209,6 +209,14 @@ void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_dpa_debug, "CXL");
 
+static void cxl_dpa_release_region(struct resource *parent, struct resource *res)
+{
+	/* zero sized decoders are not tracked in tree */
+	if (resource_size(res) == 0)
+		kfree(res);
+	__release_region(parent, res->start, resource_size(res));
+}
+
 /* See request_skip() kernel-doc */
 static resource_size_t __adjust_skip(struct cxl_dev_state *cxlds,
 				     const resource_size_t skip_base,
@@ -256,7 +264,7 @@ static void __cxl_dpa_release(struct cxl_endpoint_decoder *cxled)
 
 	/* save @skip_start, before @res is released */
 	skip_start = res->start - cxled->skip;
-	__release_region(&cxlds->dpa_res, res->start, resource_size(res));
+	cxl_dpa_release_region(&cxlds->dpa_res, res);
 	if (cxled->skip)
 		release_skip(cxlds, skip_start, cxled->skip);
 	cxled->skip = 0;
@@ -336,6 +344,22 @@ static int request_skip(struct cxl_dev_state *cxlds,
 	return -EBUSY;
 }
 
+static struct resource *cxl_dpa_request_region(struct resource *parent,
+					       resource_size_t start,
+					       resource_size_t n,
+					       const char *name)
+{
+	if (!n) {
+		struct resource *res = kmalloc_obj(*res);
+
+		if (!res)
+			return NULL;
+		*res = DEFINE_RES_NAMED(start, n, name, 0);
+		return res;
+	}
+	return __request_region(parent, start, n, name, 0);
+}
+
 static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 			     resource_size_t base, resource_size_t len,
 			     resource_size_t skipped)
@@ -349,12 +373,6 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 
 	lockdep_assert_held_write(&cxl_rwsem.dpa);
 
-	if (!len) {
-		dev_warn(dev, "decoder%d.%d: empty reservation attempted\n",
-			 port->id, cxled->cxld.id);
-		return -EINVAL;
-	}
-
 	if (cxled->dpa_res) {
 		dev_dbg(dev, "decoder%d.%d: existing allocation %pr assigned\n",
 			port->id, cxled->cxld.id, cxled->dpa_res);
@@ -378,8 +396,8 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 		if (rc)
 			return rc;
 	}
-	res = __request_region(&cxlds->dpa_res, base, len,
-			       dev_name(&cxled->cxld.dev), 0);
+	res = cxl_dpa_request_region(&cxlds->dpa_res, base, len,
+				     dev_name(&cxled->cxld.dev));
 	if (!res) {
 		dev_dbg(dev, "decoder%d.%d: failed to reserve allocation\n",
 			port->id, cxled->cxld.id);
@@ -545,7 +563,7 @@ int cxl_dpa_free(struct cxl_endpoint_decoder *cxled)
 	struct device *dev = &cxled->cxld.dev;
 
 	guard(rwsem_write)(&cxl_rwsem.dpa);
-	if (!cxled->dpa_res)
+	if (!cxled->dpa_res || !resource_size(cxled->dpa_res))
 		return 0;
 	if (cxled->cxld.region) {
 		dev_dbg(dev, "decoder assigned to: %s\n",
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 7c6c5b7450a5..6f4d634b2cfb 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1433,6 +1433,9 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 	int nr_records = 0;
 	int rc;
 
+	if (!len)
+		return 0;
+
 	ACQUIRE(mutex_intr, lock)(&mds->poison.mutex);
 	if ((rc = ACQUIRE_ERR(mutex_intr, &lock)))
 		return rc;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index e50dc716d4e8..0d03e3bedb40 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2090,7 +2090,7 @@ static int cxl_region_attach(struct cxl_region *cxlr,
 		return -ENXIO;
 	}
 
-	if (!cxled->dpa_res) {
+	if (!cxled->dpa_res || !resource_size(cxled->dpa_res)) {
 		dev_dbg(&cxlr->dev, "%s:%s: missing DPA allocation.\n",
 			dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev));
 		return -ENXIO;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-09 23:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-07  8:13 [PATCH v4 0/2] Support zero-sized HDM decoders Richard Cheng
2026-06-07  8:13 ` [PATCH v4 1/2] cxl/hdm: Allow zero sized " Richard Cheng
2026-06-07  8:30   ` sashiko-bot
2026-06-09 23:06     ` Dan Williams (nvidia)
2026-06-07  8:13 ` [PATCH v4 2/2] tools/testing/cxl: Enable zero sized decoder under hb0 Richard Cheng
2026-06-09 23:13 ` [PATCH v4 0/2] Support zero-sized HDM decoders Dan Williams (nvidia)
2026-06-09 23:37   ` Dan Williams (nvidia)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.