Linux Tegra architecture development
 help / color / mirror / Atom feed
* [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices
@ 2026-06-23  3:24 Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 01/11] cxl: Split decoder programming into a reusable helper Srirangan Madhavan
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Hi folks!

This series adds CXL Reset support for CXL Type 2 devices through the
existing PCI reset_method ABI. The reset sequence follows the CXL 4.0
specification [1], including CXL.cache disable, optional cache
writeback, CXL Reset initiation, ResetComplete polling, and ResetError
reporting.

The userspace ABI is the existing PCI reset interface:

    /sys/bus/pci/devices/.../reset_method
    /sys/bus/pci/devices/.../reset

Userspace can select "cxl_reset" in reset_method and then trigger reset
through the existing reset attribute.

Following Dan's v6 feedback, this replaces the proposed memdev sysfs ABI
with the existing PCI reset_method interface.

v7 changes from v6 [2]:
- Move the ABI from a CXL memdev attribute to PCI reset_method.
- Drop the memdev dependency from reset entry; advertise cxl_reset for
  Type 2 functions that report CXL Reset support in the CXL Device DVSEC.
- Incorporate Dan's HDM reset refactor: shared decoder settings,
  pci_dev->hdm cached state, and built-in CONFIG_CXL_HDM helpers.
- Cache endpoint HDM settings during PCI enumeration when MMIO decoding
  is already enabled, and let CXL core refresh the same cache later.
- Reduce the earlier PCI/CXL save/restore series [3] to the HDM state
  cache and restore infrastructure needed by this reset flow.
- Use cached HDM ranges to reject reset while affected ranges are busy
  and to invalidate CPU caches before reset.
- Discover the CXL reset scope with the Non-CXL Function Map and CXL
  cache/mem capability bits.
- Quiesce affected sibling functions with PCI save/disable and IOMMU
  reset prepare/done before executing reset.
- Restore cached HDM decoder state after reset before completing PCI
  reset recovery.
- Keep CXL Reset Memory Clear disabled.

Motivation:
-----------
- Type 2 devices need a CXL-specific reset mechanism beyond existing PCI
  reset methods.

- FLR does not reset CXL.cache or CXL.mem protocol state. CXL Reset is
  the architectural reset mechanism for those protocols.

- The PCI reset_method ABI lets userspace select this narrower CXL reset
  before falling back to broader bus reset methods.

Change Description:
-------------------

Patch 1: cxl/hdm: Split decoder programming into a reusable helper
- Move shared decoder settings to include/cxl/cxl.h.
- Factor low-level HDM register programming into cxl_commit().

Patch 2: cxl/hdm: Cache decoder settings on PCI devices
- Cache CXL core HDM decoder settings in pci_dev->hdm.
- Refresh the cache as decoders are enumerated, committed, or reset.

Patch 3: cxl/hdm: Cache endpoint decoder settings during PCI enumeration
- Snapshot endpoint HDM state during PCI capability initialization when
  memory decoding is already enabled.
- Reuse the same cache when CXL core later enumerates the device.

Patch 4: PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
- Export PCI reset lifecycle helpers for CXL reset orchestration.

Patch 5: PCI/CXL: Add CXL Device Reset helper
- Add the internal DVSEC reset sequence.
- Disable CXL.cache, perform cache writeback where supported, initiate
  CXL Reset, and wait for completion.

Patch 6: PCI/CXL: Validate HDM ranges before CXL reset
- Collect enabled cached HDM ranges.
- Reject reset if affected ranges are busy and invalidate CPU caches.

Patch 7: PCI/CXL: Discover the CXL reset scope
- Discover same-scope CXL functions with the Non-CXL Function Map and
  CXL cache/mem capability bits.

Patch 8: PCI/CXL: Coordinate sibling functions for CXL reset
- Lock, save, disable, and IOMMU-block affected sibling functions.
- Include mem-capable siblings in HDM range validation and cache flush.

Patch 9: cxl/pci: Restore CXL HDM state after PCI reset
- Restore cached global and per-decoder HDM state after reset.
- Keep IOMMU reset blocks active until HDM restore completes.

Patch 10: PCI/CXL: Expose CXL Reset as a PCI reset method
- Add "cxl_reset" to the PCI reset_method table for Type 2 reset-capable
  CXL devices.

Patch 11: Documentation/ABI: Document CXL Reset PCI reset method
- Document the new reset_method value and reset behavior.

The CPU cache invalidation step depends on
cpu_cache_invalidate_memregion() support for the affected address ranges.
If no provider is available, reset fails before hardware reset is
requested.

Example:

    echo cxl_reset > /sys/bus/pci/devices/0000:bb:dd.f/reset_method
    echo 1 > /sys/bus/pci/devices/0000:bb:dd.f/reset

Basic CXL DVSEC reset testing was done on a CXL Type 2 device. The reset
sequence completed successfully and ResetComplete was observed.

References:
[1] https://computeexpresslink.org/wp-content/uploads/2026/02/CXL-Specification_rev4p0_ver1p0_2026February26_clean_evalcopy_v2.pdf
[2] https://lore.kernel.org/linux-cxl/20260528083154.137979-1-smadhavan@nvidia.com/
[3] https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/

Srirangan Madhavan (11):
  cxl/hdm: Split decoder programming into a reusable helper
  cxl/hdm: Cache decoder settings on PCI devices
  cxl/hdm: Cache endpoint decoder settings during PCI enumeration
  PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
  PCI/CXL: Add CXL Device Reset helper
  PCI/CXL: Validate HDM ranges before CXL reset
  PCI/CXL: Discover the CXL reset scope
  PCI/CXL: Coordinate sibling functions for CXL reset
  cxl/pci: Restore CXL HDM state after PCI reset
  PCI/CXL: Expose CXL Reset as a PCI reset method
  Documentation/ABI: Document CXL Reset PCI reset method

 Documentation/ABI/testing/sysfs-bus-pci |   14 +
 drivers/cxl/Kconfig                     |    4 +
 drivers/cxl/core/Makefile               |    2 +-
 drivers/cxl/core/hdm.c                  |  234 ++---
 drivers/cxl/core/region.c               |    6 +-
 drivers/cxl/core/reset.c                | 1276 +++++++++++++++++++++++
 drivers/cxl/cxl.h                       |   43 -
 drivers/pci/pci.c                       |   25 +-
 drivers/pci/probe.c                     |    2 +
 include/cxl/cxl.h                       |   85 +-
 include/linux/pci.h                     |   10 +-
 include/uapi/linux/pci_regs.h           |   15 +
 tools/testing/cxl/test/cxl.c            |   10 +-
 13 files changed, 1554 insertions(+), 172 deletions(-)
 create mode 100644 drivers/cxl/core/reset.c

base-commit: 72afdd8181219f459142e571999b3b44ef7b85fb
-- 
2.43.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v7 01/11] cxl: Split decoder programming into a reusable helper
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 02/11] cxl: Cache decoder settings on PCI devices Srirangan Madhavan
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Move common HDM decoder settings to include/cxl/cxl.h and route the
register programming sequence through cxl_commit(). This lets reset code
restore cached HDM state without depending on private cxl_core types while
keeping hdm.c in charge of the existing commit policy checks.

Build the low-level HDM helper under CONFIG_CXL_HDM so it is available even
when cxl_core is modular.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/Kconfig          |   4 ++
 drivers/cxl/core/Makefile    |   1 +
 drivers/cxl/core/hdm.c       | 122 ++++-------------------------------
 drivers/cxl/core/region.c    |   6 +-
 drivers/cxl/core/reset.c     | 118 +++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h            |  43 ------------
 include/cxl/cxl.h            |  55 +++++++++++++++-
 tools/testing/cxl/test/cxl.c |  10 +--
 8 files changed, 197 insertions(+), 162 deletions(-)
 create mode 100644 drivers/cxl/core/reset.c

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 80aeb0d556bd..87d719ea1e14 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -6,6 +6,7 @@ menuconfig CXL_BUS
 	select FW_UPLOAD
 	select PCI_DOE
 	select FIRMWARE_TABLE
+	select CXL_HDM
 	select NUMA_KEEP_MEMINFO if NUMA_MEMBLKS
 	select FWCTL if CXL_FEATURES
 	help
@@ -243,4 +244,7 @@ config CXL_ATL
 	depends on CXL_REGION
 	depends on ACPI_PRMT && AMD_NB
 
+config CXL_HDM
+	bool
+
 endif
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index ce7213818d3c..dc075cee0450 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += cxl_core.o
+obj-$(CONFIG_CXL_HDM) += reset.o
 obj-$(CONFIG_CXL_SUSPEND) += suspend.o
 
 ccflags-y += -I$(srctree)/drivers/cxl
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 0c80b76a5f9b..fa978c297546 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -16,11 +16,6 @@
  * for enumerating these registers and capabilities.
  */
 
-struct cxl_rwsem cxl_rwsem = {
-	.region = __RWSEM_INITIALIZER(cxl_rwsem.region),
-	.dpa = __RWSEM_INITIALIZER(cxl_rwsem.dpa),
-};
-
 static int add_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld)
 {
 	int rc;
@@ -255,11 +250,11 @@ static void __cxl_dpa_release(struct cxl_endpoint_decoder *cxled)
 	lockdep_assert_held_write(&cxl_rwsem.dpa);
 
 	/* save @skip_start, before @res is released */
-	skip_start = res->start - cxled->skip;
+	skip_start = res->start - cxled->cxld.skip;
 	__release_region(&cxlds->dpa_res, res->start, resource_size(res));
-	if (cxled->skip)
-		release_skip(cxlds, skip_start, cxled->skip);
-	cxled->skip = 0;
+	if (cxled->cxld.skip)
+		release_skip(cxlds, skip_start, cxled->cxld.skip);
+	cxled->cxld.skip = 0;
 	cxled->dpa_res = NULL;
 	put_device(&cxled->cxld.dev);
 	port->hdm_end--;
@@ -388,7 +383,7 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 		return -EBUSY;
 	}
 	cxled->dpa_res = res;
-	cxled->skip = skipped;
+	cxled->cxld.skip = skipped;
 
 	/*
 	 * When allocating new capacity, ->part is already set, when
@@ -679,35 +674,6 @@ int cxl_dpa_alloc(struct cxl_endpoint_decoder *cxled, u64 size)
 	return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
 }
 
-static void cxld_set_interleave(struct cxl_decoder *cxld, u32 *ctrl)
-{
-	u16 eig;
-	u8 eiw;
-
-	/*
-	 * Input validation ensures these warns never fire, but otherwise
-	 * suppress unititalized variable usage warnings.
-	 */
-	if (WARN_ONCE(ways_to_eiw(cxld->interleave_ways, &eiw),
-		      "invalid interleave_ways: %d\n", cxld->interleave_ways))
-		return;
-	if (WARN_ONCE(granularity_to_eig(cxld->interleave_granularity, &eig),
-		      "invalid interleave_granularity: %d\n",
-		      cxld->interleave_granularity))
-		return;
-
-	u32p_replace_bits(ctrl, eig, CXL_HDM_DECODER0_CTRL_IG_MASK);
-	u32p_replace_bits(ctrl, eiw, CXL_HDM_DECODER0_CTRL_IW_MASK);
-	*ctrl |= CXL_HDM_DECODER0_CTRL_COMMIT;
-}
-
-static void cxld_set_type(struct cxl_decoder *cxld, u32 *ctrl)
-{
-	u32p_replace_bits(ctrl,
-			  !!(cxld->target_type == CXL_DECODER_HOSTONLYMEM),
-			  CXL_HDM_DECODER0_CTRL_HOSTONLY);
-}
-
 static void cxlsd_set_targets(struct cxl_switch_decoder *cxlsd, u64 *tgt)
 {
 	struct cxl_dport **t = &cxlsd->target[0];
@@ -730,73 +696,6 @@ static void cxlsd_set_targets(struct cxl_switch_decoder *cxlsd, u64 *tgt)
 		*tgt |= FIELD_PREP(GENMASK_ULL(63, 56), t[7]->port_id);
 }
 
-/*
- * Per CXL 2.0 8.2.5.12.20 Committing Decoder Programming, hardware must set
- * committed or error within 10ms, but just be generous with 20ms to account for
- * clock skew and other marginal behavior
- */
-#define COMMIT_TIMEOUT_MS 20
-static int cxld_await_commit(void __iomem *hdm, int id)
-{
-	u32 ctrl;
-	int i;
-
-	for (i = 0; i < COMMIT_TIMEOUT_MS; i++) {
-		ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
-		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMIT_ERROR, ctrl)) {
-			ctrl &= ~CXL_HDM_DECODER0_CTRL_COMMIT;
-			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
-			return -EIO;
-		}
-		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
-			return 0;
-		fsleep(1000);
-	}
-
-	return -ETIMEDOUT;
-}
-
-static void setup_hw_decoder(struct cxl_decoder *cxld, void __iomem *hdm)
-{
-	int id = cxld->id;
-	u64 base, size;
-	u32 ctrl;
-
-	/* common decoder settings */
-	ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
-	cxld_set_interleave(cxld, &ctrl);
-	cxld_set_type(cxld, &ctrl);
-	base = cxld->hpa_range.start;
-	size = range_len(&cxld->hpa_range);
-
-	writel(upper_32_bits(base), hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(id));
-	writel(lower_32_bits(base), hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id));
-	writel(upper_32_bits(size), hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(id));
-	writel(lower_32_bits(size), hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(id));
-
-	if (is_switch_decoder(&cxld->dev)) {
-		struct cxl_switch_decoder *cxlsd =
-			to_cxl_switch_decoder(&cxld->dev);
-		void __iomem *tl_hi = hdm + CXL_HDM_DECODER0_TL_HIGH(id);
-		void __iomem *tl_lo = hdm + CXL_HDM_DECODER0_TL_LOW(id);
-		u64 targets;
-
-		cxlsd_set_targets(cxlsd, &targets);
-		writel(upper_32_bits(targets), tl_hi);
-		writel(lower_32_bits(targets), tl_lo);
-	} else {
-		struct cxl_endpoint_decoder *cxled =
-			to_cxl_endpoint_decoder(&cxld->dev);
-		void __iomem *sk_hi = hdm + CXL_HDM_DECODER0_SKIP_HIGH(id);
-		void __iomem *sk_lo = hdm + CXL_HDM_DECODER0_SKIP_LOW(id);
-
-		writel(upper_32_bits(cxled->skip), sk_hi);
-		writel(lower_32_bits(cxled->skip), sk_lo);
-	}
-
-	writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
-}
-
 static int cxl_decoder_commit(struct cxl_decoder *cxld)
 {
 	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
@@ -834,17 +733,20 @@ static int cxl_decoder_commit(struct cxl_decoder *cxld)
 		}
 	}
 
-	scoped_guard(rwsem_read, &cxl_rwsem.dpa)
-		setup_hw_decoder(cxld, hdm);
+	if (is_switch_decoder(&cxld->dev)) {
+		struct cxl_switch_decoder *cxlsd =
+			to_cxl_switch_decoder(&cxld->dev);
+
+		cxlsd_set_targets(cxlsd, &cxld->targets);
+	}
 
-	rc = cxld_await_commit(hdm, cxld->id);
+	rc = cxl_commit(&cxld->settings, hdm);
 	if (rc) {
 		dev_dbg(&port->dev, "%s: error %d committing decoder\n",
 			dev_name(&cxld->dev), rc);
 		return rc;
 	}
 	port->commit_end++;
-	cxld->flags |= CXL_DECODER_F_ENABLE;
 
 	return 0;
 }
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index f5cd20f48d2b..adaf583ccbab 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2912,9 +2912,9 @@ static int poison_by_decoder(struct device *dev, void *arg)
 	cxlds = cxlmd->cxlds;
 	mode = cxlds->part[cxled->part].mode;
 
-	if (cxled->skip) {
-		offset = cxled->dpa_res->start - cxled->skip;
-		length = cxled->skip;
+	if (cxled->cxld.skip) {
+		offset = cxled->dpa_res->start - cxled->cxld.skip;
+		length = cxled->cxld.skip;
 		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
 		if (rc == -EFAULT && mode == CXL_PARTMODE_RAM)
 			rc = 0;
diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
new file mode 100644
index 000000000000..14f024098e82
--- /dev/null
+++ b/drivers/cxl/core/reset.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2026 NVIDIA Corporation. All rights reserved. */
+#include <linux/delay.h>
+#include <linux/bug.h>
+#include <linux/errno.h>
+#include <linux/export.h>
+#include <linux/kernel.h>
+
+#include "cxl.h"
+#include "core.h"
+
+struct cxl_rwsem cxl_rwsem = {
+	.region = __RWSEM_INITIALIZER(cxl_rwsem.region),
+	.dpa = __RWSEM_INITIALIZER(cxl_rwsem.dpa),
+};
+EXPORT_SYMBOL_FOR_MODULES(cxl_rwsem, "cxl_core");
+
+static void cxld_set_interleave(struct cxl_decoder_settings *settings, u32 *ctrl)
+{
+	u16 eig;
+	u8 eiw;
+
+	/*
+	 * Input validation ensures these warns never fire, but otherwise
+	 * suppress uninitialized variable usage warnings.
+	 */
+	if (WARN_ONCE(ways_to_eiw(settings->interleave_ways, &eiw),
+		      "invalid interleave_ways: %d\n",
+		      settings->interleave_ways))
+		return;
+	if (WARN_ONCE(granularity_to_eig(settings->interleave_granularity, &eig),
+		      "invalid interleave_granularity: %d\n",
+		      settings->interleave_granularity))
+		return;
+
+	u32p_replace_bits(ctrl, eig, CXL_HDM_DECODER0_CTRL_IG_MASK);
+	u32p_replace_bits(ctrl, eiw, CXL_HDM_DECODER0_CTRL_IW_MASK);
+	*ctrl |= CXL_HDM_DECODER0_CTRL_COMMIT;
+}
+
+static void cxld_set_type(struct cxl_decoder_settings *settings, u32 *ctrl)
+{
+	u32p_replace_bits(ctrl,
+			  !!(settings->target_type == CXL_DECODER_HOSTONLYMEM),
+			  CXL_HDM_DECODER0_CTRL_HOSTONLY);
+}
+
+/*
+ * Per CXL 2.0 8.2.5.12.20 Committing Decoder Programming, hardware must set
+ * committed or error within 10ms, but just be generous with 20ms to account for
+ * clock skew and other marginal behavior.
+ */
+#define COMMIT_TIMEOUT_MS 20
+static int cxld_await_commit(void __iomem *hdm, int id)
+{
+	u32 ctrl;
+	int i;
+
+	for (i = 0; i < COMMIT_TIMEOUT_MS; i++) {
+		ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMIT_ERROR, ctrl)) {
+			ctrl &= ~CXL_HDM_DECODER0_CTRL_COMMIT;
+			writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+			return -EIO;
+		}
+		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
+			return 0;
+		fsleep(1000);
+	}
+
+	return -ETIMEDOUT;
+}
+
+static void setup_hw_decoder(struct cxl_decoder_settings *settings,
+			     void __iomem *hdm)
+{
+	int id = settings->id;
+	u64 target_or_skip;
+	u64 base, size;
+	u32 ctrl;
+
+	ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+	cxld_set_interleave(settings, &ctrl);
+	cxld_set_type(settings, &ctrl);
+	base = settings->hpa_range.start;
+	size = range_len(&settings->hpa_range);
+	target_or_skip = settings->targets;
+
+	writel(upper_32_bits(base), hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(id));
+	writel(lower_32_bits(base), hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id));
+	writel(upper_32_bits(size), hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(id));
+	writel(lower_32_bits(size), hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(id));
+	/* Target-list and endpoint-skip registers alias the same slot. */
+	writel(upper_32_bits(target_or_skip),
+	       hdm + CXL_HDM_DECODER0_TL_HIGH(id));
+	writel(lower_32_bits(target_or_skip),
+	       hdm + CXL_HDM_DECODER0_TL_LOW(id));
+
+	writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+}
+
+int cxl_commit(struct cxl_decoder_settings *settings, void __iomem *hdm)
+{
+	int rc;
+
+	scoped_guard(rwsem_read, &cxl_rwsem.dpa) {
+		setup_hw_decoder(settings, hdm);
+	}
+
+	rc = cxld_await_commit(hdm, settings->id);
+	if (rc)
+		return rc;
+
+	settings->flags |= CXL_DECODER_F_ENABLE;
+
+	return 0;
+}
+EXPORT_SYMBOL_FOR_MODULES(cxl_commit, "cxl_core");
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 21fc89d3aeea..95bab833fc80 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -262,49 +262,8 @@ int cxl_dport_map_rcd_linkcap(struct pci_dev *pdev, struct cxl_dport *dport);
 #define CXL_DECODER_F_NORMALIZED_ADDRESSING BIT(6)
 #define CXL_DECODER_F_RESET_MASK (CXL_DECODER_F_ENABLE | CXL_DECODER_F_LOCK)
 
-enum cxl_decoder_type {
-	CXL_DECODER_DEVMEM = 2,
-	CXL_DECODER_HOSTONLYMEM = 3,
-};
-
-/*
- * Current specification goes up to 8, double that seems a reasonable
- * software max for the foreseeable future
- */
-#define CXL_DECODER_MAX_INTERLEAVE 16
-
 #define CXL_QOS_CLASS_INVALID -1
 
-/**
- * struct cxl_decoder - Common CXL HDM Decoder Attributes
- * @dev: this decoder's device
- * @id: kernel device name id
- * @hpa_range: Host physical address range mapped by this decoder
- * @interleave_ways: number of cxl_dports in this decode
- * @interleave_granularity: data stride per dport
- * @target_type: accelerator vs expander (type2 vs type3) selector
- * @region: currently assigned region for this decoder
- * @flags: memory type capabilities and locking
- * @target_map: cached copy of hardware port-id list, available at init
- *              before all @dport objects have been instantiated. While
- *              dport id is 8bit, CFMWS interleave targets are 32bits.
- * @commit: device/decoder-type specific callback to commit settings to hw
- * @reset: device/decoder-type specific callback to reset hw settings
-*/
-struct cxl_decoder {
-	struct device dev;
-	int id;
-	struct range hpa_range;
-	int interleave_ways;
-	int interleave_granularity;
-	enum cxl_decoder_type target_type;
-	struct cxl_region *region;
-	unsigned long flags;
-	u32 target_map[CXL_DECODER_MAX_INTERLEAVE];
-	int (*commit)(struct cxl_decoder *cxld);
-	void (*reset)(struct cxl_decoder *cxld);
-};
-
 /*
  * Track whether this decoder is free for userspace provisioning, reserved for
  * region autodiscovery, whether it is started connecting (awaiting other
@@ -320,7 +279,6 @@ enum cxl_decoder_state {
  * struct cxl_endpoint_decoder - Endpoint  / SPA to DPA decoder
  * @cxld: base cxl_decoder_object
  * @dpa_res: actively claimed DPA span of this decoder
- * @skip: offset into @dpa_res where @cxld.hpa_range maps
  * @state: autodiscovery state
  * @part: partition index this decoder maps
  * @pos: interleave position in @cxld.region
@@ -328,7 +286,6 @@ enum cxl_decoder_state {
 struct cxl_endpoint_decoder {
 	struct cxl_decoder cxld;
 	struct resource *dpa_res;
-	resource_size_t skip;
 	enum cxl_decoder_state state;
 	int part;
 	int pos;
diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
index fa7269154620..93d4eba6947a 100644
--- a/include/cxl/cxl.h
+++ b/include/cxl/cxl.h
@@ -5,8 +5,10 @@
 #ifndef __CXL_CXL_H__
 #define __CXL_CXL_H__
 
+#include <linux/device.h>
 #include <linux/node.h>
 #include <linux/ioport.h>
+#include <linux/range.h>
 #include <cxl/mailbox.h>
 
 /**
@@ -23,7 +25,56 @@ enum cxl_devtype {
 	CXL_DEVTYPE_CLASSMEM,
 };
 
-struct device;
+struct cxl_region;
+
+enum cxl_decoder_type {
+	CXL_DECODER_DEVMEM = 2,
+	CXL_DECODER_HOSTONLYMEM = 3,
+};
+
+/*
+ * Current specification goes up to 8, double that seems a reasonable
+ * software max for the foreseeable future
+ */
+#define CXL_DECODER_MAX_INTERLEAVE 16
+
+/**
+ * struct cxl_decoder - Common CXL HDM Decoder Attributes
+ * @dev: this decoder's device
+ * @id: kernel device name id
+ * @hpa_range: Host physical address range mapped by this decoder
+ * @skip: offset into @dpa_res where @cxld.hpa_range maps (endpoint)
+ * @targets: interleave position to dport mapping (switch)
+ * @interleave_ways: number of cxl_dports in this decode
+ * @interleave_granularity: data stride per dport
+ * @target_type: accelerator vs expander (type2 vs type3) selector
+ * @flags: memory type capabilities and locking
+ * @region: currently assigned region for this decoder
+ * @target_map: cached copy of hardware port-id list, available at init
+ *              before all @dport objects have been instantiated. While
+ *              dport id is 8bit, CFMWS interleave targets are 32bits.
+ * @commit: device/decoder-type specific callback to commit settings to hw
+ * @reset: device/decoder-type specific callback to reset hw settings
+ */
+struct cxl_decoder {
+	struct device dev;
+
+	struct_group_tagged(cxl_decoder_settings, settings, int id;
+		struct range hpa_range;
+		union {
+			u64 skip;
+			u64 targets;
+		};
+		int interleave_ways;
+		int interleave_granularity;
+		enum cxl_decoder_type target_type;
+		unsigned long flags;
+	);
+	struct cxl_region *region;
+	u32 target_map[CXL_DECODER_MAX_INTERLEAVE];
+	int (*commit)(struct cxl_decoder *cxld);
+	void (*reset)(struct cxl_decoder *cxld);
+};
 
 /*
  * Using struct_group() allows for per register-block-type helper routines,
@@ -70,6 +121,8 @@ struct cxl_regs {
 	);
 };
 
+int cxl_commit(struct cxl_decoder_settings *settings, void __iomem *hdm);
+
 struct cxl_reg_map {
 	bool valid;
 	int id;
diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
index 4281d34cd0e7..244e8ec28769 100644
--- a/tools/testing/cxl/test/cxl.c
+++ b/tools/testing/cxl/test/cxl.c
@@ -843,11 +843,11 @@ static int cxld_registry_restore(struct cxl_decoder *cxld,
 		dbg_cxld(port, "restore", &td->cxled.cxld);
 		cxld_copy(cxld, &td->cxled.cxld);
 		cxled->state = td->cxled.state;
-		cxled->skip = td->cxled.skip;
+		cxld->skip = td->cxled.cxld.skip;
 		if (range_len(&td->dpa_range)) {
 			rc = devm_cxl_dpa_reserve(cxled, td->dpa_range.start,
 						  range_len(&td->dpa_range),
-						  td->cxled.skip);
+						  td->cxled.cxld.skip);
 			if (rc) {
 				init_disabled_mock_decoder(cxld);
 				return rc;
@@ -885,7 +885,7 @@ static void __cxld_registry_save(struct cxl_test_decoder *td,
 
 		cxld_copy(&td->cxled.cxld, cxld);
 		td->cxled.state = cxled->state;
-		td->cxled.skip = cxled->skip;
+		td->cxled.cxld.skip = cxled->cxld.skip;
 
 		if (!(cxld->flags & CXL_DECODER_F_ENABLE)) {
 			td->dpa_range.start = 0;
@@ -973,7 +973,7 @@ static void mock_decoder_reset(struct cxl_decoder *cxld)
 			to_cxl_endpoint_decoder(&cxld->dev);
 
 		cxled->state = CXL_DECODER_STATE_MANUAL;
-		cxled->skip = 0;
+		cxled->cxld.skip = 0;
 	}
 	if (decoder_reset_preserve_registry)
 		dev_dbg(port->uport_dev, "decoder%d: skip registry update\n",
@@ -1024,7 +1024,7 @@ static void init_disabled_mock_decoder(struct cxl_decoder *cxld)
 			to_cxl_endpoint_decoder(&cxld->dev);
 
 		cxled->state = CXL_DECODER_STATE_MANUAL;
-		cxled->skip = 0;
+		cxled->cxld.skip = 0;
 	}
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 02/11] cxl: Cache decoder settings on PCI devices
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 01/11] cxl: Split decoder programming into a reusable helper Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 03/11] cxl: Cache endpoint decoder settings during PCI enumeration Srirangan Madhavan
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Cache CXL core's HDM decoder settings in pci_dev->hdm as decoders are
enumerated, committed, or reset. PCI reset paths can use this snapshot to
restore HDM programming without walking CXL topology during reset recovery.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/hdm.c | 81 +++++++++++++++++++++++++++++++++++++++++-
 include/cxl/cxl.h      | 12 +++++++
 include/linux/pci.h    |  6 ++++
 3 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index fa978c297546..83cda63f76a5 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -84,6 +84,76 @@ static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
 		cxlhdm->iw_cap_mask |= BIT(16);
 }
 
+static void clear_hdm_info(void *data)
+{
+	struct pci_dev *pdev = data;
+
+	WRITE_ONCE(pdev->hdm, NULL);
+}
+
+static int devm_cxl_pci_setup_hdm_info(struct cxl_hdm *cxlhdm)
+{
+	struct cxl_port *port = cxlhdm->port;
+	struct cxl_hdm_info *info;
+	struct pci_dev *pdev;
+	struct device *uport;
+
+	if (is_cxl_endpoint(port)) {
+		struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
+
+		uport = cxlmd->dev.parent;
+	} else {
+		uport = port->uport_dev;
+	}
+
+	if (!dev_is_pci(uport))
+		return 0;
+
+	pdev = to_pci_dev(uport);
+	info = devm_kzalloc(&pdev->dev,
+			    struct_size(info, settings, cxlhdm->decoder_count),
+			    GFP_KERNEL);
+	if (!info)
+		return -ENOMEM;
+
+	info->decoder_count = cxlhdm->decoder_count;
+	WRITE_ONCE(pdev->hdm, info);
+
+	return devm_add_action_or_reset(&pdev->dev, clear_hdm_info, pdev);
+}
+
+static void cxl_hdm_info_set_decoder(struct cxl_hdm *cxlhdm,
+				     struct cxl_decoder *cxld)
+{
+	struct cxl_port *port = cxlhdm->port;
+	struct cxl_hdm_info *info;
+	struct pci_dev *pdev;
+	struct device *uport;
+
+	if (is_cxl_endpoint(port)) {
+		struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
+
+		uport = cxlmd->dev.parent;
+	} else {
+		uport = port->uport_dev;
+	}
+
+	if (!dev_is_pci(uport))
+		return;
+
+	pdev = to_pci_dev(uport);
+	info = READ_ONCE(pdev->hdm);
+	if (!info || cxld->id >= info->decoder_count)
+		return;
+
+	if (cxld->flags & CXL_DECODER_F_ENABLE)
+		info->settings[cxld->id] = cxld->settings;
+	else
+		info->settings[cxld->id] = (struct cxl_decoder_settings) {
+			.id = cxld->id,
+		};
+}
+
 static bool should_emulate_decoders(struct cxl_endpoint_dvsec_info *info)
 {
 	struct cxl_hdm *cxlhdm;
@@ -747,6 +817,7 @@ static int cxl_decoder_commit(struct cxl_decoder *cxld)
 		return rc;
 	}
 	port->commit_end++;
+	cxl_hdm_info_set_decoder(cxlhdm, cxld);
 
 	return 0;
 }
@@ -819,6 +890,7 @@ static void cxl_decoder_reset(struct cxl_decoder *cxld)
 	writel(0, hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id));
 
 	cxld->flags &= ~CXL_DECODER_F_ENABLE;
+	cxl_hdm_info_set_decoder(cxlhdm, cxld);
 
 	/* Userspace is now responsible for reconfiguring this decoder */
 	if (is_endpoint_decoder(&cxld->dev)) {
@@ -989,6 +1061,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
 		lo = readl(hdm + CXL_HDM_DECODER0_TL_LOW(which));
 		hi = readl(hdm + CXL_HDM_DECODER0_TL_HIGH(which));
 		target_list.value = (hi << 32) + lo;
+		cxld->targets = target_list.value;
 		for (i = 0; i < cxld->interleave_ways; i++)
 			cxld->target_map[i] = target_list.target_id[i];
 
@@ -1062,11 +1135,16 @@ static int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
 	struct cxl_port *port = cxlhdm->port;
 	int i;
 	u64 dpa_base = 0;
+	int rc;
 
 	cxl_settle_decoders(cxlhdm);
 
+	rc = devm_cxl_pci_setup_hdm_info(cxlhdm);
+	if (rc)
+		return rc;
+
 	for (i = 0; i < cxlhdm->decoder_count; i++) {
-		int rc, target_count = cxlhdm->target_count;
+		int target_count = cxlhdm->target_count;
 		struct cxl_decoder *cxld;
 
 		if (is_cxl_endpoint(port)) {
@@ -1101,6 +1179,7 @@ static int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
 			put_device(&cxld->dev);
 			return rc;
 		}
+		cxl_hdm_info_set_decoder(cxlhdm, cxld);
 		rc = add_hdm_decoder(port, cxld);
 		if (rc) {
 			dev_warn(&port->dev,
diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
index 93d4eba6947a..cc933379f67b 100644
--- a/include/cxl/cxl.h
+++ b/include/cxl/cxl.h
@@ -121,6 +121,18 @@ struct cxl_regs {
 	);
 };
 
+/**
+ * struct cxl_hdm_info - PCI device HDM decoder programming cache
+ * @decoder_count: number of decoder settings entries
+ * @regs: mapped CXL component registers for this HDM decoder block
+ * @settings: cached per-decoder programming state
+ */
+struct cxl_hdm_info {
+	int decoder_count;
+	struct cxl_component_regs regs;
+	struct cxl_decoder_settings settings[] __counted_by(decoder_count);
+};
+
 int cxl_commit(struct cxl_decoder_settings *settings, void __iomem *hdm);
 
 struct cxl_reg_map {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c4454583c11..7db2daf8597c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -334,6 +334,9 @@ struct pcie_link_state;
 struct pci_sriov;
 struct pci_p2pdma;
 struct rcec_ea;
+#ifdef CONFIG_CXL_HDM
+struct cxl_hdm_info;
+#endif
 
 /* struct pci_dev - describes a PCI device
  *
@@ -563,6 +566,9 @@ struct pci_dev {
 #ifdef CONFIG_PCI_DOE
 	struct xarray	doe_mbs;	/* Data Object Exchange mailboxes */
 #endif
+#ifdef CONFIG_CXL_HDM
+	struct cxl_hdm_info *hdm;	/* CXL HDM decoder reset state */
+#endif
 #ifdef CONFIG_PCI_NPEM
 	struct npem	*npem;		/* Native PCIe Enclosure Management */
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 03/11] cxl: Cache endpoint decoder settings during PCI enumeration
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 01/11] cxl: Split decoder programming into a reusable helper Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 02/11] cxl: Cache decoder settings on PCI devices Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 04/11] PCI: Export pci_dev_save_and_disable() and pci_dev_restore() Srirangan Madhavan
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Populate pci_dev->hdm from PCI capability initialization when a CXL.mem
function already has memory decoding enabled. This gives driver-free reset
paths an early HDM snapshot without enabling BAR decoding during
enumeration.

CXL core later reuses and refreshes the same cache. Move the register
helpers into the built-in CONFIG_CXL_HDM set so the early cache path is
available without cxl_core.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/Makefile |   3 +-
 drivers/cxl/core/hdm.c    |  59 +++++++----
 drivers/cxl/core/reset.c  | 202 ++++++++++++++++++++++++++++++++++++++
 drivers/pci/probe.c       |   2 +
 include/cxl/cxl.h         |   9 ++
 5 files changed, 252 insertions(+), 23 deletions(-)

diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index dc075cee0450..69cf2ea7ee74 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += cxl_core.o
-obj-$(CONFIG_CXL_HDM) += reset.o
+obj-$(CONFIG_CXL_HDM) += regs.o reset.o
 obj-$(CONFIG_CXL_SUSPEND) += suspend.o
 
 ccflags-y += -I$(srctree)/drivers/cxl
@@ -8,7 +8,6 @@ CFLAGS_trace.o = -DTRACE_INCLUDE_PATH=. -I$(src)
 
 cxl_core-y := port.o
 cxl_core-y += pmem.o
-cxl_core-y += regs.o
 cxl_core-y += memdev.o
 cxl_core-y += mbox.o
 cxl_core-y += pci.o
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 83cda63f76a5..0230ebfada42 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -91,11 +91,9 @@ static void clear_hdm_info(void *data)
 	WRITE_ONCE(pdev->hdm, NULL);
 }
 
-static int devm_cxl_pci_setup_hdm_info(struct cxl_hdm *cxlhdm)
+static struct pci_dev *cxl_hdm_to_pci_dev(struct cxl_hdm *cxlhdm)
 {
 	struct cxl_port *port = cxlhdm->port;
-	struct cxl_hdm_info *info;
-	struct pci_dev *pdev;
 	struct device *uport;
 
 	if (is_cxl_endpoint(port)) {
@@ -107,9 +105,27 @@ static int devm_cxl_pci_setup_hdm_info(struct cxl_hdm *cxlhdm)
 	}
 
 	if (!dev_is_pci(uport))
+		return NULL;
+
+	return to_pci_dev(uport);
+}
+
+static int devm_cxl_pci_setup_hdm_info(struct cxl_hdm *cxlhdm)
+{
+	struct cxl_hdm_info *info;
+	struct pci_dev *pdev;
+
+	pdev = cxl_hdm_to_pci_dev(cxlhdm);
+	if (!pdev)
+		return 0;
+
+	info = READ_ONCE(pdev->hdm);
+	if (info) {
+		if (info->decoder_count != cxlhdm->decoder_count)
+			return -ENXIO;
 		return 0;
+	}
 
-	pdev = to_pci_dev(uport);
 	info = devm_kzalloc(&pdev->dev,
 			    struct_size(info, settings, cxlhdm->decoder_count),
 			    GFP_KERNEL);
@@ -125,23 +141,13 @@ static int devm_cxl_pci_setup_hdm_info(struct cxl_hdm *cxlhdm)
 static void cxl_hdm_info_set_decoder(struct cxl_hdm *cxlhdm,
 				     struct cxl_decoder *cxld)
 {
-	struct cxl_port *port = cxlhdm->port;
 	struct cxl_hdm_info *info;
 	struct pci_dev *pdev;
-	struct device *uport;
-
-	if (is_cxl_endpoint(port)) {
-		struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
-
-		uport = cxlmd->dev.parent;
-	} else {
-		uport = port->uport_dev;
-	}
 
-	if (!dev_is_pci(uport))
+	pdev = cxl_hdm_to_pci_dev(cxlhdm);
+	if (!pdev)
 		return;
 
-	pdev = to_pci_dev(uport);
 	info = READ_ONCE(pdev->hdm);
 	if (!info || cxld->id >= info->decoder_count)
 		return;
@@ -202,6 +208,7 @@ static struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
 	struct cxl_register_map *reg_map = &port->reg_map;
 	struct device *dev = &port->dev;
 	struct cxl_hdm *cxlhdm;
+	struct pci_dev *pdev;
 	int rc;
 
 	cxlhdm = devm_kzalloc(dev, sizeof(*cxlhdm), GFP_KERNEL);
@@ -227,11 +234,21 @@ static struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
 		return ERR_PTR(-ENODEV);
 	}
 
-	rc = cxl_map_component_regs(reg_map, &cxlhdm->regs,
-				    BIT(CXL_CM_CAP_CAP_ID_HDM));
-	if (rc) {
-		dev_err(dev, "Failed to map HDM capability.\n");
-		return ERR_PTR(rc);
+	pdev = cxl_hdm_to_pci_dev(cxlhdm);
+	if (pdev) {
+		struct cxl_hdm_info *info = READ_ONCE(pdev->hdm);
+
+		if (info && info->regs.hdm_decoder)
+			cxlhdm->regs = info->regs;
+	}
+
+	if (!cxlhdm->regs.hdm_decoder) {
+		rc = cxl_map_component_regs(reg_map, &cxlhdm->regs,
+					    BIT(CXL_CM_CAP_CAP_ID_HDM));
+		if (rc) {
+			dev_err(dev, "Failed to map HDM capability.\n");
+			return ERR_PTR(rc);
+		}
 	}
 
 	parse_hdm_decoder_caps(cxlhdm);
diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
index 14f024098e82..fc52d3abdb5b 100644
--- a/drivers/cxl/core/reset.c
+++ b/drivers/cxl/core/reset.c
@@ -2,9 +2,16 @@
 /* Copyright(c) 2026 NVIDIA Corporation. All rights reserved. */
 #include <linux/delay.h>
 #include <linux/bug.h>
+#include <linux/bitfield.h>
 #include <linux/errno.h>
 #include <linux/export.h>
+#include <linux/io.h>
+#include <linux/ioport.h>
 #include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/slab.h>
+
+#include <cxlpci.h>
 
 #include "cxl.h"
 #include "core.h"
@@ -116,3 +123,198 @@ int cxl_commit(struct cxl_decoder_settings *settings, void __iomem *hdm)
 	return 0;
 }
 EXPORT_SYMBOL_FOR_MODULES(cxl_commit, "cxl_core");
+
+#define CXL_HDM_DECODER_MAX_COUNT 32
+
+static void cxl_pci_hdm_clear(void *data)
+{
+	struct pci_dev *pdev = data;
+
+	WRITE_ONCE(pdev->hdm, NULL);
+}
+
+static void cxl_pci_hdm_unmap(struct pci_dev *pdev,
+			      struct cxl_component_regs *regs,
+			      struct cxl_register_map *map)
+{
+	struct cxl_reg_map *hdm_map = &map->component_map.hdm_decoder;
+
+	if (!regs->hdm_decoder)
+		return;
+
+	devm_iounmap(&pdev->dev, regs->hdm_decoder);
+	devm_release_mem_region(&pdev->dev, map->resource + hdm_map->offset,
+				hdm_map->size);
+}
+
+static int cxl_pci_hdm_read_decoder(struct pci_dev *pdev,
+				    struct cxl_decoder_settings *settings,
+				    void __iomem *hdm, int id)
+{
+	u64 target_or_skip, base, size;
+	u32 ctrl, lo, hi;
+	int rc;
+
+	*settings = (struct cxl_decoder_settings) {
+		.id = id,
+	};
+
+	ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+	if (!(ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED))
+		return 0;
+
+	lo = readl(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(id));
+	hi = readl(hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(id));
+	base = ((u64)hi << 32) | lo;
+
+	lo = readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(id));
+	hi = readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(id));
+	size = ((u64)hi << 32) | lo;
+
+	if (!size || base == U64_MAX || size == U64_MAX ||
+	    base > U64_MAX - (size - 1)) {
+		pci_err(pdev, "CXL HDM decoder %d has invalid range\n", id);
+		return -ENXIO;
+	}
+
+	lo = readl(hdm + CXL_HDM_DECODER0_TL_LOW(id));
+	hi = readl(hdm + CXL_HDM_DECODER0_TL_HIGH(id));
+	target_or_skip = ((u64)hi << 32) | lo;
+
+	settings->hpa_range = (struct range) {
+		.start = base,
+		.end = base + size - 1,
+	};
+	settings->targets = target_or_skip;
+	settings->target_type = FIELD_GET(CXL_HDM_DECODER0_CTRL_HOSTONLY, ctrl) ?
+				CXL_DECODER_HOSTONLYMEM : CXL_DECODER_DEVMEM;
+	settings->flags = CXL_DECODER_F_ENABLE;
+	if (ctrl & CXL_HDM_DECODER0_CTRL_LOCK)
+		settings->flags |= CXL_DECODER_F_LOCK;
+
+	rc = eiw_to_ways(FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl),
+			 &settings->interleave_ways);
+	if (rc)
+		return rc;
+
+	return eig_to_granularity(FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK,
+					    ctrl),
+				  &settings->interleave_granularity);
+}
+
+static int cxl_pci_hdm_capable(struct pci_dev *pdev)
+{
+	u16 cap;
+	int dvsec;
+	int rc;
+
+	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
+					  PCI_DVSEC_CXL_DEVICE);
+	if (!dvsec)
+		return -ENOTTY;
+
+	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_CAP, &cap);
+	if (rc)
+		return pcibios_err_to_errno(rc);
+
+	if (!(cap & PCI_DVSEC_CXL_MEM_CAPABLE))
+		return -ENOTTY;
+
+	return 0;
+}
+
+int pci_cxl_hdm_init(struct pci_dev *pdev)
+{
+	struct cxl_decoder_settings *settings;
+	struct cxl_component_regs regs = { 0 };
+	struct cxl_register_map map = { 0 };
+	struct cxl_hdm_info *info;
+	bool allocated_info = false;
+	int decoder_count;
+	u16 command;
+	int rc;
+
+	info = READ_ONCE(pdev->hdm);
+	if (info && info->regs.hdm_decoder)
+		return 0;
+
+	rc = cxl_pci_hdm_capable(pdev);
+	if (rc)
+		return rc;
+
+	rc = pci_read_config_word(pdev, PCI_COMMAND, &command);
+	if (rc)
+		return pcibios_err_to_errno(rc);
+
+	if (!(command & PCI_COMMAND_MEMORY))
+		return -ENOTTY;
+
+	if (!info) {
+		info = devm_kzalloc(&pdev->dev,
+				    struct_size(info, settings,
+						CXL_HDM_DECODER_MAX_COUNT),
+				    GFP_KERNEL);
+		if (!info)
+			return -ENOMEM;
+		allocated_info = true;
+	}
+
+	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (rc)
+		return rc;
+
+	rc = cxl_setup_regs(&map);
+	if (rc)
+		return rc;
+
+	if (!map.component_map.hdm_decoder.valid) {
+		rc = -ENODEV;
+		return rc;
+	}
+
+	rc = cxl_map_component_regs(&map, &regs, BIT(CXL_CM_CAP_CAP_ID_HDM));
+	if (rc)
+		return rc;
+
+	decoder_count = cxl_hdm_decoder_count(readl(regs.hdm_decoder +
+						    CXL_HDM_DECODER_CAP_OFFSET));
+	if (decoder_count < 0) {
+		rc = decoder_count;
+		goto out_unmap;
+	}
+
+	if (decoder_count > CXL_HDM_DECODER_MAX_COUNT) {
+		rc = -ENXIO;
+		goto out_unmap;
+	}
+
+	if (info->decoder_count && info->decoder_count != decoder_count) {
+		rc = -ENXIO;
+		goto out_unmap;
+	}
+
+	info->decoder_count = decoder_count;
+	info->regs = regs;
+
+	settings = info->settings;
+	for (int i = 0; i < info->decoder_count; i++) {
+		rc = cxl_pci_hdm_read_decoder(pdev, &settings[i],
+					      regs.hdm_decoder, i);
+		if (rc)
+			goto out_unmap;
+	}
+
+	WRITE_ONCE(pdev->hdm, info);
+	if (allocated_info) {
+		rc = devm_add_action(&pdev->dev, cxl_pci_hdm_clear, pdev);
+		if (rc) {
+			WRITE_ONCE(pdev->hdm, NULL);
+			goto out_unmap;
+		}
+	}
+	return 0;
+
+out_unmap:
+	cxl_pci_hdm_unmap(pdev, &regs, &map);
+	return rc;
+}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b63cd0c310bc..9e214446fd42 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -24,6 +24,7 @@
 #include <linux/pm_runtime.h>
 #include <linux/bitfield.h>
 #include <trace/events/pci.h>
+#include <cxl/cxl.h>
 #include "pci.h"
 
 static struct resource busn_resource = {
@@ -2679,6 +2680,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
 	pci_rebar_init(dev);		/* Resizable BAR */
 	pci_dev3_init(dev);		/* Device 3 capabilities */
 	pci_ide_init(dev);		/* Link Integrity and Data Encryption */
+	pci_cxl_hdm_init(dev);		/* CXL HDM Decoder Capability */
 
 	pcie_report_downtraining(dev);
 	pci_init_reset_methods(dev);
diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
index cc933379f67b..e3087b7517e8 100644
--- a/include/cxl/cxl.h
+++ b/include/cxl/cxl.h
@@ -26,6 +26,7 @@ enum cxl_devtype {
 };
 
 struct cxl_region;
+struct pci_dev;
 
 enum cxl_decoder_type {
 	CXL_DECODER_DEVMEM = 2,
@@ -134,6 +135,14 @@ struct cxl_hdm_info {
 };
 
 int cxl_commit(struct cxl_decoder_settings *settings, void __iomem *hdm);
+#ifdef CONFIG_CXL_HDM
+int pci_cxl_hdm_init(struct pci_dev *pdev);
+#else
+static inline int pci_cxl_hdm_init(struct pci_dev *pdev)
+{
+	return -ENOTTY;
+}
+#endif
 
 struct cxl_reg_map {
 	bool valid;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 04/11] PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
                   ` (2 preceding siblings ...)
  2026-06-23  3:24 ` [PATCH v7 03/11] cxl: Cache endpoint decoder settings during PCI enumeration Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 05/11] cxl: Add CXL Device Reset helper Srirangan Madhavan
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Export the standard PCI reset save/disable and restore helpers so CXL reset
can split that lifecycle around CXL-specific sequencing while preserving
driver reset_prepare/reset_done callbacks and PCI config-state handling.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/pci/pci.c   | 23 +++++++++++++++++++++--
 include/linux/pci.h |  2 ++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d34266651ad0..360f2aaee10c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5003,7 +5003,15 @@ void pci_dev_unlock(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_dev_unlock);
 
-static void pci_dev_save_and_disable(struct pci_dev *dev)
+/**
+ * pci_dev_save_and_disable - Save device state and disable it
+ * @dev: PCI device to save and disable
+ *
+ * Save the PCI configuration state, invoke the driver's reset_prepare()
+ * callback if present, and disable the device by clearing the Command
+ * register. The device lock must be held by the caller.
+ */
+void pci_dev_save_and_disable(struct pci_dev *dev)
 {
 	const struct pci_error_handlers *err_handler =
 			dev->driver ? dev->driver->err_handler : NULL;
@@ -5036,12 +5044,22 @@ static void pci_dev_save_and_disable(struct pci_dev *dev)
 	 */
 	pci_write_config_word(dev, PCI_COMMAND, PCI_COMMAND_INTX_DISABLE);
 }
+EXPORT_SYMBOL_GPL(pci_dev_save_and_disable);
 
-static void pci_dev_restore(struct pci_dev *dev)
+/**
+ * pci_dev_restore - Restore device state after reset
+ * @dev: PCI device to restore
+ *
+ * Restore the saved PCI configuration state and invoke the driver's
+ * reset_done() callback if present. The device lock must be held by the
+ * caller.
+ */
+void pci_dev_restore(struct pci_dev *dev)
 {
 	const struct pci_error_handlers *err_handler =
 			dev->driver ? dev->driver->err_handler : NULL;
 
+	device_lock_assert(&dev->dev);
 	pci_restore_state(dev);
 
 	/*
@@ -5054,6 +5072,7 @@ static void pci_dev_restore(struct pci_dev *dev)
 	else if (dev->driver)
 		pci_warn(dev, "reset done");
 }
+EXPORT_SYMBOL_GPL(pci_dev_restore);
 
 /* dev->reset_methods[] is a 0-terminated list of indices into this array */
 const struct pci_reset_fn_method pci_reset_fn_methods[] = {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7db2daf8597c..4df030837a3a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2018,6 +2018,8 @@ void pci_dev_lock(struct pci_dev *dev);
 int pci_dev_trylock(struct pci_dev *dev);
 void pci_dev_unlock(struct pci_dev *dev);
 DEFINE_GUARD(pci_dev, struct pci_dev *, pci_dev_lock(_T), pci_dev_unlock(_T))
+void pci_dev_save_and_disable(struct pci_dev *dev);
+void pci_dev_restore(struct pci_dev *dev);
 
 /*
  * PCI domain support.  Sometimes called PCI segment (eg by ACPI),
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 05/11] cxl: Add CXL Device Reset helper
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
                   ` (3 preceding siblings ...)
  2026-06-23  3:24 ` [PATCH v7 04/11] PCI: Export pci_dev_save_and_disable() and pci_dev_restore() Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 06/11] cxl: Validate HDM ranges before CXL reset Srirangan Madhavan
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Add an internal CXL Device Reset helper for Type 2 functions that advertise
CXL Reset in the CXL Device DVSEC. The helper disables CXL.cache, performs
cache writeback when supported, initiates reset with Memory Clear disabled,
waits for completion, and re-enables CXL.cache on exit.

Leave the helper unregistered until range validation and reset-scope
coordination are in place.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/reset.c      | 221 ++++++++++++++++++++++++++++++++++
 include/cxl/cxl.h             |   7 ++
 include/uapi/linux/pci_regs.h |  14 +++
 3 files changed, 242 insertions(+)

diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
index fc52d3abdb5b..fdfcc9e825e0 100644
--- a/drivers/cxl/core/reset.c
+++ b/drivers/cxl/core/reset.c
@@ -7,6 +7,8 @@
 #include <linux/export.h>
 #include <linux/io.h>
 #include <linux/ioport.h>
+#include <linux/iommu.h>
+#include <linux/jiffies.h>
 #include <linux/kernel.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
@@ -318,3 +320,222 @@ int pci_cxl_hdm_init(struct pci_dev *pdev)
 	cxl_pci_hdm_unmap(pdev, &regs, &map);
 	return rc;
 }
+
+/*
+ * CXL r4.0 sec 9.7.2 defines the reset completion timeout encodings.
+ * Sec 9.7.3 leaves config-space access behavior undefined for 100 ms after
+ * initiating CXL Reset, then limits software to CXL Status2 access until
+ * reset completion, timeout, or error.
+ */
+#define CXL_RESET_RRS_WAIT_MS 100
+#define CXL_RESET_STATUS_POLL_MS 20
+static const u32 cxl_reset_timeout_ms[] = {
+	10, 100, 1000, 10000, 100000,
+};
+
+#define CXL_CACHE_WBI_TIMEOUT_US 100000
+#define CXL_CACHE_WBI_POLL_US 100
+
+static int cxl_reset_dvsec(struct pci_dev *pdev)
+{
+	int dvsec, rc;
+	u16 cap;
+
+	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
+					  PCI_DVSEC_CXL_DEVICE);
+	if (!dvsec)
+		return -ENOTTY;
+
+	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_CAP, &cap);
+	if (rc)
+		return pcibios_err_to_errno(rc);
+
+	if ((cap & (PCI_DVSEC_CXL_CACHE_CAPABLE |
+		    PCI_DVSEC_CXL_MEM_CAPABLE)) !=
+	    (PCI_DVSEC_CXL_CACHE_CAPABLE | PCI_DVSEC_CXL_MEM_CAPABLE))
+		return -ENOTTY;
+
+	if (!(cap & PCI_DVSEC_CXL_RST_CAPABLE))
+		return -ENOTTY;
+
+	return dvsec;
+}
+
+static int cxl_reset_update_ctrl2(struct pci_dev *pdev, int dvsec, u16 set,
+				  u16 clear)
+{
+	u16 ctrl2;
+	int rc;
+
+	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_CTRL2, &ctrl2);
+	if (rc)
+		return pcibios_err_to_errno(rc);
+
+	ctrl2 |= set;
+	ctrl2 &= ~clear;
+
+	rc = pci_write_config_word(pdev, dvsec + PCI_DVSEC_CXL_CTRL2, ctrl2);
+	if (rc)
+		return pcibios_err_to_errno(rc);
+
+	return 0;
+}
+
+static int cxl_reset_enable_cache(struct pci_dev *pdev, int dvsec)
+{
+	return cxl_reset_update_ctrl2(pdev, dvsec, 0,
+				      PCI_DVSEC_CXL_DISABLE_CACHING);
+}
+
+static int cxl_reset_disable_cache(struct pci_dev *pdev, int dvsec, u16 cap)
+{
+	int remaining_us = CXL_CACHE_WBI_TIMEOUT_US;
+	u16 status2;
+	int rc, rc2;
+
+	rc = cxl_reset_update_ctrl2(pdev, dvsec,
+				    PCI_DVSEC_CXL_DISABLE_CACHING, 0);
+	if (rc)
+		return rc;
+
+	if (!(cap & PCI_DVSEC_CXL_CACHE_WBI_CAPABLE))
+		return 0;
+
+	rc = cxl_reset_update_ctrl2(pdev, dvsec,
+				    PCI_DVSEC_CXL_INIT_CACHE_WBI, 0);
+	if (rc)
+		goto err_enable_cache;
+
+	do {
+		usleep_range(CXL_CACHE_WBI_POLL_US, CXL_CACHE_WBI_POLL_US + 1);
+		remaining_us -= CXL_CACHE_WBI_POLL_US;
+
+		rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_STATUS2,
+					  &status2);
+		if (rc) {
+			rc = pcibios_err_to_errno(rc);
+			goto err_enable_cache;
+		}
+	} while (!(status2 & PCI_DVSEC_CXL_CACHE_INV) && remaining_us > 0);
+
+	if (!(status2 & PCI_DVSEC_CXL_CACHE_INV)) {
+		rc = -ETIMEDOUT;
+		goto err_enable_cache;
+	}
+
+	return 0;
+
+err_enable_cache:
+	/*
+	 * DISABLE_CACHING can be rolled back here. INIT_CACHE_WBI is
+	 * self-clearing on completion, so leave any in-flight writeback alone.
+	 */
+	rc2 = cxl_reset_enable_cache(pdev, dvsec);
+	if (rc2)
+		pci_warn(pdev, "failed to re-enable CXL caching: %d\n", rc2);
+	return rc;
+}
+
+static int cxl_reset_wait_done(struct pci_dev *pdev, int dvsec, u16 cap)
+{
+	unsigned long deadline;
+	u32 timeout_ms;
+	u16 status2;
+	int idx, rc;
+
+	idx = FIELD_GET(PCI_DVSEC_CXL_RST_TIMEOUT, cap);
+	if (idx >= ARRAY_SIZE(cxl_reset_timeout_ms)) {
+		int last = ARRAY_SIZE(cxl_reset_timeout_ms) - 1;
+
+		pci_warn(pdev,
+			 "unknown CXL reset timeout encoding %d; using %u ms\n",
+			 idx, cxl_reset_timeout_ms[last]);
+		idx = last;
+	}
+
+	timeout_ms = max_t(u32, cxl_reset_timeout_ms[idx],
+			   CXL_RESET_RRS_WAIT_MS);
+	deadline = jiffies + msecs_to_jiffies(timeout_ms);
+	msleep(CXL_RESET_RRS_WAIT_MS);
+
+	do {
+		rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_STATUS2,
+					  &status2);
+		if (rc)
+			return pcibios_err_to_errno(rc);
+
+		if (status2 & PCI_DVSEC_CXL_RST_ERR)
+			return -EIO;
+
+		if (status2 & PCI_DVSEC_CXL_RST_DONE)
+			return 0;
+
+		if (time_after_eq(jiffies, deadline))
+			return -ETIMEDOUT;
+
+		msleep(CXL_RESET_STATUS_POLL_MS);
+	} while (true);
+}
+
+static int cxl_reset_execute(struct pci_dev *pdev, int dvsec)
+{
+	bool cache_disabled = false;
+	u16 cap;
+	int rc;
+
+	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_CAP, &cap);
+	if (rc)
+		return pcibios_err_to_errno(rc);
+
+	if (!pci_wait_for_pending_transaction(pdev))
+		pci_err(pdev, "timed out waiting for pending transactions\n");
+
+	rc = pci_dev_reset_iommu_prepare(pdev);
+	if (rc) {
+		pci_err(pdev, "failed to stop IOMMU for CXL reset: %d\n", rc);
+		return rc;
+	}
+
+	rc = cxl_reset_disable_cache(pdev, dvsec, cap);
+	if (rc)
+		goto out;
+	cache_disabled = true;
+
+	rc = cxl_reset_update_ctrl2(pdev, dvsec, PCI_DVSEC_CXL_INIT_CXL_RST,
+				    PCI_DVSEC_CXL_RST_MEM_CLR_EN);
+	if (rc)
+		goto out;
+
+	rc = cxl_reset_wait_done(pdev, dvsec, cap);
+	if (rc)
+		goto out;
+
+out:
+	if (cache_disabled) {
+		int rc2;
+
+		rc2 = cxl_reset_enable_cache(pdev, dvsec);
+		if (rc2 && rc)
+			pci_warn(pdev, "failed to re-enable CXL caching: %d\n",
+				 rc2);
+		else if (rc2)
+			rc = rc2;
+	}
+
+	pci_dev_reset_iommu_done(pdev);
+	return rc;
+}
+
+int cxl_reset_function(struct pci_dev *pdev, bool probe)
+{
+	int dvsec;
+
+	dvsec = cxl_reset_dvsec(pdev);
+	if (dvsec < 0)
+		return dvsec;
+
+	if (probe)
+		return 0;
+
+	return cxl_reset_execute(pdev, dvsec);
+}
diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
index e3087b7517e8..1fe606f15733 100644
--- a/include/cxl/cxl.h
+++ b/include/cxl/cxl.h
@@ -9,6 +9,7 @@
 #include <linux/node.h>
 #include <linux/ioport.h>
 #include <linux/range.h>
+#include <linux/errno.h>
 #include <cxl/mailbox.h>
 
 /**
@@ -137,11 +138,17 @@ struct cxl_hdm_info {
 int cxl_commit(struct cxl_decoder_settings *settings, void __iomem *hdm);
 #ifdef CONFIG_CXL_HDM
 int pci_cxl_hdm_init(struct pci_dev *pdev);
+int cxl_reset_function(struct pci_dev *pdev, bool probe);
 #else
 static inline int pci_cxl_hdm_init(struct pci_dev *pdev)
 {
 	return -ENOTTY;
 }
+
+static inline int cxl_reset_function(struct pci_dev *pdev, bool probe)
+{
+	return -ENOTTY;
+}
 #endif
 
 struct cxl_reg_map {
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 14f634ab9350..194ae56b4404 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -1349,10 +1349,24 @@
 /* CXL r4.0, 8.1.3: PCIe DVSEC for CXL Device */
 #define PCI_DVSEC_CXL_DEVICE				0
 #define  PCI_DVSEC_CXL_CAP				0xA
+#define   PCI_DVSEC_CXL_CACHE_CAPABLE			_BITUL(0)
 #define   PCI_DVSEC_CXL_MEM_CAPABLE			_BITUL(2)
 #define   PCI_DVSEC_CXL_HDM_COUNT			__GENMASK(5, 4)
+#define   PCI_DVSEC_CXL_CACHE_WBI_CAPABLE		_BITUL(6)
+#define   PCI_DVSEC_CXL_RST_CAPABLE			_BITUL(7)
+#define   PCI_DVSEC_CXL_RST_TIMEOUT			__GENMASK(10, 8)
+#define   PCI_DVSEC_CXL_RST_MEM_CLR_CAPABLE		_BITUL(11)
 #define  PCI_DVSEC_CXL_CTRL				0xC
 #define   PCI_DVSEC_CXL_MEM_ENABLE			_BITUL(2)
+#define  PCI_DVSEC_CXL_CTRL2				0x10
+#define   PCI_DVSEC_CXL_DISABLE_CACHING			_BITUL(0)
+#define   PCI_DVSEC_CXL_INIT_CACHE_WBI			_BITUL(1)
+#define   PCI_DVSEC_CXL_INIT_CXL_RST			_BITUL(2)
+#define   PCI_DVSEC_CXL_RST_MEM_CLR_EN			_BITUL(3)
+#define  PCI_DVSEC_CXL_STATUS2				0x12
+#define   PCI_DVSEC_CXL_CACHE_INV			_BITUL(0)
+#define   PCI_DVSEC_CXL_RST_DONE			_BITUL(1)
+#define   PCI_DVSEC_CXL_RST_ERR			_BITUL(2)
 #define  PCI_DVSEC_CXL_RANGE_SIZE_HIGH(i)		(0x18 + (i * 0x10))
 #define  PCI_DVSEC_CXL_RANGE_SIZE_LOW(i)		(0x1C + (i * 0x10))
 #define   PCI_DVSEC_CXL_MEM_INFO_VALID			_BITUL(0)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 06/11] cxl: Validate HDM ranges before CXL reset
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
                   ` (4 preceding siblings ...)
  2026-06-23  3:24 ` [PATCH v7 05/11] cxl: Add CXL Device Reset helper Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 07/11] PCI/cxl: Discover the CXL reset scope Srirangan Madhavan
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Before reset, collect enabled cached HDM decoder ranges, reserve them with
request_mem_region(), and invalidate CPU caches. This rejects reset while
affected CXL memory is busy and keeps the validation stable through reset.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/reset.c | 239 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 238 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
index fdfcc9e825e0..786d1060e40d 100644
--- a/drivers/cxl/core/reset.c
+++ b/drivers/cxl/core/reset.c
@@ -10,6 +10,8 @@
 #include <linux/iommu.h>
 #include <linux/jiffies.h>
 #include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/memregion.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
 
@@ -336,6 +338,230 @@ static const u32 cxl_reset_timeout_ms[] = {
 #define CXL_CACHE_WBI_TIMEOUT_US 100000
 #define CXL_CACHE_WBI_POLL_US 100
 
+struct cxl_hdm_range {
+	struct list_head list;
+	struct pci_dev *pdev;
+	struct range hpa_range;
+	struct resource *res;
+};
+
+struct cxl_hdm_range_context {
+	struct list_head ranges;
+};
+
+static void cxl_hdm_range_context_init(struct cxl_hdm_range_context *ctx)
+{
+	INIT_LIST_HEAD(&ctx->ranges);
+}
+
+static void cxl_hdm_range_context_destroy(struct cxl_hdm_range_context *ctx)
+{
+	struct cxl_hdm_range *range, *next;
+
+	list_for_each_entry_safe(range, next, &ctx->ranges, list) {
+		list_del(&range->list);
+		if (range->res)
+			release_mem_region(range->hpa_range.start,
+					   resource_size(range->res));
+		kfree(range);
+	}
+}
+
+static int cxl_hdm_range_add(struct cxl_hdm_range_context *ctx,
+			     struct pci_dev *pdev, const struct range *hpa_range)
+{
+	struct cxl_hdm_range *range;
+
+	if (hpa_range->end < hpa_range->start)
+		return -EINVAL;
+
+	list_for_each_entry(range, &ctx->ranges, list)
+		if (range->hpa_range.start == hpa_range->start &&
+		    range->hpa_range.end == hpa_range->end)
+			return 0;
+
+	range = kzalloc_obj(*range);
+	if (!range)
+		return -ENOMEM;
+
+	range->pdev = pdev;
+	range->hpa_range = *hpa_range;
+	list_add_tail(&range->list, &ctx->ranges);
+
+	return 0;
+}
+
+static int cxl_hdm_ranges_collect(struct cxl_hdm_range_context *ctx,
+				  struct pci_dev *pdev)
+{
+	struct cxl_hdm_info *info = READ_ONCE(pdev->hdm);
+	int rc;
+
+	if (!info) {
+		pci_err(pdev, "CXL HDM decoder state unavailable\n");
+		return -ENXIO;
+	}
+
+	for (int i = 0; i < info->decoder_count; i++) {
+		struct cxl_decoder_settings *settings = &info->settings[i];
+
+		if (!(settings->flags & CXL_DECODER_F_ENABLE))
+			continue;
+
+		if (settings->flags & CXL_DECODER_F_NORMALIZED_ADDRESSING) {
+			pci_err(pdev,
+				"CXL reset does not support normalized address decoders\n");
+			return -EOPNOTSUPP;
+		}
+
+		rc = cxl_hdm_range_add(ctx, pdev, &settings->hpa_range);
+		if (rc)
+			return rc;
+	}
+
+	return 0;
+}
+
+static int cxl_hdm_range_len(struct pci_dev *pdev,
+			     const struct range *hpa_range, u64 *len)
+{
+	if (sizeof(resource_size_t) < sizeof(hpa_range->start) &&
+	    (hpa_range->start > (resource_size_t)~0ULL ||
+	     hpa_range->end > (resource_size_t)~0ULL)) {
+		pci_err(pdev,
+			"CXL reset range [%#llx-%#llx] exceeds resource address size\n",
+			hpa_range->start, hpa_range->end);
+		return -EOVERFLOW;
+	}
+
+	if (hpa_range->end < hpa_range->start)
+		return -EINVAL;
+
+	if (!hpa_range->start && hpa_range->end == U64_MAX) {
+		pci_err(pdev,
+			"CXL reset range [%#llx-%#llx] exceeds resource size\n",
+			hpa_range->start, hpa_range->end);
+		return -EOVERFLOW;
+	}
+
+	*len = range_len(hpa_range);
+	if (sizeof(resource_size_t) < sizeof(*len) &&
+	    *len > (resource_size_t)~0ULL) {
+		pci_err(pdev,
+			"CXL reset range [%#llx-%#llx] exceeds resource size\n",
+			hpa_range->start, hpa_range->end);
+		return -EOVERFLOW;
+	}
+
+	if (sizeof(size_t) < sizeof(*len) && *len > SIZE_MAX) {
+		pci_err(pdev,
+			"CXL reset range [%#llx-%#llx] exceeds cache flush size\n",
+			hpa_range->start, hpa_range->end);
+		return -EOVERFLOW;
+	}
+
+	return 0;
+}
+
+static int cxl_hdm_range_request(struct cxl_hdm_range *range)
+{
+	struct pci_dev *pdev = range->pdev;
+	const struct range *hpa_range = &range->hpa_range;
+	u64 len;
+	int rc;
+
+	rc = cxl_hdm_range_len(pdev, hpa_range, &len);
+	if (rc)
+		return rc;
+
+	range->res = request_mem_region(hpa_range->start, len, "cxl_reset");
+	if (!range->res) {
+		pci_err(pdev,
+			"cannot reset while CXL memory range is busy [%#llx-%#llx]\n",
+			hpa_range->start, hpa_range->end);
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static int cxl_hdm_ranges_request(struct cxl_hdm_range_context *ctx)
+{
+	struct cxl_hdm_range *range;
+	int rc;
+
+	lockdep_assert_held_write(&cxl_rwsem.region);
+
+	list_for_each_entry(range, &ctx->ranges, list) {
+		rc = cxl_hdm_range_request(range);
+		if (rc)
+			return rc;
+	}
+
+	return 0;
+}
+
+static int cxl_hdm_range_flush_cache(struct cxl_hdm_range *range)
+{
+	struct pci_dev *pdev = range->pdev;
+	const struct range *hpa_range = &range->hpa_range;
+	u64 len;
+	int rc;
+
+	rc = cxl_hdm_range_len(pdev, hpa_range, &len);
+	if (rc)
+		return rc;
+
+	rc = cpu_cache_invalidate_memregion(hpa_range->start, len);
+	if (rc)
+		pci_err(pdev,
+			"failed to invalidate CPU cache [%#llx-%#llx]: %d\n",
+			hpa_range->start, hpa_range->end, rc);
+
+	return rc;
+}
+
+static int cxl_hdm_ranges_flush_cpu_caches(struct cxl_hdm_range_context *ctx,
+					   struct pci_dev *pdev)
+{
+	struct cxl_hdm_range *range;
+	int rc;
+
+	if (list_empty(&ctx->ranges))
+		return 0;
+
+	if (!cpu_cache_has_invalidate_memregion()) {
+		pci_err(pdev, "failed to synchronize CPU cache state\n");
+		return -ENXIO;
+	}
+
+	list_for_each_entry(range, &ctx->ranges, list) {
+		rc = cxl_hdm_range_flush_cache(range);
+		if (rc)
+			return rc;
+	}
+
+	return 0;
+}
+
+static int cxl_hdm_ranges_prepare(struct cxl_hdm_range_context *ctx,
+				  struct pci_dev *pdev)
+{
+	int rc;
+
+	lockdep_assert_held_write(&cxl_rwsem.region);
+
+	rc = cxl_hdm_ranges_collect(ctx, pdev);
+	if (rc)
+		return rc;
+
+	rc = cxl_hdm_ranges_request(ctx);
+	if (rc)
+		return rc;
+
+	return cxl_hdm_ranges_flush_cpu_caches(ctx, pdev);
+}
+
 static int cxl_reset_dvsec(struct pci_dev *pdev)
 {
 	int dvsec, rc;
@@ -528,7 +754,9 @@ static int cxl_reset_execute(struct pci_dev *pdev, int dvsec)
 
 int cxl_reset_function(struct pci_dev *pdev, bool probe)
 {
+	struct cxl_hdm_range_context range_ctx;
 	int dvsec;
+	int rc;
 
 	dvsec = cxl_reset_dvsec(pdev);
 	if (dvsec < 0)
@@ -537,5 +765,14 @@ int cxl_reset_function(struct pci_dev *pdev, bool probe)
 	if (probe)
 		return 0;
 
-	return cxl_reset_execute(pdev, dvsec);
+	cxl_hdm_range_context_init(&range_ctx);
+
+	scoped_guard(rwsem_write, &cxl_rwsem.region) {
+		rc = cxl_hdm_ranges_prepare(&range_ctx, pdev);
+		if (!rc)
+			rc = cxl_reset_execute(pdev, dvsec);
+	}
+
+	cxl_hdm_range_context_destroy(&range_ctx);
+	return rc;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 07/11] PCI/cxl: Discover the CXL reset scope
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
                   ` (5 preceding siblings ...)
  2026-06-23  3:24 ` [PATCH v7 06/11] cxl: Validate HDM ranges before CXL reset Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 08/11] cxl: Coordinate sibling functions for CXL reset Srirangan Madhavan
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Add reset context support to discover same-scope CXL functions before
reset. Use the Non-CXL Function Map, ARI/devfn rules, and CXL.cache/mem
capability bits to identify participating siblings, then hold references
until the reset context is destroyed.

If the Function Map cannot be read, warn and treat all candidate siblings
as CXL functions.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/reset.c      | 180 +++++++++++++++++++++++++++++++++-
 include/uapi/linux/pci_regs.h |   1 +
 2 files changed, 180 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
index 786d1060e40d..1ae714a3595c 100644
--- a/drivers/cxl/core/reset.c
+++ b/drivers/cxl/core/reset.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2026 NVIDIA Corporation. All rights reserved. */
+#include <linux/bitmap.h>
 #include <linux/delay.h>
 #include <linux/bug.h>
 #include <linux/bitfield.h>
@@ -338,6 +339,25 @@ static const u32 cxl_reset_timeout_ms[] = {
 #define CXL_CACHE_WBI_TIMEOUT_US 100000
 #define CXL_CACHE_WBI_POLL_US 100
 
+/* CXL r4.0 sec 8.1.4 defines 256 bits of Non-CXL Function Map. */
+#define CXL_RESET_MAX_FUNCTIONS 256
+#define CXL_RESET_FUNCTION_MAP_REGS (CXL_RESET_MAX_FUNCTIONS / 32)
+#define CXL_RESET_SIBLINGS_INIT 8
+
+struct cxl_reset_context {
+	struct pci_dev *target;
+	struct pci_dev **siblings;
+	int nr_siblings;
+	int sibling_capacity;
+};
+
+struct cxl_reset_walk_context {
+	struct cxl_reset_context *ctx;
+	DECLARE_BITMAP(non_cxl_func_map, CXL_RESET_MAX_FUNCTIONS);
+	bool ari;
+	int rc;
+};
+
 struct cxl_hdm_range {
 	struct list_head list;
 	struct pci_dev *pdev;
@@ -349,6 +369,157 @@ struct cxl_hdm_range_context {
 	struct list_head ranges;
 };
 
+static void cxl_reset_context_init(struct cxl_reset_context *ctx,
+				   struct pci_dev *pdev)
+{
+	*ctx = (struct cxl_reset_context) {
+		.target = pdev,
+	};
+}
+
+static void cxl_reset_context_destroy(struct cxl_reset_context *ctx)
+{
+	for (int i = 0; i < ctx->nr_siblings; i++)
+		pci_dev_put(ctx->siblings[i]);
+	kfree(ctx->siblings);
+}
+
+static void cxl_reset_read_non_cxl_func_map(struct pci_dev *pdev,
+					    unsigned long *map)
+{
+	u32 words[CXL_RESET_FUNCTION_MAP_REGS];
+	int dvsec, reg;
+
+	bitmap_zero(map, CXL_RESET_MAX_FUNCTIONS);
+
+	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
+					  PCI_DVSEC_CXL_FUNCTION_MAP);
+	if (!dvsec)
+		return;
+
+	for (reg = 0; reg < CXL_RESET_FUNCTION_MAP_REGS; reg++) {
+		int rc;
+
+		rc = pci_read_config_dword(pdev,
+					   dvsec + PCI_DVSEC_CXL_FUNCTION_MAP_REG +
+					   reg * sizeof(u32), &words[reg]);
+		if (rc) {
+			pci_warn(pdev,
+				 "failed to read Non-CXL Function Map; treating all siblings as CXL\n");
+			bitmap_zero(map, CXL_RESET_MAX_FUNCTIONS);
+			return;
+		}
+	}
+
+	bitmap_from_arr32(map, words, CXL_RESET_MAX_FUNCTIONS);
+}
+
+static int cxl_reset_func_map_bit(struct pci_dev *sibling, bool ari)
+{
+	if (ari)
+		return sibling->devfn;
+
+	/*
+	 * Without ARI, the Function Map is organized as 32 device slots per
+	 * conventional 3-bit function number.
+	 */
+	return PCI_FUNC(sibling->devfn) * 32 + PCI_SLOT(sibling->devfn);
+}
+
+static int cxl_reset_has_cache_or_mem(struct pci_dev *pdev)
+{
+	int dvsec, rc;
+	u16 cap;
+
+	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
+					  PCI_DVSEC_CXL_DEVICE);
+	if (!dvsec)
+		return 0;
+
+	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_CAP, &cap);
+	if (rc) {
+		rc = pcibios_err_to_errno(rc);
+		pci_warn(pdev,
+			 "failed to read CXL capability; cannot determine reset scope: %d\n",
+			 rc);
+		return rc;
+	}
+
+	return !!(cap & (PCI_DVSEC_CXL_CACHE_CAPABLE |
+			 PCI_DVSEC_CXL_MEM_CAPABLE));
+}
+
+static int cxl_reset_add_sibling(struct cxl_reset_context *ctx,
+				 struct pci_dev *sibling)
+{
+	if (ctx->nr_siblings >= ctx->sibling_capacity) {
+		int capacity = ctx->sibling_capacity ?: CXL_RESET_SIBLINGS_INIT;
+		struct pci_dev **siblings;
+
+		if (capacity > INT_MAX / 2)
+			return -ENOMEM;
+		if (ctx->sibling_capacity)
+			capacity *= 2;
+
+		siblings = krealloc_array(ctx->siblings, capacity,
+					  sizeof(*siblings), GFP_KERNEL);
+		if (!siblings)
+			return -ENOMEM;
+
+		ctx->siblings = siblings;
+		ctx->sibling_capacity = capacity;
+	}
+
+	ctx->siblings[ctx->nr_siblings++] = pci_dev_get(sibling);
+	return 0;
+}
+
+static int cxl_reset_collect_sibling(struct pci_dev *sibling, void *data)
+{
+	struct cxl_reset_walk_context *wctx = data;
+	struct cxl_reset_context *ctx = wctx->ctx;
+	struct pci_dev *pdev = ctx->target;
+	int fn, rc;
+
+	if (sibling == pdev)
+		return 0;
+
+	if (sibling->bus != pdev->bus)
+		return 0;
+
+	if (!wctx->ari && PCI_SLOT(sibling->devfn) != PCI_SLOT(pdev->devfn))
+		return 0;
+
+	fn = cxl_reset_func_map_bit(sibling, wctx->ari);
+	if (test_bit(fn, wctx->non_cxl_func_map))
+		return 0;
+
+	rc = cxl_reset_has_cache_or_mem(sibling);
+	if (rc < 0) {
+		wctx->rc = rc;
+		return rc;
+	}
+	if (!rc)
+		return 0;
+
+	wctx->rc = cxl_reset_add_sibling(ctx, sibling);
+	return wctx->rc;
+}
+
+static int cxl_reset_collect_siblings(struct cxl_reset_context *ctx)
+{
+	struct pci_dev *pdev = ctx->target;
+	struct cxl_reset_walk_context wctx = {
+		.ctx = ctx,
+		.ari = pci_ari_enabled(pdev->bus),
+	};
+
+	cxl_reset_read_non_cxl_func_map(pdev, wctx.non_cxl_func_map);
+	pci_walk_bus(pdev->bus, cxl_reset_collect_sibling, &wctx);
+
+	return wctx.rc;
+}
+
 static void cxl_hdm_range_context_init(struct cxl_hdm_range_context *ctx)
 {
 	INIT_LIST_HEAD(&ctx->ranges);
@@ -755,6 +926,7 @@ static int cxl_reset_execute(struct pci_dev *pdev, int dvsec)
 int cxl_reset_function(struct pci_dev *pdev, bool probe)
 {
 	struct cxl_hdm_range_context range_ctx;
+	struct cxl_reset_context ctx;
 	int dvsec;
 	int rc;
 
@@ -765,14 +937,20 @@ int cxl_reset_function(struct pci_dev *pdev, bool probe)
 	if (probe)
 		return 0;
 
+	cxl_reset_context_init(&ctx, pdev);
 	cxl_hdm_range_context_init(&range_ctx);
 
+	rc = cxl_reset_collect_siblings(&ctx);
+	if (rc)
+		goto out;
+
 	scoped_guard(rwsem_write, &cxl_rwsem.region) {
 		rc = cxl_hdm_ranges_prepare(&range_ctx, pdev);
 		if (!rc)
 			rc = cxl_reset_execute(pdev, dvsec);
 	}
-
+out:
 	cxl_hdm_range_context_destroy(&range_ctx);
+	cxl_reset_context_destroy(&ctx);
 	return rc;
 }
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 194ae56b4404..7fc1d34fcce7 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -1380,6 +1380,7 @@
 
 /* CXL r4.0, 8.1.4: Non-CXL Function Map DVSEC */
 #define PCI_DVSEC_CXL_FUNCTION_MAP			2
+#define  PCI_DVSEC_CXL_FUNCTION_MAP_REG			0x0C
 
 /* CXL r4.0, 8.1.5: Extensions DVSEC for Ports */
 #define PCI_DVSEC_CXL_PORT				3
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 08/11] cxl: Coordinate sibling functions for CXL reset
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
                   ` (6 preceding siblings ...)
  2026-06-23  3:24 ` [PATCH v7 07/11] PCI/cxl: Discover the CXL reset scope Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 09/11] cxl: Restore CXL HDM state after PCI reset Srirangan Madhavan
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

CXL Device Reset affects all CXL.cache and CXL.mem functions in the reset
scope. Lock same-scope siblings with pci_dev_trylock(), save/disable them,
drain pending transactions, and hold IOMMU reset blocks until recovery.

Also include mem-capable siblings in HDM range validation and CPU cache
invalidation. Cache-only siblings are quiesced, but skipped for HDM range
handling.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/reset.c | 146 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 130 insertions(+), 16 deletions(-)

diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
index 1ae714a3595c..69bcfab89858 100644
--- a/drivers/cxl/core/reset.c
+++ b/drivers/cxl/core/reset.c
@@ -344,10 +344,17 @@ static const u32 cxl_reset_timeout_ms[] = {
 #define CXL_RESET_FUNCTION_MAP_REGS (CXL_RESET_MAX_FUNCTIONS / 32)
 #define CXL_RESET_SIBLINGS_INIT 8
 
+struct cxl_reset_sibling {
+	struct pci_dev *pdev;
+	bool has_mem;
+};
+
 struct cxl_reset_context {
 	struct pci_dev *target;
-	struct pci_dev **siblings;
+	struct cxl_reset_sibling *siblings;
 	int nr_siblings;
+	int nr_siblings_locked;
+	int nr_siblings_prepared;
 	int sibling_capacity;
 };
 
@@ -380,7 +387,7 @@ static void cxl_reset_context_init(struct cxl_reset_context *ctx,
 static void cxl_reset_context_destroy(struct cxl_reset_context *ctx)
 {
 	for (int i = 0; i < ctx->nr_siblings; i++)
-		pci_dev_put(ctx->siblings[i]);
+		pci_dev_put(ctx->siblings[i].pdev);
 	kfree(ctx->siblings);
 }
 
@@ -426,35 +433,49 @@ static int cxl_reset_func_map_bit(struct pci_dev *sibling, bool ari)
 	return PCI_FUNC(sibling->devfn) * 32 + PCI_SLOT(sibling->devfn);
 }
 
-static int cxl_reset_has_cache_or_mem(struct pci_dev *pdev)
+static int cxl_reset_read_cxl_cap(struct pci_dev *pdev, u16 *cap)
 {
 	int dvsec, rc;
-	u16 cap;
 
 	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
 					  PCI_DVSEC_CXL_DEVICE);
 	if (!dvsec)
-		return 0;
+		return -ENODEV;
 
-	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_CAP, &cap);
+	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_CAP, cap);
 	if (rc) {
 		rc = pcibios_err_to_errno(rc);
-		pci_warn(pdev,
-			 "failed to read CXL capability; cannot determine reset scope: %d\n",
-			 rc);
+		pci_warn(pdev, "failed to read CXL capability: %d\n", rc);
 		return rc;
 	}
 
+	return 0;
+}
+
+static int cxl_reset_has_cache_or_mem(struct pci_dev *pdev, bool *has_mem)
+{
+	u16 cap;
+	int rc;
+
+	*has_mem = false;
+
+	rc = cxl_reset_read_cxl_cap(pdev, &cap);
+	if (rc == -ENODEV)
+		return 0;
+	if (rc)
+		return rc;
+
+	*has_mem = cap & PCI_DVSEC_CXL_MEM_CAPABLE;
 	return !!(cap & (PCI_DVSEC_CXL_CACHE_CAPABLE |
 			 PCI_DVSEC_CXL_MEM_CAPABLE));
 }
 
 static int cxl_reset_add_sibling(struct cxl_reset_context *ctx,
-				 struct pci_dev *sibling)
+				 struct pci_dev *sibling, bool has_mem)
 {
 	if (ctx->nr_siblings >= ctx->sibling_capacity) {
 		int capacity = ctx->sibling_capacity ?: CXL_RESET_SIBLINGS_INIT;
-		struct pci_dev **siblings;
+		struct cxl_reset_sibling *siblings;
 
 		if (capacity > INT_MAX / 2)
 			return -ENOMEM;
@@ -470,7 +491,11 @@ static int cxl_reset_add_sibling(struct cxl_reset_context *ctx,
 		ctx->sibling_capacity = capacity;
 	}
 
-	ctx->siblings[ctx->nr_siblings++] = pci_dev_get(sibling);
+	ctx->siblings[ctx->nr_siblings] = (struct cxl_reset_sibling) {
+		.pdev = pci_dev_get(sibling),
+		.has_mem = has_mem,
+	};
+	ctx->nr_siblings++;
 	return 0;
 }
 
@@ -479,6 +504,7 @@ static int cxl_reset_collect_sibling(struct pci_dev *sibling, void *data)
 	struct cxl_reset_walk_context *wctx = data;
 	struct cxl_reset_context *ctx = wctx->ctx;
 	struct pci_dev *pdev = ctx->target;
+	bool has_mem;
 	int fn, rc;
 
 	if (sibling == pdev)
@@ -494,7 +520,7 @@ static int cxl_reset_collect_sibling(struct pci_dev *sibling, void *data)
 	if (test_bit(fn, wctx->non_cxl_func_map))
 		return 0;
 
-	rc = cxl_reset_has_cache_or_mem(sibling);
+	rc = cxl_reset_has_cache_or_mem(sibling, &has_mem);
 	if (rc < 0) {
 		wctx->rc = rc;
 		return rc;
@@ -502,7 +528,7 @@ static int cxl_reset_collect_sibling(struct pci_dev *sibling, void *data)
 	if (!rc)
 		return 0;
 
-	wctx->rc = cxl_reset_add_sibling(ctx, sibling);
+	wctx->rc = cxl_reset_add_sibling(ctx, sibling, has_mem);
 	return wctx->rc;
 }
 
@@ -520,6 +546,69 @@ static int cxl_reset_collect_siblings(struct cxl_reset_context *ctx)
 	return wctx.rc;
 }
 
+static void cxl_pci_functions_unlock(struct cxl_reset_context *ctx)
+{
+	while (ctx->nr_siblings_locked) {
+		struct pci_dev *sibling;
+
+		sibling = ctx->siblings[--ctx->nr_siblings_locked].pdev;
+		pci_dev_unlock(sibling);
+	}
+}
+
+static int cxl_pci_functions_lock(struct cxl_reset_context *ctx)
+{
+	for (int i = 0; i < ctx->nr_siblings; i++) {
+		struct pci_dev *sibling = ctx->siblings[i].pdev;
+
+		if (!pci_dev_trylock(sibling)) {
+			cxl_pci_functions_unlock(ctx);
+			return -EAGAIN;
+		}
+
+		ctx->nr_siblings_locked++;
+	}
+
+	return 0;
+}
+
+static void cxl_pci_functions_reset_done(struct cxl_reset_context *ctx)
+{
+	while (ctx->nr_siblings_prepared) {
+		struct pci_dev *sibling;
+
+		sibling = ctx->siblings[--ctx->nr_siblings_prepared].pdev;
+		pci_dev_reset_iommu_done(sibling);
+		pci_dev_restore(sibling);
+	}
+}
+
+static int cxl_pci_functions_reset_prepare(struct cxl_reset_context *ctx)
+{
+	for (int i = 0; i < ctx->nr_siblings_locked; i++) {
+		struct pci_dev *sibling = ctx->siblings[i].pdev;
+		int rc;
+
+		pci_dev_save_and_disable(sibling);
+		if (!pci_wait_for_pending_transaction(sibling))
+			pci_err(sibling,
+				"timed out waiting for pending transactions\n");
+
+		rc = pci_dev_reset_iommu_prepare(sibling);
+		if (rc) {
+			pci_err(sibling,
+				"failed to stop IOMMU for CXL reset: %d\n",
+				rc);
+			pci_dev_restore(sibling);
+			return rc;
+		}
+
+		ctx->nr_siblings_prepared++;
+	}
+
+	return 0;
+}
+
 static void cxl_hdm_range_context_init(struct cxl_hdm_range_context *ctx)
 {
 	INIT_LIST_HEAD(&ctx->ranges);
@@ -716,8 +805,9 @@ static int cxl_hdm_ranges_flush_cpu_caches(struct cxl_hdm_range_context *ctx,
 }
 
 static int cxl_hdm_ranges_prepare(struct cxl_hdm_range_context *ctx,
-				  struct pci_dev *pdev)
+				  struct cxl_reset_context *reset_ctx)
 {
+	struct pci_dev *pdev = reset_ctx->target;
 	int rc;
 
 	lockdep_assert_held_write(&cxl_rwsem.region);
@@ -726,6 +816,17 @@ static int cxl_hdm_ranges_prepare(struct cxl_hdm_range_context *ctx,
 	if (rc)
 		return rc;
 
+	for (int i = 0; i < reset_ctx->nr_siblings; i++) {
+		struct cxl_reset_sibling *sibling = &reset_ctx->siblings[i];
+
+		if (!sibling->has_mem)
+			continue;
+
+		rc = cxl_hdm_ranges_collect(ctx, sibling->pdev);
+		if (rc)
+			return rc;
+	}
+
 	rc = cxl_hdm_ranges_request(ctx);
 	if (rc)
 		return rc;
@@ -944,11 +1045,24 @@ int cxl_reset_function(struct pci_dev *pdev, bool probe)
 	if (rc)
 		goto out;
 
+	rc = cxl_pci_functions_lock(&ctx);
+	if (rc)
+		goto out_unlock;
+
+	rc = cxl_pci_functions_reset_prepare(&ctx);
+	if (rc)
+		goto out_functions_done;
+
 	scoped_guard(rwsem_write, &cxl_rwsem.region) {
-		rc = cxl_hdm_ranges_prepare(&range_ctx, pdev);
+		rc = cxl_hdm_ranges_prepare(&range_ctx, &ctx);
 		if (!rc)
 			rc = cxl_reset_execute(pdev, dvsec);
 	}
+
+out_functions_done:
+	cxl_pci_functions_reset_done(&ctx);
+out_unlock:
+	cxl_pci_functions_unlock(&ctx);
 out:
 	cxl_hdm_range_context_destroy(&range_ctx);
 	cxl_reset_context_destroy(&ctx);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 09/11] cxl: Restore CXL HDM state after PCI reset
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
                   ` (7 preceding siblings ...)
  2026-06-23  3:24 ` [PATCH v7 08/11] cxl: Coordinate sibling functions for CXL reset Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 10/11] PCI/cxl: Expose CXL Reset as a PCI reset method Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 11/11] Documentation/ABI: Document CXL Reset " Srirangan Madhavan
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

After CXL reset, restore PCI config state enough to reach HDM MMIO,
restore cached global and per-decoder HDM state, and then run the normal
PCI restore callbacks.

Keep target and sibling IOMMU reset blocks active until HDM restore
completes so Bus Master Enable cannot reopen DMA before decoder state is
valid.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/hdm.c   |   4 +
 drivers/cxl/core/reset.c | 195 ++++++++++++++++++++++++++++++++++++---
 include/cxl/cxl.h        |   2 +
 3 files changed, 190 insertions(+), 11 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 0230ebfada42..095cc13e5d00 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -152,6 +152,10 @@ static void cxl_hdm_info_set_decoder(struct cxl_hdm *cxlhdm,
 	if (!info || cxld->id >= info->decoder_count)
 		return;
 
+	if (cxlhdm->regs.hdm_decoder)
+		info->global_ctrl = readl(cxlhdm->regs.hdm_decoder +
+					  CXL_HDM_DECODER_CTRL_OFFSET);
+
 	if (cxld->flags & CXL_DECODER_F_ENABLE)
 		info->settings[cxld->id] = cxld->settings;
 	else
diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
index 69bcfab89858..d801c91a5cbf 100644
--- a/drivers/cxl/core/reset.c
+++ b/drivers/cxl/core/reset.c
@@ -83,6 +83,21 @@ static int cxld_await_commit(void __iomem *hdm, int id)
 	return -ETIMEDOUT;
 }
 
+static int cxld_await_uncommit(void __iomem *hdm, int id)
+{
+	u32 ctrl;
+	int i;
+
+	for (i = 0; i < COMMIT_TIMEOUT_MS; i++) {
+		ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+		if (!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
+			return 0;
+		fsleep(1000);
+	}
+
+	return -ETIMEDOUT;
+}
+
 static void setup_hw_decoder(struct cxl_decoder_settings *settings,
 			     void __iomem *hdm)
 {
@@ -92,6 +107,8 @@ static void setup_hw_decoder(struct cxl_decoder_settings *settings,
 	u32 ctrl;
 
 	ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+	ctrl &= ~(CXL_HDM_DECODER0_CTRL_COMMIT |
+		  CXL_HDM_DECODER0_CTRL_COMMIT_ERROR);
 	cxld_set_interleave(settings, &ctrl);
 	cxld_set_type(settings, &ctrl);
 	base = settings->hpa_range.start;
@@ -300,6 +317,8 @@ int pci_cxl_hdm_init(struct pci_dev *pdev)
 
 	info->decoder_count = decoder_count;
 	info->regs = regs;
+	info->global_ctrl = readl(regs.hdm_decoder +
+				  CXL_HDM_DECODER_CTRL_OFFSET);
 
 	settings = info->settings;
 	for (int i = 0; i < info->decoder_count; i++) {
@@ -324,6 +343,100 @@ int pci_cxl_hdm_init(struct pci_dev *pdev)
 	return rc;
 }
 
+static int cxl_hdm_decoder_uncommit(struct pci_dev *pdev, void __iomem *hdm,
+				    int id, bool *locked_committed)
+{
+	u32 ctrl;
+	int rc;
+
+	*locked_committed = false;
+	ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+	if (ctrl & CXL_HDM_DECODER0_CTRL_LOCK) {
+		if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED) {
+			pci_dbg(pdev,
+				"CXL HDM decoder %d retained locked committed state\n",
+				id);
+			*locked_committed = true;
+			return 0;
+		}
+
+		pci_err(pdev, "CXL HDM decoder %d is locked\n", id);
+		return -EBUSY;
+	}
+
+	if (!(ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED))
+		return 0;
+
+	ctrl &= ~CXL_HDM_DECODER0_CTRL_COMMIT;
+	writel(ctrl, hdm + CXL_HDM_DECODER0_CTRL_OFFSET(id));
+
+	rc = cxld_await_uncommit(hdm, id);
+	if (rc)
+		pci_err(pdev, "CXL HDM decoder %d uncommit failed: %d\n",
+			id, rc);
+
+	return rc;
+}
+
+static int cxl_restore_hdm_decoder(struct pci_dev *pdev,
+				   struct cxl_decoder_settings *settings,
+				   void __iomem *hdm)
+{
+	bool locked_committed;
+	int rc;
+
+	if (!(settings->flags & CXL_DECODER_F_ENABLE))
+		return 0;
+
+	rc = cxl_hdm_decoder_uncommit(pdev, hdm, settings->id,
+				      &locked_committed);
+	if (rc)
+		return rc;
+	if (locked_committed)
+		return 0;
+
+	rc = cxl_commit(settings, hdm);
+	if (rc)
+		pci_err(pdev, "CXL HDM decoder %d restore failed: %d\n",
+			settings->id, rc);
+
+	return rc;
+}
+
+static int cxl_restore_hdm(struct pci_dev *pdev)
+{
+	struct cxl_hdm_info *info = READ_ONCE(pdev->hdm);
+	void __iomem *hdm;
+	int first_rc = 0;
+
+	if (!info)
+		return 0;
+
+	hdm = info->regs.hdm_decoder;
+	if (!hdm) {
+		pci_err(pdev, "CXL HDM decoder registers unavailable\n");
+		return -ENXIO;
+	}
+
+	/*
+	 * Restore global HDM control before per-decoder commit. PCI config
+	 * state has been restored for MMIO access, but IOMMU reset blocks
+	 * remain active until HDM restore completes.
+	 */
+	writel(info->global_ctrl, hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+
+	for (int i = 0; i < info->decoder_count; i++) {
+		struct cxl_decoder_settings *settings = &info->settings[i];
+		int rc;
+
+		rc = cxl_restore_hdm_decoder(pdev, settings, hdm);
+		if (rc && !first_rc)
+			first_rc = rc;
+	}
+
+	return first_rc;
+}
+
 /*
  * CXL r4.0 sec 9.7.2 defines the reset completion timeout encodings.
  * Sec 9.7.3 leaves config-space access behavior undefined for 100 ms after
@@ -355,6 +468,7 @@ struct cxl_reset_context {
 	int nr_siblings;
 	int nr_siblings_locked;
 	int nr_siblings_prepared;
+	bool target_prepared;
 	int sibling_capacity;
 };
 
@@ -609,6 +723,68 @@ static int cxl_pci_functions_reset_prepare(struct cxl_reset_context *ctx)
 	return 0;
 }
 
+static void cxl_pci_target_reset_done(struct cxl_reset_context *ctx)
+{
+	if (!ctx->target_prepared)
+		return;
+
+	pci_dev_reset_iommu_done(ctx->target);
+	ctx->target_prepared = false;
+}
+
+static int cxl_pci_target_reset_prepare(struct cxl_reset_context *ctx)
+{
+	struct pci_dev *pdev = ctx->target;
+	int rc;
+
+	if (!pci_wait_for_pending_transaction(pdev))
+		pci_err(pdev, "timed out waiting for pending transactions\n");
+
+	rc = pci_dev_reset_iommu_prepare(pdev);
+	if (rc) {
+		pci_err(pdev, "failed to stop IOMMU for CXL reset: %d\n", rc);
+		return rc;
+	}
+
+	ctx->target_prepared = true;
+	return 0;
+}
+
+static void cxl_pci_functions_restore_state(struct cxl_reset_context *ctx)
+{
+	/*
+	 * Restore PCI config state first so HDM MMIO is reachable. The final
+	 * pci_dev_restore() pass deliberately replays pci_restore_state()
+	 * before invoking driver reset_done() callbacks.
+	 */
+	pci_restore_state(ctx->target);
+
+	for (int i = 0; i < ctx->nr_siblings_prepared; i++)
+		pci_restore_state(ctx->siblings[i].pdev);
+}
+
+static int cxl_restore_hdm_decoders(struct cxl_reset_context *ctx)
+{
+	int first_rc = 0;
+	int rc;
+
+	cxl_pci_functions_restore_state(ctx);
+
+	rc = cxl_restore_hdm(ctx->target);
+	if (rc && !first_rc)
+		first_rc = rc;
+
+	for (int i = 0; i < ctx->nr_siblings_prepared; i++) {
+		struct pci_dev *sibling = ctx->siblings[i].pdev;
+
+		rc = cxl_restore_hdm(sibling);
+		if (rc && !first_rc)
+			first_rc = rc;
+	}
+
+	return first_rc;
+}
+
 static void cxl_hdm_range_context_init(struct cxl_hdm_range_context *ctx)
 {
 	INIT_LIST_HEAD(&ctx->ranges);
@@ -985,18 +1161,9 @@ static int cxl_reset_execute(struct pci_dev *pdev, int dvsec)
 	if (rc)
 		return pcibios_err_to_errno(rc);
 
-	if (!pci_wait_for_pending_transaction(pdev))
-		pci_err(pdev, "timed out waiting for pending transactions\n");
-
-	rc = pci_dev_reset_iommu_prepare(pdev);
-	if (rc) {
-		pci_err(pdev, "failed to stop IOMMU for CXL reset: %d\n", rc);
-		return rc;
-	}
-
 	rc = cxl_reset_disable_cache(pdev, dvsec, cap);
 	if (rc)
-		goto out;
+		return rc;
 	cache_disabled = true;
 
 	rc = cxl_reset_update_ctrl2(pdev, dvsec, PCI_DVSEC_CXL_INIT_CXL_RST,
@@ -1020,7 +1187,6 @@ static int cxl_reset_execute(struct pci_dev *pdev, int dvsec)
 			rc = rc2;
 	}
 
-	pci_dev_reset_iommu_done(pdev);
 	return rc;
 }
 
@@ -1053,12 +1219,19 @@ int cxl_reset_function(struct pci_dev *pdev, bool probe)
 	if (rc)
 		goto out_functions_done;
 
+	rc = cxl_pci_target_reset_prepare(&ctx);
+	if (rc)
+		goto out_functions_done;
+
 	scoped_guard(rwsem_write, &cxl_rwsem.region) {
 		rc = cxl_hdm_ranges_prepare(&range_ctx, &ctx);
 		if (!rc)
 			rc = cxl_reset_execute(pdev, dvsec);
+		if (!rc)
+			rc = cxl_restore_hdm_decoders(&ctx);
 	}
 
+	cxl_pci_target_reset_done(&ctx);
 out_functions_done:
 	cxl_pci_functions_reset_done(&ctx);
 out_unlock:
diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
index 1fe606f15733..eddc48f1fa49 100644
--- a/include/cxl/cxl.h
+++ b/include/cxl/cxl.h
@@ -127,11 +127,13 @@ struct cxl_regs {
  * struct cxl_hdm_info - PCI device HDM decoder programming cache
  * @decoder_count: number of decoder settings entries
  * @regs: mapped CXL component registers for this HDM decoder block
+ * @global_ctrl: cached HDM decoder global control register
  * @settings: cached per-decoder programming state
  */
 struct cxl_hdm_info {
 	int decoder_count;
 	struct cxl_component_regs regs;
+	u32 global_ctrl;
 	struct cxl_decoder_settings settings[] __counted_by(decoder_count);
 };
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 10/11] PCI/cxl: Expose CXL Reset as a PCI reset method
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
                   ` (8 preceding siblings ...)
  2026-06-23  3:24 ` [PATCH v7 09/11] cxl: Restore CXL HDM state after PCI reset Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  2026-06-23  3:24 ` [PATCH v7 11/11] Documentation/ABI: Document CXL Reset " Srirangan Madhavan
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Add the CXL Reset helper to the PCI reset-method table so userspace can
select it through the existing reset_method ABI.

Advertise the method for Type 2 CXL devices that report CXL Reset support
in the CXL Device DVSEC. Reset execution still requires cached HDM decoder
state for the target and mem-capable siblings so that affected ranges can
be validated and HDM programming can be restored. If that state is
unavailable at reset time, return -ENOTTY so PCI can try the next reset
method.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 drivers/cxl/core/reset.c | 33 +++++++++++++++++++++++++++++++++
 drivers/pci/pci.c        |  2 ++
 include/linux/pci.h      |  2 +-
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/reset.c b/drivers/cxl/core/reset.c
index d801c91a5cbf..694d8a3789a4 100644
--- a/drivers/cxl/core/reset.c
+++ b/drivers/cxl/core/reset.c
@@ -1035,6 +1035,34 @@ static int cxl_reset_dvsec(struct pci_dev *pdev)
 	return dvsec;
 }
 
+static bool cxl_reset_hdm_available(struct pci_dev *pdev)
+{
+	struct cxl_hdm_info *info = READ_ONCE(pdev->hdm);
+
+	/*
+	 * pdev->hdm is allocated with PCI-device devres. Reset requests
+	 * operate on a live pci_dev, so the devres allocation remains valid
+	 * for this check.
+	 */
+	return info && info->regs.hdm_decoder;
+}
+
+static bool cxl_reset_scope_hdm_available(struct cxl_reset_context *ctx)
+{
+	if (!cxl_reset_hdm_available(ctx->target))
+		return false;
+
+	for (int i = 0; i < ctx->nr_siblings; i++) {
+		struct cxl_reset_sibling *sibling = &ctx->siblings[i];
+
+		if (sibling->has_mem &&
+		    !cxl_reset_hdm_available(sibling->pdev))
+			return false;
+	}
+
+	return true;
+}
+
 static int cxl_reset_update_ctrl2(struct pci_dev *pdev, int dvsec, u16 set,
 				  u16 clear)
 {
@@ -1211,6 +1239,11 @@ int cxl_reset_function(struct pci_dev *pdev, bool probe)
 	if (rc)
 		goto out;
 
+	if (!cxl_reset_scope_hdm_available(&ctx)) {
+		rc = -ENOTTY;
+		goto out;
+	}
+
 	rc = cxl_pci_functions_lock(&ctx);
 	if (rc)
 		goto out_unlock;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 360f2aaee10c..b1ec20126390 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -33,6 +33,7 @@
 #include <asm/dma.h>
 #include <linux/aer.h>
 #include <linux/bitfield.h>
+#include <cxl/cxl.h>
 #include "pci.h"
 
 DEFINE_MUTEX(pci_slot_mutex);
@@ -5081,6 +5082,7 @@ const struct pci_reset_fn_method pci_reset_fn_methods[] = {
 	{ pci_dev_acpi_reset, .name = "acpi" },
 	{ pcie_reset_flr, .name = "flr" },
 	{ pci_af_flr, .name = "af_flr" },
+	{ cxl_reset_function, .name = "cxl_reset" },
 	{ pci_pm_reset, .name = "pm" },
 	{ pci_reset_bus_function, .name = "bus" },
 	{ cxl_reset_bus_function, .name = "cxl_bus" },
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4df030837a3a..05b5feac5a49 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -51,7 +51,7 @@
 			       PCI_STATUS_PARITY)
 
 /* Number of reset methods used in pci_reset_fn_methods array in pci.c */
-#define PCI_NUM_RESET_METHODS 8
+#define PCI_NUM_RESET_METHODS 9
 
 #define PCI_RESET_PROBE		true
 #define PCI_RESET_DO_RESET	false
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 11/11] Documentation/ABI: Document CXL Reset PCI reset method
  2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
                   ` (9 preceding siblings ...)
  2026-06-23  3:24 ` [PATCH v7 10/11] PCI/cxl: Expose CXL Reset as a PCI reset method Srirangan Madhavan
@ 2026-06-23  3:24 ` Srirangan Madhavan
  10 siblings, 0 replies; 12+ messages in thread
From: Srirangan Madhavan @ 2026-06-23  3:24 UTC (permalink / raw)
  To: Alison Schofield, Bjorn Helgaas, Dan Williams, Dave Jiang,
	Davidlohr Bueso, Ira Weiny, Jonathan Cameron, Vishal Verma,
	linux-cxl, linux-pci, linux-kernel
  Cc: vsethi, alwilliamson, Dan Williams, Sai Yashwanth Reddy Kancherla,
	Vishal Aslot, Manish Honap, Jiandi An, Richard Cheng, linux-tegra,
	Srirangan Madhavan

Document the "cxl_reset" PCI reset_method value for Type 2 CXL devices.
CXL Reset is device scoped, requires affected memory to be idle,
invalidates CPU caches, restores cached HDM decoder state, and does not
request Memory Clear.

Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index b767db2c52cb..dd8de5c7eb77 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -153,6 +153,20 @@ Description:
 		"default" enables all supported reset methods in the
 		default ordering.
 
+		If present, "cxl_reset" selects CXL Reset for CXL Type 2
+		devices that advertise CXL Reset support.  CXL Reset is device
+		scoped and affects all CXL.cache and CXL.mem functions.
+
+		Before issuing CXL Reset, the kernel quiesces affected PCI
+		functions, rejects the operation if affected CXL memory is busy,
+		invalidates CPU caches for enabled HDM ranges, disables
+		CXL.cache, and initiates cache write-back where supported.  After
+		reset, the kernel restores PCI config state to access HDM MMIO,
+		restores cached HDM decoder state, and then completes reset
+		recovery for the affected functions.  If cached HDM decoder
+		state is unavailable at reset time, the kernel skips this reset
+		method.  "cxl_reset" does not request CXL Reset Memory Clear.
+
 What:		/sys/bus/pci/devices/.../reset
 Date:		July 2009
 Contact:	Michael S. Tsirkin <mst@redhat.com>
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-06-23  3:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23  3:24 [PATCH v7 00/11] PCI/CXL: Add CXL reset support for Type 2 devices Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 01/11] cxl: Split decoder programming into a reusable helper Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 02/11] cxl: Cache decoder settings on PCI devices Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 03/11] cxl: Cache endpoint decoder settings during PCI enumeration Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 04/11] PCI: Export pci_dev_save_and_disable() and pci_dev_restore() Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 05/11] cxl: Add CXL Device Reset helper Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 06/11] cxl: Validate HDM ranges before CXL reset Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 07/11] PCI/cxl: Discover the CXL reset scope Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 08/11] cxl: Coordinate sibling functions for CXL reset Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 09/11] cxl: Restore CXL HDM state after PCI reset Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 10/11] PCI/cxl: Expose CXL Reset as a PCI reset method Srirangan Madhavan
2026-06-23  3:24 ` [PATCH v7 11/11] Documentation/ABI: Document CXL Reset " Srirangan Madhavan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox