Linux CXL
 help / color / mirror / Atom feed
* [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
@ 2025-04-06 11:27 Huaisheng Ye
  2025-04-07  8:31 ` Zhijian Li (Fujitsu)
  2025-04-08  3:22 ` Gregory Price
  0 siblings, 2 replies; 11+ messages in thread
From: Huaisheng Ye @ 2025-04-06 11:27 UTC (permalink / raw)
  To: Jonathan.Cameron, dan.j.williams, dave.jiang
  Cc: pei.p.jia, linux-cxl, Huaisheng Ye

In some scenarios, the probe of endpoint ports would fail because of range
register (RR) decodes outside platform defined CXL ranges.

  [kernel debug message]
  cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
  platform
  cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
  ranges.
  cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
  call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
  rejects match -6

This defect could be found with Qemu CXL branch for a long while with a
specified probability, even with the latest branch cxl-2025-03-20.

The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
DVSEC control has been set but in cxl_hdm_decode_init
CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
not covered by root decoder's hpa_range.

When encountering similar problems of the firmware, the patch could be
effective for solving.
Instead of Return error when RR decodes outside platform cxl ranges,
driver disable Mem_Enable bit of DVSEC control then take it as the way of
no setting info->mem_enabled.

Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com>
---
 drivers/cxl/core/hdm.c |  2 +-
 drivers/cxl/core/pci.c | 35 +++++++++++++++++++++++++----------
 drivers/cxl/cxlmem.h   |  2 ++
 3 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 50e6a45b30ba..b776fb848f42 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -80,7 +80,7 @@ static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
 	u32 hdm_cap;
 
 	hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
-	cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
+	cxlhdm->decoder_count_cap = cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
 	cxlhdm->target_count =
 		FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
 	if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap))
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 013b869b66cb..5452bb285140 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -403,7 +403,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
 	struct cxl_port *port = cxlhdm->port;
 	struct device *dev = cxlds->dev;
 	struct cxl_port *root;
-	int i, rc, allowed;
+	int i, rc, allowed = 0;
 	u32 global_ctrl = 0;
 
 	if (hdm)
@@ -426,15 +426,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
 		return -ENODEV;
 	}
 
-	if (!info->mem_enabled) {
-		rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
-		if (rc)
-			return rc;
-
-		return devm_cxl_enable_mem(&port->dev, cxlds);
-	}
-
-	for (i = 0, allowed = 0; i < info->ranges; i++) {
+	for (i = 0; i < info->ranges; i++) {
 		struct device *cxld_dev;
 
 		cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i],
@@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
 		allowed++;
 	}
 
+	if (info->mem_enabled && !allowed) {
+		dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n");
+
+		/*
+		 * Instead of Return error when RR decodes outside platform ranges, reenable
+		 * Mem_Enable bit of DVSEC control for a try.
+		 */
+		rc = cxl_set_mem_enable(cxlds, 0);
+		if (rc)
+			return rc;
+
+		info->mem_enabled = 0;
+		cxlhdm->decoder_count = cxlhdm->decoder_count_cap;
+	}
+
+	if (!info->mem_enabled) {
+		rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
+		if (rc)
+			return rc;
+
+		return devm_cxl_enable_mem(&port->dev, cxlds);
+	}
+
 	if (!allowed) {
 		dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
 		return -ENXIO;
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 2a25d1957ddb..60b538f8b677 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
  * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities
  * @regs: mapped registers, see devm_cxl_setup_hdm()
  * @decoder_count: number of decoders for this port
+ * @decoder_count_cap: number of decoders from HDM Decoder Capability
  * @target_count: for switch decoders, max downstream port targets
  * @interleave_mask: interleave granularity capability, see check_interleave_cap()
  * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap()
@@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
 struct cxl_hdm {
 	struct cxl_component_regs regs;
 	unsigned int decoder_count;
+	unsigned int decoder_count_cap;
 	unsigned int target_count;
 	unsigned int interleave_mask;
 	unsigned long iw_cap_mask;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-04-15 16:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye
2025-04-07  8:31 ` Zhijian Li (Fujitsu)
2025-04-09  3:51   ` Ye, Huaisheng
2025-04-09 14:13   ` Gregory Price
2025-04-09 15:13     ` Dave Jiang
2025-04-15 16:21       ` Jonathan Cameron
2025-04-08  3:22 ` Gregory Price
2025-04-09  3:48   ` Ye, Huaisheng
2025-04-09 14:01     ` Gregory Price
2025-04-10  7:12       ` Ye, Huaisheng
2025-04-15 16:30       ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox