* [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges @ 2025-04-06 11:27 Huaisheng Ye 2025-04-07 8:31 ` Zhijian Li (Fujitsu) 2025-04-08 3:22 ` Gregory Price 0 siblings, 2 replies; 11+ messages in thread From: Huaisheng Ye @ 2025-04-06 11:27 UTC (permalink / raw) To: Jonathan.Cameron, dan.j.williams, dave.jiang Cc: pei.p.jia, linux-cxl, Huaisheng Ye In some scenarios, the probe of endpoint ports would fail because of range register (RR) decodes outside platform defined CXL ranges. [kernel debug message] cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by platform cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL ranges. cxl_bus_probe:2073: cxl_port endpoint3: probe: -6 call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port rejects match -6 This defect could be found with Qemu CXL branch for a long while with a specified probability, even with the latest branch cxl-2025-03-20. The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of DVSEC control has been set but in cxl_hdm_decode_init CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is not covered by root decoder's hpa_range. When encountering similar problems of the firmware, the patch could be effective for solving. Instead of Return error when RR decodes outside platform cxl ranges, driver disable Mem_Enable bit of DVSEC control then take it as the way of no setting info->mem_enabled. Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com> --- drivers/cxl/core/hdm.c | 2 +- drivers/cxl/core/pci.c | 35 +++++++++++++++++++++++++---------- drivers/cxl/cxlmem.h | 2 ++ 3 files changed, 28 insertions(+), 11 deletions(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 50e6a45b30ba..b776fb848f42 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -80,7 +80,7 @@ static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm) u32 hdm_cap; hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET); - cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); + cxlhdm->decoder_count_cap = cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); cxlhdm->target_count = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap); if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap)) diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 013b869b66cb..5452bb285140 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -403,7 +403,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, struct cxl_port *port = cxlhdm->port; struct device *dev = cxlds->dev; struct cxl_port *root; - int i, rc, allowed; + int i, rc, allowed = 0; u32 global_ctrl = 0; if (hdm) @@ -426,15 +426,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, return -ENODEV; } - if (!info->mem_enabled) { - rc = devm_cxl_enable_hdm(&port->dev, cxlhdm); - if (rc) - return rc; - - return devm_cxl_enable_mem(&port->dev, cxlds); - } - - for (i = 0, allowed = 0; i < info->ranges; i++) { + for (i = 0; i < info->ranges; i++) { struct device *cxld_dev; cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i], @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, allowed++; } + if (info->mem_enabled && !allowed) { + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n"); + + /* + * Instead of Return error when RR decodes outside platform ranges, reenable + * Mem_Enable bit of DVSEC control for a try. + */ + rc = cxl_set_mem_enable(cxlds, 0); + if (rc) + return rc; + + info->mem_enabled = 0; + cxlhdm->decoder_count = cxlhdm->decoder_count_cap; + } + + if (!info->mem_enabled) { + rc = devm_cxl_enable_hdm(&port->dev, cxlhdm); + if (rc) + return rc; + + return devm_cxl_enable_mem(&port->dev, cxlds); + } + if (!allowed) { dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n"); return -ENXIO; diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 2a25d1957ddb..60b538f8b677 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities * @regs: mapped registers, see devm_cxl_setup_hdm() * @decoder_count: number of decoders for this port + * @decoder_count_cap: number of decoders from HDM Decoder Capability * @target_count: for switch decoders, max downstream port targets * @interleave_mask: interleave granularity capability, see check_interleave_cap() * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap() @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); struct cxl_hdm { struct cxl_component_regs regs; unsigned int decoder_count; + unsigned int decoder_count_cap; unsigned int target_count; unsigned int interleave_mask; unsigned long iw_cap_mask; -- 2.39.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye @ 2025-04-07 8:31 ` Zhijian Li (Fujitsu) 2025-04-09 3:51 ` Ye, Huaisheng 2025-04-09 14:13 ` Gregory Price 2025-04-08 3:22 ` Gregory Price 1 sibling, 2 replies; 11+ messages in thread From: Zhijian Li (Fujitsu) @ 2025-04-07 8:31 UTC (permalink / raw) To: Huaisheng Ye, Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave.jiang@intel.com Cc: pei.p.jia@intel.com, linux-cxl@vger.kernel.org On 06/04/2025 19:27, Huaisheng Ye wrote: > In some scenarios, the probe of endpoint ports would fail because of range > register (RR) decodes outside platform defined CXL ranges. > > [kernel debug message] > cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by > platform > cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL > ranges. > cxl_bus_probe:2073: cxl_port endpoint3: probe: -6 > call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port > rejects match -6 > > This defect could be found with Qemu CXL branch for a long while with a > specified probability, even with the latest branch cxl-2025-03-20. Yeah, IIRC, the memdev cannot be enabled again after the guest reboot. Previously, I have to apply this patch[1] to my local QEMU to workaround it. > > The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of > DVSEC control has been set but in cxl_hdm_decode_init > CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is > not covered by root decoder's hpa_range. > > When encountering similar problems of the firmware, the patch could be > effective for solving. > Instead of Return error when RR decodes outside platform cxl ranges, > driver disable Mem_Enable bit of DVSEC control then take it as the way of > no setting info->mem_enabled. > > Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com> Thanks for the fix. It works for me(after revert [1]) Tested-by: Li Zhijian <lizhijian@fujitsu.com> [1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/ > --- > drivers/cxl/core/hdm.c | 2 +- > drivers/cxl/core/pci.c | 35 +++++++++++++++++++++++++---------- > drivers/cxl/cxlmem.h | 2 ++ > 3 files changed, 28 insertions(+), 11 deletions(-) > > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c > index 50e6a45b30ba..b776fb848f42 100644 > --- a/drivers/cxl/core/hdm.c > +++ b/drivers/cxl/core/hdm.c > @@ -80,7 +80,7 @@ static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm) > u32 hdm_cap; > > hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET); > - cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); > + cxlhdm->decoder_count_cap = cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap); > cxlhdm->target_count = > FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap); > if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap)) > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 013b869b66cb..5452bb285140 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -403,7 +403,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, > struct cxl_port *port = cxlhdm->port; > struct device *dev = cxlds->dev; > struct cxl_port *root; > - int i, rc, allowed; > + int i, rc, allowed = 0; > u32 global_ctrl = 0; > > if (hdm) > @@ -426,15 +426,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, > return -ENODEV; > } > > - if (!info->mem_enabled) { > - rc = devm_cxl_enable_hdm(&port->dev, cxlhdm); > - if (rc) > - return rc; > - > - return devm_cxl_enable_mem(&port->dev, cxlds); > - } > - > - for (i = 0, allowed = 0; i < info->ranges; i++) { > + for (i = 0; i < info->ranges; i++) { > struct device *cxld_dev; > > cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i], > @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, > allowed++; > } > > + if (info->mem_enabled && !allowed) { > + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n"); > + > + /* > + * Instead of Return error when RR decodes outside platform ranges, reenable > + * Mem_Enable bit of DVSEC control for a try. > + */ > + rc = cxl_set_mem_enable(cxlds, 0); > + if (rc) > + return rc; > + > + info->mem_enabled = 0; > + cxlhdm->decoder_count = cxlhdm->decoder_count_cap; > + } > + > + if (!info->mem_enabled) { > + rc = devm_cxl_enable_hdm(&port->dev, cxlhdm); > + if (rc) > + return rc; > + > + return devm_cxl_enable_mem(&port->dev, cxlds); > + } > + > if (!allowed) { > dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n"); > return -ENXIO; > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index 2a25d1957ddb..60b538f8b677 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); > * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities > * @regs: mapped registers, see devm_cxl_setup_hdm() > * @decoder_count: number of decoders for this port > + * @decoder_count_cap: number of decoders from HDM Decoder Capability > * @target_count: for switch decoders, max downstream port targets > * @interleave_mask: interleave granularity capability, see check_interleave_cap() > * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap() > @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); > struct cxl_hdm { > struct cxl_component_regs regs; > unsigned int decoder_count; > + unsigned int decoder_count_cap; > unsigned int target_count; > unsigned int interleave_mask; > unsigned long iw_cap_mask; ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-07 8:31 ` Zhijian Li (Fujitsu) @ 2025-04-09 3:51 ` Ye, Huaisheng 2025-04-09 14:13 ` Gregory Price 1 sibling, 0 replies; 11+ messages in thread From: Ye, Huaisheng @ 2025-04-09 3:51 UTC (permalink / raw) To: Zhijian Li (Fujitsu) Cc: Jia, Pei P, linux-cxl@vger.kernel.org, Du, Fan, Jonathan.Cameron@huawei.com, Williams, Dan J, Jiang, Dave From: Zhijian Li (Fujitsu) <lizhijian@fujitsu.com> Sent: Monday, April 7, 2025 4:31 PM > > > On 06/04/2025 19:27, Huaisheng Ye wrote: > > In some scenarios, the probe of endpoint ports would fail because of range > > register (RR) decodes outside platform defined CXL ranges. > > > > [kernel debug message] > > cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by > > platform > > cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL > > ranges. > > cxl_bus_probe:2073: cxl_port endpoint3: probe: -6 > > call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port > > rejects match -6 > > > > This defect could be found with Qemu CXL branch for a long while with a > > specified probability, even with the latest branch cxl-2025-03-20. > > Yeah, IIRC, the memdev cannot be enabled again after the guest reboot. > Previously, I have to apply this patch[1] to my local QEMU to workaround it. > > > > > > > The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of > > DVSEC control has been set but in cxl_hdm_decode_init > > CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is > > not covered by root decoder's hpa_range. > > > > When encountering similar problems of the firmware, the patch could be > > effective for solving. > > Instead of Return error when RR decodes outside platform cxl ranges, > > driver disable Mem_Enable bit of DVSEC control then take it as the way of > > no setting info->mem_enabled. > > > > Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com> > > Thanks for the fix. It works for me(after revert [1]) > > Tested-by: Li Zhijian <lizhijian@fujitsu.com> > Thanks for the test. Best Regards, Huaisheng Ye ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-07 8:31 ` Zhijian Li (Fujitsu) 2025-04-09 3:51 ` Ye, Huaisheng @ 2025-04-09 14:13 ` Gregory Price 2025-04-09 15:13 ` Dave Jiang 1 sibling, 1 reply; 11+ messages in thread From: Gregory Price @ 2025-04-09 14:13 UTC (permalink / raw) To: Zhijian Li (Fujitsu) Cc: Huaisheng Ye, Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave.jiang@intel.com, pei.p.jia@intel.com, linux-cxl@vger.kernel.org On Mon, Apr 07, 2025 at 08:31:13AM +0000, Zhijian Li (Fujitsu) wrote: > [1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/ After looking at this, I see why this hasn't been fixed in QEMU. Basically QEMU doesn't implement the right reset mechanism. ct3_reset calls cxl_component_register_init_common() ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0) But it never resets MEM_ENABLE in the dvsecs. I'm not sure it's sane for Linux to be trying to handle hardware that doesn't itself reset correctly - and doing this fix just for QEMU seems a bit too far. The correct fix here is building an accessor for the existing CXL dvsecs and updating it during ct3_reset. ~Gregory ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-09 14:13 ` Gregory Price @ 2025-04-09 15:13 ` Dave Jiang 2025-04-15 16:21 ` Jonathan Cameron 0 siblings, 1 reply; 11+ messages in thread From: Dave Jiang @ 2025-04-09 15:13 UTC (permalink / raw) To: Gregory Price, Zhijian Li (Fujitsu) Cc: Huaisheng Ye, Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, pei.p.jia@intel.com, linux-cxl@vger.kernel.org On 4/9/25 7:13 AM, Gregory Price wrote: > On Mon, Apr 07, 2025 at 08:31:13AM +0000, Zhijian Li (Fujitsu) wrote: >> [1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/ > > After looking at this, I see why this hasn't been fixed in QEMU. > Basically QEMU doesn't implement the right reset mechanism. > > ct3_reset calls > cxl_component_register_init_common() > ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, > HDM_DECODER_ENABLE, 0) > > But it never resets MEM_ENABLE in the dvsecs. > > I'm not sure it's sane for Linux to be trying to handle hardware that > doesn't itself reset correctly - and doing this fix just for QEMU seems > a bit too far. I agree. No need to twist Linux in a bunch for something broken on QEMU. Until there's actual hardware that does this deployed in the field, we should just leave the Linux driver as is. > > The correct fix here is building an accessor for the existing CXL dvsecs > and updating it during ct3_reset. > > ~Gregory ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-09 15:13 ` Dave Jiang @ 2025-04-15 16:21 ` Jonathan Cameron 0 siblings, 0 replies; 11+ messages in thread From: Jonathan Cameron @ 2025-04-15 16:21 UTC (permalink / raw) To: Dave Jiang Cc: Gregory Price, Zhijian Li (Fujitsu), Huaisheng Ye, dan.j.williams@intel.com, pei.p.jia@intel.com, linux-cxl@vger.kernel.org On Wed, 9 Apr 2025 08:13:48 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > On 4/9/25 7:13 AM, Gregory Price wrote: > > On Mon, Apr 07, 2025 at 08:31:13AM +0000, Zhijian Li (Fujitsu) wrote: > >> [1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/ > > > > After looking at this, I see why this hasn't been fixed in QEMU. > > Basically QEMU doesn't implement the right reset mechanism. > > > > ct3_reset calls > > cxl_component_register_init_common() > > ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, > > HDM_DECODER_ENABLE, 0) > > > > But it never resets MEM_ENABLE in the dvsecs. > > > > I'm not sure it's sane for Linux to be trying to handle hardware that > > doesn't itself reset correctly - and doing this fix just for QEMU seems > > a bit too far. > > I agree. No need to twist Linux in a bunch for something broken on QEMU. Until there's actual hardware that does this deployed in the field, we should just leave the Linux driver as is. It's not yet fixed in QEMU because the proposed fix stops the CDAT DOE working. https://lore.kernel.org/linux-cxl/20240802174010.000025e7@Huawei.com/ We don't correctly handle reset today. I'm thoroughly in favour of fixing qemu, but we need to do it carefully and not break other. > > > > > The correct fix here is building an accessor for the existing CXL dvsecs > > and updating it during ct3_reset. Yes. Will be something along those lines but needs thorough testing. Jonathan > > > > ~Gregory > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye 2025-04-07 8:31 ` Zhijian Li (Fujitsu) @ 2025-04-08 3:22 ` Gregory Price 2025-04-09 3:48 ` Ye, Huaisheng 1 sibling, 1 reply; 11+ messages in thread From: Gregory Price @ 2025-04-08 3:22 UTC (permalink / raw) To: Huaisheng Ye Cc: Jonathan.Cameron, dan.j.williams, dave.jiang, pei.p.jia, linux-cxl On Sun, Apr 06, 2025 at 07:27:52PM +0800, Huaisheng Ye wrote: > In some scenarios, the probe of endpoint ports would fail because of range > register (RR) decodes outside platform defined CXL ranges. > > [kernel debug message] > cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by > platform > cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL > ranges. > cxl_bus_probe:2073: cxl_port endpoint3: probe: -6 > call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port > rejects match -6 > > This defect could be found with Qemu CXL branch for a long while with a > specified probability, even with the latest branch cxl-2025-03-20. > > The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of > DVSEC control has been set but in cxl_hdm_decode_init > CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is > not covered by root decoder's hpa_range. > The explanation here is a bit confusing. Please clarify if my understanding of the issue is incorrect. Observed problem: Some firmware/BIOS sets MEM_ENABLED, does not set HDM_DECODER_ENABLED, and does not program the Range registers. This is possibly the result of defaulting MEM_ENABLED to 1 mistakenly, rather than a programming error / failure. Suggested solution: Linux should detect this and reset the MEM_ENABLED bit and simply attempt to enable the hdm decoders accordingly. Question: Is this only observed with QEMU? If so, can we just fix QEMU? > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 013b869b66cb..5452bb285140 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, > allowed++; > } > > + if (info->mem_enabled && !allowed) { > + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n"); > + > + /* > + * Instead of Return error when RR decodes outside platform ranges, reenable > + * Mem_Enable bit of DVSEC control for a try. > + */ Your comment says to "reenable mem_enable bit", but you clear it. I think you mean to say "reset mem_enable bit, and try to enable hdm". > + rc = cxl_set_mem_enable(cxlds, 0); > + if (rc) > + return rc; > + > + info->mem_enabled = 0; > + cxlhdm->decoder_count = cxlhdm->decoder_count_cap; > + } > + > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index 2a25d1957ddb..60b538f8b677 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); > * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities > * @regs: mapped registers, see devm_cxl_setup_hdm() > * @decoder_count: number of decoders for this port > + * @decoder_count_cap: number of decoders from HDM Decoder Capability > * @target_count: for switch decoders, max downstream port targets > * @interleave_mask: interleave granularity capability, see check_interleave_cap() > * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap() > @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); > struct cxl_hdm { > struct cxl_component_regs regs; > unsigned int decoder_count; > + unsigned int decoder_count_cap; Why is this needed, as opposed to simply re-reading the count? ~Gregory ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-08 3:22 ` Gregory Price @ 2025-04-09 3:48 ` Ye, Huaisheng 2025-04-09 14:01 ` Gregory Price 0 siblings, 1 reply; 11+ messages in thread From: Ye, Huaisheng @ 2025-04-09 3:48 UTC (permalink / raw) To: Gregory Price Cc: Jonathan.Cameron@huawei.com, Williams, Dan J, Jiang, Dave, Jia, Pei P, Du, Fan, linux-cxl@vger.kernel.org From: Gregory Price <gourry@gourry.net> Sent: Tuesday, April 8, 2025 11:23 AM > > On Sun, Apr 06, 2025 at 07:27:52PM +0800, Huaisheng Ye wrote: > > In some scenarios, the probe of endpoint ports would fail because of range > > register (RR) decodes outside platform defined CXL ranges. > > > > [kernel debug message] > > cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by > > platform > > cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL > > ranges. > > cxl_bus_probe:2073: cxl_port endpoint3: probe: -6 > > call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port > > rejects match -6 > > > > This defect could be found with Qemu CXL branch for a long while with a > > specified probability, even with the latest branch cxl-2025-03-20. > > > > The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of > > DVSEC control has been set but in cxl_hdm_decode_init > > CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is > > not covered by root decoder's hpa_range. > > > > The explanation here is a bit confusing. Please clarify if my > understanding of the issue is incorrect. > > Observed problem: > Some firmware/BIOS sets MEM_ENABLED, does not set HDM_DECODER_ENABLED, > and does not program the Range registers. This is possibly the > result of defaulting MEM_ENABLED to 1 mistakenly, rather than a > programming error / failure. The phenomenon I observed here is that the content of RR is incorrect, as it is not within the hpa_range of the root decoder. Here are the debug messages of dmesg from site scene. FYI. [ 7.683508] cxl_dvsec_rr_decode: cap = 0x1e [ 7.683535] cxl_dvsec_rr_decode: ctrl = 0x6 <-- which means CXL_DVSEC_MEM_ENABLE has been set by firmware (Qemu) [ 7.683557] cxl_dvsec_rr_decode: range0 base = 0, size = 2147483648 <- endpoint's dvsec_range [ 7.761781] dvsec_range_allowed: cxld->hpa_range.start = 11005853696 <- root decoder [ 7.761782] dvsec_range_allowed: cxld->hpa_range.end = 45365592063 > > Suggested solution: > Linux should detect this and reset the MEM_ENABLED bit and simply > attempt to enable the hdm decoders accordingly. > Exactly, after resetting MEM_ENABLED CXL driver runs as if MEM_ENABLED hasn't been set. > > Question: Is this only observed with QEMU? If so, can we just fix QEMU? Yes, it only happens with QEMU. Physical CXL card with Xeon doesn't have this issue. The firmware set CXL_DVSEC_MEM_ENABLE but also CXL_HDM_DECODER_ENABLE has been set, so it works well. I agree Qemu should fix this issue, it occurs probabilistically. And I think this patch would help CXL driver to dealing with various firmware situations. > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > > index 013b869b66cb..5452bb285140 100644 > > --- a/drivers/cxl/core/pci.c > > +++ b/drivers/cxl/core/pci.c > > @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, > > allowed++; > > } > > > > + if (info->mem_enabled && !allowed) { > > + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n"); > > + > > + /* > > + * Instead of Return error when RR decodes outside platform ranges, reenable > > + * Mem_Enable bit of DVSEC control for a try. > > + */ > > Your comment says to "reenable mem_enable bit", but you clear it. > I think you mean to say "reset mem_enable bit, and try to enable hdm". Yes, thanks for correcting. > > + rc = cxl_set_mem_enable(cxlds, 0); > > + if (rc) > > + return rc; > > + > > + info->mem_enabled = 0; > > + cxlhdm->decoder_count = cxlhdm->decoder_count_cap; > > + } > > + > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > > index 2a25d1957ddb..60b538f8b677 100644 > > --- a/drivers/cxl/cxlmem.h > > +++ b/drivers/cxl/cxlmem.h > > @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); > > * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities > > * @regs: mapped registers, see devm_cxl_setup_hdm() > > * @decoder_count: number of decoders for this port > > + * @decoder_count_cap: number of decoders from HDM Decoder Capability > > * @target_count: for switch decoders, max downstream port targets > > * @interleave_mask: interleave granularity capability, see check_interleave_cap() > > * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap() > > @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd); > > struct cxl_hdm { > > struct cxl_component_regs regs; > > unsigned int decoder_count; > > + unsigned int decoder_count_cap; > > Why is this needed, as opposed to simply re-reading the count? > They can all do it. It's easier this way I think. Many thanks for the comments. Best Regards, Huaisheng Ye ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-09 3:48 ` Ye, Huaisheng @ 2025-04-09 14:01 ` Gregory Price 2025-04-10 7:12 ` Ye, Huaisheng 2025-04-15 16:30 ` Jonathan Cameron 0 siblings, 2 replies; 11+ messages in thread From: Gregory Price @ 2025-04-09 14:01 UTC (permalink / raw) To: Ye, Huaisheng Cc: Jonathan.Cameron@huawei.com, Williams, Dan J, Jiang, Dave, Jia, Pei P, Du, Fan, linux-cxl@vger.kernel.org On Wed, Apr 09, 2025 at 03:48:42AM +0000, Ye, Huaisheng wrote: > From: Gregory Price <gourry@gourry.net> > The phenomenon I observed here is that the content of RR is incorrect, as it is not within the hpa_range of the root decoder. > Here are the debug messages of dmesg from site scene. FYI. > > [ 7.683508] cxl_dvsec_rr_decode: cap = 0x1e > [ 7.683535] cxl_dvsec_rr_decode: ctrl = 0x6 <-- which means CXL_DVSEC_MEM_ENABLE has been set by firmware (Qemu) > [ 7.683557] cxl_dvsec_rr_decode: range0 base = 0, size = 2147483648 <- endpoint's dvsec_range > > [ 7.761781] dvsec_range_allowed: cxld->hpa_range.start = 11005853696 <- root decoder > [ 7.761782] dvsec_range_allowed: cxld->hpa_range.end = 45365592063 Are you interleaving like 16 2GB devices? Because the RR here has 2GB of capacity, while the root describes 32GB of capacity. Just trying to make sense of the configuration here. If you can share your qemu config, tha would be helpful. > I agree Qemu should fix this issue, it occurs probabilistically. Hm, Jonathan is it unreasonable to go off an clear the ctrl bits in ct3_reset? I'm not sure what the warm-reset procedure is, but it does seem like we shouldn't cache decoder / ctrl bit programmings across reset. Not sure if just re-calling build_dvsecs is sufficient. > And I think this patch would help CXL driver to dealing with various firmware situations. The question is whether this behavior is even feasible on real hardware. The issue seems to basically be that QEMU doesn't implement the right reset mechanism, because other ~Gregory ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-09 14:01 ` Gregory Price @ 2025-04-10 7:12 ` Ye, Huaisheng 2025-04-15 16:30 ` Jonathan Cameron 1 sibling, 0 replies; 11+ messages in thread From: Ye, Huaisheng @ 2025-04-10 7:12 UTC (permalink / raw) To: Gregory Price Cc: Jonathan.Cameron@huawei.com, Williams, Dan J, Jiang, Dave, Jia, Pei P, Du, Fan, linux-cxl@vger.kernel.org From: Gregory Price <gourry@gourry.net> Sent: Wednesday, April 9, 2025 10:02 PM > > On Wed, Apr 09, 2025 at 03:48:42AM +0000, Ye, Huaisheng wrote: > > From: Gregory Price <gourry@gourry.net> > > The phenomenon I observed here is that the content of RR is incorrect, as it is not within the > hpa_range of the root decoder. > > Here are the debug messages of dmesg from site scene. FYI. > > > > [ 7.683508] cxl_dvsec_rr_decode: cap = 0x1e > > [ 7.683535] cxl_dvsec_rr_decode: ctrl = 0x6 <-- which means CXL_DVSEC_MEM_ENABLE has > been set by firmware (Qemu) > > [ 7.683557] cxl_dvsec_rr_decode: range0 base = 0, size = 2147483648 <- endpoint's dvsec_range > > > > [ 7.761781] dvsec_range_allowed: cxld->hpa_range.start = 11005853696 <- root decoder > > [ 7.761782] dvsec_range_allowed: cxld->hpa_range.end = 45365592063 > > Are you interleaving like 16 2GB devices? Because the RR here has 2GB > of capacity, while the root describes 32GB of capacity. Just trying to > make sense of the configuration here. If you can share your qemu > config, tha would be helpful. $ qemu-system-x86_64 -machine q35,cxl=on,accel=kvm \ -smp cpus=32 \ -m 8G \ -hda /home/work/vm-images/centos-stream8-02.qcow2 \ -object memory-backend-ram,size=4G,id=m0 \ -object memory-backend-ram,size=4G,id=m1 \ -object memory-backend-ram,size=2G,id=cxl-mem0 \ -object memory-backend-ram,size=2G,id=cxl-mem1 \ -object memory-backend-ram,size=2G,id=cxl-mem2 \ -object memory-backend-ram,size=2G,id=cxl-mem3 \ -numa node,memdev=m0,cpus=0-15,nodeid=0 \ -numa node,memdev=m1,cpus=16-31,nodeid=1 \ -netdev user,id=net0,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=net0 \ -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \ -device cxl-upstream,bus=root_port0,id=us0 \ -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-vmem0 \ -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-vmem1 \ -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-vmem2 \ -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-vmem3 \ -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=32G & Usually, I prefer to use 4 * 2GB cxl memory devices with x4 switch for debugging. This defect happens randomly but exists for a long while. Once it occurs, only a reboot can solve it. It's annoying, this is the motivation for sending patch to fix. Best Regards, Huaisheng Ye ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges 2025-04-09 14:01 ` Gregory Price 2025-04-10 7:12 ` Ye, Huaisheng @ 2025-04-15 16:30 ` Jonathan Cameron 1 sibling, 0 replies; 11+ messages in thread From: Jonathan Cameron @ 2025-04-15 16:30 UTC (permalink / raw) To: Gregory Price Cc: Ye, Huaisheng, Williams, Dan J, Jiang, Dave, Jia, Pei P, Du, Fan, linux-cxl@vger.kernel.org On Wed, 9 Apr 2025 09:01:45 -0500 Gregory Price <gourry@gourry.net> wrote: > On Wed, Apr 09, 2025 at 03:48:42AM +0000, Ye, Huaisheng wrote: > > From: Gregory Price <gourry@gourry.net> > > The phenomenon I observed here is that the content of RR is incorrect, as it is not within the hpa_range of the root decoder. > > Here are the debug messages of dmesg from site scene. FYI. > > > > [ 7.683508] cxl_dvsec_rr_decode: cap = 0x1e > > [ 7.683535] cxl_dvsec_rr_decode: ctrl = 0x6 <-- which means CXL_DVSEC_MEM_ENABLE has been set by firmware (Qemu) > > [ 7.683557] cxl_dvsec_rr_decode: range0 base = 0, size = 2147483648 <- endpoint's dvsec_range > > > > [ 7.761781] dvsec_range_allowed: cxld->hpa_range.start = 11005853696 <- root decoder > > [ 7.761782] dvsec_range_allowed: cxld->hpa_range.end = 45365592063 > > Are you interleaving like 16 2GB devices? Because the RR here has 2GB > of capacity, while the root describes 32GB of capacity. Just trying to > make sense of the configuration here. If you can share your qemu > config, tha would be helpful. > > > I agree Qemu should fix this issue, it occurs probabilistically. > > Hm, Jonathan is it unreasonable to go off an clear the ctrl bits in > ct3_reset? I'm not sure what the warm-reset procedure is, but it does > seem like we shouldn't cache decoder / ctrl bit programmings across > reset. Not sure if just re-calling build_dvsecs is sufficient. We need to handle the difference between a warm-reset and a cold one correctly I think + CXL reset ideally. This particular thing was never a priority for me but I'm more than happy to review patches from others. If not I'll get to it eventually, but I suspect there is a bunch more to do than just this and the build_dvsecs() in ct3d_reset was the thing that broken the DOE last time. Testing wise it needs someone to check the entire device configuration and ensure the correct things have been reset. Oh and on top of that move to the newer reset handling in qemu probably docs/devel/reset.rst Jonathan > > > And I think this patch would help CXL driver to dealing with various firmware situations. > > The question is whether this behavior is even feasible on real hardware. > The issue seems to basically be that QEMU doesn't implement the right > reset mechanism, because other > > ~Gregory ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-04-15 16:30 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye 2025-04-07 8:31 ` Zhijian Li (Fujitsu) 2025-04-09 3:51 ` Ye, Huaisheng 2025-04-09 14:13 ` Gregory Price 2025-04-09 15:13 ` Dave Jiang 2025-04-15 16:21 ` Jonathan Cameron 2025-04-08 3:22 ` Gregory Price 2025-04-09 3:48 ` Ye, Huaisheng 2025-04-09 14:01 ` Gregory Price 2025-04-10 7:12 ` Ye, Huaisheng 2025-04-15 16:30 ` Jonathan Cameron
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox