* [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
@ 2025-04-06 11:27 Huaisheng Ye
2025-04-07 8:31 ` Zhijian Li (Fujitsu)
2025-04-08 3:22 ` Gregory Price
0 siblings, 2 replies; 11+ messages in thread
From: Huaisheng Ye @ 2025-04-06 11:27 UTC (permalink / raw)
To: Jonathan.Cameron, dan.j.williams, dave.jiang
Cc: pei.p.jia, linux-cxl, Huaisheng Ye
In some scenarios, the probe of endpoint ports would fail because of range
register (RR) decodes outside platform defined CXL ranges.
[kernel debug message]
cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
platform
cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
ranges.
cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
rejects match -6
This defect could be found with Qemu CXL branch for a long while with a
specified probability, even with the latest branch cxl-2025-03-20.
The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
DVSEC control has been set but in cxl_hdm_decode_init
CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
not covered by root decoder's hpa_range.
When encountering similar problems of the firmware, the patch could be
effective for solving.
Instead of Return error when RR decodes outside platform cxl ranges,
driver disable Mem_Enable bit of DVSEC control then take it as the way of
no setting info->mem_enabled.
Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com>
---
drivers/cxl/core/hdm.c | 2 +-
drivers/cxl/core/pci.c | 35 +++++++++++++++++++++++++----------
drivers/cxl/cxlmem.h | 2 ++
3 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 50e6a45b30ba..b776fb848f42 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -80,7 +80,7 @@ static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
u32 hdm_cap;
hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
- cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
+ cxlhdm->decoder_count_cap = cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
cxlhdm->target_count =
FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap))
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 013b869b66cb..5452bb285140 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -403,7 +403,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
struct cxl_port *port = cxlhdm->port;
struct device *dev = cxlds->dev;
struct cxl_port *root;
- int i, rc, allowed;
+ int i, rc, allowed = 0;
u32 global_ctrl = 0;
if (hdm)
@@ -426,15 +426,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
return -ENODEV;
}
- if (!info->mem_enabled) {
- rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
- if (rc)
- return rc;
-
- return devm_cxl_enable_mem(&port->dev, cxlds);
- }
-
- for (i = 0, allowed = 0; i < info->ranges; i++) {
+ for (i = 0; i < info->ranges; i++) {
struct device *cxld_dev;
cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i],
@@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
allowed++;
}
+ if (info->mem_enabled && !allowed) {
+ dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n");
+
+ /*
+ * Instead of Return error when RR decodes outside platform ranges, reenable
+ * Mem_Enable bit of DVSEC control for a try.
+ */
+ rc = cxl_set_mem_enable(cxlds, 0);
+ if (rc)
+ return rc;
+
+ info->mem_enabled = 0;
+ cxlhdm->decoder_count = cxlhdm->decoder_count_cap;
+ }
+
+ if (!info->mem_enabled) {
+ rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
+ if (rc)
+ return rc;
+
+ return devm_cxl_enable_mem(&port->dev, cxlds);
+ }
+
if (!allowed) {
dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
return -ENXIO;
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 2a25d1957ddb..60b538f8b677 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
* struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities
* @regs: mapped registers, see devm_cxl_setup_hdm()
* @decoder_count: number of decoders for this port
+ * @decoder_count_cap: number of decoders from HDM Decoder Capability
* @target_count: for switch decoders, max downstream port targets
* @interleave_mask: interleave granularity capability, see check_interleave_cap()
* @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap()
@@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
struct cxl_hdm {
struct cxl_component_regs regs;
unsigned int decoder_count;
+ unsigned int decoder_count_cap;
unsigned int target_count;
unsigned int interleave_mask;
unsigned long iw_cap_mask;
--
2.39.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye
@ 2025-04-07 8:31 ` Zhijian Li (Fujitsu)
2025-04-09 3:51 ` Ye, Huaisheng
2025-04-09 14:13 ` Gregory Price
2025-04-08 3:22 ` Gregory Price
1 sibling, 2 replies; 11+ messages in thread
From: Zhijian Li (Fujitsu) @ 2025-04-07 8:31 UTC (permalink / raw)
To: Huaisheng Ye, Jonathan.Cameron@huawei.com,
dan.j.williams@intel.com, dave.jiang@intel.com
Cc: pei.p.jia@intel.com, linux-cxl@vger.kernel.org
On 06/04/2025 19:27, Huaisheng Ye wrote:
> In some scenarios, the probe of endpoint ports would fail because of range
> register (RR) decodes outside platform defined CXL ranges.
>
> [kernel debug message]
> cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
> platform
> cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
> ranges.
> cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
> call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
> rejects match -6
>
> This defect could be found with Qemu CXL branch for a long while with a
> specified probability, even with the latest branch cxl-2025-03-20.
Yeah, IIRC, the memdev cannot be enabled again after the guest reboot.
Previously, I have to apply this patch[1] to my local QEMU to workaround it.
>
> The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
> DVSEC control has been set but in cxl_hdm_decode_init
> CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
> not covered by root decoder's hpa_range.
>
> When encountering similar problems of the firmware, the patch could be
> effective for solving.
> Instead of Return error when RR decodes outside platform cxl ranges,
> driver disable Mem_Enable bit of DVSEC control then take it as the way of
> no setting info->mem_enabled.
>
> Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com>
Thanks for the fix. It works for me(after revert [1])
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
[1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/
> ---
> drivers/cxl/core/hdm.c | 2 +-
> drivers/cxl/core/pci.c | 35 +++++++++++++++++++++++++----------
> drivers/cxl/cxlmem.h | 2 ++
> 3 files changed, 28 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 50e6a45b30ba..b776fb848f42 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -80,7 +80,7 @@ static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
> u32 hdm_cap;
>
> hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> - cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
> + cxlhdm->decoder_count_cap = cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
> cxlhdm->target_count =
> FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap))
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 013b869b66cb..5452bb285140 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -403,7 +403,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> struct cxl_port *port = cxlhdm->port;
> struct device *dev = cxlds->dev;
> struct cxl_port *root;
> - int i, rc, allowed;
> + int i, rc, allowed = 0;
> u32 global_ctrl = 0;
>
> if (hdm)
> @@ -426,15 +426,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> return -ENODEV;
> }
>
> - if (!info->mem_enabled) {
> - rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> - if (rc)
> - return rc;
> -
> - return devm_cxl_enable_mem(&port->dev, cxlds);
> - }
> -
> - for (i = 0, allowed = 0; i < info->ranges; i++) {
> + for (i = 0; i < info->ranges; i++) {
> struct device *cxld_dev;
>
> cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i],
> @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> allowed++;
> }
>
> + if (info->mem_enabled && !allowed) {
> + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n");
> +
> + /*
> + * Instead of Return error when RR decodes outside platform ranges, reenable
> + * Mem_Enable bit of DVSEC control for a try.
> + */
> + rc = cxl_set_mem_enable(cxlds, 0);
> + if (rc)
> + return rc;
> +
> + info->mem_enabled = 0;
> + cxlhdm->decoder_count = cxlhdm->decoder_count_cap;
> + }
> +
> + if (!info->mem_enabled) {
> + rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> + if (rc)
> + return rc;
> +
> + return devm_cxl_enable_mem(&port->dev, cxlds);
> + }
> +
> if (!allowed) {
> dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> return -ENXIO;
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 2a25d1957ddb..60b538f8b677 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities
> * @regs: mapped registers, see devm_cxl_setup_hdm()
> * @decoder_count: number of decoders for this port
> + * @decoder_count_cap: number of decoders from HDM Decoder Capability
> * @target_count: for switch decoders, max downstream port targets
> * @interleave_mask: interleave granularity capability, see check_interleave_cap()
> * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap()
> @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> struct cxl_hdm {
> struct cxl_component_regs regs;
> unsigned int decoder_count;
> + unsigned int decoder_count_cap;
> unsigned int target_count;
> unsigned int interleave_mask;
> unsigned long iw_cap_mask;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye
2025-04-07 8:31 ` Zhijian Li (Fujitsu)
@ 2025-04-08 3:22 ` Gregory Price
2025-04-09 3:48 ` Ye, Huaisheng
1 sibling, 1 reply; 11+ messages in thread
From: Gregory Price @ 2025-04-08 3:22 UTC (permalink / raw)
To: Huaisheng Ye
Cc: Jonathan.Cameron, dan.j.williams, dave.jiang, pei.p.jia,
linux-cxl
On Sun, Apr 06, 2025 at 07:27:52PM +0800, Huaisheng Ye wrote:
> In some scenarios, the probe of endpoint ports would fail because of range
> register (RR) decodes outside platform defined CXL ranges.
>
> [kernel debug message]
> cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
> platform
> cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
> ranges.
> cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
> call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
> rejects match -6
>
> This defect could be found with Qemu CXL branch for a long while with a
> specified probability, even with the latest branch cxl-2025-03-20.
>
> The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
> DVSEC control has been set but in cxl_hdm_decode_init
> CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
> not covered by root decoder's hpa_range.
>
The explanation here is a bit confusing. Please clarify if my
understanding of the issue is incorrect.
Observed problem:
Some firmware/BIOS sets MEM_ENABLED, does not set HDM_DECODER_ENABLED,
and does not program the Range registers. This is possibly the
result of defaulting MEM_ENABLED to 1 mistakenly, rather than a
programming error / failure.
Suggested solution:
Linux should detect this and reset the MEM_ENABLED bit and simply
attempt to enable the hdm decoders accordingly.
Question: Is this only observed with QEMU? If so, can we just fix QEMU?
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 013b869b66cb..5452bb285140 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> allowed++;
> }
>
> + if (info->mem_enabled && !allowed) {
> + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n");
> +
> + /*
> + * Instead of Return error when RR decodes outside platform ranges, reenable
> + * Mem_Enable bit of DVSEC control for a try.
> + */
Your comment says to "reenable mem_enable bit", but you clear it.
I think you mean to say "reset mem_enable bit, and try to enable hdm".
> + rc = cxl_set_mem_enable(cxlds, 0);
> + if (rc)
> + return rc;
> +
> + info->mem_enabled = 0;
> + cxlhdm->decoder_count = cxlhdm->decoder_count_cap;
> + }
> +
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 2a25d1957ddb..60b538f8b677 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities
> * @regs: mapped registers, see devm_cxl_setup_hdm()
> * @decoder_count: number of decoders for this port
> + * @decoder_count_cap: number of decoders from HDM Decoder Capability
> * @target_count: for switch decoders, max downstream port targets
> * @interleave_mask: interleave granularity capability, see check_interleave_cap()
> * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap()
> @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> struct cxl_hdm {
> struct cxl_component_regs regs;
> unsigned int decoder_count;
> + unsigned int decoder_count_cap;
Why is this needed, as opposed to simply re-reading the count?
~Gregory
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-08 3:22 ` Gregory Price
@ 2025-04-09 3:48 ` Ye, Huaisheng
2025-04-09 14:01 ` Gregory Price
0 siblings, 1 reply; 11+ messages in thread
From: Ye, Huaisheng @ 2025-04-09 3:48 UTC (permalink / raw)
To: Gregory Price
Cc: Jonathan.Cameron@huawei.com, Williams, Dan J, Jiang, Dave,
Jia, Pei P, Du, Fan, linux-cxl@vger.kernel.org
From: Gregory Price <gourry@gourry.net>
Sent: Tuesday, April 8, 2025 11:23 AM
>
> On Sun, Apr 06, 2025 at 07:27:52PM +0800, Huaisheng Ye wrote:
> > In some scenarios, the probe of endpoint ports would fail because of range
> > register (RR) decodes outside platform defined CXL ranges.
> >
> > [kernel debug message]
> > cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
> > platform
> > cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
> > ranges.
> > cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
> > call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
> > rejects match -6
> >
> > This defect could be found with Qemu CXL branch for a long while with a
> > specified probability, even with the latest branch cxl-2025-03-20.
> >
> > The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
> > DVSEC control has been set but in cxl_hdm_decode_init
> > CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
> > not covered by root decoder's hpa_range.
> >
>
> The explanation here is a bit confusing. Please clarify if my
> understanding of the issue is incorrect.
>
> Observed problem:
> Some firmware/BIOS sets MEM_ENABLED, does not set HDM_DECODER_ENABLED,
> and does not program the Range registers. This is possibly the
> result of defaulting MEM_ENABLED to 1 mistakenly, rather than a
> programming error / failure.
The phenomenon I observed here is that the content of RR is incorrect, as it is not within the hpa_range of the root decoder.
Here are the debug messages of dmesg from site scene. FYI.
[ 7.683508] cxl_dvsec_rr_decode: cap = 0x1e
[ 7.683535] cxl_dvsec_rr_decode: ctrl = 0x6 <-- which means CXL_DVSEC_MEM_ENABLE has been set by firmware (Qemu)
[ 7.683557] cxl_dvsec_rr_decode: range0 base = 0, size = 2147483648 <- endpoint's dvsec_range
[ 7.761781] dvsec_range_allowed: cxld->hpa_range.start = 11005853696 <- root decoder
[ 7.761782] dvsec_range_allowed: cxld->hpa_range.end = 45365592063
>
> Suggested solution:
> Linux should detect this and reset the MEM_ENABLED bit and simply
> attempt to enable the hdm decoders accordingly.
>
Exactly, after resetting MEM_ENABLED CXL driver runs as if MEM_ENABLED hasn't been set.
>
> Question: Is this only observed with QEMU? If so, can we just fix QEMU?
Yes, it only happens with QEMU.
Physical CXL card with Xeon doesn't have this issue. The firmware set CXL_DVSEC_MEM_ENABLE but also CXL_HDM_DECODER_ENABLE has been set, so it works well.
I agree Qemu should fix this issue, it occurs probabilistically.
And I think this patch would help CXL driver to dealing with various firmware situations.
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index 013b869b66cb..5452bb285140 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> > allowed++;
> > }
> >
> > + if (info->mem_enabled && !allowed) {
> > + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n");
> > +
> > + /*
> > + * Instead of Return error when RR decodes outside platform ranges, reenable
> > + * Mem_Enable bit of DVSEC control for a try.
> > + */
>
> Your comment says to "reenable mem_enable bit", but you clear it.
> I think you mean to say "reset mem_enable bit, and try to enable hdm".
Yes, thanks for correcting.
> > + rc = cxl_set_mem_enable(cxlds, 0);
> > + if (rc)
> > + return rc;
> > +
> > + info->mem_enabled = 0;
> > + cxlhdm->decoder_count = cxlhdm->decoder_count_cap;
> > + }
> > +
>
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 2a25d1957ddb..60b538f8b677 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> > * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities
> > * @regs: mapped registers, see devm_cxl_setup_hdm()
> > * @decoder_count: number of decoders for this port
> > + * @decoder_count_cap: number of decoders from HDM Decoder Capability
> > * @target_count: for switch decoders, max downstream port targets
> > * @interleave_mask: interleave granularity capability, see check_interleave_cap()
> > * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap()
> > @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> > struct cxl_hdm {
> > struct cxl_component_regs regs;
> > unsigned int decoder_count;
> > + unsigned int decoder_count_cap;
>
> Why is this needed, as opposed to simply re-reading the count?
>
They can all do it. It's easier this way I think.
Many thanks for the comments.
Best Regards,
Huaisheng Ye
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-07 8:31 ` Zhijian Li (Fujitsu)
@ 2025-04-09 3:51 ` Ye, Huaisheng
2025-04-09 14:13 ` Gregory Price
1 sibling, 0 replies; 11+ messages in thread
From: Ye, Huaisheng @ 2025-04-09 3:51 UTC (permalink / raw)
To: Zhijian Li (Fujitsu)
Cc: Jia, Pei P, linux-cxl@vger.kernel.org, Du, Fan,
Jonathan.Cameron@huawei.com, Williams, Dan J, Jiang, Dave
From: Zhijian Li (Fujitsu) <lizhijian@fujitsu.com>
Sent: Monday, April 7, 2025 4:31 PM
>
>
> On 06/04/2025 19:27, Huaisheng Ye wrote:
> > In some scenarios, the probe of endpoint ports would fail because of range
> > register (RR) decodes outside platform defined CXL ranges.
> >
> > [kernel debug message]
> > cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
> > platform
> > cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
> > ranges.
> > cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
> > call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
> > rejects match -6
> >
> > This defect could be found with Qemu CXL branch for a long while with a
> > specified probability, even with the latest branch cxl-2025-03-20.
>
> Yeah, IIRC, the memdev cannot be enabled again after the guest reboot.
> Previously, I have to apply this patch[1] to my local QEMU to workaround it.
>
>
>
> >
> > The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
> > DVSEC control has been set but in cxl_hdm_decode_init
> > CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
> > not covered by root decoder's hpa_range.
> >
> > When encountering similar problems of the firmware, the patch could be
> > effective for solving.
> > Instead of Return error when RR decodes outside platform cxl ranges,
> > driver disable Mem_Enable bit of DVSEC control then take it as the way of
> > no setting info->mem_enabled.
> >
> > Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com>
>
> Thanks for the fix. It works for me(after revert [1])
>
> Tested-by: Li Zhijian <lizhijian@fujitsu.com>
>
Thanks for the test.
Best Regards,
Huaisheng Ye
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-09 3:48 ` Ye, Huaisheng
@ 2025-04-09 14:01 ` Gregory Price
2025-04-10 7:12 ` Ye, Huaisheng
2025-04-15 16:30 ` Jonathan Cameron
0 siblings, 2 replies; 11+ messages in thread
From: Gregory Price @ 2025-04-09 14:01 UTC (permalink / raw)
To: Ye, Huaisheng
Cc: Jonathan.Cameron@huawei.com, Williams, Dan J, Jiang, Dave,
Jia, Pei P, Du, Fan, linux-cxl@vger.kernel.org
On Wed, Apr 09, 2025 at 03:48:42AM +0000, Ye, Huaisheng wrote:
> From: Gregory Price <gourry@gourry.net>
> The phenomenon I observed here is that the content of RR is incorrect, as it is not within the hpa_range of the root decoder.
> Here are the debug messages of dmesg from site scene. FYI.
>
> [ 7.683508] cxl_dvsec_rr_decode: cap = 0x1e
> [ 7.683535] cxl_dvsec_rr_decode: ctrl = 0x6 <-- which means CXL_DVSEC_MEM_ENABLE has been set by firmware (Qemu)
> [ 7.683557] cxl_dvsec_rr_decode: range0 base = 0, size = 2147483648 <- endpoint's dvsec_range
>
> [ 7.761781] dvsec_range_allowed: cxld->hpa_range.start = 11005853696 <- root decoder
> [ 7.761782] dvsec_range_allowed: cxld->hpa_range.end = 45365592063
Are you interleaving like 16 2GB devices? Because the RR here has 2GB
of capacity, while the root describes 32GB of capacity. Just trying to
make sense of the configuration here. If you can share your qemu
config, tha would be helpful.
> I agree Qemu should fix this issue, it occurs probabilistically.
Hm, Jonathan is it unreasonable to go off an clear the ctrl bits in
ct3_reset? I'm not sure what the warm-reset procedure is, but it does
seem like we shouldn't cache decoder / ctrl bit programmings across
reset. Not sure if just re-calling build_dvsecs is sufficient.
> And I think this patch would help CXL driver to dealing with various firmware situations.
The question is whether this behavior is even feasible on real hardware.
The issue seems to basically be that QEMU doesn't implement the right
reset mechanism, because other
~Gregory
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-07 8:31 ` Zhijian Li (Fujitsu)
2025-04-09 3:51 ` Ye, Huaisheng
@ 2025-04-09 14:13 ` Gregory Price
2025-04-09 15:13 ` Dave Jiang
1 sibling, 1 reply; 11+ messages in thread
From: Gregory Price @ 2025-04-09 14:13 UTC (permalink / raw)
To: Zhijian Li (Fujitsu)
Cc: Huaisheng Ye, Jonathan.Cameron@huawei.com,
dan.j.williams@intel.com, dave.jiang@intel.com,
pei.p.jia@intel.com, linux-cxl@vger.kernel.org
On Mon, Apr 07, 2025 at 08:31:13AM +0000, Zhijian Li (Fujitsu) wrote:
> [1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/
After looking at this, I see why this hasn't been fixed in QEMU.
Basically QEMU doesn't implement the right reset mechanism.
ct3_reset calls
cxl_component_register_init_common()
ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL,
HDM_DECODER_ENABLE, 0)
But it never resets MEM_ENABLE in the dvsecs.
I'm not sure it's sane for Linux to be trying to handle hardware that
doesn't itself reset correctly - and doing this fix just for QEMU seems
a bit too far.
The correct fix here is building an accessor for the existing CXL dvsecs
and updating it during ct3_reset.
~Gregory
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-09 14:13 ` Gregory Price
@ 2025-04-09 15:13 ` Dave Jiang
2025-04-15 16:21 ` Jonathan Cameron
0 siblings, 1 reply; 11+ messages in thread
From: Dave Jiang @ 2025-04-09 15:13 UTC (permalink / raw)
To: Gregory Price, Zhijian Li (Fujitsu)
Cc: Huaisheng Ye, Jonathan.Cameron@huawei.com,
dan.j.williams@intel.com, pei.p.jia@intel.com,
linux-cxl@vger.kernel.org
On 4/9/25 7:13 AM, Gregory Price wrote:
> On Mon, Apr 07, 2025 at 08:31:13AM +0000, Zhijian Li (Fujitsu) wrote:
>> [1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/
>
> After looking at this, I see why this hasn't been fixed in QEMU.
> Basically QEMU doesn't implement the right reset mechanism.
>
> ct3_reset calls
> cxl_component_register_init_common()
> ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL,
> HDM_DECODER_ENABLE, 0)
>
> But it never resets MEM_ENABLE in the dvsecs.
>
> I'm not sure it's sane for Linux to be trying to handle hardware that
> doesn't itself reset correctly - and doing this fix just for QEMU seems
> a bit too far.
I agree. No need to twist Linux in a bunch for something broken on QEMU. Until there's actual hardware that does this deployed in the field, we should just leave the Linux driver as is.
>
> The correct fix here is building an accessor for the existing CXL dvsecs
> and updating it during ct3_reset.
>
> ~Gregory
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-09 14:01 ` Gregory Price
@ 2025-04-10 7:12 ` Ye, Huaisheng
2025-04-15 16:30 ` Jonathan Cameron
1 sibling, 0 replies; 11+ messages in thread
From: Ye, Huaisheng @ 2025-04-10 7:12 UTC (permalink / raw)
To: Gregory Price
Cc: Jonathan.Cameron@huawei.com, Williams, Dan J, Jiang, Dave,
Jia, Pei P, Du, Fan, linux-cxl@vger.kernel.org
From: Gregory Price <gourry@gourry.net>
Sent: Wednesday, April 9, 2025 10:02 PM
>
> On Wed, Apr 09, 2025 at 03:48:42AM +0000, Ye, Huaisheng wrote:
> > From: Gregory Price <gourry@gourry.net>
> > The phenomenon I observed here is that the content of RR is incorrect, as it is not within the
> hpa_range of the root decoder.
> > Here are the debug messages of dmesg from site scene. FYI.
> >
> > [ 7.683508] cxl_dvsec_rr_decode: cap = 0x1e
> > [ 7.683535] cxl_dvsec_rr_decode: ctrl = 0x6 <-- which means CXL_DVSEC_MEM_ENABLE has
> been set by firmware (Qemu)
> > [ 7.683557] cxl_dvsec_rr_decode: range0 base = 0, size = 2147483648 <- endpoint's dvsec_range
> >
> > [ 7.761781] dvsec_range_allowed: cxld->hpa_range.start = 11005853696 <- root decoder
> > [ 7.761782] dvsec_range_allowed: cxld->hpa_range.end = 45365592063
>
> Are you interleaving like 16 2GB devices? Because the RR here has 2GB
> of capacity, while the root describes 32GB of capacity. Just trying to
> make sense of the configuration here. If you can share your qemu
> config, tha would be helpful.
$ qemu-system-x86_64 -machine q35,cxl=on,accel=kvm \
-smp cpus=32 \
-m 8G \
-hda /home/work/vm-images/centos-stream8-02.qcow2 \
-object memory-backend-ram,size=4G,id=m0 \
-object memory-backend-ram,size=4G,id=m1 \
-object memory-backend-ram,size=2G,id=cxl-mem0 \
-object memory-backend-ram,size=2G,id=cxl-mem1 \
-object memory-backend-ram,size=2G,id=cxl-mem2 \
-object memory-backend-ram,size=2G,id=cxl-mem3 \
-numa node,memdev=m0,cpus=0-15,nodeid=0 \
-numa node,memdev=m1,cpus=16-31,nodeid=1 \
-netdev user,id=net0,hostfwd=tcp::2222-:22 \
-device virtio-net-pci,netdev=net0 \
-device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
-device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \
-device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \
-device cxl-upstream,bus=root_port0,id=us0 \
-device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
-device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-vmem0 \
-device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
-device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-vmem1 \
-device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \
-device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-vmem2 \
-device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \
-device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-vmem3 \
-M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=32G &
Usually, I prefer to use 4 * 2GB cxl memory devices with x4 switch for debugging.
This defect happens randomly but exists for a long while.
Once it occurs, only a reboot can solve it. It's annoying, this is the motivation for sending patch to fix.
Best Regards,
Huaisheng Ye
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-09 15:13 ` Dave Jiang
@ 2025-04-15 16:21 ` Jonathan Cameron
0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron @ 2025-04-15 16:21 UTC (permalink / raw)
To: Dave Jiang
Cc: Gregory Price, Zhijian Li (Fujitsu), Huaisheng Ye,
dan.j.williams@intel.com, pei.p.jia@intel.com,
linux-cxl@vger.kernel.org
On Wed, 9 Apr 2025 08:13:48 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> On 4/9/25 7:13 AM, Gregory Price wrote:
> > On Mon, Apr 07, 2025 at 08:31:13AM +0000, Zhijian Li (Fujitsu) wrote:
> >> [1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/
> >
> > After looking at this, I see why this hasn't been fixed in QEMU.
> > Basically QEMU doesn't implement the right reset mechanism.
> >
> > ct3_reset calls
> > cxl_component_register_init_common()
> > ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL,
> > HDM_DECODER_ENABLE, 0)
> >
> > But it never resets MEM_ENABLE in the dvsecs.
> >
> > I'm not sure it's sane for Linux to be trying to handle hardware that
> > doesn't itself reset correctly - and doing this fix just for QEMU seems
> > a bit too far.
>
> I agree. No need to twist Linux in a bunch for something broken on QEMU. Until there's actual hardware that does this deployed in the field, we should just leave the Linux driver as is.
It's not yet fixed in QEMU because the proposed fix stops the CDAT DOE working.
https://lore.kernel.org/linux-cxl/20240802174010.000025e7@Huawei.com/
We don't correctly handle reset today. I'm thoroughly in favour
of fixing qemu, but we need to do it carefully and not break other.
>
> >
> > The correct fix here is building an accessor for the existing CXL dvsecs
> > and updating it during ct3_reset.
Yes. Will be something along those lines but needs thorough testing.
Jonathan
> >
> > ~Gregory
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
2025-04-09 14:01 ` Gregory Price
2025-04-10 7:12 ` Ye, Huaisheng
@ 2025-04-15 16:30 ` Jonathan Cameron
1 sibling, 0 replies; 11+ messages in thread
From: Jonathan Cameron @ 2025-04-15 16:30 UTC (permalink / raw)
To: Gregory Price
Cc: Ye, Huaisheng, Williams, Dan J, Jiang, Dave, Jia, Pei P, Du, Fan,
linux-cxl@vger.kernel.org
On Wed, 9 Apr 2025 09:01:45 -0500
Gregory Price <gourry@gourry.net> wrote:
> On Wed, Apr 09, 2025 at 03:48:42AM +0000, Ye, Huaisheng wrote:
> > From: Gregory Price <gourry@gourry.net>
> > The phenomenon I observed here is that the content of RR is incorrect, as it is not within the hpa_range of the root decoder.
> > Here are the debug messages of dmesg from site scene. FYI.
> >
> > [ 7.683508] cxl_dvsec_rr_decode: cap = 0x1e
> > [ 7.683535] cxl_dvsec_rr_decode: ctrl = 0x6 <-- which means CXL_DVSEC_MEM_ENABLE has been set by firmware (Qemu)
> > [ 7.683557] cxl_dvsec_rr_decode: range0 base = 0, size = 2147483648 <- endpoint's dvsec_range
> >
> > [ 7.761781] dvsec_range_allowed: cxld->hpa_range.start = 11005853696 <- root decoder
> > [ 7.761782] dvsec_range_allowed: cxld->hpa_range.end = 45365592063
>
> Are you interleaving like 16 2GB devices? Because the RR here has 2GB
> of capacity, while the root describes 32GB of capacity. Just trying to
> make sense of the configuration here. If you can share your qemu
> config, tha would be helpful.
>
> > I agree Qemu should fix this issue, it occurs probabilistically.
>
> Hm, Jonathan is it unreasonable to go off an clear the ctrl bits in
> ct3_reset? I'm not sure what the warm-reset procedure is, but it does
> seem like we shouldn't cache decoder / ctrl bit programmings across
> reset. Not sure if just re-calling build_dvsecs is sufficient.
We need to handle the difference between a warm-reset and a cold one correctly
I think + CXL reset ideally. This particular thing was never a priority
for me but I'm more than happy to review patches from others.
If not I'll get to it eventually, but I suspect there is a bunch more to do than
just this and the build_dvsecs() in ct3d_reset was the thing that
broken the DOE last time.
Testing wise it needs someone to check the entire device configuration and
ensure the correct things have been reset.
Oh and on top of that move to the newer reset handling in qemu probably
docs/devel/reset.rst
Jonathan
>
> > And I think this patch would help CXL driver to dealing with various firmware situations.
>
> The question is whether this behavior is even feasible on real hardware.
> The issue seems to basically be that QEMU doesn't implement the right
> reset mechanism, because other
>
> ~Gregory
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-04-15 16:30 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye
2025-04-07 8:31 ` Zhijian Li (Fujitsu)
2025-04-09 3:51 ` Ye, Huaisheng
2025-04-09 14:13 ` Gregory Price
2025-04-09 15:13 ` Dave Jiang
2025-04-15 16:21 ` Jonathan Cameron
2025-04-08 3:22 ` Gregory Price
2025-04-09 3:48 ` Ye, Huaisheng
2025-04-09 14:01 ` Gregory Price
2025-04-10 7:12 ` Ye, Huaisheng
2025-04-15 16:30 ` Jonathan Cameron
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox