From: "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com>
To: Huaisheng Ye <huaisheng.ye@intel.com>,
"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
"dave.jiang@intel.com" <dave.jiang@intel.com>
Cc: "pei.p.jia@intel.com" <pei.p.jia@intel.com>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Subject: Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
Date: Mon, 7 Apr 2025 08:31:13 +0000 [thread overview]
Message-ID: <02fbb2a7-3973-4c7f-8b7d-cfabbb379467@fujitsu.com> (raw)
In-Reply-To: <20250406112752.1261855-1-huaisheng.ye@intel.com>
On 06/04/2025 19:27, Huaisheng Ye wrote:
> In some scenarios, the probe of endpoint ports would fail because of range
> register (RR) decodes outside platform defined CXL ranges.
>
> [kernel debug message]
> cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
> platform
> cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
> ranges.
> cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
> call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
> rejects match -6
>
> This defect could be found with Qemu CXL branch for a long while with a
> specified probability, even with the latest branch cxl-2025-03-20.
Yeah, IIRC, the memdev cannot be enabled again after the guest reboot.
Previously, I have to apply this patch[1] to my local QEMU to workaround it.
>
> The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
> DVSEC control has been set but in cxl_hdm_decode_init
> CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
> not covered by root decoder's hpa_range.
>
> When encountering similar problems of the firmware, the patch could be
> effective for solving.
> Instead of Return error when RR decodes outside platform cxl ranges,
> driver disable Mem_Enable bit of DVSEC control then take it as the way of
> no setting info->mem_enabled.
>
> Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com>
Thanks for the fix. It works for me(after revert [1])
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
[1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/
> ---
> drivers/cxl/core/hdm.c | 2 +-
> drivers/cxl/core/pci.c | 35 +++++++++++++++++++++++++----------
> drivers/cxl/cxlmem.h | 2 ++
> 3 files changed, 28 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 50e6a45b30ba..b776fb848f42 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -80,7 +80,7 @@ static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
> u32 hdm_cap;
>
> hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> - cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
> + cxlhdm->decoder_count_cap = cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
> cxlhdm->target_count =
> FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap))
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 013b869b66cb..5452bb285140 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -403,7 +403,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> struct cxl_port *port = cxlhdm->port;
> struct device *dev = cxlds->dev;
> struct cxl_port *root;
> - int i, rc, allowed;
> + int i, rc, allowed = 0;
> u32 global_ctrl = 0;
>
> if (hdm)
> @@ -426,15 +426,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> return -ENODEV;
> }
>
> - if (!info->mem_enabled) {
> - rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> - if (rc)
> - return rc;
> -
> - return devm_cxl_enable_mem(&port->dev, cxlds);
> - }
> -
> - for (i = 0, allowed = 0; i < info->ranges; i++) {
> + for (i = 0; i < info->ranges; i++) {
> struct device *cxld_dev;
>
> cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i],
> @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> allowed++;
> }
>
> + if (info->mem_enabled && !allowed) {
> + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n");
> +
> + /*
> + * Instead of Return error when RR decodes outside platform ranges, reenable
> + * Mem_Enable bit of DVSEC control for a try.
> + */
> + rc = cxl_set_mem_enable(cxlds, 0);
> + if (rc)
> + return rc;
> +
> + info->mem_enabled = 0;
> + cxlhdm->decoder_count = cxlhdm->decoder_count_cap;
> + }
> +
> + if (!info->mem_enabled) {
> + rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> + if (rc)
> + return rc;
> +
> + return devm_cxl_enable_mem(&port->dev, cxlds);
> + }
> +
> if (!allowed) {
> dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> return -ENXIO;
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 2a25d1957ddb..60b538f8b677 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities
> * @regs: mapped registers, see devm_cxl_setup_hdm()
> * @decoder_count: number of decoders for this port
> + * @decoder_count_cap: number of decoders from HDM Decoder Capability
> * @target_count: for switch decoders, max downstream port targets
> * @interleave_mask: interleave granularity capability, see check_interleave_cap()
> * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap()
> @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> struct cxl_hdm {
> struct cxl_component_regs regs;
> unsigned int decoder_count;
> + unsigned int decoder_count_cap;
> unsigned int target_count;
> unsigned int interleave_mask;
> unsigned long iw_cap_mask;
next prev parent reply other threads:[~2025-04-07 8:32 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye
2025-04-07 8:31 ` Zhijian Li (Fujitsu) [this message]
2025-04-09 3:51 ` Ye, Huaisheng
2025-04-09 14:13 ` Gregory Price
2025-04-09 15:13 ` Dave Jiang
2025-04-15 16:21 ` Jonathan Cameron
2025-04-08 3:22 ` Gregory Price
2025-04-09 3:48 ` Ye, Huaisheng
2025-04-09 14:01 ` Gregory Price
2025-04-10 7:12 ` Ye, Huaisheng
2025-04-15 16:30 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=02fbb2a7-3973-4c7f-8b7d-cfabbb379467@fujitsu.com \
--to=lizhijian@fujitsu.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=huaisheng.ye@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=pei.p.jia@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox