From: Gregory Price <gourry@gourry.net>
To: Huaisheng Ye <huaisheng.ye@intel.com>
Cc: Jonathan.Cameron@huawei.com, dan.j.williams@intel.com,
dave.jiang@intel.com, pei.p.jia@intel.com,
linux-cxl@vger.kernel.org
Subject: Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
Date: Mon, 7 Apr 2025 23:22:53 -0400 [thread overview]
Message-ID: <Z_SWjQA4qb3OE6Dk@gourry-fedora-PF4VCD3F> (raw)
In-Reply-To: <20250406112752.1261855-1-huaisheng.ye@intel.com>
On Sun, Apr 06, 2025 at 07:27:52PM +0800, Huaisheng Ye wrote:
> In some scenarios, the probe of endpoint ports would fail because of range
> register (RR) decodes outside platform defined CXL ranges.
>
> [kernel debug message]
> cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
> platform
> cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
> ranges.
> cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
> call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
> rejects match -6
>
> This defect could be found with Qemu CXL branch for a long while with a
> specified probability, even with the latest branch cxl-2025-03-20.
>
> The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
> DVSEC control has been set but in cxl_hdm_decode_init
> CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
> not covered by root decoder's hpa_range.
>
The explanation here is a bit confusing. Please clarify if my
understanding of the issue is incorrect.
Observed problem:
Some firmware/BIOS sets MEM_ENABLED, does not set HDM_DECODER_ENABLED,
and does not program the Range registers. This is possibly the
result of defaulting MEM_ENABLED to 1 mistakenly, rather than a
programming error / failure.
Suggested solution:
Linux should detect this and reset the MEM_ENABLED bit and simply
attempt to enable the hdm decoders accordingly.
Question: Is this only observed with QEMU? If so, can we just fix QEMU?
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 013b869b66cb..5452bb285140 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> allowed++;
> }
>
> + if (info->mem_enabled && !allowed) {
> + dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n");
> +
> + /*
> + * Instead of Return error when RR decodes outside platform ranges, reenable
> + * Mem_Enable bit of DVSEC control for a try.
> + */
Your comment says to "reenable mem_enable bit", but you clear it.
I think you mean to say "reset mem_enable bit, and try to enable hdm".
> + rc = cxl_set_mem_enable(cxlds, 0);
> + if (rc)
> + return rc;
> +
> + info->mem_enabled = 0;
> + cxlhdm->decoder_count = cxlhdm->decoder_count_cap;
> + }
> +
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 2a25d1957ddb..60b538f8b677 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities
> * @regs: mapped registers, see devm_cxl_setup_hdm()
> * @decoder_count: number of decoders for this port
> + * @decoder_count_cap: number of decoders from HDM Decoder Capability
> * @target_count: for switch decoders, max downstream port targets
> * @interleave_mask: interleave granularity capability, see check_interleave_cap()
> * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap()
> @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
> struct cxl_hdm {
> struct cxl_component_regs regs;
> unsigned int decoder_count;
> + unsigned int decoder_count_cap;
Why is this needed, as opposed to simply re-reading the count?
~Gregory
next prev parent reply other threads:[~2025-04-08 3:22 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye
2025-04-07 8:31 ` Zhijian Li (Fujitsu)
2025-04-09 3:51 ` Ye, Huaisheng
2025-04-09 14:13 ` Gregory Price
2025-04-09 15:13 ` Dave Jiang
2025-04-15 16:21 ` Jonathan Cameron
2025-04-08 3:22 ` Gregory Price [this message]
2025-04-09 3:48 ` Ye, Huaisheng
2025-04-09 14:01 ` Gregory Price
2025-04-10 7:12 ` Ye, Huaisheng
2025-04-15 16:30 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z_SWjQA4qb3OE6Dk@gourry-fedora-PF4VCD3F \
--to=gourry@gourry.net \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=huaisheng.ye@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=pei.p.jia@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox