Linux CXL
 help / color / mirror / Atom feed
From: "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com>
To: Huaisheng Ye <huaisheng.ye@intel.com>,
	"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>
Cc: "pei.p.jia@intel.com" <pei.p.jia@intel.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Subject: Re: [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges
Date: Mon, 7 Apr 2025 08:31:13 +0000	[thread overview]
Message-ID: <02fbb2a7-3973-4c7f-8b7d-cfabbb379467@fujitsu.com> (raw)
In-Reply-To: <20250406112752.1261855-1-huaisheng.ye@intel.com>



On 06/04/2025 19:27, Huaisheng Ye wrote:
> In some scenarios, the probe of endpoint ports would fail because of range
> register (RR) decodes outside platform defined CXL ranges.
> 
>    [kernel debug message]
>    cxl_hdm_decode_init:447: cxl_pci 0000:10:00.0: DVSEC Range0 denied by
>    platform
>    cxl_pci 0000:10:00.0: Range register decodes outside platform defined CXL
>    ranges.
>    cxl_bus_probe:2073: cxl_port endpoint3: probe: -6
>    call_driver_probe:590: cxl_port endpoint3: probe with driver cxl_port
>    rejects match -6
> 
> This defect could be found with Qemu CXL branch for a long while with a
> specified probability, even with the latest branch cxl-2025-03-20.

Yeah, IIRC, the memdev cannot be enabled again after the guest reboot.
Previously, I have to apply this patch[1] to my local QEMU to workaround it.



> 
> The root cause of this defect comes from that, bit CXL_DVSEC_MEM_ENABLE of
> DVSEC control has been set but in cxl_hdm_decode_init
> CXL_HDM_DECODER_ENABLE has NOT been set and also endpoint's dvsec_range is
> not covered by root decoder's hpa_range.
> 
> When encountering similar problems of the firmware, the patch could be
> effective for solving.
> Instead of Return error when RR decodes outside platform cxl ranges,
> driver disable Mem_Enable bit of DVSEC control then take it as the way of
> no setting info->mem_enabled.
> 
> Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com>

Thanks for the fix. It works for me(after revert [1])

Tested-by: Li Zhijian <lizhijian@fujitsu.com>


[1] https://lore.kernel.org/linux-cxl/20240409075846.85370-1-lizhijian@fujitsu.com/

> ---
>   drivers/cxl/core/hdm.c |  2 +-
>   drivers/cxl/core/pci.c | 35 +++++++++++++++++++++++++----------
>   drivers/cxl/cxlmem.h   |  2 ++
>   3 files changed, 28 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 50e6a45b30ba..b776fb848f42 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -80,7 +80,7 @@ static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
>   	u32 hdm_cap;
>   
>   	hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> -	cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
> +	cxlhdm->decoder_count_cap = cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
>   	cxlhdm->target_count =
>   		FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
>   	if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap))
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 013b869b66cb..5452bb285140 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -403,7 +403,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>   	struct cxl_port *port = cxlhdm->port;
>   	struct device *dev = cxlds->dev;
>   	struct cxl_port *root;
> -	int i, rc, allowed;
> +	int i, rc, allowed = 0;
>   	u32 global_ctrl = 0;
>   
>   	if (hdm)
> @@ -426,15 +426,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>   		return -ENODEV;
>   	}
>   
> -	if (!info->mem_enabled) {
> -		rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> -		if (rc)
> -			return rc;
> -
> -		return devm_cxl_enable_mem(&port->dev, cxlds);
> -	}
> -
> -	for (i = 0, allowed = 0; i < info->ranges; i++) {
> +	for (i = 0; i < info->ranges; i++) {
>   		struct device *cxld_dev;
>   
>   		cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i],
> @@ -448,6 +440,29 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>   		allowed++;
>   	}
>   
> +	if (info->mem_enabled && !allowed) {
> +		dev_warn(dev, "RR decodes outside ranges, have a try by disabling Mem_Enable bit.\n");
> +
> +		/*
> +		 * Instead of Return error when RR decodes outside platform ranges, reenable
> +		 * Mem_Enable bit of DVSEC control for a try.
> +		 */
> +		rc = cxl_set_mem_enable(cxlds, 0);
> +		if (rc)
> +			return rc;
> +
> +		info->mem_enabled = 0;
> +		cxlhdm->decoder_count = cxlhdm->decoder_count_cap;
> +	}
> +
> +	if (!info->mem_enabled) {
> +		rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
> +		if (rc)
> +			return rc;
> +
> +		return devm_cxl_enable_mem(&port->dev, cxlds);
> +	}
> +
>   	if (!allowed) {
>   		dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
>   		return -ENXIO;
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 2a25d1957ddb..60b538f8b677 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -855,6 +855,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
>    * struct cxl_hdm - HDM Decoder registers and cached / decoded capabilities
>    * @regs: mapped registers, see devm_cxl_setup_hdm()
>    * @decoder_count: number of decoders for this port
> + * @decoder_count_cap: number of decoders from HDM Decoder Capability
>    * @target_count: for switch decoders, max downstream port targets
>    * @interleave_mask: interleave granularity capability, see check_interleave_cap()
>    * @iw_cap_mask: bitmask of supported interleave ways, see check_interleave_cap()
> @@ -863,6 +864,7 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 cmd);
>   struct cxl_hdm {
>   	struct cxl_component_regs regs;
>   	unsigned int decoder_count;
> +	unsigned int decoder_count_cap;
>   	unsigned int target_count;
>   	unsigned int interleave_mask;
>   	unsigned long iw_cap_mask;

  reply	other threads:[~2025-04-07  8:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-06 11:27 [RFC PATCH] cxl/core: reenable Mem_Enable bit of DVSEC control when RR decodes outside platform ranges Huaisheng Ye
2025-04-07  8:31 ` Zhijian Li (Fujitsu) [this message]
2025-04-09  3:51   ` Ye, Huaisheng
2025-04-09 14:13   ` Gregory Price
2025-04-09 15:13     ` Dave Jiang
2025-04-15 16:21       ` Jonathan Cameron
2025-04-08  3:22 ` Gregory Price
2025-04-09  3:48   ` Ye, Huaisheng
2025-04-09 14:01     ` Gregory Price
2025-04-10  7:12       ` Ye, Huaisheng
2025-04-15 16:30       ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02fbb2a7-3973-4c7f-8b7d-cfabbb379467@fujitsu.com \
    --to=lizhijian@fujitsu.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=huaisheng.ye@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=pei.p.jia@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox