Linux CXL
 help / color / mirror / Atom feed
From: Dave Jiang <dave.jiang@intel.com>
To: Dan Williams <dan.j.williams@intel.com>,
	"Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com>,
	Huaisheng Ye <huaisheng.ye@intel.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"pei.p.jia@intel.com" <pei.p.jia@intel.com>
Subject: Re: [PATCH] [RFC] cxl/region: Fix region creation for greater than x2 switches
Date: Thu, 7 Nov 2024 08:23:28 -0700	[thread overview]
Message-ID: <1f05ccef-5b45-4eac-b3ca-588b1e5ec6f5@intel.com> (raw)
In-Reply-To: <672c13b19a530_10bc6294bd@dwillia2-xfh.jf.intel.com.notmuch>



On 11/6/24 6:11 PM, Dan Williams wrote:
> [ add Dave so you 
> 
> Zhijian Li (Fujitsu) wrote:
>>
>>
>> On 27/10/2024 15:57, Huaisheng Ye wrote:
>>> The cxl_port_setup_targets() algorithm fails to identify valid target list
>>> ordering in the presence of 4-way and above switches resulting in
>>> 'cxl create-region' failures of the form:
>>>
>>>    # cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0
>>>    cxl region: create_region: region0: failed to set target7 to mem0
>>>    cxl region: cmd_create_region: created 0 regions
>>>
>>>    [kernel debug message]
>>>    check_last_peer:1213: cxl region0: pci0000:0c:port1: cannot host mem6:decoder7.0 at 2
>>>    bus_remove_device:574: bus: 'cxl': remove device region0
>>>
>>> QEMU can create this failing topology:
>>>
>>>                         ACPI0017:00 [root0]
>>>                             |
>>>                           HB_0 [port1]
>>>                          /             \
>>>                       RP_0             RP_1
>>>                        |                 |
>>>                  USP [port2]           USP [port3]
>>>              /    /    \    \        /   /    \    \
>>>            DSP   DSP   DSP   DSP   DSP  DSP   DSP  DSP
>>>             |     |     |     |     |    |     |    |
>>>            mem4  mem6  mem2  mem7  mem1 mem3  mem5  mem0
>>>   Pos:      0     2     4     6     1    3     5    7
>>
>> Yeah, I tried this topology long long ago, it didn't work. At that time, I thought it
>> might be just like that. Until recently that I saw this [1] in section
>> 2.13.15.1 Region Spanning 2 HB Root Ports Example Configuration Checks
>>
>> I once tried to understand why the code used "distance" to determine the order of the target,
>> but in the end, I still couldn't figure it out (and I still don't understand it now).
>> IIRC, neither the CXL spec nor this document mentioned the keyword "distance" at all.
> 
> Oh, that means this needs a comment or a better variable name.
> 
> In this patch discussion [1] Jim came up with the term "ancestral_ways"
> to describe the same concept of what is the offset ("distance") between
> consecutive indices in the target list.
> 
> [1]: http://lore.kernel.org/ZUHeTLZb+od8q4EE@ubuntu
> 
> Does "ancestral_ways" more clearly convey the math that is being
> performed at each level of the hierarchy?
> 
> Now, "ancestral_ways" also does not show up in the CXL specification,
> but that is because the CXL specification leaves at as an exercise for
> software to figure out an algorithm to validate that a proposed ordering
> of memory-device-decoders in a region can be supported by the given CXL
> topology.
> 
> In the meantime I have flagged this patch to Dave for consideration in
> the next cxl-fixes pull request, but I suspect it will need to be
> updated with a comment and/or rename to resovle the "distance"
> confusion.
> 
> https://patchwork.kernel.org/bundle/cxllinux/cxl-fixes/

If we can get it respin and tagged by next week, there's time to get it into the 6.13 merge window. Otherwise it can wait until 6.13-rc fixes.
> 


  reply	other threads:[~2024-11-07 15:23 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-27  7:57 [PATCH] [RFC] cxl/region: Fix region creation for greater than x2 switches Huaisheng Ye
2024-10-31  9:33 ` Zhijian Li (Fujitsu)
2024-11-06  9:34   ` Ye, Huaisheng
2024-11-07  1:11   ` Dan Williams
2024-11-07 15:23     ` Dave Jiang [this message]
2024-11-12  1:56       ` Ye, Huaisheng
2024-12-10  0:08         ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f05ccef-5b45-4eac-b3ca-588b1e5ec6f5@intel.com \
    --to=dave.jiang@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=huaisheng.ye@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizhijian@fujitsu.com \
    --cc=pei.p.jia@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox