From: Li Ming <ming4.li@outlook.com>
To: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com>,
Li Ming <ming4.li@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>,
Jonathan Cameron <jonathan.cameron@huawei.com>,
Dave Jiang <dave.jiang@intel.com>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Huang Ying <ying.huang@intel.com>,
Yao Xingtao <yaoxt.fnst@fujitsu.com>,
linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org
Subject: Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
Date: Tue, 26 Nov 2024 10:13:28 +0800 [thread overview]
Message-ID: <VI1PR10MB2016CDFC3A816ADC4F8840EECE2F2@VI1PR10MB2016.EURPRD10.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <2791149.btlEUcBR6m@fdefranc-mobl3>
On 11/26/2024 1:22 AM, Fabio M. De Francesco wrote:
> On Monday, November 25, 2024 9:42:56 AM GMT+1 Li Ming wrote:
>>
>> On 11/22/2024 11:51 PM, Fabio M. De Francesco wrote:
>>> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
>>> Physical Address (HPA) windows that are associated with each CXL Host
>>> Bridge. Each window represents a contiguous HPA that may be interleaved
>>> with one or more targets (CXL v3.1 - 9.18.1.3).
>>>
>>> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
>>> memory to which systems cannot send transactions. In some cases the size
>>> of that hole is not compatible with the CXL hardware decoder constraint
>>> that the size is always aligned to 256M * Interleave Ways.
>>>
>>> On those systems, BIOS publishes CFMWS which communicate the active System
>>> Physical Address (SPA) ranges that map to a subset of the Host Physical
>>> Address (HPA) ranges. The SPA range trims out the hole, and capacity in
>>> the endpoint is lost with no SPA to map to CXL HPA in that hole.
>>>
>>> In the early stages of CXL Regions construction and attach on platforms
>>> with Low Memory Holes, cxl_add_to_region() fails and returns an error
>>> because it can't find any CXL Window that matches a given CXL Endpoint
>>> Decoder.
>>>
>>> Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
>>> ranges: both must start at physical address 0 and end below 4 GB, while
>>> the Root Decoder ranges end at lower addresses than the matching Endpoint
>>> Decoders which instead must always respect the above-mentioned CXL hardware
>>> decoders HPA alignment constraint.
>>
>> Hi Fabio,
>>
>> Here mentions that the end address must be below 4GB, but I don't find any checking about that in the implementation, is it not needed?
>>
> Hi Ming,
>
> Good question, thanks!
>
> While a first version of arch_match_spa() had a check of 'r2->end < SZ_4G',
> I dropped it for the final one. It ended up out of sync with the commit message.
>
> I think that we don't want that check. I'll rework the commit message for v2.
>
> If the hardware decoders HPA ranges extended beyond the end of Low
> Memory, the LMH would still need to be detected and the decoders capacity
> would still need to be trimmed to match their corresponding CFMWS range end.
>>
>> [snip]
>>
>>
>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>> index ac2c486c16e9..3cb5a69e9731 100644
>>> --- a/drivers/cxl/core/region.c
>>> +++ b/drivers/cxl/core/region.c
>>> @@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
>>> cxld = to_cxl_decoder(dev);
>>> r = &cxld->hpa_range;
>>>
>>> - if (p->res && p->res->start == r->start && p->res->end == r->end)
>>> - return 1;
>>> + if (p->res) {
>>> + if (p->res->start == r->start && p->res->end == r->end)
>>> + return 1;
>>> + if (arch_match_region(p, cxld))
>>> + return 1;
>>> + }
>>
>> I think that it is better to check LMH cases before checking (p->res->start == r->start && p->res->end == r->end).
>> Per the changelog and the implementation of arch_match_region(), below case should fails but current checking will succeed:
>> the value of 'p->res->start' is MISALIGNED_CFMWS_RANGE_BASE and the the value of 'p->res->end' is equals to the value of 'r->end'.
>>
> I think that the expected "normal" case should always come first. In the expected
> scenarios the driver deals either with SPA range == HPA range
> (e.g, in match_auto_decoder()) or with SPA range that fully contains the HPA range
> (match_decoder_by_range()).
>
> If either one of those two cases holds, arch_match_*() helper doesn't need to be
> called and the match must succeed.
>
> Besides, other architectures are free to define holes with many constraints that
> the driver doesn't want to check first if the expected case is already met.
>>
>>>
>>> return 0;
>>> }
>>> @@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
>>> if (cxld->interleave_ways != iw ||
>>> cxld->interleave_granularity != ig ||
>>> cxld->hpa_range.start != p->res->start ||
>>> - cxld->hpa_range.end != p->res->end ||
>>> + (cxld->hpa_range.end != p->res->end &&
>>> + !arch_match_region(p, cxld)) ||
>>> ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
>>> dev_err(&cxlr->dev,
>>> "%s:%s %s expected iw: %d ig: %d %pr\n",
>>> @@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>>> {
>>> struct cxl_endpoint_decoder *cxled = data;
>>> struct cxl_switch_decoder *cxlsd;
>>> + struct cxl_root_decoder *cxlrd;
>>> struct range *r1, *r2;
>>>
>>> if (!is_switch_decoder(dev))
>>> @@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
>>> r1 = &cxlsd->cxld.hpa_range;
>>> r2 = &cxled->cxld.hpa_range;
>>>
>>> - if (is_root_decoder(dev))
>>> - return range_contains(r1, r2);
>>> + if (is_root_decoder(dev)) {
>>> + if (range_contains(r1, r2))
>>> + return 1;
>>> + cxlrd = to_cxl_root_decoder(dev);
>>> + if (arch_match_spa(cxlrd, cxled))
>>> + return 1;
>>
>> Same as above, what will happen if the root decoder's address range still contains endpoint decoder's address range in LMH cases? should fails or succeed?
>>
> If r1 contains r2, there is no LMH and the driver is dealing with the regular,
> expected, case. It must succeed.
>
> Think of the arch_match_*() helpers like something that avoid unwanted
> failures if arch permits exceptions. Before returning errors when the "normal"
> tests fail, check if the arch allows any exceptions and make the driver
> ignore that "strange" SPA/HPA misalignment.
>>
>> [snip]
>>
> Thanks,
>
> Fabio
>
>
Understand it, thanks for your explanation.
Ming
>
>
>
next prev parent reply other threads:[~2024-11-26 2:13 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-22 15:51 [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Fabio M. De Francesco
2024-11-22 15:51 ` [PATCH 1/3] cxl/core: Change match_*_by_range() calling convention Fabio M. De Francesco
2024-11-22 17:28 ` Ira Weiny
2024-11-25 21:10 ` Alison Schofield
2024-11-22 15:51 ` [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole Fabio M. De Francesco
2024-11-22 17:25 ` Ira Weiny
2024-11-25 11:23 ` Li Ming
2024-11-25 8:41 ` kernel test robot
2024-11-25 8:42 ` Li Ming
2024-11-25 17:22 ` Fabio M. De Francesco
2024-11-26 2:13 ` Li Ming [this message]
2024-11-25 13:37 ` kernel test robot
2024-11-25 20:35 ` Alison Schofield
2024-11-25 22:44 ` kernel test robot
2024-12-16 21:30 ` Robert Richter
2025-01-08 14:48 ` Fabio M. De Francesco
2025-01-09 10:58 ` Robert Richter
2025-01-10 16:06 ` Fabio M. De Francesco
2024-11-22 15:51 ` [PATCH 3/3] cxl/test: Simulate an x86 Low Memory Hole for tests Fabio M. De Francesco
2024-11-22 17:26 ` Ira Weiny
2024-11-25 20:46 ` Alison Schofield
2024-11-26 15:00 ` Jonathan Cameron
2024-11-22 19:46 ` [PATCH 0/3] cxl/core: Enable Region creation on x86 with Low Mem Hole Gregory Price
2024-11-25 22:00 ` Alison Schofield
2024-12-03 18:23 ` Fabio M. De Francesco
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=VI1PR10MB2016CDFC3A816ADC4F8840EECE2F2@VI1PR10MB2016.EURPRD10.PROD.OUTLOOK.COM \
--to=ming4.li@outlook.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=fabio.m.de.francesco@linux.intel.com \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming4.li@intel.com \
--cc=vishal.l.verma@intel.com \
--cc=yaoxt.fnst@fujitsu.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox