From: Itaru Kitayama <itaru.kitayama@linux.dev>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
linux-cxl@vger.kernel.org
Subject: Re: Internal error: Oops: 0000000096000044 [#11] SMP
Date: Fri, 23 May 2025 06:46:53 +0900 [thread overview]
Message-ID: <FD4183E1-162E-4790-B865-E50F20249A74@linux.dev> (raw)
In-Reply-To: <20250522145622.00002633@huawei.com>
Hi Jonathan,
> On May 22, 2025, at 22:56, Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 21 May 2025 16:34:16 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
>> Itaru Kitayama wrote:
>>> Dave et al.,
>> [..]
>>> Rebuilt the rootfs image and tried today’s cx/next
>>> (6.15.0-rc4-00046-g6eed708a5693) again to boot now I don’t see the
>>> splats, so something I was messing my dev environment sorry about
>>> that.
>>>
>>> CXL utility commands work reasonably now and I can execute meson test
>>> —suite cxl, while most of them still fails due to the HPA allocation
>>> error which makes me wonder as the resource requests are quite modest.
>>
>> So cxl_test_init() just "hopes" that the top of the system physical
>> address space is free to use to emulate CXL windows. That might be an
>> assumption that only works for x86_64, not ARM64. I would double check
>> that this code in cxl_test_init()
>>
>> rc = gen_pool_add(cxl_mock_pool, iomem_resource.end + 1 - SZ_64G,
>> SZ_64G, NUMA_NO_NODE);
>> if (rc)
>> goto err_gen_pool_add;
>>
>> ...is not setting up CXL Windows that overlap with existing resources in
>> that range.
>>
>
> I think there are checks that block use of ranges up there.
>
> Print I'm seeing is
> Hotplug memory [0xfffffff010000000-0xfffffff030000000] exceeds maximum addressable range [0x40000000-0xf80003fffffff]
>
> I think right answer is to use mhp_get_pluggable_range(true); to check
> for limits on the range we can use.
>
> On architectures that don't define arch_get_mappable_range()
> that ends up the as (unsigned long)-1 which I think would work
> though there may be other stuff up there. Maybe min(iomem_resource.end + 1 - SZ_64G,
> mappable_range.end + 1 - SZ_64G)
> or something like that adapted to avoid wrap around.
>
> I haven't yet sanity checked this doesn't break x86 but I think it should
> end up making no difference to the locations on x86.
>
>
> With the below - all 11 tests in ndctl cxl test suite pass for me.
>
> From b287ff2c5ee7fbe507ef8cb61df3e4e156a9773f Mon Sep 17 00:00:00 2001
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Date: Thu, 22 May 2025 14:20:42 +0100
> Subject: [PATCH] cxl_test: Limit location for fake CFMWS to mappable range
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> tools/testing/cxl/test/cxl.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
> index 8a5815ca870d..b4e6c7659ac4 100644
> --- a/tools/testing/cxl/test/cxl.c
> +++ b/tools/testing/cxl/test/cxl.c
> @@ -1328,6 +1328,7 @@ static int cxl_mem_init(void)
> static __init int cxl_test_init(void)
> {
> int rc, i;
> + struct range mappable;
>
> cxl_acpi_test();
> cxl_core_test();
> @@ -1342,8 +1343,11 @@ static __init int cxl_test_init(void)
> rc = -ENOMEM;
> goto err_gen_pool_create;
> }
> + mappable = mhp_get_pluggable_range(true);
>
> - rc = gen_pool_add(cxl_mock_pool, iomem_resource.end + 1 - SZ_64G,
> + rc = gen_pool_add(cxl_mock_pool,
> + min(iomem_resource.end + 1 - SZ_64G,
> + mappable.end + 1 - SZ_64G),
> SZ_64G, NUMA_NO_NODE);
> if (rc)
> goto err_gen_pool_add;
> --
> 2.43.0
>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com <mailto:itaru.kitayama@fujitsu.com>>
# meson test --suite cxl
ninja: Entering directory `/root/ndctl/build'
[1/82] Generating version.h with a custom command
1/12 ndctl:cxl / cxl-topology.sh OK 33.96s
2/12 ndctl:cxl / cxl-region-sysfs.sh OK 18.00s
3/12 ndctl:cxl / cxl-labels.sh OK 23.78s
4/12 ndctl:cxl / cxl-create-region.sh OK 43.03s
5/12 ndctl:cxl / cxl-xor-region.sh OK 19.30s
6/12 ndctl:cxl / cxl-events.sh FAIL 6.40s exit status 1
>>> LD_LIBRARY_PATH=/root/ndctl/build/daxctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/ndctl/lib MALLOC_PERTURB_=45 TEST_PATH=/root/ndctl/build/test UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MESON_TEST_ITERATION=1 DAXCTL=/root/ndctl/build/daxctl/daxctl NDCTL=/root/ndctl/build/ndctl/ndctl ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 DATA_PATH=/root/ndctl/test /bin/bash /root/ndctl/test/cxl-events.sh
7/12 ndctl:cxl / cxl-sanitize.sh OK 14.77s
8/12 ndctl:cxl / cxl-destroy-region.sh OK 13.69s
9/12 ndctl:cxl / cxl-qos-class.sh OK 14.31s
10/12 ndctl:cxl / cxl-poison.sh FAIL 3.46s exit status 1
>>> LD_LIBRARY_PATH=/root/ndctl/build/daxctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/ndctl/lib MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MALLOC_PERTURB_=80 UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 TEST_PATH=/root/ndctl/build/test MESON_TEST_ITERATION=1 DAXCTL=/root/ndctl/build/daxctl/daxctl NDCTL=/root/ndctl/build/ndctl/ndctl ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 DATA_PATH=/root/ndctl/test /bin/bash /root/ndctl/test/cxl-poison.sh
11/12 ndctl:cxl / cxl-update-firmware.sh OK 66.23s
12/12 ndctl:cxl / cxl-security.sh SKIP 0.34s exit status 77
Ok: 9
Expected Fail: 0
Fail: 2
Unexpected Pass: 0
Skipped: 1
Timeout: 0
My understanding is that these CXL tests are using mock CFMWs, not the actual physical memory regions at their fixed locations. So I wonder executing these set of test on a “sane" CXL emulation setup (run_qemu.sh creates) that the Intel folk is using does matter or not.
Itaru.
next prev parent reply other threads:[~2025-05-22 21:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-21 8:39 Internal error: Oops: 0000000096000044 [#11] SMP Itaru Kitayama
2025-05-21 15:31 ` Dave Jiang
2025-05-21 20:38 ` Itaru Kitayama
2025-05-21 20:46 ` Dave Jiang
2025-05-21 23:28 ` Itaru Kitayama
2025-05-21 23:34 ` Dan Williams
2025-05-22 13:56 ` Jonathan Cameron
2025-05-22 18:19 ` Dan Williams
2025-05-22 21:46 ` Itaru Kitayama [this message]
2025-05-23 3:28 ` Alison Schofield
2025-05-23 4:56 ` Itaru Kitayama
2025-05-23 5:52 ` Marc Herbert
2025-05-21 15:33 ` Alison Schofield
2025-05-21 15:36 ` Jonathan Cameron
2025-05-21 15:41 ` Alison Schofield
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=FD4183E1-162E-4790-B865-E50F20249A74@linux.dev \
--to=itaru.kitayama@linux.dev \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=linux-cxl@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox