From: Itaru Kitayama <itaru.kitayama@linux.dev>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
linux-cxl@vger.kernel.org
Subject: Re: Internal error: Oops: 0000000096000044 [#11] SMP
Date: Fri, 23 May 2025 06:46:53 +0900 [thread overview]
Message-ID: <FD4183E1-162E-4790-B865-E50F20249A74@linux.dev> (raw)
In-Reply-To: <20250522145622.00002633@huawei.com>
Hi Jonathan,
> On May 22, 2025, at 22:56, Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 21 May 2025 16:34:16 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
>> Itaru Kitayama wrote:
>>> Dave et al.,
>> [..]
>>> Rebuilt the rootfs image and tried today’s cx/next
>>> (6.15.0-rc4-00046-g6eed708a5693) again to boot now I don’t see the
>>> splats, so something I was messing my dev environment sorry about
>>> that.
>>>
>>> CXL utility commands work reasonably now and I can execute meson test
>>> —suite cxl, while most of them still fails due to the HPA allocation
>>> error which makes me wonder as the resource requests are quite modest.
>>
>> So cxl_test_init() just "hopes" that the top of the system physical
>> address space is free to use to emulate CXL windows. That might be an
>> assumption that only works for x86_64, not ARM64. I would double check
>> that this code in cxl_test_init()
>>
>> rc = gen_pool_add(cxl_mock_pool, iomem_resource.end + 1 - SZ_64G,
>> SZ_64G, NUMA_NO_NODE);
>> if (rc)
>> goto err_gen_pool_add;
>>
>> ...is not setting up CXL Windows that overlap with existing resources in
>> that range.
>>
>
> I think there are checks that block use of ranges up there.
>
> Print I'm seeing is
> Hotplug memory [0xfffffff010000000-0xfffffff030000000] exceeds maximum addressable range [0x40000000-0xf80003fffffff]
>
> I think right answer is to use mhp_get_pluggable_range(true); to check
> for limits on the range we can use.
>
> On architectures that don't define arch_get_mappable_range()
> that ends up the as (unsigned long)-1 which I think would work
> though there may be other stuff up there. Maybe min(iomem_resource.end + 1 - SZ_64G,
> mappable_range.end + 1 - SZ_64G)
> or something like that adapted to avoid wrap around.
>
> I haven't yet sanity checked this doesn't break x86 but I think it should
> end up making no difference to the locations on x86.
>
>
> With the below - all 11 tests in ndctl cxl test suite pass for me.
>
> From b287ff2c5ee7fbe507ef8cb61df3e4e156a9773f Mon Sep 17 00:00:00 2001
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Date: Thu, 22 May 2025 14:20:42 +0100
> Subject: [PATCH] cxl_test: Limit location for fake CFMWS to mappable range
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> tools/testing/cxl/test/cxl.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c
> index 8a5815ca870d..b4e6c7659ac4 100644
> --- a/tools/testing/cxl/test/cxl.c
> +++ b/tools/testing/cxl/test/cxl.c
> @@ -1328,6 +1328,7 @@ static int cxl_mem_init(void)
> static __init int cxl_test_init(void)
> {
> int rc, i;
> + struct range mappable;
>
> cxl_acpi_test();
> cxl_core_test();
> @@ -1342,8 +1343,11 @@ static __init int cxl_test_init(void)
> rc = -ENOMEM;
> goto err_gen_pool_create;
> }
> + mappable = mhp_get_pluggable_range(true);
>
> - rc = gen_pool_add(cxl_mock_pool, iomem_resource.end + 1 - SZ_64G,
> + rc = gen_pool_add(cxl_mock_pool,
> + min(iomem_resource.end + 1 - SZ_64G,
> + mappable.end + 1 - SZ_64G),
> SZ_64G, NUMA_NO_NODE);
> if (rc)
> goto err_gen_pool_add;
> --
> 2.43.0
>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com <mailto:itaru.kitayama@fujitsu.com>>
# meson test --suite cxl
ninja: Entering directory `/root/ndctl/build'
[1/82] Generating version.h with a custom command
1/12 ndctl:cxl / cxl-topology.sh OK 33.96s
2/12 ndctl:cxl / cxl-region-sysfs.sh OK 18.00s
3/12 ndctl:cxl / cxl-labels.sh OK 23.78s
4/12 ndctl:cxl / cxl-create-region.sh OK 43.03s
5/12 ndctl:cxl / cxl-xor-region.sh OK 19.30s
6/12 ndctl:cxl / cxl-events.sh FAIL 6.40s exit status 1
>>> LD_LIBRARY_PATH=/root/ndctl/build/daxctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/ndctl/lib MALLOC_PERTURB_=45 TEST_PATH=/root/ndctl/build/test UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MESON_TEST_ITERATION=1 DAXCTL=/root/ndctl/build/daxctl/daxctl NDCTL=/root/ndctl/build/ndctl/ndctl ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 DATA_PATH=/root/ndctl/test /bin/bash /root/ndctl/test/cxl-events.sh
7/12 ndctl:cxl / cxl-sanitize.sh OK 14.77s
8/12 ndctl:cxl / cxl-destroy-region.sh OK 13.69s
9/12 ndctl:cxl / cxl-qos-class.sh OK 14.31s
10/12 ndctl:cxl / cxl-poison.sh FAIL 3.46s exit status 1
>>> LD_LIBRARY_PATH=/root/ndctl/build/daxctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/ndctl/lib MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 MALLOC_PERTURB_=80 UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 TEST_PATH=/root/ndctl/build/test MESON_TEST_ITERATION=1 DAXCTL=/root/ndctl/build/daxctl/daxctl NDCTL=/root/ndctl/build/ndctl/ndctl ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 DATA_PATH=/root/ndctl/test /bin/bash /root/ndctl/test/cxl-poison.sh
11/12 ndctl:cxl / cxl-update-firmware.sh OK 66.23s
12/12 ndctl:cxl / cxl-security.sh SKIP 0.34s exit status 77
Ok: 9
Expected Fail: 0
Fail: 2
Unexpected Pass: 0
Skipped: 1
Timeout: 0
My understanding is that these CXL tests are using mock CFMWs, not the actual physical memory regions at their fixed locations. So I wonder executing these set of test on a “sane" CXL emulation setup (run_qemu.sh creates) that the Intel folk is using does matter or not.
Itaru.
next prev parent reply other threads:[~2025-05-22 21:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-21 8:39 Internal error: Oops: 0000000096000044 [#11] SMP Itaru Kitayama
2025-05-21 15:31 ` Dave Jiang
2025-05-21 20:38 ` Itaru Kitayama
2025-05-21 20:46 ` Dave Jiang
2025-05-21 23:28 ` Itaru Kitayama
2025-05-21 23:34 ` Dan Williams
2025-05-22 13:56 ` Jonathan Cameron
2025-05-22 18:19 ` Dan Williams
2025-05-22 21:46 ` Itaru Kitayama [this message]
2025-05-23 3:28 ` Alison Schofield
2025-05-23 4:56 ` Itaru Kitayama
2025-05-23 5:52 ` Marc Herbert
2025-05-21 15:33 ` Alison Schofield
2025-05-21 15:36 ` Jonathan Cameron
2025-05-21 15:41 ` Alison Schofield
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=FD4183E1-162E-4790-B865-E50F20249A74@linux.dev \
--to=itaru.kitayama@linux.dev \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=linux-cxl@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.