From: Gregory Price <gregory.price@memverge.com>
To: "Verma, Vishal L" <vishal.l.verma@intel.com>
Cc: "Williams, Dan J" <dan.j.williams@intel.com>,
"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Subject: Re: [GIT preview] for-6.3/cxl-ram-region
Date: Tue, 31 Jan 2023 18:03:53 -0500 [thread overview]
Message-ID: <Y9meWfDiCGbca4nP@memverge.com> (raw)
In-Reply-To: <73ef066b15c5551087da3667398f462d427d3204.camel@intel.com>
On Tue, Jan 31, 2023 at 08:24:19PM +0000, Verma, Vishal L wrote:
> On Tue, 2023-01-31 at 19:46 +0000, Verma, Vishal L wrote:
> > On Tue, 2023-01-31 at 14:03 -0500, Gregory Price wrote:
> > >
> > >
> > > Right now I believe this is failing due to the interleave and size not
> > > having default values
> > >
> > > ./cxl create-region -m -t ram -d decoder0.0 -w 1 -g 4096 mem0
> > > cxl region: create_region: create_region: unable to determine region size
> > > cxl region: cmd_create_region: created 0 regions
> > >
> > >
> > > appears to be due to this code
> > > static int create_region(struct cxl_ctx *ctx, int *count,
> > > struct parsed_params *p)
> > > {
> > > // ... snip ...
> > > rc = create_region_validate_config(ctx, p);
> > > if (rc)
> > > return rc;
> > >
> > > if (p->size) {
> > > size = p->size;
> > > default_size = false;
> > > } else if (p->ep_min_size) {
> > > size = p->ep_min_size * p->ways;
> > > ** } else {
> > > ** log_err(&rl, "%s: unable to determine region size\n", __func__);
> > > ** return -ENXIO;
> > > ** }
> > >
> > > So both size and ep_min_size are 0 here
> > >
> > > echo region0 > /sys/bus/cxl/devices/decoder0.0/create_ram_region
> > > cat /sys/bus/cxl/devices/region0/interleave_ways
> > > 0
> > > cat /sys/bus/cxl/devices/region0/interleave_granularity
> > > 0
> > > cat /sys/bus/cxl/devices/region0/size
> > > 0
> >
> > Ah - this revealed an actual bug in these commits - the size and
> > ep_min_size don't refer to the region's size, it is the capacity of the
> > component memdevs. Right after create_ram_region, the region size is
> > expected to be zero.
> >
> > However the bug here was a pmem assumption I had missed. When
> > determining sizes, we only look at pmem capacity, which is wrong. It
> > happened to work in my testing because the memdevs I used had both pmem
> > and ram capacity. I'll update with a fix shortly. Thanks for trying it
> > out and reporting this!
>
> I've updated the branch now with a fix for this.
Progress! But now i've found a kernel segfault :D
(sorry about the jumble here, looks like multiple issues))
[root@fedora cxl]# ./cxl create-region -m -t ram -d decoder0.0 -w 1 -g 4096 mem0
[ 170.675334] cxl_region region0: Failed to synchronize CPU cache state
libcxl: [c x l1_7r0e.68249g6i] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 170.691163] #PF: supervisor instruction fetch in kernel mode
[o n 1_70.70e3n9a1b6l]e :# rPeF: error_code(0gixo0010) - not-present page
n0[: fai led1 7to 0e.7n19709] PGD 800000004d25d067 P4D 800000004d25d067 PUD 4cdf3067 PMD 0
[ 170.725436] Oops: 0010 [#1] PREEMPT SMP PTI
1b[l e
7c0x.l734 510r]e giConPU: 0 PID: 717 Comm: cxl Not tainted 6.2.0-rc2+ #19
[ 170.739750] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
:[ 170.747119] R IP: 0c0r1e0:at0ex_0r
egi[o n: 170.751110] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 170.757699] RSP: 0018:ffffb9a3c0e97c60 EFLAGS: 00010296
[ 17r0e.g7ion0:6 f6a0i9l1e]d RAX: 0000000000000000 RBX: ffff9c38e459de60 RCX: 0000000000000000
[ 170.772499] RDX: 0000000000000000 RSI: ffff9c38e42ecdb0 RDI: ffff9c390f11d400
[ t170o.77 8e3nab0l0e] RBP: fff:f 9Nco3 8seed38000 R08: 0000000000000001 R09: ffffb9a3c0e97b38
[ 170.783787] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c393d8c8c00
uch d[ev i 1ce7 0o.7r8 800a9]d R13: ffff9c390f141c00 R14: ffff9c38eed38340 R15: ffff9c38c1a01400
dr[e s1s7
0.795938] FS: 00007ff89ca037c0(0000) GS:ffff9c393dc00000(0000) knlGS:0000000000000000
[ 170.802891] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 170.806705] CR2: ffffffffffffffd6 CR3: 0000000024c8e000 CR4: 00000000000006f0
[ 170.817025] Call Trace:
[ 170.818831] <TASK>
[ 170.820589] cxl_region_decode_reset+0xb8/0x110
[ 170.823893] cxl_region_detach+0xda/0x1e0
[ 170.829457] detach_target.part.0+0x29/0x80
[ 170.833503] unregister_region+0x42/0x90
[ 170.836813] devm_release_action+0x3d/0x70
[ 170.840128] ? __pfx_unregister_region+0x10/0x10
[ 170.843899] delete_region_store+0x69/0x80
[ 170.847680] kernfs_fop_write_iter+0x11e/0x200
[ 170.851217] vfs_write+0x222/0x3e0
[ 170.854141] ksys_write+0x5b/0xd0
[ 170.856695] do_syscall_64+0x5b/0x80
[ 170.859678] ? kmem_cache_free+0x15/0x3b0
[ 170.862234] ? do_sys_openat2+0x77/0x150
[ 170.865560] ? syscall_exit_to_user_mode+0x17/0x40
[ 170.870920] ? do_syscall_64+0x67/0x80
[ 170.874726] ? syscall_exit_to_user_mode+0x17/0x40
[ 170.879464] ? do_syscall_64+0x67/0x80
[ 170.881634] ? __irq_exit_rcu+0x3d/0x140
[ 170.884720] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 170.888810] RIP: 0033:0x7ff89c901c37
[ 170.891435] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 4[ 170.905803] RSP: 002b:00007fff0e843a68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 170.913373] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007ff89c901c37
[ 170.920868] RDX: 0000000000000008 RSI: 0000000001290ee6 RDI: 0000000000000003
[ 170.931402] RBP: 00007fff0e843aa0 R08: 000000000000fee0 R09: 0000000000000073
[ 170.936639] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 170.942484] R13: 00007fff0e844000 R14: 000000000041fdc8 R15: 00007ff89cbdf000
[ 170.954794] </TASK>
[ 170.957649] Modules linked in: rfkill vfat fat snd_pcm iTCO_wdt snd_timer intel_pmc_bxt ppdev iTCO_vendor_support snd cxl_pmem soundcore bochg[ 170.980623] CR2: 0000000000000000
[ 170.984137] ---[ end trace 0000000000000000 ]---
[ 170.989062] RIP: 0010:0x0
[ 170.991505] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 170.996401] RSP: 0018:ffffb9a3c0e97c60 EFLAGS: 00010296
[ 170.999716] RAX: 0000000000000000 RBX: ffff9c38e459de60 RCX: 0000000000000000
[ 171.006146] RDX: 0000000000000000 RSI: ffff9c38e42ecdb0 RDI: ffff9c390f11d400
[ 171.018226] RBP: ffff9c38eed38000 R08: 0000000000000001 R09: ffffb9a3c0e97b38
[ 171.024812] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c393d8c8c00
[ 171.036512] R13: ffff9c390f141c00 R14: ffff9c38eed38340 R15: ffff9c38c1a01400
[ 171.042400] FS: 00007ff89ca037c0(0000) GS:ffff9c393dc00000(0000) knlGS:0000000000000000
[ 171.050182] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 171.055740] CR2: ffffffffffffffd6 CR3: 0000000024c8e000 CR4: 00000000000006f0
Killed
next prev parent reply other threads:[~2023-01-31 23:04 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-26 6:25 [GIT preview] for-6.3/cxl-ram-region Dan Williams
2023-01-26 6:29 ` Dan Williams
2023-01-26 18:50 ` Jonathan Cameron
2023-01-26 19:34 ` Jonathan Cameron
2023-01-30 14:16 ` Gregory Price
2023-01-30 20:10 ` Dan Williams
2023-01-30 20:58 ` Gregory Price
2023-01-30 23:18 ` Dan Williams
2023-01-30 22:00 ` Gregory Price
2023-01-31 2:00 ` Gregory Price
2023-01-31 16:56 ` Dan Williams
2023-01-31 17:59 ` Verma, Vishal L
2023-01-31 19:03 ` Gregory Price
2023-01-31 19:46 ` Verma, Vishal L
2023-01-31 20:24 ` Verma, Vishal L
2023-01-31 23:03 ` Gregory Price [this message]
2023-01-31 23:17 ` Gregory Price
2023-01-31 23:50 ` Fan Ni
2023-02-01 5:29 ` Gregory Price
2023-02-01 21:16 ` Gregory Price
2023-02-02 1:06 ` Gregory Price
2023-02-02 16:03 ` Jonathan Cameron
2023-02-01 22:05 ` Gregory Price
2023-02-02 18:13 ` Jonathan Cameron
2023-02-02 0:43 ` Gregory Price
2023-02-02 18:18 ` Dan Williams
2023-02-02 0:44 ` Gregory Price
2023-02-07 16:31 ` Jonathan Cameron
2023-01-30 14:23 ` Gregory Price
2023-01-31 14:56 ` Jonathan Cameron
2023-01-31 17:34 ` Gregory Price
2023-01-26 22:05 ` Gregory Price
2023-01-26 22:20 ` Dan Williams
2023-02-04 2:36 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y9meWfDiCGbca4nP@memverge.com \
--to=gregory.price@memverge.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox