Re: [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Gregory Price <gregory.price@memverge.com>
Cc: Lukas Wunner <lukas@wunner.de>, <qemu-devel@nongnu.org>,
	Michael Tsirkin <mst@redhat.com>,
	Ben Widawsky <bwidawsk@kernel.org>, <linux-cxl@vger.kernel.org>,
	<linuxarm@huawei.com>, Ira Weiny <ira.weiny@intel.com>,
	Gregory Price <gourry.memverge@gmail.com>,
	"Dan Williams" <dan.j.williams@intel.com>
Subject: Re: [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream
Date: Thu, 19 Jan 2023 12:42:44 +0000	[thread overview]
Message-ID: <20230119124244.000015b3@Huawei.com> (raw)
In-Reply-To: <Y8hJKcy1993SFLLJ@memverge.com>

On Wed, 18 Jan 2023 14:31:53 -0500
Gregory Price <gregory.price@memverge.com> wrote:

> I apparently forgot an intro lol
> 
> I tested the DOE linux branch with the 2023-1-11 QEMU branch with both
> volatile, non-volatile, and "legacy" (pre-my-patch) non-volatile mode.
> 
> 1) *In volatile mode, there are no stack traces present (during boot*)
> 
> On Wed, Jan 18, 2023 at 02:22:08PM -0500, Gregory Price wrote:
> > 
> > 1) No stack traces present
> > 2) Device usage appears to work, but cxl-cli fails to create a region, i
> > haven't checked why yet (also tried ndctl-75, same results)
> > 3) There seems to be some other regression with the cxl_pmem_init
> > routine, because I get a stack trace in this setup regardless of whether
> > I apply the type-3 device commit.
> > 
> > 
> > All tests below with the previously posted DOE linux branch.
> > Base QEMU branch was Jonathan's 2023-1-11
> > 
> > 
> > DOE Branch - 2023-1-11 (HEAD) (all commits)
> > 
> > QEMU Config:
> > sudo /opt/qemu-cxl/bin/qemu-system-x86_64 \
> > -drive file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=disk,id=hd \
> > -m 3G,slots=4,maxmem=8G \
> > -smp 4 \
> > -machine type=q35,accel=kvm,cxl=on \
> > -enable-kvm \
> > -nographic \
> > -object memory-backend-ram,id=mem0,size=1G,share=on \
> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
> > -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,slot=0 \
> > -device cxl-type3,bus=rp0,volatile-memdev=mem0,id=cxl-mem0 \
> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=1G
> > 
> > Result:  This worked, but cxl-cli could not create a region (will look
> > into this further later).
> > 
> > 
> > 
> > 
> > When running with a persistent memory configuration, I'm seeing a
> > kernel stack trace on cxl_pmem_init
> > 
> > Config:
> > sudo /opt/qemu-cxl/bin/qemu-system-x86_64 \
> > -drive file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=disk,id=hd \
> > -m 3G,slots=4,maxmem=4G \
> > -smp 4 \
> > -machine type=q35,accel=kvm,cxl=on \
> > -enable-kvm \
> > -nographic \
> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
> > -device cxl-rp,port=0,id=rp0,bus=cxl.0,chassis=0,slot=0 \
> > -object memory-backend-file,id=cxl-mem0,mem-path=/tmp/mem0,size=1G \
> > -object memory-backend-file,id=cxl-lsa0,mem-path=/tmp/lsa0,size=1G \
> > -device cxl-type3,bus=rp0,persistent-memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \
> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=1G
> > 
> > 
> > [   62.167518] BUG: kernel NULL pointer dereference, address: 00000000000004c0
> > [   62.185069] #PF: supervisor read access in kernel mode
> > [   62.198502] #PF: error_code(0x0000) - not-present page
> > [   62.211019] PGD 0 P4D 0
> > [   62.220521] Oops: 0000 [#1] PREEMPT SMP PTI
> > [   62.233457] CPU: 3 PID: 558 Comm: systemd-udevd Not tainted 6.2.0-rc1+ #1
> > [   62.252886] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
> > [   62.258432] Adding 2939900k swap on /dev/zram0.  Priority:100 extents:1 across:2939900k SSDscFS
> > [   62.285513] RIP: 0010:cxl_nvdimm_probe+0x8d/0x130 [cxl_pmem]
> > [   62.285529] Code: 85 c0 0f 85 90 00 00 00 f0 80 0c 24 40 f0 80 4c 24 08 10 f0 80 4c 24 08 20 f0 80 4c 24 08 40 49 8d 84 24 b8 04 00 00 4c 89 0
> > [   62.285531] RSP: 0018:ffffacff0141fc38 EFLAGS: 00010202
> > [   62.285534] RAX: ffff97a8a37b84b8 RBX: ffff97a8a37b8000 RCX: 0000000000000000
> > [   62.285536] RDX: 0000000000000001 RSI: ffff97a8a37b8000 RDI: 00000000ffffffff
> > [   62.285537] RBP: ffff97a8a37b8000 R08: 0000000000000001 R09: 0000000000000001
> > [   62.285538] R10: 0000000000000001 R11: 0000000000000000 R12: ffff97a8a37b8000
> > [   62.285539] R13: ffff97a982c3dc28 R14: 0000000000000000 R15: 0000000000000000
> > [   62.285541] FS:  00007f2619829580(0000) GS:ffff97a9bca00000(0000) knlGS:0000000000000000
> > [   62.285542] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   62.285544] CR2: 00000000000004c0 CR3: 00000001056a8000 CR4: 00000000000006e0
> > [   62.285653] Call Trace:
> > [   62.285656]  <TASK>
> > [   62.285660]  cxl_bus_probe+0x17/0x50
> > [   62.285691]  really_probe+0xde/0x380
> > [   62.285695]  ? pm_runtime_barrier+0x50/0x90
> > [   62.285700]  __driver_probe_device+0x78/0x170
> > [   62.285846]  driver_probe_device+0x1f/0x90
> > [   62.285850]  __driver_attach+0xd2/0x1c0
> > [   62.285853]  ? __pfx___driver_attach+0x10/0x10
> > [   62.285856]  bus_for_each_dev+0x76/0xa0
> > [   62.285860]  bus_add_driver+0x1b1/0x200
> > [   62.285863]  driver_register+0x89/0xe0
> > [   62.285868]  ? __pfx_init_module+0x10/0x10 [cxl_pmem]
> > [   62.285874]  cxl_pmem_init+0x50/0xff0 [cxl_pmem]
> > [   62.285880]  do_one_initcall+0x6e/0x330
> > [   62.285888]  do_init_module+0x4a/0x200
> > [   62.285892]  __do_sys_finit_module+0x93/0xf0
> > [   62.285899]  do_syscall_64+0x5b/0x80
> > [   62.285904]  ? do_syscall_64+0x67/0x80
> > [   62.285906]  ? asm_exc_page_fault+0x22/0x30
> > [   62.285910]  ? lockdep_hardirqs_on+0x7d/0x100
> > [   62.285914]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > [   62.285917] RIP: 0033:0x7f2619b0afbd
> > [   62.285920] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 8
> > [   62.285922] RSP: 002b:00007ffcc516bf58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> > [   62.285924] RAX: ffffffffffffffda RBX: 00005557c0dcaa60 RCX: 00007f2619b0afbd
> > [   62.285925] RDX: 0000000000000000 RSI: 00007f261a18743c RDI: 0000000000000006
> > [   62.285926] RBP: 00007f261a18743c R08: 0000000000000000 R09: 00007f261a17bb52
> > [   62.285927] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000020000
> > [   62.285928] R13: 00005557c0dbbce0 R14: 0000000000000000 R15: 00005557c0dc18a0
> > [   62.285932]  </TASK>
> > [   62.285933] Modules linked in: cxl_pmem(+) snd_pcm libnvdimm snd_timer snd joydev bochs cxl_mem drm_vram_helper parport_pc soundcore drm_ttm_g
> > [   62.285954] CR2: 00000000000004c0
> > [   62.288385] ---[ end trace 0000000000000000 ]---
> > [   63.203514] RIP: 0010:cxl_nvdimm_probe+0x8d/0x130 [cxl_pmem]
> > [   63.203562] Code: 85 c0 0f 85 90 00 00 00 f0 80 0c 24 40 f0 80 4c 24 08 10 f0 80 4c 24 08 20 f0 80 4c 24 08 40 49 8d 84 24 b8 04 00 00 4c 89 0
> > [   63.203565] RSP: 0018:ffffacff0141fc38 EFLAGS: 00010202
> > [   63.203570] RAX: ffff97a8a37b84b8 RBX: ffff97a8a37b8000 RCX: 0000000000000000
> > [   63.203572] RDX: 0000000000000001 RSI: ffff97a8a37b8000 RDI: 00000000ffffffff
> > [   63.203574] RBP: ffff97a8a37b8000 R08: 0000000000000001 R09: 0000000000000001
> > [   63.203576] R10: 0000000000000001 R11: 0000000000000000 R12: ffff97a8a37b8000
> > [   63.203577] R13: ffff97a982c3dc28 R14: 0000000000000000 R15: 0000000000000000
> > [   63.203580] FS:  00007f2619829580(0000) GS:ffff97a9bca00000(0000) knlGS:0000000000000000
> > [   63.203583] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   63.203585] CR2: 00000000000004c0 CR3: 00000001056a8000 CR4: 00000000000006e0

Possibly replicated.  What I did was stop cxl_pmem.ko being probed automatically and
added it manually later. Trace that results is certainly similar to yours.

Now the MODULE_SOFTDEP() in drivers/cxl/acpi.c should stop that happening
assuming you are letting autoloading run.
I wonder if there is a path in which it doesn't?

Dan, any thoughts?

There is another race that I can trigger by repeatedly injecting errors and
causing resets, but the trace for that is very different and
points at cxl_pmem_ctl() called via nvdimm_probe(). I was going to try
and pin that one down a little more before posting a report but might
as well muddy the waters :)

 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000358
 Mem abort info:
 ESR = 0x0000000096000004
 EC = 0x25: DABT (current EL), IL = 32 bits
 SET = 0, FnV = 0
 EA = 0, S1PTW = 0
 FSC = 0x04: level 0 translation fault
 Data abort info:
 ISV = 0, ISS = 0x00000004
 CM = 0, WnR = 0
 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000102e12000
 [0000000000000358] pgd=0000000000000000, p4d=0000000000000000
 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
 Modules linked in: cxl_mem cxl_port cxl_acpi cxl_pmem cxl_pci cxl_core
 CPU: 0 PID: 236 Comm: kworker/u8:3 Not tainted 6.2.0-rc3+ #598
 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
 Workqueue: events_unbound async_run_entry_fn
 pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
 pc : cxl_pmem_ctl+0x74/0x244 [cxl_pmem]
 lr : cxl_pmem_ctl+0x60/0x244 [cxl_pmem]
 sp : ffff8000089239d0
 x29: ffff8000089239d0 x28: 0000000000000000 x27: 0000000000000000
 x26: ffffcd4f6b263000 x25: ffffcd4f6a25d9c8 x24: 0000000000000000
 kernel: Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
 x23: 0000000000000000 x22: 000000000000000c x21: ffff0000c5b95400
 x20: ffff0000c0d9ce0c x19: 0000000000000004 x18: 0000000000000000
 x17: 0000000000000000 x16: ffffcd4f698a5fa0 x15: 0000000000000000
 x14: 0000000000000002 x13: 0000000000000000 x12: 0000000000000000
 x11: 0000000000000001 x10: 409e5dd45a38ef72 x9 : ffffcd4f607531f0
 x8 : ffff0000c0d9ce80 x7 : 0000000000000000 x6 : 0000000000000000
 x5 : ffff800008923a84 x4 : 000000000000000c x3 : ffff800008923a10
 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000070
 Call trace:
  cxl_pmem_ctl+0x74/0x244 [cxl_pmem]
  nvdimm_init_nsarea+0xb8/0xdc
  nvdimm_probe+0xc0/0x1d0
  nvdimm_bus_probe+0x90/0x200
  really_probe+0xc8/0x3e0
  __driver_probe_device+0x84/0x190
  driver_probe_device+0x44/0x120
  __device_attach_driver+0xc4/0x160
  bus_for_each_drv+0x80/0xe0
  __device_attach+0xa4/0x1cc
  device_initial_probe+0x1c/0x2c
  bus_probe_device+0xa4/0xb0
  device_add+0x404/0x920
  nd_async_device_register+0x20/0x70
  async_run_entry_fn+0x3c/0x154
  process_one_work+0x200/0x474
  worker_thread+0x74/0x43c
  kthread+0x110/0x114
  ret_from_fork+0x10/0x20
  Code: 53067e61 f90023e0 f9417aa2 f8617860 (f941ac57)
  ---[ end trace 0000000000000000 ]---

Note this seems to have gotten harder to hit for some reason - took
about 50 resets.

I'll keep digging

Jonathan


> > 
> > 
> > 
> > Next i reverted the QEMU branch to the commit just before the type-3
> > volatile commit and used the old method of launching with a type-3 pmem
> > device
> > 
> > Config:
> > sudo /opt/qemu-cxl/bin/qemu-system-x86_64 \
> > -drive file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=disk,id=hd \
> > -m 2G,slots=4,maxmem=4G \
> > -smp 4 \
> > -machine type=q35,accel=kvm,cxl=on \
> > -enable-kvm \
> > -nographic \
> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
> > -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,slot=0 \
> > -object memory-backend-file,pmem=true,id=cxl-mem0,mem-path=/tmp/cxl-mem0,size=1G \
> > -object memory-backend-file,pmem=true,id=lsa0,mem-path=/tmp/cxl-lsa0,size=1G \
> > -device cxl-type3,bus=rp0,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 \
> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=1G
> > 
> > 
> > Result: Similar stack trace
> > [   29.850023] BUG: kernel NULL pointer dereference, address: 00000000000004c0
> > [   29.882400] RIP: 0010:cxl_nvdimm_probe+0x8d/0x130 [cxl_pmem]
> > [   29.957485] Call Trace:
> > [   29.959067]  <TASK>
> > [   29.962176]  cxl_bus_probe+0x17/0x50
> > [   29.964940]  really_probe+0xde/0x380
> > [   29.969065]  ? pm_runtime_barrier+0x50/0x90
> > [   29.973419]  __driver_probe_device+0x78/0x170
> > [   29.977183]  driver_probe_device+0x1f/0x90
> > [   29.984212]  __driver_attach+0xd2/0x1c0
> > [   29.988463]  ? __pfx___driver_attach+0x10/0x10
> > [   29.992379]  bus_for_each_dev+0x76/0xa0
> > [   29.997040]  bus_add_driver+0x1b1/0x200
> > [   30.000368]  driver_register+0x89/0xe0
> > [   30.004579]  ? __pfx_init_module+0x10/0x10 [cxl_pmem]
> > [   30.012403]  cxl_pmem_init+0x50/0xff0 [cxl_pmem]
> > [   30.019394]  do_one_initcall+0x6e/0x330
> > [   30.024028]  do_init_module+0x4a/0x200
> > [   30.029243]  __do_sys_finit_module+0x93/0xf0
> > [   30.034943]  do_syscall_64+0x5b/0x80
> > [   30.039844]  ? do_syscall_64+0x67/0x80
> > [   30.045163]  ? do_syscall_64+0x67/0x80
> > [   30.049729]  ? lock_release+0x14b/0x440
> > [   30.054055]  ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
> > [   30.061039]  ? lock_is_held_type+0xe8/0x140
> > [   30.067625]  ? do_syscall_64+0x67/0x80
> > [   30.071909]  ? lockdep_hardirqs_on+0x7d/0x100
> > [   30.079037]  ? do_syscall_64+0x67/0x80
> > [   30.084537]  ? do_syscall_64+0x67/0x80
> > [   30.089091]  ? do_syscall_64+0x67/0x80
> > [   30.094174]  ? lockdep_hardirqs_on+0x7d/0x100
> > [   30.099224]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > [   30.104446] RIP: 0033:0x7f000550afbd

WARNING: multiple messages have this Message-ID (diff)

From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Gregory Price <gregory.price@memverge.com>
Cc: Lukas Wunner <lukas@wunner.de>, <qemu-devel@nongnu.org>,
	Michael Tsirkin <mst@redhat.com>,
	Ben Widawsky <bwidawsk@kernel.org>, <linux-cxl@vger.kernel.org>,
	<linuxarm@huawei.com>, Ira Weiny <ira.weiny@intel.com>,
	Gregory Price <gourry.memverge@gmail.com>,
	"Dan Williams" <dan.j.williams@intel.com>
Subject: Re: [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream
Date: Thu, 19 Jan 2023 12:42:44 +0000	[thread overview]
Message-ID: <20230119124244.000015b3@Huawei.com> (raw)
In-Reply-To: <Y8hJKcy1993SFLLJ@memverge.com>

On Wed, 18 Jan 2023 14:31:53 -0500
Gregory Price <gregory.price@memverge.com> wrote:

> I apparently forgot an intro lol
> 
> I tested the DOE linux branch with the 2023-1-11 QEMU branch with both
> volatile, non-volatile, and "legacy" (pre-my-patch) non-volatile mode.
> 
> 1) *In volatile mode, there are no stack traces present (during boot*)
> 
> On Wed, Jan 18, 2023 at 02:22:08PM -0500, Gregory Price wrote:
> > 
> > 1) No stack traces present
> > 2) Device usage appears to work, but cxl-cli fails to create a region, i
> > haven't checked why yet (also tried ndctl-75, same results)
> > 3) There seems to be some other regression with the cxl_pmem_init
> > routine, because I get a stack trace in this setup regardless of whether
> > I apply the type-3 device commit.
> > 
> > 
> > All tests below with the previously posted DOE linux branch.
> > Base QEMU branch was Jonathan's 2023-1-11
> > 
> > 
> > DOE Branch - 2023-1-11 (HEAD) (all commits)
> > 
> > QEMU Config:
> > sudo /opt/qemu-cxl/bin/qemu-system-x86_64 \
> > -drive file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=disk,id=hd \
> > -m 3G,slots=4,maxmem=8G \
> > -smp 4 \
> > -machine type=q35,accel=kvm,cxl=on \
> > -enable-kvm \
> > -nographic \
> > -object memory-backend-ram,id=mem0,size=1G,share=on \
> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
> > -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,slot=0 \
> > -device cxl-type3,bus=rp0,volatile-memdev=mem0,id=cxl-mem0 \
> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=1G
> > 
> > Result:  This worked, but cxl-cli could not create a region (will look
> > into this further later).
> > 
> > 
> > 
> > 
> > When running with a persistent memory configuration, I'm seeing a
> > kernel stack trace on cxl_pmem_init
> > 
> > Config:
> > sudo /opt/qemu-cxl/bin/qemu-system-x86_64 \
> > -drive file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=disk,id=hd \
> > -m 3G,slots=4,maxmem=4G \
> > -smp 4 \
> > -machine type=q35,accel=kvm,cxl=on \
> > -enable-kvm \
> > -nographic \
> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
> > -device cxl-rp,port=0,id=rp0,bus=cxl.0,chassis=0,slot=0 \
> > -object memory-backend-file,id=cxl-mem0,mem-path=/tmp/mem0,size=1G \
> > -object memory-backend-file,id=cxl-lsa0,mem-path=/tmp/lsa0,size=1G \
> > -device cxl-type3,bus=rp0,persistent-memdev=cxl-mem0,lsa=cxl-lsa0,id=cxl-pmem0 \
> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=1G
> > 
> > 
> > [   62.167518] BUG: kernel NULL pointer dereference, address: 00000000000004c0
> > [   62.185069] #PF: supervisor read access in kernel mode
> > [   62.198502] #PF: error_code(0x0000) - not-present page
> > [   62.211019] PGD 0 P4D 0
> > [   62.220521] Oops: 0000 [#1] PREEMPT SMP PTI
> > [   62.233457] CPU: 3 PID: 558 Comm: systemd-udevd Not tainted 6.2.0-rc1+ #1
> > [   62.252886] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
> > [   62.258432] Adding 2939900k swap on /dev/zram0.  Priority:100 extents:1 across:2939900k SSDscFS
> > [   62.285513] RIP: 0010:cxl_nvdimm_probe+0x8d/0x130 [cxl_pmem]
> > [   62.285529] Code: 85 c0 0f 85 90 00 00 00 f0 80 0c 24 40 f0 80 4c 24 08 10 f0 80 4c 24 08 20 f0 80 4c 24 08 40 49 8d 84 24 b8 04 00 00 4c 89 0
> > [   62.285531] RSP: 0018:ffffacff0141fc38 EFLAGS: 00010202
> > [   62.285534] RAX: ffff97a8a37b84b8 RBX: ffff97a8a37b8000 RCX: 0000000000000000
> > [   62.285536] RDX: 0000000000000001 RSI: ffff97a8a37b8000 RDI: 00000000ffffffff
> > [   62.285537] RBP: ffff97a8a37b8000 R08: 0000000000000001 R09: 0000000000000001
> > [   62.285538] R10: 0000000000000001 R11: 0000000000000000 R12: ffff97a8a37b8000
> > [   62.285539] R13: ffff97a982c3dc28 R14: 0000000000000000 R15: 0000000000000000
> > [   62.285541] FS:  00007f2619829580(0000) GS:ffff97a9bca00000(0000) knlGS:0000000000000000
> > [   62.285542] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   62.285544] CR2: 00000000000004c0 CR3: 00000001056a8000 CR4: 00000000000006e0
> > [   62.285653] Call Trace:
> > [   62.285656]  <TASK>
> > [   62.285660]  cxl_bus_probe+0x17/0x50
> > [   62.285691]  really_probe+0xde/0x380
> > [   62.285695]  ? pm_runtime_barrier+0x50/0x90
> > [   62.285700]  __driver_probe_device+0x78/0x170
> > [   62.285846]  driver_probe_device+0x1f/0x90
> > [   62.285850]  __driver_attach+0xd2/0x1c0
> > [   62.285853]  ? __pfx___driver_attach+0x10/0x10
> > [   62.285856]  bus_for_each_dev+0x76/0xa0
> > [   62.285860]  bus_add_driver+0x1b1/0x200
> > [   62.285863]  driver_register+0x89/0xe0
> > [   62.285868]  ? __pfx_init_module+0x10/0x10 [cxl_pmem]
> > [   62.285874]  cxl_pmem_init+0x50/0xff0 [cxl_pmem]
> > [   62.285880]  do_one_initcall+0x6e/0x330
> > [   62.285888]  do_init_module+0x4a/0x200
> > [   62.285892]  __do_sys_finit_module+0x93/0xf0
> > [   62.285899]  do_syscall_64+0x5b/0x80
> > [   62.285904]  ? do_syscall_64+0x67/0x80
> > [   62.285906]  ? asm_exc_page_fault+0x22/0x30
> > [   62.285910]  ? lockdep_hardirqs_on+0x7d/0x100
> > [   62.285914]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > [   62.285917] RIP: 0033:0x7f2619b0afbd
> > [   62.285920] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 8
> > [   62.285922] RSP: 002b:00007ffcc516bf58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> > [   62.285924] RAX: ffffffffffffffda RBX: 00005557c0dcaa60 RCX: 00007f2619b0afbd
> > [   62.285925] RDX: 0000000000000000 RSI: 00007f261a18743c RDI: 0000000000000006
> > [   62.285926] RBP: 00007f261a18743c R08: 0000000000000000 R09: 00007f261a17bb52
> > [   62.285927] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000020000
> > [   62.285928] R13: 00005557c0dbbce0 R14: 0000000000000000 R15: 00005557c0dc18a0
> > [   62.285932]  </TASK>
> > [   62.285933] Modules linked in: cxl_pmem(+) snd_pcm libnvdimm snd_timer snd joydev bochs cxl_mem drm_vram_helper parport_pc soundcore drm_ttm_g
> > [   62.285954] CR2: 00000000000004c0
> > [   62.288385] ---[ end trace 0000000000000000 ]---
> > [   63.203514] RIP: 0010:cxl_nvdimm_probe+0x8d/0x130 [cxl_pmem]
> > [   63.203562] Code: 85 c0 0f 85 90 00 00 00 f0 80 0c 24 40 f0 80 4c 24 08 10 f0 80 4c 24 08 20 f0 80 4c 24 08 40 49 8d 84 24 b8 04 00 00 4c 89 0
> > [   63.203565] RSP: 0018:ffffacff0141fc38 EFLAGS: 00010202
> > [   63.203570] RAX: ffff97a8a37b84b8 RBX: ffff97a8a37b8000 RCX: 0000000000000000
> > [   63.203572] RDX: 0000000000000001 RSI: ffff97a8a37b8000 RDI: 00000000ffffffff
> > [   63.203574] RBP: ffff97a8a37b8000 R08: 0000000000000001 R09: 0000000000000001
> > [   63.203576] R10: 0000000000000001 R11: 0000000000000000 R12: ffff97a8a37b8000
> > [   63.203577] R13: ffff97a982c3dc28 R14: 0000000000000000 R15: 0000000000000000
> > [   63.203580] FS:  00007f2619829580(0000) GS:ffff97a9bca00000(0000) knlGS:0000000000000000
> > [   63.203583] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   63.203585] CR2: 00000000000004c0 CR3: 00000001056a8000 CR4: 00000000000006e0

Possibly replicated.  What I did was stop cxl_pmem.ko being probed automatically and
added it manually later. Trace that results is certainly similar to yours.

Now the MODULE_SOFTDEP() in drivers/cxl/acpi.c should stop that happening
assuming you are letting autoloading run.
I wonder if there is a path in which it doesn't?

Dan, any thoughts?

There is another race that I can trigger by repeatedly injecting errors and
causing resets, but the trace for that is very different and
points at cxl_pmem_ctl() called via nvdimm_probe(). I was going to try
and pin that one down a little more before posting a report but might
as well muddy the waters :)

 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000358
 Mem abort info:
 ESR = 0x0000000096000004
 EC = 0x25: DABT (current EL), IL = 32 bits
 SET = 0, FnV = 0
 EA = 0, S1PTW = 0
 FSC = 0x04: level 0 translation fault
 Data abort info:
 ISV = 0, ISS = 0x00000004
 CM = 0, WnR = 0
 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000102e12000
 [0000000000000358] pgd=0000000000000000, p4d=0000000000000000
 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
 Modules linked in: cxl_mem cxl_port cxl_acpi cxl_pmem cxl_pci cxl_core
 CPU: 0 PID: 236 Comm: kworker/u8:3 Not tainted 6.2.0-rc3+ #598
 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
 Workqueue: events_unbound async_run_entry_fn
 pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
 pc : cxl_pmem_ctl+0x74/0x244 [cxl_pmem]
 lr : cxl_pmem_ctl+0x60/0x244 [cxl_pmem]
 sp : ffff8000089239d0
 x29: ffff8000089239d0 x28: 0000000000000000 x27: 0000000000000000
 x26: ffffcd4f6b263000 x25: ffffcd4f6a25d9c8 x24: 0000000000000000
 kernel: Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
 x23: 0000000000000000 x22: 000000000000000c x21: ffff0000c5b95400
 x20: ffff0000c0d9ce0c x19: 0000000000000004 x18: 0000000000000000
 x17: 0000000000000000 x16: ffffcd4f698a5fa0 x15: 0000000000000000
 x14: 0000000000000002 x13: 0000000000000000 x12: 0000000000000000
 x11: 0000000000000001 x10: 409e5dd45a38ef72 x9 : ffffcd4f607531f0
 x8 : ffff0000c0d9ce80 x7 : 0000000000000000 x6 : 0000000000000000
 x5 : ffff800008923a84 x4 : 000000000000000c x3 : ffff800008923a10
 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000070
 Call trace:
  cxl_pmem_ctl+0x74/0x244 [cxl_pmem]
  nvdimm_init_nsarea+0xb8/0xdc
  nvdimm_probe+0xc0/0x1d0
  nvdimm_bus_probe+0x90/0x200
  really_probe+0xc8/0x3e0
  __driver_probe_device+0x84/0x190
  driver_probe_device+0x44/0x120
  __device_attach_driver+0xc4/0x160
  bus_for_each_drv+0x80/0xe0
  __device_attach+0xa4/0x1cc
  device_initial_probe+0x1c/0x2c
  bus_probe_device+0xa4/0xb0
  device_add+0x404/0x920
  nd_async_device_register+0x20/0x70
  async_run_entry_fn+0x3c/0x154
  process_one_work+0x200/0x474
  worker_thread+0x74/0x43c
  kthread+0x110/0x114
  ret_from_fork+0x10/0x20
  Code: 53067e61 f90023e0 f9417aa2 f8617860 (f941ac57)
  ---[ end trace 0000000000000000 ]---

Note this seems to have gotten harder to hit for some reason - took
about 50 resets.

I'll keep digging

Jonathan


> > 
> > 
> > 
> > Next i reverted the QEMU branch to the commit just before the type-3
> > volatile commit and used the old method of launching with a type-3 pmem
> > device
> > 
> > Config:
> > sudo /opt/qemu-cxl/bin/qemu-system-x86_64 \
> > -drive file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=disk,id=hd \
> > -m 2G,slots=4,maxmem=4G \
> > -smp 4 \
> > -machine type=q35,accel=kvm,cxl=on \
> > -enable-kvm \
> > -nographic \
> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
> > -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,slot=0 \
> > -object memory-backend-file,pmem=true,id=cxl-mem0,mem-path=/tmp/cxl-mem0,size=1G \
> > -object memory-backend-file,pmem=true,id=lsa0,mem-path=/tmp/cxl-lsa0,size=1G \
> > -device cxl-type3,bus=rp0,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0 \
> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=1G
> > 
> > 
> > Result: Similar stack trace
> > [   29.850023] BUG: kernel NULL pointer dereference, address: 00000000000004c0
> > [   29.882400] RIP: 0010:cxl_nvdimm_probe+0x8d/0x130 [cxl_pmem]
> > [   29.957485] Call Trace:
> > [   29.959067]  <TASK>
> > [   29.962176]  cxl_bus_probe+0x17/0x50
> > [   29.964940]  really_probe+0xde/0x380
> > [   29.969065]  ? pm_runtime_barrier+0x50/0x90
> > [   29.973419]  __driver_probe_device+0x78/0x170
> > [   29.977183]  driver_probe_device+0x1f/0x90
> > [   29.984212]  __driver_attach+0xd2/0x1c0
> > [   29.988463]  ? __pfx___driver_attach+0x10/0x10
> > [   29.992379]  bus_for_each_dev+0x76/0xa0
> > [   29.997040]  bus_add_driver+0x1b1/0x200
> > [   30.000368]  driver_register+0x89/0xe0
> > [   30.004579]  ? __pfx_init_module+0x10/0x10 [cxl_pmem]
> > [   30.012403]  cxl_pmem_init+0x50/0xff0 [cxl_pmem]
> > [   30.019394]  do_one_initcall+0x6e/0x330
> > [   30.024028]  do_init_module+0x4a/0x200
> > [   30.029243]  __do_sys_finit_module+0x93/0xf0
> > [   30.034943]  do_syscall_64+0x5b/0x80
> > [   30.039844]  ? do_syscall_64+0x67/0x80
> > [   30.045163]  ? do_syscall_64+0x67/0x80
> > [   30.049729]  ? lock_release+0x14b/0x440
> > [   30.054055]  ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
> > [   30.061039]  ? lock_is_held_type+0xe8/0x140
> > [   30.067625]  ? do_syscall_64+0x67/0x80
> > [   30.071909]  ? lockdep_hardirqs_on+0x7d/0x100
> > [   30.079037]  ? do_syscall_64+0x67/0x80
> > [   30.084537]  ? do_syscall_64+0x67/0x80
> > [   30.089091]  ? do_syscall_64+0x67/0x80
> > [   30.094174]  ? lockdep_hardirqs_on+0x7d/0x100
> > [   30.099224]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > [   30.104446] RIP: 0033:0x7f000550afbd

next prev parent reply	other threads:[~2023-01-19 12:44 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-11 14:24 [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 1/8] hw/mem/cxl_type3: Improve error handling in realize() Jonathan Cameron
2023-01-11 14:24   ` Jonathan Cameron via
2023-01-11 17:33   ` Ira Weiny
2023-01-11 14:24 ` [PATCH 2/8] hw/pci-bridge/cxl_downstream: Fix type naming mismatch Jonathan Cameron
2023-01-11 14:24   ` Jonathan Cameron via
2023-01-11 14:45   ` Philippe Mathieu-Daudé
2023-01-11 17:38   ` Ira Weiny
2023-01-11 14:24 ` [PATCH 3/8] hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL Jonathan Cameron
2023-01-11 14:24   ` Jonathan Cameron via
2023-01-11 17:41   ` Ira Weiny
2023-01-11 14:24 ` [PATCH 4/8] hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition Jonathan Cameron
2023-01-11 14:24   ` Jonathan Cameron via
2023-01-11 15:48   ` Philippe Mathieu-Daudé
2023-01-11 14:24 ` [PATCH 5/8] hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge Jonathan Cameron
2023-01-11 14:24   ` Jonathan Cameron via
2023-01-11 17:48   ` Ira Weiny
2023-01-11 14:24 ` [PATCH 6/8] qemu/bswap: Add const_le64() Jonathan Cameron
2023-01-11 14:24   ` Jonathan Cameron via
2023-01-11 15:49   ` Philippe Mathieu-Daudé
2023-01-11 16:07     ` Philippe Mathieu-Daudé
2023-01-11 16:33       ` Philippe Mathieu-Daudé
2023-01-11 16:40   ` Philippe Mathieu-Daudé
2023-01-11 16:59     ` Jonathan Cameron
2023-01-11 16:59       ` Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 7/8] qemu/uuid: Add UUID static initializer Jonathan Cameron
2023-01-11 14:24   ` Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 8/8] hw/cxl/mailbox: Use new UUID network order define for cel_uuid Jonathan Cameron
2023-01-11 14:24   ` Jonathan Cameron via
2023-01-11 15:50   ` Philippe Mathieu-Daudé
2023-01-12 15:39 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Gregory Price
2023-01-12 17:21   ` Jonathan Cameron
2023-01-12 17:21     ` Jonathan Cameron via
2023-01-12 22:46     ` Gregory Price
2023-01-13  9:12       ` Jonathan Cameron
2023-01-13  9:12         ` Jonathan Cameron via
2023-01-13 14:19         ` Gregory Price
2023-01-13 14:40           ` Jonathan Cameron
2023-01-13 14:40             ` Jonathan Cameron via
2023-01-13 14:45             ` Gregory Price
2023-01-13 14:45             ` Jonathan Cameron
2023-01-13 14:45               ` Jonathan Cameron via
2023-01-13 15:12               ` Lukas Wunner
2023-01-13 15:42                 ` Gregory Price
2023-01-18 19:22                 ` Gregory Price
2023-01-18 19:31                   ` Gregory Price
2023-01-19 12:42                     ` Jonathan Cameron [this message]
2023-01-19 12:42                       ` Jonathan Cameron via
2023-01-19 15:04                       ` cxl nvdimm Potential probe ordering issues Jonathan Cameron
2023-01-19 15:04                         ` Jonathan Cameron via
2023-01-19 16:17                         ` Jonathan Cameron
2023-01-19 16:17                           ` Jonathan Cameron via
2023-01-20  5:51                           ` Gregory Price
2023-01-20 17:26                           ` Dan Williams
2023-01-20  4:53                         ` Gregory Price
2023-01-20 10:47                           ` Jonathan Cameron
2023-01-20 10:47                             ` Jonathan Cameron via
2023-01-20 17:38                           ` Dan Williams
2023-01-20 21:54                             ` Gregory Price
2023-01-20 22:41                               ` Dan Williams
2023-01-23  9:44                                 ` Jonathan Cameron
2023-01-23  9:44                                   ` Jonathan Cameron via
2023-01-23 18:16                                 ` Gregory Price
2023-01-19 10:19                   ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron
2023-01-19 10:19                     ` Jonathan Cameron via
2023-01-19 11:48                     ` Michael S. Tsirkin
2023-01-19 12:16                       ` Jonathan Cameron
2023-01-19 12:16                         ` Jonathan Cameron via
2023-01-19 14:23                       ` Gregory Price
2023-01-19 14:20                     ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230119124244.000015b3@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=bwidawsk@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=gourry.memverge@gmail.com \
    --cc=gregory.price@memverge.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=lukas@wunner.de \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.