* CXL -next issue on arm64
@ 2025-03-12 3:56 Itaru Kitayama
2025-03-12 15:14 ` Dave Jiang
0 siblings, 1 reply; 3+ messages in thread
From: Itaru Kitayama @ 2025-03-12 3:56 UTC (permalink / raw)
To: Alison Schofield; +Cc: linux-cxl
Hi Alison,
I rebased onto the latest CXL kernel -next this morning and `modprobe cxl_test` triggers a NULL pointer dereference see below. I am building a kernel with ACPI_HMAT set to “y” but the FW doesn’t provide the table on my QEMU virt machine.
Thanks,
Itaru.
[ 128.095189][ T552] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 128.095629][ T552] Mem abort info:
[ 128.095703][ T552] ESR = 0x0000000096000044
[ 128.095789][ T552] EC = 0x25: DABT (current EL), IL = 32 bits
[ 128.096320][ T552] SET = 0, FnV = 0
[ 128.096655][ T552] EA = 0, S1PTW = 0
[ 128.096733][ T552] FSC = 0x04: level 0 translation fault
[ 128.096862][ T552] Data abort info:
[ 128.096939][ T552] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 128.097042][ T552] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 128.097149][ T552] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 128.097325][ T552] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000103981600
[ 128.098312][ T552] [0000000000000000] pgd=080000010f5a6403, p4d=0000000000000000
[ 134.299341][ T552] Internal error: Oops: 0000000096000044 [#3] PREEMPT SMP
[ 134.299844][ T552] Modules linked in: cxl_mock_mem(O) cxl_test(O) cxl_mem(O) cxl_pmem(O) cxl_acpi(O) cxl_port(O) cxl_mock(O) libnvdimm cxl_core(O) sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 button processor cfg80211 rfkill fuse drm backlight ip_tables x_tables ipv6
[ 134.302032][ T557] cxl_mock_mem cxl_rcd.10: CXL MCE unsupported
[ 134.302604][ T552] CPU: 1 UID: 0 PID: 552 Comm: kworker/u8:5 Tainted: G D O 6.14.0-rc1-00050-gb1eb9579d26a-dirty #103 09186677403f60ca5f8511de95b4969341ca485e
[ 134.303300][ T552] Tainted: [D]=DIE, [O]=OOT_MODULE
[ 134.304067][ T552] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[ 134.304209][ T552] Workqueue: async async_run_entry_fn
[ 134.304519][ T552] pstate: 61402005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 134.304856][ T552] pc : cxl_mock_mbox_send+0x514/0x11dc [cxl_mock_mem]
[ 134.305024][ T552] lr : cxl_internal_send_cmd+0x40/0x118 [cxl_core]
[ 134.305566][ T552] sp : ffff800082c5b9e0
[ 134.305671][ T552] x29: ffff800082c5b9e0 x28: fffeffb34bacd010 x27: fffeffb3434af390
[ 134.306215][ T552] x26: ffff800082c5bb57 x25: 0000000000000100 x24: 0000000000000001
[ 134.309335][ T552] x23: 0000000000000020 x22: fffeffb34bacd010 x21: fffeffb347f5e080
[ 134.311287][ T552] x20: fffeffb3434af080 x19: ffff800082c5bb58 x18: 00000000ffffffff
[ 134.312134][ T552] x17: 0000000000000000 x16: ffffa563f133a508 x15: fffeffb347e3ea1c
[ 134.313208][ T552] x14: ffffa563f31f4220 x13: 0000000000000040 x12: 0000000000000228
[ 134.313912][ T552] x11: 0000000000000000 x10: ffff5a4f60c1ec20 x9 : 0000000000000028
[ 134.314855][ T552] x8 : ffff800082c5bb98 x7 : 0000000000000003 x6 : 0000000000000003
[ 134.315033][ T552] x5 : fffeffb3437f3540 x4 : 0000000000000001 x3 : 0000000000001000
[ 134.318113][ T552] x2 : 0000000000000070 x1 : 0000000000000000 x0 : 0000000000000088
[ 134.318600][ T552] Call trace:
[ 134.318680][ T552] cxl_mock_mbox_send+0x514/0x11dc [cxl_mock_mem 0d13b81331ab9470a26e7387d930d28978595994] (P)
[ 134.319019][ T552] cxl_internal_send_cmd+0x40/0x118 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
[ 134.322029][ T552] cxl_mem_get_records_log+0xb8/0x184 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
[ 134.322795][ T552] cxl_mem_get_event_records+0xb0/0xb8 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
[ 134.323051][ T552] cxl_mock_mem_probe+0x41c/0x46c [cxl_mock_mem 0d13b81331ab9470a26e7387d930d28978595994]
[ 134.323233][ T552] platform_probe+0x68/0xdc
[ 134.323473][ T552] really_probe+0xc0/0x388
[ 134.323967][ T552] __driver_probe_device+0x7c/0x15c
[ 134.324085][ T552] driver_probe_device+0x40/0x114
[ 134.324599][ T552] __driver_attach_async_helper+0x50/0xec
[ 134.325234][ T552] async_run_entry_fn+0x34/0x14c
[ 134.326457][ T552] process_one_work+0x150/0x294
[ 134.326631][ T552] worker_thread+0x2dc/0x3dc
[ 134.326729][ T552] kthread+0x130/0x204
[ 134.326947][ T552] ret_from_fork+0x10/0x20
[ 134.327113][ T552] Code: 540010a8 f9400a61 52801100 d2800e02 (a9007c3f)
[ 134.327240][ T552] ---[ end trace 0000000000000000 ]---
[ 134.619427][ T557] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 134.633336][ T557] Mem abort info:
[ 134.644018][ T557] ESR = 0x0000000096000044
[ 134.668786][ T557] EC = 0x25: DABT (current EL), IL = 32 bits
[ 134.672231][ T557] SET = 0, FnV = 0
[ 134.701787][ T557] EA = 0, S1PTW = 0
[ 134.705094][ T557] FSC = 0x04: level 0 translation fault
[ 134.705227][ T557] Data abort info:
[ 134.705299][ T557] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
[ 134.705562][ T557] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 134.723727][ T557] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 134.786343][ T557] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000101734880
[ 134.791824][ T557] [0000000000000000] pgd=08000001019e9403, p4d=0000000000000000
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: CXL -next issue on arm64
2025-03-12 3:56 CXL -next issue on arm64 Itaru Kitayama
@ 2025-03-12 15:14 ` Dave Jiang
2025-03-12 23:04 ` Itaru Kitayama
0 siblings, 1 reply; 3+ messages in thread
From: Dave Jiang @ 2025-03-12 15:14 UTC (permalink / raw)
To: Itaru Kitayama, Alison Schofield; +Cc: linux-cxl
On 3/11/25 8:56 PM, Itaru Kitayama wrote:
> Hi Alison,
> I rebased onto the latest CXL kernel -next this morning and `modprobe cxl_test` triggers a NULL pointer dereference see below. I am building a kernel with ACPI_HMAT set to “y” but the FW doesn’t provide the table on my QEMU virt machine.
>
> Thanks,
> Itaru.
>
>
> [ 128.095189][ T552] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> [ 128.095629][ T552] Mem abort info:
> [ 128.095703][ T552] ESR = 0x0000000096000044
> [ 128.095789][ T552] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 128.096320][ T552] SET = 0, FnV = 0
> [ 128.096655][ T552] EA = 0, S1PTW = 0
> [ 128.096733][ T552] FSC = 0x04: level 0 translation fault
> [ 128.096862][ T552] Data abort info:
> [ 128.096939][ T552] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
> [ 128.097042][ T552] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> [ 128.097149][ T552] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 128.097325][ T552] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000103981600
> [ 128.098312][ T552] [0000000000000000] pgd=080000010f5a6403, p4d=0000000000000000
> [ 134.299341][ T552] Internal error: Oops: 0000000096000044 [#3] PREEMPT SMP
> [ 134.299844][ T552] Modules linked in: cxl_mock_mem(O) cxl_test(O) cxl_mem(O) cxl_pmem(O) cxl_acpi(O) cxl_port(O) cxl_mock(O) libnvdimm cxl_core(O) sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 button processor cfg80211 rfkill fuse drm backlight ip_tables x_tables ipv6
> [ 134.302032][ T557] cxl_mock_mem cxl_rcd.10: CXL MCE unsupported
> [ 134.302604][ T552] CPU: 1 UID: 0 PID: 552 Comm: kworker/u8:5 Tainted: G D O 6.14.0-rc1-00050-gb1eb9579d26a-dirty #103 09186677403f60ca5f8511de95b4969341ca485e
> [ 134.303300][ T552] Tainted: [D]=DIE, [O]=OOT_MODULE
> [ 134.304067][ T552] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> [ 134.304209][ T552] Workqueue: async async_run_entry_fn
> [ 134.304519][ T552] pstate: 61402005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [ 134.304856][ T552] pc : cxl_mock_mbox_send+0x514/0x11dc [cxl_mock_mem]
Can you run scripts/faddr2line on this and see which line of code triggered it? Thanks!
Also a bisect vs 6.14-rc1 would be great if you can pin point which new commit is causing it.
DJ
> [ 134.305024][ T552] lr : cxl_internal_send_cmd+0x40/0x118 [cxl_core]
> [ 134.305566][ T552] sp : ffff800082c5b9e0
> [ 134.305671][ T552] x29: ffff800082c5b9e0 x28: fffeffb34bacd010 x27: fffeffb3434af390
> [ 134.306215][ T552] x26: ffff800082c5bb57 x25: 0000000000000100 x24: 0000000000000001
> [ 134.309335][ T552] x23: 0000000000000020 x22: fffeffb34bacd010 x21: fffeffb347f5e080
> [ 134.311287][ T552] x20: fffeffb3434af080 x19: ffff800082c5bb58 x18: 00000000ffffffff
> [ 134.312134][ T552] x17: 0000000000000000 x16: ffffa563f133a508 x15: fffeffb347e3ea1c
> [ 134.313208][ T552] x14: ffffa563f31f4220 x13: 0000000000000040 x12: 0000000000000228
> [ 134.313912][ T552] x11: 0000000000000000 x10: ffff5a4f60c1ec20 x9 : 0000000000000028
> [ 134.314855][ T552] x8 : ffff800082c5bb98 x7 : 0000000000000003 x6 : 0000000000000003
> [ 134.315033][ T552] x5 : fffeffb3437f3540 x4 : 0000000000000001 x3 : 0000000000001000
> [ 134.318113][ T552] x2 : 0000000000000070 x1 : 0000000000000000 x0 : 0000000000000088
> [ 134.318600][ T552] Call trace:
> [ 134.318680][ T552] cxl_mock_mbox_send+0x514/0x11dc [cxl_mock_mem 0d13b81331ab9470a26e7387d930d28978595994] (P)
> [ 134.319019][ T552] cxl_internal_send_cmd+0x40/0x118 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
> [ 134.322029][ T552] cxl_mem_get_records_log+0xb8/0x184 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
> [ 134.322795][ T552] cxl_mem_get_event_records+0xb0/0xb8 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
> [ 134.323051][ T552] cxl_mock_mem_probe+0x41c/0x46c [cxl_mock_mem 0d13b81331ab9470a26e7387d930d28978595994]
> [ 134.323233][ T552] platform_probe+0x68/0xdc
> [ 134.323473][ T552] really_probe+0xc0/0x388
> [ 134.323967][ T552] __driver_probe_device+0x7c/0x15c
> [ 134.324085][ T552] driver_probe_device+0x40/0x114
> [ 134.324599][ T552] __driver_attach_async_helper+0x50/0xec
> [ 134.325234][ T552] async_run_entry_fn+0x34/0x14c
> [ 134.326457][ T552] process_one_work+0x150/0x294
> [ 134.326631][ T552] worker_thread+0x2dc/0x3dc
> [ 134.326729][ T552] kthread+0x130/0x204
> [ 134.326947][ T552] ret_from_fork+0x10/0x20
> [ 134.327113][ T552] Code: 540010a8 f9400a61 52801100 d2800e02 (a9007c3f)
> [ 134.327240][ T552] ---[ end trace 0000000000000000 ]---
> [ 134.619427][ T557] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> [ 134.633336][ T557] Mem abort info:
> [ 134.644018][ T557] ESR = 0x0000000096000044
> [ 134.668786][ T557] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 134.672231][ T557] SET = 0, FnV = 0
> [ 134.701787][ T557] EA = 0, S1PTW = 0
> [ 134.705094][ T557] FSC = 0x04: level 0 translation fault
> [ 134.705227][ T557] Data abort info:
> [ 134.705299][ T557] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
> [ 134.705562][ T557] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> [ 134.723727][ T557] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 134.786343][ T557] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000101734880
> [ 134.791824][ T557] [0000000000000000] pgd=08000001019e9403, p4d=0000000000000000
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: CXL -next issue on arm64
2025-03-12 15:14 ` Dave Jiang
@ 2025-03-12 23:04 ` Itaru Kitayama
0 siblings, 0 replies; 3+ messages in thread
From: Itaru Kitayama @ 2025-03-12 23:04 UTC (permalink / raw)
To: Dave Jiang; +Cc: Alison Schofield, linux-cxl
Dave
> On Mar 13, 2025, at 0:14, Dave Jiang <dave.jiang@intel.com> wrote:
>
>
>
>> On 3/11/25 8:56 PM, Itaru Kitayama wrote:
>> Hi Alison,
>> I rebased onto the latest CXL kernel -next this morning and `modprobe cxl_test` triggers a NULL pointer dereference see below. I am building a kernel with ACPI_HMAT set to “y” but the FW doesn’t provide the table on my QEMU virt machine.
>>
>> Thanks,
>> Itaru.
>>
>>
>> [ 128.095189][ T552] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>> [ 128.095629][ T552] Mem abort info:
>> [ 128.095703][ T552] ESR = 0x0000000096000044
>> [ 128.095789][ T552] EC = 0x25: DABT (current EL), IL = 32 bits
>> [ 128.096320][ T552] SET = 0, FnV = 0
>> [ 128.096655][ T552] EA = 0, S1PTW = 0
>> [ 128.096733][ T552] FSC = 0x04: level 0 translation fault
>> [ 128.096862][ T552] Data abort info:
>> [ 128.096939][ T552] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
>> [ 128.097042][ T552] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
>> [ 128.097149][ T552] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>> [ 128.097325][ T552] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000103981600
>> [ 128.098312][ T552] [0000000000000000] pgd=080000010f5a6403, p4d=0000000000000000
>> [ 134.299341][ T552] Internal error: Oops: 0000000096000044 [#3] PREEMPT SMP
>> [ 134.299844][ T552] Modules linked in: cxl_mock_mem(O) cxl_test(O) cxl_mem(O) cxl_pmem(O) cxl_acpi(O) cxl_port(O) cxl_mock(O) libnvdimm cxl_core(O) sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 button processor cfg80211 rfkill fuse drm backlight ip_tables x_tables ipv6
>> [ 134.302032][ T557] cxl_mock_mem cxl_rcd.10: CXL MCE unsupported
>> [ 134.302604][ T552] CPU: 1 UID: 0 PID: 552 Comm: kworker/u8:5 Tainted: G D O 6.14.0-rc1-00050-gb1eb9579d26a-dirty #103 09186677403f60ca5f8511de95b4969341ca485e
>> [ 134.303300][ T552] Tainted: [D]=DIE, [O]=OOT_MODULE
>> [ 134.304067][ T552] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>> [ 134.304209][ T552] Workqueue: async async_run_entry_fn
>> [ 134.304519][ T552] pstate: 61402005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
>> [ 134.304856][ T552] pc : cxl_mock_mbox_send+0x514/0x11dc [cxl_mock_mem]
>
> Can you run scripts/faddr2line on this and see which line of code triggered it? Thanks!
>
> Also a bisect vs 6.14-rc1 would be great if you can pin point which new commit is causing it.
>
> DJ
Sorry I can’t reproduce it now, next time I see it I’ll do what you suggested with a stack trace.
The latest CXL -next at least boots and I can do: sudo modprobe cxl_test without warnings directly related to the cxl test suite on arm64.
I’m using Jonathan’s QEMU for now.
Itaru.
>
>> [ 134.305024][ T552] lr : cxl_internal_send_cmd+0x40/0x118 [cxl_core]
>> [ 134.305566][ T552] sp : ffff800082c5b9e0
>> [ 134.305671][ T552] x29: ffff800082c5b9e0 x28: fffeffb34bacd010 x27: fffeffb3434af390
>> [ 134.306215][ T552] x26: ffff800082c5bb57 x25: 0000000000000100 x24: 0000000000000001
>> [ 134.309335][ T552] x23: 0000000000000020 x22: fffeffb34bacd010 x21: fffeffb347f5e080
>> [ 134.311287][ T552] x20: fffeffb3434af080 x19: ffff800082c5bb58 x18: 00000000ffffffff
>> [ 134.312134][ T552] x17: 0000000000000000 x16: ffffa563f133a508 x15: fffeffb347e3ea1c
>> [ 134.313208][ T552] x14: ffffa563f31f4220 x13: 0000000000000040 x12: 0000000000000228
>> [ 134.313912][ T552] x11: 0000000000000000 x10: ffff5a4f60c1ec20 x9 : 0000000000000028
>> [ 134.314855][ T552] x8 : ffff800082c5bb98 x7 : 0000000000000003 x6 : 0000000000000003
>> [ 134.315033][ T552] x5 : fffeffb3437f3540 x4 : 0000000000000001 x3 : 0000000000001000
>> [ 134.318113][ T552] x2 : 0000000000000070 x1 : 0000000000000000 x0 : 0000000000000088
>> [ 134.318600][ T552] Call trace:
>> [ 134.318680][ T552] cxl_mock_mbox_send+0x514/0x11dc [cxl_mock_mem 0d13b81331ab9470a26e7387d930d28978595994] (P)
>> [ 134.319019][ T552] cxl_internal_send_cmd+0x40/0x118 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
>> [ 134.322029][ T552] cxl_mem_get_records_log+0xb8/0x184 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
>> [ 134.322795][ T552] cxl_mem_get_event_records+0xb0/0xb8 [cxl_core 98e80007eca5dee8da38639ce9041c6e7bffd043]
>> [ 134.323051][ T552] cxl_mock_mem_probe+0x41c/0x46c [cxl_mock_mem 0d13b81331ab9470a26e7387d930d28978595994]
>> [ 134.323233][ T552] platform_probe+0x68/0xdc
>> [ 134.323473][ T552] really_probe+0xc0/0x388
>> [ 134.323967][ T552] __driver_probe_device+0x7c/0x15c
>> [ 134.324085][ T552] driver_probe_device+0x40/0x114
>> [ 134.324599][ T552] __driver_attach_async_helper+0x50/0xec
>> [ 134.325234][ T552] async_run_entry_fn+0x34/0x14c
>> [ 134.326457][ T552] process_one_work+0x150/0x294
>> [ 134.326631][ T552] worker_thread+0x2dc/0x3dc
>> [ 134.326729][ T552] kthread+0x130/0x204
>> [ 134.326947][ T552] ret_from_fork+0x10/0x20
>> [ 134.327113][ T552] Code: 540010a8 f9400a61 52801100 d2800e02 (a9007c3f)
>> [ 134.327240][ T552] ---[ end trace 0000000000000000 ]---
>> [ 134.619427][ T557] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>> [ 134.633336][ T557] Mem abort info:
>> [ 134.644018][ T557] ESR = 0x0000000096000044
>> [ 134.668786][ T557] EC = 0x25: DABT (current EL), IL = 32 bits
>> [ 134.672231][ T557] SET = 0, FnV = 0
>> [ 134.701787][ T557] EA = 0, S1PTW = 0
>> [ 134.705094][ T557] FSC = 0x04: level 0 translation fault
>> [ 134.705227][ T557] Data abort info:
>> [ 134.705299][ T557] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
>> [ 134.705562][ T557] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
>> [ 134.723727][ T557] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>> [ 134.786343][ T557] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000101734880
>> [ 134.791824][ T557] [0000000000000000] pgd=08000001019e9403, p4d=0000000000000000
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-03-12 23:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-12 3:56 CXL -next issue on arm64 Itaru Kitayama
2025-03-12 15:14 ` Dave Jiang
2025-03-12 23:04 ` Itaru Kitayama
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox