* arm64: kernel panic on 4G RAM platform
@ 2024-05-23 8:16 Alexander Wilhelm
2024-05-29 15:47 ` Will Deacon
0 siblings, 1 reply; 4+ messages in thread
From: Alexander Wilhelm @ 2024-05-23 8:16 UTC (permalink / raw)
To: Catalin Marinas; +Cc: linux-arm-kernel
Hello ARM64 developers,
I have a kernel panic problem on my ARM64 architecture board but I'm not sure if
it's a problem in kernel or otherwise. Maybe one could help me.
My problem is the following: I'm using the NXP TQ board with ARM64 architecture
to run OpenWRT operating system with linux kernel v5.15. The current
revision of the board (TQMLS1046A-CB.0203) has now a 4GiB RAM instead of 2GiB.
Therefore I adapted the U-Boot to use the entire memory. But now it leads to
kernel crash. Interesting is that if I only use 2GiB the problem doesn't occur.
The memory is splitted up in two different banks.
While analyzing my problem I tried to narrow down the source of my problem. But
with each new "print message" that should me help to trace the problem I get
another one. It seems like the error happens unpredictable like due to race
condition or memory access. Then I tried different RAM sizes something
in-between like 3GiB. I could boot successfully but then I got errors from
"swiotlb" if my wireless driver tried to allocate memory from my CMA pool. I
understand that the error description is vary vague but I give my best to
explain my problem. Please refer to my log of the current kernel panic state:
Starting kernel ...
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd082]
[ 0.000000] Linux version 5.15.158 (########@##############) (aarch64-openwrt-linux-gnu-gcc (OpenWrt GCC 12.3.0 r23630-842932a63d) 12.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed May 22 13:15:09 2024
[ 0.000000] Machine model: #########
[ 0.000000] earlycon: uart8250 at MMIO 0x00000000021c0500 (options '')
[ 0.000000] printk: bootconsole [uart8250] enabled
[ 0.000000] Reserved memory: created DMA memory pool at 0x00000008ff800000, size 8 MiB
[ 0.000000] OF: reserved mem: initialized node qman-fqd, compatible id shared-dma-pool
[ 0.000000] Reserved memory: created DMA memory pool at 0x00000008fc000000, size 32 MiB
[ 0.000000] OF: reserved mem: initialized node qman-pfdr, compatible id shared-dma-pool
[ 0.000000] Reserved memory: created DMA memory pool at 0x00000008fe000000, size 16 MiB
[ 0.000000] OF: reserved mem: initialized node bman-fbpr, compatible id shared-dma-pool
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000080000000-0x00000000ffffffff]
[ 0.000000] DMA32 empty
[ 0.000000] Normal [mem 0x0000000100000000-0x00000008ffffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000080000000-0x00000000fbdfffff]
[ 0.000000] node 0: [mem 0x0000000880000000-0x00000008fbffffff]
[ 0.000000] node 0: [mem 0x00000008fc000000-0x00000008feffffff]
[ 0.000000] node 0: [mem 0x00000008ff000000-0x00000008ff7fffff]
[ 0.000000] node 0: [mem 0x00000008ff800000-0x00000008ffffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x00000008ffffffff]
[ 0.000000] On node 0, zone Normal: 16896 pages in unavailable ranges
[ 0.000000] cma: Reserved 192 MiB at 0x00000000ee800000
[ 0.000000] Failed to find device node for boot cpu
[ 0.000000] /cpus/cpu@0: missing reg property
[ 0.000000] /cpus/cpu@1: missing reg property
[ 0.000000] /cpus/cpu@2: missing reg property
[ 0.000000] /cpus/cpu@3: missing reg property
[ 0.000000] Number of cores (5) exceeds configured maximum of 2 - clipping
[ 0.000000] missing boot CPU MPIDR, not enabling secondaries
[ 0.000000] percpu: Embedded 19 pages/cpu s37976 r8192 d31656 u77824
[ 0.000000] Detected PIPT I-cache on CPU0
[ 0.000000] CPU features: detected: Spectre-v2
[ 0.000000] CPU features: detected: Spectre-v4
[ 0.000000] CPU features: detected: Spectre-BHB
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 1017344
[ 0.000000] Kernel command line: #######################
[ 0.000000] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] software IO TLB: mapped [mem 0x00000000ea800000-0x00000000ee800000] (64MB)
[ 0.000000] Memory: 3698904K/4126720K available (11776K kernel code, 968K rwdata, 3280K rodata, 576K init, 367K bss, 231208K reserved, 196608K cma-reserved)
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] rcu: RCU event tracing is enabled.
[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=2 to nr_cpu_ids=1.
[ 0.000000] Trampoline variant of Tasks RCU enabled.
[ 0.000000] Rude variant of Tasks RCU enabled.
[ 0.000000] Tracing variant of Tasks RCU enabled.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] timer_probe: no matching timers found
[ 0.000000] Kernel panic - not syncing: Unable to initialise architected timer.
[ 0.000000] ---[ end Kernel panic - not syncing: Unable to initialise architected timer. ]---
Best regards
Alexander Wilhelm
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: arm64: kernel panic on 4G RAM platform
2024-05-23 8:16 arm64: kernel panic on 4G RAM platform Alexander Wilhelm
@ 2024-05-29 15:47 ` Will Deacon
2024-06-03 6:19 ` Alexander Wilhelm
0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2024-05-29 15:47 UTC (permalink / raw)
To: Alexander Wilhelm; +Cc: Catalin Marinas, linux-arm-kernel
On Thu, May 23, 2024 at 10:16:56AM +0200, Alexander Wilhelm wrote:
> Hello ARM64 developers,
>
> I have a kernel panic problem on my ARM64 architecture board but I'm not sure if
> it's a problem in kernel or otherwise. Maybe one could help me.
>
> My problem is the following: I'm using the NXP TQ board with ARM64 architecture
> to run OpenWRT operating system with linux kernel v5.15. The current
> revision of the board (TQMLS1046A-CB.0203) has now a 4GiB RAM instead of 2GiB.
> Therefore I adapted the U-Boot to use the entire memory. But now it leads to
> kernel crash. Interesting is that if I only use 2GiB the problem doesn't occur.
> The memory is splitted up in two different banks.
>
> While analyzing my problem I tried to narrow down the source of my problem. But
> with each new "print message" that should me help to trace the problem I get
> another one. It seems like the error happens unpredictable like due to race
> condition or memory access. Then I tried different RAM sizes something
> in-between like 3GiB. I could boot successfully but then I got errors from
> "swiotlb" if my wireless driver tried to allocate memory from my CMA pool. I
> understand that the error description is vary vague but I give my best to
> explain my problem. Please refer to my log of the current kernel panic state:
>
> Starting kernel ...
> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd082]
> [ 0.000000] Linux version 5.15.158 (########@##############) (aarch64-openwrt-linux-gnu-gcc (OpenWrt GCC 12.3.0 r23630-842932a63d) 12.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed May 22 13:15:09 2024
> [ 0.000000] Machine model: #########
> [ 0.000000] earlycon: uart8250 at MMIO 0x00000000021c0500 (options '')
> [ 0.000000] printk: bootconsole [uart8250] enabled
> [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008ff800000, size 8 MiB
> [ 0.000000] OF: reserved mem: initialized node qman-fqd, compatible id shared-dma-pool
> [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008fc000000, size 32 MiB
> [ 0.000000] OF: reserved mem: initialized node qman-pfdr, compatible id shared-dma-pool
> [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008fe000000, size 16 MiB
> [ 0.000000] OF: reserved mem: initialized node bman-fbpr, compatible id shared-dma-pool
> [ 0.000000] Zone ranges:
> [ 0.000000] DMA [mem 0x0000000080000000-0x00000000ffffffff]
> [ 0.000000] DMA32 empty
> [ 0.000000] Normal [mem 0x0000000100000000-0x00000008ffffffff]
> [ 0.000000] Movable zone start for each node
> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x0000000080000000-0x00000000fbdfffff]
> [ 0.000000] node 0: [mem 0x0000000880000000-0x00000008fbffffff]
> [ 0.000000] node 0: [mem 0x00000008fc000000-0x00000008feffffff]
> [ 0.000000] node 0: [mem 0x00000008ff000000-0x00000008ff7fffff]
> [ 0.000000] node 0: [mem 0x00000008ff800000-0x00000008ffffffff]
> [ 0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x00000008ffffffff]
> [ 0.000000] On node 0, zone Normal: 16896 pages in unavailable ranges
> [ 0.000000] cma: Reserved 192 MiB at 0x00000000ee800000
> [ 0.000000] Failed to find device node for boot cpu
> [ 0.000000] /cpus/cpu@0: missing reg property
> [ 0.000000] /cpus/cpu@1: missing reg property
> [ 0.000000] /cpus/cpu@2: missing reg property
> [ 0.000000] /cpus/cpu@3: missing reg property
> [ 0.000000] Number of cores (5) exceeds configured maximum of 2 - clipping
> [ 0.000000] missing boot CPU MPIDR, not enabling secondaries
I'd start by fixing this bit ^^^ If the secondary CPUs are spinning
somewhere in memory, maybe that gets allocated by Linux and you end up
with them executing random instructions?
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: arm64: kernel panic on 4G RAM platform
2024-05-29 15:47 ` Will Deacon
@ 2024-06-03 6:19 ` Alexander Wilhelm
2024-06-03 13:34 ` Will Deacon
0 siblings, 1 reply; 4+ messages in thread
From: Alexander Wilhelm @ 2024-06-03 6:19 UTC (permalink / raw)
To: Will Deacon; +Cc: Catalin Marinas, linux-arm-kernel
Am Wed, May 29, 2024 at 04:47:40PM +0100 schrieb Will Deacon:
> On Thu, May 23, 2024 at 10:16:56AM +0200, Alexander Wilhelm wrote:
> > Hello ARM64 developers,
> >
> > I have a kernel panic problem on my ARM64 architecture board but I'm not sure if
> > it's a problem in kernel or otherwise. Maybe one could help me.
> >
> > My problem is the following: I'm using the NXP TQ board with ARM64 architecture
> > to run OpenWRT operating system with linux kernel v5.15. The current
> > revision of the board (TQMLS1046A-CB.0203) has now a 4GiB RAM instead of 2GiB.
> > Therefore I adapted the U-Boot to use the entire memory. But now it leads to
> > kernel crash. Interesting is that if I only use 2GiB the problem doesn't occur.
> > The memory is splitted up in two different banks.
> >
> > While analyzing my problem I tried to narrow down the source of my problem. But
> > with each new "print message" that should me help to trace the problem I get
> > another one. It seems like the error happens unpredictable like due to race
> > condition or memory access. Then I tried different RAM sizes something
> > in-between like 3GiB. I could boot successfully but then I got errors from
> > "swiotlb" if my wireless driver tried to allocate memory from my CMA pool. I
> > understand that the error description is vary vague but I give my best to
> > explain my problem. Please refer to my log of the current kernel panic state:
> >
> > Starting kernel ...
> > [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd082]
> > [ 0.000000] Linux version 5.15.158 (########@##############) (aarch64-openwrt-linux-gnu-gcc (OpenWrt GCC 12.3.0 r23630-842932a63d) 12.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed May 22 13:15:09 2024
> > [ 0.000000] Machine model: #########
> > [ 0.000000] earlycon: uart8250 at MMIO 0x00000000021c0500 (options '')
> > [ 0.000000] printk: bootconsole [uart8250] enabled
> > [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008ff800000, size 8 MiB
> > [ 0.000000] OF: reserved mem: initialized node qman-fqd, compatible id shared-dma-pool
> > [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008fc000000, size 32 MiB
> > [ 0.000000] OF: reserved mem: initialized node qman-pfdr, compatible id shared-dma-pool
> > [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008fe000000, size 16 MiB
> > [ 0.000000] OF: reserved mem: initialized node bman-fbpr, compatible id shared-dma-pool
> > [ 0.000000] Zone ranges:
> > [ 0.000000] DMA [mem 0x0000000080000000-0x00000000ffffffff]
> > [ 0.000000] DMA32 empty
> > [ 0.000000] Normal [mem 0x0000000100000000-0x00000008ffffffff]
> > [ 0.000000] Movable zone start for each node
> > [ 0.000000] Early memory node ranges
> > [ 0.000000] node 0: [mem 0x0000000080000000-0x00000000fbdfffff]
> > [ 0.000000] node 0: [mem 0x0000000880000000-0x00000008fbffffff]
> > [ 0.000000] node 0: [mem 0x00000008fc000000-0x00000008feffffff]
> > [ 0.000000] node 0: [mem 0x00000008ff000000-0x00000008ff7fffff]
> > [ 0.000000] node 0: [mem 0x00000008ff800000-0x00000008ffffffff]
> > [ 0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x00000008ffffffff]
> > [ 0.000000] On node 0, zone Normal: 16896 pages in unavailable ranges
> > [ 0.000000] cma: Reserved 192 MiB at 0x00000000ee800000
> > [ 0.000000] Failed to find device node for boot cpu
> > [ 0.000000] /cpus/cpu@0: missing reg property
> > [ 0.000000] /cpus/cpu@1: missing reg property
> > [ 0.000000] /cpus/cpu@2: missing reg property
> > [ 0.000000] /cpus/cpu@3: missing reg property
> > [ 0.000000] Number of cores (5) exceeds configured maximum of 2 - clipping
> > [ 0.000000] missing boot CPU MPIDR, not enabling secondaries
>
> I'd start by fixing this bit ^^^ If the secondary CPUs are spinning
> somewhere in memory, maybe that gets allocated by Linux and you end up
> with them executing random instructions?
>
> Will
It sounds very much like it. I already suspected that the CPUs were getting in
each other's way. Unfortunately I could not reduce the CPUs to 1 in my kernel
configuration. For some reason, 2 is the minimum number. If you could give me
some tips on how to narrow down the problem, I would appreciate it.
Best regards
Alexander Wilhelm
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: arm64: kernel panic on 4G RAM platform
2024-06-03 6:19 ` Alexander Wilhelm
@ 2024-06-03 13:34 ` Will Deacon
0 siblings, 0 replies; 4+ messages in thread
From: Will Deacon @ 2024-06-03 13:34 UTC (permalink / raw)
To: Alexander Wilhelm, shawnguo, festevam, s.hauer
Cc: Catalin Marinas, linux-arm-kernel, imx
On Mon, Jun 03, 2024 at 08:19:55AM +0200, Alexander Wilhelm wrote:
> Am Wed, May 29, 2024 at 04:47:40PM +0100 schrieb Will Deacon:
> > On Thu, May 23, 2024 at 10:16:56AM +0200, Alexander Wilhelm wrote:
> > > Hello ARM64 developers,
> > >
> > > I have a kernel panic problem on my ARM64 architecture board but I'm not sure if
> > > it's a problem in kernel or otherwise. Maybe one could help me.
> > >
> > > My problem is the following: I'm using the NXP TQ board with ARM64 architecture
> > > to run OpenWRT operating system with linux kernel v5.15. The current
> > > revision of the board (TQMLS1046A-CB.0203) has now a 4GiB RAM instead of 2GiB.
> > > Therefore I adapted the U-Boot to use the entire memory. But now it leads to
> > > kernel crash. Interesting is that if I only use 2GiB the problem doesn't occur.
> > > The memory is splitted up in two different banks.
> > >
> > > While analyzing my problem I tried to narrow down the source of my problem. But
> > > with each new "print message" that should me help to trace the problem I get
> > > another one. It seems like the error happens unpredictable like due to race
> > > condition or memory access. Then I tried different RAM sizes something
> > > in-between like 3GiB. I could boot successfully but then I got errors from
> > > "swiotlb" if my wireless driver tried to allocate memory from my CMA pool. I
> > > understand that the error description is vary vague but I give my best to
> > > explain my problem. Please refer to my log of the current kernel panic state:
> > >
> > > Starting kernel ...
> > > [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd082]
> > > [ 0.000000] Linux version 5.15.158 (########@##############) (aarch64-openwrt-linux-gnu-gcc (OpenWrt GCC 12.3.0 r23630-842932a63d) 12.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed May 22 13:15:09 2024
> > > [ 0.000000] Machine model: #########
> > > [ 0.000000] earlycon: uart8250 at MMIO 0x00000000021c0500 (options '')
> > > [ 0.000000] printk: bootconsole [uart8250] enabled
> > > [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008ff800000, size 8 MiB
> > > [ 0.000000] OF: reserved mem: initialized node qman-fqd, compatible id shared-dma-pool
> > > [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008fc000000, size 32 MiB
> > > [ 0.000000] OF: reserved mem: initialized node qman-pfdr, compatible id shared-dma-pool
> > > [ 0.000000] Reserved memory: created DMA memory pool at 0x00000008fe000000, size 16 MiB
> > > [ 0.000000] OF: reserved mem: initialized node bman-fbpr, compatible id shared-dma-pool
> > > [ 0.000000] Zone ranges:
> > > [ 0.000000] DMA [mem 0x0000000080000000-0x00000000ffffffff]
> > > [ 0.000000] DMA32 empty
> > > [ 0.000000] Normal [mem 0x0000000100000000-0x00000008ffffffff]
> > > [ 0.000000] Movable zone start for each node
> > > [ 0.000000] Early memory node ranges
> > > [ 0.000000] node 0: [mem 0x0000000080000000-0x00000000fbdfffff]
> > > [ 0.000000] node 0: [mem 0x0000000880000000-0x00000008fbffffff]
> > > [ 0.000000] node 0: [mem 0x00000008fc000000-0x00000008feffffff]
> > > [ 0.000000] node 0: [mem 0x00000008ff000000-0x00000008ff7fffff]
> > > [ 0.000000] node 0: [mem 0x00000008ff800000-0x00000008ffffffff]
> > > [ 0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x00000008ffffffff]
> > > [ 0.000000] On node 0, zone Normal: 16896 pages in unavailable ranges
> > > [ 0.000000] cma: Reserved 192 MiB at 0x00000000ee800000
> > > [ 0.000000] Failed to find device node for boot cpu
> > > [ 0.000000] /cpus/cpu@0: missing reg property
> > > [ 0.000000] /cpus/cpu@1: missing reg property
> > > [ 0.000000] /cpus/cpu@2: missing reg property
> > > [ 0.000000] /cpus/cpu@3: missing reg property
> > > [ 0.000000] Number of cores (5) exceeds configured maximum of 2 - clipping
> > > [ 0.000000] missing boot CPU MPIDR, not enabling secondaries
> >
> > I'd start by fixing this bit ^^^ If the secondary CPUs are spinning
> > somewhere in memory, maybe that gets allocated by Linux and you end up
> > with them executing random instructions?
> >
>
> It sounds very much like it. I already suspected that the CPUs were getting in
> each other's way. Unfortunately I could not reduce the CPUs to 1 in my kernel
> configuration. For some reason, 2 is the minimum number. If you could give me
> some tips on how to narrow down the problem, I would appreciate it.
I'm not really familiar with the FSL SoCs, so adding a bunch of folks
who are in case they have any ideas of things you could try next.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-03 13:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-23 8:16 arm64: kernel panic on 4G RAM platform Alexander Wilhelm
2024-05-29 15:47 ` Will Deacon
2024-06-03 6:19 ` Alexander Wilhelm
2024-06-03 13:34 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox