All of lore.kernel.org
 help / color / mirror / Atom feed
* [meta-fsl-arm] Help debugging kernel lockup
@ 2015-06-17 16:39 Gary Thomas
  2015-06-17 19:32 ` Nikolay Dimitrov
  2015-06-18 18:48 ` Gary Thomas
  0 siblings, 2 replies; 4+ messages in thread
From: Gary Thomas @ 2015-06-17 16:39 UTC (permalink / raw)
  To: meta-freescale@yoctoproject.org

I know this is a bit off-topic, but I'm hoping someone might
have a clue how to help.

I have an LS1021 system that has become unstable - it will
lock up (reliably) 5 minutes after boot up, +/- a few seconds.
I enabled kernel debugging for this and got this info:

BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0]
Modules linked in:
irq event stamp: 542337
hardirqs last  enabled at (542336): [<80021b20>] __do_softirq+0xf4/0x29c
hardirqs last disabled at (542337): [<80012034>] __irq_svc+0x34/0x58
softirqs last  enabled at (542334): [<800216b0>] _local_bh_enable+0x14/0x18
softirqs last disabled at (542335): [<80021d94>] do_softirq+0x54/0x78

CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.19-rt30-00644-gc847f76-dirty #4
task: 808bc0a8 ti: 808b0000 task.ti: 808b0000
PC is at __do_softirq+0xf8/0x29c
LR is at trace_hardirqs_on_caller+0x170/0x23c
pc : [<80021b24>]    lr : [<8006a20c>]    psr: 200c0113
sp : 808b1e68  ip : 808b1e38  fp : 808b1eac
r10: 00200000  r9 : 410fc075  r8 : 808b0028
r7 : 808b1f54  r6 : 80021d94  r5 : 00000082  r4 : 808b0000
r3 : 808bc0a8  r2 : 00000000  r1 : 000e38ed  r0 : 00000001
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 30c53c7d  Table: be464cc0  DAC: fffffffd
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.19-rt30-00644-gc847f76-dirty #4
Backtrace:
[<800112c8>] (dump_backtrace+0x0/0x100) from [<80011588>] (show_stack+0x18/0x1c)
  r6:0000013c r5:00000000 r4:808b0028 r3:00200000
[<80011570>] (show_stack+0x0/0x1c) from [<8056abb0>] (dump_stack+0x90/0xbc)
[<8056ab20>] (dump_stack+0x0/0xbc) from [<8000f0f4>] (show_regs+0x24/0x2c)
  r5:808b8be0 r4:808b1e20
[<8000f0d0>] (show_regs+0x0/0x2c) from [<8007abac>] (watchdog_timer_fn+0x120/0x174)
  r4:808b0000 r3:808bc0a8
[<8007aa8c>] (watchdog_timer_fn+0x0/0x174) from [<8003dca4>] (__run_hrtimer.isra.25+0x58/0xd0)
[<8003dc4c>] (__run_hrtimer.isra.25+0x0/0xd0) from [<8003eabc>] (hrtimer_run_queues+0x214/0x234)
  r7:8168e2f0 r6:8168e2e0 r5:00000052 r4:0c14ad80
[<8003e8a8>] (hrtimer_run_queues+0x0/0x234) from [<800292b4>] (run_local_timers+0x1c/0x54)
[<80029298>] (run_local_timers+0x0/0x54) from [<80029324>] (update_process_times+0x38/0x6c)
  r4:808b0000 r3:00000000
[<800292ec>] (update_process_times+0x0/0x6c) from [<80062c18>] (tick_periodic+0x94/0xac)
  r7:00000000 r6:817f2c80 r5:808ce168 r4:bf807d00
[<80062b84>] (tick_periodic+0x0/0xac) from [<80062e14>] (tick_handle_periodic+0x2c/0x9c)
[<80062de8>] (tick_handle_periodic+0x0/0x9c) from [<803dd7c0>] (arch_timer_handler_phys+0x30/0x38)
  r9:410fc075 r8:817f2c80 r7:0000001d r6:bf81b240 r5:808ce168
r4:bf807d00
[<803dd790>] (arch_timer_handler_phys+0x0/0x38) from [<8005922c>] (handle_percpu_devid_irq+0x70/0x8c)
[<800591bc>] (handle_percpu_devid_irq+0x0/0x8c) from [<800557ac>] (generic_handle_irq+0x28/0x38)
  r8:808b0028 r7:808b1e54 r6:808b1f20 r5:808acd8c r4:0000001d
r3:800591bc
[<80055784>] (generic_handle_irq+0x0/0x38) from [<8000eac4>] (handle_IRQ+0x70/0x98)
  r4:0000001d r3:00000140
[<8000ea54>] (handle_IRQ+0x0/0x98) from [<80008560>] (gic_handle_irq+0x44/0x68)
  r6:808b90c4 r5:808b1e20 r4:c0802000 r3:00000100
[<8000851c>] (gic_handle_irq+0x0/0x68) from [<80012044>] (__irq_svc+0x44/0x58)
Exception stack(0x808b1e20 to 0x808b1e68)
1e20: 00000001 000e38ed 00000000 808bc0a8 808b0000 00000082 80021d94 808b1f54
1e40: 808b0028 410fc075 00200000 808b1eac 808b1e38 808b1e68 8006a20c 80021b24
1e60: 200c0113 ffffffff
  r6:ffffffff r5:200c0113 r4:80021b24 r3:8006a20c
[<80021a2c>] (__do_softirq+0x0/0x29c) from [<80021d94>] (do_softirq+0x54/0x78)
[<80021d40>] (do_softirq+0x0/0x78) from [<8002272c>] (irq_exit+0xa8/0x118)
  r5:808acd8c r4:808b0008
[<80022684>] (irq_exit+0x0/0x118) from [<8000eac8>] (handle_IRQ+0x74/0x98)
  r5:808acd8c r4:0000001d
[<8000ea54>] (handle_IRQ+0x0/0x98) from [<80008560>] (gic_handle_irq+0x44/0x68)
  r6:808b90c4 r5:808b1f20 r4:c0802000 r3:00000100
[<8000851c>] (gic_handle_irq+0x0/0x68) from [<80012044>] (__irq_svc+0x44/0x58)
Exception stack(0x808b1f20 to 0x808b1f68)
1f20: 00000001 000e38ec 00000000 808bc0a8 808b0038 808c0360 808b0000 8168b880
1f40: 808b8b00 410fc075 00000000 808b1f74 808b1f38 808b1f68 8006a250 8000ee08
1f60: 200c0013 ffffffff
  r6:ffffffff r5:200c0013 r4:8000ee08 r3:8006a250
[<8000edcc>] (arch_cpu_idle+0x0/0x44) from [<800556fc>] (cpu_startup_entry+0xd0/0x138)
[<8005562c>] (cpu_startup_entry+0x0/0x138) from [<805643b4>] (rest_init+0xe0/0x108)
[<805642d4>] (rest_init+0x0/0x108) from [<80715b28>] (start_kernel+0x2fc/0x354)
  r6:808b0000 r5:8073f450 r4:808b8c08
[<8071582c>] (start_kernel+0x0/0x354) from [<80008084>] (0x80008084)

I have disabled all drivers except for network, serial and MMC.
My kernel is built using meta-fsl-arm/recipes-kernel/linux/linux-ls1_3.12.bb

Any ideas what might cause this or how I can debug it further?

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [meta-fsl-arm] Help debugging kernel lockup
  2015-06-17 16:39 [meta-fsl-arm] Help debugging kernel lockup Gary Thomas
@ 2015-06-17 19:32 ` Nikolay Dimitrov
  2015-06-18 18:48 ` Gary Thomas
  1 sibling, 0 replies; 4+ messages in thread
From: Nikolay Dimitrov @ 2015-06-17 19:32 UTC (permalink / raw)
  To: Gary Thomas, meta-freescale@yoctoproject.org

Hi Gary,

On 06/17/2015 07:39 PM, Gary Thomas wrote:
> I know this is a bit off-topic, but I'm hoping someone might
> have a clue how to help.
>
> I have an LS1021 system that has become unstable - it will
> lock up (reliably) 5 minutes after boot up, +/- a few seconds.
> I enabled kernel debugging for this and got this info:
[snip]

It would be good if you can separate the hardware from software
problems: do you have a previous SW version where everything worked as
expected even under full load? Or this issue has been there all the
time?

I'm asking this, because 1-2 months ago I had kernel panic issues under
load, which turned out to be a problem with the DDR power supplies.

Regards,
Nikolay


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [meta-fsl-arm] Help debugging kernel lockup
  2015-06-17 16:39 [meta-fsl-arm] Help debugging kernel lockup Gary Thomas
  2015-06-17 19:32 ` Nikolay Dimitrov
@ 2015-06-18 18:48 ` Gary Thomas
  2015-06-18 19:40   ` Nikolay Dimitrov
  1 sibling, 1 reply; 4+ messages in thread
From: Gary Thomas @ 2015-06-18 18:48 UTC (permalink / raw)
  To: meta-freescale

On 2015-06-17 10:39, Gary Thomas wrote:
> I know this is a bit off-topic, but I'm hoping someone might
> have a clue how to help.
>
> I have an LS1021 system that has become unstable - it will
> lock up (reliably) 5 minutes after boot up, +/- a few seconds.
> I enabled kernel debugging for this and got this info:
>
> BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0]
> Modules linked in:
> irq event stamp: 542337
> hardirqs last  enabled at (542336): [<80021b20>] __do_softirq+0xf4/0x29c
> hardirqs last disabled at (542337): [<80012034>] __irq_svc+0x34/0x58
> softirqs last  enabled at (542334): [<800216b0>] _local_bh_enable+0x14/0x18
> softirqs last disabled at (542335): [<80021d94>] do_softirq+0x54/0x78
>
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.19-rt30-00644-gc847f76-dirty #4
> task: 808bc0a8 ti: 808b0000 task.ti: 808b0000
> PC is at __do_softirq+0xf8/0x29c
> LR is at trace_hardirqs_on_caller+0x170/0x23c
> pc : [<80021b24>]    lr : [<8006a20c>]    psr: 200c0113
> sp : 808b1e68  ip : 808b1e38  fp : 808b1eac
> r10: 00200000  r9 : 410fc075  r8 : 808b0028
> r7 : 808b1f54  r6 : 80021d94  r5 : 00000082  r4 : 808b0000
> r3 : 808bc0a8  r2 : 00000000  r1 : 000e38ed  r0 : 00000001
> Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> Control: 30c53c7d  Table: be464cc0  DAC: fffffffd
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.19-rt30-00644-gc847f76-dirty #4
> Backtrace:
> [<800112c8>] (dump_backtrace+0x0/0x100) from [<80011588>] (show_stack+0x18/0x1c)
>   r6:0000013c r5:00000000 r4:808b0028 r3:00200000
> [<80011570>] (show_stack+0x0/0x1c) from [<8056abb0>] (dump_stack+0x90/0xbc)
> [<8056ab20>] (dump_stack+0x0/0xbc) from [<8000f0f4>] (show_regs+0x24/0x2c)
>   r5:808b8be0 r4:808b1e20
> [<8000f0d0>] (show_regs+0x0/0x2c) from [<8007abac>] (watchdog_timer_fn+0x120/0x174)
>   r4:808b0000 r3:808bc0a8
> [<8007aa8c>] (watchdog_timer_fn+0x0/0x174) from [<8003dca4>] (__run_hrtimer.isra.25+0x58/0xd0)
> [<8003dc4c>] (__run_hrtimer.isra.25+0x0/0xd0) from [<8003eabc>] (hrtimer_run_queues+0x214/0x234)
>   r7:8168e2f0 r6:8168e2e0 r5:00000052 r4:0c14ad80
> [<8003e8a8>] (hrtimer_run_queues+0x0/0x234) from [<800292b4>] (run_local_timers+0x1c/0x54)
> [<80029298>] (run_local_timers+0x0/0x54) from [<80029324>] (update_process_times+0x38/0x6c)
>   r4:808b0000 r3:00000000
> [<800292ec>] (update_process_times+0x0/0x6c) from [<80062c18>] (tick_periodic+0x94/0xac)
>   r7:00000000 r6:817f2c80 r5:808ce168 r4:bf807d00
> [<80062b84>] (tick_periodic+0x0/0xac) from [<80062e14>] (tick_handle_periodic+0x2c/0x9c)
> [<80062de8>] (tick_handle_periodic+0x0/0x9c) from [<803dd7c0>] (arch_timer_handler_phys+0x30/0x38)
>   r9:410fc075 r8:817f2c80 r7:0000001d r6:bf81b240 r5:808ce168
> r4:bf807d00
> [<803dd790>] (arch_timer_handler_phys+0x0/0x38) from [<8005922c>] (handle_percpu_devid_irq+0x70/0x8c)
> [<800591bc>] (handle_percpu_devid_irq+0x0/0x8c) from [<800557ac>] (generic_handle_irq+0x28/0x38)
>   r8:808b0028 r7:808b1e54 r6:808b1f20 r5:808acd8c r4:0000001d
> r3:800591bc
> [<80055784>] (generic_handle_irq+0x0/0x38) from [<8000eac4>] (handle_IRQ+0x70/0x98)
>   r4:0000001d r3:00000140
> [<8000ea54>] (handle_IRQ+0x0/0x98) from [<80008560>] (gic_handle_irq+0x44/0x68)
>   r6:808b90c4 r5:808b1e20 r4:c0802000 r3:00000100
> [<8000851c>] (gic_handle_irq+0x0/0x68) from [<80012044>] (__irq_svc+0x44/0x58)
> Exception stack(0x808b1e20 to 0x808b1e68)
> 1e20: 00000001 000e38ed 00000000 808bc0a8 808b0000 00000082 80021d94 808b1f54
> 1e40: 808b0028 410fc075 00200000 808b1eac 808b1e38 808b1e68 8006a20c 80021b24
> 1e60: 200c0113 ffffffff
>   r6:ffffffff r5:200c0113 r4:80021b24 r3:8006a20c
> [<80021a2c>] (__do_softirq+0x0/0x29c) from [<80021d94>] (do_softirq+0x54/0x78)
> [<80021d40>] (do_softirq+0x0/0x78) from [<8002272c>] (irq_exit+0xa8/0x118)
>   r5:808acd8c r4:808b0008
> [<80022684>] (irq_exit+0x0/0x118) from [<8000eac8>] (handle_IRQ+0x74/0x98)
>   r5:808acd8c r4:0000001d
> [<8000ea54>] (handle_IRQ+0x0/0x98) from [<80008560>] (gic_handle_irq+0x44/0x68)
>   r6:808b90c4 r5:808b1f20 r4:c0802000 r3:00000100
> [<8000851c>] (gic_handle_irq+0x0/0x68) from [<80012044>] (__irq_svc+0x44/0x58)
> Exception stack(0x808b1f20 to 0x808b1f68)
> 1f20: 00000001 000e38ec 00000000 808bc0a8 808b0038 808c0360 808b0000 8168b880
> 1f40: 808b8b00 410fc075 00000000 808b1f74 808b1f38 808b1f68 8006a250 8000ee08
> 1f60: 200c0013 ffffffff
>   r6:ffffffff r5:200c0013 r4:8000ee08 r3:8006a250
> [<8000edcc>] (arch_cpu_idle+0x0/0x44) from [<800556fc>] (cpu_startup_entry+0xd0/0x138)
> [<8005562c>] (cpu_startup_entry+0x0/0x138) from [<805643b4>] (rest_init+0xe0/0x108)
> [<805642d4>] (rest_init+0x0/0x108) from [<80715b28>] (start_kernel+0x2fc/0x354)
>   r6:808b0000 r5:8073f450 r4:808b8c08
> [<8071582c>] (start_kernel+0x0/0x354) from [<80008084>] (0x80008084)
>
> I have disabled all drivers except for network, serial and MMC.
> My kernel is built using meta-fsl-arm/recipes-kernel/linux/linux-ls1_3.12.bb
>
> Any ideas what might cause this or how I can debug it further?
>

I don't fully understand why, but this was caused by U-Boot on
the board.  If I build U-Boot with GCC 4.9.2, I get the above
error.  If I build U-Boot with GCC 4.8.4, the error does not
occur and the system is 100% solid.  That's it - nothing else
changes, only U-Boot (with identical U-Boot sources).

I'm not sure what U-Boot could be doing to cause the above problems,
perhaps it's because it is experimental/pre-production silicon...
For now, I'll just use the older GCC for this target.

Weird :-(

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [meta-fsl-arm] Help debugging kernel lockup
  2015-06-18 18:48 ` Gary Thomas
@ 2015-06-18 19:40   ` Nikolay Dimitrov
  0 siblings, 0 replies; 4+ messages in thread
From: Nikolay Dimitrov @ 2015-06-18 19:40 UTC (permalink / raw)
  To: Gary Thomas, meta-freescale

Hi Gary,

On 06/18/2015 09:48 PM, Gary Thomas wrote:
[snip]
> I'm not sure what U-Boot could be doing to cause the above problems,
> perhaps it's because it is experimental/pre-production silicon...

U-Boot is almost always responsible for configuring the DRAM
controller(s), and sometimes it also programs the PMIC voltages. These
alone are enough to cause all kinds of trouble to the system.

Still, using early silicon tapeouts is quite a fun...

Regards,
Nikolay


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-06-18 19:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-17 16:39 [meta-fsl-arm] Help debugging kernel lockup Gary Thomas
2015-06-17 19:32 ` Nikolay Dimitrov
2015-06-18 18:48 ` Gary Thomas
2015-06-18 19:40   ` Nikolay Dimitrov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.