* Problem about " rcu_sched self-detected stall on CPU "on arm64 platform
@ 2015-09-21 3:31 majun (F)
0 siblings, 0 replies; only message in thread
From: majun (F) @ 2015-09-21 3:31 UTC (permalink / raw)
To: LKML
Hi all:
I have a cpu stall problem need you help.
On my arm64 board, when
[1] set maxcpus=17 or other value < 32 and > 16 (total 32 cpus in soc with 2 cpu die. each die has 16 cpus)
[2] enable CONFIG_NUMA or CONFIG_SCHED_MC or both.
system would stall on cpu( log list as below)
When I set maxcpus=32, this problem gone and system boots fine.
If you ever meet or know about this problem,please give me some suggestion.
Thanks
Ma Jun
//-------log-----------------
[ OK ] Reached target Swap.
[ OK ] Mounted Debug File System.
[ OK ] Mounted Huge Pages File System.
[ OK ] Mounted POSIX Message Queue File System.
[ OK ] Started Create static device nodes in /dev.
[ OK ] Started udev Coldplug all Devices.
INFO: rcu_sched self-detected stall on CPU
16: (5250 ticks this GP) idle=407/140000000000001/0 softirq=527/527 fqs=5242
INFO: rcu_sched detected stalls on CPUs/tasks:
16: (5250 ticks this GP) idle=407/140000000000001/0 softirq=527/527 fqs=5242
(detected by 0, t=5252 jiffies, g=229, c=228, q=3574)
Task dump for CPU 16:
systemd-journal R running task 0 978 1 0x00000002
Call trace:
[<ffffffc000086c5c>] __switch_to+0x74/0x8c
(t=5260 jiffies g=229 c=228 q=3574)
Task dump for CPU 16:
systemd-journal R running task 0 978 1 0x00000002
Call trace:
[<ffffffc000089904>] dump_backtrace+0x0/0x124
[<ffffffc000089a38>] show_stack+0x10/0x1c
[<ffffffc0000d65f4>] sched_show_task+0x94/0xdc
[<ffffffc0000d99b0>] dump_cpu_task+0x3c/0x4c
[<ffffffc0000f947c>] rcu_dump_cpu_stacks+0x98/0xe8
[<ffffffc0000fca34>] rcu_check_callbacks+0x47c/0x788
[<ffffffc0000ffddc>] update_process_times+0x38/0x6c
[<ffffffc00010ec80>] tick_sched_handle.isra.16+0x1c/0x68
[<ffffffc00010ed0c>] tick_sched_timer+0x40/0x88
[<ffffffc00010088c>] __run_hrtimer.isra.34+0x4c/0x10c
[<ffffffc000100b88>] hrtimer_interrupt+0xd0/0x258
[<ffffffc0004f0acc>] arch_timer_handler_phys+0x28/0x38
[<ffffffc0000f3760>] handle_percpu_devid_irq+0x74/0x9c
[<ffffffc0000ef524>] generic_handle_irq+0x30/0x4c
[<ffffffc0000ef83c>] __handle_domain_irq+0x5c/0xac
[<ffffffc000082524>] gic_handle_irq+0xb8/0x1c8
Exception stack(0xffffffef5f343af0 to 0xffffffef5f343c10)
3ae0: 7fb069c0 ffffffef 7fb069c8 ffffffef
3b00: 5f343c70 ffffffef 00113990 ffffffc0 80000145 00000000 00000001 00000000
3b20: 001130c4 ffffffc0 00000000 00000000 00856718 ffffffc0 00000040 00000000
3b40: 00000210 00000000 00856000 ffffffc0 7fa19ff8 ffffffef 7fa19fe0 ffffffef
3b60: 00000001 00000000 008568f0 ffffffc0 00000001 00000000 fffffffe ffffffff
3b80: 00000000 00000000 00000000 00000000 00000900 00000000 65747379 6a2f646d
3ba0: 6c616e72 636f732f 6e72756f 732f6c61 65747379 6a2f646d ffffffff ffffffff
3bc0: 95716a94 0000007f 00005749 00000000 00511f54 ffffffc0 957f09d0 0000007f
3be0: e42b0110 0000007f 7fb069c0 ffffffef 7fb069c8 ffffffef 00856000 ffffffc0
3c00: 00856718 ffffffc0 00846980 ffffffc0
[<ffffffc0000855a4>] el1_irq+0x64/0xc0
[<ffffffc000113ab0>] kick_all_cpus_sync+0x24/0x30
[<ffffffc00008c4ac>] aarch64_insn_patch_text+0x84/0x90
[<ffffffc000093860>] arch_jump_label_transform+0x58/0x64
[<ffffffc00013970c>] __jump_label_update+0x68/0x84
[<ffffffc0001397ac>] jump_label_update+0x84/0xa8
[<ffffffc0001398c4>] static_key_slow_inc+0xf4/0xfc
[<ffffffc000524bd8>] net_enable_timestamp+0x6c/0x7c
[<ffffffc000516050>] sock_enable_timestamp+0x70/0x7c
[<ffffffc000516290>] sock_setsockopt+0x234/0x838
[<ffffffc000511fe8>] SyS_setsockopt+0x94/0xa8
NMI watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [systemd-journal:978]
Modules linked in:
CPU: 16 PID: 978 Comm: systemd-journal Not tainted 4.1.6+ #9
Hardware name: Hisilicon PhosphorV660 2P1S Development Board (DT)
task: ffffffef5f38b700 ti: ffffffef5f340000 task.ti: ffffffef5f340000
PC is at smp_call_function_many+0x284/0x2f0
LR is at smp_call_function_many+0x250/0x2f0
pc : [<ffffffc000113990>] lr : [<ffffffc00011395c>] pstate: 80000145
sp : ffffffef5f343c70
x29: ffffffef5f343c70 x28: 0000000000000040
x27: ffffffc000856718 x26: 0000000000000000
x25: ffffffc0001130c4 x24: 0000000000000001
x23: ffffffc000846980 x22: ffffffc000856718
x21: ffffffc000856000 x20: ffffffef7fb069c8
x19: ffffffef7fb069c0 x18: 0000007fe42b0110
x17: 0000007f957f09d0 x16: ffffffc000511f54
x15: 0000000000005749 x14: 0000007f95716a94
x13: ffffffffffffffff x12: 6a2f646d65747379
x11: 732f6c616e72756f x10: 636f732f6c616e72
x9 : 6a2f646d65747379 x8 : 0000000000000900
x7 : 0000000000000000 x6 : 0000000000000000
x5 : fffffffffffffffe x4 : 0000000000000001
x3 : ffffffc0008568f0 x2 : 0000000000000001
x1 : ffffffef7fa19fe0 x0 : ffffffef7fa19ff8
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2015-09-21 3:31 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-21 3:31 Problem about " rcu_sched self-detected stall on CPU "on arm64 platform majun (F)
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.