From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756043AbbIUDbS (ORCPT ); Sun, 20 Sep 2015 23:31:18 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:35431 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755914AbbIUDbR (ORCPT ); Sun, 20 Sep 2015 23:31:17 -0400 To: LKML From: "majun (F)" Subject: Problem about " rcu_sched self-detected stall on CPU "on arm64 platform Message-ID: <55FF79F8.9000207@huawei.com> Date: Mon, 21 Sep 2015 11:31:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 Content-Type: text/plain; charset="gbk" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.235.245] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi all: I have a cpu stall problem need you help. On my arm64 board, when [1] set maxcpus=17 or other value < 32 and > 16 (total 32 cpus in soc with 2 cpu die. each die has 16 cpus) [2] enable CONFIG_NUMA or CONFIG_SCHED_MC or both. system would stall on cpu( log list as below) When I set maxcpus=32, this problem gone and system boots fine. If you ever meet or know about this problem,please give me some suggestion. Thanks Ma Jun //-------log----------------- [ OK ] Reached target Swap. [ OK ] Mounted Debug File System. [ OK ] Mounted Huge Pages File System. [ OK ] Mounted POSIX Message Queue File System. [ OK ] Started Create static device nodes in /dev. [ OK ] Started udev Coldplug all Devices. INFO: rcu_sched self-detected stall on CPU 16: (5250 ticks this GP) idle=407/140000000000001/0 softirq=527/527 fqs=5242 INFO: rcu_sched detected stalls on CPUs/tasks: 16: (5250 ticks this GP) idle=407/140000000000001/0 softirq=527/527 fqs=5242 (detected by 0, t=5252 jiffies, g=229, c=228, q=3574) Task dump for CPU 16: systemd-journal R running task 0 978 1 0x00000002 Call trace: [] __switch_to+0x74/0x8c (t=5260 jiffies g=229 c=228 q=3574) Task dump for CPU 16: systemd-journal R running task 0 978 1 0x00000002 Call trace: [] dump_backtrace+0x0/0x124 [] show_stack+0x10/0x1c [] sched_show_task+0x94/0xdc [] dump_cpu_task+0x3c/0x4c [] rcu_dump_cpu_stacks+0x98/0xe8 [] rcu_check_callbacks+0x47c/0x788 [] update_process_times+0x38/0x6c [] tick_sched_handle.isra.16+0x1c/0x68 [] tick_sched_timer+0x40/0x88 [] __run_hrtimer.isra.34+0x4c/0x10c [] hrtimer_interrupt+0xd0/0x258 [] arch_timer_handler_phys+0x28/0x38 [] handle_percpu_devid_irq+0x74/0x9c [] generic_handle_irq+0x30/0x4c [] __handle_domain_irq+0x5c/0xac [] gic_handle_irq+0xb8/0x1c8 Exception stack(0xffffffef5f343af0 to 0xffffffef5f343c10) 3ae0: 7fb069c0 ffffffef 7fb069c8 ffffffef 3b00: 5f343c70 ffffffef 00113990 ffffffc0 80000145 00000000 00000001 00000000 3b20: 001130c4 ffffffc0 00000000 00000000 00856718 ffffffc0 00000040 00000000 3b40: 00000210 00000000 00856000 ffffffc0 7fa19ff8 ffffffef 7fa19fe0 ffffffef 3b60: 00000001 00000000 008568f0 ffffffc0 00000001 00000000 fffffffe ffffffff 3b80: 00000000 00000000 00000000 00000000 00000900 00000000 65747379 6a2f646d 3ba0: 6c616e72 636f732f 6e72756f 732f6c61 65747379 6a2f646d ffffffff ffffffff 3bc0: 95716a94 0000007f 00005749 00000000 00511f54 ffffffc0 957f09d0 0000007f 3be0: e42b0110 0000007f 7fb069c0 ffffffef 7fb069c8 ffffffef 00856000 ffffffc0 3c00: 00856718 ffffffc0 00846980 ffffffc0 [] el1_irq+0x64/0xc0 [] kick_all_cpus_sync+0x24/0x30 [] aarch64_insn_patch_text+0x84/0x90 [] arch_jump_label_transform+0x58/0x64 [] __jump_label_update+0x68/0x84 [] jump_label_update+0x84/0xa8 [] static_key_slow_inc+0xf4/0xfc [] net_enable_timestamp+0x6c/0x7c [] sock_enable_timestamp+0x70/0x7c [] sock_setsockopt+0x234/0x838 [] SyS_setsockopt+0x94/0xa8 NMI watchdog: BUG: soft lockup - CPU#16 stuck for 22s! [systemd-journal:978] Modules linked in: CPU: 16 PID: 978 Comm: systemd-journal Not tainted 4.1.6+ #9 Hardware name: Hisilicon PhosphorV660 2P1S Development Board (DT) task: ffffffef5f38b700 ti: ffffffef5f340000 task.ti: ffffffef5f340000 PC is at smp_call_function_many+0x284/0x2f0 LR is at smp_call_function_many+0x250/0x2f0 pc : [] lr : [] pstate: 80000145 sp : ffffffef5f343c70 x29: ffffffef5f343c70 x28: 0000000000000040 x27: ffffffc000856718 x26: 0000000000000000 x25: ffffffc0001130c4 x24: 0000000000000001 x23: ffffffc000846980 x22: ffffffc000856718 x21: ffffffc000856000 x20: ffffffef7fb069c8 x19: ffffffef7fb069c0 x18: 0000007fe42b0110 x17: 0000007f957f09d0 x16: ffffffc000511f54 x15: 0000000000005749 x14: 0000007f95716a94 x13: ffffffffffffffff x12: 6a2f646d65747379 x11: 732f6c616e72756f x10: 636f732f6c616e72 x9 : 6a2f646d65747379 x8 : 0000000000000900 x7 : 0000000000000000 x6 : 0000000000000000 x5 : fffffffffffffffe x4 : 0000000000000001 x3 : ffffffc0008568f0 x2 : 0000000000000001 x1 : ffffffef7fa19fe0 x0 : ffffffef7fa19ff8