From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965421Ab3GLUeW (ORCPT ); Fri, 12 Jul 2013 16:34:22 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:34195 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965377Ab3GLUeU (ORCPT ); Fri, 12 Jul 2013 16:34:20 -0400 Message-ID: <51E06841.50701@oracle.com> Date: Fri, 12 Jul 2013 16:34:09 -0400 From: Sasha Levin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130623 Thunderbird/17.0.7 MIME-Version: 1.0 To: dipankar@in.ibm.com, paulmck@linux.vnet.ibm.com CC: LKML Subject: rcu: qs related lockup on boot Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: ucsinet21.oracle.com [156.151.31.93] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi all, I've stumbled on the following rcu related spew at boot. It happened right when the kernel finished up it's boot sequence and passed it to userspace. [ 116.549044] BUG: soft lockup - CPU#0 stuck for 30s! [modprobe:12510] [ 116.549884] Modules linked in: [ 116.551796] hardirqs last enabled at (3258): [] restore_args+0x0/0x30 [ 116.553684] hardirqs last disabled at (3259): [] apic_timer_interrupt+0x6d/0x80 [ 116.554753] softirqs last enabled at (3212): [] __do_softirq+0x447/0x4d0 [ 116.555857] softirqs last disabled at (3223): [] irq_exit+0x86/0x120 [ 116.556760] CPU: 0 PID: 12510 Comm: modprobe Tainted: G W 3.10.0-next-20130712-sasha #3956 [ 116.557812] task: ffff8807c8173000 ti: ffff8807c72a6000 task.ti: ffff8807c72a6000 [ 116.558599] RIP: 0010:[] [] _raw_spin_unlock_irqrestore+0x8d/0xc0 [ 116.558958] RSP: 0018:ffff8807eba03ce8 EFLAGS: 00000282 [ 116.558958] RAX: ffff8807c8173000 RBX: ffffffff84199eb7 RCX: 0000000000000000 [ 116.558958] RDX: ffff8807c8173000 RSI: 0000000000000000 RDI: 0000000000000282 [ 116.558958] RBP: ffff8807eba03cf8 R08: 0000000000000000 R09: 0000000000000000 [ 116.558958] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8807eba03c58 [ 116.558958] R13: ffffffff841a3cf2 R14: ffff8807eba03cf8 R15: ffffffff86098020 [ 116.558958] FS: 0000000000000000(0000) GS:ffff8807eba00000(0000) knlGS:0000000000000000 [ 116.558958] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 116.558958] CR2: 00007f99fc17f850 CR3: 00000007c8438000 CR4: 00000000000006f0 [ 116.558958] Stack: [ 116.558958] ffffffff86098020 0000000000000282 ffff8807eba03d38 ffffffff81169173 [ 116.558958] ffff8807eba03d18 ffffffff85e8d000 0000000000000282 0000000000000000 [ 116.558958] 0000000000000000 0000000000000000 ffff8807eba03d58 ffffffff811e7743 [ 116.558958] Call Trace: [ 116.558958] [ 116.558958] [] __wake_up+0x53/0x70 [ 116.558958] [] rcu_report_qs_rsp+0x73/0x80 [ 116.558958] [] rcu_report_qs_rnp+0x26d/0x2c0 [ 116.558958] [] ? rcu_check_quiescent_state+0x4a/0xf0 [ 116.558958] [] rcu_check_quiescent_state+0xcc/0xf0 [ 116.558958] [] __rcu_process_callbacks+0x5b/0x180 [ 116.558958] [] rcu_process_callbacks+0x88/0xc0 [ 116.558958] [] __do_softirq+0x261/0x4d0 [ 116.558958] [] irq_exit+0x86/0x120 [ 116.558958] [] smp_apic_timer_interrupt+0x4a/0x60 [ 116.558958] [] apic_timer_interrupt+0x72/0x80 [ 116.558958] [ 116.558958] [] ? retint_restore_args+0x13/0x13 [ 116.558958] [] ? _raw_spin_unlock_irq+0x4c/0x80 [ 116.558958] [] ? _raw_spin_unlock_irq+0x30/0x80 [ 116.558958] [] finish_task_switch+0x96/0x120 [ 116.558958] [] ? finish_task_switch+0x58/0x120 [ 116.558958] [] __schedule+0x81b/0x8e0 [ 116.558958] [] ? rcu_irq_exit+0x1b7/0x200 [ 116.558958] [] ? retint_restore_args+0x13/0x13 [ 116.558958] [] preempt_schedule_irq+0xa4/0xf0 [ 116.558958] [] retint_kernel+0x26/0x30 [ 116.558958] [] ? user_enter+0x135/0x150 [ 116.558958] [] syscall_trace_leave+0x12d/0x160 [ 116.558958] [] int_check_syscall_exit_work+0x34/0x3d [ 116.558958] Code: 1f 80 00 00 00 00 e8 73 e2 00 fd 48 83 3d e3 d6 8a 01 00 75 11 0f 0b 0f 1f 80 00 00 00 00 eb fe 66 0f 1f 44 00 00 4c 89 e7 57 9d <66> 66 90 66 90 bf 01 00 00 00 e8 94 4f 00 00 65 48 8b 04 25 88 The other cpus with anything in their callstack are: [ 118.366894] CPU: 140 PID: 0 Comm: swapper/140 Tainted: G W 3.10.0-next-20130712-sasha #3956 [ 118.366894] task: ffff8807de5bb000 ti: ffff8807de5c0000 task.ti: ffff8807de5c0000 [ 118.366894] RIP: 0010:[] [] delay_tsc+0xb2/0x140 [ 118.366894] RSP: 0018:ffff8807fd203eb8 EFLAGS: 00000046 [ 118.366894] RAX: 0000000000080000 RBX: 000000005af9b24b RCX: 000000005af9c2c7 [ 118.366894] RDX: 0000000000000043 RSI: 0000000000000000 RDI: 0000000000000001 [ 118.366894] RBP: ffff8807fd203ee8 R08: 00000000e26ec8c5 R09: 0000000000000001 [ 118.366894] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8807de5c0010 [ 118.366894] R13: 0000000000005854 R14: 000000000000008c R15: 000000005af9c2c7 [ 118.366894] FS: 0000000000000000(0000) GS:ffff8807fd200000(0000) knlGS:0000000000000000 [ 118.366894] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 118.366894] CR2: 00007f55dc6b6000 CR3: 0000000005a24000 CR4: 00000000000006e0 [ 118.366894] Stack: [ 118.366894] ffff8807de5bb000 ffff8807de5bb000 0000000000000000 00000000ffffffff [ 118.366894] 0000000000000000 0000000000000002 ffff8807fd203ef8 ffffffff81a734f9 [ 118.366894] ffff8807fd203f18 ffffffff81156af4 0000000000000000 0000000000000000 [ 118.366894] Call Trace: [ 118.366894] [ 118.366894] [] __const_udelay+0x29/0x30 [ 118.366894] [] __rcu_read_unlock+0x44/0xb0 [ 118.366894] [] __atomic_notifier_call_chain+0x10d/0x130 [ 118.366894] [] ? notifier_call_chain+0x130/0x130 [ 118.366894] [] atomic_notifier_call_chain+0x16/0x20 [ 118.366894] [] exit_idle+0x3f/0x50 [ 118.366894] [] smp_apic_timer_interrupt+0x3e/0x60 [ 118.366894] [] apic_timer_interrupt+0x72/0x80 [ 118.366894] [ 118.366894] [] ? native_safe_halt+0x6/0x10 [ 118.366894] [] ? trace_hardirqs_on+0xd/0x10 [ 118.366894] [] default_idle+0x13d/0x2a0 [ 118.366894] [] arch_cpu_idle+0x18/0x50 [ 118.366894] [] cpu_idle_loop+0x32f/0x410 [ 118.366894] [] cpu_startup_entry+0x70/0x80 [ 118.366894] [] start_secondary+0xf3/0x100 [ 118.366894] Code: 41 89 c7 29 d8 44 39 e8 73 6f bf 01 00 00 00 e8 15 b0 72 02 49 8b 04 24 a8 08 74 0d e8 58 3d 72 02 0f 1f 84 00 00 00 00 00 f3 90 01 00 00 00 e8 e4 b0 72 02 e8 af d6 01 00 41 39 c6 74 9a e8 and: [ 116.900485] CPU: 230 PID: 2154 Comm: watchdog/230 Tainted: G W 3.10.0-next-20130712-sasha #3956 [ 116.900485] task: ffff8807da8c8000 ti: ffff8807da8bc000 task.ti: ffff8807da8bc000 [ 116.900485] RIP: 0010:[] [] delay_tsc+0xb2/0x140 [ 116.900485] RSP: 0018:ffff8807da8bdc58 EFLAGS: 00000046 [ 116.900485] RAX: 0000000000080000 RBX: 000000007a14c1cd RCX: 000000007a1518e5 [ 116.900485] RDX: 0000000000000043 RSI: 0000000000000000 RDI: 0000000000000001 [ 116.900485] RBP: ffff8807da8bdc88 R08: 00000000e26ec8c5 R09: 0000000000000001 [ 116.900485] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8807da8bc010 [ 116.900485] R13: 0000000000005854 R14: 00000000000000e6 R15: 000000007a1518e5 [ 116.900485] FS: 0000000000000000(0000) GS:ffff880808600000(0000) knlGS:0000000000000000 [ 116.900485] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 116.900485] CR2: 0000000000625370 CR3: 0000000005a24000 CR4: 00000000000006e0 [ 116.900485] Stack: [ 116.900485] ffff8807da8c8000 ffff8807da8c8000 00000000ffffb87a ffff8808087d78c0 [ 116.900485] 00000000000000e6 ffff8807da8bdcf4 ffff8807da8bdc98 ffffffff81a734f9 [ 116.900485] ffff8807da8bdcb8 ffffffff81156af4 00000000000000e6 0000000000000000 [ 116.900485] Call Trace: [ 116.900485] [] __const_udelay+0x29/0x30 [ 116.900485] [] __rcu_read_unlock+0x44/0xb0 [ 116.900485] [] idle_balance+0x205/0x240 [ 116.900485] [] ? idle_balance+0x44/0x240 [ 116.900485] [] ? dequeue_task_rt+0x39/0x50 [ 116.900485] [] __schedule+0x275/0x8e0 [ 116.900485] [] schedule+0x5d/0x70 [ 116.900485] [] smpboot_thread_fn+0x295/0x2f0 [ 116.900485] [] ? smpboot_register_percpu_thread+0x100/0x100 [ 116.900485] [] kthread+0xe7/0xf0 [ 116.900485] [] ? wait_for_completion+0xca/0x120 [ 116.900485] [] ? __init_kthread_worker+0x70/0x70 [ 116.900485] [] ret_from_fork+0x7c/0xb0 [ 116.900485] [] ? __init_kthread_worker+0x70/0x70 [ 116.900485] Code: 41 89 c7 29 d8 44 39 e8 73 6f bf 01 00 00 00 e8 15 b0 72 02 49 8b 04 24 a8 08 74 0d e8 58 3d 72 02 0f 1f 84 00 00 00 00 00 f3 90 01 00 00 00 e8 e4 b0 72 02 e8 af d6 01 00 41 39 c6 74 9a e8 All other cpus are idle. Thanks, Sasha