* WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() @ 2012-02-09 1:31 Sasha Levin 2012-02-09 0:59 ` Josh Boyer 0 siblings, 1 reply; 15+ messages in thread From: Sasha Levin @ 2012-02-09 1:31 UTC (permalink / raw) To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity Cc: kvm, linux-kernel, x86 Hi all, I got the following warning when shutting down a KVM guest with a whole bunch of cores (254 in this case). It's actually pretty easy to reproduce it, it happens every once in 2-3 shutdowns. [ 32.448626] ------------[ cut here ]------------ [ 32.449160] WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() [ 32.449621] Pid: 1, comm: init_stage2 Not tainted 3.2.0+ #14 [ 32.449621] Call Trace: [ 32.449621] <IRQ> [<ffffffff81041a44>] ? native_smp_send_reschedule+0x25/0x43 [ 32.449621] [<ffffffff810735b2>] warn_slowpath_common+0x7b/0x93 [ 32.449621] [<ffffffff810962cc>] ? tick_nohz_handler+0xc9/0xc9 [ 32.449621] [<ffffffff81073675>] warn_slowpath_null+0x15/0x18 [ 32.449621] [<ffffffff81041a44>] native_smp_send_reschedule+0x25/0x43 [ 32.449621] [<ffffffff81067a00>] smp_send_reschedule+0xa/0xc [ 32.449621] [<ffffffff8106f25e>] scheduler_tick+0x21a/0x242 [ 32.449621] [<ffffffff8107da10>] update_process_times+0x62/0x73 [ 32.449621] [<ffffffff81096336>] tick_sched_timer+0x6a/0x8a [ 32.449621] [<ffffffff8108c5eb>] __run_hrtimer.clone.26+0x55/0xcb [ 32.449621] [<ffffffff8108cd77>] hrtimer_interrupt+0xcb/0x19b [ 32.449621] [<ffffffff810428a8>] smp_apic_timer_interrupt+0x72/0x85 [ 32.449621] [<ffffffff8165a8de>] apic_timer_interrupt+0x6e/0x80 [ 32.449621] <EOI> [<ffffffff8165928e>] ? _raw_spin_unlock_irqrestore+0x3a/0x3e [ 32.449621] [<ffffffff81042f4e>] ? arch_local_irq_restore+0x6/0xd [ 32.449621] [<ffffffff810430c4>] default_send_IPI_mask_allbutself_phys+0x78/0x88 [ 32.449621] [<ffffffff8106c3c4>] ? __migrate_task+0xf1/0xf1 [ 32.449621] [<ffffffff81045445>] physflat_send_IPI_allbutself+0x12/0x14 [ 32.449621] [<ffffffff81041aaf>] native_stop_other_cpus+0x4d/0xa8 [ 32.449621] [<ffffffff810411c6>] native_machine_shutdown+0x56/0x6d [ 32.449621] [<ffffffff81048499>] kvm_shutdown+0x1a/0x1c [ 32.449621] [<ffffffff810411f9>] machine_shutdown+0xa/0xc [ 32.449621] [<ffffffff81041265>] native_machine_restart+0x20/0x32 [ 32.449621] [<ffffffff81041297>] machine_restart+0xa/0xc [ 32.449621] [<ffffffff81081d53>] kernel_restart+0x49/0x4d [ 32.449621] [<ffffffff81081f26>] sys_reboot+0x14b/0x18a [ 32.449621] [<ffffffff81089937>] ? remove_wait_queue+0x4c/0x51 [ 32.449621] [<ffffffff8107637f>] ? do_wait+0x1a4/0x1e7 [ 32.449621] [<ffffffff8107735a>] ? sys_wait4+0xa8/0xbc [ 32.449621] [<ffffffff8107522b>] ? clear_tsk_thread_flag+0xf/0xf [ 32.449621] [<ffffffff81659a25>] ? async_page_fault+0x25/0x30 [ 32.449621] [<ffffffff81659e92>] system_call_fastpath+0x16/0x1b [ 32.449621] ---[ end trace d0f03651493fd3d6 ]-- -- Sasha. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-09 1:31 WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() Sasha Levin @ 2012-02-09 0:59 ` Josh Boyer 2012-02-09 19:46 ` Sasha Levin 0 siblings, 1 reply; 15+ messages in thread From: Josh Boyer @ 2012-02-09 0:59 UTC (permalink / raw) To: Sasha Levin Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86 On Wed, Feb 8, 2012 at 8:31 PM, Sasha Levin <levinsasha928@gmail.com> wrote: > Hi all, > > I got the following warning when shutting down a KVM guest with a whole bunch of cores (254 in this case). > > It's actually pretty easy to reproduce it, it happens every once in 2-3 shutdowns. > > [ 32.448626] ------------[ cut here ]------------ > [ 32.449160] WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() > [ 32.449621] Pid: 1, comm: init_stage2 Not tainted 3.2.0+ #14 > [ 32.449621] Call Trace: > [ 32.449621] <IRQ> [<ffffffff81041a44>] ? native_smp_send_reschedule+0x25/0x43 > [ 32.449621] [<ffffffff810735b2>] warn_slowpath_common+0x7b/0x93 > [ 32.449621] [<ffffffff810962cc>] ? tick_nohz_handler+0xc9/0xc9 > [ 32.449621] [<ffffffff81073675>] warn_slowpath_null+0x15/0x18 > [ 32.449621] [<ffffffff81041a44>] native_smp_send_reschedule+0x25/0x43 > [ 32.449621] [<ffffffff81067a00>] smp_send_reschedule+0xa/0xc > [ 32.449621] [<ffffffff8106f25e>] scheduler_tick+0x21a/0x242 > [ 32.449621] [<ffffffff8107da10>] update_process_times+0x62/0x73 > [ 32.449621] [<ffffffff81096336>] tick_sched_timer+0x6a/0x8a > [ 32.449621] [<ffffffff8108c5eb>] __run_hrtimer.clone.26+0x55/0xcb > [ 32.449621] [<ffffffff8108cd77>] hrtimer_interrupt+0xcb/0x19b > [ 32.449621] [<ffffffff810428a8>] smp_apic_timer_interrupt+0x72/0x85 > [ 32.449621] [<ffffffff8165a8de>] apic_timer_interrupt+0x6e/0x80 > [ 32.449621] <EOI> [<ffffffff8165928e>] ? _raw_spin_unlock_irqrestore+0x3a/0x3e > [ 32.449621] [<ffffffff81042f4e>] ? arch_local_irq_restore+0x6/0xd > [ 32.449621] [<ffffffff810430c4>] default_send_IPI_mask_allbutself_phys+0x78/0x88 > [ 32.449621] [<ffffffff8106c3c4>] ? __migrate_task+0xf1/0xf1 > [ 32.449621] [<ffffffff81045445>] physflat_send_IPI_allbutself+0x12/0x14 > [ 32.449621] [<ffffffff81041aaf>] native_stop_other_cpus+0x4d/0xa8 > [ 32.449621] [<ffffffff810411c6>] native_machine_shutdown+0x56/0x6d > [ 32.449621] [<ffffffff81048499>] kvm_shutdown+0x1a/0x1c > [ 32.449621] [<ffffffff810411f9>] machine_shutdown+0xa/0xc > [ 32.449621] [<ffffffff81041265>] native_machine_restart+0x20/0x32 > [ 32.449621] [<ffffffff81041297>] machine_restart+0xa/0xc > [ 32.449621] [<ffffffff81081d53>] kernel_restart+0x49/0x4d > [ 32.449621] [<ffffffff81081f26>] sys_reboot+0x14b/0x18a > [ 32.449621] [<ffffffff81089937>] ? remove_wait_queue+0x4c/0x51 > [ 32.449621] [<ffffffff8107637f>] ? do_wait+0x1a4/0x1e7 > [ 32.449621] [<ffffffff8107735a>] ? sys_wait4+0xa8/0xbc > [ 32.449621] [<ffffffff8107522b>] ? clear_tsk_thread_flag+0xf/0xf > [ 32.449621] [<ffffffff81659a25>] ? async_page_fault+0x25/0x30 > [ 32.449621] [<ffffffff81659e92>] system_call_fastpath+0x16/0x1b > [ 32.449621] ---[ end trace d0f03651493fd3d6 ]-- You don't really point out exactly which kernel this is, but we saw this in 3.3 git and it was fixed by commit 71325960d16cd68ea0e22a8da15b2495b0f363f7. Or at least something very like it was. josh ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-09 0:59 ` Josh Boyer @ 2012-02-09 19:46 ` Sasha Levin 2012-02-10 10:06 ` Srivatsa S. Bhat 0 siblings, 1 reply; 15+ messages in thread From: Sasha Levin @ 2012-02-09 19:46 UTC (permalink / raw) To: Josh Boyer Cc: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86 On Wed, Feb 8, 2012 at 7:59 PM, Josh Boyer <jwboyer@gmail.com> wrote: > On Wed, Feb 8, 2012 at 8:31 PM, Sasha Levin <levinsasha928@gmail.com> wrote: >> Hi all, >> >> I got the following warning when shutting down a KVM guest with a whole bunch of cores (254 in this case). >> >> It's actually pretty easy to reproduce it, it happens every once in 2-3 shutdowns. >> >> [ 32.448626] ------------[ cut here ]------------ >> [ 32.449160] WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() >> [ 32.449621] Pid: 1, comm: init_stage2 Not tainted 3.2.0+ #14 >> [ 32.449621] Call Trace: >> [ 32.449621] <IRQ> [<ffffffff81041a44>] ? native_smp_send_reschedule+0x25/0x43 >> [ 32.449621] [<ffffffff810735b2>] warn_slowpath_common+0x7b/0x93 >> [ 32.449621] [<ffffffff810962cc>] ? tick_nohz_handler+0xc9/0xc9 >> [ 32.449621] [<ffffffff81073675>] warn_slowpath_null+0x15/0x18 >> [ 32.449621] [<ffffffff81041a44>] native_smp_send_reschedule+0x25/0x43 >> [ 32.449621] [<ffffffff81067a00>] smp_send_reschedule+0xa/0xc >> [ 32.449621] [<ffffffff8106f25e>] scheduler_tick+0x21a/0x242 >> [ 32.449621] [<ffffffff8107da10>] update_process_times+0x62/0x73 >> [ 32.449621] [<ffffffff81096336>] tick_sched_timer+0x6a/0x8a >> [ 32.449621] [<ffffffff8108c5eb>] __run_hrtimer.clone.26+0x55/0xcb >> [ 32.449621] [<ffffffff8108cd77>] hrtimer_interrupt+0xcb/0x19b >> [ 32.449621] [<ffffffff810428a8>] smp_apic_timer_interrupt+0x72/0x85 >> [ 32.449621] [<ffffffff8165a8de>] apic_timer_interrupt+0x6e/0x80 >> [ 32.449621] <EOI> [<ffffffff8165928e>] ? _raw_spin_unlock_irqrestore+0x3a/0x3e >> [ 32.449621] [<ffffffff81042f4e>] ? arch_local_irq_restore+0x6/0xd >> [ 32.449621] [<ffffffff810430c4>] default_send_IPI_mask_allbutself_phys+0x78/0x88 >> [ 32.449621] [<ffffffff8106c3c4>] ? __migrate_task+0xf1/0xf1 >> [ 32.449621] [<ffffffff81045445>] physflat_send_IPI_allbutself+0x12/0x14 >> [ 32.449621] [<ffffffff81041aaf>] native_stop_other_cpus+0x4d/0xa8 >> [ 32.449621] [<ffffffff810411c6>] native_machine_shutdown+0x56/0x6d >> [ 32.449621] [<ffffffff81048499>] kvm_shutdown+0x1a/0x1c >> [ 32.449621] [<ffffffff810411f9>] machine_shutdown+0xa/0xc >> [ 32.449621] [<ffffffff81041265>] native_machine_restart+0x20/0x32 >> [ 32.449621] [<ffffffff81041297>] machine_restart+0xa/0xc >> [ 32.449621] [<ffffffff81081d53>] kernel_restart+0x49/0x4d >> [ 32.449621] [<ffffffff81081f26>] sys_reboot+0x14b/0x18a >> [ 32.449621] [<ffffffff81089937>] ? remove_wait_queue+0x4c/0x51 >> [ 32.449621] [<ffffffff8107637f>] ? do_wait+0x1a4/0x1e7 >> [ 32.449621] [<ffffffff8107735a>] ? sys_wait4+0xa8/0xbc >> [ 32.449621] [<ffffffff8107522b>] ? clear_tsk_thread_flag+0xf/0xf >> [ 32.449621] [<ffffffff81659a25>] ? async_page_fault+0x25/0x30 >> [ 32.449621] [<ffffffff81659e92>] system_call_fastpath+0x16/0x1b >> [ 32.449621] ---[ end trace d0f03651493fd3d6 ]-- > > You don't really point out exactly which kernel this is, but we saw this in > 3.3 git and it was fixed by commit 71325960d16cd68ea0e22a8da15b2495b0f363f7. > Or at least something very like it was. The kernel there was vanilla 3.2 (as stated in the warning header). I've tried it again with linux-next from today which includes the commit you mentioned, and still get the same error. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-09 19:46 ` Sasha Levin @ 2012-02-10 10:06 ` Srivatsa S. Bhat 2012-02-10 18:58 ` Peter Zijlstra 0 siblings, 1 reply; 15+ messages in thread From: Srivatsa S. Bhat @ 2012-02-10 10:06 UTC (permalink / raw) To: Sasha Levin Cc: Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Peter Zijlstra, Sergey Senozhatsky Adding Suresh and Peter to Cc. On 02/10/2012 01:16 AM, Sasha Levin wrote: > On Wed, Feb 8, 2012 at 7:59 PM, Josh Boyer <jwboyer@gmail.com> wrote: >> On Wed, Feb 8, 2012 at 8:31 PM, Sasha Levin <levinsasha928@gmail.com> wrote: >>> Hi all, >>> >>> I got the following warning when shutting down a KVM guest with a whole bunch of cores (254 in this case). >>> >>> It's actually pretty easy to reproduce it, it happens every once in 2-3 shutdowns. >>> >>> [ 32.448626] ------------[ cut here ]------------ >>> [ 32.449160] WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() >>> [ 32.449621] Pid: 1, comm: init_stage2 Not tainted 3.2.0+ #14 >>> [ 32.449621] Call Trace: >>> [ 32.449621] <IRQ> [<ffffffff81041a44>] ? native_smp_send_reschedule+0x25/0x43 >>> [ 32.449621] [<ffffffff810735b2>] warn_slowpath_common+0x7b/0x93 >>> [ 32.449621] [<ffffffff810962cc>] ? tick_nohz_handler+0xc9/0xc9 >>> [ 32.449621] [<ffffffff81073675>] warn_slowpath_null+0x15/0x18 >>> [ 32.449621] [<ffffffff81041a44>] native_smp_send_reschedule+0x25/0x43 >>> [ 32.449621] [<ffffffff81067a00>] smp_send_reschedule+0xa/0xc >>> [ 32.449621] [<ffffffff8106f25e>] scheduler_tick+0x21a/0x242 >>> [ 32.449621] [<ffffffff8107da10>] update_process_times+0x62/0x73 >>> [ 32.449621] [<ffffffff81096336>] tick_sched_timer+0x6a/0x8a >>> [ 32.449621] [<ffffffff8108c5eb>] __run_hrtimer.clone.26+0x55/0xcb >>> [ 32.449621] [<ffffffff8108cd77>] hrtimer_interrupt+0xcb/0x19b >>> [ 32.449621] [<ffffffff810428a8>] smp_apic_timer_interrupt+0x72/0x85 >>> [ 32.449621] [<ffffffff8165a8de>] apic_timer_interrupt+0x6e/0x80 >>> [ 32.449621] <EOI> [<ffffffff8165928e>] ? _raw_spin_unlock_irqrestore+0x3a/0x3e >>> [ 32.449621] [<ffffffff81042f4e>] ? arch_local_irq_restore+0x6/0xd >>> [ 32.449621] [<ffffffff810430c4>] default_send_IPI_mask_allbutself_phys+0x78/0x88 >>> [ 32.449621] [<ffffffff8106c3c4>] ? __migrate_task+0xf1/0xf1 >>> [ 32.449621] [<ffffffff81045445>] physflat_send_IPI_allbutself+0x12/0x14 >>> [ 32.449621] [<ffffffff81041aaf>] native_stop_other_cpus+0x4d/0xa8 >>> [ 32.449621] [<ffffffff810411c6>] native_machine_shutdown+0x56/0x6d >>> [ 32.449621] [<ffffffff81048499>] kvm_shutdown+0x1a/0x1c >>> [ 32.449621] [<ffffffff810411f9>] machine_shutdown+0xa/0xc >>> [ 32.449621] [<ffffffff81041265>] native_machine_restart+0x20/0x32 >>> [ 32.449621] [<ffffffff81041297>] machine_restart+0xa/0xc >>> [ 32.449621] [<ffffffff81081d53>] kernel_restart+0x49/0x4d >>> [ 32.449621] [<ffffffff81081f26>] sys_reboot+0x14b/0x18a >>> [ 32.449621] [<ffffffff81089937>] ? remove_wait_queue+0x4c/0x51 >>> [ 32.449621] [<ffffffff8107637f>] ? do_wait+0x1a4/0x1e7 >>> [ 32.449621] [<ffffffff8107735a>] ? sys_wait4+0xa8/0xbc >>> [ 32.449621] [<ffffffff8107522b>] ? clear_tsk_thread_flag+0xf/0xf >>> [ 32.449621] [<ffffffff81659a25>] ? async_page_fault+0x25/0x30 >>> [ 32.449621] [<ffffffff81659e92>] system_call_fastpath+0x16/0x1b >>> [ 32.449621] ---[ end trace d0f03651493fd3d6 ]-- >> >> You don't really point out exactly which kernel this is, but we saw this in >> 3.3 git and it was fixed by commit 71325960d16cd68ea0e22a8da15b2495b0f363f7. >> Or at least something very like it was. > > The kernel there was vanilla 3.2 (as stated in the warning header). > > I've tried it again with linux-next from today which includes the > commit you mentioned, and still get the same error. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-10 10:06 ` Srivatsa S. Bhat @ 2012-02-10 18:58 ` Peter Zijlstra 2012-02-10 19:03 ` Peter Zijlstra 0 siblings, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2012-02-10 18:58 UTC (permalink / raw) To: Srivatsa S. Bhat Cc: Sasha Levin, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky, Don Zickus On Fri, 2012-02-10 at 15:36 +0530, Srivatsa S. Bhat wrote: > >>> [ 32.448626] ------------[ cut here ]------------ > >>> [ 32.449160] WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() > >>> [ 32.449621] Pid: 1, comm: init_stage2 Not tainted 3.2.0+ #14 > >>> [ 32.449621] Call Trace: > >>> [ 32.449621] <IRQ> [<ffffffff81041a44>] ? native_smp_send_reschedule+0x25/0x43 > >>> [ 32.449621] [<ffffffff810735b2>] warn_slowpath_common+0x7b/0x93 > >>> [ 32.449621] [<ffffffff810962cc>] ? tick_nohz_handler+0xc9/0xc9 > >>> [ 32.449621] [<ffffffff81073675>] warn_slowpath_null+0x15/0x18 > >>> [ 32.449621] [<ffffffff81041a44>] native_smp_send_reschedule+0x25/0x43 > >>> [ 32.449621] [<ffffffff81067a00>] smp_send_reschedule+0xa/0xc > >>> [ 32.449621] [<ffffffff8106f25e>] scheduler_tick+0x21a/0x242 > >>> [ 32.449621] [<ffffffff8107da10>] update_process_times+0x62/0x73 > >>> [ 32.449621] [<ffffffff81096336>] tick_sched_timer+0x6a/0x8a > >>> [ 32.449621] [<ffffffff8108c5eb>] __run_hrtimer.clone.26+0x55/0xcb > >>> [ 32.449621] [<ffffffff8108cd77>] hrtimer_interrupt+0xcb/0x19b > >>> [ 32.449621] [<ffffffff810428a8>] smp_apic_timer_interrupt+0x72/0x85 > >>> [ 32.449621] [<ffffffff8165a8de>] apic_timer_interrupt+0x6e/0x80 > >>> [ 32.449621] <EOI> [<ffffffff8165928e>] ? _raw_spin_unlock_irqrestore+0x3a/0x3e > >>> [ 32.449621] [<ffffffff81042f4e>] ? arch_local_irq_restore+0x6/0xd > >>> [ 32.449621] [<ffffffff810430c4>] default_send_IPI_mask_allbutself_phys+0x78/0x88 > >>> [ 32.449621] [<ffffffff8106c3c4>] ? __migrate_task+0xf1/0xf1 > >>> [ 32.449621] [<ffffffff81045445>] physflat_send_IPI_allbutself+0x12/0x14 > >>> [ 32.449621] [<ffffffff81041aaf>] native_stop_other_cpus+0x4d/0xa8 > >>> [ 32.449621] [<ffffffff810411c6>] native_machine_shutdown+0x56/0x6d > >>> [ 32.449621] [<ffffffff81048499>] kvm_shutdown+0x1a/0x1c > >>> [ 32.449621] [<ffffffff810411f9>] machine_shutdown+0xa/0xc > >>> [ 32.449621] [<ffffffff81041265>] native_machine_restart+0x20/0x32 > >>> [ 32.449621] [<ffffffff81041297>] machine_restart+0xa/0xc > >>> [ 32.449621] [<ffffffff81081d53>] kernel_restart+0x49/0x4d > >>> [ 32.449621] [<ffffffff81081f26>] sys_reboot+0x14b/0x18a > >>> [ 32.449621] [<ffffffff81089937>] ? remove_wait_queue+0x4c/0x51 > >>> [ 32.449621] [<ffffffff8107637f>] ? do_wait+0x1a4/0x1e7 > >>> [ 32.449621] [<ffffffff8107735a>] ? sys_wait4+0xa8/0xbc > >>> [ 32.449621] [<ffffffff8107522b>] ? clear_tsk_thread_flag+0xf/0xf > >>> [ 32.449621] [<ffffffff81659a25>] ? async_page_fault+0x25/0x30 > >>> [ 32.449621] [<ffffffff81659e92>] system_call_fastpath+0x16/0x1b > >>> [ 32.449621] ---[ end trace d0f03651493fd3d6 ]-- OK, so a 'modern' kernel does it slightly different and I've no idea what exactly goes wrong in your vintage version. But I can see the current stuff going at it all wrong. What seems to happen is that native_nmi_stop_other_cpus() NMI broadcasts for smp_stop_nmi_callback()->stop_this_cpu(). Which without any serialization what so ever marks all remote CPUs offline and calls halt with IRQs disabled -> dead. While we're waiting for this all to complete, the scheduler tries to no_hz load-balance and kick a cpu it thinks is still around and we get the above splat because the NMI just marked it offline without telling anybody about it. Now, arguably you don't want to go through the whole hotplug crap to shut down your machine, esp not on panic, but clearing the online state without telling anybody about it is bound to lead to these things. No immediate solution comes to mind... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-10 18:58 ` Peter Zijlstra @ 2012-02-10 19:03 ` Peter Zijlstra 2012-02-10 20:02 ` Don Zickus 0 siblings, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2012-02-10 19:03 UTC (permalink / raw) To: Srivatsa S. Bhat Cc: Sasha Levin, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky, Don Zickus On Fri, 2012-02-10 at 19:58 +0100, Peter Zijlstra wrote: > OK, so a 'modern' kernel does it slightly different and I've no idea > what exactly goes wrong in your vintage version. But I can see the > current stuff going at it all wrong. > > What seems to happen is that native_nmi_stop_other_cpus() NMI broadcasts > for smp_stop_nmi_callback()->stop_this_cpu(). Which without any > serialization what so ever marks all remote CPUs offline and calls halt > with IRQs disabled -> dead. > > While we're waiting for this all to complete, the scheduler tries to > no_hz load-balance and kick a cpu it thinks is still around and we get > the above splat because the NMI just marked it offline without telling > anybody about it. > > Now, arguably you don't want to go through the whole hotplug crap to > shut down your machine, esp not on panic, but clearing the online state > without telling anybody about it is bound to lead to these things. > > No immediate solution comes to mind... Don, any reason you wait for the NMI broadcast to complete with IRQs enabled? If you disable IRQs before the broadcast the interrupt can't happen and should side-step this particular problem. Its not like we have 'latency' issues on this path :-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-10 19:03 ` Peter Zijlstra @ 2012-02-10 20:02 ` Don Zickus 2012-02-10 20:18 ` Peter Zijlstra 0 siblings, 1 reply; 15+ messages in thread From: Don Zickus @ 2012-02-10 20:02 UTC (permalink / raw) To: Peter Zijlstra Cc: Srivatsa S. Bhat, Sasha Levin, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky On Fri, Feb 10, 2012 at 08:03:53PM +0100, Peter Zijlstra wrote: > On Fri, 2012-02-10 at 19:58 +0100, Peter Zijlstra wrote: > > OK, so a 'modern' kernel does it slightly different and I've no idea > > what exactly goes wrong in your vintage version. But I can see the > > current stuff going at it all wrong. > > > > What seems to happen is that native_nmi_stop_other_cpus() NMI broadcasts > > for smp_stop_nmi_callback()->stop_this_cpu(). Which without any > > serialization what so ever marks all remote CPUs offline and calls halt > > with IRQs disabled -> dead. > > > > While we're waiting for this all to complete, the scheduler tries to > > no_hz load-balance and kick a cpu it thinks is still around and we get > > the above splat because the NMI just marked it offline without telling > > anybody about it. > > > > Now, arguably you don't want to go through the whole hotplug crap to > > shut down your machine, esp not on panic, but clearing the online state > > without telling anybody about it is bound to lead to these things. > > > > No immediate solution comes to mind... > > Don, any reason you wait for the NMI broadcast to complete with IRQs > enabled? If you disable IRQs before the broadcast the interrupt can't > happen and should side-step this particular problem. Well I believe the old way had the same problem using the REBOOT_IRQ as opposed to NMI. I also don't know how to shutdown interrupts system wide without just broadcasting an IRQ to locally disable interrupts. > > Its not like we have 'latency' issues on this path :-) Heh. Oddly I was writing the changelog for a patch that kinda changes this path to sorta revert back to the old way of using a REBOOT_IRQ with an NMI follow-on when the IRQ fails. Originally, I wanted to make sure the cpus were shutdown immediately so we can serialize the panic path hence the original change. I also ran into the same problem you did and hacked up another patch that checked a global atomic variable that let the system know we were shutting down and not to do the WARN_ON (the global is already created for the NMI case now). I'll try to post that soon once I finish my long winded changelog. Though it kinda addresses your issue, I'm not sure it does it in a way that will satisfy you. But I look forward to the discussion. :-) Cheers, Don ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-10 20:02 ` Don Zickus @ 2012-02-10 20:18 ` Peter Zijlstra 2012-02-10 20:31 ` Don Zickus 0 siblings, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2012-02-10 20:18 UTC (permalink / raw) To: Don Zickus Cc: Srivatsa S. Bhat, Sasha Levin, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky On Fri, 2012-02-10 at 15:02 -0500, Don Zickus wrote: > I also ran into the same problem you did and hacked up another patch that > checked a global atomic variable that let the system know we were shutting > down and not to do the WARN_ON (the global is already created for the NMI > case now). system_state seems like that thing.. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-10 20:18 ` Peter Zijlstra @ 2012-02-10 20:31 ` Don Zickus 2012-02-10 20:36 ` Peter Zijlstra 0 siblings, 1 reply; 15+ messages in thread From: Don Zickus @ 2012-02-10 20:31 UTC (permalink / raw) To: Peter Zijlstra Cc: Srivatsa S. Bhat, Sasha Levin, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky On Fri, Feb 10, 2012 at 09:18:41PM +0100, Peter Zijlstra wrote: > On Fri, 2012-02-10 at 15:02 -0500, Don Zickus wrote: > > I also ran into the same problem you did and hacked up another patch that > > checked a global atomic variable that let the system know we were shutting > > down and not to do the WARN_ON (the global is already created for the NMI > > case now). > > system_state seems like that thing.. except it doesn't seem to have a PANIC state, though we could add one I suppose. The thing is even if you reverted my changes: e58d429 x86, reboot: Fix typo in nmi reboot path bda6263 x86, NMI: Add knob to disable using NMI IPIs to stop cpus 3603a25 x86, reboot: Use NMI instead of REBOOT_VECTOR to stop cpus I think you still run into the same problem because the reschedule code changed. So my second patch which I will eventually post will just skip the WARN_ON if the system is going down. Not sure if that is the proper way to address this problem or change all of the stop_this_cpu code to use a different bitmask than the cpu_online bitmask (but then you run the risk of a stuck IPI I guess if the cpu is halted without notifying anyone). Cheers, Don ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-10 20:31 ` Don Zickus @ 2012-02-10 20:36 ` Peter Zijlstra 2012-02-10 21:04 ` Don Zickus 0 siblings, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2012-02-10 20:36 UTC (permalink / raw) To: Don Zickus Cc: Srivatsa S. Bhat, Sasha Levin, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky On Fri, 2012-02-10 at 15:31 -0500, Don Zickus wrote: > So my second patch which I will eventually post will just skip the WARN_ON > if the system is going down. Not sure if that is the proper way to address > this problem or change all of the stop_this_cpu code to use a different > bitmask than the cpu_online bitmask (but then you run the risk of a stuck > IPI I guess if the cpu is halted without notifying anyone). Yeah, the async hard kill of all cpus is bound to make problems.. what I'm wondering is, why is this in the normal shutdown path and not specific to a hard panic? Trying to make this work is just not going to be pretty, and in the panic case we really don't care much. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-10 20:36 ` Peter Zijlstra @ 2012-02-10 21:04 ` Don Zickus 2012-03-23 10:47 ` Sasha Levin 0 siblings, 1 reply; 15+ messages in thread From: Don Zickus @ 2012-02-10 21:04 UTC (permalink / raw) To: Peter Zijlstra Cc: Srivatsa S. Bhat, Sasha Levin, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky On Fri, Feb 10, 2012 at 09:36:03PM +0100, Peter Zijlstra wrote: > On Fri, 2012-02-10 at 15:31 -0500, Don Zickus wrote: > > So my second patch which I will eventually post will just skip the WARN_ON > > if the system is going down. Not sure if that is the proper way to address > > this problem or change all of the stop_this_cpu code to use a different > > bitmask than the cpu_online bitmask (but then you run the risk of a stuck > > IPI I guess if the cpu is halted without notifying anyone). > > Yeah, the async hard kill of all cpus is bound to make problems.. what > I'm wondering is, why is this in the normal shutdown path and not > specific to a hard panic? I didn't write the original code, I just changed it from REBOOT_IRQ to NMI and left all the stop_this_cpu stuff alone. > > Trying to make this work is just not going to be pretty, and in the > panic case we really don't care much. Sure. Cheers, Don ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-02-10 21:04 ` Don Zickus @ 2012-03-23 10:47 ` Sasha Levin 2012-03-23 13:26 ` Don Zickus 0 siblings, 1 reply; 15+ messages in thread From: Sasha Levin @ 2012-03-23 10:47 UTC (permalink / raw) To: Don Zickus Cc: Peter Zijlstra, Srivatsa S. Bhat, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky I'm just wondering about the status of the patches to fix this issue, this is still happening on linux-next. On Fri, Feb 10, 2012 at 11:04 PM, Don Zickus <dzickus@redhat.com> wrote: > On Fri, Feb 10, 2012 at 09:36:03PM +0100, Peter Zijlstra wrote: >> On Fri, 2012-02-10 at 15:31 -0500, Don Zickus wrote: >> > So my second patch which I will eventually post will just skip the WARN_ON >> > if the system is going down. Not sure if that is the proper way to address >> > this problem or change all of the stop_this_cpu code to use a different >> > bitmask than the cpu_online bitmask (but then you run the risk of a stuck >> > IPI I guess if the cpu is halted without notifying anyone). >> >> Yeah, the async hard kill of all cpus is bound to make problems.. what >> I'm wondering is, why is this in the normal shutdown path and not >> specific to a hard panic? > > I didn't write the original code, I just changed it from REBOOT_IRQ to > NMI and left all the stop_this_cpu stuff alone. > >> >> Trying to make this work is just not going to be pretty, and in the >> panic case we really don't care much. > > Sure. > > Cheers, > Don ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-03-23 10:47 ` Sasha Levin @ 2012-03-23 13:26 ` Don Zickus 2012-04-05 20:38 ` Tony Luck 0 siblings, 1 reply; 15+ messages in thread From: Don Zickus @ 2012-03-23 13:26 UTC (permalink / raw) To: Sasha Levin Cc: Peter Zijlstra, Srivatsa S. Bhat, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky On Fri, Mar 23, 2012 at 12:47:38PM +0200, Sasha Levin wrote: > I'm just wondering about the status of the patches to fix this issue, > this is still happening on linux-next. I got distracted with other stuff. I have been running code that does the following in the shutdown path: foreach_online_cpu cpu_down but I get occasional hangs on reboot that I haven't gotten around to debugging. I assumed this is the approach Peter was suggesting though I don't think he was sure if it was going to be reliable. Cheers, Don > > On Fri, Feb 10, 2012 at 11:04 PM, Don Zickus <dzickus@redhat.com> wrote: > > On Fri, Feb 10, 2012 at 09:36:03PM +0100, Peter Zijlstra wrote: > >> On Fri, 2012-02-10 at 15:31 -0500, Don Zickus wrote: > >> > So my second patch which I will eventually post will just skip the WARN_ON > >> > if the system is going down. Not sure if that is the proper way to address > >> > this problem or change all of the stop_this_cpu code to use a different > >> > bitmask than the cpu_online bitmask (but then you run the risk of a stuck > >> > IPI I guess if the cpu is halted without notifying anyone). > >> > >> Yeah, the async hard kill of all cpus is bound to make problems.. what > >> I'm wondering is, why is this in the normal shutdown path and not > >> specific to a hard panic? > > > > I didn't write the original code, I just changed it from REBOOT_IRQ to > > NMI and left all the stop_this_cpu stuff alone. > > > >> > >> Trying to make this work is just not going to be pretty, and in the > >> panic case we really don't care much. > > > > Sure. > > > > Cheers, > > Don ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-03-23 13:26 ` Don Zickus @ 2012-04-05 20:38 ` Tony Luck 2012-06-01 13:36 ` Borislav Petkov 0 siblings, 1 reply; 15+ messages in thread From: Tony Luck @ 2012-04-05 20:38 UTC (permalink / raw) To: Don Zickus Cc: Sasha Levin, Peter Zijlstra, Srivatsa S. Bhat, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky A plain v3.3 kernel hits this when I just type "reboot" on a 32 cpu (2 socket * 8 core * 2 HT) system: sd 0:0:0:0: [sda] Synchronizing SCSI cache Restarting system. machine restart ------------[ cut here ]------------ WARNING: at arch/x86/kernel/smp.c:120 native_smp_send_reschedule+0x5c/0x60() Hardware name: S2600CP Modules linked in: Pid: 10068, comm: reboot Not tainted 3.3.0- #1 Call Trace: <IRQ> [<ffffffff8104c37f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8104c3da>] warn_slowpath_null+0x1a/0x20 [<ffffffff810314bc>] native_smp_send_reschedule+0x5c/0x60 [<ffffffff81084824>] trigger_load_balance+0x244/0x2f0 [<ffffffff8107bf71>] scheduler_tick+0x101/0x160 [<ffffffff8105b1de>] update_process_times+0x6e/0x90 [<ffffffff8109e6a6>] tick_sched_timer+0x66/0xc0 [<ffffffff81072d63>] __run_hrtimer+0x83/0x1d0 [<ffffffff8109e640>] ? tick_nohz_handler+0xf0/0xf0 [<ffffffff81073146>] hrtimer_interrupt+0x106/0x240 [<ffffffff8157bf69>] smp_apic_timer_interrupt+0x69/0x99 [<ffffffff8157ac1e>] apic_timer_interrupt+0x6e/0x80 <EOI> [<ffffffff81033276>] ? default_send_IPI_mask_allbutself_phys+0xd6/0x110 [<ffffffff81036397>] physflat_send_IPI_allbutself+0x17/0x20 [<ffffffff81031849>] native_nmi_stop_other_cpus+0xa9/0x110 [<ffffffff81030e24>] native_machine_shutdown+0x64/0x90 [<ffffffff81030a97>] native_machine_restart+0x27/0x40 [<ffffffff810309cf>] machine_restart+0xf/0x20 [<ffffffff810637de>] kernel_restart+0x3e/0x60 [<ffffffff810639e0>] sys_reboot+0x1c0/0x240 [<ffffffff811734af>] ? __d_free+0x4f/0x70 [<ffffffff8117353c>] ? d_free+0x6c/0x80 [<ffffffff81174dbd>] ? d_kill+0xad/0x110 [<ffffffff8117b603>] ? mntput+0x23/0x40 [<ffffffff8115f9b7>] ? fput+0x197/0x260 [<ffffffff8115ba63>] ? filp_close+0x63/0x90 [<ffffffff8157a169>] system_call_fastpath+0x16/0x1b ---[ end trace 5e0dddabdbb21c7e ]--- ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() 2012-04-05 20:38 ` Tony Luck @ 2012-06-01 13:36 ` Borislav Petkov 0 siblings, 0 replies; 15+ messages in thread From: Borislav Petkov @ 2012-06-01 13:36 UTC (permalink / raw) To: Tony Luck Cc: Don Zickus, Sasha Levin, Peter Zijlstra, Srivatsa S. Bhat, Josh Boyer, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Avi Kivity, kvm, linux-kernel, x86, Suresh B Siddha, Sergey Senozhatsky On Thu, Apr 05, 2012 at 01:38:41PM -0700, Tony Luck wrote: > A plain v3.3 kernel hits this when I just type "reboot" on a 32 cpu (2 > socket * 8 core * 2 HT) system: Same here on latest linus on a 24 CPU box right before the box reboots: [ 6851.207504] ------------[ cut here ]------------ [ 6851.212340] WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule+0x2a/0x56() [ 6851.220727] Hardware name: Dinar [ 6851.224096] Modules linked in: kvm_amd kvm radeon microcode ttm drm_kms_helper hwmon amd64_edac_mod e1000e ohci_hcd backlight cfbcopyarea cfbimgblt cfbfillrect ehci_hcd edac_core [ 6851.240585] Pid: 23976, comm: reboot Tainted: G W 3.4.0+ #15 [ 6851.247351] Call Trace: [ 6851.249895] <IRQ> [<ffffffff8102f114>] warn_slowpath_common+0x85/0x9d [ 6851.256753] [<ffffffff8102f146>] warn_slowpath_null+0x1a/0x1c [ 6851.262801] [<ffffffff81019bbd>] native_smp_send_reschedule+0x2a/0x56 [ 6851.269573] [<ffffffff8105e625>] trigger_load_balance+0x1ed/0x21a [ 6851.275983] [<ffffffff810582ac>] scheduler_tick+0xe9/0xf2 [ 6851.281671] [<ffffffff8103cb91>] update_process_times+0x67/0x77 [ 6851.287900] [<ffffffff81071323>] tick_sched_timer+0x72/0x91 [ 6851.293769] [<ffffffff8104e5e5>] __run_hrtimer+0xc3/0x17f [ 6851.299456] [<ffffffff810712b1>] ? tick_nohz_handler+0xd1/0xd1 [ 6851.305593] [<ffffffff8104eee1>] hrtimer_interrupt+0xd4/0x197 [ 6851.311642] [<ffffffff8145e36a>] smp_apic_timer_interrupt+0x86/0x99 [ 6851.325636] [<ffffffff8145d35c>] apic_timer_interrupt+0x6c/0x80 [ 6851.339424] <EOI> [<ffffffff811d68ea>] ? delay_tsc+0x23/0x50 [ 6851.353034] [<ffffffff811d6849>] __delay+0xf/0x11 [ 6851.365561] [<ffffffff811d6874>] __const_udelay+0x29/0x2b [ 6851.378817] [<ffffffff81019c8a>] native_stop_other_cpus+0x78/0x13d [ 6851.392924] [<ffffffff81019305>] native_machine_shutdown+0x53/0x6a [ 6851.406992] [<ffffffff81019347>] machine_shutdown+0xf/0x11 [ 6851.420438] [<ffffffff810193ae>] native_machine_restart+0x25/0x37 [ 6851.434580] [<ffffffff810193ea>] machine_restart+0xf/0x11 [ 6851.448039] [<ffffffff81041b62>] kernel_restart+0x4e/0x52 [ 6851.461517] [<ffffffff81041cc9>] sys_reboot+0x151/0x187 [ 6851.474818] [<ffffffff81114ba6>] ? mntput_no_expire+0x31/0x105 [ 6851.488748] [<ffffffff81114ca4>] ? mntput+0x2a/0x2c [ 6851.501659] [<ffffffff810fe66e>] ? fput+0x1e0/0x1ef [ 6851.514551] [<ffffffff811d7a0e>] ? trace_hardirqs_on_thunk+0x3a/0x3c [ 6851.529021] [<ffffffff8145c952>] system_call_fastpath+0x16/0x1b [ 6851.543148] ---[ end trace 4eaa2a86a8e2da24 ]--- -- Regards/Gruss, Boris. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-06-01 13:36 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-09 1:31 WARNING: at arch/x86/kernel/smp.c:119 native_smp_send_reschedule+0x25/0x43() Sasha Levin 2012-02-09 0:59 ` Josh Boyer 2012-02-09 19:46 ` Sasha Levin 2012-02-10 10:06 ` Srivatsa S. Bhat 2012-02-10 18:58 ` Peter Zijlstra 2012-02-10 19:03 ` Peter Zijlstra 2012-02-10 20:02 ` Don Zickus 2012-02-10 20:18 ` Peter Zijlstra 2012-02-10 20:31 ` Don Zickus 2012-02-10 20:36 ` Peter Zijlstra 2012-02-10 21:04 ` Don Zickus 2012-03-23 10:47 ` Sasha Levin 2012-03-23 13:26 ` Don Zickus 2012-04-05 20:38 ` Tony Luck 2012-06-01 13:36 ` Borislav Petkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox