From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753828Ab0K3W2J (ORCPT ); Tue, 30 Nov 2010 17:28:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:5872 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754848Ab0K3W2G (ORCPT ); Tue, 30 Nov 2010 17:28:06 -0500 From: Don Zickus To: Ingo Molnar Cc: Peter Zijlstra , Robert Richter , ying.huang@intel.com, Andi Kleen , gorcunov@gmail.com, LKML , Dongdong Deng , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Don Zickus Subject: [PATCH 7/9] x86: Avoid calling arch_trigger_all_cpu_backtrace() at the same time Date: Tue, 30 Nov 2010 17:27:28 -0500 Message-Id: <1291156050-4482-8-git-send-email-dzickus@redhat.com> In-Reply-To: <1291156050-4482-1-git-send-email-dzickus@redhat.com> References: <1291156050-4482-1-git-send-email-dzickus@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Dongdong Deng The spin_lock_debug/rcu_cpu_stall detector uses trigger_all_cpu_backtrace() to dump cpu backtrace. Therefore it is possible that trigger_all_cpu_backtrace() could be called at the same time on different CPUs, which triggers and 'unknown reason NMI' warning. The following case illustrates the problem: CPU1 CPU2 ... CPU N trigger_all_cpu_backtrace() set "backtrace_mask" to cpu mask | generate NMI interrupts generate NMI interrupts ... \ | / \ | / The "backtrace_mask" will be cleaned by the first NMI interrupt at nmi_watchdog_tick(), then the following NMI interrupts generated by other cpus's arch_trigger_all_cpu_backtrace() will be taken as unknown reason NMI interrupts. This patch uses a test_and_set to avoid the problem, and stop the arch_trigger_all_cpu_backtrace() from calling to avoid dumping a double cpu backtrace info when there is already a trigger_all_cpu_backtrace() in progress. Signed-off-by: Dongdong Deng Reviewed-by: Bruce Ashfield CC: Thomas Gleixner CC: Ingo Molnar CC: "H. Peter Anvin" CC: x86@kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Don Zickus --- arch/x86/kernel/apic/hw_nmi.c | 13 +++++++++++++ 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c index ddb29bf..b36dd01 100644 --- a/arch/x86/kernel/apic/hw_nmi.c +++ b/arch/x86/kernel/apic/hw_nmi.c @@ -36,10 +36,20 @@ EXPORT_SYMBOL(touch_nmi_watchdog); /* For reliability, we're prepared to waste bits here. */ static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; +/* "in progress" flag of arch_trigger_all_cpu_backtrace */ +static unsigned long backtrace_flag; + void arch_trigger_all_cpu_backtrace(void) { int i; + if (test_and_set_bit(0, &backtrace_flag)) + /* + * If there is already a trigger_all_cpu_backtrace() in progress + * (backtrace_flag == 1), don't output double cpu dump infos. + */ + return; + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); printk(KERN_INFO "sending NMI to all CPUs:\n"); @@ -51,6 +61,9 @@ void arch_trigger_all_cpu_backtrace(void) break; mdelay(1); } + + clear_bit(0, &backtrace_flag); + smp_mb__after_clear_bit(); } static int __kprobes -- 1.7.3.2