public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fix i386 NMI watchdog checking
@ 2006-10-13  3:22 Corey Minyard
  2006-10-17  1:48 ` Steven Rostedt
  0 siblings, 1 reply; 2+ messages in thread
From: Corey Minyard @ 2006-10-13  3:22 UTC (permalink / raw)
  To: Linux Kernel


I was having a problem with the NMI testing hanging on an SMP system
in check_nmi_watchdog() when using nmi_watchdog=2.  It doesn't seem to
happen on a stock kernel, but I was working on something else and it
triggered this problem.

This patch solves the problem.  I'm not sure this is quite the right
solution, but I know that the local_irq_enable() is kind of pointless
here and it seems that having scheduling on while the other CPUs are
locked up with interrupts off is a bad idea.  And you can't call
smp_call_function() with interrupts off.  But adding the preempt
disable around this operation seems to solve the problem.

Signed-off-by: Corey Minyard <minyard@acm.org>

Index: linux-2.6.18/arch/i386/kernel/nmi.c
===================================================================
--- linux-2.6.18.orig/arch/i386/kernel/nmi.c
+++ linux-2.6.18/arch/i386/kernel/nmi.c
@@ -134,12 +134,18 @@ static int __init check_nmi_watchdog(voi
 
 	printk(KERN_INFO "Testing NMI watchdog ... ");
 
+	/*
+	 * We must have preempt off while testing the local APIC
+	 * watchdog.  If we have an interrupt on this CPU while the
+	 * other CPUs are wedged, and that interrupt tries to schedule
+	 * (and possibly do an IPC), we would be hung.
+	 */
+	preempt_disable();
 	if (nmi_watchdog == NMI_LOCAL_APIC)
 		smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0);
 
 	for_each_possible_cpu(cpu)
 		prev_nmi_count[cpu] = per_cpu(irq_stat, cpu).__nmi_count;
-	local_irq_enable();
 	mdelay((10*1000)/nmi_hz); // wait 10 ticks
 
 	for_each_possible_cpu(cpu) {
@@ -151,6 +157,7 @@ static int __init check_nmi_watchdog(voi
 #endif
 		if (nmi_count(cpu) - prev_nmi_count[cpu] <= 5) {
 			endflag = 1;
+			preempt_enable();
 			printk("CPU#%d: NMI appears to be stuck (%d->%d)!\n",
 				cpu,
 				prev_nmi_count[cpu],
@@ -162,6 +169,7 @@ static int __init check_nmi_watchdog(voi
 		}
 	}
 	endflag = 1;
+	preempt_enable();
 	printk("OK.\n");
 
 	/* now that we know it works we can reduce NMI frequency to

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-10-17  1:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-13  3:22 [PATCH] fix i386 NMI watchdog checking Corey Minyard
2006-10-17  1:48 ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox