From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1422946AbWJQBsY (ORCPT ); Mon, 16 Oct 2006 21:48:24 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1422950AbWJQBsY (ORCPT ); Mon, 16 Oct 2006 21:48:24 -0400 Received: from ms-smtp-02.nyroc.rr.com ([24.24.2.56]:23748 "EHLO ms-smtp-02.nyroc.rr.com") by vger.kernel.org with ESMTP id S1422946AbWJQBsX (ORCPT ); Mon, 16 Oct 2006 21:48:23 -0400 Subject: Re: [PATCH] fix i386 NMI watchdog checking From: Steven Rostedt To: minyard@acm.org Cc: Linux Kernel , Andrew Morton , Ingo Molnar In-Reply-To: <20061013032221.GA10816@localdomain> References: <20061013032221.GA10816@localdomain> Content-Type: text/plain Date: Mon, 16 Oct 2006 21:48:13 -0400 Message-Id: <1161049693.24364.15.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2006-10-12 at 22:22 -0500, Corey Minyard wrote: > I was having a problem with the NMI testing hanging on an SMP system > in check_nmi_watchdog() when using nmi_watchdog=2. It doesn't seem to > happen on a stock kernel, but I was working on something else and it > triggered this problem. > > This patch solves the problem. I'm not sure this is quite the right > solution, but I know that the local_irq_enable() is kind of pointless > here and it seems that having scheduling on while the other CPUs are > locked up with interrupts off is a bad idea. And you can't call > smp_call_function() with interrupts off. But adding the preempt > disable around this operation seems to solve the problem. The patch makes sense. And you're right about the local_irq_enable. Either it should be matched with a local_irq_disable (knowing that this function is called with interrupts disabled), or it isn't needed. The later seems to be correct. But this also shows something that bothers me. > > Signed-off-by: Corey Minyard > > Index: linux-2.6.18/arch/i386/kernel/nmi.c > =================================================================== > --- linux-2.6.18.orig/arch/i386/kernel/nmi.c > +++ linux-2.6.18/arch/i386/kernel/nmi.c > @@ -134,12 +134,18 @@ static int __init check_nmi_watchdog(voi > > printk(KERN_INFO "Testing NMI watchdog ... "); > > + /* > + * We must have preempt off while testing the local APIC > + * watchdog. If we have an interrupt on this CPU while the > + * other CPUs are wedged, and that interrupt tries to schedule > + * (and possibly do an IPC), we would be hung. > + */ > + preempt_disable(); > if (nmi_watchdog == NMI_LOCAL_APIC) > smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0); > > for_each_possible_cpu(cpu) > prev_nmi_count[cpu] = per_cpu(irq_stat, cpu).__nmi_count; > - local_irq_enable(); > mdelay((10*1000)/nmi_hz); // wait 10 ticks > > for_each_possible_cpu(cpu) { > @@ -151,6 +157,7 @@ static int __init check_nmi_watchdog(voi > #endif > if (nmi_count(cpu) - prev_nmi_count[cpu] <= 5) { > endflag = 1; > + preempt_enable(); > printk("CPU#%d: NMI appears to be stuck (%d->%d)!\n", > cpu, > prev_nmi_count[cpu], > @@ -162,6 +169,7 @@ static int __init check_nmi_watchdog(voi > } > } > endflag = 1; endflag is a local variable!!! Yes there's a printk and a kfree after it, but say we have printk turned off (which is a possibility) and kfree happens to be quick. At that same time, we have the other CPU (for a two CPU system) running. If the other CPU takes an interrupt (which is possible since interrupts are turned on), it can miss the endflag set to one. That is, this function could return and call another function that resets the location on the stack where endflag was, to zero. This race is very unlikely, since the system will continue with one CPU, and eventually that location will probably be set. I don't know if this can cause a dead lock or not, but it just seems like bad practice to use a local variable to share across CPUS. Why not just convert this to a global static? This function is only called once. Ah what the heck, patch below (on top of this patch). Only compiled tested (trivial enough). -- Steve > + preempt_enable(); > printk("OK.\n"); > > /* now that we know it works we can reduce NMI frequency to Description: Don't use a local stack variable to share across CPUS, even if it most likely wont break anything. It's just wrong. Signed-off-by: Steven Rostedt Index: linux-2.6.18/arch/i386/kernel/nmi.c =================================================================== --- linux-2.6.18.orig/arch/i386/kernel/nmi.c 2006-10-16 21:34:08.000000000 -0400 +++ linux-2.6.18/arch/i386/kernel/nmi.c 2006-10-16 21:45:13.000000000 -0400 @@ -119,9 +119,10 @@ static __init void nmi_cpu_busy(void *da } #endif +static volatile int end_flag __initdata = 0; + static int __init check_nmi_watchdog(void) { - volatile int endflag = 0; unsigned int *prev_nmi_count; int cpu; @@ -142,7 +143,7 @@ static int __init check_nmi_watchdog(voi */ preempt_disable(); if (nmi_watchdog == NMI_LOCAL_APIC) - smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0); + smp_call_function(nmi_cpu_busy, (void *)&end_flag, 0, 0); for_each_possible_cpu(cpu) prev_nmi_count[cpu] = per_cpu(irq_stat, cpu).__nmi_count; @@ -156,7 +157,7 @@ static int __init check_nmi_watchdog(voi continue; #endif if (nmi_count(cpu) - prev_nmi_count[cpu] <= 5) { - endflag = 1; + end_flag = 1; preempt_enable(); printk("CPU#%d: NMI appears to be stuck (%d->%d)!\n", cpu, @@ -168,7 +169,7 @@ static int __init check_nmi_watchdog(voi return -1; } } - endflag = 1; + end_flag = 1; preempt_enable(); printk("OK.\n");