From: Steven Rostedt <rostedt@goodmis.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
john stultz <johnstul@us.ibm.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: 2.6.14-rc5-rt6 -- False NMI lockup detects
Date: Tue, 01 Nov 2005 12:41:51 -0500 [thread overview]
Message-ID: <1130866911.29788.15.camel@localhost.localdomain> (raw)
In-Reply-To: <20051101113304.GB2871@elte.hu>
On Tue, 2005-11-01 at 12:33 +0100, Ingo Molnar wrote:
> * Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > Hi Ingo and Thomas,
> >
> > On some of my machines, I've been experiencing false NMI lockups.
> > This usually happens on slower machines, and taking a look into this,
> > it seems to be due to a short time where no processes are using
> > timers, and the ktimer interrupts aren't needed. So the APIC timer,
> > which now is used only for the ktimers, has a five second pause, and
> > causes the NMI to go off. The NMI uses the apic timer to determine
> > lockups.
> >
> > So, I added a more generic method. This only works for x86 for now,
> > but it has a #ifdef to keep other archs working until it implements
> > this as well. I added a nmi_irq_incr which is called by __do_IRQ in
> > the generic code. This is what is used in the NMI code to determine
> > if the CPU has locked up. This way we don't have to worry about what
> > resource we are using for timers.
>
> but e.g. the APIC timer doesnt go through do_IRQ(), it has its own
> special IRQ entry code. The simple solution would be to also include the
> IRQ#0 count in the NMI watchdog detection condition - i.e. something
> like the patch below. Hm?
>
> Ingo
>
> Index: linux/arch/i386/kernel/nmi.c
> ===================================================================
> --- linux.orig/arch/i386/kernel/nmi.c
> +++ linux/arch/i386/kernel/nmi.c
> @@ -521,7 +521,7 @@ void notrace nmi_watchdog_tick (struct p
> */
> int sum, cpu = smp_processor_id();
>
> - sum = per_cpu(irq_stat, cpu).apic_timer_irqs;
> + sum = per_cpu(irq_stat, cpu).apic_timer_irqs + kstat_irqs(0);
>
> profile_tick(CPU_PROFILING, regs);
> if (nmi_show_regs[cpu]) {
:) I thought about doing that too, but I wanted a more generic solution.
I think I would have just put the nmi_incr in the apic interrupt handler
as well. That way we might some day be able to pull out the
nmi_watchdog detect code out of the arch specific all together.
-- Steve
prev parent reply other threads:[~2005-11-01 17:42 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-25 14:23 2.6.14-rc5-rt6 -- False NMI lockup detects Steven Rostedt
2005-10-25 14:28 ` Ingo Molnar
2005-10-25 19:24 ` Steven Rostedt
2005-10-25 19:40 ` Thomas Gleixner
2005-10-25 20:00 ` George Anzinger
2005-10-25 20:10 ` Steven Rostedt
2005-10-26 11:27 ` Steven Rostedt
2005-11-01 11:33 ` Ingo Molnar
2005-11-01 17:41 ` Steven Rostedt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1130866911.29788.15.camel@localhost.localdomain \
--to=rostedt@goodmis.org \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.