* [patch] fix NMI watchdog, 2.5.34
@ 2002-09-09 19:45 Ingo Molnar
2002-09-10 5:19 ` Rusty Russell
0 siblings, 1 reply; 4+ messages in thread
From: Ingo Molnar @ 2002-09-09 19:45 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, Rusty Russell
the attached patch fixes the NMI watchdog to trigger on all CPUs - the
cpu_up() code broke it long time ago. With this patch NMI interrupts get
generated on all CPUs, not just the boot CPU.
Ingo
--- linux/arch/i386/kernel/io_apic.c.orig Mon Sep 9 21:34:53 2002
+++ linux/arch/i386/kernel/io_apic.c Mon Sep 9 21:37:27 2002
@@ -1490,7 +1490,7 @@
end_lapic_irq
};
-static void enable_NMI_through_LVT0 (void * dummy)
+void enable_NMI_through_LVT0 (void * dummy)
{
unsigned int v, ver;
--- linux/arch/i386/kernel/smpboot.c.orig Mon Sep 9 21:35:48 2002
+++ linux/arch/i386/kernel/smpboot.c Mon Sep 9 21:43:00 2002
@@ -452,6 +452,11 @@
while (!test_bit(smp_processor_id(), &smp_commenced_mask))
rep_nop();
setup_secondary_APIC_clock();
+ if (nmi_watchdog == NMI_IO_APIC) {
+ disable_8259A_irq(0);
+ enable_NMI_through_LVT0(NULL);
+ enable_8259A_irq(0);
+ }
enable_APIC_timer();
/*
* low-memory mappings have been cleared, flush them from
--- linux/include/asm-i386/apic.h.orig Mon Sep 9 21:37:43 2002
+++ linux/include/asm-i386/apic.h Mon Sep 9 21:37:51 2002
@@ -89,6 +89,7 @@
extern unsigned int apic_timer_irqs [NR_CPUS];
extern int check_nmi_watchdog (void);
+extern void enable_NMI_through_LVT0 (void * dummy);
extern unsigned int nmi_watchdog;
#define NMI_NONE 0
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch] fix NMI watchdog, 2.5.34
2002-09-09 19:45 [patch] fix NMI watchdog, 2.5.34 Ingo Molnar
@ 2002-09-10 5:19 ` Rusty Russell
2002-09-10 18:11 ` Zwane Mwaikambo
0 siblings, 1 reply; 4+ messages in thread
From: Rusty Russell @ 2002-09-10 5:19 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, torvalds
In message <Pine.LNX.4.44.0209092144140.10544-100000@localhost.localdomain> you
write:
>
> the attached patch fixes the NMI watchdog to trigger on all CPUs - the
> cpu_up() code broke it long time ago. With this patch NMI interrupts get
> generated on all CPUs, not just the boot CPU.
Well spotted. You might want to test the following patch which
catches calls to smp_call_function() before the cpus are actually
online. I ran a variant on my (crappy, old, SMP) box before I sent
the patch to Linus, and all I saw was the (harmless) tlb_flush.
diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.34/arch/i386/kernel/smp.c working-2.5.34-smp_call_cpus/arch/i386/kernel/smp.c
--- linux-2.5.34/arch/i386/kernel/smp.c Wed Aug 28 09:29:40 2002
+++ working-2.5.34-smp_call_cpus/arch/i386/kernel/smp.c Tue Sep 10 14:50:15 2002
@@ -561,9 +561,15 @@ int smp_call_function (void (*func) (voi
* hardware interrupt handler or from a bottom half handler.
*/
{
+ extern int smp_done;
struct call_data_struct data;
int cpus = num_online_cpus()-1;
+ if (!smp_done) {
+ printk(KERN_ERR "smp_call_function %p called before SMP!\n",
+ func);
+ show_stack(NULL);
+ }
if (!cpus)
return 0;
diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.34/arch/i386/kernel/smpboot.c working-2.5.34-smp_call_cpus/arch/i386/kernel/smpboot.c
--- linux-2.5.34/arch/i386/kernel/smpboot.c Sun Sep 1 12:22:57 2002
+++ working-2.5.34-smp_call_cpus/arch/i386/kernel/smpboot.c Tue Sep 10 14:35:07 2002
@@ -1218,7 +1218,10 @@ int __devinit __cpu_up(unsigned int cpu)
return 0;
}
+unsigned int smp_done = 0;
+
void __init smp_cpus_done(unsigned int max_cpus)
{
zap_low_mappings();
+ smp_done = 1;
}
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch] fix NMI watchdog, 2.5.34
2002-09-10 5:19 ` Rusty Russell
@ 2002-09-10 18:11 ` Zwane Mwaikambo
2002-09-12 7:48 ` Rusty Russell
0 siblings, 1 reply; 4+ messages in thread
From: Zwane Mwaikambo @ 2002-09-10 18:11 UTC (permalink / raw)
To: Rusty Russell; +Cc: Ingo Molnar, linux-kernel, torvalds
On Tue, 10 Sep 2002, Rusty Russell wrote:
> Well spotted. You might want to test the following patch which
> catches calls to smp_call_function() before the cpus are actually
> online. I ran a variant on my (crappy, old, SMP) box before I sent
> the patch to Linus, and all I saw was the (harmless) tlb_flush.
hmm...
> diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.34/arch/i386/kernel/smpboot.c working-2.5.34-smp_call_cpus/arch/i386/kernel/smpboot.c
> --- linux-2.5.34/arch/i386/kernel/smpboot.c Sun Sep 1 12:22:57 2002
> +++ working-2.5.34-smp_call_cpus/arch/i386/kernel/smpboot.c Tue Sep 10 14:35:07 2002
> @@ -1218,7 +1218,10 @@ int __devinit __cpu_up(unsigned int cpu)
> return 0;
> }
>
> +unsigned int smp_done = 0;
> +
> void __init smp_cpus_done(unsigned int max_cpus)
> {
> zap_low_mappings();
> + smp_done = 1;
I've got an SMP box which dies reliably at zap_low_mappings, i wonder if
this could be the same problem. My BSP sits spinning on the completion
check.
Zwane
--
function.linuxpower.ca
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch] fix NMI watchdog, 2.5.34
2002-09-10 18:11 ` Zwane Mwaikambo
@ 2002-09-12 7:48 ` Rusty Russell
0 siblings, 0 replies; 4+ messages in thread
From: Rusty Russell @ 2002-09-12 7:48 UTC (permalink / raw)
To: Zwane Mwaikambo; +Cc: mingo, linux-kernel, torvalds
On Tue, 10 Sep 2002 20:11:49 +0200 (SAST)
Zwane Mwaikambo <zwane@mwaikambo.name> wrote:
> On Tue, 10 Sep 2002, Rusty Russell wrote:
>
> > Well spotted. You might want to test the following patch which
> > catches calls to smp_call_function() before the cpus are actually
> > online. I ran a variant on my (crappy, old, SMP) box before I sent
> > the patch to Linus, and all I saw was the (harmless) tlb_flush.
>
> hmm...
>
> > diff -urNp --exclude TAGS -X /home/rusty/current-dontdiff --minimal linux-2.5.34/arch/i386/kernel/smpboot.c working-2.5.34-smp_call_cpus/arch/i386/kernel/smpboot.c
> > --- linux-2.5.34/arch/i386/kernel/smpboot.c Sun Sep 1 12:22:57 2002
> > +++ working-2.5.34-smp_call_cpus/arch/i386/kernel/smpboot.c Tue Sep 10 14:35:07 2002
> > @@ -1218,7 +1218,10 @@ int __devinit __cpu_up(unsigned int cpu)
> > return 0;
> > }
> >
> > +unsigned int smp_done = 0;
> > +
> > void __init smp_cpus_done(unsigned int max_cpus)
> > {
> > zap_low_mappings();
> > + smp_done = 1;
>
> I've got an SMP box which dies reliably at zap_low_mappings, i wonder if
> this could be the same problem. My BSP sits spinning on the completion
> check.
Hmmm, I can't see how: you mean it hangs in flush_tlb_all() (waiting for the
ack in smp_call_function()?). If so, that seems really wierd. You could add
a printk("here: %u\n", smp_processor_id()) in flush_tlb_all_ipi() to see which
CPU isn't getting it...
Strange,
Rusty.
--
there are those who do and those who hang on and you don't see too
many doers quoting their contemporaries. -- Larry McVoy
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2002-09-12 7:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-09 19:45 [patch] fix NMI watchdog, 2.5.34 Ingo Molnar
2002-09-10 5:19 ` Rusty Russell
2002-09-10 18:11 ` Zwane Mwaikambo
2002-09-12 7:48 ` Rusty Russell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox