linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] NULL pointer crash in early NMI handler
@ 2009-04-21  1:35 Steven Rostedt
  2009-04-21  6:30 ` [PATCH] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC Rusty Russell
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Steven Rostedt @ 2009-04-21  1:35 UTC (permalink / raw)
  To: LKML; +Cc: Ingo Molnar, Rusty Russell, H. Peter Anvin, Thomas Gleixner


I'm hitting this bug in latest Linus tree:

[    0.161089] Setting APIC routing to flat
[    0.171346] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=0 pin2=0
[    0.180001] BUG: unable to handle kernel NULL pointer dereference at 
(null)
[    0.180001] IP: [<ffffffff8063f8a6>] nmi_watchdog_tick+0xd0/0x27d

[...]

[    0.180001] Call Trace:
[    0.180001]  <NMI> <0> [<ffffffff8063e9b7>] do_nmi+0x12e/0x3af
[    0.180001]  [<ffffffff8063e59a>] nmi+0x1a/0x2c
[    0.180001]  [<ffffffff806415c2>] ? add_preempt_count+0xdc/0x18b
[    0.180001]  <<EOE>> <0> [<ffffffff8040944c>] delay_tsc+0xa7/0x13b
[    0.180001]  [<ffffffff804092df>] __delay+0xf/0x11
[    0.180001]  [<ffffffff80409322>] __const_udelay+0x41/0x43
[    0.180001]  [<ffffffff80f18539>] timer_irq_works+0x4e/0xb0
[    0.180001]  [<ffffffff80f18ad4>] setup_IO_APIC+0x539/0xb26
[    0.180001]  [<ffffffff8041b840>] ? debug_smp_processor_id+0x38/0x170
[    0.180001]  [<ffffffff80226152>] ? setup_apic_nmi_watchdog+0xb8/0xdb
[    0.180001]  [<ffffffff80f14231>] native_smp_prepare_cpus+0x606/0x6be
[    0.180001]  [<ffffffff80f05a30>] kernel_init+0x56/0x1fc
[    0.180001]  [<ffffffff8020d7fa>] child_rip+0xa/0x20
[    0.180001]  [<ffffffff8020d1c0>] ? restore_args+0x0/0x30
[    0.180001]  [<ffffffff80f059da>] ? kernel_init+0x0/0x1fc
[    0.180001]  [<ffffffff8020d7f0>] ? child_rip+0x0/0x20


Looking into exactly where it crashed, it seems to be when it accesses the 
CPU mask variable backtrace_mask.

When the APIC routing is set to flat, it somehow starts triggering the NMI 
watchdog. This happens before we run "check_nmi_watchdog" which is what 
allocates the backtrace_mask cpu mask.

Yes I have CONFIG_CPUMASK_OFFSTACK=y.

When I disable it, the box boots up fine.

-- Steve


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC
  2009-04-21  1:35 [BUG] NULL pointer crash in early NMI handler Steven Rostedt
@ 2009-04-21  6:30 ` Rusty Russell
  2009-04-21  6:33   ` [PATCH] x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y Rusty Russell
  2009-04-21  8:12 ` [tip:x86/urgent] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC tip-bot for Rusty Russell
  2009-04-21  8:12 ` [tip:x86/urgent] x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y tip-bot for Rusty Russell
  2 siblings, 1 reply; 5+ messages in thread
From: Rusty Russell @ 2009-04-21  6:30 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Ingo Molnar, H. Peter Anvin, Thomas Gleixner

fcef8576d8a64fc603e719c97d423f9f6d4e0e8b converted backtrace_mask to a
cpumask_var_t, and assumed check_nmi_watchdog was called before
nmi_watchdog_tick was ever called.  Steven's oops shows I was wrong.

This is something of a bandaid: I'm not sure we *should* be calling
nmi_watchdog_tick before check_nmi_watchdog.  Note that gcc eliminates
this test for the CONFIG_CPUMASK_OFFSTACK=n case.

LKML Message-ID: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 arch/x86/kernel/apic/nmi.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -414,7 +414,8 @@ nmi_watchdog_tick(struct pt_regs *regs, 
 		touched = 1;
 	}
 
-	if (cpumask_test_cpu(cpu, backtrace_mask)) {
+	/* We can be called before check_nmi_watchdog, hence NULL check. */
+	if (backtrace_mask != NULL && cpumask_test_cpu(cpu, backtrace_mask)) {
 		static DEFINE_SPINLOCK(lock);	/* Serialise the printks */
 
 		spin_lock(&lock);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y
  2009-04-21  6:30 ` [PATCH] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC Rusty Russell
@ 2009-04-21  6:33   ` Rusty Russell
  0 siblings, 0 replies; 5+ messages in thread
From: Rusty Russell @ 2009-04-21  6:33 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Ingo Molnar, H. Peter Anvin, Thomas Gleixner

In theory (though not shown in practice) alloc_cpumask_var() doesn't zero
memory, so CPUs might print an "NMI backtrace for cpu %d" once on boot.

(Bug introduced in fcef8576d8a64fc603e719c97d423f9f6d4e0e8b).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 arch/x86/kernel/apic/nmi.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -138,7 +138,7 @@ int __init check_nmi_watchdog(void)
 	if (!prev_nmi_count)
 		goto error;
 
-	alloc_cpumask_var(&backtrace_mask, GFP_KERNEL);
+	alloc_cpumask_var(&backtrace_mask, GFP_KERNEL|__GFP_ZERO);
 	printk(KERN_INFO "Testing NMI watchdog ... ");
 
 #ifdef CONFIG_SMP

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip:x86/urgent] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC
  2009-04-21  1:35 [BUG] NULL pointer crash in early NMI handler Steven Rostedt
  2009-04-21  6:30 ` [PATCH] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC Rusty Russell
@ 2009-04-21  8:12 ` tip-bot for Rusty Russell
  2009-04-21  8:12 ` [tip:x86/urgent] x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y tip-bot for Rusty Russell
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Rusty Russell @ 2009-04-21  8:12 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, rusty, tglx, mingo

Commit-ID:  2f537a9f8e82f55c241b002c8cfbf34303b45ada
Gitweb:     http://git.kernel.org/tip/2f537a9f8e82f55c241b002c8cfbf34303b45ada
Author:     Rusty Russell <rusty@rustcorp.com.au>
AuthorDate: Tue, 21 Apr 2009 16:00:15 +0930
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 21 Apr 2009 10:09:49 +0200

x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC

fcef8576d8a64fc603e719c97d423f9f6d4e0e8b converted backtrace_mask to a
cpumask_var_t, and assumed check_nmi_watchdog was called before
nmi_watchdog_tick was ever called.  Steven's oops shows I was wrong.

This is something of a bandaid: I'm not sure we *should* be calling
nmi_watchdog_tick before check_nmi_watchdog.  Note that gcc eliminates
this test for the CONFIG_CPUMASK_OFFSTACK=n case.

[ Impact: fix boot crash in rare configs ]

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/x86/kernel/apic/nmi.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
index d6bd624..2ba52f3 100644
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -414,7 +414,8 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
 		touched = 1;
 	}
 
-	if (cpumask_test_cpu(cpu, backtrace_mask)) {
+	/* We can be called before check_nmi_watchdog, hence NULL check. */
+	if (backtrace_mask != NULL && cpumask_test_cpu(cpu, backtrace_mask)) {
 		static DEFINE_SPINLOCK(lock);	/* Serialise the printks */
 
 		spin_lock(&lock);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [tip:x86/urgent] x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y
  2009-04-21  1:35 [BUG] NULL pointer crash in early NMI handler Steven Rostedt
  2009-04-21  6:30 ` [PATCH] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC Rusty Russell
  2009-04-21  8:12 ` [tip:x86/urgent] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC tip-bot for Rusty Russell
@ 2009-04-21  8:12 ` tip-bot for Rusty Russell
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Rusty Russell @ 2009-04-21  8:12 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, rusty, tglx, mingo

Commit-ID:  fcc5c4a2feea3886dc058498b28508b2731720d5
Gitweb:     http://git.kernel.org/tip/fcc5c4a2feea3886dc058498b28508b2731720d5
Author:     Rusty Russell <rusty@rustcorp.com.au>
AuthorDate: Tue, 21 Apr 2009 16:03:41 +0930
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 21 Apr 2009 10:09:50 +0200

x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y

In theory (though not shown in practice) alloc_cpumask_var() doesn't zero
memory, so CPUs might print an "NMI backtrace for cpu %d" once on boot.

(Bug introduced in fcef8576d8a64fc603e719c97d423f9f6d4e0e8b).

[ Impact: avoid theoretical syslog noise in rare configs ]

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/x86/kernel/apic/nmi.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
index 2ba52f3..ce4fbfa 100644
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -138,7 +138,7 @@ int __init check_nmi_watchdog(void)
 	if (!prev_nmi_count)
 		goto error;
 
-	alloc_cpumask_var(&backtrace_mask, GFP_KERNEL);
+	alloc_cpumask_var(&backtrace_mask, GFP_KERNEL|__GFP_ZERO);
 	printk(KERN_INFO "Testing NMI watchdog ... ");
 
 #ifdef CONFIG_SMP

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-04-21  8:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-21  1:35 [BUG] NULL pointer crash in early NMI handler Steven Rostedt
2009-04-21  6:30 ` [PATCH] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC Rusty Russell
2009-04-21  6:33   ` [PATCH] x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y Rusty Russell
2009-04-21  8:12 ` [tip:x86/urgent] x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC tip-bot for Rusty Russell
2009-04-21  8:12 ` [tip:x86/urgent] x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y tip-bot for Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).