public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Len Brown <lenb@kernel.org>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
	ak@suse.de, Linus Torvalds <torvalds@osdl.org>
Subject: [patch] x86_64: do not enable the NMI watchdog by default
Date: Thu, 7 Dec 2006 13:11:35 +0100	[thread overview]
Message-ID: <20061207121135.GA15529@elte.hu> (raw)
In-Reply-To: <200612061857.30248.len.brown@intel.com>


* Len Brown <len.brown@intel.com> wrote:

> Personally I have never been a big fan of having the NMI watchdog 
> running by default on all systems -- but Andi insists that it helps 
> him debug failures, so tick it does...

enabling it by default was IMO a really bad decision and it needs to be 
undone via the patch attached further below.

If Andi wants to debug stuff via the NMI wachdog, he should use the 
nmi_watchdog=2 boot option: next to the tty=ttyS0 serial console 
options, initcall_debug, apic=debug, earlyprintk and myriads of other 
kernel-hackers-only boot options that we all use to 'help debug 
failures' ...

also, lock debugging facilities catch lockup possibilities (and actual 
lockups) alot more efficiently, i cannot remember the last time the NMI 
watchdog caught anything on my boxes without some other facility not 
triggering first. (and i have it enabled everywhere)

I have run a quick analysis over all locking related crashes i triggered 
on one particular testbox over the past 2 weeks. Out of 26 separate lock 
related incidents spread out equally over that timeframe (out of 497 
bootups on this box), this was the distribution of debugging facilities 
that caught the bugs:

      1  BUG: spinlock lockup on
      1  [ INFO: inconsistent lock state ]
      2  BUG: scheduling while atomic
      2  [ BUG: bad unlock balance detected! ]
      2  BUG: sleeping function called from invalid context
      6  BUG: scheduling with irqs disabled
      5  [ INFO: hard-safe -> hard-unsafe lock order detected ]
      7  BUG: using smp_processor_id() in preemptible [] code

8 were caught by lockdep, 8 by atomicity checks in the scheduler, 7 by 
DEBUG_PREEMPT and 1 by DEBUG_SPINLOCK.

Note: zero were caught by the NMI watchdog, and i run the NMI watchdog 
enabled by default on all architectures, and i have serial logging of 
everything.

but even for the typical distro kernel and for the typical user, the NMI 
watchdog is normally useless, because NMI lockups rarely make it into 
the syslog and X just locks up without showing anything on the screen.

	Ingo

----------------------->
Subject: [patch] x86_64: do not enable the NMI watchdog by default
From: Ingo Molnar <mingo@elte.hu>

do not enable the NMI watchdog by default. Now that we have lockdep i 
cannot remember the last time it caught a real bug, but the NMI watchdog 
can /cause/ problems. Furthermore, to the typical user, an NMI watchdog 
assert results in a total lockup anyway (if under X). In that sense, all 
that the NMI watchdog does is that it makes the system /less/ stable and 
/less/ debuggable.

people can still enable it either after bootup via:

   echo 1 > /proc/sys/kernel/nmi

or via the nmi_watchdog=1 or nmi_watchdog=2 boot options.

build and boot tested on an Athlon64 box.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86_64/kernel/apic.c    |    1 -
 arch/x86_64/kernel/io_apic.c |    5 +----
 arch/x86_64/kernel/nmi.c     |    2 +-
 arch/x86_64/kernel/smpboot.c |    1 -
 include/asm-x86_64/nmi.h     |    1 -
 5 files changed, 2 insertions(+), 8 deletions(-)

Index: linux/arch/x86_64/kernel/apic.c
===================================================================
--- linux.orig/arch/x86_64/kernel/apic.c
+++ linux/arch/x86_64/kernel/apic.c
@@ -443,7 +443,6 @@ void __cpuinit setup_local_APIC (void)
 			oldvalue, value);
 	}
 
-	nmi_watchdog_default();
 	setup_apic_nmi_watchdog(NULL);
 	apic_pm_activate();
 }
Index: linux/arch/x86_64/kernel/io_apic.c
===================================================================
--- linux.orig/arch/x86_64/kernel/io_apic.c
+++ linux/arch/x86_64/kernel/io_apic.c
@@ -1604,7 +1604,6 @@ static inline void check_timer(void)
 		 */
 		unmask_IO_APIC_irq(0);
 		if (!no_timer_check && timer_irq_works()) {
-			nmi_watchdog_default();
 			if (nmi_watchdog == NMI_IO_APIC) {
 				disable_8259A_irq(0);
 				setup_nmi();
@@ -1630,10 +1629,8 @@ static inline void check_timer(void)
 		setup_ExtINT_IRQ0_pin(apic2, pin2, vector);
 		if (timer_irq_works()) {
 			apic_printk(APIC_VERBOSE," works.\n");
-			nmi_watchdog_default();
-			if (nmi_watchdog == NMI_IO_APIC) {
+			if (nmi_watchdog == NMI_IO_APIC)
 				setup_nmi();
-			}
 			return;
 		}
 		/*
Index: linux/arch/x86_64/kernel/nmi.c
===================================================================
--- linux.orig/arch/x86_64/kernel/nmi.c
+++ linux/arch/x86_64/kernel/nmi.c
@@ -181,7 +181,7 @@ static __cpuinit inline int nmi_known_cp
 }
 
 /* Run after command line and cpu_init init, but before all other checks */
-void nmi_watchdog_default(void)
+static inline void nmi_watchdog_default(void)
 {
 	if (nmi_watchdog != NMI_DEFAULT)
 		return;
Index: linux/arch/x86_64/kernel/smpboot.c
===================================================================
--- linux.orig/arch/x86_64/kernel/smpboot.c
+++ linux/arch/x86_64/kernel/smpboot.c
@@ -866,7 +866,6 @@ static int __init smp_sanity_check(unsig
  */
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
-	nmi_watchdog_default();
 	current_cpu_data = boot_cpu_data;
 	current_thread_info()->cpu = 0;  /* needed? */
 	set_cpu_sibling_map(0);
Index: linux/include/asm-x86_64/nmi.h
===================================================================
--- linux.orig/include/asm-x86_64/nmi.h
+++ linux/include/asm-x86_64/nmi.h
@@ -59,7 +59,6 @@ extern void disable_timer_nmi_watchdog(v
 extern void enable_timer_nmi_watchdog(void);
 extern int nmi_watchdog_tick (struct pt_regs * regs, unsigned reason);
 
-extern void nmi_watchdog_default(void);
 extern int setup_nmi_watchdog(char *);
 
 extern atomic_t nmi_active;

  parent reply	other threads:[~2006-12-07 12:13 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-06 22:30 [patch] ACPI, i686, x86_64: fix laptop bootup hang in init_acpi() Ingo Molnar
2006-12-06 23:57 ` Len Brown
2006-12-07 11:08   ` Ingo Molnar
2006-12-07 12:11   ` Ingo Molnar [this message]
2006-12-07 12:30     ` [patch] x86_64: do not enable the NMI watchdog by default Alan
2006-12-07 20:38       ` Andrew Morton
2006-12-07 20:47         ` Ingo Molnar
2006-12-07 20:49           ` Ingo Molnar
2006-12-07 20:55             ` [patch] net: dev_watchdog() locking fix Ingo Molnar
2006-12-07 21:06               ` Herbert Xu
2006-12-08 23:19                 ` Andrew Morton
2006-12-08 23:59                   ` Herbert Xu
2006-12-09 22:02                     ` David Miller
2006-12-11  7:45                       ` Andrew Morton
2006-12-11  7:51                         ` Herbert Xu
2006-12-11  7:56                           ` Ingo Molnar
2006-12-11 20:09                           ` Andrew Morton
2006-12-07 17:24     ` [patch] x86_64: do not enable the NMI watchdog by default Andi Kleen
2006-12-07  2:28 ` [patch] ACPI, i686, x86_64: fix laptop bootup hang in init_acpi() Sergio Monteiro Basto
2006-12-07  4:47   ` Karsten Wiese
2006-12-07 11:09     ` Ingo Molnar
2006-12-07 11:24       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061207121135.GA15529@elte.hu \
    --to=mingo@elte.hu \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox