public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] softdog - panic instead of reboot, kernel 2.6.22.9
@ 2007-10-10  6:42 sugaken.r3
  2007-10-10 12:15 ` Alan Cox
  0 siblings, 1 reply; 3+ messages in thread
From: sugaken.r3 @ 2007-10-10  6:42 UTC (permalink / raw)
  To: wim; +Cc: linux-kernel

>From Ken Sugawara <sugaken.r3@gmail.com>

Description: Add an option to softdog to panic upon timer expiration 
instead of reboot.

Signed-off-by: Ken Sugawara <sugaken.r3@gmail.com>
---
This patch is intended to help increase the chance of postmortem 
analysis in the event of silent system hang where it is impossible to 
initiate crash dump manually (e.g. no console or network response).  
We've had occasions where some of our customers had silent system hang 
problems in which they were unable to obtain vmcore for root cause 
analysis.  Provided that timer interrupts are not blocked, this patch 
might enable them to get a crash dump in such a situation by activating 
softdog.

--- linux-2.6.22.9/drivers/char/watchdog/softdog.c.orig	2007-09-27 03:03:01.000000000 +0900
+++ linux-2.6.22.9/drivers/char/watchdog/softdog.c	2007-10-04 12:06:28.000000000 +0900
@@ -49,6 +49,7 @@
 #include <linux/jiffies.h>
 
 #include <asm/uaccess.h>
+#include <linux/kernel.h>
 
 #define PFX "SoftDog: "
 
@@ -70,6 +71,11 @@ static int soft_noboot = 0;
 module_param(soft_noboot, int, 0);
 MODULE_PARM_DESC(soft_noboot, "Softdog action, set to 1 to ignore reboots, 0 to reboot (default depends on ONLY_TESTING)");
 
+static int panic_instead = 0;
+
+module_param(panic_instead, int, 0);
+MODULE_PARM_DESC(panic_instead, "Softdog action, set to 1 to panic instead of reboot, 0 to reboot (default 0)");
+
 /*
  *	Our timer
  */
@@ -93,8 +99,11 @@ static void watchdog_fire(unsigned long 
 
 	if (soft_noboot)
 		printk(KERN_CRIT PFX "Triggered - Reboot ignored.\n");
-	else
-	{
+	else if (panic_instead) {
+		printk(KERN_CRIT PFX "Initiating panic.\n");
+		panic("Watchdog timer expired.");
+		printk(KERN_CRIT PFX "Panic didn't ?????\n");
+	} else {
 		printk(KERN_CRIT PFX "Initiating system reboot.\n");
 		emergency_restart();
 		printk(KERN_CRIT PFX "Reboot didn't ?????\n");
@@ -262,7 +271,7 @@ static struct notifier_block softdog_not
 	.notifier_call	= softdog_notify_sys,
 };
 
-static char banner[] __initdata = KERN_INFO "Software Watchdog Timer: 0.07 initialized. soft_noboot=%d soft_margin=%d sec (nowayout= %d)\n";
+static char banner[] __initdata = KERN_INFO "Software Watchdog Timer: 0.07 initialized. soft_noboot=%d soft_margin=%d sec panic_instead=%d (nowayout= %d)\n";
 
 static int __init watchdog_init(void)
 {
@@ -290,7 +299,7 @@ static int __init watchdog_init(void)
 		return ret;
 	}
 
-	printk(banner, soft_noboot, soft_margin, nowayout);
+	printk(banner, soft_noboot, soft_margin, panic_instead, nowayout);
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] softdog - panic instead of reboot, kernel 2.6.22.9
  2007-10-10  6:42 [PATCH] softdog - panic instead of reboot, kernel 2.6.22.9 sugaken.r3
@ 2007-10-10 12:15 ` Alan Cox
  2007-10-11 10:25   ` Ken Sugawara
  0 siblings, 1 reply; 3+ messages in thread
From: Alan Cox @ 2007-10-10 12:15 UTC (permalink / raw)
  To: sugaken.r3@gmail.com; +Cc: wim, linux-kernel

> analysis.  Provided that timer interrupts are not blocked, this patch 
> might enable them to get a crash dump in such a situation by activating 
> softdog.

You want the NMI based anti lockup stuff for that. If the box locks up
then generally softdog will fail. The NMI based debug
(CONFIG_DETECT_SOFTLOCKUP) is the option you want.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] softdog - panic instead of reboot, kernel 2.6.22.9
  2007-10-10 12:15 ` Alan Cox
@ 2007-10-11 10:25   ` Ken Sugawara
  0 siblings, 0 replies; 3+ messages in thread
From: Ken Sugawara @ 2007-10-11 10:25 UTC (permalink / raw)
  To: Alan Cox; +Cc: wim, linux-kernel

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
>> analysis.  Provided that timer interrupts are not blocked, this patch 
>> might enable them to get a crash dump in such a situation by activating 
>> softdog.
>
>You want the NMI based anti lockup stuff for that. If the box locks up
>then generally softdog will fail. The NMI based debug
>(CONFIG_DETECT_SOFTLOCKUP) is the option you want.

Thanks for your comment.  I'll look into it, but unfortunately, most of 
our customers are running much older kernels (e.g. 2.6.9-based from 
RHEL4.x) which don't have the option, and some non-negligible number of 
them are running on non-IA platforms, whereas right now the option seems 
to be implemented only in i386 and x86_64 (correct me if I'm wrong).

Anyways, I'll try to look for a way to manually trigger NMI on their non
-IA boxes, preferably without any additional hardware.

Regards,

Ken

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-10-11 10:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-10  6:42 [PATCH] softdog - panic instead of reboot, kernel 2.6.22.9 sugaken.r3
2007-10-10 12:15 ` Alan Cox
2007-10-11 10:25   ` Ken Sugawara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox