From: George Anzinger <george@mvista.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Keith Owens <kaos@sgi.com>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] NMI watch dog notify patch
Date: Fri, 29 Jul 2005 13:55:23 -0700 [thread overview]
Message-ID: <42EA97BB.8020001@mvista.com> (raw)
In-Reply-To: <20050728221440.5a16d401.akpm@osdl.org>
[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]
Andrew Morton wrote:
> Keith Owens <kaos@sgi.com> wrote:
>
>>>I had though that too, but it does not allow recovery (i.e. lets reset
>>
>> >the watchdog and try again).
>>
>> die_nmi() returns to nmi_watchdog_tick(), nmi_watchdog_tick does the
>> reset and continues. Patch below.
>>
>> >Hmm.. just looked at traps.c. Seems die_nmi is NOT called from the nmi
>> >trap, only from the watchdog. Also, there is a notify in the path to
>> >the other nmi stuff.
>>
>> I was looking at unknown_nmi_panic_callback(), which also calls
>> die_nmi().
>>
>> traps.c already has several notify_die() calls, nmi.c has none. It is
>> cleaner to keep all the notification in traps.c, with this small change
>> to nmi.c to cope with die_nmi() returning.
>>
>> Index: linux/arch/i386/kernel/nmi.c
>> ===================================================================
>> --- linux.orig/arch/i386/kernel/nmi.c 2005-07-28 17:22:06.735038510 +1000
>> +++ linux/arch/i386/kernel/nmi.c 2005-07-29 15:19:00.371196596 +1000
>> @@ -494,8 +494,10 @@ void nmi_watchdog_tick (struct pt_regs *
>> * wait a few IRQs (5 seconds) before doing the oops ...
>> */
>> alert_counter[cpu]++;
>> - if (alert_counter[cpu] == 5*nmi_hz)
>> + if (alert_counter[cpu] == 5*nmi_hz) {
>> die_nmi(regs, "NMI Watchdog detected LOCKUP");
>> + alert_counter[cpu] = 0;
>> + }
>> } else {
>> last_irq_sums[cpu] = sum;
>> alert_counter[cpu] = 0;
>
>
> That all makes sense - let's go that way?
Looks good to me. Trimed a bit more fat too. Here is the complete patch.
> -
-
George Anzinger george@mvista.com
HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/
[-- Attachment #2: nmi-notify.patch --]
[-- Type: text/plain, Size: 1804 bytes --]
Source: MontaVista Software, Inc. George Anzinger <george@mvista.com>
Type: Enhancement
Description:
This patch adds a notify to the die_nmi notify that the system
is about to be taken down. If the notify is handled with a
NOTIFY_STOP return, the system is given a new lease on life.
We also change the nmi watchdog to carry on if die_nmi returns.
This give debug code a chance to a) catch watchdog timeouts and
b) possibly allow the system to continue, realizing that
the time out may be due to debugger activities such as single
stepping which is usually done with "other" cpus held.
Signed-off-by: George Anzinger<george@mvista.com>
nmi.c | 5 ++++-
traps.c | 4 ++++
2 files changed, 8 insertions(+), 1 deletion(-)
Index: linux-2.6.13-rc/arch/i386/kernel/nmi.c
===================================================================
--- linux-2.6.13-rc.orig/arch/i386/kernel/nmi.c
+++ linux-2.6.13-rc/arch/i386/kernel/nmi.c
@@ -495,8 +495,11 @@ void nmi_watchdog_tick (struct pt_regs *
*/
alert_counter[cpu]++;
if (alert_counter[cpu] == 5*nmi_hz)
+ /*
+ * die_nmi will return ONLY if NOTIFY_STOP happens..
+ */
die_nmi(regs, "NMI Watchdog detected LOCKUP");
- } else {
+
last_irq_sums[cpu] = sum;
alert_counter[cpu] = 0;
}
Index: linux-2.6.13-rc/arch/i386/kernel/traps.c
===================================================================
--- linux-2.6.13-rc.orig/arch/i386/kernel/traps.c
+++ linux-2.6.13-rc/arch/i386/kernel/traps.c
@@ -555,6 +555,10 @@ static DEFINE_SPINLOCK(nmi_print_lock);
void die_nmi (struct pt_regs *regs, const char *msg)
{
+ if (notify_die(DIE_NMIWATCHDOG, "nmi_watchdog", regs,
+ 0, 0, SIGINT) == NOTIFY_STOP)
+ return;
+
spin_lock(&nmi_print_lock);
/*
* We are in trouble anyway, lets at least try
next prev parent reply other threads:[~2005-07-29 20:59 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-07-28 20:31 [PATCH] NMI watch dog notify patch George Anzinger
2005-07-28 23:13 ` Andrew Morton
2005-07-28 23:58 ` George Anzinger
2005-07-29 0:12 ` Andrew Morton
2005-07-29 1:50 ` Keith Owens
2005-07-29 4:16 ` George Anzinger
2005-07-29 5:09 ` Keith Owens
2005-07-29 5:14 ` Andrew Morton
2005-07-29 20:55 ` George Anzinger [this message]
2005-07-30 3:24 ` Keith Owens
2005-08-01 22:58 ` George Anzinger
2005-08-02 20:26 ` [PATCH] " George Anzinger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42EA97BB.8020001@mvista.com \
--to=george@mvista.com \
--cc=akpm@osdl.org \
--cc=kaos@sgi.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox