From: Ingo Molnar <mingo@elte.hu>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Don Zickus <dzickus@redhat.com>,
"Mingarelli, Thomas" <Thomas.Mingarelli@hp.com>,
Andi Kleen <andi@firstfloor.org>,
Prarit Bhargava <prarit@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"arozansk@redhat.com" <arozansk@redhat.com>,
"ak@linux.intel.com" <ak@linux.intel.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
"H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
"Maciej W. Rozycki" <macro@linux-mips.org>,
"Eric W. Biederman" <ebiederm@xmission.com>
Subject: Re: [PATCH RFC] NMI Re-introduce un[set]_nmi_callback
Date: Fri, 5 Sep 2008 10:57:36 +0200 [thread overview]
Message-ID: <20080905085736.GD18485@elte.hu> (raw)
In-Reply-To: <20080904214628.GE4349@redhat.com>
* Vivek Goyal <vgoyal@redhat.com> wrote:
> > Prarit's patch disabled the timer upon registering a callback to
> > prevent this case. The thought was if you have your own handler you
> > could provide your own watchdog.
>
> In theory, you could do the same while registering the handler on die
> chain?
>
> I don't get the whole point. So we are looking for a system where no
> body else uses an NMI for any purpose and the moment NMI happens, this
> driver will go and panic() the system. I don't get, what do we achive
> by that?
>
> Looks like you got some device in platform which raises an NMI and
> which indicates that something is wrong and log the message and do a
> panic(). But you don't have a way to find out if that device has
> raised the interrupt or something else has raised the NMI, hence you
> want to stop the watchdog timer and assume any NMI henceforth is from
> device?
>
> So any functionality which is dependent on NMI, will not work as long
> as this driver is loaded.
exactly. The proposed patch brings the NMI notifier arms race to its
next level: it is in essence a "super high priority" notifier that wants
exclusivity. That is unnecessary: we already have sufficient
infrastructure in place to recognize the priority of notifiers, and die
notifiers make active use of it. (see the list below)
If the goal is to log relevant NMI events to NVRAM for later inspection
then the solution is to register a low-priority notifier which will
execute if no other notifier shows interest in an NMI event.
If the goal is to reboot the system if there's a "bad" NMI, then the
solution is to register a low-priority notifier which will execute and
panic the system if no other notifier shows interest in that NMI event.
This means all the 'good' NMI sources need to register at higher
priority and all the standard platform fallback notifiers need to
register at lowest priority. Debuggers go last. If any of the existing
notifiers in the kernel dont follow the rules and cause problems then we
can fix up those.
Here's a list of all current die notifiers we use in the kernel:
prio
----
# core kernel:
kernel/kprobes.c # instruction probing +INT_MAX
kernel/trace/trace.c # dump trace 150
# arch/x86:
arch/x86/mm/kmmio.c # mmiotrace, debug feature 0
arch/x86/kernel/kgdb.c # kgdb, kernel debugger -INT_MAX
arch/x86/kernel/crash.c # kdump 0
arch/x86/oprofile/nmi_int.c # oprofile 0
arch/x86/oprofile/nmi_timer_int.c # oprofile timer fallback 0
# drivers:
drivers/watchdog/hpwdt.c # HP watchdog driver +INT_MAX
drivers/char/ipmi/ipmi_watchdog.c # IPMI watchdog 150
drivers/misc/sgi-xp/xpc_main.c # SGI SN2 0
# upcoming:
arch/x86/mm/kmemcheck/smp.c # in tip/master 0
[ this should change to +INT_MAX ]
One change we could do is to move arch/x86/kernel/crash.c's priority to
-100. That would achieve the following effect: it would execute after
all known sources of NMIs, but before the lowest prio interactive
notifiers such as kgdb.
That way a platform driver could insert at priority -50 and execute
after the known self-generated NMI sources, but before any of the debug
and crash-tool notifiers. And the constants should be made symbolic as
well, so that we encode the purpose of a notifier, not the
implementation - and can map the purpose to priority and shape the
ordering flexibly.
furthermore, as i indicated it in my first reply, if the port 61H read
is causing problems on this platform (and i'd not be surprised if it did
so) we can turn it into a lowest-prio fallback notifier which only
executes if no known sources of NMIs show interest in an event.
Ingo
next prev parent reply other threads:[~2008-09-05 8:58 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-04 13:07 [PATCH RFC] NMI Re-introduce un[set]_nmi_callback Prarit Bhargava
2008-09-04 13:37 ` Peter Zijlstra
2008-09-04 14:29 ` Prarit Bhargava
2008-09-04 14:49 ` aris
2008-09-04 14:56 ` Ingo Molnar
2008-09-04 15:12 ` H. Peter Anvin
2008-09-04 15:18 ` Ingo Molnar
2008-09-04 15:52 ` Andi Kleen
2008-09-04 17:20 ` Don Zickus
2008-09-04 17:52 ` Andi Kleen
2008-09-04 18:26 ` Don Zickus
2008-09-04 18:47 ` Andi Kleen
2008-09-04 19:08 ` Vivek Goyal
2008-09-04 20:00 ` Andi Kleen
2008-09-04 20:01 ` Mingarelli, Thomas
2008-09-04 20:19 ` Andi Kleen
2008-09-04 20:21 ` Mingarelli, Thomas
2008-09-04 20:53 ` Andi Kleen
2008-09-04 21:22 ` Don Zickus
2008-09-04 20:57 ` Vivek Goyal
2008-09-04 21:05 ` Mingarelli, Thomas
2008-09-04 21:21 ` Vivek Goyal
2008-09-04 21:24 ` Don Zickus
2008-09-04 21:46 ` Vivek Goyal
2008-09-05 8:57 ` Ingo Molnar [this message]
2008-09-05 10:24 ` Ingo Molnar
2008-09-05 9:33 ` Ingo Molnar
2008-09-05 14:16 ` Vivek Goyal
2008-09-05 14:18 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080905085736.GD18485@elte.hu \
--to=mingo@elte.hu \
--cc=Thomas.Mingarelli@hp.com \
--cc=ak@linux.intel.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=andi@firstfloor.org \
--cc=arozansk@redhat.com \
--cc=dzickus@redhat.com \
--cc=ebiederm@xmission.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=macro@linux-mips.org \
--cc=peterz@infradead.org \
--cc=prarit@redhat.com \
--cc=tglx@linutronix.de \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox