public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Robert Richter <robert.richter@amd.com>
To: huang ying <huang.ying.caritas@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>,
	Don Zickus <dzickus@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH -v2 6/7] x86, NMI, Add support to notify hardware error with unknown NMI
Date: Mon, 27 Sep 2010 15:38:16 +0200	[thread overview]
Message-ID: <20100927133816.GP13563@erda.amd.com> (raw)
In-Reply-To: <AANLkTimY5hr3DgN9=v83Z289MkzjgX8k0WzdTWZbGorj@mail.gmail.com>

On 27.09.10 08:47:53, huang ying wrote:

> >>  arch/x86/kernel/hwerr.c    |   55 +++++++++++++++++++++++++++++++++++++++++++++
> >
> > Instead of creating this file the code should be implemented in
> >
> >  arch/x86/kernel/cpu/intel.c
> >
> > Similar AMD NB code is implemented in amd.c and k8.c.
> 
> Why? This file is not vendor specific.

No, it only implements an Intel specific PCI device, nothing else.

> >> +late_initcall(check_unknown_nmi_for_hwerr);
> >
> > Maybe you can use early pci functions like read_pci_config() to avoid
> > late init.
> 
> I don't think late init is a big issue. Hardware error is rare after all.

Just want to let you know this as an option.

> >> --- a/arch/x86/kernel/traps.c
> >> +++ b/arch/x86/kernel/traps.c
> >> @@ -83,6 +83,8 @@ EXPORT_SYMBOL_GPL(used_vectors);
> >>
> >>  static int ignore_nmis;
> >>
> >> +int unknown_nmi_for_hwerr;
> >
> > If it is an nmi for hwerr, it is no longer an unknown nmi. So we
> > should drop 'unknow' in the naming.
> 
> I think unkown NMI is the one we can not identify the source.
> Something like anonymous.
> 
> >> +
> >>  /*
> >>   * Prevent NMI reason port (0x61) being accessed simultaneously, can
> >>   * only be used in NMI handler.
> >> @@ -360,6 +362,14 @@ io_check_error(unsigned char reason, str
> >>  static notrace __kprobes void
> >>  unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
> >>  {
> >> +     /*
> >> +      * On some platforms, hardware errors may be notified via
> >> +      * unknown NMI
> >> +      */
> >> +     if (unknown_nmi_for_hwerr)
> >> +             panic("NMI for hardware error without error record: "
> >> +                   "Not continuing");
> >> +
> >
> > Instead of checking this flag you should implement and register an nmi
> > handler for this case.
> 
> I think explicit function calls have better readability than notifier chains.

What is different to unknown_nmi() then?

So no, in your case you want to catch unknown nmis for a certain
hardware and then throw a panic. This should be clearly implemented in
a separate handler for this piece of hardware.

We want to cleanup this code and throw out all hardware specific
snippets, and not introduce new special cases here.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


  reply	other threads:[~2010-09-27 13:39 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-27  0:57 [PATCH -v2 1/7] x86, NMI, Add symbol definition for NMI magic constants Huang Ying
2010-09-27  0:57 ` [PATCH -v2 2/7] x86, NMI, Add touch_nmi_watchdog to io_check_error delay Huang Ying
2010-09-27  0:57 ` [PATCH -v2 3/7] x86, NMI, Rename memory parity error to PCI SERR error Huang Ying
2010-09-27  8:01   ` Robert Richter
2010-09-27  8:39     ` Huang Ying
2010-09-27  9:00       ` Robert Richter
2010-09-27 15:33         ` Don Zickus
2010-09-27 16:45           ` Robert Richter
2010-09-27 17:50             ` Don Zickus
2010-09-28  1:33             ` Huang Ying
2010-09-28 14:29               ` Robert Richter
2010-09-29  7:56                 ` huang ying
2010-09-28 15:38               ` Don Zickus
2010-09-28  1:22           ` Huang Ying
2010-09-27  0:57 ` [PATCH -v2 4/7] x86, NMI, Rewrite NMI handler Huang Ying
2010-09-27  9:41   ` Robert Richter
2010-09-27 12:39     ` huang ying
2010-09-27 13:25       ` Robert Richter
2010-09-27 15:29         ` Don Zickus
2010-09-27 17:40           ` Robert Richter
2010-09-27 19:14             ` Don Zickus
2010-09-27 22:35               ` Robert Richter
2010-09-28  1:03         ` Huang Ying
2010-09-28 14:59           ` Robert Richter
2010-09-29  7:54             ` huang ying
2010-09-27  0:57 ` [PATCH -v2 5/7] Make NMI reason io port (0x61) can be processed on any CPU Huang Ying
2010-09-27  0:57 ` [PATCH -v2 6/7] x86, NMI, Add support to notify hardware error with unknown NMI Huang Ying
2010-09-27 10:09   ` Robert Richter
2010-09-27 12:47     ` huang ying
2010-09-27 13:38       ` Robert Richter [this message]
2010-09-27 15:20         ` Don Zickus
2010-09-28  0:36           ` Huang Ying
2010-09-28 15:32             ` Don Zickus
2010-09-29  8:17               ` huang ying
2010-09-30  4:36                 ` Don Zickus
2010-09-30  4:57                   ` Huang Ying
2010-09-30  8:38                     ` Robert Richter
2010-09-30  9:36                       ` huang ying
2010-09-30  9:51                         ` Andi Kleen
2010-10-01 20:00                     ` Maciej W. Rozycki
2010-09-30  8:25                   ` Andi Kleen
2010-09-28  1:19         ` Huang Ying
2010-09-28 15:27           ` Robert Richter
2010-09-29  8:07             ` huang ying
2010-09-27 15:38   ` Don Zickus
2010-09-28  1:54     ` Huang Ying
2010-09-27  0:57 ` [PATCH -v2 7/7] x86, NMI, Remove do_nmi_callback logic Huang Ying
2010-09-27 10:44   ` Robert Richter
2010-09-27 12:56     ` huang ying
2010-09-27 13:43       ` Robert Richter
2010-09-27 15:16         ` Don Zickus
2010-09-27 16:58           ` Robert Richter
2010-09-28  1:41             ` Huang Ying
2010-09-28 15:16               ` Robert Richter
2010-09-28 15:21               ` Don Zickus
2010-09-28  0:28           ` Huang Ying
2010-09-28 15:19             ` Don Zickus
2010-09-29  6:55               ` huang ying
2010-09-30  4:04                 ` Don Zickus
2010-09-30  5:21                   ` Huang Ying
2010-09-30  8:24                     ` Andi Kleen
2010-09-30  8:23                   ` Robert Richter
2010-09-27 10:50 ` [PATCH -v2 1/7] x86, NMI, Add symbol definition for NMI magic constants Robert Richter
2010-09-27 15:29   ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100927133816.GP13563@erda.amd.com \
    --to=robert.richter@amd.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=hpa@zytor.com \
    --cc=huang.ying.caritas@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox