All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robert Richter <robert.richter@amd.com>
To: huang ying <huang.ying.caritas@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>,
	Don Zickus <dzickus@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH -v2 6/7] x86, NMI, Add support to notify hardware error with unknown NMI
Date: Mon, 27 Sep 2010 15:38:16 +0200	[thread overview]
Message-ID: <20100927133816.GP13563@erda.amd.com> (raw)
In-Reply-To: <AANLkTimY5hr3DgN9=v83Z289MkzjgX8k0WzdTWZbGorj@mail.gmail.com>

On 27.09.10 08:47:53, huang ying wrote:

> >>  arch/x86/kernel/hwerr.c    |   55 +++++++++++++++++++++++++++++++++++++++++++++
> >
> > Instead of creating this file the code should be implemented in
> >
> >  arch/x86/kernel/cpu/intel.c
> >
> > Similar AMD NB code is implemented in amd.c and k8.c.
> 
> Why? This file is not vendor specific.

No, it only implements an Intel specific PCI device, nothing else.

> >> +late_initcall(check_unknown_nmi_for_hwerr);
> >
> > Maybe you can use early pci functions like read_pci_config() to avoid
> > late init.
> 
> I don't think late init is a big issue. Hardware error is rare after all.

Just want to let you know this as an option.

> >> --- a/arch/x86/kernel/traps.c
> >> +++ b/arch/x86/kernel/traps.c
> >> @@ -83,6 +83,8 @@ EXPORT_SYMBOL_GPL(used_vectors);
> >>
> >>  static int ignore_nmis;
> >>
> >> +int unknown_nmi_for_hwerr;
> >
> > If it is an nmi for hwerr, it is no longer an unknown nmi. So we
> > should drop 'unknow' in the naming.
> 
> I think unkown NMI is the one we can not identify the source.
> Something like anonymous.
> 
> >> +
> >>  /*
> >>   * Prevent NMI reason port (0x61) being accessed simultaneously, can
> >>   * only be used in NMI handler.
> >> @@ -360,6 +362,14 @@ io_check_error(unsigned char reason, str
> >>  static notrace __kprobes void
> >>  unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
> >>  {
> >> +     /*
> >> +      * On some platforms, hardware errors may be notified via
> >> +      * unknown NMI
> >> +      */
> >> +     if (unknown_nmi_for_hwerr)
> >> +             panic("NMI for hardware error without error record: "
> >> +                   "Not continuing");
> >> +
> >
> > Instead of checking this flag you should implement and register an nmi
> > handler for this case.
> 
> I think explicit function calls have better readability than notifier chains.

What is different to unknown_nmi() then?

So no, in your case you want to catch unknown nmis for a certain
hardware and then throw a panic. This should be clearly implemented in
a separate handler for this piece of hardware.

We want to cleanup this code and throw out all hardware specific
snippets, and not introduce new special cases here.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


  reply	other threads:[~2010-09-27 13:39 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-27  0:57 [PATCH -v2 1/7] x86, NMI, Add symbol definition for NMI magic constants Huang Ying
2010-09-27  0:57 ` [PATCH -v2 2/7] x86, NMI, Add touch_nmi_watchdog to io_check_error delay Huang Ying
2010-09-27  0:57 ` [PATCH -v2 3/7] x86, NMI, Rename memory parity error to PCI SERR error Huang Ying
2010-09-27  8:01   ` Robert Richter
2010-09-27  8:39     ` Huang Ying
2010-09-27  9:00       ` Robert Richter
2010-09-27 15:33         ` Don Zickus
2010-09-27 16:45           ` Robert Richter
2010-09-27 17:50             ` Don Zickus
2010-09-28  1:33             ` Huang Ying
2010-09-28 14:29               ` Robert Richter
2010-09-29  7:56                 ` huang ying
2010-09-28 15:38               ` Don Zickus
2010-09-28  1:22           ` Huang Ying
2010-09-27  0:57 ` [PATCH -v2 4/7] x86, NMI, Rewrite NMI handler Huang Ying
2010-09-27  9:41   ` Robert Richter
2010-09-27 12:39     ` huang ying
2010-09-27 13:25       ` Robert Richter
2010-09-27 15:29         ` Don Zickus
2010-09-27 17:40           ` Robert Richter
2010-09-27 19:14             ` Don Zickus
2010-09-27 22:35               ` Robert Richter
2010-09-28  1:03         ` Huang Ying
2010-09-28 14:59           ` Robert Richter
2010-09-29  7:54             ` huang ying
2010-09-27  0:57 ` [PATCH -v2 5/7] Make NMI reason io port (0x61) can be processed on any CPU Huang Ying
2010-09-27  0:57 ` [PATCH -v2 6/7] x86, NMI, Add support to notify hardware error with unknown NMI Huang Ying
2010-09-27 10:09   ` Robert Richter
2010-09-27 12:47     ` huang ying
2010-09-27 13:38       ` Robert Richter [this message]
2010-09-27 15:20         ` Don Zickus
2010-09-28  0:36           ` Huang Ying
2010-09-28 15:32             ` Don Zickus
2010-09-29  8:17               ` huang ying
2010-09-30  4:36                 ` Don Zickus
2010-09-30  4:57                   ` Huang Ying
2010-09-30  8:38                     ` Robert Richter
2010-09-30  9:36                       ` huang ying
2010-09-30  9:51                         ` Andi Kleen
2010-10-01 20:00                     ` Maciej W. Rozycki
2010-09-30  8:25                   ` Andi Kleen
2010-09-28  1:19         ` Huang Ying
2010-09-28 15:27           ` Robert Richter
2010-09-29  8:07             ` huang ying
2010-09-27 15:38   ` Don Zickus
2010-09-28  1:54     ` Huang Ying
2010-09-27  0:57 ` [PATCH -v2 7/7] x86, NMI, Remove do_nmi_callback logic Huang Ying
2010-09-27 10:44   ` Robert Richter
2010-09-27 12:56     ` huang ying
2010-09-27 13:43       ` Robert Richter
2010-09-27 15:16         ` Don Zickus
2010-09-27 16:58           ` Robert Richter
2010-09-28  1:41             ` Huang Ying
2010-09-28 15:16               ` Robert Richter
2010-09-28 15:21               ` Don Zickus
2010-09-28  0:28           ` Huang Ying
2010-09-28 15:19             ` Don Zickus
2010-09-29  6:55               ` huang ying
2010-09-30  4:04                 ` Don Zickus
2010-09-30  5:21                   ` Huang Ying
2010-09-30  8:24                     ` Andi Kleen
2010-09-30  8:23                   ` Robert Richter
2010-09-27 10:50 ` [PATCH -v2 1/7] x86, NMI, Add symbol definition for NMI magic constants Robert Richter
2010-09-27 15:29   ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100927133816.GP13563@erda.amd.com \
    --to=robert.richter@amd.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=hpa@zytor.com \
    --cc=huang.ying.caritas@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.