public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Huang Ying <ying.huang@intel.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andi Kleen <andi@firstfloor.org>,
	Robert Richter <robert.richter@amd.com>
Subject: Re: [PATCH -v3 5/6] x86, NMI, treat unknown NMI as hardware error
Date: Tue, 12 Oct 2010 09:10:21 +0800	[thread overview]
Message-ID: <1286845821.7768.150.camel@yhuang-dev> (raw)
In-Reply-To: <20101011212006.GB23882@redhat.com>

On Tue, 2010-10-12 at 05:20 +0800, Don Zickus wrote:
> On Sat, Oct 09, 2010 at 02:49:46PM +0800, Huang Ying wrote:
> > In general, unknown NMI is used by hardware and firmware to notify
> > fatal hardware errors to OS. So the Linux should treat unknown NMI as
> > hardware error and go panic upon unknown NMI for better error
> > containment.
> > 
> > But there are some broken hardware, which will generate unknown NMI
> > not for hardware error. To support these machines, a white list
> > mechanism is provided to treat unknown NMI as hardware error only on
> > some known working system.
> > 
> > These systems are identified via the presentation of APEI HEST or
> > some PCI ID of the host bridge. The PCI ID of host bridge instead of
> > DMI ID is used, so that the checking can be done based on the platform
> > type instead of motherboard. This should be simpler and sufficient.
> > 
> > The method to identify the platforms is designed by Andi Kleen.
> 
> I don't have any major problems with the other patches in the patch
> series.  In fact I would like to get them committed somewhere, so we can
> continue building on them.

Thanks.

> > @@ -366,6 +368,15 @@ unknown_nmi_error(unsigned char reason,
> >  	if (notify_die(DIE_NMIUNKNOWN, "nmi", regs, reason, 2, SIGINT) ==
> >  			NOTIFY_STOP)
> >  		return;
> > +	/*
> > +	 * On some platforms, hardware errors may be notified via
> > +	 * unknown NMI
> > +	 */
> > +	if (unknown_nmi_as_hwerr)
> > +		panic(
> > +		"NMI for hardware error without error record: Not continuing\n"
> > +		"Please check BIOS/BMC log for further information.");
> > +
> >  #ifdef CONFIG_MCA
> >  	/*
> >  	 * Might actually be able to figure out what the guilty party
> 
> The only quirk I have left is the above piece, which is basically a
> philosophy difference with Robert and myself.  Where we believe it should
> be on the die_chain and Andi and yourself would like to see it explicitly
> called out.
> 
> If we move to a new notifier chain, like we discussed in another thread,
> would you guys be willing to move this into that new notifier chain or is
> your argument still going to stand?

Perhaps I will not move this into that new notifier chain. If you want
to do that, feel free to pick it up and change it as you will.

Best Regards,
Huang Ying



  reply	other threads:[~2010-10-12  1:10 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-09  6:49 [PATCH -v3 1/6] x86, NMI, Add NMI symbol constants and rename memory parity to PCI SERR Huang Ying
2010-10-09  6:49 ` [PATCH -v3 2/6] x86, NMI, Add touch_nmi_watchdog to io_check_error delay Huang Ying
2010-10-09  6:49 ` [PATCH -v3 3/6] x86, NMI, Rewrite NMI handler Huang Ying
2010-10-11 16:13   ` Peter Zijlstra
2010-10-11 20:35     ` Don Zickus
2010-10-12  0:50     ` Huang Ying
2010-10-12  6:04       ` Peter Zijlstra
2010-10-12  6:14         ` Huang Ying
2010-10-12  6:31           ` Peter Zijlstra
2010-10-12  6:37             ` Huang Ying
2010-10-12  6:40               ` Peter Zijlstra
2010-10-12  6:45                 ` Huang Ying
2010-10-12  6:49                   ` Peter Zijlstra
2010-10-12  6:54                     ` Huang Ying
2010-10-12 13:51                     ` Andi Kleen
2010-10-12 14:15                       ` Peter Zijlstra
2010-10-27 16:45                         ` Don Zickus
2010-10-27 17:08                           ` Peter Zijlstra
2010-10-27 18:07                             ` Don Zickus
2010-11-02 17:50                             ` Don Zickus
2010-11-02 18:16                               ` Huang Ying
2010-11-02 19:11                                 ` Don Zickus
2010-11-02 20:47                                 ` Don Zickus
2010-10-09  6:49 ` [PATCH -v3 4/6] Make NMI reason io port (0x61) can be processed on any CPU Huang Ying
2010-10-09  6:49 ` [PATCH -v3 5/6] x86, NMI, treat unknown NMI as hardware error Huang Ying
2010-10-10 14:07   ` Alan Cox
2010-10-10 14:13     ` Andi Kleen
2010-10-11 21:08       ` Don Zickus
2010-10-11 21:12         ` Don Zickus
2010-10-11 21:20   ` Don Zickus
2010-10-12  1:10     ` Huang Ying [this message]
2010-10-20  6:12     ` Huang Ying
2010-10-20 14:15       ` Don Zickus
2010-10-21  1:14         ` Huang Ying
2010-10-21  2:31           ` Don Zickus
2010-10-21  5:17             ` Huang Ying
2010-10-21 14:10               ` Don Zickus
2010-10-21 15:45                 ` Andi Kleen
2010-10-22  1:49                   ` Don Zickus
2010-10-22  2:05                     ` Huang Ying
2010-10-22  2:56                       ` Don Zickus
2010-10-22  5:23                         ` Huang Ying
2010-10-22  9:24                     ` Andi Kleen
2010-10-09  6:49 ` [PATCH -v3 6/6] x86, NMI, Remove do_nmi_callback logic Huang Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1286845821.7768.150.camel@yhuang-dev \
    --to=ying.huang@intel.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=robert.richter@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox