All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: huang ying <huang.ying.caritas@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>, Ingo Molnar <mingo@elte.hu>,
	Don Zickus <dzickus@redhat.com>,
	linux-kernel@vger.kernel.org, Andi Kleen <andi@firstfloor.org>,
	Robert Richter <robert.richter@amd.com>,
	Andi Kleen <ak@linux.intel.com>
Subject: Re: [RFC] x86, NMI, Treat unknown NMI as hardware error
Date: Sat, 14 May 2011 11:51:47 +0400	[thread overview]
Message-ID: <4DCE3493.4090404@gmail.com> (raw)
In-Reply-To: <BANLkTin7LNo0dDLs-Rz6WqCQ9RvS=MLXcw@mail.gmail.com>

On 05/14/2011 04:26 AM, huang ying wrote:
> On Fri, May 13, 2011 at 11:17 PM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>> On 05/13/2011 12:23 PM, Huang Ying wrote:
>>> In general, unknown NMI is used by hardware and firmware to notify
>>> fatal hardware errors to OS. So the Linux should treat unknown NMI as
>>> hardware error and go panic upon unknown NMI for better error
>>> containment.
>>>
>>> But there are some legacy machine which would randomly send unknown
>>> NMIs for no good reason.  To support these machines, a white list
>>> mechanism is provided to treat unknown NMI as hardware error only on
>>> some known working system.
>>>
>>> These systems are identified via the presentation of APEI HEST or
>>> some PCI ID of the host bridge. The PCI ID of host bridge instead of
>>> DMI ID is used, so that the checking can be done based on the platform
>>> type instead of motherboard. This should be simpler and sufficient.
>>>
>>> The method to identify the platforms is designed by Andi Kleen.
>>>
>>> Signed-off-by: Huang Ying <ying.huang@intel.com>
>>> Cc: Andi Kleen <ak@linux.intel.com>
>>> Cc: Don Zickus <dzickus@redhat.com>
>>> ---
>> ...
>>
>> Hi Ying,
>>
>> just curious (regardless the concerns Don and Ingo have) -- if there still a need
>> for such semi-unknown nmi handling maybe it's worth to register a *notifier* for it
>> and we panic only when user *explicitly* specify how to treat this class of NMIs
>> (via say "hest-nmi-panic" boot option or something like that). Maybe such partially
>> modular scheme would be better? If only I don't miss anything.
> 
> Hi, Cyrill,
> 
> IMHO, Pushing all policy to user is not good too.  How many users
> understand unknown NMI and hardware error clearly?  It is better if we
> can determine what is the right behavior.
> 
> Best Regards,
> Huang Ying

  Hi Ying,

yes, is not good. But at least we *must* provide a way to turn this new feature off
via command line I think. One of a reason for me is perf unknown nmis (at moment we seems
to have captured and cured all parasite NMIs sources but there is no guarantee we wont
meet them in future due to some code change or whatever). And bloating trap.c with
new if()'s is not that good I guess, that is why I asked if there a way to do all the
work via notifiers ;)

-- 
            Cyrill

  reply	other threads:[~2011-05-14  7:51 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-13  8:23 [RFC] x86, NMI, Treat unknown NMI as hardware error Huang Ying
2011-05-13 12:45 ` Don Zickus
2011-05-13 13:00   ` Ingo Molnar
2011-05-13 13:24     ` huang ying
2011-05-13 15:20       ` Ingo Molnar
2011-05-13 16:00         ` Don Zickus
2011-05-16 11:29           ` Ingo Molnar
2011-05-16 19:19             ` Don Zickus
2011-05-17  8:50               ` Ingo Molnar
2011-05-17  7:41             ` Huang Ying
2011-05-17  8:53               ` Ingo Molnar
2011-05-19  6:44                 ` Huang Ying
2011-05-20 11:58                   ` Ingo Molnar
2011-05-14  0:56         ` huang ying
2011-05-13 13:17   ` huang ying
2011-05-13 13:51     ` Don Zickus
2011-05-14  0:20       ` huang ying
2011-05-14  4:11         ` Andi Kleen
2011-05-13 15:17 ` Cyrill Gorcunov
2011-05-14  0:26   ` huang ying
2011-05-14  7:51     ` Cyrill Gorcunov [this message]
2011-05-15  0:06       ` huang ying
2011-05-15  6:34         ` Cyrill Gorcunov
2011-05-16  1:09           ` Huang Ying
2011-05-16 19:03             ` Don Zickus
2011-05-16 19:53               ` Cyrill Gorcunov
2011-05-17  5:39               ` Huang Ying
2011-05-17 14:24                 ` Don Zickus
2011-05-17 16:38                   ` Andi Kleen
2011-05-17 17:57                     ` Don Zickus
2011-05-17 18:18                       ` Andi Kleen
2011-05-17 19:07                         ` Don Zickus
2011-05-20  8:13                           ` Huang Ying
2011-06-09 12:09                             ` Don Zickus
2011-06-09 15:22                               ` Cyrill Gorcunov
2011-06-13  1:34                               ` Huang Ying
2011-05-16 19:44             ` Cyrill Gorcunov
2011-05-17  7:32               ` Huang Ying
2011-05-14  0:47   ` huang ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DCE3493.4090404@gmail.com \
    --to=gorcunov@gmail.com \
    --cc=ak@linux.intel.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=huang.ying.caritas@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=robert.richter@amd.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.