public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@linux.intel.com>
To: Russ Anderson <rja@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: x86/mce merge, integration hickup + crash, design thoughts
Date: Wed, 31 Dec 2008 14:32:07 +0100	[thread overview]
Message-ID: <495B7457.1040605@linux.intel.com> (raw)
In-Reply-To: <20081230211310.GA19653@sgi.com>

Russ Anderson wrote:

> Summary ASCII information is useful, especially if the error
> is clearly a hardware error.  Andi is right that decoding the
> information to print the specific failing hardware (ie which 
> DIMM) may be too dificult to decode on the way down.  It would
> be great to identify the failing hardware component on the
> way down, when possible.

This will hopefully happen in the future. In fact mcelog has support
to decode using DMI tables, but it turns out this doesn't work
very well in practice (both because of BIOS problems and because
the DMI standard was not really designed for this). That is why
I wouldn't advocate right now to move this code from mcelog
into the kernel. This might change later.


>>> It turns out that users don't really find this more enlightening (most 
>>> users have no clue what a Northbridge is).  They think it's some kind of 
>>> kernel bug even with the HARDWARE ERROR header.
>> You should not assume that administrators/users reading kernel crash 
>> messages are dumb. (an ordinary user wont see it most of the time anyway) 
>>
>> The usage patterns i see is that admins who get an MCE crash often fail to 
>> write down the whole MCE message (not realizing that it is important) and 
>> have to go back and reproduce the MCE crash once again before they can get 
>> any meaningful information.
> 
> This is why saving the error records to MVRAM is so useful.
> After reboot the records can be read, formatted, and logged.

I've been looking at using EFI runtime services for this, but it's
also somewhat problematic (e.g. getting the messages out again in
a useful way without risking duplicating events)

There are also a few other candidates like the Management Engine
interface on many Intel platforms, but it's also not available everywhere.

Short term just changing the MCE panic to timeout by default
is the best option. I'll probably submit that for .30

-Andi

  reply	other threads:[~2008-12-31 13:31 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-27 15:50 x86/mce merge, integration hickup + crash, design thoughts Ingo Molnar
2008-12-27 22:51 ` Ingo Molnar
2008-12-29 21:41   ` Andi Kleen
2009-01-13 17:45     ` Ingo Molnar
2009-01-13 18:57       ` Tim Hockin
2009-01-14  9:29         ` Andi Kleen
2009-01-14 16:18           ` Tim Hockin
2009-01-14 18:05             ` Andi Kleen
2009-01-14 19:32               ` Tim Hockin
2009-01-15 22:56                 ` Andi Kleen
2009-01-15 23:39                   ` Tim Hockin
2009-01-14  2:02       ` Huang Ying
2008-12-30 21:13   ` Russ Anderson
2008-12-31 13:32     ` Andi Kleen [this message]
2008-12-31 18:09       ` Russ Anderson
2008-12-29 21:51 ` Andi Kleen
2008-12-30  6:50   ` Ingo Molnar
2008-12-30  9:13     ` Andi Kleen
2008-12-30 21:29 ` Russ Anderson
2009-01-12 22:02 ` Tim Hockin
2009-01-13  5:02   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=495B7457.1040605@linux.intel.com \
    --to=ak@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rja@sgi.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox