linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab@redhat.com>
To: Borislav Petkov <bp@amd64.org>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
	Tony Luck <tony.luck@intel.com>, Ingo Molnar <mingo@elte.hu>,
	EDAC devel <linux-edac@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] mce: Add a msg string to the MCE tracepoint
Date: Wed, 29 Feb 2012 09:04:46 -0300	[thread overview]
Message-ID: <4F4E145E.4040901@redhat.com> (raw)
In-Reply-To: <20120229101047.GA21224@aftab>

Em 29-02-2012 07:10, Borislav Petkov escreveu:
> On Wed, Feb 29, 2012 at 10:14:33AM +0900, Hidetoshi Seto wrote:
>> (2012/02/29 1:11), Borislav Petkov wrote:
>>> From: Borislav Petkov <borislav.petkov@amd.com>
>>>
>>> The idea here is to pass an additional decoded MCE message through
>>> the tracepoint and into the ring buffer for userspace to consume. The
>>> designated consumers are RAS daemons and other tools collecting RAS
>>> information.
>>
>> I could not catch the point... Why you need this msg field?
>>
>> I think that all of information about the error is already packed in
>> the record and that we can make a string from the bits in the record
>> soon afterward.  From my point of view it seems that what you are
>> doing here is just consuming the ring buffer by repeating same
>> contents in another format with dynamic length which might be short
>> but otherwise could be too long.

Not all information is packed in the record. The record packs only what it
is inside the MCE registers. However, for certain errors, it is needed to
parse other hardware registers to decode the error (for example, on Sandy
Bridge, the MCE registers don't contain the affected dimms).

> Right, to answer your immediate question: we've already decoded the MCE
> so we carry that decoded info to userspace.
> 
> To address your indirect question: why aren't we using the MCE fields
> to decode the MCE in userspace? Well, this has been a long discussion
> already and one of the strong arguments for decoding hardware errors in
> the kernel is that the kernel simply knows its hardware better. Imagine
> a big server farm with heterogeneous hw configurations - if you get an
> MCE there you have to also have collected the hardware platform details
> so that you are able to decode it. If the kernel can do that for ya, you
> don't have to do anything!
> 
> Or the case where you get an uncorrectable error and the machine panics
> - it is much more convenient to see the decoded error on the screen
> before the machine dies instead of some MCA register dumps which you
> have to jot down and go and decode them by hand.
> 
>> And one more unacceptable point is that filling this msg field is
>> expected to be done in machine check context where have many
>> limitations in kernel's subsystems such as use of memory allocators.
> 
> Doh, I should've seen that, thanks to you and Tony for pointing that
> out.
> 
>> Suggestion; How about having a kind of translator function for
>> userland, e.g. an exported function named mce_record_to_msg()?
>> Tool obtains raw data from the record in the tracepoint's ring buffer,
>> and if it likes, optionally it can pass the record to the translator
>> function to get some accomplished string.
> 
> Either that or I could simply allocate a large enough buffer from the
> get-go, as Tony suggests. I'll experiment with my MCE generation script
> and see how large a buffer can become.

Just allocate one page. 4096 should be enough even for the most hungry needs.

>>> Drop unneeded fields while at it, thus saving some room in the ring
>>> buffer.
>>
>> Really unneeded and should be killed?
> 
> Right, so this is me suggesting to remove those because I don't see
> why we'd need them, I'm expecting other people to come and say either
> "Boris, no no, this is needed in... " or "Yeah, go ahead and remove
> them, no one uses those." So feel free to argue either way.

IMHO, before removing those fields, it would be better to first implement
what is there at the mcelog userspace parser for the Intel machines into
kernelspace (or to look into its source code), and check what registers
aren't used by either AMD 64 MCE decoder or by the Intel MCE decoder.

Tony,

Is there anyone at Intel working on porting it to kernelspace?

Regards,
Mauro

  reply	other threads:[~2012-02-29 12:05 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-28 16:11 [RFC PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-02-28 16:11 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov
2012-02-29  1:14   ` Hidetoshi Seto
2012-02-29 10:10     ` Borislav Petkov
2012-02-29 12:04       ` Mauro Carvalho Chehab [this message]
2012-02-29 12:19         ` Borislav Petkov
2012-02-29 13:05           ` Mauro Carvalho Chehab
2012-02-29 13:37             ` Borislav Petkov
2012-02-29 17:11               ` Luck, Tony
2012-02-29 17:19                 ` Borislav Petkov
2012-03-01  2:23               ` Hidetoshi Seto
2012-03-01 11:40                 ` Borislav Petkov
2012-03-01 18:28                   ` Luck, Tony
2012-03-02  4:02                     ` Hidetoshi Seto
2012-03-02 13:17                       ` Mauro Carvalho Chehab
2012-03-02 20:05                       ` Luck, Tony
2012-02-29 17:20         ` Luck, Tony
2012-02-29 18:00           ` Mauro Carvalho Chehab
2012-02-29 18:11             ` Luck, Tony
2012-02-29 12:52   ` Mauro Carvalho Chehab
2012-02-29 13:45     ` Borislav Petkov
2012-02-29 14:04       ` Mauro Carvalho Chehab
2012-02-29 14:40         ` Borislav Petkov
2012-02-29 16:58           ` Luck, Tony
2012-02-29 17:16             ` Borislav Petkov
2012-02-29 17:33               ` Luck, Tony
2012-03-01 11:29                 ` Borislav Petkov
2012-03-01 13:19                   ` Mauro Carvalho Chehab
2012-03-01 18:15                     ` Luck, Tony
2012-03-01 18:45                       ` Borislav Petkov
2012-03-01 18:58                         ` Luck, Tony
2012-03-01 19:54                           ` Mauro Carvalho Chehab
2012-02-29 17:45               ` Mauro Carvalho Chehab
2012-02-29 17:17           ` Mauro Carvalho Chehab
2012-02-28 16:11 ` [PATCH 2/3] x86, RAS: Add a decoded msg buffer Borislav Petkov
2012-02-28 22:43   ` Luck, Tony
2012-02-29 10:11     ` Borislav Petkov
2012-03-02  9:55       ` Borislav Petkov
2012-02-28 16:11 ` [PATCH 3/3] EDAC: Convert AMD EDAC pieces to use RAS printk buffer Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2012-03-06 13:31 [RFC -v3 PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-03-06 13:31 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F4E145E.4040901@redhat.com \
    --to=mchehab@redhat.com \
    --cc=bp@amd64.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).