From: Borislav Petkov <bp@amd64.org>
To: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Borislav Petkov <bp@amd64.org>, Tony Luck <tony.luck@intel.com>,
Ingo Molnar <mingo@elte.hu>,
EDAC devel <linux-edac@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] mce: Add a msg string to the MCE tracepoint
Date: Wed, 29 Feb 2012 11:10:47 +0100 [thread overview]
Message-ID: <20120229101047.GA21224@aftab> (raw)
In-Reply-To: <4F4D7BF9.9070104@jp.fujitsu.com>
On Wed, Feb 29, 2012 at 10:14:33AM +0900, Hidetoshi Seto wrote:
> (2012/02/29 1:11), Borislav Petkov wrote:
> > From: Borislav Petkov <borislav.petkov@amd.com>
> >
> > The idea here is to pass an additional decoded MCE message through
> > the tracepoint and into the ring buffer for userspace to consume. The
> > designated consumers are RAS daemons and other tools collecting RAS
> > information.
>
> I could not catch the point... Why you need this msg field?
>
> I think that all of information about the error is already packed in
> the record and that we can make a string from the bits in the record
> soon afterward. From my point of view it seems that what you are
> doing here is just consuming the ring buffer by repeating same
> contents in another format with dynamic length which might be short
> but otherwise could be too long.
Right, to answer your immediate question: we've already decoded the MCE
so we carry that decoded info to userspace.
To address your indirect question: why aren't we using the MCE fields
to decode the MCE in userspace? Well, this has been a long discussion
already and one of the strong arguments for decoding hardware errors in
the kernel is that the kernel simply knows its hardware better. Imagine
a big server farm with heterogeneous hw configurations - if you get an
MCE there you have to also have collected the hardware platform details
so that you are able to decode it. If the kernel can do that for ya, you
don't have to do anything!
Or the case where you get an uncorrectable error and the machine panics
- it is much more convenient to see the decoded error on the screen
before the machine dies instead of some MCA register dumps which you
have to jot down and go and decode them by hand.
> And one more unacceptable point is that filling this msg field is
> expected to be done in machine check context where have many
> limitations in kernel's subsystems such as use of memory allocators.
Doh, I should've seen that, thanks to you and Tony for pointing that
out.
> Suggestion; How about having a kind of translator function for
> userland, e.g. an exported function named mce_record_to_msg()?
> Tool obtains raw data from the record in the tracepoint's ring buffer,
> and if it likes, optionally it can pass the record to the translator
> function to get some accomplished string.
Either that or I could simply allocate a large enough buffer from the
get-go, as Tony suggests. I'll experiment with my MCE generation script
and see how large a buffer can become.
> > Drop unneeded fields while at it, thus saving some room in the ring
> > buffer.
>
> Really unneeded and should be killed?
Right, so this is me suggesting to remove those because I don't see
why we'd need them, I'm expecting other people to come and say either
"Boris, no no, this is needed in... " or "Yeah, go ahead and remove
them, no one uses those." So feel free to argue either way.
Thanks.
--
Regards/Gruss,
Boris.
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
next prev parent reply other threads:[~2012-02-29 10:11 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-28 16:11 [RFC PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-02-28 16:11 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov
2012-02-29 1:14 ` Hidetoshi Seto
2012-02-29 10:10 ` Borislav Petkov [this message]
2012-02-29 12:04 ` Mauro Carvalho Chehab
2012-02-29 12:19 ` Borislav Petkov
2012-02-29 13:05 ` Mauro Carvalho Chehab
2012-02-29 13:37 ` Borislav Petkov
2012-02-29 17:11 ` Luck, Tony
2012-02-29 17:19 ` Borislav Petkov
2012-03-01 2:23 ` Hidetoshi Seto
2012-03-01 11:40 ` Borislav Petkov
2012-03-01 18:28 ` Luck, Tony
2012-03-02 4:02 ` Hidetoshi Seto
2012-03-02 13:17 ` Mauro Carvalho Chehab
2012-03-02 20:05 ` Luck, Tony
2012-02-29 17:20 ` Luck, Tony
2012-02-29 18:00 ` Mauro Carvalho Chehab
2012-02-29 18:11 ` Luck, Tony
2012-02-29 12:52 ` Mauro Carvalho Chehab
2012-02-29 13:45 ` Borislav Petkov
2012-02-29 14:04 ` Mauro Carvalho Chehab
2012-02-29 14:40 ` Borislav Petkov
2012-02-29 16:58 ` Luck, Tony
2012-02-29 17:16 ` Borislav Petkov
2012-02-29 17:33 ` Luck, Tony
2012-03-01 11:29 ` Borislav Petkov
2012-03-01 13:19 ` Mauro Carvalho Chehab
2012-03-01 18:15 ` Luck, Tony
2012-03-01 18:45 ` Borislav Petkov
2012-03-01 18:58 ` Luck, Tony
2012-03-01 19:54 ` Mauro Carvalho Chehab
2012-02-29 17:45 ` Mauro Carvalho Chehab
2012-02-29 17:17 ` Mauro Carvalho Chehab
2012-02-28 16:11 ` [PATCH 2/3] x86, RAS: Add a decoded msg buffer Borislav Petkov
2012-02-28 22:43 ` Luck, Tony
2012-02-29 10:11 ` Borislav Petkov
2012-03-02 9:55 ` Borislav Petkov
2012-02-28 16:11 ` [PATCH 3/3] EDAC: Convert AMD EDAC pieces to use RAS printk buffer Borislav Petkov
-- strict thread matches above, loose matches on Subject: below --
2012-03-06 13:31 [RFC -v3 PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-03-06 13:31 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120229101047.GA21224@aftab \
--to=bp@amd64.org \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).