From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Borislav Petkov <bp@amd64.org>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>,
Tony Luck <tony.luck@intel.com>, Ingo Molnar <mingo@elte.hu>,
EDAC devel <linux-edac@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] mce: Add a msg string to the MCE tracepoint
Date: Thu, 01 Mar 2012 11:23:22 +0900 [thread overview]
Message-ID: <4F4EDD9A.4050900@jp.fujitsu.com> (raw)
In-Reply-To: <20120229133741.GF21224@aftab>
(2012/02/29 22:37), Borislav Petkov wrote:
> On Wed, Feb 29, 2012 at 10:05:53AM -0300, Mauro Carvalho Chehab wrote:
>> Em 29-02-2012 09:19, Borislav Petkov escreveu:
>> - on SB, the MCE status register only has the error message. In order to get
>> the DIMM location, the driver needs to parse the registers that describe
>> how the DIMM's are organized (this is spread on dozens of PCI devices, and
>> 200+ registers), and how they're interlaced, in order to convert the error
>> address reported by the MCA into a DIMM location.
>
> As I already said, amd64_edac does a similar thing does already so I
> don't see any difference in the solutions there: decode to the DIMM and
> pass the info through 'msg'.
My concern is; on Sandy Bridge, is it safe to gather info about the DIMM
location in/from machine check context in a reasonable time span?
I know that for corrected errors which is handled in normal context it is
safe to refer the vast PCI configuration space...
Or is it really possible to determine the erroneous DIMM location from OS?
It looks like that how to get the location is highly depending on the
hardware, processor's vendor/family/model and firmware configuration etc..
Even if OS tells me "please replace memory seated on slot#3 at node#5" or
so, I'm not sure whether these numbers are consistent over reboot if
there are some hot-plugged node and/or memory. Order of numbering can
be changed by how firmware enumerate ACPI namespace or so...
Actually in these days we usually use firmware's system event log to
determine which module should be replaced, assuming that firmware knows
hardware better than OSes running on that machine.
Getting back to the "msg" I think it is not necessary if it does not
contain any new data which is not available in the mce_record today.
If you just want to add field about physical memory location, I think
string "msg" is not only way to do so.
Thanks,
H.Seto
next prev parent reply other threads:[~2012-03-01 2:24 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-28 16:11 [RFC PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-02-28 16:11 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov
2012-02-29 1:14 ` Hidetoshi Seto
2012-02-29 10:10 ` Borislav Petkov
2012-02-29 12:04 ` Mauro Carvalho Chehab
2012-02-29 12:19 ` Borislav Petkov
2012-02-29 13:05 ` Mauro Carvalho Chehab
2012-02-29 13:37 ` Borislav Petkov
2012-02-29 17:11 ` Luck, Tony
2012-02-29 17:19 ` Borislav Petkov
2012-03-01 2:23 ` Hidetoshi Seto [this message]
2012-03-01 11:40 ` Borislav Petkov
2012-03-01 18:28 ` Luck, Tony
2012-03-02 4:02 ` Hidetoshi Seto
2012-03-02 13:17 ` Mauro Carvalho Chehab
2012-03-02 20:05 ` Luck, Tony
2012-02-29 17:20 ` Luck, Tony
2012-02-29 18:00 ` Mauro Carvalho Chehab
2012-02-29 18:11 ` Luck, Tony
2012-02-29 12:52 ` Mauro Carvalho Chehab
2012-02-29 13:45 ` Borislav Petkov
2012-02-29 14:04 ` Mauro Carvalho Chehab
2012-02-29 14:40 ` Borislav Petkov
2012-02-29 16:58 ` Luck, Tony
2012-02-29 17:16 ` Borislav Petkov
2012-02-29 17:33 ` Luck, Tony
2012-03-01 11:29 ` Borislav Petkov
2012-03-01 13:19 ` Mauro Carvalho Chehab
2012-03-01 18:15 ` Luck, Tony
2012-03-01 18:45 ` Borislav Petkov
2012-03-01 18:58 ` Luck, Tony
2012-03-01 19:54 ` Mauro Carvalho Chehab
2012-02-29 17:45 ` Mauro Carvalho Chehab
2012-02-29 17:17 ` Mauro Carvalho Chehab
2012-02-28 16:11 ` [PATCH 2/3] x86, RAS: Add a decoded msg buffer Borislav Petkov
2012-02-28 22:43 ` Luck, Tony
2012-02-29 10:11 ` Borislav Petkov
2012-03-02 9:55 ` Borislav Petkov
2012-02-28 16:11 ` [PATCH 3/3] EDAC: Convert AMD EDAC pieces to use RAS printk buffer Borislav Petkov
-- strict thread matches above, loose matches on Subject: below --
2012-03-06 13:31 [RFC -v3 PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-03-06 13:31 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F4EDD9A.4050900@jp.fujitsu.com \
--to=seto.hidetoshi@jp.fujitsu.com \
--cc=bp@amd64.org \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
--cc=mingo@elte.hu \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.