From: Borislav Petkov <bp@amd64.org>
To: Nils Carlson <nils.carlson@ludd.ltu.se>
Cc: Andi Kleen <andi@firstfloor.org>,
Doug Thompson <norsk5@yahoo.com>, Tony Luck <tony.luck@intel.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Ingo Molnar <mingo@elte.hu>,
Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
Mauro Carvalho Chehab <mchehab@redhat.com>,
BrentYoung <brent.young@intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"bluesmoke-devel@lists.sourceforge.net"
<bluesmoke-devel@lists.sourceforge.net>,
Doug Thompson <dougthompson@xmission.com>,
Joe Perches <joe@perches.com>,
Thomas Gleixner <tglx@linutronix.de>,
Linux Edac Mailing List <linux-edac@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Matt Domsch <Matt_Domsch@dell.com>,
Nils Carlson <nils.carlson@ericsson.com>
Subject: Re: Hardware Error Kernel Mini-Summit
Date: Tue, 15 Jun 2010 12:01:17 +0200 [thread overview]
Message-ID: <20100615100117.GA17953@aftab> (raw)
In-Reply-To: <Pine.GSO.4.58.1006150951150.17579@dexter.ludd.ltu.se>
From: Nils Carlson <nils.carlson@ludd.ltu.se>
Date: Tue, Jun 15, 2010 at 04:06:33AM -0400
> On Tue, 15 Jun 2010, Andi Kleen wrote:
>
> > On Mon, Jun 14, 2010 at 04:46:40PM -0700, Doug Thompson wrote:
> >
> > Hi Doug,
> >
> > >
> > > Maybe I didn't see it covered (or I missed it), but EDAC is used on more than just x86 based machines, though they are the majority by volume. We should have an abstraction that covers all the archs, like we do with other subsystems of Linux.
> >
> > The way I envision it to working is that a abstracted dimm interface
> > (or edac2 or whatever you want to call it) can be fed from any reasonable
> > DIMM layout driver. This could be either DMI on x86 or some other
> > driver. There would be nothing really x86 specific about that.
>
> Could you maybe provide some references on how DIMM layout
> could be read from DMI? I can't find anything nearly this specific,
> or is it something we're expecting to happen in future BIOS's?
>
> Also, there would probably need to be some standard describing
> different DIMM layouts in general, though maybe such a thing exists.
>
> In other words, there would be have to be some way of ascertaining
> that the info you read from DMI is sufficient to decode MCEs so that
> a faulting DIMM can be identified. In an ideal world, this could
> be tested by some simple tool that could be run by the BIOS writers
> to test that they're providing the OS with sufficient info.
You cannot decode an ECC to a DIMM only using DMI info - at least on AMD
you cannot. The MCE contains the physical address where the ECC happened
and you need EDAC to convert this to a chip select row. Additionally,
you need the error syndrome depending on the dram controllers addressing
mode used.
Now, after you have the chip select row, you need to map this to a DIMM
rank and in order to do that, you need the DIMM info which is in the
SPD ROM (one of the data in the SPD is the DIMM rank which is needed
to unambiguously pinpoint which DIMM is generating those errors). Then
you can use the DMI info - assuming it contains the correct silk screen
labels on the motherboard - to map to a DIMM.
What currently EDAC does is decode the ECC to a chip select - what we
need is some I2C/SMBus code which can read the SPD ROM. I haven't had
the time to look into it yet, though.
--
Regards/Gruss,
Boris.
Operating Systems Research Center
Advanced Micro Devices, Inc.
next prev parent reply other threads:[~2010-06-15 10:01 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-17 18:23 Hardware Error Kernel Mini-Summit Mauro Carvalho Chehab
2010-05-17 22:41 ` Andi Kleen
2010-05-18 16:50 ` Mauro Carvalho Chehab
2010-05-18 18:10 ` Andi Kleen
2010-05-18 6:52 ` Hidetoshi Seto
2010-05-18 16:44 ` Mauro Carvalho Chehab
2010-05-18 17:42 ` Joe Perches
2010-05-18 17:59 ` Mauro Carvalho Chehab
2010-05-18 18:45 ` Andi Kleen
2010-05-18 18:57 ` Joe Perches
2010-05-18 18:53 ` Ingo Molnar
2010-05-18 19:08 ` Luck, Tony
2010-05-18 19:18 ` Borislav Petkov
2010-05-18 19:34 ` Ingo Molnar
2010-05-18 22:14 ` Eric W. Biederman
2010-05-18 22:28 ` Andi Kleen
2010-05-19 1:14 ` Eric W. Biederman
2010-05-19 6:46 ` Borislav Petkov
2010-05-19 7:09 ` Ingo Molnar
2010-05-19 11:54 ` Mauro Carvalho Chehab
2010-05-20 12:37 ` Ingo Molnar
2010-06-14 10:03 ` Nils Carlson
2010-06-14 11:49 ` Andi Kleen
2010-06-14 19:47 ` Nils Carlson
2010-06-14 20:21 ` Andi Kleen
2010-06-14 21:02 ` Nils Carlson
2010-06-14 20:06 ` Eric W. Biederman
2010-06-14 20:21 ` Luck, Tony
2010-06-14 20:36 ` Andi Kleen
2010-06-14 21:34 ` Tony Luck
2010-06-14 23:46 ` Doug Thompson
2010-06-15 6:56 ` Andi Kleen
2010-06-15 8:06 ` Nils Carlson
2010-06-15 10:01 ` Borislav Petkov [this message]
2010-06-15 11:41 ` Andi Kleen
2010-06-15 12:21 ` Nils Carlson
2010-06-15 18:15 ` Luck, Tony
2010-06-15 18:38 ` Nils Carlson
2010-06-15 19:37 ` Andi Kleen
2010-06-15 19:35 ` Andi Kleen
2010-06-15 20:48 ` Nils Carlson
2010-06-16 9:40 ` Andi Kleen
2010-06-15 22:33 ` Tony Luck
2010-06-15 6:44 ` Andi Kleen
2010-05-19 9:03 ` Andi Kleen
2010-05-24 16:21 ` Russ Anderson
2010-05-24 18:26 ` Andi Kleen
2010-05-19 17:30 ` Tony Luck
2010-05-24 15:55 ` Russ Anderson
2010-05-24 17:35 ` Tony Luck
2010-05-24 18:31 ` Andi Kleen
2010-05-18 22:29 ` Ingo Molnar
2010-05-18 19:30 ` Ingo Molnar
2010-05-18 20:42 ` Ingo Molnar
2010-05-18 21:37 ` Tony Luck
2010-05-18 22:00 ` Ingo Molnar
2010-05-24 17:13 ` Russ Anderson
2010-05-19 6:39 ` Ingo Molnar
2010-05-18 13:06 ` Borislav Petkov
2010-05-18 16:52 ` Mauro Carvalho Chehab
2010-05-18 17:06 ` Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100615100117.GA17953@aftab \
--to=bp@amd64.org \
--cc=Matt_Domsch@dell.com \
--cc=andi@firstfloor.org \
--cc=bluesmoke-devel@lists.sourceforge.net \
--cc=brent.young@intel.com \
--cc=dougthompson@xmission.com \
--cc=ebiederm@xmission.com \
--cc=joe@perches.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=nils.carlson@ericsson.com \
--cc=nils.carlson@ludd.ltu.se \
--cc=norsk5@yahoo.com \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).