public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64
@ 2009-04-29 16:54 Borislav Petkov
  2009-04-29 16:54 ` [PATCH 01/21] x86: add methods for writing of an MSR on several CPUs Borislav Petkov
                   ` (21 more replies)
  0 siblings, 22 replies; 70+ messages in thread
From: Borislav Petkov @ 2009-04-29 16:54 UTC (permalink / raw)
  To: akpm, greg
  Cc: mingo, tglx, hpa, dougthompson, linux-kernel, Borislav Petkov,
	Doug Thompson

Hi,

thanks to all reviewers of the previous submission, here is the second
version of this series.

Highlights are the addition of two helpers to read/write MSRs on several
CPUs, denoted by a cpumask and using an array of MSR values per-CPU, as
Peter suggested. Since IMHO they look generic enough I've added them to
arch/x86/lib/msr-on-cpu.c (now renamed to msr.c).

Moreover, I've addressed all the issues raised from the previous series.
Please let me know should there be anything else remaining.

Thanks,
Boris.

 arch/x86/include/asm/msr.h |   11 +
 arch/x86/lib/Makefile      |    2 +-
 arch/x86/lib/msr-on-cpu.c  |   97 -
 arch/x86/lib/msr.c         |  151 ++
 drivers/edac/Kconfig       |   26 +
 drivers/edac/Makefile      |    1 +
 drivers/edac/amd64_edac.c  | 5385 ++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 5575 insertions(+), 98 deletions(-)


^ permalink raw reply	[flat|nested] 70+ messages in thread
* Re: [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64
@ 2009-04-30 14:23 Doug Thompson
  0 siblings, 0 replies; 70+ messages in thread
From: Doug Thompson @ 2009-04-30 14:23 UTC (permalink / raw)
  To: Andi Kleen, Borislav Petkov
  Cc: akpm, greg, mingo, tglx, hpa, dougthompson, linux-kernel



W1DUG


--- On Thu, 4/30/09, Borislav Petkov <borislav.petkov@amd.com> wrote:

> From: Borislav Petkov <borislav.petkov@amd.com>
> Subject: Re: [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64
> To: "Andi Kleen" <andi@firstfloor.org>
> Cc: akpm@linux-foundation.org, greg@kroah.com, mingo@elte.hu, tglx@linutronix.de, hpa@zytor.com, dougthompson@xmission.com, linux-kernel@vger.kernel.org
> Date: Thursday, April 30, 2009, 5:57 AM
> Hi,
> 
> On Wed, Apr 29, 2009 at 09:30:31PM +0200, Andi Kleen
> wrote:
> > Borislav Petkov <borislav.petkov@amd.com>
> writes:
> > 
> > > Hi,
> > >
> > > thanks to all reviewers of the previous
> submission, here is the second
> > > version of this series.
> > 
> > The classic problem of the previous versions of these
> patches was that
> > they consume the same error registers (even if using
> pci config versus
> > msrs as access methods) as the kernel machine check
> poll/threshold
> > interrupt code.

Even the recommendation of AMD of having a polling thread for CORRECTABLE ERROR has a race issue to the same error registers due to the fact that a MCE is an exception and cannot be deferred or blocked off. In the middle of any poll cycle a MCE could fire and touch the same registers. small but present.

> >  And with two logging agents
> racing on the same
> > registers you will always get junk results. Typically
> with threshold 
> > enabled the mce code wins the race. I suspect this
> patchkit has
> > exactly the same fundamental design problem. EDAC
> really is not
> > particularly fitting for integrated memory controllers
> that report
> > their errors using standard machine check events.
> 
> ok, how about we remove tha MSR/PCI cfg space reading bits
> and leave
> that task solely to the mce core. Then, iff you have edac
> turned on in
> Kconfig, mce code delivers needed error info to edac which,
> in turn,
> goes and decodes the error/does the mapping to DIMM
> blocks/supplies DRAM
> error injection facility for testing purposes and similar
> things. That
> way you have both and they don't overlap in functionality.

Adding the synchronization between the two is very doable. It is not yet in the current patch set, but a work in progress.

That is the solution we are pursuing, to have a mechanism to provide communication between MCE and EDAC providing the mapping operation to a DIMM label. The MCA exception fires retrieves the info and calls EDAC module for address mapping.

MCE polling handler calls the EDAC module for address mapping.

EDAC's basic model is a polling operation on the error registers at a 1 second (tunable) rate. 

AMD's manual describes the UNCORRECTABLE MEMORY error handling via the MCE handler.  It further recommends a polling thread to harvest CORRECTABLE MEMORY errors. Last time I checked the MCE poller was running on a 5 minute poll cycle.

That is where we have 2 different threads polling the same error registers without synchronization is problematic and where a "Listener" pattern can be created to provide callbacks for both or form into a single poller operation.

> 
> By the way, I think there's a similar attempt/proposal of
> letting mce
> and edac talk to each other from Red Hat so I think this
> could be a
> viable thing to try.

Exactly

> 
> > -Andi (who thinks all of this decoding should be in
> user space anyways)
> 
> Think of a big data center with a thousands of 2,4,8 socket
> blades
> and the admin collecting mce output and running around
> decoding the
> errors on his workstation. Even worse, the blades have
> different DIMM
> configurations due to hw upgrades/newer machines. I'd much
> rather have
> the complete decoding done in kernel, where all the
> information needed
> for proper decoding is present and with the error landing
> in syslog or
> some other monitored buffer instead of reconstructing it in
> userspace.
> 
> Thanks.
> 
> -- 
> Regards/Gruss,
> Boris.

This model of clusters with thousands of multi-core nodes (5,000 in one case I think of) is used many times. The system console is tie to a serial port via a BIOS switch. The serial port is then attached to "conman" and all the consoles are funneled to a cluster controller which parses for a "bad memory" event.

In sites with EDAC deployed now the parser finds the node number, the CPU number on the node and extracts the EDAC DIMM label provided and generates a Repair Ticket. The technician proceeds to find the proper rack, blade and DIMM and takes that node out of service (for MCEs that are intermittent the node is reboot earlier). Then the bad DIMM is replaced - the one identified from EDAC - and the node quickly brought back online.

Without the DIMM Label provided by EDAC - or with just mce 'bad address' information - ALL the DIMMS are swapped out for off-line testing  or all are return for warranty replacement.  Getting the node back on line is the priority and reducing the time the technician spends on the rack floor. 

Bare MCE information is logged on the cluster controller and no time is spent trying to retrieve the log and running a user space program. Cheaper (man hours) and faster to swapout out all the DIMMs. But that is frowned on, with EDAC solving the problem for themnow.

The requested feature from the customers is to provide the DIMM label WITH the MCE error information as well.

doug t


^ permalink raw reply	[flat|nested] 70+ messages in thread
* Re: [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64
@ 2009-04-30 14:39 Doug Thompson
  0 siblings, 0 replies; 70+ messages in thread
From: Doug Thompson @ 2009-04-30 14:39 UTC (permalink / raw)
  To: Andi Kleen; +Cc: akpm, greg, mingo, tglx, hpa, dougthompson, linux-kernel


--- On Thu, 4/30/09, Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <andi@firstfloor.org>
> Subject: Re: [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64
> To: "Doug Thompson" <norsk5@yahoo.com>
> Cc: "Andi Kleen" <andi@firstfloor.org>
> Date: Thursday, April 30, 2009, 1:05 AM
> > The problem we have had is once
> > an Uncorrected Error fires and dumps the address, mapping it
> > to the DIMM silk screen label is difficult, especially in
> > user space, in gaining access to the registers of the
> > controller.  
> 
> You can just do it either after reboot or in the crash
> kernel. I don't
> think it's required to put it all in kernel. Also you don't
> really
> need access to the registers; 

Actually, according to AMD, their reference code for mapping from an error address to a memory slot does require access to the controller's registers. On page 67 of the BKDG for family F10 from their website is 2 and 1/2 pages of the code to perform that mapping. It takes into consideration interleaving of all kinds, etc. It is narly to say the least.

> SMBIOS provides this
> information and
> mcelog knows how to convert it. 

As I undestand SMBIOS it provides a linear assignment of basic memory starts and lengths but does not provide the memory controller context as AMD's reference code takes into consideration

> 
> Trying to add other consumers to mce.c will be likely very
> messy;
> there's really no generic way to do it. I hope you're not
> planning
> turning the nicely CPU independent code in mce.c into a
> mess
> of twisty CPU specific passages like the old 32bit code
> was.
> 
> -Andi

No, not at all. Keeping the "clean" code is paramount, but we are seeking for an interface to accept the MCE error register structure and  map that information to at least a DIMM label field, if not more.

The EDAC module would register for that interface upon loading and unregister upon module unload.

The MCE code would call a stub routine that either returns no mapping occurred OR call the EDAC mapper. MCE could then determine from that return code if a mapping occurred or not. If it did, then display the desired information, otherwise proceed as normal.

doug t


^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2009-05-05 10:26 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-29 16:54 [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64 Borislav Petkov
2009-04-29 16:54 ` [PATCH 01/21] x86: add methods for writing of an MSR on several CPUs Borislav Petkov
2009-04-29 17:39   ` H. Peter Anvin
2009-05-04 16:46     ` Borislav Petkov
2009-05-04 17:25       ` H. Peter Anvin
2009-05-04 17:53         ` Borislav Petkov
2009-05-04 20:51           ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 02/21] amd64_edac: add PCI config register defines Borislav Petkov
2009-05-04 20:54   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 03/21] amd64_edac: add driver structs Borislav Petkov
2009-05-04 20:38   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 04/21] amd64_edac: add memory scrubber interface Borislav Petkov
2009-05-04 21:02   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 05/21] amd64_edac: add sys addr to memory controller mapping helpers Borislav Petkov
2009-05-04 21:08   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 06/21] amd64_edac: add functionality to compute the DRAM hole Borislav Petkov
2009-05-04 21:22   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 07/21] amd64_edac: add DRAM address type conversion facilities Borislav Petkov
2009-05-04 21:39   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 08/21] amd64_edac: add helper to dump relevant registers Borislav Petkov
2009-05-04 21:43   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 09/21] amd64_edac: assign DRAM chip select base and mask in a family-specific way Borislav Petkov
2009-05-04 21:59   ` Mauro Carvalho Chehab
2009-05-05 10:25     ` Borislav Petkov
2009-04-29 16:54 ` [PATCH 10/21] amd64_edac: add k8-specific methods Borislav Petkov
2009-05-04 22:06   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 11/21] amd64_edac: add f10-and-later methods-p1 Borislav Petkov
2009-05-04 22:10   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 12/21] amd64_edac: add f10-and-later methods-p2 Borislav Petkov
2009-05-04 23:25   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 13/21] amd64_edac: add f10-and-later methods-p3 Borislav Petkov
2009-04-29 18:22   ` Ingo Molnar
2009-04-29 18:24     ` Ingo Molnar
2009-04-29 19:05     ` Andrew Morton
2009-04-29 19:23       ` Ingo Molnar
2009-04-29 19:42         ` Andrew Morton
2009-04-29 19:53           ` Ingo Molnar
2009-04-29 20:47             ` Ingo Molnar
2009-04-30 10:01               ` Borislav Petkov
2009-04-30 10:42                 ` Ingo Molnar
2009-05-04 23:36   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 14/21] amd64_edac: add per-family descriptors Borislav Petkov
2009-05-04 23:39   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 15/21] amd64_edac: add ECC chipkill syndrome mapping table Borislav Petkov
2009-05-04 23:42   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 16/21] amd64_edac: add error decoding logic Borislav Petkov
2009-04-29 18:19   ` Ingo Molnar
2009-05-04 23:48   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 17/21] amd64_edac: add EDAC core-related initializers Borislav Petkov
2009-05-04 23:53   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 18/21] amd64_edac: add ECC reporting initializers Borislav Petkov
2009-05-04 23:59   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 19/21] amd64_edac: add debugging/testing code Borislav Petkov
2009-04-29 18:18   ` Ingo Molnar
2009-04-29 16:55 ` [PATCH 20/21] amd64_edac: add DRAM error injection logic using sysfs Borislav Petkov
2009-04-29 18:17   ` Ingo Molnar
2009-05-05  0:06   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 21/21] amd64_edac: add module registration routines Borislav Petkov
2009-05-05  0:10   ` Mauro Carvalho Chehab
2009-04-29 19:30 ` [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64 Andi Kleen
2009-04-30 11:57   ` Borislav Petkov
2009-04-30 12:21     ` Ingo Molnar
2009-04-30 12:47     ` Andi Kleen
2009-04-30 14:48       ` Aristeu Rozanski
2009-05-01  7:53         ` Borislav Petkov
2009-05-03  0:32           ` Aristeu Rozanski
2009-04-30 18:37       ` Mauro Carvalho Chehab
2009-05-01 12:39       ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2009-04-30 14:23 Doug Thompson
2009-04-30 14:39 Doug Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox