public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <borislav.petkov@amd.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: akpm@linux-foundation.org, greg@kroah.com, mingo@elte.hu,
	tglx@linutronix.de, hpa@zytor.com, dougthompson@xmission.com,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64
Date: Thu, 30 Apr 2009 13:57:41 +0200	[thread overview]
Message-ID: <20090430115741.GA23634@aftab> (raw)
In-Reply-To: <87iqknp8a0.fsf@basil.nowhere.org>

Hi,

On Wed, Apr 29, 2009 at 09:30:31PM +0200, Andi Kleen wrote:
> Borislav Petkov <borislav.petkov@amd.com> writes:
> 
> > Hi,
> >
> > thanks to all reviewers of the previous submission, here is the second
> > version of this series.
> 
> The classic problem of the previous versions of these patches was that
> they consume the same error registers (even if using pci config versus
> msrs as access methods) as the kernel machine check poll/threshold
> interrupt code.  And with two logging agents racing on the same
> registers you will always get junk results. Typically with threshold 
> enabled the mce code wins the race. I suspect this patchkit has
> exactly the same fundamental design problem. EDAC really is not
> particularly fitting for integrated memory controllers that report
> their errors using standard machine check events.

ok, how about we remove tha MSR/PCI cfg space reading bits and leave
that task solely to the mce core. Then, iff you have edac turned on in
Kconfig, mce code delivers needed error info to edac which, in turn,
goes and decodes the error/does the mapping to DIMM blocks/supplies DRAM
error injection facility for testing purposes and similar things. That
way you have both and they don't overlap in functionality.

By the way, I think there's a similar attempt/proposal of letting mce
and edac talk to each other from Red Hat so I think this could be a
viable thing to try.

> -Andi (who thinks all of this decoding should be in user space anyways)

Think of a big data center with a thousands of 2,4,8 socket blades
and the admin collecting mce output and running around decoding the
errors on his workstation. Even worse, the blades have different DIMM
configurations due to hw upgrades/newer machines. I'd much rather have
the complete decoding done in kernel, where all the information needed
for proper decoding is present and with the error landing in syslog or
some other monitored buffer instead of reconstructing it in userspace.

Thanks.

-- 
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
  System  | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. München, Germany
 Research | Geschäftsführer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni
  Center  | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
  (OSRC)  | Registergericht München, HRB Nr. 43632


  reply	other threads:[~2009-04-30 11:58 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-29 16:54 [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64 Borislav Petkov
2009-04-29 16:54 ` [PATCH 01/21] x86: add methods for writing of an MSR on several CPUs Borislav Petkov
2009-04-29 17:39   ` H. Peter Anvin
2009-05-04 16:46     ` Borislav Petkov
2009-05-04 17:25       ` H. Peter Anvin
2009-05-04 17:53         ` Borislav Petkov
2009-05-04 20:51           ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 02/21] amd64_edac: add PCI config register defines Borislav Petkov
2009-05-04 20:54   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 03/21] amd64_edac: add driver structs Borislav Petkov
2009-05-04 20:38   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 04/21] amd64_edac: add memory scrubber interface Borislav Petkov
2009-05-04 21:02   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 05/21] amd64_edac: add sys addr to memory controller mapping helpers Borislav Petkov
2009-05-04 21:08   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 06/21] amd64_edac: add functionality to compute the DRAM hole Borislav Petkov
2009-05-04 21:22   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 07/21] amd64_edac: add DRAM address type conversion facilities Borislav Petkov
2009-05-04 21:39   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 08/21] amd64_edac: add helper to dump relevant registers Borislav Petkov
2009-05-04 21:43   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 09/21] amd64_edac: assign DRAM chip select base and mask in a family-specific way Borislav Petkov
2009-05-04 21:59   ` Mauro Carvalho Chehab
2009-05-05 10:25     ` Borislav Petkov
2009-04-29 16:54 ` [PATCH 10/21] amd64_edac: add k8-specific methods Borislav Petkov
2009-05-04 22:06   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 11/21] amd64_edac: add f10-and-later methods-p1 Borislav Petkov
2009-05-04 22:10   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 12/21] amd64_edac: add f10-and-later methods-p2 Borislav Petkov
2009-05-04 23:25   ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 13/21] amd64_edac: add f10-and-later methods-p3 Borislav Petkov
2009-04-29 18:22   ` Ingo Molnar
2009-04-29 18:24     ` Ingo Molnar
2009-04-29 19:05     ` Andrew Morton
2009-04-29 19:23       ` Ingo Molnar
2009-04-29 19:42         ` Andrew Morton
2009-04-29 19:53           ` Ingo Molnar
2009-04-29 20:47             ` Ingo Molnar
2009-04-30 10:01               ` Borislav Petkov
2009-04-30 10:42                 ` Ingo Molnar
2009-05-04 23:36   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 14/21] amd64_edac: add per-family descriptors Borislav Petkov
2009-05-04 23:39   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 15/21] amd64_edac: add ECC chipkill syndrome mapping table Borislav Petkov
2009-05-04 23:42   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 16/21] amd64_edac: add error decoding logic Borislav Petkov
2009-04-29 18:19   ` Ingo Molnar
2009-05-04 23:48   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 17/21] amd64_edac: add EDAC core-related initializers Borislav Petkov
2009-05-04 23:53   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 18/21] amd64_edac: add ECC reporting initializers Borislav Petkov
2009-05-04 23:59   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 19/21] amd64_edac: add debugging/testing code Borislav Petkov
2009-04-29 18:18   ` Ingo Molnar
2009-04-29 16:55 ` [PATCH 20/21] amd64_edac: add DRAM error injection logic using sysfs Borislav Petkov
2009-04-29 18:17   ` Ingo Molnar
2009-05-05  0:06   ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 21/21] amd64_edac: add module registration routines Borislav Petkov
2009-05-05  0:10   ` Mauro Carvalho Chehab
2009-04-29 19:30 ` [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64 Andi Kleen
2009-04-30 11:57   ` Borislav Petkov [this message]
2009-04-30 12:21     ` Ingo Molnar
2009-04-30 12:47     ` Andi Kleen
2009-04-30 14:48       ` Aristeu Rozanski
2009-05-01  7:53         ` Borislav Petkov
2009-05-03  0:32           ` Aristeu Rozanski
2009-04-30 18:37       ` Mauro Carvalho Chehab
2009-05-01 12:39       ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2009-04-30 14:23 Doug Thompson
2009-04-30 14:39 Doug Thompson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090430115741.GA23634@aftab \
    --to=borislav.petkov@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=dougthompson@xmission.com \
    --cc=greg@kroah.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox