From: Ingo Molnar <mingo@elte.hu>
To: Andi Kleen <andi@firstfloor.org>
Cc: Borislav Petkov <borislav.petkov@amd.com>,
akpm@linux-foundation.org, greg@kroah.com, tglx@linutronix.de,
hpa@zytor.com, dougthompson@xmission.com,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64
Date: Fri, 1 May 2009 14:39:56 +0200 [thread overview]
Message-ID: <20090501123956.GA4225@elte.hu> (raw)
In-Reply-To: <20090430124716.GW23223@one.firstfloor.org>
* Andi Kleen <andi@firstfloor.org> wrote:
> > Kconfig, mce code delivers needed error info to edac which, in
> > turn, goes and decodes the error/does the mapping to DIMM
> > blocks/supplies DRAM error injection facility for testing
> > purposes and similar things. That way you have both and they
> > don't overlap in functionality.
>
> You can do that, but it's redundant because mcelog can do this
> this already. [...]
The thing is, when we took up x86 maintenance i had a good look at
the MCE situation, i checked both the kernel and the user-space
side.
The kernel side MCE code was in pretty bad shape to begin with, but
mcelog (the user-space tool) is a big stinking pile of poo on every
level.
It's one of the worst piece of kernel related code i ever saw. I
think you wrote all of it, and you should be ashamed of that code,
and you should be ashamed of the design and you should be ashamed of
the concept.
It even came with its own 'database' code: mcelog*/db.[ch] is 600+
lines of needless code instead of obvious library use. It's NIH and
self-serving complexity all over.
And the thing is, mcelog/mcedecode never really _did_ anything real
an useful, other than to:
1) Confuse kernel users who see a fatal MCE panic, with cryptic,
quirky codes, who write that down on paper, then run it through
the user-space tool - just to see a piece of information the
kernel could have provided already. (if they didnt make any
mistakes while writing down the codes)
2) Decode a quirky, binary MCE record and combine it with DMI data.
(which the kernel can and should do just fine.)
Yes, i know about tolerant=3 and certain people/companies opting to
ignore MCE fatality levels and live dangerously (and i also know
about non-fatal reporting and correction extensions in hw) - but for
99.999% of the Linux users the whole thing is just needless
complexity today, that does not offer anything valuable.
And that is really what happens when code is misdesigned and the
wrong pieces of code are pushed to user-space: a crappy, limited ABI
and an under-maintained, big pile of junk user-space kit.
The obvious truth is that hardware faults have to be caught, decoded
and optionally handled by the kernel.
The EDAC code at least has a sane design: it realizes that hardware
faults _must_ be fully known, decoded and potentially handled in the
kernel.
Piggyback-ing to user-space is plain idiotic and not defensible. So
if a piece of hardware capability is handled by the EDAC code, the
x86 MCE code will step aside and will stay the heck out of that
business. At least until the two concepts are merged into some sane
kernel hardware fault logging and handling framework.
And Andi, until you dont grasp such _basic_ design concepts, you
have no business writing such code really. You should stay the heck
away from it and you should stop 'advising' people who made the
right calls while you messed up. It is mcelog that is crap, not the
EDAC code.
Ingo
next prev parent reply other threads:[~2009-05-01 12:40 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-29 16:54 [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64 Borislav Petkov
2009-04-29 16:54 ` [PATCH 01/21] x86: add methods for writing of an MSR on several CPUs Borislav Petkov
2009-04-29 17:39 ` H. Peter Anvin
2009-05-04 16:46 ` Borislav Petkov
2009-05-04 17:25 ` H. Peter Anvin
2009-05-04 17:53 ` Borislav Petkov
2009-05-04 20:51 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 02/21] amd64_edac: add PCI config register defines Borislav Petkov
2009-05-04 20:54 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 03/21] amd64_edac: add driver structs Borislav Petkov
2009-05-04 20:38 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 04/21] amd64_edac: add memory scrubber interface Borislav Petkov
2009-05-04 21:02 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 05/21] amd64_edac: add sys addr to memory controller mapping helpers Borislav Petkov
2009-05-04 21:08 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 06/21] amd64_edac: add functionality to compute the DRAM hole Borislav Petkov
2009-05-04 21:22 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 07/21] amd64_edac: add DRAM address type conversion facilities Borislav Petkov
2009-05-04 21:39 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 08/21] amd64_edac: add helper to dump relevant registers Borislav Petkov
2009-05-04 21:43 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 09/21] amd64_edac: assign DRAM chip select base and mask in a family-specific way Borislav Petkov
2009-05-04 21:59 ` Mauro Carvalho Chehab
2009-05-05 10:25 ` Borislav Petkov
2009-04-29 16:54 ` [PATCH 10/21] amd64_edac: add k8-specific methods Borislav Petkov
2009-05-04 22:06 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 11/21] amd64_edac: add f10-and-later methods-p1 Borislav Petkov
2009-05-04 22:10 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 12/21] amd64_edac: add f10-and-later methods-p2 Borislav Petkov
2009-05-04 23:25 ` Mauro Carvalho Chehab
2009-04-29 16:54 ` [PATCH 13/21] amd64_edac: add f10-and-later methods-p3 Borislav Petkov
2009-04-29 18:22 ` Ingo Molnar
2009-04-29 18:24 ` Ingo Molnar
2009-04-29 19:05 ` Andrew Morton
2009-04-29 19:23 ` Ingo Molnar
2009-04-29 19:42 ` Andrew Morton
2009-04-29 19:53 ` Ingo Molnar
2009-04-29 20:47 ` Ingo Molnar
2009-04-30 10:01 ` Borislav Petkov
2009-04-30 10:42 ` Ingo Molnar
2009-05-04 23:36 ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 14/21] amd64_edac: add per-family descriptors Borislav Petkov
2009-05-04 23:39 ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 15/21] amd64_edac: add ECC chipkill syndrome mapping table Borislav Petkov
2009-05-04 23:42 ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 16/21] amd64_edac: add error decoding logic Borislav Petkov
2009-04-29 18:19 ` Ingo Molnar
2009-05-04 23:48 ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 17/21] amd64_edac: add EDAC core-related initializers Borislav Petkov
2009-05-04 23:53 ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 18/21] amd64_edac: add ECC reporting initializers Borislav Petkov
2009-05-04 23:59 ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 19/21] amd64_edac: add debugging/testing code Borislav Petkov
2009-04-29 18:18 ` Ingo Molnar
2009-04-29 16:55 ` [PATCH 20/21] amd64_edac: add DRAM error injection logic using sysfs Borislav Petkov
2009-04-29 18:17 ` Ingo Molnar
2009-05-05 0:06 ` Mauro Carvalho Chehab
2009-04-29 16:55 ` [PATCH 21/21] amd64_edac: add module registration routines Borislav Petkov
2009-05-05 0:10 ` Mauro Carvalho Chehab
2009-04-29 19:30 ` [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64 Andi Kleen
2009-04-30 11:57 ` Borislav Petkov
2009-04-30 12:21 ` Ingo Molnar
2009-04-30 12:47 ` Andi Kleen
2009-04-30 14:48 ` Aristeu Rozanski
2009-05-01 7:53 ` Borislav Petkov
2009-05-03 0:32 ` Aristeu Rozanski
2009-04-30 18:37 ` Mauro Carvalho Chehab
2009-05-01 12:39 ` Ingo Molnar [this message]
-- strict thread matches above, loose matches on Subject: below --
2009-04-30 14:23 Doug Thompson
2009-04-30 14:39 Doug Thompson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090501123956.GA4225@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=borislav.petkov@amd.com \
--cc=dougthompson@xmission.com \
--cc=greg@kroah.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox