From: Andi Kleen <andi@firstfloor.org>
To: Nils Carlson <nils.carlson@ludd.ltu.se>
Cc: Ingo Molnar <mingo@elte.hu>, Borislav Petkov <bp@amd64.org>,
Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
"Luck, Tony" <tony.luck@intel.com>,
Mauro Carvalho Chehab <mchehab@redhat.com>,
"Young, Brent" <brent.young@intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"bluesmoke-devel@lists.sourceforge.net"
<bluesmoke-devel@lists.sourceforge.net>,
Andi Kleen <andi@firstfloor.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Doug Thompson <dougthompson@xmission.com>,
Joe Perches <joe@perches.com>,
Thomas Gleixner <tglx@linutronix.de>,
Linux Edac Mailing List <linux-edac@vger.kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Matt Domsch <Matt_Domsch@dell.com>
Subject: Re: Hardware Error Kernel Mini-Summit
Date: Mon, 14 Jun 2010 13:49:06 +0200 [thread overview]
Message-ID: <20100614114906.GG17092@basil.fritz.box> (raw)
In-Reply-To: <50689ECC-A371-4923-BEBE-1A5A7E5B9D3B@ludd.ltu.se>
> Just left the above for reference. How would this affect other
> aspects of EDAC such as the error injection, the sysfs
> entries that (in most cases) reflect the layout of dimm's, and
Some of this can be probably retained, about the way EDAC
e.g. represents layout is quite unsuitable too. It includes
a lot of internal implementation details that in some cases
you can't even get anymore on modern design. Something
with a proper abstract interface is better. EDAC never had this.
Also the biggest problem is still that EDAC doesn't
give you any silk screen labels, so unless you
have motherboard schemantics the layout it presents
is fairly useless -- you still don't know which DIMM
to exchange. So in theory EDAC looks great, but in practice ...
On a lot of modern systems I checked DMI
seems reasonably accurate in terms of layout, so I suspect they can
be handled with this. For others probably
still need some special driver, but one
with a proper interface.
For error injection: some modern systems support this
though ACPI EINJ which has an separate non EDAC
interface. For others I've been simply using some scripts
that twiddle the bits from user space. You can do that
with a shell script. If it was staying in the kernel
it could be probably moved into a proper error injection
framework that is not arbitarily tied to memory.
Lots of different devices have error injection
support and exposing some of that a in a general
frame work would likely make sense.
Anyways the old EDAC drivers for this are not going
away, you can still use them. The interesting
question though is how to properly define the interface
for new hardware.
> allow the setting of scrub rate? If we're just talking about
I never quite saw the point of that one, but yes
there's no replacement for this anywhere else.
Normally scrub rate can be simply set in the BIOS,
is that not good enough? Is there a use case for
changing it dynamically?
Note that modern hardware typically has demand scrubbing
anyways, that is when there is an error it automatically
scrubs.
> replacing all instances of printk (when logging single bit
> errors) with perf events, I don't really see that as a problem.
I don't think perf is the right tool for this, the semantics
are mostly unsuitable (it hasn't been designed as a error reporting
tool, but as a performance tool and performance events are quite
different from errors) and it doesn't provide most of the infrastructure
needed for it anyways.
> But EDAC is much more than that today...
Well it's a hodge podge of quite a lot of odd bits.
I'm not sure "more" is the right word.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
next prev parent reply other threads:[~2010-06-14 11:49 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-17 18:23 Hardware Error Kernel Mini-Summit Mauro Carvalho Chehab
2010-05-17 22:41 ` Andi Kleen
2010-05-18 16:50 ` Mauro Carvalho Chehab
2010-05-18 18:10 ` Andi Kleen
2010-05-18 6:52 ` Hidetoshi Seto
2010-05-18 16:44 ` Mauro Carvalho Chehab
2010-05-18 17:42 ` Joe Perches
2010-05-18 17:59 ` Mauro Carvalho Chehab
2010-05-18 18:45 ` Andi Kleen
2010-05-18 18:57 ` Joe Perches
2010-05-18 18:53 ` Ingo Molnar
2010-05-18 19:08 ` Luck, Tony
2010-05-18 19:18 ` Borislav Petkov
2010-05-18 19:34 ` Ingo Molnar
2010-05-18 22:14 ` Eric W. Biederman
2010-05-18 22:28 ` Andi Kleen
2010-05-19 1:14 ` Eric W. Biederman
2010-05-19 6:46 ` Borislav Petkov
2010-05-19 7:09 ` Ingo Molnar
2010-05-19 11:54 ` Mauro Carvalho Chehab
2010-05-20 12:37 ` Ingo Molnar
2010-06-14 10:03 ` Nils Carlson
2010-06-14 11:49 ` Andi Kleen [this message]
2010-06-14 19:47 ` Nils Carlson
2010-06-14 20:21 ` Andi Kleen
2010-06-14 20:06 ` Eric W. Biederman
2010-06-14 20:21 ` Luck, Tony
2010-06-14 20:36 ` Andi Kleen
2010-06-14 21:34 ` Tony Luck
2010-06-15 6:44 ` Andi Kleen
[not found] ` <35525.41387.qm@web50105.mail.re2.yahoo.com>
2010-06-15 6:56 ` Andi Kleen
2010-06-15 8:06 ` Nils Carlson
2010-06-15 10:01 ` Borislav Petkov
2010-06-15 11:41 ` Andi Kleen
2010-06-15 12:21 ` Nils Carlson
2010-06-15 18:15 ` Luck, Tony
2010-06-15 18:38 ` Nils Carlson
2010-06-15 19:37 ` Andi Kleen
2010-06-15 19:35 ` Andi Kleen
2010-06-15 20:48 ` Nils Carlson
2010-06-16 9:40 ` Andi Kleen
2010-06-15 22:33 ` Tony Luck
2010-05-19 9:03 ` Andi Kleen
2010-05-24 16:21 ` Russ Anderson
2010-05-24 18:26 ` Andi Kleen
2010-05-19 17:30 ` Tony Luck
2010-05-24 15:55 ` Russ Anderson
2010-05-24 17:35 ` Tony Luck
2010-05-24 18:31 ` Andi Kleen
2010-05-18 22:29 ` Ingo Molnar
2010-05-18 19:30 ` Ingo Molnar
2010-05-18 20:42 ` Ingo Molnar
2010-05-18 21:37 ` Tony Luck
2010-05-18 22:00 ` Ingo Molnar
2010-05-24 17:13 ` Russ Anderson
2010-05-19 6:39 ` Ingo Molnar
2010-05-18 13:06 ` Borislav Petkov
2010-05-18 16:52 ` Mauro Carvalho Chehab
2010-05-18 17:06 ` Mauro Carvalho Chehab
-- strict thread matches above, loose matches on Subject: below --
2010-06-16 8:57 George Spelvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100614114906.GG17092@basil.fritz.box \
--to=andi@firstfloor.org \
--cc=Matt_Domsch@dell.com \
--cc=bluesmoke-devel@lists.sourceforge.net \
--cc=bp@amd64.org \
--cc=brent.young@intel.com \
--cc=dougthompson@xmission.com \
--cc=ebiederm@xmission.com \
--cc=joe@perches.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@redhat.com \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=nils.carlson@ludd.ltu.se \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).