From: Valdis.Kletnieks@vt.edu
To: Folkert van Heusden <folkert@vanheusden.com>
Cc: roland <devzero@web.de>, linux-kernel@vger.kernel.org
Subject: Re: Software based ECC ?
Date: Sun, 12 Aug 2007 23:09:22 -0400 [thread overview]
Message-ID: <1599.1186974562@turing-police.cc.vt.edu> (raw)
In-Reply-To: Your message of "Sun, 12 Aug 2007 18:51:31 +0200." <20070812165131.GG7973@vanheusden.com>
[-- Attachment #1: Type: text/plain, Size: 1818 bytes --]
On Sun, 12 Aug 2007 18:51:31 +0200, Folkert van Heusden said:
> a question and an idea: Q: is ecc guaranteed to detect all bitflips?
It depends on the exact ECC function the hardware implements. Usually it
provides performance such as:
"Correct all 1-bit errors. Detect all 2-bit errors, and most 3 and higher,
but not correct".
(Of course, "correct all 1 or 2 bit and detect all 3 bit" can be done, it
just takes more bits of ECC.)
> Idea: what about a multicore system (3 or more) that runs the same
> processes on 2 cores and a third core verifying that they both do the
> same? As I think it is not only ram that can become faulty.
This is actually done for high-reliability systems (Google for "tell me twice"
and "tell me three times"). The problem is that it takes a lot of extra
hardware. The G5 and later IBM Z-series mainframe chipsets (not to be confused with
the PowerPC G5) implemented dual computation units and a comparator that
signals a 'Machine Check' condition if the two CPUs don't end up in the
same exact state (as an added bonus, at the end of each instruction that
both *do* compare good, it latches the *entire* state of the CPU out,
and then does the following:
1) Retry the instruction on the same CPU - if it compares correctly, keep
going and flag a "soft" error.
2) If it still fails, read out the last "known good" status latch, and load
it into a spare CPU, and fire it up, and flag the failing one as bad.
http://www.research.ibm.com/journal/rd/435/spainhower.pdf
http://www.research.ibm.com/journal/rd/435/mueller.pdf
These guys have forgotten more about designing highly reliable systems than
most of us will ever know. ;)
Needless to say, not everybody is willing to pay the costs of the hardware
overhead of this approach.
[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]
next prev parent reply other threads:[~2007-08-13 3:09 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-10 21:16 Software based ECC ? roland
2007-08-10 22:21 ` Alan Cox
2007-08-11 6:11 ` Valdis.Kletnieks
2007-08-12 16:51 ` Folkert van Heusden
2007-08-12 17:07 ` Jan Engelhardt
2007-08-12 19:05 ` chibiryuu
2007-08-13 3:09 ` Valdis.Kletnieks [this message]
[not found] <8QK3R-kc-9@gated-at.bofh.it>
[not found] ` <8QSuw-4J2-9@gated-at.bofh.it>
[not found] ` <8RoXy-3NJ-13@gated-at.bofh.it>
2007-08-21 18:44 ` Bodo Eggert
2007-08-21 20:17 ` linux-os (Dick Johnson)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1599.1186974562@turing-police.cc.vt.edu \
--to=valdis.kletnieks@vt.edu \
--cc=devzero@web.de \
--cc=folkert@vanheusden.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox