From: Gavin Maltby <Gavin.Maltby@Sun.COM>
To: xen-devel@lists.xensource.com
Subject: Re: Re: RFC: MCA/MCE concept
Date: Wed, 30 May 2007 15:00:26 +0100 [thread overview]
Message-ID: <465D837A.1070801@sun.com> (raw)
In-Reply-To: <200705301310.18574.Christoph.Egger@amd.com>
Hi,
Apologies for the screwy quoting below - I did not receive the first half of this
thread so it's been forwarded to me.
>>>
>>> - Dom0 got enough CEs so that UEs are very likely to happen in order
>>> to "circumvent" UEs.
The greatest rewards here are in syndrome/row/column/bank analysis of the
error stream. Where something like a bad pin produces tonnes of CEs
they are always on the same bit and your chance of a UE is that of a random
radiation type CE colliding within the set of ECC checkwords being undermined
by that pin - not very high. On the other hand if we're seeing repeated
distinct syndromes from the same chip-select (or chip-select in a pair)
then there is a good chance they could collide "soon" - our data is that
this combination predicts a UE within hours to a few days. If you have
row/column/bank decoding you can also perform further analysis of the
error source and assess the chances of a collision that would produce a UE.
That example has DIMM memory in mind, but similar approaches apply to
cache memory where it is ECC protected and so on.
>>> - Possible operations on a DomU
>>> - save/restore DomU
>>> - (live-)migrate DomU to a different physical machine
>>> - etc.
>> Very heavy-weight operations, which I think are unlikely to succeed if
>> you already suspect the system's going to suffer a UE soon.
As above, some predictors can give you hours to a few days warning of a UE.
Gavin
parent reply other threads:[~2007-05-30 14:00 UTC|newest]
Thread overview: expand[flat|nested] mbox.gz Atom feed
[parent not found: <200705301310.18574.Christoph.Egger@amd.com>]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=465D837A.1070801@sun.com \
--to=gavin.maltby@sun.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.