From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gavin Maltby Subject: Re: Re: RFC: MCA/MCE concept Date: Wed, 30 May 2007 15:00:26 +0100 Message-ID: <465D837A.1070801@sun.com> References: <200705301310.18574.Christoph.Egger@amd.com> Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Content-Transfer-Encoding: 7BIT Return-path: In-reply-to: <200705301310.18574.Christoph.Egger@amd.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Hi, Apologies for the screwy quoting below - I did not receive the first half of this thread so it's been forwarded to me. >>> >>> - Dom0 got enough CEs so that UEs are very likely to happen in order >>> to "circumvent" UEs. The greatest rewards here are in syndrome/row/column/bank analysis of the error stream. Where something like a bad pin produces tonnes of CEs they are always on the same bit and your chance of a UE is that of a random radiation type CE colliding within the set of ECC checkwords being undermined by that pin - not very high. On the other hand if we're seeing repeated distinct syndromes from the same chip-select (or chip-select in a pair) then there is a good chance they could collide "soon" - our data is that this combination predicts a UE within hours to a few days. If you have row/column/bank decoding you can also perform further analysis of the error source and assess the chances of a collision that would produce a UE. That example has DIMM memory in mind, but similar approaches apply to cache memory where it is ECC protected and so on. >>> - Possible operations on a DomU >>> - save/restore DomU >>> - (live-)migrate DomU to a different physical machine >>> - etc. >> Very heavy-weight operations, which I think are unlikely to succeed if >> you already suspect the system's going to suffer a UE soon. As above, some predictors can give you hours to a few days warning of a UE. Gavin