From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Christoph Egger" <Christoph.Egger@amd.com>
Subject: Re: RFC: MCA/MCE concept
Date: Fri, 1 Jun 2007 10:11:35 +0200
Message-ID: <200706011011.35336.Christoph.Egger@amd.com>
References: <907625E08839C4409CE5768403633E0B02561D81@sefsexmb1.amd.com>
Mime-Version: 1.0
Content-Type: text/plain;
 charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <907625E08839C4409CE5768403633E0B02561D81@sefsexmb1.amd.com>
Content-Disposition: inline
List-Unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: xen-devel@lists.xensource.com
Cc: Gavin Maltby <Gavin.Maltby@sun.com>
List-Id: xen-devel@lists.xenproject.org

On Wednesday 30 May 2007 17:03:55 Petersson, Mats wrote:
> [snip]
>
> > My feeling is that the hypervisor and dom0 own the hardware
> > and as such
> > all hardware fault management should reside there.  So we should never
> > deliver any form of #MC to a domU, nor should a poll of MCA state from
> > a domU ever observe valid state (e.g, make the RDMSR return 0).
> > So all handling, logging and diagnosis as well as hardware
> > response actions
> > (such as to deploy an online spare chip-select) are controlled
> > in the hypervisor/dom0 combination.  That seems a consistent
> > model - e.g.,
> > if a domU is migrated to another system it should not carry the
> > diagnosis state of the original system across etc, since that
> > belongs with
> > the one domain that cannot migrate.
>
> I agree entirely with this.
>
> > But that is not to say that (I think at a future phase) domU
> > should not
> > participate in a higher-level fault management function, at
> > the direction
> > of the hypervisor/dom0 combo.  For example if/when we can isolate an
> > uncorrectable error to a single domU we could forward such an event to
> > the affected domU if it has registered its ability/interest in such
> > events.  These won't be in the form of a faked #MC or anything,
> > instead they'd be some form of synchronous trap experienced when next
> > the affected domU context resumes on CPU.  The intelligent
> > domU handler
> > can then decide whether the domU must panic, whether it could simply
> > kill the affected process etc.  Those details are clearly
> > sketchy, but the
> > idea is to up-level the communication to a domU to be more like
> > "you're broken" rather than "here's a machine-level hardware error for
> > you to interpret and decide what to do with".
>
> Yes, this makes much more sense than forwarding #MC, as the guest would
> have a hard time to actually do anything really useful with this. As far
> as I know, most uncorrectable errors are near enough entirely fatal in
> most commercial non-Enterprise OS's anyways - e.g. in Windows XP or
> Server 2K3, it always ends in a blue-screen - which is hardly any better
> than the guest being "humanely euthenazed" by Dom0.
>
> I take it this would be some sort of hypercall (available through the
> regular PV-driver interface for HVM guests) to say "Let me know if I'm
> broken - trap on vector X".

=46or short, guests with a PV MCA driver will see a certain event
(assuming the event mechanism will be used for the notification)
and guests w/o a PV MCA driver will see a "General Protection Fault".
Is that right?

> --
> Mats
>
> > Gavin
> >

=2D-=20
AMD Saxony, Dresden, Germany
Operating System Research Center

Legal Information:
AMD Saxony Limited Liability Company & Co. KG
Sitz (Gesch=E4ftsanschrift):
   Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland
Registergericht Dresden: HRA 4896
vertretungsberechtigter Komplement=E4r:
   AMD Saxony LLC (Sitz Wilmington, Delaware, USA)
Gesch=E4ftsf=FChrer der AMD Saxony LLC:
   Dr. Hans-R. Deppe, Thomas McCoy