From: Gavin Maltby <Gavin.Maltby@Sun.COM>
To: "Petersson, Mats" <Mats.Petersson@amd.com>
Cc: "Egger, Christoph" <Christoph.Egger@amd.com>,
xen-devel@lists.xensource.com,
Keir Fraser <Keir.Fraser@xensource.com>
Subject: Re: RFC: MCA/MCE concept
Date: Fri, 01 Jun 2007 11:57:23 +0100 [thread overview]
Message-ID: <465FFB93.5090300@sun.com> (raw)
In-Reply-To: <907625E08839C4409CE5768403633E0B02561D9C@sefsexmb1.amd.com>
Hi
On 06/01/07 10:48, Petersson, Mats wrote:
[cut]
>>> Note that Windows kernel drivers are allowed to use the
>> kernel exception
>>> handling, and ARE allowed to "allow" GP faults if they wish
>> to do so.
>>> [Don't ask me why MS allows this, but that's the case, so
>> we have to live
>>> with it].
>> In that case, it will die sooner or later *after* consuming
>> the data in error.
>> That means, the guest continues to live for an unknown time...
>
> Yes. What I'm worried about is that if you have a "transient" or "few-bit"
> error in a rarely used, the guest may well live a LONG time with incorrect
> data and potentially not get it detected for quite some time again (say it's
> two bits have stuck to 0, and the data is then written back with the zero's there
> - next time we read it, no error, since the data has zero's in that location.
I don't believe GP faults and uncorrectable errors really overlap that much.
In a GP fault the extent of the damage is known - you tried to read from
an address not in your address space, you lacked permissions for an operation
etc. In an uncorrected error situation it is difficult to understand the
bounds of the problem in that way - unless the hardware assists with
data poisoning etc such errors may well be unconstrained and affect
a wider area than just the bracket of code that caught a GP fault.
You can often ring-fence critical code sequences by inserting error
barrier instructions before and after it. Those operations are
usually very expensive (drain the pipeline or similar) and are
suitable only in special places.
When running natively it is usually the "owner" of affected data that
sees it bad in memory, eg from a read it made. In those cases we
have the owner on cpu and can kill/signal it synchronously.
There are times when the kernel may be shifting some data
on behalf of the application owner (eg, copyin/copyout, shift
network data etc) in which case we still have a handle on the
real owner. If the access is from a scrub then we should not
panic - just wait and see if the owner does indeed use the bad data
at which time we take appropriate action.
With the virtualisation layer there is the additional case of the HV or
dom0 performing operations on behalf of a guest, ie the HV may make the
access that traps but it's own state is not affected.
CPU errors get still trickier. For example what do we do when we're told that
while running guest A we displaced modified data from l2cache that had
uncorrectable ECC? We have a physical address only, and no idea of who the
data belongs to (guest A, a recently scheduled guest, or the HV?). Where
cachelines are tagged with some form of context or guest ID you have
a chance, provided that is reported in the error state.
> Also consider the case where one cell (or small block of cells) has gone bad,
> but it's only used by one single piece of code that is using this try/catch code?
> I know, this is probably relatively rare, but I'm still worried that it will "break" things...
>>> I'm not sure if Linux, Solaris, *BSD, OS/2 or other OS's will allow
>>> "catching" a Kernel GP fault in a non-precise fashion (I
>> know Linux has
>>> exception handling for EXACT positions in the code). But
>> since at least one
>>> kernel DOES allow this, we can't be sure that a GPF will
>> destroy the guest.
>>
>> When Linux and *BSD see a GPF while they are in userspace,
>> then they kill
>> the process with a SIGSEGV. If they are in kernelspace, then
>> they panic.
Solaris has some wrappers that can be applied, maybe at some expense to
performance, to make protected accesses that will catch and
survive various types of error including hardware errors,
wild pointers etc.
>>> Second point to note is of course that if the guest is in
>> user-mode when
>>> the GPF happens, then almost all OS's will just kill the
>> application - and
>>> there's absolutely no reason to believe that the
>> application running is
>>> necessarily where the actual memory problem is - it may be
>> caused by memory
>>> scrubbing for example.
Yes, these are the myriad permutations I was alluding to above.
>>> Whatever we do to the guest, it should be a "certain
>> death", unless the
Yes, certain and instant death unless it is a PV guest that has registered
the ability to deal with these more elegantly.
>>> kernel has told us "I can handle MCE's".
>> It is obvious that there is no absolute generic way to handle
>> all sort of
>> buggy guests. I vote for:
>>
>> If DomU has a PV MCA driver use this or inject a GPF.
>> Multiplexing all the MSR's related to emulate MCA/MCE for the
>> guests is much
>> more complex than just injecting a GPF - and slower.
Do we need to send the non-PV guest a signal of any kind to kill it?
After all, we can stop it running any further instructions (and perhaps
avoid the use of bad data) by deciding within the HV or dom0 simply
to abort that guest. There is no loss to diagnosability since the
HV/dom0 combination is doing that, anyway.
> Emulating MCE to the guest wasn't my intended alternative suggestion. Instead,
> my idea was that if the guest hasn't registered a "PV MCE handler", we just
> immediately kill the domain as such - e.g similar to "domain_crash_synchronous()".
> Don't let the guest have any chance to "do something wrong" in the process - it's
> already broken, and letting it run any further will almost certainly not help matters.
> This may not be the prettiest solution, but then on the other hand, a "Windows blue-screen"
> or Linux "oops" saying GP fault happened at some random place in the guest isn't exactly
> helping the SysAdmin understand the problem either.
Agreed - don't let the affected guest run one more instruction if we can. Sysadmins
will learn to consult dom0 diagnostics to see if they explain any sudden guest deaths -
no need, as you say, to splurge any raw error data to them.
Gavin
>
> --
> Mats
>> Keir, what are your opinions on this thread?
>>
>>
>> Christoph
>>
>> --
>> AMD Saxony, Dresden, Germany
>> Operating System Research Center
>>
>> Legal Information:
>> AMD Saxony Limited Liability Company & Co. KG
>> Sitz (Geschäftsanschrift):
>> Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland
>> Registergericht Dresden: HRA 4896
>> vertretungsberechtigter Komplementär:
>> AMD Saxony LLC (Sitz Wilmington, Delaware, USA)
>> Geschäftsführer der AMD Saxony LLC:
>> Dr. Hans-R. Deppe, Thomas McCoy
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>
>>
>>
>
--
Gavin Maltby, Solaris Kernel Development.
next prev parent reply other threads:[~2007-06-01 10:57 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-29 15:32 RFC: MCA/MCE concept Christoph Egger
2007-05-30 7:19 ` Jan Beulich
2007-05-30 7:45 ` Christoph Egger
2007-05-30 8:49 ` Jan Beulich
2007-05-30 9:10 ` Christoph Egger
2007-05-30 9:59 ` Jan Beulich
2007-05-30 10:12 ` Christoph Egger
2007-05-30 13:50 ` Gavin Maltby
2007-05-30 15:03 ` Petersson, Mats
2007-06-01 8:11 ` Christoph Egger
2007-06-01 8:55 ` Petersson, Mats
2007-06-01 9:28 ` Christoph Egger
2007-06-01 9:48 ` Petersson, Mats
2007-06-01 10:57 ` Gavin Maltby [this message]
2007-06-01 11:38 ` Petersson, Mats
2007-06-04 16:16 ` Gavin Maltby
2007-06-06 9:28 ` Christoph Egger
2007-06-06 10:35 ` Gavin Maltby
2007-06-06 11:57 ` Christoph Egger
2007-06-06 12:25 ` Gavin Maltby
2007-06-06 13:24 ` Christoph Egger
2007-06-14 11:59 ` Gavin Maltby
2007-06-21 9:29 ` Christoph Egger
2007-06-21 10:15 ` Petersson, Mats
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=465FFB93.5090300@sun.com \
--to=gavin.maltby@sun.com \
--cc=Christoph.Egger@amd.com \
--cc=Keir.Fraser@xensource.com \
--cc=Mats.Petersson@amd.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.