All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frank van der Linden <Frank.Vanderlinden@Sun.COM>
To: "Jiang, Yunhong" <yunhong.jiang@intel.com>
Cc: Christoph Egger <Christoph.Egger@amd.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Ke, Liping" <liping.ke@intel.com>,
	Gavin Maltby <Gavin.Maltby@Sun.COM>,
	Keir Fraser <keir.fraser@eu.citrix.com>,
	"Kleen, Andi" <andi.kleen@intel.com>
Subject: Re: Re: [RFC] RAS(Part II)--MCA enalbing in XEN
Date: Tue, 24 Feb 2009 11:53:51 -0700	[thread overview]
Message-ID: <49A4423F.7070702@Sun.COM> (raw)
In-Reply-To: <E2263E4A5B2284449EEBD0AAB751098401C7B24F23@PDSMSX501.ccr.corp.intel.com>

Thanks for your reply. Let me explain my comments a little:

Jiang, Yunhong wrote:
> 
> One notice is, we delieve vMCE to dom0/domU only when it is impacted. The idea behind this is, MCE is handled by Xen HV totally, while guest's vMCE handler will only works for itself. For example, when a page broken, Xen will firstly mark the page offline in Xen side (i.e. take the recover action), then, it will inject a vMCE to guest corresponding (dom0 or domU), the guest will kill the application using the page, free the page, or do more action.
> 
> And we always pass the vIRQ to dom0 for logging and telemetry, user space tools can take more proactive action for this if needed.

I understand this part, and have no problems with them mechanism itself. 
I think it has advantages over the original concept, where dom0 informs 
domUs. My question is: what useful action can a domU take without fully 
knowing the physical system? I'll go more in to that below.

>> What would be needed for the Solaris framework, however, is to provide
>> information on what action was taken, along with the telemetry. As
> 
> Agree that this modification is needed. Sorry we didn't reliaze the requirement from Dom0 after reboot.
> 
> Either we can pass the action in the telemetry, or Dom0 can take action specific method ,like retrieve the offlined page from Xen before reboot. If we take the former, we may need a interface definition.

Passing the action along with the telemetry seems the best way to go to 
me. Since the telemetry is used to determine which action to take, any 
information on actions already taken should come at the same time.

\
> 
> What do you mean of the effect of wrmsr instruction. We need considering inject #GP if invalid wrmsr , or remove the event when guest clear the MCi_STATUS_MCA if needed. We send this RFC early to get feedback firstly for the design idea. 
> Or you mean more than this for the wrmsr?
> 
>> To take further action, the MCA code in dom0 (or a domU) needs to know
>> that it is running under Xen, and it needs to have detailed physical
> 
> Our purpose is guest has no idea it is running under xen as descripted above. And what information do you think a normal guest's MCA handler needs to know, and use the detailed physical information? After all, a guest cares only itself. Also, maybe we can't provide PV handler for all guest (like windows).
> 
> Dom0 is a special case, it's vIRQ handler knows it is running under Xen, but that is for log/telemetry and for proactive action. 
> 
>> information on the system. In other words, the existing code
>> that can be
> 
> What do you mean of "existing", our patch or current Xen implementation?
> 
>> used is only the code that gathers some information. So, the
>> only thing
>> that vMCE is good for, is that you can run unmodified error logging
>> code. But you can't interpret any of the error information further
>> without knowing more. Especially for a domU, which might not know
>> anything, this doesn't seem useful. What would the user of a domU do with
>> that information? 
>> To recap, I think the part where Xen itself takes action is fine, with
>> some modifications. But I don't see any advantages in vMCE delivery,
>> unless I'm missing something of course..
> 
> I think the main advantage are:
> a) We don't need maintain a PV MCA handler for guest, especially for HVM guest
> b) We can get benifit from guest's MCA improvement/enhancement .
> c) Applying this to dom0, we don't need different mechanism to dom0/hvm.

Ok, my main issue here is: if you want to enable a guest to run 
unmodified MCA code (which you state as a goal, and as an advantage of 
the vMCE approach), then what can the guest actually do. Or the dom0, 
for that matter?

MCA information is highly specific to the hardware. Without additional 
information on the hardware, it is hard, or even impossible, for the 
unmodified MCA handler in dom0 or a domU to do anything useful. It will 
interpret the information to fit the virtualized environment it is in, 
which doesn't match the reality of the hardware at all. So what can it 
do? It can just read the MSRs and log the information, but even that 
information wouldn't be useful; it is already available to dom0, where 
the code and/or person who can make sense of the data will see it. The 
unmodified MCA handler also can't take any corrective action; it might 
think that it is taking action, but in fact, its wrmsr instructions have 
no effect (and they shouldn't, guests should definitely not be able to 
do MSR writes).

I only see one possible exception to this: if you translate the ADDR MSR 
of a bank to a guest address in the vmca info before delivering the 
vMCE, then the guest could do something useful, because its virtualized 
MSR reads would then produce a guest address, and it could do something 
useful with it. But currently, your code doesn't seem to do this; the 
virtualized MSR will produce the machine address, which the guest can't 
do anything with, unless it knows its running under Xen.

So that's my main problem here: there is a contradiction. The vMCE 
mechanism as you implement it enables guests to run an unmodified MCA 
handler, but there isn't actually much that the guest can do with that, 
without knowing it runs under Xen. I see only one specific use for this: 
if you translate the ADDR info to a guest address, it could potentially 
try to do a "local" page retire.

- Frank

  reply	other threads:[~2009-02-24 18:53 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-16  5:35 [RFC] RAS(Part II)--MCA enalbing in XEN Ke, Liping
2009-02-16 13:34 ` Christoph Egger
2009-02-16 14:18   ` Christoph Egger
2009-02-16 15:03     ` Keir Fraser
2009-02-16 15:19       ` Jiang, Yunhong
2009-02-16 17:58       ` Frank Van Der Linden
2009-02-17  5:50         ` Frank Van Der Linden
2009-02-17  6:44           ` Jiang, Yunhong
2009-02-17  6:53           ` Jiang, Yunhong
2009-02-17  6:41         ` Jiang, Yunhong
2009-02-18 18:05           ` Christoph Egger
2009-02-19  9:13             ` Jiang, Yunhong
2009-02-19 16:25               ` Christoph Egger
2009-02-20  2:53                 ` Jiang, Yunhong
2009-02-20 21:01                   ` Frank van der Linden
2009-02-23  9:01                     ` Jiang, Yunhong
2009-02-24 18:53                       ` Frank van der Linden [this message]
     [not found]                         ` <2E9E6F5F5978EF44A8590E339E888CF988279945@irsmsx503.ger.corp.intel.com>
2009-02-24 19:07                           ` Frank van der Linden
     [not found]                             ` <2E9E6F5F5978EF44A8590E339E888CF98827996D@irsmsx503.ger.corp.intel.com>
2009-02-24 20:47                               ` Frank van der Linden
2009-02-25  2:25                                 ` Jiang, Yunhong
2009-02-25 12:19                                   ` Christoph Egger
2009-02-25 17:32                                     ` Frank van der Linden
2009-02-26  2:16                                       ` Jiang, Yunhong
2009-03-02 14:58                                         ` Christoph Egger
2009-03-02 16:15                                           ` Jiang, Yunhong
2009-03-02  5:51                                       ` Jiang, Yunhong
2009-03-02 14:51                                         ` Christoph Egger
2009-03-02 16:09                                           ` Jiang, Yunhong
2009-03-02 17:47                                         ` Frank van der Linden
2009-03-05  4:45                                           ` Jiang, Yunhong
2009-03-05  8:31                                           ` Jiang, Yunhong
2009-03-05 14:53                                             ` Christoph Egger
2009-03-05 15:19                                               ` Jiang, Yunhong
2009-03-05 17:28                                                 ` Christoph Egger
2009-03-06  2:11                                                   ` Jiang, Yunhong
2009-03-10  1:19                                                   ` Jiang, Yunhong
2009-03-10 19:08                                                     ` Christoph Egger
2009-03-12 15:52                                                       ` Jiang, Yunhong
2009-03-16 16:27                                                         ` Frank van der Linden
2009-02-25 22:30                                     ` Gavin Maltby
2009-02-25  2:31                               ` Jiang, Yunhong
2009-02-25 10:57                               ` Christoph Egger
2009-02-25  2:26                             ` Jiang, Yunhong
2009-02-25 10:37                             ` Christoph Egger
2009-02-16 15:05     ` Jiang, Yunhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49A4423F.7070702@Sun.COM \
    --to=frank.vanderlinden@sun.com \
    --cc=Christoph.Egger@amd.com \
    --cc=Gavin.Maltby@Sun.COM \
    --cc=andi.kleen@intel.com \
    --cc=keir.fraser@eu.citrix.com \
    --cc=liping.ke@intel.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.