From: George Dunlap <george.dunlap@eu.citrix.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: "Liu, Jinsong" <jinsong.liu@intel.com>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
"JBeulich@suse.com" <JBeulich@suse.com>
Subject: Re: [PATCH V3] X86/vMCE: handle broken page with regard to migration
Date: Tue, 20 Nov 2012 15:08:46 +0000 [thread overview]
Message-ID: <50AB9CFE.7000003@eu.citrix.com> (raw)
In-Reply-To: <1353344256.18229.129.camel@zakaz.uk.xensource.com>
On 19/11/12 16:57, Ian Campbell wrote:
> On Mon, 2012-11-19 at 15:29 +0000, George Dunlap wrote:
>> On 19/11/12 09:55, Ian Campbell wrote:
>>> If we get to this stage then haven't we either already sent something
>>> over the wire for this page or marked it as dirty when we tried and
>>> failed to send it?
>>>
>>> In the former case we don't care that the page is now broken on the
>>> source since the target has got a good pre-breakage copy.
>>>
>>> In the latter case could we not set a flag at the same time as we mark
>>> the page dirty which means "go round at least one more time"?
>> Yeah -- on the last iteration, the VM itself has to be paused; if any
>> pages get broken after that, it doesn't really matter, does it? The real
>> thing is to have a consistent "snapshot" of behavior.
>>
>> I guess the one potentially tricky case to worry about is whether to
>> deliver an MCE to the guest on restore. Consider the following scenario:
>>
>> - Page A is modified (and marked dirty)
>> - VM paused for last iteration
>> - Page breaks, is marked broken in the p2m
>> - Save code sends page A
>>
>> In that case, the save code would send a "broken" page, and the restore
>> code would mark a page as broken, and we *would* want to deliver an MCE
>> on the far side. But suppose the last two steps were reversed:
>>
>> - Page A modified
>> - VM paused for last iteration
>> - Save code sends page A
>> - Page breaks, marked broken in the p2m
>>
>> In that case, when the save code sends page A, it will send a good page;
>> there's no need to mark it broken, or to send the guest an MCE.
> I guess you'd want to err on the side of stopping using a good page, as
> opposed to continuing to use a bad page? i.e. its better to take a
> spurious vMCE than to not take an actual one.
While that's true, taking a spurious MCE means at very least one less
page available to the guest to use (for HVM guests that haven't
ballooned down, at least), and the unnecessary loss of the data in that
page.
The problem I guess is that the save code at the moment has no way of
distinguishing the following cases:
1. Marked broken after the last time I sent it, but before the VM was
paused; but the page hasn't been written to
2. Marked broken after the VM was paused; page hadn't been written to
3. Marked broken after the VM was paused, but the page had been written to
In case 1, we definitely need to send a broken page; but the VM may have
already received a vMCE. In case 2, we don't need to send a broken page
or a vMCE, while in #3 we need to do both.
On the other hand, the whole situation is hopefully rare enough that
maybe we can just do the simple correct thing, even if it's a tiny bit
sub-optimal. In that case, assuming that spurious vMCEs aren't a
problem (e.g., #1), I think we basically just need to see if the last
iteration contains a broken page, and if so, send the guest a vMCE on
resume.
Thoughts?
> I'm not actually sure what a guest does with a vMCE, I guess it does
> some sort of memory exchange to give the bad page back to the h/v and
> get a good page in return? If the hypervisor thinks the old page is ok
> rather than bad I guess it'll just put it in the free list instead of
> the bad list?
Yes, I'm pretty sure the hypervisor's accounting of broken pages is
separate from guest p2m entries; I think if you mark a p2m entry broken,
the hypervisor will just free the ram page that was mapped there before.
I think the guest just tries to recover gracefully when it gets a vMCE
(e.g., by re-reading the page from disk or killing the process). I
don't think it asks the hypervisor for another page to replace it at
this point.
-George
next prev parent reply other threads:[~2012-11-20 15:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-17 2:04 [PATCH V3] X86/vMCE: handle broken page with regard to migration Liu Jinsong
2012-11-16 18:19 ` Ian Jackson
2012-11-16 18:31 ` Liu, Jinsong
2012-11-19 9:55 ` Ian Campbell
2012-11-19 15:29 ` George Dunlap
2012-11-19 16:57 ` Ian Campbell
2012-11-20 15:08 ` George Dunlap [this message]
2012-11-20 17:08 ` Liu, Jinsong
2012-11-20 17:23 ` George Dunlap
2012-11-20 17:49 ` Liu, Jinsong
2012-11-20 18:54 ` Liu, Jinsong
2012-11-21 11:07 ` Ian Campbell
2012-11-21 11:18 ` George Dunlap
2012-11-21 12:11 ` Liu, Jinsong
2012-11-20 16:43 ` Liu, Jinsong
2012-11-20 16:29 ` Liu, Jinsong
2012-11-20 16:11 ` Liu, Jinsong
2012-11-20 17:48 ` George Dunlap
2012-11-20 18:13 ` Liu, Jinsong
2012-11-20 18:21 ` Ian Jackson
2012-11-20 18:39 ` Liu, Jinsong
2012-11-20 18:42 ` Ian Jackson
2012-11-20 19:07 ` Liu, Jinsong
2012-11-21 11:34 ` George Dunlap
2012-11-21 11:55 ` Ian Jackson
2012-11-21 12:11 ` Ian Campbell
2012-11-21 12:15 ` George Dunlap
2012-11-21 13:26 ` Liu, Jinsong
2012-11-21 13:37 ` Jan Beulich
2012-11-22 11:23 ` Liu, Jinsong
2012-11-21 13:59 ` George Dunlap
2012-11-22 11:44 ` Liu, Jinsong
2012-11-21 12:17 ` George Dunlap
2012-11-21 13:31 ` Liu, Jinsong
2012-11-22 12:37 ` Liu, Jinsong
2012-11-22 13:36 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50AB9CFE.7000003@eu.citrix.com \
--to=george.dunlap@eu.citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=jinsong.liu@intel.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.