From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Wen Congyang <wency@cn.fujitsu.com>
Cc: xen devel <xen-devel@lists.xen.org>
Subject: Re: question about migration
Date: Tue, 29 Dec 2015 10:57:12 +0000 [thread overview]
Message-ID: <56826708.4040005@citrix.com> (raw)
In-Reply-To: <567C9415.4070403@cn.fujitsu.com>
On 25/12/2015 00:55, Wen Congyang wrote:
> On 12/24/2015 08:36 PM, Andrew Cooper wrote:
>> On 24/12/15 02:29, Wen Congyang wrote:
>>> Hi Andrew Cooper:
>>>
>>> I rebase the COLO codes to the newest upstream xen, and test it. I found
>>> a problem in the test, and I can reproduce this problem via the migration.
>>>
>>> How to reproduce:
>>> 1. xl cr -p hvm_nopv
>>> 2. xl migrate hvm_nopv 192.168.3.1
>> You are the very first person to try a usecase like this.
>>
>> It works as much as it does because of your changes to the uncooperative HVM domain logic. I have said repeatedly during review, this is not necessarily a safe change to make without an in-depth analysis of the knock-on effects; it looks as if you have found the first knock-on effect.
>>
>>> The migration successes, but the vm doesn't run in the target machine.
>>> You can get the reason from 'xl dmesg':
>>> (XEN) HVM2 restore: VMCE_VCPU 1
>>> (XEN) HVM2 restore: TSC_ADJUST 0
>>> (XEN) HVM2 restore: TSC_ADJUST 1
>>> (d2) HVM Loader
>>> (d2) Detected Xen v4.7-unstable
>>> (d2) Get guest memory maps[128] failed. (-38)
>>> (d2) *** HVMLoader bug at e820.c:39
>>> (d2) *** HVMLoader crashed.
>>>
>>> The reason is that:
>>> We don't call xc_domain_set_memory_map() in the target machine.
>>> When we create a hvm domain:
>>> libxl__domain_build()
>>> libxl__build_hvm()
>>> libxl__arch_domain_construct_memmap()
>>> xc_domain_set_memory_map()
>>>
>>> Should we migrate the guest memory from source machine to target machine?
>> This bug specifically is because HVMLoader is expected to have run and turned the hypercall information in an E820 table in the guest before a migration occurs.
>>
>> Unfortunately, the current codebase is riddled with such assumption and expectations (e.g. the HVM save code assumed that FPU context is valid when it is saving register state) which is a direct side effect of how it was developed.
> Does FPU context have the similar problem?
Yes, although it is far harder to spot, and no software will likely
crash as a result.
> IIRC, I have tested colo befroe 4.6 is released. It works. In my test, I always
> use the option '-p' to start the HVM guest.
If the FPU wasn't initialised, the save code memset()'s the x87 register
block to 0. On the restore side, this is taken an loaded back.
The problem is that a block of zeroes is valid for the x87, and not the
default which the vcpu would expect to observe, given no resetting
itself. However, the first thing any real software will do is reset the
values properly.
>
>>
>> Having said all of the above, I agree that your example is a usecase which should work. It is the ultimate test of whether the migration stream contains enough information to faithfully reproduce the domain on the far side. Clearly at the moment, this is not the case.
> I think it should work. But the user doesn't use the migration like this.
> So it is not a serious problem.
It is still worth identifying as an issue.
~Andrew
next prev parent reply other threads:[~2015-12-29 10:57 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-24 2:29 question about migration Wen Congyang
2015-12-24 12:36 ` Andrew Cooper
2015-12-25 0:55 ` Wen Congyang
2015-12-29 10:57 ` Andrew Cooper [this message]
2015-12-25 1:45 ` Wen Congyang
2015-12-25 3:06 ` Wen Congyang
2015-12-29 12:46 ` Andrew Cooper
2016-01-04 15:31 ` Ian Jackson
2016-01-04 15:44 ` Ian Campbell
2016-01-04 15:48 ` Ian Campbell
2016-01-04 16:38 ` Andrew Cooper
2016-01-04 17:46 ` Ian Jackson
2016-01-04 18:05 ` Andrew Cooper
2016-01-05 15:40 ` Ian Jackson
2016-01-05 17:39 ` Andrew Cooper
2016-01-05 18:17 ` Ian Jackson
2016-01-06 10:21 ` Ian Campbell
2015-12-29 11:24 ` Andrew Cooper
2016-01-04 10:28 ` Paul Durrant
2016-01-04 10:36 ` Andrew Cooper
2016-01-04 11:08 ` Paul Durrant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56826708.4040005@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=wency@cn.fujitsu.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).