Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: zhanghailiang <zhang.zhanghailiang@huawei.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: hangaohuai@huawei.com, Li Zhijian <lizhijian@cn.fujitsu.com>,
	Juan Quintela <quintela@redhat.com>,
	peter.huangpeng@huawei.com, qemu-devel@nongnu.org,
	"Gonglei (Arei)" <arei.gonglei@huawei.com>,
	Amit Shah <amit.shah@redhat.com>,
	david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration
Date: Wed, 25 Mar 2015 19:48:44 +0800	[thread overview]
Message-ID: <5512A09C.2000702@huawei.com> (raw)
In-Reply-To: <20150325113615.GE2313@work-vm>

On 2015/3/25 19:36, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2015/3/25 17:46, Dr. David Alan Gilbert wrote:
>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>>> Hi all,
>>>>
>>>> We found that, sometimes, the content of VM's memory is inconsistent between Source side and Destination side
>>>> when we check it just after finishing migration but before VM continue to Run.
>>>>
>>>> We use a patch like bellow to find this issue, you can find it from affix,
>>>> and Steps to reproduce:
>>>>
>>>> (1) Compile QEMU:
>>>>   ./configure --target-list=x86_64-softmmu  --extra-ldflags="-lssl" && make
>>>>
>>>> (2) Command and output:
>>>> SRC: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu qemu64,-kvmclock -netdev tap,id=hn0-device virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio
>>>> (qemu) migrate tcp:192.168.3.8:3004
>>>> before saving ram complete
>>>> ff703f6889ab8701e4e040872d079a28
>>>> md_host : after saving ram complete
>>>> ff703f6889ab8701e4e040872d079a28
>>>>
>>>> DST: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu qemu64,-kvmclock -netdev tap,id=hn0,vhost=on -device virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -incoming tcp:0:3004
>>>> (qemu) QEMU_VM_SECTION_END, after loading ram
>>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>>> md_host : after loading all vmstate
>>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>>> md_host : after cpu_synchronize_all_post_init
>>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>>>
>>>> This happens occasionally, and it is more easy to reproduce when issue migration command during VM's startup time.
>>>>
>>>> We have done further test and found that some pages has been dirtied but its corresponding migration_bitmap is not set.
>>>> We can't figure out which modules of QEMU has missed setting bitmap when dirty page of VM,
>>>> it is very difficult for us to trace all the actions of dirtying VM's pages.
>>>>
>>>> Actually, the first time we found this problem was in the COLO FT development, and it triggered some strange issues in
>>>> VM which all pointed to the issue of inconsistent of VM's memory. (We have try to save all memory of VM to slave side every time
>>>> when do checkpoint in COLO FT, and everything will be OK.)
>>>>
>>>> Is it OK for some pages that not transferred to destination when do migration ? Or is it a bug?
>>>
>>> That does sound like a bug.
>>> The only other explanation I have is that memory is being changed by a device emulation
>>> that happens after the end of a saving the vm, or after loading the memory.  That's
>>> certainly possible - especially if a device (say networking) hasn't been properly
>>> stopped.
>>>
>>>> This issue has blocked our COLO development... :(
>>>>
>>>> Any help will be greatly appreciated!
>>>
>>
>> Hi Dave???
>>
>>> I suggest:
>>>     1) Does it happen with devices other than virtio?
>>>     2) Strip the devices down - e.g. just run with serial and no video/usb
>>
>> I will try this ...
>>
>>>     3) Try doing the md5 comparison at the end of ram_save_complete
>>
>> Yes, we have done this test at the end of ram_save_complete.
>>
>>>     4) mprotect RAM after the ram_save_complete and see if anything faults.
>>
>> I have tried, and there will be kvm error reports. like bellow:
>>
>> KVM: entry failed, hardware error 0x7
>
> If the RAM is mprotect'd with PROT_READ|PROT_EXEC I'm not sure why this
> would happen; but then the question is why is it trying to do a KVM entry
> after the end of ram_save_complete - that suggests the CPUs are trying to
> run again, and that shouldn't happen in a normal migration.
> If you can trigger this KVM entry on a non-colo migration, then I suggest
> get the backtrace from where the KVM entry failed, because then you
> should be able to see why the CPU was being restarted.
>

Er, sorry, the follow report is reproduced with COLO, i will test this with mprotect
in normal migration.

> Dave
>
>> EAX=00000000 EBX=0000e000 ECX=00009578 EDX=0000434f
>> ESI=0000fc10 EDI=0000434f EBP=00000000 ESP=00001fca
>> EIP=00009594 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0040 00000400 0000ffff 00009300
>> CS =f000 000f0000 0000ffff 00009b00
>> SS =434f 000434f0 0000ffff 00009300
>> DS =434f 000434f0 0000ffff 00009300
>> FS =0000 00000000 0000ffff 00009300
>> GS =0000 00000000 0000ffff 00009300
>> LDT=0000 00000000 0000ffff 00008200
>> TR =0000 00000000 0000ffff 00008b00
>> GDT=     0002dcc8 00000047
>> IDT=     00000000 0000ffff
>> CR0=00000010 CR2=ffffffff CR3=00000000 CR4=00000000
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=c0 74 0f 66 b9 78 95 00 00 66 31 d2 66 31 c0 e9 47 e0 fb 90 <f3> 90 fa fc 66 c3 66 53 66 89 c3 66 e8 9d e8 ff ff 66 01 c3 66 89 d8 66 e8 40 e9 ff ff 66
>> ERROR: invalid runstate transition: 'internal-error' -> 'colo'
>>
>>>     5) Can you trigger this with normal migration or just COLO?
>>
>> Yes, we have reproduced it without COLO, just in normal migration
>>
>>>        I'm wondering if something is doing something on a running/paused/etc state
>>>        change and isn't expecting the new COLO states.
>>>
>>
>> Actually, we can also reproduce this with COLO, only in master side, we store all RAM to ram-cache, and also duplicate migration_bitmap,
>> and after a checkpoint, before VM continue to run, we compare cache memory and current memory, if they are different but whose
>> dirty bitmap is not set, we thought this page is missing tracing by migration bitmap. The result is  result will be like:
>> (Certainly, this happens also occasionally, sometimes, there will be none page un-traced.)
>> pc.ram[0xc7000](mis_dirty)
>> pc.ram[0xec000](mis_dirty)
>> pc.ram[0xf4000](mis_dirty)
>> pc.ram[0xfa000](mis_dirty)
>> pc.ram[0xbf93000](mis_dirty)
>> pc.ram[0xbf94000](mis_dirty)
>> pc.ram[0xbf95000](mis_dirty)
>> pc.ram[0xbf96000](mis_dirty)
>> pc.ram[0xbf97000](mis_dirty)
>> pc.ram[0xbf98000](mis_dirty)
>> pc.ram[0xbf99000](mis_dirty)
>> pc.ram[0xbf9a000](mis_dirty)
>> pc.ram[0xbf9b000](mis_dirty)
>> pc.ram[0xbf9c000](mis_dirty)
>> ...
>>
>> Thanks,
>> zhang
>>
>>>> --- a/savevm.c
>>>> +++ b/savevm.c
>>>> @@ -51,6 +51,26 @@
>>>>   #define ARP_PTYPE_IP 0x0800
>>>>   #define ARP_OP_REQUEST_REV 0x3
>>>>
>>>> +#include "qemu/rcu_queue.h"
>>>> +#include <openssl/md5.h>
>>>> +
>>>> +static void check_host_md5(void)
>>>> +{
>>>> +    int i;
>>>> +    unsigned char md[MD5_DIGEST_LENGTH];
>>>> +    MD5_CTX ctx;
>>>> +    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check 'pc.ram' block */
>>>> +
>>>> +    MD5_Init(&ctx);
>>>> +    MD5_Update(&ctx, (void *)block->host, block->used_length);
>>>> +    MD5_Final(md, &ctx);
>>>> +    printf("md_host : ");
>>>> +    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
>>>> +        fprintf(stderr, "%02x", md[i]);
>>>> +    }
>>>> +    fprintf(stderr, "\n");
>>>> +}
>>>> +
>>>>   static int announce_self_create(uint8_t *buf,
>>>>                                   uint8_t *mac_addr)
>>>>   {
>>>> @@ -741,7 +761,13 @@ void qemu_savevm_state_complete(QEMUFile *f)
>>>>           qemu_put_byte(f, QEMU_VM_SECTION_END);
>>>>           qemu_put_be32(f, se->section_id);
>>>>
>>>> +        printf("before saving %s complete\n", se->idstr);
>>>> +        check_host_md5();
>>>> +
>>>>           ret = se->ops->save_live_complete(f, se->opaque);
>>>> +        printf("after saving %s complete\n", se->idstr);
>>>> +        check_host_md5();
>>>> +
>>>>           trace_savevm_section_end(se->idstr, se->section_id, ret);
>>>>           if (ret < 0) {
>>>>               qemu_file_set_error(f, ret);
>>>> @@ -1030,6 +1063,11 @@ int qemu_loadvm_state(QEMUFile *f)
>>>>               }
>>>>
>>>>               ret = vmstate_load(f, le->se, le->version_id);
>>>> +            if (section_type == QEMU_VM_SECTION_END) {
>>>> +                printf("QEMU_VM_SECTION_END, after loading %s\n", le->se->idstr);
>>>> +                check_host_md5();
>>>> +            }
>>>> +
>>>>               if (ret < 0) {
>>>>                   error_report("error while loading state section id %d(%s)",
>>>>                                section_id, le->se->idstr);
>>>> @@ -1061,7 +1099,11 @@ int qemu_loadvm_state(QEMUFile *f)
>>>>           g_free(buf);
>>>>       }
>>>>
>>>> +    printf("after loading all vmstate\n");
>>>> +    check_host_md5();
>>>>       cpu_synchronize_all_post_init();
>>>> +    printf("after cpu_synchronize_all_post_init\n");
>>>> +    check_host_md5();
>>>>
>>>>       ret = 0;
>>>>
>>>> --
>>>
>>>>  From ecb789cf7f383b112da3cce33eb9822a94b9497a Mon Sep 17 00:00:00 2001
>>>> From: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> Date: Tue, 24 Mar 2015 21:53:26 -0400
>>>> Subject: [PATCH] check pc.ram block md5sum between migration Source and
>>>>   Destination
>>>>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> ---
>>>>   savevm.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>>>   1 file changed, 42 insertions(+)
>>>>   mode change 100644 => 100755 savevm.c
>>>>
>>>> diff --git a/savevm.c b/savevm.c
>>>> old mode 100644
>>>> new mode 100755
>>>> index 3b0e222..3d431dc
>>>> --- a/savevm.c
>>>> +++ b/savevm.c
>>>> @@ -51,6 +51,26 @@
>>>>   #define ARP_PTYPE_IP 0x0800
>>>>   #define ARP_OP_REQUEST_REV 0x3
>>>>
>>>> +#include "qemu/rcu_queue.h"
>>>> +#include <openssl/md5.h>
>>>> +
>>>> +static void check_host_md5(void)
>>>> +{
>>>> +    int i;
>>>> +    unsigned char md[MD5_DIGEST_LENGTH];
>>>> +    MD5_CTX ctx;
>>>> +    RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check 'pc.ram' block */
>>>> +
>>>> +    MD5_Init(&ctx);
>>>> +    MD5_Update(&ctx, (void *)block->host, block->used_length);
>>>> +    MD5_Final(md, &ctx);
>>>> +    printf("md_host : ");
>>>> +    for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
>>>> +        fprintf(stderr, "%02x", md[i]);
>>>> +    }
>>>> +    fprintf(stderr, "\n");
>>>> +}
>>>> +
>>>>   static int announce_self_create(uint8_t *buf,
>>>>                                   uint8_t *mac_addr)
>>>>   {
>>>> @@ -741,7 +761,13 @@ void qemu_savevm_state_complete(QEMUFile *f)
>>>>           qemu_put_byte(f, QEMU_VM_SECTION_END);
>>>>           qemu_put_be32(f, se->section_id);
>>>>
>>>> +        printf("before saving %s complete\n", se->idstr);
>>>> +        check_host_md5();
>>>> +
>>>>           ret = se->ops->save_live_complete(f, se->opaque);
>>>> +        printf("after saving %s complete\n", se->idstr);
>>>> +        check_host_md5();
>>>> +
>>>>           trace_savevm_section_end(se->idstr, se->section_id, ret);
>>>>           if (ret < 0) {
>>>>               qemu_file_set_error(f, ret);
>>>> @@ -1007,6 +1033,13 @@ int qemu_loadvm_state(QEMUFile *f)
>>>>               QLIST_INSERT_HEAD(&loadvm_handlers, le, entry);
>>>>
>>>>               ret = vmstate_load(f, le->se, le->version_id);
>>>> +#if 0
>>>> +            if (section_type == QEMU_VM_SECTION_FULL) {
>>>> +                printf("QEMU_VM_SECTION_FULL, after loading %s\n", le->se->idstr);
>>>> +                check_host_md5();
>>>> +            }
>>>> +#endif
>>>> +
>>>>               if (ret < 0) {
>>>>                   error_report("error while loading state for instance 0x%x of"
>>>>                                " device '%s'", instance_id, idstr);
>>>> @@ -1030,6 +1063,11 @@ int qemu_loadvm_state(QEMUFile *f)
>>>>               }
>>>>
>>>>               ret = vmstate_load(f, le->se, le->version_id);
>>>> +            if (section_type == QEMU_VM_SECTION_END) {
>>>> +                printf("QEMU_VM_SECTION_END, after loading %s\n", le->se->idstr);
>>>> +                check_host_md5();
>>>> +            }
>>>> +
>>>>               if (ret < 0) {
>>>>                   error_report("error while loading state section id %d(%s)",
>>>>                                section_id, le->se->idstr);
>>>> @@ -1061,7 +1099,11 @@ int qemu_loadvm_state(QEMUFile *f)
>>>>           g_free(buf);
>>>>       }
>>>>
>>>> +    printf("after loading all vmstate\n");
>>>> +    check_host_md5();
>>>>       cpu_synchronize_all_post_init();
>>>> +    printf("after cpu_synchronize_all_post_init\n");
>>>> +    check_host_md5();
>>>>
>>>>       ret = 0;
>>>>
>>>> --
>>>> 1.7.12.4
>>>>
>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>> .
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

next prev parent reply	other threads:[~2015-03-25 11:51 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-25  9:31 [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration zhanghailiang
2015-03-25  9:46 ` Dr. David Alan Gilbert
2015-03-25 11:28   ` zhanghailiang
2015-03-25 11:36     ` Dr. David Alan Gilbert
2015-03-25 11:48       ` zhanghailiang [this message]
2015-03-25  9:50 ` Juan Quintela
2015-03-25 10:21   ` Wen Congyang
2015-03-25 13:12     ` Paolo Bonzini
2015-03-26  1:43       ` Wen Congyang
2015-03-25 11:32   ` zhanghailiang
2015-03-26  3:12   ` Wen Congyang
2015-03-26  3:52     ` Li Zhijian
2015-03-27 10:13       ` zhanghailiang
2015-03-27 10:18         ` Dr. David Alan Gilbert
2015-03-28  9:54           ` zhanghailiang
2015-03-30  7:59             ` Dr. David Alan Gilbert
2015-03-31 11:48               ` zhanghailiang
2015-03-31 19:06                 ` Dr. David Alan Gilbert
2015-04-02 11:52                   ` zhanghailiang
2015-04-02 13:00                     ` Paolo Bonzini
2015-04-03  8:51                     ` Jason Wang
2015-04-03  9:08                       ` Wen Congyang
2015-04-03  9:20                       ` zhanghailiang
2015-04-08  8:08                         ` Jason Wang
2015-03-27 10:51         ` Juan Quintela
2015-03-28  1:08           ` zhanghailiang
2015-03-26 10:29     ` Juan Quintela
2015-03-26 11:57       ` Michael S. Tsirkin
2015-03-27  8:56       ` Stefan Hajnoczi
2015-03-27  9:14         ` Wen Congyang
2015-03-27  9:57           ` Stefan Hajnoczi
2015-03-27 10:05             ` Wen Congyang
2015-03-27 10:11               ` Stefan Hajnoczi
2015-03-27 10:36               ` Juan Quintela
2015-03-27 10:34           ` Juan Quintela
2015-03-31  7:54         ` Wen Congyang
2015-03-31 14:16           ` Stefan Hajnoczi
2015-04-02  9:14       ` Wen Congyang
2015-04-02 13:17         ` Paolo Bonzini
2015-04-03  1:29           ` Wen Congyang
2015-04-03 10:56             ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5512A09C.2000702@huawei.com \
    --to=zhang.zhanghailiang@huawei.com \
    --cc=amit.shah@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dgilbert@redhat.com \
    --cc=hangaohuai@huawei.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).