From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41454) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yajpr-000295-P8 for qemu-devel@nongnu.org; Wed, 25 Mar 2015 07:51:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yajpn-0007gl-M7 for qemu-devel@nongnu.org; Wed, 25 Mar 2015 07:50:59 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:11718) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yajpk-0007cj-5U for qemu-devel@nongnu.org; Wed, 25 Mar 2015 07:50:55 -0400 Message-ID: <5512A09C.2000702@huawei.com> Date: Wed, 25 Mar 2015 19:48:44 +0800 From: zhanghailiang MIME-Version: 1.0 References: <55128084.2040304@huawei.com> <20150325094639.GB2313@work-vm> <55129BC3.4060408@huawei.com> <20150325113615.GE2313@work-vm> In-Reply-To: <20150325113615.GE2313@work-vm> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: hangaohuai@huawei.com, Li Zhijian , Juan Quintela , peter.huangpeng@huawei.com, qemu-devel@nongnu.org, "Gonglei (Arei)" , Amit Shah , david@gibson.dropbear.id.au On 2015/3/25 19:36, Dr. David Alan Gilbert wrote: > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote: >> On 2015/3/25 17:46, Dr. David Alan Gilbert wrote: >>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote: >>>> Hi all, >>>> >>>> We found that, sometimes, the content of VM's memory is inconsistent between Source side and Destination side >>>> when we check it just after finishing migration but before VM continue to Run. >>>> >>>> We use a patch like bellow to find this issue, you can find it from affix, >>>> and Steps to reproduce: >>>> >>>> (1) Compile QEMU: >>>> ./configure --target-list=x86_64-softmmu --extra-ldflags="-lssl" && make >>>> >>>> (2) Command and output: >>>> SRC: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu qemu64,-kvmclock -netdev tap,id=hn0-device virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio >>>> (qemu) migrate tcp:192.168.3.8:3004 >>>> before saving ram complete >>>> ff703f6889ab8701e4e040872d079a28 >>>> md_host : after saving ram complete >>>> ff703f6889ab8701e4e040872d079a28 >>>> >>>> DST: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu qemu64,-kvmclock -netdev tap,id=hn0,vhost=on -device virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -incoming tcp:0:3004 >>>> (qemu) QEMU_VM_SECTION_END, after loading ram >>>> 230e1e68ece9cd4e769630e1bcb5ddfb >>>> md_host : after loading all vmstate >>>> 230e1e68ece9cd4e769630e1bcb5ddfb >>>> md_host : after cpu_synchronize_all_post_init >>>> 230e1e68ece9cd4e769630e1bcb5ddfb >>>> >>>> This happens occasionally, and it is more easy to reproduce when issue migration command during VM's startup time. >>>> >>>> We have done further test and found that some pages has been dirtied but its corresponding migration_bitmap is not set. >>>> We can't figure out which modules of QEMU has missed setting bitmap when dirty page of VM, >>>> it is very difficult for us to trace all the actions of dirtying VM's pages. >>>> >>>> Actually, the first time we found this problem was in the COLO FT development, and it triggered some strange issues in >>>> VM which all pointed to the issue of inconsistent of VM's memory. (We have try to save all memory of VM to slave side every time >>>> when do checkpoint in COLO FT, and everything will be OK.) >>>> >>>> Is it OK for some pages that not transferred to destination when do migration ? Or is it a bug? >>> >>> That does sound like a bug. >>> The only other explanation I have is that memory is being changed by a device emulation >>> that happens after the end of a saving the vm, or after loading the memory. That's >>> certainly possible - especially if a device (say networking) hasn't been properly >>> stopped. >>> >>>> This issue has blocked our COLO development... :( >>>> >>>> Any help will be greatly appreciated! >>> >> >> Hi Dave??? >> >>> I suggest: >>> 1) Does it happen with devices other than virtio? >>> 2) Strip the devices down - e.g. just run with serial and no video/usb >> >> I will try this ... >> >>> 3) Try doing the md5 comparison at the end of ram_save_complete >> >> Yes, we have done this test at the end of ram_save_complete. >> >>> 4) mprotect RAM after the ram_save_complete and see if anything faults. >> >> I have tried, and there will be kvm error reports. like bellow: >> >> KVM: entry failed, hardware error 0x7 > > If the RAM is mprotect'd with PROT_READ|PROT_EXEC I'm not sure why this > would happen; but then the question is why is it trying to do a KVM entry > after the end of ram_save_complete - that suggests the CPUs are trying to > run again, and that shouldn't happen in a normal migration. > If you can trigger this KVM entry on a non-colo migration, then I suggest > get the backtrace from where the KVM entry failed, because then you > should be able to see why the CPU was being restarted. > Er, sorry, the follow report is reproduced with COLO, i will test this with mprotect in normal migration. > Dave > >> EAX=00000000 EBX=0000e000 ECX=00009578 EDX=0000434f >> ESI=0000fc10 EDI=0000434f EBP=00000000 ESP=00001fca >> EIP=00009594 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 >> ES =0040 00000400 0000ffff 00009300 >> CS =f000 000f0000 0000ffff 00009b00 >> SS =434f 000434f0 0000ffff 00009300 >> DS =434f 000434f0 0000ffff 00009300 >> FS =0000 00000000 0000ffff 00009300 >> GS =0000 00000000 0000ffff 00009300 >> LDT=0000 00000000 0000ffff 00008200 >> TR =0000 00000000 0000ffff 00008b00 >> GDT= 0002dcc8 00000047 >> IDT= 00000000 0000ffff >> CR0=00000010 CR2=ffffffff CR3=00000000 CR4=00000000 >> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 >> DR6=00000000ffff0ff0 DR7=0000000000000400 >> EFER=0000000000000000 >> Code=c0 74 0f 66 b9 78 95 00 00 66 31 d2 66 31 c0 e9 47 e0 fb 90 90 fa fc 66 c3 66 53 66 89 c3 66 e8 9d e8 ff ff 66 01 c3 66 89 d8 66 e8 40 e9 ff ff 66 >> ERROR: invalid runstate transition: 'internal-error' -> 'colo' >> >>> 5) Can you trigger this with normal migration or just COLO? >> >> Yes, we have reproduced it without COLO, just in normal migration >> >>> I'm wondering if something is doing something on a running/paused/etc state >>> change and isn't expecting the new COLO states. >>> >> >> Actually, we can also reproduce this with COLO, only in master side, we store all RAM to ram-cache, and also duplicate migration_bitmap, >> and after a checkpoint, before VM continue to run, we compare cache memory and current memory, if they are different but whose >> dirty bitmap is not set, we thought this page is missing tracing by migration bitmap. The result is result will be like: >> (Certainly, this happens also occasionally, sometimes, there will be none page un-traced.) >> pc.ram[0xc7000](mis_dirty) >> pc.ram[0xec000](mis_dirty) >> pc.ram[0xf4000](mis_dirty) >> pc.ram[0xfa000](mis_dirty) >> pc.ram[0xbf93000](mis_dirty) >> pc.ram[0xbf94000](mis_dirty) >> pc.ram[0xbf95000](mis_dirty) >> pc.ram[0xbf96000](mis_dirty) >> pc.ram[0xbf97000](mis_dirty) >> pc.ram[0xbf98000](mis_dirty) >> pc.ram[0xbf99000](mis_dirty) >> pc.ram[0xbf9a000](mis_dirty) >> pc.ram[0xbf9b000](mis_dirty) >> pc.ram[0xbf9c000](mis_dirty) >> ... >> >> Thanks, >> zhang >> >>>> --- a/savevm.c >>>> +++ b/savevm.c >>>> @@ -51,6 +51,26 @@ >>>> #define ARP_PTYPE_IP 0x0800 >>>> #define ARP_OP_REQUEST_REV 0x3 >>>> >>>> +#include "qemu/rcu_queue.h" >>>> +#include >>>> + >>>> +static void check_host_md5(void) >>>> +{ >>>> + int i; >>>> + unsigned char md[MD5_DIGEST_LENGTH]; >>>> + MD5_CTX ctx; >>>> + RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check 'pc.ram' block */ >>>> + >>>> + MD5_Init(&ctx); >>>> + MD5_Update(&ctx, (void *)block->host, block->used_length); >>>> + MD5_Final(md, &ctx); >>>> + printf("md_host : "); >>>> + for(i = 0; i < MD5_DIGEST_LENGTH; i++) { >>>> + fprintf(stderr, "%02x", md[i]); >>>> + } >>>> + fprintf(stderr, "\n"); >>>> +} >>>> + >>>> static int announce_self_create(uint8_t *buf, >>>> uint8_t *mac_addr) >>>> { >>>> @@ -741,7 +761,13 @@ void qemu_savevm_state_complete(QEMUFile *f) >>>> qemu_put_byte(f, QEMU_VM_SECTION_END); >>>> qemu_put_be32(f, se->section_id); >>>> >>>> + printf("before saving %s complete\n", se->idstr); >>>> + check_host_md5(); >>>> + >>>> ret = se->ops->save_live_complete(f, se->opaque); >>>> + printf("after saving %s complete\n", se->idstr); >>>> + check_host_md5(); >>>> + >>>> trace_savevm_section_end(se->idstr, se->section_id, ret); >>>> if (ret < 0) { >>>> qemu_file_set_error(f, ret); >>>> @@ -1030,6 +1063,11 @@ int qemu_loadvm_state(QEMUFile *f) >>>> } >>>> >>>> ret = vmstate_load(f, le->se, le->version_id); >>>> + if (section_type == QEMU_VM_SECTION_END) { >>>> + printf("QEMU_VM_SECTION_END, after loading %s\n", le->se->idstr); >>>> + check_host_md5(); >>>> + } >>>> + >>>> if (ret < 0) { >>>> error_report("error while loading state section id %d(%s)", >>>> section_id, le->se->idstr); >>>> @@ -1061,7 +1099,11 @@ int qemu_loadvm_state(QEMUFile *f) >>>> g_free(buf); >>>> } >>>> >>>> + printf("after loading all vmstate\n"); >>>> + check_host_md5(); >>>> cpu_synchronize_all_post_init(); >>>> + printf("after cpu_synchronize_all_post_init\n"); >>>> + check_host_md5(); >>>> >>>> ret = 0; >>>> >>>> -- >>> >>>> From ecb789cf7f383b112da3cce33eb9822a94b9497a Mon Sep 17 00:00:00 2001 >>>> From: Li Zhijian >>>> Date: Tue, 24 Mar 2015 21:53:26 -0400 >>>> Subject: [PATCH] check pc.ram block md5sum between migration Source and >>>> Destination >>>> >>>> Signed-off-by: Li Zhijian >>>> --- >>>> savevm.c | 42 ++++++++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 42 insertions(+) >>>> mode change 100644 => 100755 savevm.c >>>> >>>> diff --git a/savevm.c b/savevm.c >>>> old mode 100644 >>>> new mode 100755 >>>> index 3b0e222..3d431dc >>>> --- a/savevm.c >>>> +++ b/savevm.c >>>> @@ -51,6 +51,26 @@ >>>> #define ARP_PTYPE_IP 0x0800 >>>> #define ARP_OP_REQUEST_REV 0x3 >>>> >>>> +#include "qemu/rcu_queue.h" >>>> +#include >>>> + >>>> +static void check_host_md5(void) >>>> +{ >>>> + int i; >>>> + unsigned char md[MD5_DIGEST_LENGTH]; >>>> + MD5_CTX ctx; >>>> + RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check 'pc.ram' block */ >>>> + >>>> + MD5_Init(&ctx); >>>> + MD5_Update(&ctx, (void *)block->host, block->used_length); >>>> + MD5_Final(md, &ctx); >>>> + printf("md_host : "); >>>> + for(i = 0; i < MD5_DIGEST_LENGTH; i++) { >>>> + fprintf(stderr, "%02x", md[i]); >>>> + } >>>> + fprintf(stderr, "\n"); >>>> +} >>>> + >>>> static int announce_self_create(uint8_t *buf, >>>> uint8_t *mac_addr) >>>> { >>>> @@ -741,7 +761,13 @@ void qemu_savevm_state_complete(QEMUFile *f) >>>> qemu_put_byte(f, QEMU_VM_SECTION_END); >>>> qemu_put_be32(f, se->section_id); >>>> >>>> + printf("before saving %s complete\n", se->idstr); >>>> + check_host_md5(); >>>> + >>>> ret = se->ops->save_live_complete(f, se->opaque); >>>> + printf("after saving %s complete\n", se->idstr); >>>> + check_host_md5(); >>>> + >>>> trace_savevm_section_end(se->idstr, se->section_id, ret); >>>> if (ret < 0) { >>>> qemu_file_set_error(f, ret); >>>> @@ -1007,6 +1033,13 @@ int qemu_loadvm_state(QEMUFile *f) >>>> QLIST_INSERT_HEAD(&loadvm_handlers, le, entry); >>>> >>>> ret = vmstate_load(f, le->se, le->version_id); >>>> +#if 0 >>>> + if (section_type == QEMU_VM_SECTION_FULL) { >>>> + printf("QEMU_VM_SECTION_FULL, after loading %s\n", le->se->idstr); >>>> + check_host_md5(); >>>> + } >>>> +#endif >>>> + >>>> if (ret < 0) { >>>> error_report("error while loading state for instance 0x%x of" >>>> " device '%s'", instance_id, idstr); >>>> @@ -1030,6 +1063,11 @@ int qemu_loadvm_state(QEMUFile *f) >>>> } >>>> >>>> ret = vmstate_load(f, le->se, le->version_id); >>>> + if (section_type == QEMU_VM_SECTION_END) { >>>> + printf("QEMU_VM_SECTION_END, after loading %s\n", le->se->idstr); >>>> + check_host_md5(); >>>> + } >>>> + >>>> if (ret < 0) { >>>> error_report("error while loading state section id %d(%s)", >>>> section_id, le->se->idstr); >>>> @@ -1061,7 +1099,11 @@ int qemu_loadvm_state(QEMUFile *f) >>>> g_free(buf); >>>> } >>>> >>>> + printf("after loading all vmstate\n"); >>>> + check_host_md5(); >>>> cpu_synchronize_all_post_init(); >>>> + printf("after cpu_synchronize_all_post_init\n"); >>>> + check_host_md5(); >>>> >>>> ret = 0; >>>> >>>> -- >>>> 1.7.12.4 >>>> >>> >>> -- >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >>> >>> . >>> >> >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > . >