From: Li Zhijian <lizhijian@cn.fujitsu.com>
To: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
Amit Shah <amit.shah@redhat.com>,
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>,
david@gibson.dropbear.id.au
Cc: Li Zhijian <lizhijian@cn.fujitsu.com>,
zhanghailiang <zhang.zhanghailiang@huawei.com>
Subject: [Qemu-devel] [TCG only][Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration
Date: Thu, 3 Dec 2015 15:32:40 +0800 [thread overview]
Message-ID: <565FF018.9040206@cn.fujitsu.com> (raw)
Hi all,
Does anyboday remember the similar issue post by hailiang months ago
http://patchwork.ozlabs.org/patch/454322/
At least tow bugs about migration had been fixed since that.
And now we found the same issue at the tcg vm(kvm is fine), after
migration, the content VM's memory is inconsistent.
we add a patch to check memory content, you can find it from affix
steps to reporduce:
1) apply the patch and re-build qemu
2) prepare the ubuntu guest and run memtest in grub.
soruce side:
x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
-vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
pc-i440fx-2.3,accel=tcg,usb=off
destination side:
x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hn0 -device
e1000,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:65 -boot c -drive
if=none,file=/home/lizj/ubuntu.raw,id=drive-virtio-disk0 -device
virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
-vnc :7 -m 128 -smp 1 -device piix3-usb-uhci -device usb-tablet -qmp
tcp::4444,server,nowait -monitor stdio -cpu qemu64 -machine
pc-i440fx-2.3,accel=tcg,usb=off -incoming tcp:0:8881
3) start migration
with 1000M NIC, migration will finish within 3 min.
at source:
(qemu) migrate tcp:192.168.2.66:8881
after saving ram complete
e9e725df678d392b1a83b3a917f332bb
qemu-system-x86_64: end ram md5
(qemu)
at destination:
...skip...
Completed load of VM with exit code 0 seq iteration 1264
Completed load of VM with exit code 0 seq iteration 1265
Completed load of VM with exit code 0 seq iteration 1266
qemu-system-x86_64: after loading state section id 2(ram)
49c2dac7bde0e5e22db7280dcb3824f9
qemu-system-x86_64: end ram md5
qemu-system-x86_64: qemu_loadvm_state: after cpu_synchronize_all_post_init
49c2dac7bde0e5e22db7280dcb3824f9
qemu-system-x86_64: end ram md5
This occurs occasionally and only at tcg machine. It seems that
some pages dirtied in source side don't transferred to destination.
This problem can be reproduced even if we disable virtio.
Is it OK for some pages that not transferred to destination when do
migration ? Or is it a bug?
Any idea...
=================md5 check patch=============================
diff --git a/Makefile.target b/Makefile.target
index 962d004..e2cb8e9 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -139,7 +139,7 @@ obj-y += memory.o cputlb.o
obj-y += memory_mapping.o
obj-y += dump.o
obj-y += migration/ram.o migration/savevm.o
-LIBS := $(libs_softmmu) $(LIBS)
+LIBS := $(libs_softmmu) $(LIBS) -lplumb
# xen support
obj-$(CONFIG_XEN) += xen-common.o
diff --git a/migration/ram.c b/migration/ram.c
index 1eb155a..3b7a09d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2513,7 +2513,7 @@ static int ram_load(QEMUFile *f, void *opaque, int
version_id)
}
rcu_read_unlock();
- DPRINTF("Completed load of VM with exit code %d seq iteration "
+ fprintf(stderr, "Completed load of VM with exit code %d seq iteration "
"%" PRIu64 "\n", ret, seq_iter);
return ret;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index 0ad1b93..3feaa61 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -891,6 +891,29 @@ void qemu_savevm_state_header(QEMUFile *f)
}
+#include "exec/ram_addr.h"
+#include "qemu/rcu_queue.h"
+#include <clplumbing/md5.h>
+#ifndef MD5_DIGEST_LENGTH
+#define MD5_DIGEST_LENGTH 16
+#endif
+
+static void check_host_md5(void)
+{
+ int i;
+ unsigned char md[MD5_DIGEST_LENGTH];
+ rcu_read_lock();
+ RAMBlock *block = QLIST_FIRST_RCU(&ram_list.blocks);/* Only check
'pc.ram' block */
+ rcu_read_unlock();
+
+ MD5(block->host, block->used_length, md);
+ for(i = 0; i < MD5_DIGEST_LENGTH; i++) {
+ fprintf(stderr, "%02x", md[i]);
+ }
+ fprintf(stderr, "\n");
+ error_report("end ram md5");
+}
+
void qemu_savevm_state_begin(QEMUFile *f,
const MigrationParams *params)
{
@@ -1056,6 +1079,10 @@ void qemu_savevm_state_complete_precopy(QEMUFile
*f, bool iterable_only)
save_section_header(f, se, QEMU_VM_SECTION_END);
ret = se->ops->save_live_complete_precopy(f, se->opaque);
+
+ fprintf(stderr, "after saving %s complete\n", se->idstr);
+ check_host_md5();
+
trace_savevm_section_end(se->idstr, se->section_id, ret);
save_section_footer(f, se);
if (ret < 0) {
@@ -1791,6 +1818,11 @@ static int qemu_loadvm_state_main(QEMUFile *f,
MigrationIncomingState *mis)
section_id, le->se->idstr);
return ret;
}
+ if (section_type == QEMU_VM_SECTION_END) {
+ error_report("after loading state section id %d(%s)",
+ section_id, le->se->idstr);
+ check_host_md5();
+ }
if (!check_section_footer(f, le)) {
return -EINVAL;
}
@@ -1901,6 +1933,8 @@ int qemu_loadvm_state(QEMUFile *f)
}
cpu_synchronize_all_post_init();
+ error_report("%s: after cpu_synchronize_all_post_init\n", __func__);
+ check_host_md5();
return ret;
}
next reply other threads:[~2015-12-03 7:33 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-03 7:32 Li Zhijian [this message]
2015-12-03 9:24 ` [Qemu-devel] [TCG only][Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration Dr. David Alan Gilbert
2015-12-03 9:37 ` Hailiang Zhang
2015-12-03 10:23 ` Li Zhijian
2015-12-03 10:23 ` Li Zhijian
2015-12-03 11:22 ` Dr. David Alan Gilbert
2015-12-03 11:20 ` Juan Quintela
2015-12-04 1:43 ` Li, Liang Z
2015-12-17 6:07 ` Amit Shah
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=565FF018.9040206@cn.fujitsu.com \
--to=lizhijian@cn.fujitsu.com \
--cc=amit.shah@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=dgilbert@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=zhang.zhanghailiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).