From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53133) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZZWCW-00007Q-J9 for qemu-devel@nongnu.org; Tue, 08 Sep 2015 23:37:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZZWCO-0004Nq-Tl for qemu-devel@nongnu.org; Tue, 08 Sep 2015 23:37:32 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:6724) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZZWCO-0004Ky-1f for qemu-devel@nongnu.org; Tue, 08 Sep 2015 23:37:28 -0400 References: <1441182199-8328-1-git-send-email-zhang.zhanghailiang@huawei.com> From: zhanghailiang Message-ID: <55EFA949.3080207@huawei.com> Date: Wed, 9 Sep 2015 11:36:41 +0800 MIME-Version: 1.0 In-Reply-To: <1441182199-8328-1-git-send-email-zhang.zhanghailiang@huawei.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH COLO-Frame v9 00/32] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: quintela@redhat.com, amit.shah@redhat.com Cc: yanghy@cn.fujitsu.com, qemu-devel@nongnu.org, peter.huangpeng@huawei.com Ping... Hi Juan & Amit, Could you please help review this series ? Since it is already comes v9, i really hope to get your feedback on this series :) Thanks, zhanghailiang On 2015/9/2 16:22, zhanghailiang wrote: > This is the 9th version of COLO. > > Please Note that, this version is very different from the previous versions. > since we have decided to realize proxy in qemu, which based on slirp in qemu. > We dropped all the original colo proxy related part. > > It will be a long time for proxy to be ready for merging, so here we extract > the basic periodic checkpoint part that not depend on proxy into this series. > Actually, the 'periodic' mode is also what we want to support in COLO, it is > based on Yang Hongyang's netfilter series. and this mode is very like > MicroCheckpointing and Remus. > > You can find the discussion about why & how to realize the colo proxy in qemu > from the follow link: > http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04069.html > > As usual, here is only COLO frame part, you can get the whole codes from github: > https://github.com/coloft/qemu/commits/colo-v2.0-periodic-mode > > Compared with previous versions, this version is more easy to test. > > Test procedure: > 1. Startup qemu > Primary side: > # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -netdev tap,id=bn0 -netfilter buffer,id=f0,netdev=bn0,chain=in -device virtio-net-pci,id=net-pci0,netdev=bn0 -boot c -drive if=virtio,id=disk1,driver=quorum,read-pattern=fifo,cache=none,aio=native,children.0.file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,children.0.driver=raw -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -S > > Secondary side: > # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -netdev tap,id=bn0 -device virtio-net-pci,id=net-pci0,netdev=bn0 -drive if=none,driver=raw,file=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,id=colo1,cache=none,aio=native -drive if=virtio,driver=replication,mode=secondary,throttling.bps-total=70000000,file.file.filename=/mnt/ramfs/active_disk.img,file.driver=qcow2,file.backing.file.filename=/mnt/ramfs/hidden_disk.img,file.backing.driver=qcow2,file.backing.backing.backing_reference=colo1,file.backing.allow-write-backing-file=on -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-table -monitor stdio -incoming tcp:0:8888 > > 2. On Secondary VM's QEMU monitor, issue command > (qemu) nbd_server_start 192.168.2.88:8889 > (qemu) nbd_server_add -w colo1 > > 3. On Primary VM's QEMU monitor, issue command: > (qemu) child_add disk1 child.driver=replication,child.mode=primary,child.file.host=192.168.2.88,child.file.port=8889,child.file.export=colo1,child.file.driver=nbd,child.ignore-errors=on > (qemu) migrate_set_capability colo on > (qemu) migrate tcp:192.168.2.88:8888 > > 4. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced. > You can by issue command "migrate_set_parameter checkpoint-delay 2000" > to change the checkpoint period time. > > 5. Failover test > You can kill PVM and run 'colo_lost_heartbeat' in SVM's > monitor at the same time, then SVM will failover and client will not feel this change. > > COLO is a totally new feature which is still in early stage, > your comments and feedback are warmly welcomed. > > TODO: > 1. checkpoint based on proxy in qemu > 2. The capability of continuous FT > > v9: > - Drop colo proxy related part (colo-nic.c file) > - Convert COLO protocol name definition to QAPI > - Smash failover related patch (patch 19/20/23) > - Fix colo exit event according Eric's comments. > - Fix some typos from Eric's comments > - Fix bug 'invalid runstate transition: 'colo' -> 'prelaunch' reported > by Dave (patch 27) > - Use migrate_set_parameter intead of ecolo-set-checkpoint-period to set > checkpoint delay time (patch 25) > - Add new patch (patch 29/30) to seperate the process of saving/loading > device and state during checkpoint. which will reduce the data size > for sending and also reduce the qsb size used in checkpoint. > > Wen Congyang (1): > COLO: Add block replication into colo process > > zhanghailiang (31): > configure: Add parameter for configure to enable/disable COLO support > migration: Introduce capability 'colo' to migration > COLO: migrate colo related info to slave > migration: Add state records for migration incoming > migration: Integrate COLO checkpoint process into migration > migration: Integrate COLO checkpoint process into loadvm > migration: Rename the'file' member of MigrationState and > MigrationIncomingState > COLO/migration: establish a new communication path from destination to > source > COLO: Implement colo checkpoint protocol > COLO: Add a new RunState RUN_STATE_COLO > QEMUSizedBuffer: Introduce two help functions for qsb > COLO: Save PVM state to secondary side when do checkpoint > COLO: Load PVM's dirty pages into SVM's RAM cache temporarily > COLO: Load VMState into qsb before restore it > COLO: Flush PVM's cached RAM into SVM's memory > COLO: synchronize PVM's state to SVM periodically > COLO failover: Introduce a new command to trigger a failover > COLO failover: Introduce state to record failover process > COLO: Implement failover work for Primary VM > COLO: Implement failover work for Secondary VM > COLO: implement default failover treatment > qmp event: Add event notification for COLO error > COLO failover: Shutdown related socket fd when do failover > COLO failover: Don't do failover during loading VM's state > COLO: Control the checkpoint delay time by migrate-set-parameters > command > COLO: Implement shutdown checkpoint > COLO: Update the global runstate after going into colo state > savevm: Split load vm state function qemu_loadvm_state > COLO: Separate the process of saving/loading ram and device state > COLO: Split qemu_savevm_state_begin out of checkpoint process > COLO: Add net packets treatment into COLO > > configure | 11 + > docs/qmp/qmp-events.txt | 17 + > hmp-commands.hx | 15 + > hmp.c | 16 + > hmp.h | 1 + > include/exec/cpu-all.h | 1 + > include/migration/colo.h | 44 +++ > include/migration/failover.h | 33 ++ > include/migration/migration.h | 16 +- > include/migration/qemu-file.h | 3 +- > include/sysemu/sysemu.h | 8 + > migration/Makefile.objs | 2 + > migration/colo-comm.c | 75 ++++ > migration/colo-failover.c | 83 +++++ > migration/colo.c | 782 ++++++++++++++++++++++++++++++++++++++++++ > migration/exec.c | 4 +- > migration/fd.c | 4 +- > migration/migration.c | 184 +++++++--- > migration/qemu-file-buf.c | 58 ++++ > migration/ram.c | 185 +++++++++- > migration/savevm.c | 309 +++++++++++++---- > migration/tcp.c | 4 +- > migration/unix.c | 4 +- > qapi-schema.json | 101 +++++- > qapi/event.json | 17 + > qmp-commands.hx | 20 ++ > stubs/Makefile.objs | 1 + > stubs/migration-colo.c | 45 +++ > trace-events | 8 + > vl.c | 37 +- > 30 files changed, 1930 insertions(+), 158 deletions(-) > create mode 100644 include/migration/colo.h > create mode 100644 include/migration/failover.h > create mode 100644 migration/colo-comm.c > create mode 100644 migration/colo-failover.c > create mode 100644 migration/colo.c > create mode 100644 stubs/migration-colo.c >