From: zhanghailiang <zhang.zhanghailiang@huawei.com>
To: qemu-devel@nongnu.org
Cc: lizhijian@cn.fujitsu.com, quintela@redhat.com,
yunhong.jiang@intel.com, eddie.dong@intel.com,
peter.huangpeng@huawei.com, dgilbert@redhat.com,
arei.gonglei@huawei.com, stefanha@redhat.com,
amit.shah@redhat.com, yanghy@cn.fujitsu.com,
zhanghailiang <zhang.zhanghailiang@huawei.com>
Subject: [Qemu-devel] [PATCH COLO-Frame v9 00/32] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)
Date: Wed, 2 Sep 2015 16:22:47 +0800 [thread overview]
Message-ID: <1441182199-8328-1-git-send-email-zhang.zhanghailiang@huawei.com> (raw)
This is the 9th version of COLO.
Please Note that, this version is very different from the previous versions.
since we have decided to realize proxy in qemu, which based on slirp in qemu.
We dropped all the original colo proxy related part.
It will be a long time for proxy to be ready for merging, so here we extract
the basic periodic checkpoint part that not depend on proxy into this series.
Actually, the 'periodic' mode is also what we want to support in COLO, it is
based on Yang Hongyang's netfilter series. and this mode is very like
MicroCheckpointing and Remus.
You can find the discussion about why & how to realize the colo proxy in qemu
from the follow link:
http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04069.html
As usual, here is only COLO frame part, you can get the whole codes from github:
https://github.com/coloft/qemu/commits/colo-v2.0-periodic-mode
Compared with previous versions, this version is more easy to test.
Test procedure:
1. Startup qemu
Primary side:
# x86_64-softmmu/qemu-system-x86_64 -enable-kvm -netdev tap,id=bn0 -netfilter buffer,id=f0,netdev=bn0,chain=in -device virtio-net-pci,id=net-pci0,netdev=bn0 -boot c -drive if=virtio,id=disk1,driver=quorum,read-pattern=fifo,cache=none,aio=native,children.0.file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,children.0.driver=raw -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -S
Secondary side:
# x86_64-softmmu/qemu-system-x86_64 -enable-kvm -netdev tap,id=bn0 -device virtio-net-pci,id=net-pci0,netdev=bn0 -drive if=none,driver=raw,file=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,id=colo1,cache=none,aio=native -drive if=virtio,driver=replication,mode=secondary,throttling.bps-total=70000000,file.file.filename=/mnt/ramfs/active_disk.img,file.driver=qcow2,file.backing.file.filename=/mnt/ramfs/hidden_disk.img,file.backing.driver=qcow2,file.backing.backing.backing_reference=colo1,file.backing.allow-write-backing-file=on -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-table -monitor stdio -incoming tcp:0:8888
2. On Secondary VM's QEMU monitor, issue command
(qemu) nbd_server_start 192.168.2.88:8889
(qemu) nbd_server_add -w colo1
3. On Primary VM's QEMU monitor, issue command:
(qemu) child_add disk1 child.driver=replication,child.mode=primary,child.file.host=192.168.2.88,child.file.port=8889,child.file.export=colo1,child.file.driver=nbd,child.ignore-errors=on
(qemu) migrate_set_capability colo on
(qemu) migrate tcp:192.168.2.88:8888
4. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
You can by issue command "migrate_set_parameter checkpoint-delay 2000"
to change the checkpoint period time.
5. Failover test
You can kill PVM and run 'colo_lost_heartbeat' in SVM's
monitor at the same time, then SVM will failover and client will not feel this change.
COLO is a totally new feature which is still in early stage,
your comments and feedback are warmly welcomed.
TODO:
1. checkpoint based on proxy in qemu
2. The capability of continuous FT
v9:
- Drop colo proxy related part (colo-nic.c file)
- Convert COLO protocol name definition to QAPI
- Smash failover related patch (patch 19/20/23)
- Fix colo exit event according Eric's comments.
- Fix some typos from Eric's comments
- Fix bug 'invalid runstate transition: 'colo' -> 'prelaunch' reported
by Dave (patch 27)
- Use migrate_set_parameter intead of ecolo-set-checkpoint-period to set
checkpoint delay time (patch 25)
- Add new patch (patch 29/30) to seperate the process of saving/loading
device and state during checkpoint. which will reduce the data size
for sending and also reduce the qsb size used in checkpoint.
Wen Congyang (1):
COLO: Add block replication into colo process
zhanghailiang (31):
configure: Add parameter for configure to enable/disable COLO support
migration: Introduce capability 'colo' to migration
COLO: migrate colo related info to slave
migration: Add state records for migration incoming
migration: Integrate COLO checkpoint process into migration
migration: Integrate COLO checkpoint process into loadvm
migration: Rename the'file' member of MigrationState and
MigrationIncomingState
COLO/migration: establish a new communication path from destination to
source
COLO: Implement colo checkpoint protocol
COLO: Add a new RunState RUN_STATE_COLO
QEMUSizedBuffer: Introduce two help functions for qsb
COLO: Save PVM state to secondary side when do checkpoint
COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
COLO: Load VMState into qsb before restore it
COLO: Flush PVM's cached RAM into SVM's memory
COLO: synchronize PVM's state to SVM periodically
COLO failover: Introduce a new command to trigger a failover
COLO failover: Introduce state to record failover process
COLO: Implement failover work for Primary VM
COLO: Implement failover work for Secondary VM
COLO: implement default failover treatment
qmp event: Add event notification for COLO error
COLO failover: Shutdown related socket fd when do failover
COLO failover: Don't do failover during loading VM's state
COLO: Control the checkpoint delay time by migrate-set-parameters
command
COLO: Implement shutdown checkpoint
COLO: Update the global runstate after going into colo state
savevm: Split load vm state function qemu_loadvm_state
COLO: Separate the process of saving/loading ram and device state
COLO: Split qemu_savevm_state_begin out of checkpoint process
COLO: Add net packets treatment into COLO
configure | 11 +
docs/qmp/qmp-events.txt | 17 +
hmp-commands.hx | 15 +
hmp.c | 16 +
hmp.h | 1 +
include/exec/cpu-all.h | 1 +
include/migration/colo.h | 44 +++
include/migration/failover.h | 33 ++
include/migration/migration.h | 16 +-
include/migration/qemu-file.h | 3 +-
include/sysemu/sysemu.h | 8 +
migration/Makefile.objs | 2 +
migration/colo-comm.c | 75 ++++
migration/colo-failover.c | 83 +++++
migration/colo.c | 782 ++++++++++++++++++++++++++++++++++++++++++
migration/exec.c | 4 +-
migration/fd.c | 4 +-
migration/migration.c | 184 +++++++---
migration/qemu-file-buf.c | 58 ++++
migration/ram.c | 185 +++++++++-
migration/savevm.c | 309 +++++++++++++----
migration/tcp.c | 4 +-
migration/unix.c | 4 +-
qapi-schema.json | 101 +++++-
qapi/event.json | 17 +
qmp-commands.hx | 20 ++
stubs/Makefile.objs | 1 +
stubs/migration-colo.c | 45 +++
trace-events | 8 +
vl.c | 37 +-
30 files changed, 1930 insertions(+), 158 deletions(-)
create mode 100644 include/migration/colo.h
create mode 100644 include/migration/failover.h
create mode 100644 migration/colo-comm.c
create mode 100644 migration/colo-failover.c
create mode 100644 migration/colo.c
create mode 100644 stubs/migration-colo.c
--
1.8.3.1
next reply other threads:[~2015-09-02 8:24 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-02 8:22 zhanghailiang [this message]
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 01/32] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
2015-10-02 15:10 ` Dr. David Alan Gilbert
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 02/32] migration: Introduce capability 'colo' to migration zhanghailiang
2015-10-02 16:02 ` Eric Blake
2015-10-08 6:34 ` zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 03/32] COLO: migrate colo related info to slave zhanghailiang
2015-10-02 18:45 ` Dr. David Alan Gilbert
2015-10-08 6:48 ` zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 04/32] migration: Add state records for migration incoming zhanghailiang
2015-10-09 16:18 ` Dr. David Alan Gilbert
2015-10-10 7:07 ` zhanghailiang
2015-10-16 11:14 ` Dr. David Alan Gilbert
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 05/32] migration: Integrate COLO checkpoint process into migration zhanghailiang
2015-10-09 16:53 ` Dr. David Alan Gilbert
2015-10-10 6:25 ` zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 06/32] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
2015-10-19 9:17 ` Dr. David Alan Gilbert
2015-10-20 8:04 ` zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 07/32] migration: Rename the'file' member of MigrationState and MigrationIncomingState zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 08/32] COLO/migration: establish a new communication path from destination to source zhanghailiang
2015-10-19 9:54 ` Dr. David Alan Gilbert
2015-10-20 8:30 ` zhanghailiang
2015-10-20 19:32 ` Dr. David Alan Gilbert
2015-10-21 8:33 ` zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 09/32] COLO: Implement colo checkpoint protocol zhanghailiang
2015-10-21 12:17 ` Eric Blake
2015-10-22 7:13 ` zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 10/32] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2015-10-21 12:18 ` Eric Blake
2015-10-22 6:58 ` zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 11/32] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
2015-09-02 8:22 ` [Qemu-devel] [PATCH COLO-Frame v9 12/32] COLO: Save PVM state to secondary side when do checkpoint zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 13/32] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 14/32] COLO: Load VMState into qsb before restore it zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 15/32] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 16/32] COLO: synchronize PVM's state to SVM periodically zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 17/32] COLO failover: Introduce a new command to trigger a failover zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 18/32] COLO failover: Introduce state to record failover process zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 19/32] COLO: Implement failover work for Primary VM zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 20/32] COLO: Implement failover work for Secondary VM zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 21/32] COLO: implement default failover treatment zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 22/32] qmp event: Add event notification for COLO error zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 23/32] COLO failover: Shutdown related socket fd when do failover zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 24/32] COLO failover: Don't do failover during loading VM's state zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 25/32] COLO: Control the checkpoint delay time by migrate-set-parameters command zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 26/32] COLO: Implement shutdown checkpoint zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 27/32] COLO: Update the global runstate after going into colo state zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 28/32] savevm: Split load vm state function qemu_loadvm_state zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 29/32] COLO: Separate the process of saving/loading ram and device state zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 30/32] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 31/32] COLO: Add block replication into colo process zhanghailiang
2015-09-02 8:23 ` [Qemu-devel] [PATCH COLO-Frame v9 32/32] COLO: Add net packets treatment into COLO zhanghailiang
2015-09-02 9:03 ` [Qemu-devel] [PATCH COLO-Frame v9 00/32] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) Yang Hongyang
2015-09-02 9:17 ` zhanghailiang
2015-09-09 3:36 ` zhanghailiang
2015-09-15 10:40 ` zhanghailiang
2015-10-21 14:10 ` Dr. David Alan Gilbert
2015-10-22 9:01 ` zhanghailiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1441182199-8328-1-git-send-email-zhang.zhanghailiang@huawei.com \
--to=zhang.zhanghailiang@huawei.com \
--cc=amit.shah@redhat.com \
--cc=arei.gonglei@huawei.com \
--cc=dgilbert@redhat.com \
--cc=eddie.dong@intel.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=peter.huangpeng@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=yanghy@cn.fujitsu.com \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).