From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: lizhijian@cn.fujitsu.com, quintela@redhat.com,
yunhong.jiang@intel.com, eddie.dong@intel.com,
peter.huangpeng@huawei.com, qemu-devel@nongnu.org,
arei.gonglei@huawei.com, amit.shah@redhat.com
Subject: Re: [Qemu-devel] [PATCH COLO-Frame v8 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)
Date: Wed, 5 Aug 2015 12:24:56 +0100 [thread overview]
Message-ID: <20150805112456.GF2331@work-vm> (raw)
In-Reply-To: <1438159544-6224-1-git-send-email-zhang.zhanghailiang@huawei.com>
* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> This is the 8th version of COLO.
>
> Here is only COLO frame part, include: VM checkpoint,
> failover, proxy API, block replication API, not include block replication.
> The block part is treated as a separate series.
>
> As usual, we provide 'basic' and 'developing' branches in github:
> https://github.com/coloft/qemu/commits/colo-v1.5-basic
> https://github.com/coloft/qemu/commits/colo-v1.5-developing (more features)
>
> The 'basic' branch is exactly the same with this patch series,
> We will keep this series simple as possible, just for easy review.
>
> The extra features in colo-v1.5-developing branch:
> 1) Separate ram and device save/load process to reduce size of extra memory
> used during checkpoint
> 2) Live migrate part of dirty pages to slave during sleep time.
> 3) You get the statistic info about checkpoint by command 'info migrate'
I'm hitting a problem that I think is due to the new global_state section
that Juan recently added; if I cause a failover I hit:
ERROR: invalid runstate transition: 'colo' -> 'prelaunch'
(on the secondary).
I think the problem is that, the global_state is only sent for any 'unusual' states,
so in the first migration that gets done at startup, 'prelaunch' is included in the stream
in the global state, but then for later checkpoints the global_state probably isn't
sent.
I hacked around it by making global_state_needed return false; I guess
we need to find a better fix!
Dave
> Please reference to the follow link to test COLO.
> http://wiki.qemu.org/Features/COLO.
>
> COLO is a totally new feature which is still in early stage,
> your comments and feedback are warmly welcomed.
>
> NOTE:
> We have decided to re-implement the colo proxy in userspace (In qemu exactly).
> you can find the discussion about why & how to realize the colo proxy in qemu from the follow link:
> http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04069.html
>
> TODO:
> 1. COLO function switch on/off
> 2. The capability of continuous FT
> 3. Optimize the performance.
>
> v8:
> - Move some global variables into MigrationIncomingState and MigrationState
> - Move some cleanup work form colo thread and colo incoming thread into failover
> BH function and also fix the code logic for the cleanup work.
> - fix the bug that colo thread and colo incoming thread possibly block in the
> socket 'recv' call when do failover work.
> - Optimize colo_flush_ram_cache()
> - Add migration state for incoming side, we use the state to verify if migration
> incoming side is in COLO state or not (Patch 5).
> - Drop the patch 'COLO: Disable qdev hotplug when VM is in COLO mode', since it is not correct.
>
> zhanghailiang (34):
> configure: Add parameter for configure to enable/disable COLO support
> migration: Introduce capability 'colo' to migration
> COLO: migrate colo related info to slave
> colo-comm/migration: skip colo info section for special cases
> migration: Add state records for migration incoming
> migration: Integrate COLO checkpoint process into migration
> migration: Integrate COLO checkpoint process into loadvm
> COLO: Implement colo checkpoint protocol
> COLO: Add a new RunState RUN_STATE_COLO
> QEMUSizedBuffer: Introduce two help functions for qsb
> COLO: Save VM state to slave when do checkpoint
> COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
> COLO VMstate: Load VM state into qsb before restore it
> arch_init: Start to trace dirty pages of SVM
> COLO RAM: Flush cached RAM into SVM's memory
> COLO failover: Introduce a new command to trigger a failover
> COLO failover: Introduce state to record failover process
> COLO failover: Implement COLO primary/secondary vm failover work
> qmp event: Add event notification for COLO error
> COLO failover: Don't do failover during loading VM's state
> COLO: Add new command parameter 'forward_nic' 'colo_script' for net
> COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
> tap: Make launch_script() public
> COLO NIC: Implement colo nic device interface configure()
> colo-nic: Handle secondary VM's original net device configure
> COLO NIC: Implement colo nic init/destroy function
> COLO NIC: Some init work related with proxy module
> COLO: Handle nfnetlink message from proxy module
> COLO: Do checkpoint according to the result of packets comparation
> COLO: Improve checkpoint efficiency by do additional periodic
> checkpoint
> COLO: Add colo-set-checkpoint-period command
> COLO NIC: Implement NIC checkpoint and failover
> COLO: Implement shutdown checkpoint
> COLO: Add block replication into colo process
>
> configure | 33 +-
> docs/qmp/qmp-events.txt | 16 +
> hmp-commands.hx | 30 ++
> hmp.c | 15 +
> hmp.h | 2 +
> include/exec/cpu-all.h | 1 +
> include/migration/colo.h | 45 +++
> include/migration/failover.h | 33 ++
> include/migration/migration.h | 19 +
> include/migration/qemu-file.h | 3 +-
> include/net/colo-nic.h | 37 ++
> include/net/net.h | 2 +
> include/net/tap.h | 19 +
> include/sysemu/sysemu.h | 3 +
> migration/Makefile.objs | 2 +
> migration/colo-comm.c | 75 ++++
> migration/colo-failover.c | 83 +++++
> migration/colo.c | 805 ++++++++++++++++++++++++++++++++++++++++++
> migration/migration.c | 116 ++++--
> migration/qemu-file-buf.c | 58 +++
> migration/ram.c | 242 ++++++++++++-
> migration/savevm.c | 2 +-
> net/Makefile.objs | 1 +
> net/colo-nic.c | 457 ++++++++++++++++++++++++
> net/net.c | 2 +
> net/tap.c | 90 +++--
> qapi-schema.json | 58 ++-
> qapi/event.json | 15 +
> qemu-options.hx | 7 +
> qmp-commands.hx | 42 +++
> scripts/colo-proxy-script.sh | 145 ++++++++
> stubs/Makefile.objs | 1 +
> stubs/migration-colo.c | 58 +++
> trace-events | 10 +
> vl.c | 37 +-
> 35 files changed, 2474 insertions(+), 90 deletions(-)
> create mode 100644 include/migration/colo.h
> create mode 100644 include/migration/failover.h
> create mode 100644 include/net/colo-nic.h
> create mode 100644 migration/colo-comm.c
> create mode 100644 migration/colo-failover.c
> create mode 100644 migration/colo.c
> create mode 100644 net/colo-nic.c
> create mode 100755 scripts/colo-proxy-script.sh
> create mode 100644 stubs/migration-colo.c
>
> --
> 1.8.3.1
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2015-08-05 11:25 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-29 8:45 [Qemu-devel] [PATCH COLO-Frame v8 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 01/34] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 02/34] migration: Introduce capability 'colo' to migration zhanghailiang
2015-08-28 21:54 ` Eric Blake
2015-08-31 2:18 ` zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 03/34] COLO: migrate colo related info to slave zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 04/34] colo-comm/migration: skip colo info section for special cases zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 05/34] migration: Add state records for migration incoming zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 06/34] migration: Integrate COLO checkpoint process into migration zhanghailiang
2015-08-28 21:55 ` Eric Blake
2015-08-31 5:06 ` zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 07/34] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 08/34] COLO: Implement colo checkpoint protocol zhanghailiang
2015-08-27 10:40 ` Dr. David Alan Gilbert
2015-08-27 11:27 ` zhanghailiang
2015-08-27 12:43 ` Dr. David Alan Gilbert
2015-08-28 7:53 ` zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 09/34] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2015-08-28 21:58 ` Eric Blake
2015-08-31 6:09 ` zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 10/34] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 11/34] COLO: Save VM state to slave when do checkpoint zhanghailiang
2015-08-27 12:06 ` Dr. David Alan Gilbert
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 12/34] COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 13/34] COLO VMstate: Load VM state into qsb before restore it zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 14/34] arch_init: Start to trace dirty pages of SVM zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 15/34] COLO RAM: Flush cached RAM into SVM's memory zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 16/34] COLO failover: Introduce a new command to trigger a failover zhanghailiang
2015-08-28 22:06 ` Eric Blake
2015-09-01 2:47 ` zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 17/34] COLO failover: Introduce state to record failover process zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 18/34] COLO failover: Implement COLO primary/secondary vm failover work zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 19/34] qmp event: Add event notification for COLO error zhanghailiang
2015-08-28 22:13 ` Eric Blake
2015-08-31 9:27 ` zhanghailiang
2015-08-31 15:07 ` Eric Blake
2015-09-01 1:08 ` zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 20/34] COLO failover: Don't do failover during loading VM's state zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 21/34] COLO: Add new command parameter 'forward_nic' 'colo_script' for net zhanghailiang
2015-08-28 22:24 ` Eric Blake
2015-08-31 10:57 ` zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 22/34] COLO NIC: Init/remove colo nic devices when add/cleanup tap devices zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 23/34] tap: Make launch_script() public zhanghailiang
2015-07-29 8:57 ` Jason Wang
2015-07-29 9:17 ` zhanghailiang
2015-07-29 9:24 ` Jason Wang
2015-07-29 9:43 ` zhanghailiang
2015-07-30 3:32 ` Jason Wang
2015-07-30 4:02 ` zhanghailiang
2015-07-29 9:19 ` Daniel P. Berrange
2015-07-29 9:37 ` Dr. David Alan Gilbert
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 24/34] COLO NIC: Implement colo nic device interface configure() zhanghailiang
2015-08-05 10:42 ` Dr. David Alan Gilbert
2015-08-05 11:54 ` Li Zhijian
2015-08-20 10:34 ` Dr. David Alan Gilbert
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 25/34] colo-nic: Handle secondary VM's original net device configure zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 26/34] COLO NIC: Implement colo nic init/destroy function zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 27/34] COLO NIC: Some init work related with proxy module zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 28/34] COLO: Handle nfnetlink message from " zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 29/34] COLO: Do checkpoint according to the result of packets comparation zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 30/34] COLO: Improve checkpoint efficiency by do additional periodic checkpoint zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 31/34] COLO: Add colo-set-checkpoint-period command zhanghailiang
2015-08-28 22:26 ` Eric Blake
2015-08-31 12:00 ` zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 32/34] COLO NIC: Implement NIC checkpoint and failover zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 33/34] COLO: Implement shutdown checkpoint zhanghailiang
2015-07-29 8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 34/34] COLO: Add block replication into colo process zhanghailiang
2015-08-05 11:24 ` Dr. David Alan Gilbert [this message]
2015-08-06 10:25 ` [Qemu-devel] [PATCH COLO-Frame v8 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
2015-08-12 8:20 ` zhanghailiang
2015-08-24 14:38 ` Dr. David Alan Gilbert
2015-08-25 7:03 ` zhanghailiang
2015-08-26 16:49 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150805112456.GF2331@work-vm \
--to=dgilbert@redhat.com \
--cc=amit.shah@redhat.com \
--cc=arei.gonglei@huawei.com \
--cc=eddie.dong@intel.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=peter.huangpeng@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=yunhong.jiang@intel.com \
--cc=zhang.zhanghailiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).