From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>,
Jiang Yunhong <yunhong.jiang@intel.com>,
Dong Eddie <eddie.dong@intel.com>, Ye Wei <wei.ye1987@gmail.com>,
xen-devl <xen-devel@lists.xen.org>,
Hong Tao <bobby.hong@huawei.com>, Xu Yao <xuyao.xu@huawei.com>,
Shriram Rajagopalan <rshriram@cs.ubc.ca>
Subject: Re: [RFC Patch v2 00/16] COarse-grain LOck-stepping Virtual Machines for Non-stop Service
Date: Thu, 11 Jul 2013 10:37:55 +0100 [thread overview]
Message-ID: <51DE7CF3.7050609@citrix.com> (raw)
In-Reply-To: <1373531748-12547-1-git-send-email-wency@cn.fujitsu.com>
On 11/07/13 09:35, Wen Congyang wrote:
> Virtual machine (VM) replication is a well known technique for providing
> application-agnostic software-implemented hardware fault tolerance -
> "non-stop service". Currently, remus provides this function, but it buffers
> all output packets, and the latency is unacceptable.
>
> In xen summit 2012, We introduce a new VM replication solution: colo
> (COarse-grain LOck-stepping virtual machine). The presentation is in
> the following URL:
> http://www.slideshare.net/xen_com_mgr/colo-coarsegrain-lockstepping-virtual-machines-for-nonstop-service
>
> Here is the summary of the solution:
> >From the client's point of view, as long as the client observes identical
> responses from the primary and secondary VMs, according to the service
> semantics, then the secondary VM(SVM) is a valid replica of the primary
> VM(PVM), and can successfully take over when a hardware failure of the
> PVM is detected.
How set in stone are you about the terms PVM and SVM?
SVM already has a specific meaning in Xen, being AMD Software Virtual
Machine extensions which allow for HVM guests.
As a lesser problem, PVM is sometimes used to mean PV, as a mirror of HVM.
~Andrew
>
> This patchset is RFC, and implements the frame of colo:
> 1. Both PVM and SVM are running
> 2. do checkpoint only when the output packets from PVM and SVM are different
> 3. cache write requests from SVM
>
> ChangeLog from v1 to v2:
> 1. update block-remus to support colo
> 2. split large patch to small one
> 3. fix some bugs
> 4. add a new hypercall for colo
>
> Changelog:
> Patch 1: optimize the dirty pages transfer speed.
> Patch 2-3: allow SVM running after checkpoint
> Patch 4-5: modification for colo on the master side(wait a new checkpoint,
> communicate with slaver when doing checkoint)
> Patch 6-7: implement colo's user interface
>
>
> Wen Congyang (16):
> xen: introduce new hypercall to reset vcpu
> block-remus: introduce colo mode
> block-remus: introduce a interface to allow the user specify which
> mode the backup end uses
> dominfo.completeRestore() will be called more than once in colo mode
> xc_domain_restore: introduce restore_callbacks for colo
> colo: implement restore_callbacks init()/free()
> colo: implement restore_callbacks get_page()
> colo: implement restore_callbacks flush_memory
> colo: implement restore_callbacks update_p2m()
> colo: implement restore_callbacks finish_restore()
> xc_restore: implement for colo
> XendCheckpoint: implement colo
> xc_domain_save: flush cache before calling callbacks->postcopy()
> add callback to configure network for colo
> xc_domain_save: implement save_callbacks for colo
> remus: implement colo mode
>
> tools/blktap2/drivers/block-remus.c | 188 ++++-
> tools/libxc/Makefile | 8 +-
> tools/libxc/xc_domain_restore.c | 264 ++++--
> tools/libxc/xc_domain_restore_colo.c | 939 +++++++++++++++++++++
> tools/libxc/xc_domain_save.c | 23 +-
> tools/libxc/xc_save_restore_colo.h | 14 +
> tools/libxc/xenguest.h | 51 ++
> tools/libxl/Makefile | 2 +-
> tools/python/xen/lowlevel/checkpoint/checkpoint.c | 322 +++++++-
> tools/python/xen/lowlevel/checkpoint/checkpoint.h | 1 +
> tools/python/xen/remus/device.py | 8 +
> tools/python/xen/remus/image.py | 8 +-
> tools/python/xen/remus/save.py | 13 +-
> tools/python/xen/xend/XendCheckpoint.py | 127 ++-
> tools/python/xen/xend/XendDomainInfo.py | 13 +-
> tools/remus/remus | 28 +-
> tools/xcutils/Makefile | 4 +-
> tools/xcutils/xc_restore.c | 36 +-
> xen/arch/x86/domain.c | 57 ++
> xen/arch/x86/x86_64/entry.S | 4 +
> xen/include/public/xen.h | 1 +
> 21 files changed, 1947 insertions(+), 164 deletions(-)
> create mode 100644 tools/libxc/xc_domain_restore_colo.c
> create mode 100644 tools/libxc/xc_save_restore_colo.h
>
next prev parent reply other threads:[~2013-07-11 9:37 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-11 8:35 [RFC Patch v2 00/16] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 01/16] xen: introduce new hypercall to reset vcpu Wen Congyang
2013-07-11 9:44 ` Andrew Cooper
2013-07-11 9:58 ` Wen Congyang
2013-07-11 10:01 ` Ian Campbell
2013-08-01 11:48 ` Tim Deegan
2013-08-06 6:47 ` Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 02/16] block-remus: introduce colo mode Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 03/16] block-remus: introduce a interface to allow the user specify which mode the backup end uses Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 04/16] dominfo.completeRestore() will be called more than once in colo mode Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 05/16] xc_domain_restore: introduce restore_callbacks for colo Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 06/16] colo: implement restore_callbacks init()/free() Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 07/16] colo: implement restore_callbacks get_page() Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 08/16] colo: implement restore_callbacks flush_memory Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 09/16] colo: implement restore_callbacks update_p2m() Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 10/16] colo: implement restore_callbacks finish_restore() Wen Congyang
2013-07-11 9:40 ` Ian Campbell
2013-07-11 9:54 ` Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 11/16] xc_restore: implement for colo Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 12/16] XendCheckpoint: implement colo Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 13/16] xc_domain_save: flush cache before calling callbacks->postcopy() Wen Congyang
2013-07-11 13:43 ` Andrew Cooper
2013-07-12 1:36 ` Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 14/16] add callback to configure network for colo Wen Congyang
2013-07-11 8:35 ` [RFC Patch v2 15/16] xc_domain_save: implement save_callbacks " Wen Congyang
2013-07-11 13:52 ` Andrew Cooper
2013-07-11 8:35 ` [RFC Patch v2 16/16] remus: implement colo mode Wen Congyang
2013-07-11 9:37 ` Andrew Cooper [this message]
2013-07-11 9:40 ` [RFC Patch v2 00/16] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Ian Campbell
2013-07-14 14:33 ` Shriram Rajagopalan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51DE7CF3.7050609@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=bobby.hong@huawei.com \
--cc=eddie.dong@intel.com \
--cc=laijs@cn.fujitsu.com \
--cc=rshriram@cs.ubc.ca \
--cc=wei.ye1987@gmail.com \
--cc=wency@cn.fujitsu.com \
--cc=xen-devel@lists.xen.org \
--cc=xuyao.xu@huawei.com \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).