qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Wen Congyang <wency@cn.fujitsu.com>
To: Yang Hongyang <yanghy@cn.fujitsu.com>, qemu-devel@nongnu.org
Cc: Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Dong Eddie <eddie.dong@intel.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Michael R. Hines" <mrhines@linux.vnet.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Walid Nouri <walid.nouri@gmail.com>
Subject: Re: [Qemu-devel] [RFC PATCH v2 00/23] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
Date: Wed, 29 Oct 2014 14:53:41 +0800	[thread overview]
Message-ID: <54508EF5.8000908@cn.fujitsu.com> (raw)
In-Reply-To: <1411464235-5653-1-git-send-email-yanghy@cn.fujitsu.com>

On 09/23/2014 05:23 PM, Yang Hongyang wrote:
> Virtual machine (VM) replication is a well known technique for
> providing application-agnostic software-implemented hardware fault
> tolerance "non-stop service". COLO is a high availability solution.
> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
> receive the same request from client, and generate response in parallel
> too. If the response packets from PVM and SVM are identical, they are
> released immediately. Otherwise, a VM checkpoint (on demand) is
> conducted. The idea is presented in Xen summit 2012, and 2013,
> and academia paper in SOCC 2013. It's also presented in KVM forum
> 2013:
> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
> Please refer to above document for detailed information. 
> Please also refer to previous posted RFC proposal:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
> 
> The patchset is also hosted on github:
> https://github.com/macrosheep/qemu/tree/colo_v0.5
> 
> v2:
> use QEMUSizedBuffer/QEMUFile as COLO buffer
> colo support is enabled by default
> add nic replication support
> addressed comments from Eric Blake and Dr. David Alan Gilbert
> 
> v1:
> implement the frame of colo
> 
> This patchset is RFC, But it is ready for demo the COLO idea
> with QEMU-KVM.
> Steps using this patchset to get an overview of COLO:
> 1. configure
> 2. compile
> 3. just like QEMU's normal migration, run 2 QEMU VM:
>    - Primary VM 
>    - Secondary VM with -incoming tcp:[IP]:[PORT] option
> 4. on Primary VM's QEMU monitor, run following command:
>    migrate_set_capability colo on
>    migrate tcp:[IP]:[PORT]
> 5. done
> you will see two runing VMs, whenever you make changes to PVM, SVM
> will be synced to PVM's state.
> 
> TODO list:
> 1. failover (will require heartbeat module: http://www.linux-ha.org/wiki/Downloads)
> 2. disk replication[COLO Disk manager]

Hi all:

I will start to implement disk replication. Before doing this, I think we should decide
how to implement it.

I have two ideas about it:
1. implement it in qemu
   Advantage: very easy, and don't take too much time
   Disadvantage: the virtio disk with vhost is not supported, because the disk I/O
       operations are not handled in qemu.

2. update drbd and make it support colo
   Advantage: we can use it for both KVM and XEN.
   Disadvantage: The implementation may be complex, and need too much time to
        implement it.(I don't read the drbd's codes, and can't estimate the cost)

I think we can use 1 to implement it first.
If you have some other idea, please let me know.

Thanks
Wen Congyang

> 
> Any comments/feedbacks are warmly welcomed.
> 
> Thanks,
> Yang
> 
> 
> Dr. David Alan Gilbert (1):
>   QEMUSizedBuffer/QEMUFile
> 
> Yang Hongyang (22):
>   configure: add CONFIG_COLO to switch COLO support
>   COLO: introduce an api colo_supported() to indicate COLO support
>   COLO migration: add a migration capability 'colo'
>   COLO info: use colo info to tell migration target colo is enabled
>   COLO save: integrate COLO checkpointed save into qemu migration
>   COLO restore: integrate COLO checkpointed restore into qemu restore
>   COLO: disable qdev hotplug
>   COLO ctl: implement API's that communicate with colo agent
>   COLO ctl: introduce is_slave() and is_master()
>   COLO ctl: implement colo checkpoint protocol
>   COLO ctl: add a RunState RUN_STATE_COLO
>   COLO ctl: implement colo save
>   COLO ctl: implement colo restore
>   COLO save: reuse migration bitmap under colo checkpoint
>   COLO ram cache: implement colo ram cache on slave
>   HACK: trigger checkpoint every 500ms
>   COLO nic: add command line switch
>   COLO nic: init/remove colo nic devices when add/cleanup tap devices
>   COLO nic: implement colo nic device interface support_colo()
>   COLO nic: implement colo nic device interface configure()
>   COLO nic: export colo nic APIs
>   COLO nic: setup/teardown colo nic devices
> 
>  Makefile.objs                      |   2 +
>  arch_init.c                        | 174 +++++++++++-
>  configure                          |  14 +
>  include/exec/cpu-all.h             |   1 +
>  include/migration/migration-colo.h |  36 +++
>  include/migration/migration.h      |  13 +
>  include/migration/qemu-file.h      |  28 ++
>  include/net/colo-nic.h             |  20 ++
>  include/net/net.h                  |   4 +
>  include/qemu/typedefs.h            |   1 +
>  migration-colo-comm.c              |  78 ++++++
>  migration-colo.c                   | 540 +++++++++++++++++++++++++++++++++++++
>  migration.c                        |  47 ++--
>  net/Makefile.objs                  |   1 +
>  net/colo-nic.c                     | 227 ++++++++++++++++
>  net/tap.c                          |  45 +++-
>  network-colo                       | 194 +++++++++++++
>  qapi-schema.json                   |  18 +-
>  qemu-file.c                        | 410 ++++++++++++++++++++++++++++
>  qemu-options.hx                    |  10 +-
>  stubs/Makefile.objs                |   1 +
>  stubs/migration-colo.c             |  34 +++
>  vl.c                               |  12 +
>  23 files changed, 1879 insertions(+), 31 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 include/net/colo-nic.h
>  create mode 100644 migration-colo-comm.c
>  create mode 100644 migration-colo.c
>  create mode 100644 net/colo-nic.c
>  create mode 100755 network-colo
>  create mode 100644 stubs/migration-colo.c
> 

  parent reply	other threads:[~2014-10-29  7:01 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-23  9:23 [Qemu-devel] [RFC PATCH v2 00/23] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 01/23] QEMUSizedBuffer/QEMUFile Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 02/23] configure: add CONFIG_COLO to switch COLO support Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 03/23] COLO: introduce an api colo_supported() to indicate " Yang Hongyang
2014-10-08 15:02   ` Eric Blake
2014-10-09  1:06     ` Wen Congyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 04/23] COLO migration: add a migration capability 'colo' Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 05/23] COLO info: use colo info to tell migration target colo is enabled Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 06/23] COLO save: integrate COLO checkpointed save into qemu migration Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 07/23] COLO restore: integrate COLO checkpointed restore into qemu restore Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 08/23] COLO: disable qdev hotplug Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 09/23] COLO ctl: implement API's that communicate with colo agent Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 10/23] COLO ctl: introduce is_slave() and is_master() Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 11/23] COLO ctl: implement colo checkpoint protocol Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 12/23] COLO ctl: add a RunState RUN_STATE_COLO Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 13/23] COLO ctl: implement colo save Yang Hongyang
2014-10-08 10:23   ` Shunsuke Kurumatani
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 14/23] COLO ctl: implement colo restore Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 15/23] COLO save: reuse migration bitmap under colo checkpoint Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 16/23] COLO ram cache: implement colo ram cache on slave Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 17/23] HACK: trigger checkpoint every 500ms Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 18/23] COLO nic: add command line switch Yang Hongyang
2014-09-23 17:04   ` Eric Blake
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 19/23] COLO nic: init/remove colo nic devices when add/cleanup tap devices Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 20/23] COLO nic: implement colo nic device interface support_colo() Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 21/23] COLO nic: implement colo nic device interface configure() Yang Hongyang
2014-10-27 17:49   ` Dr. David Alan Gilbert
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 22/23] COLO nic: export colo nic APIs Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 23/23] COLO nic: setup/teardown colo nic devices Yang Hongyang
2014-10-29  6:53 ` Wen Congyang [this message]
2014-10-29  9:34   ` [Qemu-devel] [RFC PATCH v2 00/23] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service Dr. David Alan Gilbert
2014-10-29  9:54     ` Wen Congyang
2014-10-29 11:05       ` Dr. David Alan Gilbert
2014-10-29 17:19       ` Stefan Hajnoczi
2014-10-29 10:19     ` Hongyang Yang
2014-10-29 11:01       ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54508EF5.8000908@cn.fujitsu.com \
    --to=wency@cn.fujitsu.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=mrhines@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=walid.nouri@gmail.com \
    --cc=yanghy@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).