All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wen Congyang <wency@cn.fujitsu.com>
To: Yang Hongyang <yanghy@cn.fujitsu.com>, qemu-devel@nongnu.org
Cc: Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Dong Eddie <eddie.dong@intel.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Michael R. Hines" <mrhines@linux.vnet.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Walid Nouri <walid.nouri@gmail.com>
Subject: Re: [Qemu-devel] [RFC PATCH v2 00/23] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
Date: Wed, 29 Oct 2014 14:53:41 +0800	[thread overview]
Message-ID: <54508EF5.8000908@cn.fujitsu.com> (raw)
In-Reply-To: <1411464235-5653-1-git-send-email-yanghy@cn.fujitsu.com>

On 09/23/2014 05:23 PM, Yang Hongyang wrote:
> Virtual machine (VM) replication is a well known technique for
> providing application-agnostic software-implemented hardware fault
> tolerance "non-stop service". COLO is a high availability solution.
> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
> receive the same request from client, and generate response in parallel
> too. If the response packets from PVM and SVM are identical, they are
> released immediately. Otherwise, a VM checkpoint (on demand) is
> conducted. The idea is presented in Xen summit 2012, and 2013,
> and academia paper in SOCC 2013. It's also presented in KVM forum
> 2013:
> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
> Please refer to above document for detailed information. 
> Please also refer to previous posted RFC proposal:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
> 
> The patchset is also hosted on github:
> https://github.com/macrosheep/qemu/tree/colo_v0.5
> 
> v2:
> use QEMUSizedBuffer/QEMUFile as COLO buffer
> colo support is enabled by default
> add nic replication support
> addressed comments from Eric Blake and Dr. David Alan Gilbert
> 
> v1:
> implement the frame of colo
> 
> This patchset is RFC, But it is ready for demo the COLO idea
> with QEMU-KVM.
> Steps using this patchset to get an overview of COLO:
> 1. configure
> 2. compile
> 3. just like QEMU's normal migration, run 2 QEMU VM:
>    - Primary VM 
>    - Secondary VM with -incoming tcp:[IP]:[PORT] option
> 4. on Primary VM's QEMU monitor, run following command:
>    migrate_set_capability colo on
>    migrate tcp:[IP]:[PORT]
> 5. done
> you will see two runing VMs, whenever you make changes to PVM, SVM
> will be synced to PVM's state.
> 
> TODO list:
> 1. failover (will require heartbeat module: http://www.linux-ha.org/wiki/Downloads)
> 2. disk replication[COLO Disk manager]

Hi all:

I will start to implement disk replication. Before doing this, I think we should decide
how to implement it.

I have two ideas about it:
1. implement it in qemu
   Advantage: very easy, and don't take too much time
   Disadvantage: the virtio disk with vhost is not supported, because the disk I/O
       operations are not handled in qemu.

2. update drbd and make it support colo
   Advantage: we can use it for both KVM and XEN.
   Disadvantage: The implementation may be complex, and need too much time to
        implement it.(I don't read the drbd's codes, and can't estimate the cost)

I think we can use 1 to implement it first.
If you have some other idea, please let me know.

Thanks
Wen Congyang

> 
> Any comments/feedbacks are warmly welcomed.
> 
> Thanks,
> Yang
> 
> 
> Dr. David Alan Gilbert (1):
>   QEMUSizedBuffer/QEMUFile
> 
> Yang Hongyang (22):
>   configure: add CONFIG_COLO to switch COLO support
>   COLO: introduce an api colo_supported() to indicate COLO support
>   COLO migration: add a migration capability 'colo'
>   COLO info: use colo info to tell migration target colo is enabled
>   COLO save: integrate COLO checkpointed save into qemu migration
>   COLO restore: integrate COLO checkpointed restore into qemu restore
>   COLO: disable qdev hotplug
>   COLO ctl: implement API's that communicate with colo agent
>   COLO ctl: introduce is_slave() and is_master()
>   COLO ctl: implement colo checkpoint protocol
>   COLO ctl: add a RunState RUN_STATE_COLO
>   COLO ctl: implement colo save
>   COLO ctl: implement colo restore
>   COLO save: reuse migration bitmap under colo checkpoint
>   COLO ram cache: implement colo ram cache on slave
>   HACK: trigger checkpoint every 500ms
>   COLO nic: add command line switch
>   COLO nic: init/remove colo nic devices when add/cleanup tap devices
>   COLO nic: implement colo nic device interface support_colo()
>   COLO nic: implement colo nic device interface configure()
>   COLO nic: export colo nic APIs
>   COLO nic: setup/teardown colo nic devices
> 
>  Makefile.objs                      |   2 +
>  arch_init.c                        | 174 +++++++++++-
>  configure                          |  14 +
>  include/exec/cpu-all.h             |   1 +
>  include/migration/migration-colo.h |  36 +++
>  include/migration/migration.h      |  13 +
>  include/migration/qemu-file.h      |  28 ++
>  include/net/colo-nic.h             |  20 ++
>  include/net/net.h                  |   4 +
>  include/qemu/typedefs.h            |   1 +
>  migration-colo-comm.c              |  78 ++++++
>  migration-colo.c                   | 540 +++++++++++++++++++++++++++++++++++++
>  migration.c                        |  47 ++--
>  net/Makefile.objs                  |   1 +
>  net/colo-nic.c                     | 227 ++++++++++++++++
>  net/tap.c                          |  45 +++-
>  network-colo                       | 194 +++++++++++++
>  qapi-schema.json                   |  18 +-
>  qemu-file.c                        | 410 ++++++++++++++++++++++++++++
>  qemu-options.hx                    |  10 +-
>  stubs/Makefile.objs                |   1 +
>  stubs/migration-colo.c             |  34 +++
>  vl.c                               |  12 +
>  23 files changed, 1879 insertions(+), 31 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 include/net/colo-nic.h
>  create mode 100644 migration-colo-comm.c
>  create mode 100644 migration-colo.c
>  create mode 100644 net/colo-nic.c
>  create mode 100755 network-colo
>  create mode 100644 stubs/migration-colo.c
> 

  parent reply	other threads:[~2014-10-29  7:01 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-23  9:23 [Qemu-devel] [RFC PATCH v2 00/23] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 01/23] QEMUSizedBuffer/QEMUFile Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 02/23] configure: add CONFIG_COLO to switch COLO support Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 03/23] COLO: introduce an api colo_supported() to indicate " Yang Hongyang
2014-10-08 15:02   ` Eric Blake
2014-10-09  1:06     ` Wen Congyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 04/23] COLO migration: add a migration capability 'colo' Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 05/23] COLO info: use colo info to tell migration target colo is enabled Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 06/23] COLO save: integrate COLO checkpointed save into qemu migration Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 07/23] COLO restore: integrate COLO checkpointed restore into qemu restore Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 08/23] COLO: disable qdev hotplug Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 09/23] COLO ctl: implement API's that communicate with colo agent Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 10/23] COLO ctl: introduce is_slave() and is_master() Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 11/23] COLO ctl: implement colo checkpoint protocol Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 12/23] COLO ctl: add a RunState RUN_STATE_COLO Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 13/23] COLO ctl: implement colo save Yang Hongyang
2014-10-08 10:23   ` Shunsuke Kurumatani
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 14/23] COLO ctl: implement colo restore Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 15/23] COLO save: reuse migration bitmap under colo checkpoint Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 16/23] COLO ram cache: implement colo ram cache on slave Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 17/23] HACK: trigger checkpoint every 500ms Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 18/23] COLO nic: add command line switch Yang Hongyang
2014-09-23 17:04   ` Eric Blake
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 19/23] COLO nic: init/remove colo nic devices when add/cleanup tap devices Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 20/23] COLO nic: implement colo nic device interface support_colo() Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 21/23] COLO nic: implement colo nic device interface configure() Yang Hongyang
2014-10-27 17:49   ` Dr. David Alan Gilbert
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 22/23] COLO nic: export colo nic APIs Yang Hongyang
2014-09-23  9:23 ` [Qemu-devel] [RFC PATCH v2 23/23] COLO nic: setup/teardown colo nic devices Yang Hongyang
2014-10-29  6:53 ` Wen Congyang [this message]
2014-10-29  9:34   ` [Qemu-devel] [RFC PATCH v2 00/23] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service Dr. David Alan Gilbert
2014-10-29  9:54     ` Wen Congyang
2014-10-29 11:05       ` Dr. David Alan Gilbert
2014-10-29 17:19       ` Stefan Hajnoczi
2014-10-29 10:19     ` Hongyang Yang
2014-10-29 11:01       ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54508EF5.8000908@cn.fujitsu.com \
    --to=wency@cn.fujitsu.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=mrhines@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=walid.nouri@gmail.com \
    --cc=yanghy@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.