Re: [Qemu-devel] [RFC] COLO HA Project proposal

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Hongyang Yang <yanghy@cn.fujitsu.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: FNST-Gui Jianfeng <GuiJianfeng@cn.fujitsu.com>,
	Dong Eddie <eddie.dong@intel.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] [RFC] COLO HA Project proposal
Date: Thu, 3 Jul 2014 11:42:43 +0800	[thread overview]
Message-ID: <53B4D133.4060903@cn.fujitsu.com> (raw)
In-Reply-To: <20140701121248.GH2394@work-vm>

Hi David,

On 07/01/2014 08:12 PM, Dr. David Alan Gilbert wrote:
> * Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
>
> Hi Yang,
>
>> Background:
>>    COLO HA project is a high availability solution. Both primary
>> VM (PVM) and secondary VM (SVM) run in parallel. They receive the
>> same request from client, and generate response in parallel too.
>> If the response packets from PVM and SVM are identical, they are
>> released immediately. Otherwise, a VM checkpoint (on demand) is
>> conducted. The idea is presented in Xen summit 2012, and 2013,
>> and academia paper in SOCC 2013. It's also presented in KVM forum
>> 2013:
>> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
>> Please refer to above document for detailed information.
>
> Yes, I remember that talk - very interesting.
>
> I didn't quite understand a couple of things though, perhaps you
> can explain:
>    1) If we ignore the TCP sequence number problem, in an SMP machine
> don't we get other randomnesses - e.g. which core completes something
> first, or who wins a lock contention, so the output stream might not
> be identical - so do those normal bits of randomness cause the machines
> to flag as out-of-sync?

It's about COLO agent, CCing Congyang, he can give the detailed
explanation.

>
>    2) If the PVM has decided that the SVM is out of sync (due to 1) and
> the PVM fails at about the same point - can we switch over to the SVM?

Yes, we can switch over, we have some mechanisms to ensure the SVM's state
is consentient:
- memory cache.
   The memory cache was initially the same as PVM's memory. At
checkpoint, we cache the dirty memory of PVM while transporting the
memory, write cached memory to SVM when we received all PVM memory
(we only need to write memory that was both dirty on PVM and SVM
from last checkpoint). This solves problem 2) you've mentioned above:
If PVM fails while checkpointing, SVM will discard the cached memory
and continue to run and to provide service just as it is.

- COLO Disk manager
   Like memory cache, COLO Disk manager caches the Disk modifications
of PVM, and write it to SVM Disk when checkpointing. If PVM fails while
checkpointing, SVM will discard the cached Disk modifications.

>
> I'm worried that due to (1) there are periods where the system
> is out-of-sync and a failure of the PVM is not protected.  Does that happen?
> If so how often?
>
>> The attached was the architecture of kvm-COLO we proposed.
>>    - COLO Manager: Requires modifications of qemu
>>      - COLO Controller
>>          COLO Controller includes modifications of save/restore
>>        flow just like MC(macrocheckpoint), a memory cache on
>>        secondary VM which cache the dirty pages of primary VM
>>        and a failover module which provides APIs to communicate
>>        with external heartbead module.
>>      - COLO Disk Manager
>>          When pvm writes data into image, the colo disk manger
>>        captures this data and send it to the colo disk manger
>>        which makes sure the context of svm's image is consentient
>>        with the context of pvm's image.
>
> I wonder if there is anyway to coordinate this between COLO, Michael
> Hines microcheckpointing and the two separate reverse-execution
> projects that also need to do some similar things.
> Are there any standard APIs for the heartbeet thing we can already
> tie into?

Sadly we have checked MC, it does not have heartbeat support for now.

>
>>    - COLO Agent("Proxy module" in the arch picture)
>>        We need an agent to compare the packets returned by
>>      Primary VM and Secondary VM, and decide whether to start a
>>      checkpoint according to some rules. It is a linux kernel
>>      module for host.
>
> Why is that a kernel module, and how does it communicate the state
> to the QEMU instance?

The reason we made this a kernel module is to gain better performance.
We can easily hook the packets in a kernel module.
QEMU instance uses ioctl() to communicate with the COLO Agent.

>
>>    - Other minor modifications
>>        We may need other modifications for better performance.
>
> Dave
> P.S. I'm starting to look at fault-tolerance stuff, but haven't
> got very far yet, so starting to try and understand the details
> of COLO, microcheckpointing, etc
>
>> --
>> Thanks,
>> Yang.
>
>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

next prev parent reply	other threads:[~2014-07-03  3:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-24  2:08 [Qemu-devel] [RFC] COLO HA Project proposal Hongyang Yang
2014-07-01 12:12 ` Dr. David Alan Gilbert
2014-07-03  3:42   ` Hongyang Yang [this message]
2014-07-04  8:31     ` Dong, Eddie
2014-07-04  8:35       ` Dr. David Alan Gilbert
2014-07-04  8:54         ` Dong, Eddie
2014-07-04 12:22           ` Dr. David Alan Gilbert
2014-07-04 15:55             ` Dong, Eddie
2014-07-08  6:06     ` Michael R. Hines
2014-07-08  6:26       ` Hongyang Yang
2014-07-04 11:22   ` Andreas Färber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53B4D133.4060903@cn.fujitsu.com \
    --to=yanghy@cn.fujitsu.com \
    --cc=GuiJianfeng@cn.fujitsu.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).