qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Orit Wasserman <owasserm@redhat.com>
To: Jules Wang <junqing.wang@cs2c.com.cn>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com,
	quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance
Date: Tue, 10 Sep 2013 15:27:49 +0300	[thread overview]
Message-ID: <522F1045.2000705@redhat.com> (raw)
In-Reply-To: <1378784607-7398-1-git-send-email-junqing.wang@cs2c.com.cn>

On 09/10/2013 06:43 AM, Jules Wang wrote:
> The goal of Curling(sports) is to provide a fault tolerant mechanism for KVM,
> so that in the event of a hardware failure, the virtual machine fails over to
> the backup in a way that is completely transparent to the guest operating system.
> 
> Our goal is exactly the same as the goal of Kemari, by which Curling is
> inspired. However, Curling is simpler than Kemari(too simple, I afraid):
> 
> * By leveraging live migration feature, we do endless live migrations between
> the sender and receiver, so the two virtual machines are synchronized.
> 

Hi,
There are two issues I see with your solution,
The first is that if the VM failure happen in the middle on the live migration 
the backup VM state will be inconsistent which means you can't failover to it.
Solving it is not simple as you need some transaction mechanism that will
change the backup VM state only when the transaction completes (the live migration completes).
Kemari has something like that.

The second is that sadly live migration doesn't always converge this means 
that the backup VM won't have a consist state to failover to.
You need to detect such a case and throttle down the guest to force convergence.

Regards,
Orit

> * The receiver does not load vm state once the migration begins, instead, it
> perfetches one whole migration data into a buffer, then loads vm state from that
> buffer afterwards. This "all or nothing" approach prevents the
> broken-in-the-middle problem Kemari has.
> 
> * The sender sleeps a little while after each migration, to ease the performance
> penalty entailed by vm_stop and iothread locks. This is a tradeoff between
> performance and accuracy.
> 
> Usage:
> The steps of curling are the same as the steps of live migration except the
> following:
> 1. Start the receiver vm with -incoming curling:tcp:<address>:<port>
> 2. Start ft in the qemu monitor of sender vm by following cmdline:
>    > migrate_set_speed  <full bandwidth>
>    > migrate curling:tcp:<address>:<port>
> 3. Connect to the receiver vm by vnc or spice. The screen of the vm is displayed
> when curling is ready.
> 4. Now, the sender vm is protected by ft, When it encounters a failure,
> the failover kicks in.
> 
> Problems to be discussed:
> 1. When the receiver is prefectching data, how does it know where is the EOF of
> one migration?
> 
> Currently, we use a magic number 0xfeedcafe to indicate the EOF.
> Any better solutions?
> 
> 2. How to reduce the overhead entailed by vm_stop and iothread locks?
> 
> Any solutions other than sleeping?
> 
> --
> 
> Jules Wang (4):
>   Curling: add doc
>   Curling: cmdline interface
>   Curling: the sender
>   Curling: the receiver
> 
>  arch_init.c                   |  18 +++--
>  docs/curling.txt              |  52 ++++++++++++++
>  include/migration/migration.h |   2 +
>  include/migration/qemu-file.h |   1 +
>  include/sysemu/sysemu.h       |   1 +
>  migration.c                   |  61 ++++++++++++++--
>  savevm.c                      | 158 ++++++++++++++++++++++++++++++++++++++++--
>  7 files changed, 277 insertions(+), 16 deletions(-)
>  create mode 100644 docs/curling.txt
> 

  parent reply	other threads:[~2013-09-10 12:27 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-10  3:43 [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance Jules Wang
2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 1/4] Curling: add doc Jules Wang
2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 2/4] Curling: cmdline interface Jules Wang
2013-09-10 13:57   ` Juan Quintela
2013-09-10 13:03     ` Paolo Bonzini
2013-09-10 16:37       ` Juan Quintela
2013-09-10 14:38         ` Paolo Bonzini
2013-09-10 15:21           ` Juan Quintela
2013-09-10 15:22           ` Juan Quintela
2013-09-11  2:51     ` junqing.wang
2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 3/4] Curling: the sender Jules Wang
2013-09-10 14:05   ` Juan Quintela
2013-09-11  7:31     ` junqing.wang
2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 4/4] Curling: the receiver Jules Wang
2013-09-10 14:19   ` Juan Quintela
2013-09-11  8:25     ` junqing.wang
2013-09-10 12:27 ` Orit Wasserman [this message]
2013-09-11  1:54   ` [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance junqing.wang
2013-09-12  7:37     ` Orit Wasserman
2013-09-12  8:17       ` junqing.wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=522F1045.2000705@redhat.com \
    --to=owasserm@redhat.com \
    --cc=junqing.wang@cs2c.com.cn \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).