qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance
@ 2013-09-10  3:43 Jules Wang
  2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 1/4] Curling: add doc Jules Wang
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Jules Wang @ 2013-09-10  3:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, owasserm, Jules Wang, stefanha, pbonzini

The goal of Curling(sports) is to provide a fault tolerant mechanism for KVM,
so that in the event of a hardware failure, the virtual machine fails over to
the backup in a way that is completely transparent to the guest operating system.

Our goal is exactly the same as the goal of Kemari, by which Curling is
inspired. However, Curling is simpler than Kemari(too simple, I afraid):

* By leveraging live migration feature, we do endless live migrations between
the sender and receiver, so the two virtual machines are synchronized.

* The receiver does not load vm state once the migration begins, instead, it
perfetches one whole migration data into a buffer, then loads vm state from that
buffer afterwards. This "all or nothing" approach prevents the
broken-in-the-middle problem Kemari has.

* The sender sleeps a little while after each migration, to ease the performance
penalty entailed by vm_stop and iothread locks. This is a tradeoff between
performance and accuracy.

Usage:
The steps of curling are the same as the steps of live migration except the
following:
1. Start the receiver vm with -incoming curling:tcp:<address>:<port>
2. Start ft in the qemu monitor of sender vm by following cmdline:
   > migrate_set_speed  <full bandwidth>
   > migrate curling:tcp:<address>:<port>
3. Connect to the receiver vm by vnc or spice. The screen of the vm is displayed
when curling is ready.
4. Now, the sender vm is protected by ft, When it encounters a failure,
the failover kicks in.

Problems to be discussed:
1. When the receiver is prefectching data, how does it know where is the EOF of
one migration?

Currently, we use a magic number 0xfeedcafe to indicate the EOF.
Any better solutions?

2. How to reduce the overhead entailed by vm_stop and iothread locks?

Any solutions other than sleeping?

--

Jules Wang (4):
  Curling: add doc
  Curling: cmdline interface
  Curling: the sender
  Curling: the receiver

 arch_init.c                   |  18 +++--
 docs/curling.txt              |  52 ++++++++++++++
 include/migration/migration.h |   2 +
 include/migration/qemu-file.h |   1 +
 include/sysemu/sysemu.h       |   1 +
 migration.c                   |  61 ++++++++++++++--
 savevm.c                      | 158 ++++++++++++++++++++++++++++++++++++++++--
 7 files changed, 277 insertions(+), 16 deletions(-)
 create mode 100644 docs/curling.txt

-- 
1.8.0.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-09-12  8:33 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-10  3:43 [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance Jules Wang
2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 1/4] Curling: add doc Jules Wang
2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 2/4] Curling: cmdline interface Jules Wang
2013-09-10 13:57   ` Juan Quintela
2013-09-10 13:03     ` Paolo Bonzini
2013-09-10 16:37       ` Juan Quintela
2013-09-10 14:38         ` Paolo Bonzini
2013-09-10 15:21           ` Juan Quintela
2013-09-10 15:22           ` Juan Quintela
2013-09-11  2:51     ` junqing.wang
2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 3/4] Curling: the sender Jules Wang
2013-09-10 14:05   ` Juan Quintela
2013-09-11  7:31     ` junqing.wang
2013-09-10  3:43 ` [Qemu-devel] [PATCH RFC 4/4] Curling: the receiver Jules Wang
2013-09-10 14:19   ` Juan Quintela
2013-09-11  8:25     ` junqing.wang
2013-09-10 12:27 ` [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance Orit Wasserman
2013-09-11  1:54   ` junqing.wang
2013-09-12  7:37     ` Orit Wasserman
2013-09-12  8:17       ` junqing.wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).