qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jules <junqing.wang@cs2c.com.cn>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: pbonzini@redhat.com, quintela@redhat.com, qemu-devel@nongnu.org,
	owasserm@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance
Date: Wed, 23 Oct 2013 08:08:55 +0800	[thread overview]
Message-ID: <1382486935.1780.6.camel@localhost> (raw)
In-Reply-To: <20131017115059.GF10774@stefanha-thinkpad.redhat.com>


> On Tue, Oct 15, 2013 at 03:26:19PM +0800, Jules Wang wrote:
> > v2 -> v3:
> > * add documentation of new option in qapi-schema.
> > 
> > * long option name: ft -> fault-tolerant
> > 
> > v1 -> v2:
> > * cmdline: migrate curling:tcp:<address>:<port> 
> >        ->  migrate -f tcp:<address>:<port>
> > 
> > * sender: use QEMU_VM_FILE_MAGIC_FT as the header of the migration
> >           to indicate this is a ft migration.
> > 
> > * receiver: look for the signature: 
> >             QEMU_VM_EOF_MAGIC + QEMU_VM_FILE_MAGIC_FT(64bit total)
> >             which indicates the end of one migration.
> > --
> > Jules Wang (4):
> >   Curling: add doc
> >   Curling: cmdline interface.
> >   Curling: the sender
> >   Curling: the receiver
> 

First of all, thanks for your superb and spot-on comments.

> It would be helpful to clarify the status of Curling in the cover letter
> email so reviewers know what to expect.

OK, but I'm not quite clear about how to clarify the status, would you
pls give me an example? 
> 
> This series does not address I/O or failover.  I guess you are aware of
> the missing topics that I mentioned, here are my thoughts on them:
> 
> I/O needs to be held back until the destination host has acknowledged
> receiving the last full migration state.  The outside world cannot
> witness state changes in the guest until the migration state has been
> successfully transferred to the destination host.  Otherwise the guest
> may appear to act incorrectly when resuming execution from the last
> snapshot.
> 
> The time period used by the FT sender thread determines how much latency
> is added to I/O requests.

Yes, there is the latency. That is inevitable.

I guess you mean the following situation:
If a msg 'hello' is sent to the chat room server just a few seconds
before the failover happens, there is a possibility that the msg will be
sent to the others twice or be lost.

Am I right?

> 
> Failover functionality is missing from these patches.  We cannot simply
> start executing on the destination host when the migration connection
> ends.  If the guest disk image is located on shared storage then
> split-brain occurs when a network error terminates the migration
> connection - 

> will both hosts begin accessing the shared disk? 
YES
> 

I have a simple way to handle that. In one word, the third point
--gateway.

Both the sender and the receiver check the connectivity to the gateway
every X seconds. Let's use A and B stand for whether the sender and the
receiver are connected to the gateway respectively.

When the connection between the sender and the receiver is down.
A && B is false.

If A is false, the vm instance at the sender will be stopped.
If B is false, the vm instance at the receiver will not be started.

a.A false  B false: 0 vm run
b.A false  B true: 1 vm run 
c.A true   B false: 1 vm run
d.A true   B true : 1 vm run (normal case)

It becomes complicated when we consider the state transitions in
these four states.
  
I suggest adding this feature to libvirt instead of qemu.


> What is your plan to address these issues?
> 
> Stefan
> 

  reply	other threads:[~2013-10-22  8:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-15  7:26 [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance Jules Wang
2013-10-15  7:26 ` [Qemu-devel] [PATCH v3 1/4] Curling: add doc Jules Wang
2013-10-17 11:25   ` Stefan Hajnoczi
2013-10-15  7:26 ` [Qemu-devel] [PATCH v3 2/4] Curling: cmdline interface Jules Wang
2013-10-15  7:26 ` [Qemu-devel] [PATCH v3 3/4] Curling: the sender Jules Wang
2013-10-15  7:26 ` [Qemu-devel] [PATCH v3 4/4] Curling: the receiver Jules Wang
2013-10-17 11:50 ` [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance Stefan Hajnoczi
2013-10-23  0:08   ` Jules [this message]
2013-10-24 12:10     ` Stefan Hajnoczi
2013-10-22 21:00 ` Michael R. Hines
2013-10-23  5:23   ` Jules
2013-11-06 18:38     ` Michael R. Hines
2013-10-22 21:08 ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1382486935.1780.6.camel@localhost \
    --to=junqing.wang@cs2c.com.cn \
    --cc=owasserm@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).