All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jules <junqing.wang@cs2c.com.cn>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: pbonzini@redhat.com, quintela@redhat.com, qemu-devel@nongnu.org,
	owasserm@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance
Date: Wed, 23 Oct 2013 08:08:55 +0800	[thread overview]
Message-ID: <1382486935.1780.6.camel@localhost> (raw)
In-Reply-To: <20131017115059.GF10774@stefanha-thinkpad.redhat.com>


> On Tue, Oct 15, 2013 at 03:26:19PM +0800, Jules Wang wrote:
> > v2 -> v3:
> > * add documentation of new option in qapi-schema.
> > 
> > * long option name: ft -> fault-tolerant
> > 
> > v1 -> v2:
> > * cmdline: migrate curling:tcp:<address>:<port> 
> >        ->  migrate -f tcp:<address>:<port>
> > 
> > * sender: use QEMU_VM_FILE_MAGIC_FT as the header of the migration
> >           to indicate this is a ft migration.
> > 
> > * receiver: look for the signature: 
> >             QEMU_VM_EOF_MAGIC + QEMU_VM_FILE_MAGIC_FT(64bit total)
> >             which indicates the end of one migration.
> > --
> > Jules Wang (4):
> >   Curling: add doc
> >   Curling: cmdline interface.
> >   Curling: the sender
> >   Curling: the receiver
> 

First of all, thanks for your superb and spot-on comments.

> It would be helpful to clarify the status of Curling in the cover letter
> email so reviewers know what to expect.

OK, but I'm not quite clear about how to clarify the status, would you
pls give me an example? 
> 
> This series does not address I/O or failover.  I guess you are aware of
> the missing topics that I mentioned, here are my thoughts on them:
> 
> I/O needs to be held back until the destination host has acknowledged
> receiving the last full migration state.  The outside world cannot
> witness state changes in the guest until the migration state has been
> successfully transferred to the destination host.  Otherwise the guest
> may appear to act incorrectly when resuming execution from the last
> snapshot.
> 
> The time period used by the FT sender thread determines how much latency
> is added to I/O requests.

Yes, there is the latency. That is inevitable.

I guess you mean the following situation:
If a msg 'hello' is sent to the chat room server just a few seconds
before the failover happens, there is a possibility that the msg will be
sent to the others twice or be lost.

Am I right?

> 
> Failover functionality is missing from these patches.  We cannot simply
> start executing on the destination host when the migration connection
> ends.  If the guest disk image is located on shared storage then
> split-brain occurs when a network error terminates the migration
> connection - 

> will both hosts begin accessing the shared disk? 
YES
> 

I have a simple way to handle that. In one word, the third point
--gateway.

Both the sender and the receiver check the connectivity to the gateway
every X seconds. Let's use A and B stand for whether the sender and the
receiver are connected to the gateway respectively.

When the connection between the sender and the receiver is down.
A && B is false.

If A is false, the vm instance at the sender will be stopped.
If B is false, the vm instance at the receiver will not be started.

a.A false  B false: 0 vm run
b.A false  B true: 1 vm run 
c.A true   B false: 1 vm run
d.A true   B true : 1 vm run (normal case)

It becomes complicated when we consider the state transitions in
these four states.
  
I suggest adding this feature to libvirt instead of qemu.


> What is your plan to address these issues?
> 
> Stefan
> 

  reply	other threads:[~2013-10-22  8:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-15  7:26 [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance Jules Wang
2013-10-15  7:26 ` [Qemu-devel] [PATCH v3 1/4] Curling: add doc Jules Wang
2013-10-17 11:25   ` Stefan Hajnoczi
2013-10-15  7:26 ` [Qemu-devel] [PATCH v3 2/4] Curling: cmdline interface Jules Wang
2013-10-15  7:26 ` [Qemu-devel] [PATCH v3 3/4] Curling: the sender Jules Wang
2013-10-15  7:26 ` [Qemu-devel] [PATCH v3 4/4] Curling: the receiver Jules Wang
2013-10-17 11:50 ` [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance Stefan Hajnoczi
2013-10-23  0:08   ` Jules [this message]
2013-10-24 12:10     ` Stefan Hajnoczi
2013-10-22 21:00 ` Michael R. Hines
2013-10-23  5:23   ` Jules
2013-11-06 18:38     ` Michael R. Hines
2013-10-22 21:08 ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1382486935.1780.6.camel@localhost \
    --to=junqing.wang@cs2c.com.cn \
    --cc=owasserm@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.