From: Jules <junqing.wang@cs2c.com.cn>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: pbonzini@redhat.com, quintela@redhat.com, qemu-devel@nongnu.org,
owasserm@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance
Date: Wed, 23 Oct 2013 08:08:55 +0800 [thread overview]
Message-ID: <1382486935.1780.6.camel@localhost> (raw)
In-Reply-To: <20131017115059.GF10774@stefanha-thinkpad.redhat.com>
> On Tue, Oct 15, 2013 at 03:26:19PM +0800, Jules Wang wrote:
> > v2 -> v3:
> > * add documentation of new option in qapi-schema.
> >
> > * long option name: ft -> fault-tolerant
> >
> > v1 -> v2:
> > * cmdline: migrate curling:tcp:<address>:<port>
> > -> migrate -f tcp:<address>:<port>
> >
> > * sender: use QEMU_VM_FILE_MAGIC_FT as the header of the migration
> > to indicate this is a ft migration.
> >
> > * receiver: look for the signature:
> > QEMU_VM_EOF_MAGIC + QEMU_VM_FILE_MAGIC_FT(64bit total)
> > which indicates the end of one migration.
> > --
> > Jules Wang (4):
> > Curling: add doc
> > Curling: cmdline interface.
> > Curling: the sender
> > Curling: the receiver
>
First of all, thanks for your superb and spot-on comments.
> It would be helpful to clarify the status of Curling in the cover letter
> email so reviewers know what to expect.
OK, but I'm not quite clear about how to clarify the status, would you
pls give me an example?
>
> This series does not address I/O or failover. I guess you are aware of
> the missing topics that I mentioned, here are my thoughts on them:
>
> I/O needs to be held back until the destination host has acknowledged
> receiving the last full migration state. The outside world cannot
> witness state changes in the guest until the migration state has been
> successfully transferred to the destination host. Otherwise the guest
> may appear to act incorrectly when resuming execution from the last
> snapshot.
>
> The time period used by the FT sender thread determines how much latency
> is added to I/O requests.
Yes, there is the latency. That is inevitable.
I guess you mean the following situation:
If a msg 'hello' is sent to the chat room server just a few seconds
before the failover happens, there is a possibility that the msg will be
sent to the others twice or be lost.
Am I right?
>
> Failover functionality is missing from these patches. We cannot simply
> start executing on the destination host when the migration connection
> ends. If the guest disk image is located on shared storage then
> split-brain occurs when a network error terminates the migration
> connection -
> will both hosts begin accessing the shared disk?
YES
>
I have a simple way to handle that. In one word, the third point
--gateway.
Both the sender and the receiver check the connectivity to the gateway
every X seconds. Let's use A and B stand for whether the sender and the
receiver are connected to the gateway respectively.
When the connection between the sender and the receiver is down.
A && B is false.
If A is false, the vm instance at the sender will be stopped.
If B is false, the vm instance at the receiver will not be started.
a.A false B false: 0 vm run
b.A false B true: 1 vm run
c.A true B false: 1 vm run
d.A true B true : 1 vm run (normal case)
It becomes complicated when we consider the state transitions in
these four states.
I suggest adding this feature to libvirt instead of qemu.
> What is your plan to address these issues?
>
> Stefan
>
next prev parent reply other threads:[~2013-10-22 8:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-15 7:26 [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance Jules Wang
2013-10-15 7:26 ` [Qemu-devel] [PATCH v3 1/4] Curling: add doc Jules Wang
2013-10-17 11:25 ` Stefan Hajnoczi
2013-10-15 7:26 ` [Qemu-devel] [PATCH v3 2/4] Curling: cmdline interface Jules Wang
2013-10-15 7:26 ` [Qemu-devel] [PATCH v3 3/4] Curling: the sender Jules Wang
2013-10-15 7:26 ` [Qemu-devel] [PATCH v3 4/4] Curling: the receiver Jules Wang
2013-10-17 11:50 ` [Qemu-devel] [PATCH v3 0/4] Curling: KVM Fault Tolerance Stefan Hajnoczi
2013-10-23 0:08 ` Jules [this message]
2013-10-24 12:10 ` Stefan Hajnoczi
2013-10-22 21:00 ` Michael R. Hines
2013-10-23 5:23 ` Jules
2013-11-06 18:38 ` Michael R. Hines
2013-10-22 21:08 ` Michael R. Hines
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1382486935.1780.6.camel@localhost \
--to=junqing.wang@cs2c.com.cn \
--cc=owasserm@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).