From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1MPPIW-00047J-MR
	for qemu-devel@nongnu.org; Fri, 10 Jul 2009 19:14:32 -0400
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1MPPIS-00046v-6G
	for qemu-devel@nongnu.org; Fri, 10 Jul 2009 19:14:32 -0400
Received: from [199.232.76.173] (port=39830 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1MPPIS-00046s-0Q
	for qemu-devel@nongnu.org; Fri, 10 Jul 2009 19:14:28 -0400
Received: from mail2.shareable.org ([80.68.89.115]:41937)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <jamie@shareable.org>) id 1MPPIR-0001EB-KS
	for qemu-devel@nongnu.org; Fri, 10 Jul 2009 19:14:27 -0400
Date: Sat, 11 Jul 2009 00:14:24 +0100
From: Jamie Lokier <jamie@shareable.org>
Subject: Re: [Qemu-devel] [PATCH 2/3] move vm stop/start to migrate_set_state
Message-ID: <20090710231424.GD30322@shareable.org>
References: <1247140059-5034-1-git-send-email-pbonzini@redhat.com>
	<1247140059-5034-3-git-send-email-pbonzini@redhat.com>
	<4A55F46F.6060705@codemonkey.ws> <4A55F510.5090801@redhat.com>
	<4A55F641.6000701@codemonkey.ws>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4A55F641.6000701@codemonkey.ws>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org

Anthony Liguori wrote:
> Paolo Bonzini wrote:
> >On 07/09/2009 03:45 PM, Anthony Liguori wrote:
> >>How does the disk become full during the final stage?  The guest isn't
> >>running.
> >
> >The host disk can become full and cause a "migrate exec" to fail.  Or 
> >for network migration migration, you could have the connection drop 
> >exactly during the final stage.  In this case, the VM would be 
> >unconditionally restarted.
> 
> Because migration failed.  Is that not the desired behavior?  It seems 
> like it is to me.
> 
> If I try to do a live migration, it should either succeed and my guest 
> experiences minimal downtime or it should fail and my guest should 
> experience minimal downtime.

What happens if the destination host sends "migration completed", and
then the connection drops before that message is delivered reliably to
the sending host?

The destination host will run the VM,
and the sending host will restart and run the VM too.

Two copies of the same VM running together doesn't sound healthy.

This is a classic handshaking problem and I'm not aware of any perfect
solution, only ways to ensure eventual recovery, and temporary
uncertainty errs on the side of caution.  In this case, caution would
be neither VM running but a notification to the system manager of this
rare condition, and the possibility to recover when the two hosts are
able to resume communication.  I don't know how to do better than that.

-- Jamie