From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1MPRbT-0007Sm-HS
	for qemu-devel@nongnu.org; Fri, 10 Jul 2009 21:42:15 -0400
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1MPRbO-0007QK-Ea
	for qemu-devel@nongnu.org; Fri, 10 Jul 2009 21:42:14 -0400
Received: from [199.232.76.173] (port=47317 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1MPRbO-0007QH-9i
	for qemu-devel@nongnu.org; Fri, 10 Jul 2009 21:42:10 -0400
Received: from mail2.shareable.org ([80.68.89.115]:39677)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <jamie@shareable.org>) id 1MPRbN-00033i-T0
	for qemu-devel@nongnu.org; Fri, 10 Jul 2009 21:42:10 -0400
Date: Sat, 11 Jul 2009 02:42:07 +0100
From: Jamie Lokier <jamie@shareable.org>
Subject: Re: [Qemu-devel] [PATCH 2/3] move vm stop/start to migrate_set_state
Message-ID: <20090711014207.GM30322@shareable.org>
References: <1247140059-5034-1-git-send-email-pbonzini@redhat.com>
	<1247140059-5034-3-git-send-email-pbonzini@redhat.com>
	<4A55F46F.6060705@codemonkey.ws> <4A55F510.5090801@redhat.com>
	<4A55F641.6000701@codemonkey.ws>
	<20090710231424.GD30322@shareable.org>
	<4A57E3AA.5020305@codemonkey.ws>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4A57E3AA.5020305@codemonkey.ws>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org

Anthony Liguori wrote:
> Jamie Lokier wrote:
> >Anthony Liguori wrote:
> >  
> >>Paolo Bonzini wrote:
> >>    
> >>>On 07/09/2009 03:45 PM, Anthony Liguori wrote:
> >>>      
> >>>>How does the disk become full during the final stage?  The guest isn't
> >>>>running.
> >>>>        
> >>>The host disk can become full and cause a "migrate exec" to fail.  Or 
> >>>for network migration migration, you could have the connection drop 
> >>>exactly during the final stage.  In this case, the VM would be 
> >>>unconditionally restarted.
> >>>      
> >>Because migration failed.  Is that not the desired behavior?  It seems 
> >>like it is to me.
> >>
> >>If I try to do a live migration, it should either succeed and my guest 
> >>experiences minimal downtime or it should fail and my guest should 
> >>experience minimal downtime.
> >>    
> >
> >What happens if the destination host sends "migration completed", and
> >then the connection drops before that message is delivered reliably to
> >the sending host?
> >  
> 
> We don't check the return value of close

Linux doesn't return I/O or network errors from close() anyway, except
for a few network filesystems, and not even those in older kernels.  It
generally returns zero.

(If you were saving to disk and wanted to detect write I/O errors,
which by the way includes disk full when writing to a network
filesystem, you'll need to call fsync().  I'm not sure if this is relevant).

> so the last possible place failure can occur is the last write.  By
> definition, if the write failed, the migration session could not
> have been completed successfully.  Migration is unidirectional.
> There is no "migration completed" message from the destination.
> We're very conservative wrt restarting the source.

Yes, I agree, as long as it's conservative and only restarts when the
last byte needed to start the destination has definitely not been
written, that's safe.  That's a good design.

If you get an error during the last write(), I wouldn't trust that to
mean the recipient will definitely not see the data you wrote.  (Enjoy
the double negative).  It's another variation of the handshake
uncertainty, this time reflected in what write() should report when
it's uncertain about a network transmission.  If it reports an error
when it's uncertain, then you can't trust that a write() error means
the data was not written, only that a problem was detected.

By saving the final "commit" byte for it's own 1-byte write(), then if
you get an error from any earlier write, then of course you know the
last byte has not been sent and it's safe to resume the source.
Reading SO_ERROR before the 1-byte write() would maximise this chance,
but it's probably so rare as to be pointless.

-- Jamie