From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MPPIW-00047J-MR for qemu-devel@nongnu.org; Fri, 10 Jul 2009 19:14:32 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MPPIS-00046v-6G for qemu-devel@nongnu.org; Fri, 10 Jul 2009 19:14:32 -0400 Received: from [199.232.76.173] (port=39830 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MPPIS-00046s-0Q for qemu-devel@nongnu.org; Fri, 10 Jul 2009 19:14:28 -0400 Received: from mail2.shareable.org ([80.68.89.115]:41937) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MPPIR-0001EB-KS for qemu-devel@nongnu.org; Fri, 10 Jul 2009 19:14:27 -0400 Date: Sat, 11 Jul 2009 00:14:24 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] [PATCH 2/3] move vm stop/start to migrate_set_state Message-ID: <20090710231424.GD30322@shareable.org> References: <1247140059-5034-1-git-send-email-pbonzini@redhat.com> <1247140059-5034-3-git-send-email-pbonzini@redhat.com> <4A55F46F.6060705@codemonkey.ws> <4A55F510.5090801@redhat.com> <4A55F641.6000701@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4A55F641.6000701@codemonkey.ws> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Paolo Bonzini , qemu-devel@nongnu.org Anthony Liguori wrote: > Paolo Bonzini wrote: > >On 07/09/2009 03:45 PM, Anthony Liguori wrote: > >>How does the disk become full during the final stage? The guest isn't > >>running. > > > >The host disk can become full and cause a "migrate exec" to fail. Or > >for network migration migration, you could have the connection drop > >exactly during the final stage. In this case, the VM would be > >unconditionally restarted. > > Because migration failed. Is that not the desired behavior? It seems > like it is to me. > > If I try to do a live migration, it should either succeed and my guest > experiences minimal downtime or it should fail and my guest should > experience minimal downtime. What happens if the destination host sends "migration completed", and then the connection drops before that message is delivered reliably to the sending host? The destination host will run the VM, and the sending host will restart and run the VM too. Two copies of the same VM running together doesn't sound healthy. This is a classic handshaking problem and I'm not aware of any perfect solution, only ways to ensure eventual recovery, and temporary uncertainty errs on the side of caution. In this case, caution would be neither VM running but a notification to the system manager of this rare condition, and the possibility to recover when the two hosts are able to resume communication. I don't know how to do better than that. -- Jamie