From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NCbZI-0003HN-T5 for qemu-devel@nongnu.org; Mon, 23 Nov 2009 11:15:12 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NCbZD-0003Ec-WB for qemu-devel@nongnu.org; Mon, 23 Nov 2009 11:15:12 -0500 Received: from [199.232.76.173] (port=55556 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NCbZD-0003EW-QO for qemu-devel@nongnu.org; Mon, 23 Nov 2009 11:15:07 -0500 Received: from mx1.redhat.com ([209.132.183.28]:24991) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NCbZD-00043g-Bm for qemu-devel@nongnu.org; Mon, 23 Nov 2009 11:15:07 -0500 Date: Mon, 23 Nov 2009 18:15:04 +0200 From: Gleb Natapov Message-ID: <20091123161504.GY2999@redhat.com> References: <20091123123640.GL2999@redhat.com> <20091123143242.GO2999@redhat.com> <4B0AA165.60900@codemonkey.ws> <20091123145356.GQ2999@redhat.com> <4B0AA4D6.9060607@codemonkey.ws> <20091123152252.GR2999@redhat.com> <4B0AAB20.8060109@codemonkey.ws> <20091123154951.GT2999@redhat.com> <4B0AB3AB.5080204@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4B0AB3AB.5080204@codemonkey.ws> Subject: [Qemu-devel] Re: Live migration protocol, device features, ABIs and other beasts List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Paolo Bonzini , Juan Quintela , qemu-devel@nongnu.org On Mon, Nov 23, 2009 at 10:09:15AM -0600, Anthony Liguori wrote: > Gleb Natapov wrote: > >On Mon, Nov 23, 2009 at 09:32:48AM -0600, Anthony Liguori wrote: > >>Gleb Natapov wrote: > >>>On Mon, Nov 23, 2009 at 09:05:58AM -0600, Anthony Liguori wrote: > >>>>Gleb Natapov wrote: > >>>>>Then I don't see why Juan claims what he claims. > >>>>Live migration is unidirectional. As long as qemu can send out all > >>>>of the data without the stream closing, it will "succeed" on the > >>>>source. While this may sound like a bug, it's an impossible problem > >>>>to solve as it's dealing with reliable communication between two > >>>>unreliable nodes (i.e. the two general's problem). This is why the > >>>>source qemu does not exit after a successful live migration. It > >>>As far as I remember the two general's problem talks about unreliable > >>>channel, not unreliable nodes. > >>That's just semantics. The problem is that one general does not > >>know if the other general received the message. Even if there was a > >>reliable channel between the two generals, if one of the generals > >>can die with no indication, then you still have the same problem, > >>i.e. the first general doesn't know for sure if the second general > >>received the message. > >> > >>>Why not having destination send ACK/NACK > >>>to the source when it knows that migration succeeded/failed. > >>1) Source sends migration traffic > >>2) Destination receives it, sends Ack > >>3) Destination needs to wait to receive Ack from Source before > >>starting guest to ensure that guest does not start twice > >>4) Source receives Ack from Destination, sends Ack > >>5) Source kills guest > >>6) Destination receives Ack from Source, starts guest > >> > >>If Destination dies in between 5 and 6, the VM disappears. > >> > >1) Source sends migration traffic > >2) Destination receives it, sends Ack > >3) Destination start running > >4) Source receives Ack from Destination > >5) Source kills guest > > > >If Source does not receive Ack it stays paused and wait for management to > >sort things out. > > Is it really useful to kill the source guest in this case? I'm wary > of how useful an unreliable ack is namely because it introduces > rather complex semantics from a management tool perspective. If > folks think it would be really useful, I'm not fundamentally opposed > to it. > I am OK with management being responsible to sort things out. Juan said that destination can't abort migration in the middle, so I pointed out easy solution that will work in 99.999% cases. -- Gleb.