From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MQMtk-0008Jw-Os for qemu-devel@nongnu.org; Mon, 13 Jul 2009 10:52:56 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MQMtg-0008J9-Th for qemu-devel@nongnu.org; Mon, 13 Jul 2009 10:52:56 -0400 Received: from [199.232.76.173] (port=48032 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MQMtg-0008J6-O5 for qemu-devel@nongnu.org; Mon, 13 Jul 2009 10:52:52 -0400 Received: from mail-gx0-f220.google.com ([209.85.217.220]:60281) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MQMtg-00014z-As for qemu-devel@nongnu.org; Mon, 13 Jul 2009 10:52:52 -0400 Received: by gxk20 with SMTP id 20so13775532gxk.10 for ; Mon, 13 Jul 2009 07:52:51 -0700 (PDT) Message-ID: <4A5B4A3C.1070903@codemonkey.ws> Date: Mon, 13 Jul 2009 09:52:44 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 2/3] move vm stop/start to migrate_set_state References: <1247140059-5034-3-git-send-email-pbonzini@redhat.com> <4A55F46F.6060705@codemonkey.ws> <4A55F510.5090801@redhat.com> <4A55F641.6000701@codemonkey.ws> <20090710231424.GD30322@shareable.org> <4A57E3AA.5020305@codemonkey.ws> <20090711014207.GM30322@shareable.org> <4A5958F8.3090306@codemonkey.ws> <4A59F1AC.9070401@redhat.com> <4A5A3533.7040107@codemonkey.ws> <20090713053139.GD28046@redhat.com> In-Reply-To: <20090713053139.GD28046@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov Cc: Paolo Bonzini , Avi Kivity , qemu-devel@nongnu.org Gleb Natapov wrote: > With unreliable socket it doesn't matter what write() returns data may > or may not reach the destination regardless, with reliable sockets > write() succeeds only after data was acked by the receiver, but it still > doesn't mean that data will be read from destination socket. > You are correct and we handle both of these cases appropriately. In the event that we think we completed a migration successfully and we really didn't because of a lost network connection, the result is both the source and destination are stopped. A third party can resume the source and continue along happily. The case being debated is whether write() can ever actually complete and yet still return an error. In this case, since we automatically resume the source on error, the result would be two copies of the VM running. I haven't seen any evidence that this case could actually happen other than theoretic speculation. I just at the migration code and it's not a simple change to try and be conservative wrt this case because of the way we do buffering. Regards, Anthony Liguori > -- > Gleb. >