From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37998) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSPDR-0005qb-JR for qemu-devel@nongnu.org; Fri, 12 Sep 2014 07:40:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XSPDM-0004DG-5G for qemu-devel@nongnu.org; Fri, 12 Sep 2014 07:40:37 -0400 Received: from [59.151.112.132] (port=58621 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XSPDL-0004BT-M9 for qemu-devel@nongnu.org; Fri, 12 Sep 2014 07:40:32 -0400 Message-ID: <5412DBA4.1060408@cn.fujitsu.com> Date: Fri, 12 Sep 2014 19:40:20 +0800 From: Hongyang Yang MIME-Version: 1.0 References: <1406125538-27992-1-git-send-email-yanghy@cn.fujitsu.com> <1406125538-27992-12-git-send-email-yanghy@cn.fujitsu.com> <20140801150347.GE2430@work-vm> <541290C5.4010905@cn.fujitsu.com> <20140912111722.GD2413@work-vm> In-Reply-To: <20140912111722.GD2413@work-vm> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: kvm@vger.kernel.org, GuiJianfeng@cn.fujitsu.com, eddie.dong@intel.com, qemu-devel@nongnu.org, mrhines@linux.vnet.ibm.com =E5=9C=A8 09/12/2014 07:17 PM, Dr. David Alan Gilbert =E5=86=99=E9=81=93: > * Hongyang Yang (yanghy@cn.fujitsu.com) wrote: >> >> >> ??? 08/01/2014 11:03 PM, Dr. David Alan Gilbert ??????: >>> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote: > > > >>>> +static int do_colo_transaction(MigrationState *s, QEMUFile *control, >>>> + QEMUFile *trans) >>>> +{ >>>> + int ret; >>>> + >>>> + ret =3D colo_ctl_put(s->file, COLO_CHECKPOINT_NEW); >>>> + if (ret) { >>>> + goto out; >>>> + } >>>> + >>>> + ret =3D colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED); >>> >>> What happens at this point if the slave just doesn't respond? >>> (i.e. the socket doesn't drop - you just don't get the byte). >> >> If the socket return bytes that were not expected, exit. If >> socket return error, do some cleanup and quit COLO process. >> refer to: colo_ctl_get() and colo_ctl_get_value() > > But what happens if the slave just doesn't respond at all; e.g. > if the slave host loses power, it'll take a while (many seconds) > before the socket will timeout. It will wait until the call returns timeout error, and then do some cleanup and quit COLO process. There may be better way to handle this? > > Dave > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > . > --=20 Thanks, Yang.