From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38824) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bzI0E-0005VE-1c for qemu-devel@nongnu.org; Wed, 26 Oct 2016 02:47:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bzI09-0002w5-2P for qemu-devel@nongnu.org; Wed, 26 Oct 2016 02:47:58 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:13703) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1bzI08-0002uM-2w for qemu-devel@nongnu.org; Wed, 26 Oct 2016 02:47:52 -0400 References: <1476792613-11712-1-git-send-email-zhang.zhanghailiang@huawei.com> <20161026060931.GR1679@amit-lp.rh> From: Hailiang Zhang Message-ID: <58105092.2050102@huawei.com> Date: Wed, 26 Oct 2016 14:43:30 +0800 MIME-Version: 1.0 In-Reply-To: <20161026060931.GR1679@amit-lp.rh> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH COLO-Frame (Base) v21 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Amit Shah Cc: quintela@redhat.com, qemu-devel@nongnu.org, dgilbert@redhat.com, wency@cn.fujitsu.com, lizhijian@cn.fujitsu.com, xiecl.fnst@cn.fujitsu.com, Hai Huang , Weidong Han , Dong eddie , Stefan Hajnoczi , Jason Wang Hi Amit, On 2016/10/26 14:09, Amit Shah wrote: > Hello, > > On (Tue) 18 Oct 2016 [20:09:56], zhanghailiang wrote: >> This is the 21th version of COLO frame series. >> >> Rebase to the latest master. > > I've reviewed the patchset, have some minor comments, but overall it > looks good. The changes are contained, and common code / existing > code paths are not affected much. We can still target to merge this > for 2.8. > I really appreciate your help ;), I will fix all the issues later and send v22. Hope we can still catch the deadline of V2.8. > Do you have any tests on how much the VM slows down / downtime > incurred during checkpoints? > Yes, we tested that long time ago, it all depends. The downtime is determined by the time of transferring the dirty pages and the time of flushing ram from ram buffer. But we really have methods to reduce the downtime. One method is to reduce the amount of data (dirty pages mainly) while do checkpoint by transferring dirty pages asynchronously while PVM and SVM are running (no in the time of doing checkpoint). Besides we can re-use the capability of migration, such as compressing, etc. Another method is to reduce the time of flushing ram by using userfaultfd API to convert copying ram into marking bitmap. We can also flushing the ram buffer by multiple threads which advised by Dave ... > Also, can you tell how did you arrive at the default checkpoint > interval? > Er, for this value, we referred to Remus in XEN platform. ;) But after we implement COLO with colo proxy, this interval value will be changed to a bigger one (10s). And we will make it configuration too. Besides, we will add another configurable value to control the min interval of checkpointing. Thanks, Hailiang > Thanks, > > Amit > > . >