From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48235) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aaNVG-00013C-T7 for qemu-devel@nongnu.org; Mon, 29 Feb 2016 08:04:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aaNVD-00077f-Lb for qemu-devel@nongnu.org; Mon, 29 Feb 2016 08:04:46 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45032) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aaNVD-00077a-GE for qemu-devel@nongnu.org; Mon, 29 Feb 2016 08:04:43 -0500 Date: Mon, 29 Feb 2016 13:04:36 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20160229130436.GA23461@work-vm> References: <1456108832-24212-1-git-send-email-zhang.zhanghailiang@huawei.com> <20160225195232.GB18374@work-vm> <20160226163602.GM2161@work-vm> <56D15653.90406@huawei.com> <20160229094715.GA2125@work-vm> <56D43693.5050401@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56D43693.5050401@huawei.com> Subject: Re: [Qemu-devel] [PATCH COLO-Frame v15 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Hailiang Zhang Cc: xiecl.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com, quintela@redhat.com, armbru@redhat.com, yunhong.jiang@intel.com, eddie.dong@intel.com, peter.huangpeng@huawei.com, qemu-devel@nongnu.org, arei.gonglei@huawei.com, stefanha@redhat.com, amit.shah@redhat.com, zhangchen.fnst@cn.fujitsu.com, hongyang.yang@easystack.cn * Hailiang Zhang (zhang.zhanghailiang@huawei.com) wrote: > On 2016/2/29 17:47, Dr. David Alan Gilbert wrote: > >* Hailiang Zhang (zhang.zhanghailiang@huawei.com) wrote: > >>On 2016/2/27 0:36, Dr. David Alan Gilbert wrote: > >>>* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote: > >I've got a patch where I've tried to multithread the flush - it's made it a little > >faster, but not as much as I hoped (~20ms down to ~16ms using 4 cores) > > > > Hmm, that seems to be a good idea, after switch to COLO (hybrid) mode, in most cases, > we will get much more dirtied pages than the periodic mode, because the delay time > between two checkpoints is usually longer. > The multi-thread flushing way may gain much more in that case, but i doubt, in some > bad case, users still can't bear the pause time. > > Actually, we have thought about this problem for a long time, > In our early test based on Kernel COLO-proxy, we can easily got more than > one seconds' flushing time, IMHO, uses can't bear the long pausing time of VM if they > choose to use COLO. Yes, that's just too long; although only solving the 'flushing' time isn't enough in those cases, because the same cases will probably need to transfer lots of RAM over the wire as well. > We have designed another scenario which based on userfault's page-miss capability. > The base idea is to convert the flushing action to marking action, the flush action > will be processed during SVM's running time. For now it is only an idea, > we'd like to verify the idea first. (I'm not quite sure if userfaults' page-miss > feature is good performance designed, while we use it to mark one page to be MISS a time). Yes, it's a different trade off, slower execution, but no flush time. Dave > > > Thanks, > Hailiang -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK