From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:48235)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1aaNVG-00013C-T7
	for qemu-devel@nongnu.org; Mon, 29 Feb 2016 08:04:50 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1aaNVD-00077f-Lb
	for qemu-devel@nongnu.org; Mon, 29 Feb 2016 08:04:46 -0500
Received: from mx1.redhat.com ([209.132.183.28]:45032)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1aaNVD-00077a-GE
	for qemu-devel@nongnu.org; Mon, 29 Feb 2016 08:04:43 -0500
Date: Mon, 29 Feb 2016 13:04:36 +0000
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Message-ID: <20160229130436.GA23461@work-vm>
References: <1456108832-24212-1-git-send-email-zhang.zhanghailiang@huawei.com>
	<20160225195232.GB18374@work-vm> <20160226163602.GM2161@work-vm>
	<56D15653.90406@huawei.com> <20160229094715.GA2125@work-vm>
	<56D43693.5050401@huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <56D43693.5050401@huawei.com>
Subject: Re: [Qemu-devel] [PATCH COLO-Frame v15 00/38] COarse-grain
 LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Cc: xiecl.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com, quintela@redhat.com, armbru@redhat.com, yunhong.jiang@intel.com, eddie.dong@intel.com, peter.huangpeng@huawei.com, qemu-devel@nongnu.org, arei.gonglei@huawei.com, stefanha@redhat.com, amit.shah@redhat.com, zhangchen.fnst@cn.fujitsu.com, hongyang.yang@easystack.cn

* Hailiang Zhang (zhang.zhanghailiang@huawei.com) wrote:
> On 2016/2/29 17:47, Dr. David Alan Gilbert wrote:
> >* Hailiang Zhang (zhang.zhanghailiang@huawei.com) wrote:
> >>On 2016/2/27 0:36, Dr. David Alan Gilbert wrote:
> >>>* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:

> >I've got a patch where I've tried to multithread the flush - it's made it a little
> >faster, but not as much as I hoped (~20ms down to ~16ms using 4 cores)
> >
> 
> Hmm, that seems to be a good idea, after switch to COLO (hybrid) mode, in most cases,
> we will get much more dirtied pages than the periodic mode, because the delay time
> between two checkpoints is usually longer.
> The multi-thread flushing way may gain much more in that case, but i doubt, in some
> bad case, users still can't bear the pause time.
> 
> Actually, we have thought about this problem for a long time,
> In our early test based on Kernel COLO-proxy, we can easily got more than
> one seconds' flushing time, IMHO, uses can't bear the long pausing time of VM if they
> choose to use COLO.

Yes, that's just too long; although only solving the 'flushing' time isn't enough in those
cases, because the same cases will probably need to transfer lots of RAM over the wire as well.

> We have designed another scenario which based on userfault's page-miss capability.
> The base idea is to convert the flushing action to marking action, the flush action
> will be processed during SVM's running time. For now it is only an idea,
> we'd like to verify the idea first. (I'm not quite sure if userfaults' page-miss
> feature is good performance designed, while we use it to mark one page to be MISS a time).

Yes, it's a different trade off, slower execution, but no flush time.

Dave

> 
> 
> Thanks,
> Hailiang
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK