Re: [Qemu-devel] [PATCH COLO-Frame (Base) v21 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
To: Amit Shah <amit.shah@redhat.com>
Cc: quintela@redhat.com, qemu-devel@nongnu.org, dgilbert@redhat.com,
	wency@cn.fujitsu.com, lizhijian@cn.fujitsu.com,
	xiecl.fnst@cn.fujitsu.com, Hai Huang <hhuang@redhat.com>,
	Weidong Han <hanweidong@huawei.com>,
	Dong eddie <eddie.dong@intel.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Jason Wang <jasowang@redhat.com>
Subject: Re: [Qemu-devel] [PATCH COLO-Frame (Base) v21 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)
Date: Wed, 26 Oct 2016 23:52:48 +0800	[thread overview]
Message-ID: <5810D150.5070709@huawei.com> (raw)
In-Reply-To: <20161026082609.GT1679@amit-lp.rh>

Hi Amit,

On 2016/10/26 16:26, Amit Shah wrote:
> On (Wed) 26 Oct 2016 [14:43:30], Hailiang Zhang wrote:
>> Hi Amit,
>>
>> On 2016/10/26 14:09, Amit Shah wrote:
>>> Hello,
>>>
>>> On (Tue) 18 Oct 2016 [20:09:56], zhanghailiang wrote:
>>>> This is the 21th version of COLO frame series.
>>>>
>>>> Rebase to the latest master.
>>>
>>> I've reviewed the patchset, have some minor comments, but overall it
>>> looks good.  The changes are contained, and common code / existing
>>> code paths are not affected much.  We can still target to merge this
>>> for 2.8.
>>>
>>
>> I really appreciate your help ;), I will fix all the issues later
>> and send v22. Hope we can still catch the deadline of V2.8.
>>
>>> Do you have any tests on how much the VM slows down / downtime
>>> incurred during checkpoints?
>>>
>>
>> Yes, we tested that long time ago, it all depends.
>> The downtime is determined by the time of transferring the dirty pages
>> and the time of flushing ram from ram buffer.
>> But we really have methods to reduce the downtime.
>>
>> One method is to reduce the amount of data (dirty pages mainly) while do checkpoint
>> by transferring dirty pages asynchronously while PVM and SVM are running (no in
>> the time of doing checkpoint). Besides we can re-use the capability of migration, such
>> as compressing, etc.
>> Another method is to reduce the time of flushing ram by using userfaultfd API
>> to convert copying ram into marking bitmap. We can also flushing the ram buffer
>> by multiple threads which advised by Dave ...
>
> Yes, I understand that as with any migration numbers, this too depends
> on what the guest is doing.  However, can you just pick some standard
> workload - kernel compile or something like that - and post a few
> observations?
>

Li Zhijian has sent some test results which based on kernel colo proxy,
After switch to userspace colo proxy, there maybe some degradations.
But for the old scenario, some optimizations are not implemented.
For the new userspace colo proxy scenario, we didn't test it overall,
Because it is still WIP, we will start the work after this frame is merged.

>>> Also, can you tell how did you arrive at the default checkpoint
>>> interval?
>>>
>>
>> Er, for this value, we referred to Remus in XEN platform. ;)
>> But after we implement COLO with colo proxy, this interval value will be changed
>> to a bigger one (10s). And we will make it configuration too. Besides, we will
>> add another configurable value to control the min interval of checkpointing.
>
> OK - any typical value that is a good mix between COLO keeping the
> network too busy / guest paused vs guest making progress?  Again this
> is something that's workload-dependent, but I guess you have typical
> numbers from a network-bound workload?
>

Yes, you can refer to Zhijian's email for detail.
I think it is necessary to add some test/performance results into COLO's wiki.
We will do that later.

Thanks,
hailiang

> Thanks,
>
> 		Amit
>
> .
>

next prev parent reply	other threads:[~2016-10-26 15:53 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-18 12:09 [Qemu-devel] [PATCH COLO-Frame (Base) v21 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
2016-10-18 12:09 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 01/17] migration: Introduce capability 'x-colo' to migration zhanghailiang
2016-10-26  4:32   ` Amit Shah
2016-10-18 12:09 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 02/17] COLO: migrate COLO related info to secondary node zhanghailiang
2016-10-26  4:35   ` Amit Shah
2016-10-18 12:09 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 03/17] migration: Enter into COLO mode after migration if COLO is enabled zhanghailiang
2016-10-26  4:50   ` Amit Shah
2016-10-26 13:49     ` Hailiang Zhang
2016-10-27  3:58       ` Amit Shah
2016-10-27  6:10         ` Hailiang Zhang
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 04/17] migration: Switch to COLO process after finishing loadvm zhanghailiang
2016-10-26  5:01   ` Amit Shah
2016-10-26 13:55     ` Hailiang Zhang
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 05/17] COLO: Establish a new communicating path for COLO zhanghailiang
2016-10-26  5:06   ` Amit Shah
2016-10-26 14:05     ` Hailiang Zhang
2016-10-27  3:57       ` Amit Shah
2016-10-27  6:06         ` Hailiang Zhang
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 06/17] COLO: Introduce checkpointing protocol zhanghailiang
2016-10-26  5:25   ` Amit Shah
2016-10-26 14:18     ` Hailiang Zhang
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 07/17] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2016-10-26  7:06   ` Amit Shah
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 08/17] COLO: Send PVM state to secondary side when do checkpoint zhanghailiang
2016-10-26  5:36   ` Amit Shah
2016-10-26  5:37   ` Amit Shah
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 09/17] COLO: Load VMState into QIOChannelBuffer before restore it zhanghailiang
2016-10-26  5:40   ` Amit Shah
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 10/17] COLO: Add checkpoint-delay parameter for migrate-set-parameters zhanghailiang
2016-10-26  5:45   ` Amit Shah
2016-10-26 13:39     ` Eric Blake
2016-10-26 14:43       ` Hailiang Zhang
2016-10-26 14:40     ` Hailiang Zhang
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 11/17] COLO: Synchronize PVM's state to SVM periodically zhanghailiang
2016-10-26  5:46   ` Amit Shah
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 12/17] COLO: Add 'x-colo-lost-heartbeat' command to trigger failover zhanghailiang
2016-10-26  5:51   ` Amit Shah
2016-10-26 13:59     ` Dr. David Alan Gilbert
2016-10-26 15:32       ` Hailiang Zhang
2016-10-26 14:50     ` Hailiang Zhang
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 13/17] COLO: Introduce state to record failover process zhanghailiang
2016-10-26  5:54   ` Amit Shah
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 14/17] COLO: Implement the process of failover for primary VM zhanghailiang
2016-10-26  5:58   ` Amit Shah
2016-10-26 14:59     ` Hailiang Zhang
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 15/17] COLO: Implement failover work for secondary VM zhanghailiang
2016-10-26  5:59   ` Amit Shah
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 16/17] docs: Add documentation for COLO feature zhanghailiang
2016-10-26  6:06   ` Amit Shah
2016-10-18 12:10 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 17/17] configure: Support enable/disable " zhanghailiang
2016-10-26  6:07   ` Amit Shah
2016-10-26 13:42     ` Eric Blake
2016-10-26 15:11       ` Hailiang Zhang
2016-10-27  3:54       ` Amit Shah
2016-10-26 15:09     ` Hailiang Zhang
2016-10-26  6:09 ` [Qemu-devel] [PATCH COLO-Frame (Base) v21 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) Amit Shah
2016-10-26  6:43   ` Hailiang Zhang
2016-10-26  8:26     ` Amit Shah
2016-10-26  9:53       ` Li Zhijian
2016-10-26 10:17         ` Li Zhijian
2016-10-26 10:14       ` Li Zhijian
2016-10-26 15:52       ` Hailiang Zhang [this message]
2016-10-27  3:52         ` Amit Shah
2016-10-27  5:55           ` Hailiang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5810D150.5070709@huawei.com \
    --to=zhang.zhanghailiang@huawei.com \
    --cc=amit.shah@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=hanweidong@huawei.com \
    --cc=hhuang@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=wency@cn.fujitsu.com \
    --cc=xiecl.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.