From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:56175)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <wency@cn.fujitsu.com>) id 1YUnWx-0004t2-Ma
	for qemu-devel@nongnu.org; Sun, 08 Mar 2015 22:34:57 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <wency@cn.fujitsu.com>) id 1YUnWu-0003f4-Fi
	for qemu-devel@nongnu.org; Sun, 08 Mar 2015 22:34:55 -0400
Received: from [59.151.112.132] (port=12901 helo=heian.cn.fujitsu.com)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <wency@cn.fujitsu.com>) id 1YUnWc-0003bw-V6
	for qemu-devel@nongnu.org; Sun, 08 Mar 2015 22:34:52 -0400
Message-ID: <54FD074C.5060009@cn.fujitsu.com>
Date: Mon, 9 Mar 2015 10:37:00 +0800
From: Wen Congyang <wency@cn.fujitsu.com>
MIME-Version: 1.0
References: <1425562294-1616-1-git-send-email-dgilbert@redhat.com>
	<54F9077B.3020803@huawei.com> <20150306183021.GE2507@work-vm>
In-Reply-To: <20150306183021.GE2507@work-vm>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [RFC 0/1] Rolling stats on colo
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: hangaohuai@huawei.com, yunhong.jiang@intel.com, eddie.dong@intel.com, qemu-devel@nongnu.org, peter.huangpeng@huawei.com, luis@cs.umu.se

On 03/07/2015 02:30 AM, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2015/3/5 21:31, Dr. David Alan Gilbert (git) wrote:
>>> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>>
>> Hi Dave,
>>
>>>
>>> Hi,
>>>   I'm getting COLO running on a couple of our machines here
>>> and wanted to see what was actually going on, so I merged
>>> in my recent rolling-stats code:
>>>
>>> http://lists.gnu.org/archive/html/qemu-devel/2015-03/msg00648.html
>>>
>>> with the following patch, and now I get on the primary side,
>>> info migrate shows me:
>>>
>>> capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off colo: on
>>> Migration status: colo
>>> total time: 0 milliseconds
>>> colo checkpoint (ms): Min/Max: 0, 10000 Mean: -1.1415868e-13 (Weighted: 4.3136025e-158) Count: 4020 Values: 0@1425561742237, 0@1425561742300, 0@1425561742363, 0@1425561742426, 0@1425561742489, 0@1425561742555, 0@1425561742618, 0@1425561742681, 0@1425561742743, 0@1425561742824
>>> colo paused time (ms): Min/Max: 55, 2789 Mean: 63.9 (Weighted: 76.243584) Count: 4019 Values: 62@1425561742237, 62@1425561742300, 62@1425561742363, 62@1425561742426, 61@1425561742489, 65@1425561742555, 62@1425561742618, 62@1425561742681, 61@1425561742743, 80@1425561742824
>>> colo checkpoint size: Min/Max: 18351, 2.1731606e+08 Mean: 150096.4 (Weighted: 127195.56) Count: 4020 Values: 211246@1425561742238, 186622@1425561742301, 227662@1425561742364, 219454@1425561742428, 268702@1425561742490, 96334@1425561742556, 47086@1425561742619, 42982@1425561742682, 55294@1425561742744, 145582@1425561742825
>>>
>>> which suggests I've got a problem with the packet comparison; but that's
>>> a separate issue I'll look at.
>>>
>>
>> There is an obvious mistake we have made in proxy, the macro 'IPS_UNTRACKED_BIT' in colo-patch-for-kernel.patch should be 14,
>> so please fix it before do the follow test. Sorry for this low-grade mistake, we should do full test before issue it. ;)
> 
> No, that's OK; we all make them.
> 
> However, that didn't cure my problem; but after a bit of experimentation I now have
> COLO working pretty well; thanks for the help!
> 
>    1) I had to disable IPv6 in the guest; it doesn't look like the
>    conntrack is coping with IPv6 ICMPV6, and on our test network
>    we're getting a few 10s of those each second, so it's constant
>    miscompares (they seem to be neighbour broadcasts and multicast
>    stuff).
> 
>    2) It looks like virtio-net is sending ARPs - possibly every time
>    that a snapshot is loaded;  it's not the 'qemu' announce-self code,
>    (I added some debug there and it's not being called); and ARPs
>    cause a miscompare - so you get a continuous streem of miscompares
>    because a miscompare triggers a new snapshot, that sends more ARPs.
>    I solved this by switching to e1000.
> 
>    3) The other problem with virtio is it's occasionally triggering a
>    'virtio: error trying to map MMIO memory' from qemu;  I'm not sure
>    why, the state COLO sends over should always be consistent.

I don't meet this problem. Can you provide your command line?
Primary or secondary qemu reports this error message?

> 
>    4) With the e1000 setup; connections are generally fairly responsive,
>    but sshing into the guest takes *ages* (10s of seconds).  I'm not sure
>    why, because a curl to a web server seems OK (less than a second)
>    and once the ssh is open it's pretty responsive.
> 
>    5) I've seen one instance of; 
>       'qemu-system-x86_64: block/raw-posix.c:836: handle_aiocb_rw: Assertion `p - buf == aiocb->aio_nbytes' failed.'
>       on the primary side.

It is a known bug in quorum. You can try this patch:
http://lists.nongnu.org/archive/html/qemu-devel/2015-01/msg04507.html

Thanks
Wen Congyang

> 
> Stats for a mostly idle guest are now showing:
> 
> colo checkpoint (ms): Min/Max: 0, 10004 Mean: 1592.1 (Weighted: 1806.214) Count: 227 Values: 1650@1425666160229, 1661@1425666161998, 1662@1425666163736, 1687@1425666165524, 811@1425666166438, 788@1425666167298, 1619@1425666168992, 1699@1425666170793, 2711@1425666173602, 1633@1425666175315
> colo paused time (ms): Min/Max: 58, 2975 Mean: 90.3 (Weighted: 94.109752) Count: 227 Values: 107@1425666160337, 75@1425666162074, 100@1425666163837, 102@1425666165627, 71@1425666166510, 74@1425666167373, 101@1425666169094, 97@1425666170891, 79@1425666173682, 97@1425666175413
> colo checkpoint size: Min/Max: 212252, 1.9241972e+08 Mean: 5569622.6 (Weighted: 4826386.5) Count: 227 Values: 5998892@1425666160230, 4660988@1425666161999, 6002996@1425666163737, 5945540@1425666165525, 4833356@1425666166439, 5510606@1425666167299, 5793692@1425666168993, 5584388@1425666170794, 7016684@1425666173603, 4349084@1425666175316
> 
> So, one checkpoint every ~1.5 seconds; that's just with an
> ssh connected and a script doing a 'curl' to it's http
> repeatedly.   Running 'top' on the ssh with a fast refresh
> brings the checkpoints much faster; I guess that's because
> the output of top is quite random.
> 
>> To be honest, the proxy part in github is not integrated, we have cut it just for easy review and understand, so there may be some mistakes.
> 
> Yes, that's OK; and I've had a few kernel crashes; normally 
> when the qemu crashes, the kernel doesn't really like it;
> but that's OK, I'm sure it will get better.
> 
> I added the following to make my debug easier; which is how
> I found the IPv6 problem.
> 
> diff --git a/xt_PMYCOLO.c b/xt_PMYCOLO.c
> index 9e50b62..13c0b48 100644
> --- a/xt_PMYCOLO.c
> +++ b/xt_PMYCOLO.c
> @@ -1072,7 +1072,7 @@ resolve_master_ct(struct sk_buff *skb, unsigned int dataoff,
>         h = nf_conntrack_find_get(&init_net, NF_CT_DEFAULT_ZONE, &tuple);
>  
>         if (h == NULL) {
> -               pr_dbg("can't find master's ct for slaver packet\n");
> +               pr_dbg("can't find master's ct for slaver packet (pf/l3num=%d protonum=%d)\n", l3num, protonum);
>                 return NULL;
>         }
>  
> @@ -1092,7 +1092,7 @@ nf_conntrack_slaver_in(u_int8_t pf, unsigned int hooknum,
>         /* rcu_read_lock()ed by nf_hook_slow */
>         l3proto = __nf_ct_l3proto_find(pf);
>         if (l3proto->get_l4proto(skb, skb_network_offset(skb), &dataoff, &protonum) <= 0) {
> -               pr_dbg("slaver: l3proto not prepared to track yet or error occurred\n");
> +               pr_dbg("slaver: l3proto not prepared to track yet or error occurred (pf=%d)\n", pf);
>                 NF_CT_STAT_INC_ATOMIC(&init_net, error);
>                 NF_CT_STAT_INC_ATOMIC(&init_net, invalid);
>                 goto out;
> 
>>
>> Thanks,
>> zhanghailiang
> 
> Thanks,
> 
> Dave
>>
>>
>>> Dave
>>>
>>> Dr. David Alan Gilbert (1):
>>>   COLO: Add primary side rolling statistics
>>>
>>>  hmp.c                         | 12 ++++++++++++
>>>  include/migration/migration.h |  3 +++
>>>  migration/colo.c              | 15 +++++++++++++++
>>>  migration/migration.c         | 30 ++++++++++++++++++++++++++++++
>>>  qapi-schema.json              | 11 ++++++++++-
>>>  5 files changed, 70 insertions(+), 1 deletion(-)
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>