Re: [Qemu-devel] [RFC 0/1] Rolling stats on colo

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Li Zhijian <lizhijian@cn.fujitsu.com>,
	yunhong.jiang@intel.com, eddie.dong@intel.com,
	peter.huangpeng@huawei.com, qemu-devel@nongnu.org,
	"Gonglei (Arei)" <arei.gonglei@huawei.com>,
	luis@cs.umu.se
Subject: Re: [Qemu-devel] [RFC 0/1] Rolling stats on colo
Date: Wed, 11 Mar 2015 09:06:42 +0000	[thread overview]
Message-ID: <20150311090642.GB2334@work-vm> (raw)
In-Reply-To: <54FFB25B.9010603@huawei.com>

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Hi Dave,
> 
> Sorry for the late reply :)

No problem.

> On 2015/3/7 2:30, Dr. David Alan Gilbert wrote:
> >* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>On 2015/3/5 21:31, Dr. David Alan Gilbert (git) wrote:
> >>>From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >>
> >>Hi Dave,
> >>
> >>>
> >>>Hi,
> >>>   I'm getting COLO running on a couple of our machines here
> >>>and wanted to see what was actually going on, so I merged
> >>>in my recent rolling-stats code:
> >>>
> >>>http://lists.gnu.org/archive/html/qemu-devel/2015-03/msg00648.html
> >>>
> >>>with the following patch, and now I get on the primary side,
> >>>info migrate shows me:
> >>>
> >>>capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off colo: on
> >>>Migration status: colo
> >>>total time: 0 milliseconds
> >>>colo checkpoint (ms): Min/Max: 0, 10000 Mean: -1.1415868e-13 (Weighted: 4.3136025e-158) Count: 4020 Values: 0@1425561742237, 0@1425561742300, 0@1425561742363, 0@1425561742426, 0@1425561742489, 0@1425561742555, 0@1425561742618, 0@1425561742681, 0@1425561742743, 0@1425561742824
> >>>colo paused time (ms): Min/Max: 55, 2789 Mean: 63.9 (Weighted: 76.243584) Count: 4019 Values: 62@1425561742237, 62@1425561742300, 62@1425561742363, 62@1425561742426, 61@1425561742489, 65@1425561742555, 62@1425561742618, 62@1425561742681, 61@1425561742743, 80@1425561742824
> >>>colo checkpoint size: Min/Max: 18351, 2.1731606e+08 Mean: 150096.4 (Weighted: 127195.56) Count: 4020 Values: 211246@1425561742238, 186622@1425561742301, 227662@1425561742364, 219454@1425561742428, 268702@1425561742490, 96334@1425561742556, 47086@1425561742619, 42982@1425561742682, 55294@1425561742744, 145582@1425561742825
> >>>
> >>>which suggests I've got a problem with the packet comparison; but that's
> >>>a separate issue I'll look at.
> >>>
> >>
> >>There is an obvious mistake we have made in proxy, the macro 'IPS_UNTRACKED_BIT' in colo-patch-for-kernel.patch should be 14,
> >>so please fix it before do the follow test. Sorry for this low-grade mistake, we should do full test before issue it. ;)
> >
> >No, that's OK; we all make them.
> >
> >However, that didn't cure my problem; but after a bit of experimentation I now have
> >COLO working pretty well; thanks for the help!
> >
> >    1) I had to disable IPv6 in the guest; it doesn't look like the
> >    conntrack is coping with IPv6 ICMPV6, and on our test network
> >    we're getting a few 10s of those each second, so it's constant
> >    miscompares (they seem to be neighbour broadcasts and multicast
> >    stuff).
> >
> 
> Hmm, yes, the proxy code in github does not support ICMPV6 packet comparing.
> We will add this in the future.
> 
> >    2) It looks like virtio-net is sending ARPs - possibly every time
> >    that a snapshot is loaded;  it's not the 'qemu' announce-self code,
> >    (I added some debug there and it's not being called); and ARPs
> >    cause a miscompare - so you get a continuous streem of miscompares
> >    because a miscompare triggers a new snapshot, that sends more ARPs.
> >    I solved this by switching to e1000.
> >
> 
> I didn't meet this problem, i used tcpdump to capture the net packets and
> did not found any ARPs after VM load in slave.

Interesting.

> Maybe i missed something, Are there any servers/commands that net related run in VM?

I don't think so, and even if they were, I don't think they would go away
by switching to an e1000; I see there is a 'VIRTIO_NET_S_ANNOUNCE' feature
in virtio-net, and I suspect it's that which is doing it, but maybe it
depends on the guest/host kernels to have it enabled?

> And what's your tcpdump command line?

just tcpdump -i em4 -n -w outputfile

> >    3) The other problem with virtio is it's occasionally triggering a
> >    'virtio: error trying to map MMIO memory' from qemu;  I'm not sure
> >    why, the state COLO sends over should always be consistent.
> >
> >    4) With the e1000 setup; connections are generally fairly responsive,
> >    but sshing into the guest takes *ages* (10s of seconds).  I'm not sure
> >    why, because a curl to a web server seems OK (less than a second)
> >    and once the ssh is open it's pretty responsive.
> >
> 
> Er, have you tried to ssh into the guest without in COLO mode? Is it also taking a long time?

Not yet; I'm going to try and take some logging to it to find out why.

> I have encounter a similar situation when the slave VM is faked dead which 'info status' is 'running',
> but VM can not respond to keyboad from VNC. Maybe there is some thing wrong with device status, i
> will look into it.
> 
> >    5) I've seen one instance of;
> >       'qemu-system-x86_64: block/raw-posix.c:836: handle_aiocb_rw: Assertion `p - buf == aiocb->aio_nbytes' failed.'
> >       on the primary side.
> >
> >Stats for a mostly idle guest are now showing:
> >
> >colo checkpoint (ms): Min/Max: 0, 10004 Mean: 1592.1 (Weighted: 1806.214) Count: 227 Values: 1650@1425666160229, 1661@1425666161998, 1662@1425666163736, 1687@1425666165524, 811@1425666166438, 788@1425666167298, 1619@1425666168992, 1699@1425666170793, 2711@1425666173602, 1633@1425666175315
> >colo paused time (ms): Min/Max: 58, 2975 Mean: 90.3 (Weighted: 94.109752) Count: 227 Values: 107@1425666160337, 75@1425666162074, 100@1425666163837, 102@1425666165627, 71@1425666166510, 74@1425666167373, 101@1425666169094, 97@1425666170891, 79@1425666173682, 97@1425666175413
> >colo checkpoint size: Min/Max: 212252, 1.9241972e+08 Mean: 5569622.6 (Weighted: 4826386.5) Count: 227 Values: 5998892@1425666160230, 4660988@1425666161999, 6002996@1425666163737, 5945540@1425666165525, 4833356@1425666166439, 5510606@1425666167299, 5793692@1425666168993, 5584388@1425666170794, 7016684@1425666173603, 4349084@1425666175316
> >
> >So, one checkpoint every ~1.5 seconds; that's just with an
> >ssh connected and a script doing a 'curl' to it's http
> >repeatedly.   Running 'top' on the ssh with a fast refresh
> >brings the checkpoints much faster; I guess that's because
> >the output of top is quite random.
> >
> 
> Yes, it is a known problem, actually, not only 'top' command, every command with
> random output may result in continuous miscompare.
> Besides, the data transferred through SSH will be encrypted, which makes things more bad.
> 
> One way to solve this problem maybe:
> if we detect a continuous stream of miscompares, we fall back to Microcheckpointing mode (periodic checkpoint).

Yes, I was going to try and implement that fallback - I've got some ideas
to try for it.

> >>To be honest, the proxy part in github is not integrated, we have cut it just for easy review and understand, so there may be some mistakes.
> >
> >Yes, that's OK; and I've had a few kernel crashes; normally
> >when the qemu crashes, the kernel doesn't really like it;
> >but that's OK, I'm sure it will get better.
> >
> 
> Hmm, thanks very much for your feedback, we are making our efforts to better it... ;)

Thanks,

Dave

> 
> >I added the following to make my debug easier; which is how
> >I found the IPv6 problem.
> >
> >diff --git a/xt_PMYCOLO.c b/xt_PMYCOLO.c
> >index 9e50b62..13c0b48 100644
> >--- a/xt_PMYCOLO.c
> >+++ b/xt_PMYCOLO.c
> >@@ -1072,7 +1072,7 @@ resolve_master_ct(struct sk_buff *skb, unsigned int dataoff,
> >         h = nf_conntrack_find_get(&init_net, NF_CT_DEFAULT_ZONE, &tuple);
> >
> >         if (h == NULL) {
> >-               pr_dbg("can't find master's ct for slaver packet\n");
> >+               pr_dbg("can't find master's ct for slaver packet (pf/l3num=%d protonum=%d)\n", l3num, protonum);
> >                 return NULL;
> >         }
> >
> >@@ -1092,7 +1092,7 @@ nf_conntrack_slaver_in(u_int8_t pf, unsigned int hooknum,
> >         /* rcu_read_lock()ed by nf_hook_slow */
> >         l3proto = __nf_ct_l3proto_find(pf);
> >         if (l3proto->get_l4proto(skb, skb_network_offset(skb), &dataoff, &protonum) <= 0) {
> >-               pr_dbg("slaver: l3proto not prepared to track yet or error occurred\n");
> >+               pr_dbg("slaver: l3proto not prepared to track yet or error occurred (pf=%d)\n", pf);
> >                 NF_CT_STAT_INC_ATOMIC(&init_net, error);
> >                 NF_CT_STAT_INC_ATOMIC(&init_net, invalid);
> >                 goto out;
> >
> >>
> >>Thanks,
> >>zhanghailiang
> >
> >Thanks,
> >
> >Dave
> >>
> >>
> >>>Dave
> >>>
> >>>Dr. David Alan Gilbert (1):
> >>>   COLO: Add primary side rolling statistics
> >>>
> >>>  hmp.c                         | 12 ++++++++++++
> >>>  include/migration/migration.h |  3 +++
> >>>  migration/colo.c              | 15 +++++++++++++++
> >>>  migration/migration.c         | 30 ++++++++++++++++++++++++++++++
> >>>  qapi-schema.json              | 11 ++++++++++-
> >>>  5 files changed, 70 insertions(+), 1 deletion(-)
> >>>
> >>
> >>
> >--
> >Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> >.
> >
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2015-03-11  9:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-05 13:31 [Qemu-devel] [RFC 0/1] Rolling stats on colo Dr. David Alan Gilbert (git)
2015-03-05 13:31 ` [Qemu-devel] [RFC 1/1] COLO: Add primary side rolling statistics Dr. David Alan Gilbert (git)
2015-03-06  1:48 ` [Qemu-devel] [RFC 0/1] Rolling stats on colo zhanghailiang
2015-03-06  1:52   ` zhanghailiang
2015-03-06 18:30   ` Dr. David Alan Gilbert
2015-03-09  2:37     ` Wen Congyang
2015-03-09  8:55       ` Dr. David Alan Gilbert
2015-03-09  9:01         ` Wen Congyang
2015-03-11  3:11     ` zhanghailiang
2015-03-11  9:06       ` Dr. David Alan Gilbert [this message]
2015-03-11  9:31         ` zhanghailiang
2015-03-11 10:07           ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150311090642.GB2334@work-vm \
    --to=dgilbert@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=eddie.dong@intel.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=luis@cs.umu.se \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yunhong.jiang@intel.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).