From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Cc: lizhijian@cn.fujitsu.com, quintela@redhat.com,
yunhong.jiang@intel.com, eddie.dong@intel.com,
peter.huangpeng@huawei.com, qemu-devel@nongnu.org,
arei.gonglei@huawei.com, luis@cs.umu.se, amit.shah@redhat.com,
hongyang.yang@easystack.cn
Subject: [Qemu-devel] [WIP] RDMA transport for COLO
Date: Thu, 17 Dec 2015 20:31:12 +0000 [thread overview]
Message-ID: <20151217203111.GG2484@work-vm> (raw)
Hi,
I've been playing with getting an RDMA setup for COLO and
have something that mostly works, but it is very new and quite
hacky; but I thought I'd share my work so far.
You can find it at:
https://github.com/orbitfp7/qemu/commits/orbit-wp4-colo-dec
What I've done is:
a) Wire up a partner TCP connection by the side of the RDMA
connection.
b) Use the TCP connection just for the responses from secondary->primary
c) Make the RDMA connection write to the colo-cache after
the first migrate
d) Make the RDMA connection notify the secondary when it
sends writes, so that the secondary can know that it needs
to flush those pages in the colo-cache.
e) Add a shutdown function and fix some other bugs
I've had that working on both your current world (which is
what that tree is based off) and your older COLO world
from July (with a bit more hacking to make it take the newer
patches).
Looking at the speed:
a) The CPU load on the incoming thread is much lower - maybe
only 10-11% instead of 30-40%.
b) The performance of guest code is a little slower (~10% slower?)
on RDMA rather than TCP (on both 10Gbps and 40Gbps links)
I've not worked out why yet. (My guess is it could be to do with
RDMA dynamic registration)
Things I know I need to do:
1) Tidy it up - it's very messy!
2) Try and get rid of the TCP connection and use an RDMA
channel for the backwards connection
3) Make sure the shutdown really can cope with the other host
being dead.
4) It only deals with the dynamic registration mode of RDMA;
setting pin-all will probably break it.
5) Figure out why it's slower!
6) Test failover more.
My work on this is part of the EU Orbit project
( http://www.orbitproject.eu/ )
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next reply other threads:[~2015-12-17 20:31 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-17 20:31 Dr. David Alan Gilbert [this message]
2015-12-25 7:06 ` [Qemu-devel] [WIP] RDMA transport for COLO Hailiang Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151217203111.GG2484@work-vm \
--to=dgilbert@redhat.com \
--cc=amit.shah@redhat.com \
--cc=arei.gonglei@huawei.com \
--cc=eddie.dong@intel.com \
--cc=hongyang.yang@easystack.cn \
--cc=lizhijian@cn.fujitsu.com \
--cc=luis@cs.umu.se \
--cc=peter.huangpeng@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=yunhong.jiang@intel.com \
--cc=zhang.zhanghailiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).