From: Yang Hongyang <yanghy@cn.fujitsu.com>
To: Jason Wang <jasowang@redhat.com>,
"Dong, Eddie" <eddie.dong@intel.com>,
Stefan Hajnoczi <stefanha@gmail.com>,
Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: zhanghailiang <zhang.zhanghailiang@huawei.com>,
"jan.kiszka@siemens.com" <jan.kiszka@siemens.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"peter.huangpeng" <peter.huangpeng@huawei.com>,
"Gonglei (Arei)" <arei.gonglei@huawei.com>,
"stefanha@redhat.com" <stefanha@redhat.com>,
"dgilbert@redhat.com" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [POC] colo-proxy in qemu
Date: Mon, 27 Jul 2015 13:51:37 +0800 [thread overview]
Message-ID: <55B5C6E9.6090707@cn.fujitsu.com> (raw)
In-Reply-To: <55B5B85B.1010009@redhat.com>
On 07/27/2015 12:49 PM, Jason Wang wrote:
>
>
> On 07/27/2015 11:54 AM, Yang Hongyang wrote:
>>
>>
>> On 07/27/2015 11:24 AM, Jason Wang wrote:
>>>
>>>
>>> On 07/24/2015 04:04 PM, Yang Hongyang wrote:
>>>> Hi Jason,
>>>>
>>>> On 07/24/2015 10:12 AM, Jason Wang wrote:
>>>>>
>>>>>
>>>>> On 07/24/2015 10:04 AM, Dong, Eddie wrote:
>>>>>> Hi Stefan:
>>>>>> Thanks for your comments!
>>>>>>
>>>>>>> On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote:
>>>>>>>> We are planning to implement colo-proxy in qemu to cache and
>>>>>>>> compare
>>>>>>> packets.
>>>>>>>
>>>>>>> I thought there is a kernel module to do that?
>>>>>> Yes, that is the previous solution the COLO sub-community choose
>>>>>> to go, but we realized it might be not the best choices, and thus we
>>>>>> want to bring discussion back here :) More comments are welcome.
>>>>>>
>>>>>
>>>>> Hi:
>>>>>
>>>>> Could you pls describe more details on this decision? What's the
>>>>> reason
>>>>> that you realize it was not the best choice?
>>>>
>>>> Below is my opinion:
>>>>
>>>> We realized that there're disadvantages do it in kernel spaces:
>>>> 1. We need to recompile kernel: the colo-proxy kernel module is
>>>> implemented as a nf conntrack extension. Adding a extension need to
>>>> modify the extension struct in-kernel, so recompile kernel is
>>>> needed.
>>>
>>> There's no need to do all in kernel, you can use a separate process to
>>> do the comparing and trigger the state sync through monitor.
>>
>> I don't get it, colo-proxy kernel module using a kthread do the
>> comparing and
>> trigger the state sync. We implemented it as a nf conntrack extension
>> module,
>> so we need to extend the extension struct in-kernel, although it just
>> needs
>> few lines changes to kernel, but a recompile of kernel is needed. Are you
>> talking about not implement it as a nf conntrack extension?
>
> Yes, I mean implement the comparing in userspace but not in qemu.
Yes, it is an alternative, that requires other components such as
netfilter userspace tools, it will add the complexity I think, we
wanted to implement a simple solution in QEMU. Another reason is
that using other userspace tools will affect the performance, the
context switch between kernel and userspace may be an overhead.
>
>>
>>>
>>>> 2. We need to recompile iptables/nftables to use together with the
>>>> colo-proxy
>>>> kernel module.
>>>> 3. Need to configure primary host to forward input packets to
>>>> secondary as
>>>> well as configure secondary to forward output packets to primary
>>>> host, the
>>>> network topology and configuration is too complex for a regular
>>>> user.
>>>>
>>>
>>> You can use current kernel primitives to mirror the traffic of both PVM
>>> and SVM to another process without any modification of kernel. And qemu
>>> can offload all network configuration to management in this case. And
>>> what's more import, this works for vhost. Filtering in qemu won't work
>>> for vhost.
>>
>> We are using tc to mirror/forward packets now. Implement in QEMU do
>> have some
>> limits, but there're also limits in kernel, if the packet do not pass
>> the host kernel TCP/IP stack, such as vhost-user.
>
> But the limits are much less than userspace, no? For vhost-user, maybe
> we could extend the backed to mirror the traffic also.
IMO the limits are more or less. Besides, for mirror/forward packets,
using tc requires a separate physical nic or a vlan, the nic should not
be used for other purpose. if we implement it in QEMU, using an socket
connection to forward packets, we no longer need an separate nic, it will
reduce the network topology complexity.
>
>>
>>>
>>>
>>>> You can refer to http://wiki.qemu.org/Features/COLO
>>>> to see the network topology and the steps to setup an env.
>>>
>>> The figure "COLO Framework" shows there's a proxy kernel module in
>>> primary node but in secondary node this is done through a process? This
>>> will complicate the environment a bit more.
>>
>> proxy kernel module also works for secondary node.
>>
>>>
>>>>
>>>> Setup a test env is too complex. The usability is so important to a
>>>> feature
>>>> like COLO which provide VM FT solution, if fewer people can/willing to
>>>> setup the env, the feature is useless. So we decide to develop user
>>>> space
>>>> colo-proxy.
>>>
>>> If the setup is too complex, need to consider to simplify or reuse codes
>>> and designs. Otherwise you probably introduce something new that needs
>>> fault tolerance.
>>>
>>>>
>>>> The advantage is obvious,
>>>> 1. we do not need to recompile kernel.
>>>> 2. No need to recompile iptables/nftables.
>>>
>>> As I descried above, looks like there's no need to modify kernel.
>>>
>>>> 3. we do not need to deal with the network configuration, we just
>>>> using a
>>>> socket connection between 2 QEMUs to forward packets.
>>>
>>> All network configurations should be offloaded to management. And you
>>> still need a dedicated topology according to the wiki.
>>>
>>>> 4. A complete VM FT solution in one go, we have already developed the
>>>> block
>>>> replication in QEMU, so with the network replication in QEMU, all
>>>> components we needed are within QEMU, this is very important, it
>>>> greatly
>>>> improves the usability of COLO feature! We hope it will gain more
>>>> testers,
>>>> users and developers.
>>>
>>> Is your block solution works for vhost?
>>
>> No, it can't works for vhost and dataplane, migration also won't work
>> for dataplane IIRC.
>>
>>>
>>>> 5. QEMU will gain a complete VM FT solution and the most advantage FT
>>>> solution
>>>> so far!
>>>>
>>>> Overall, usability is the most important factor that impact our choice.
>>>>
>>>>
>>>
>>> Usability will be improved if you can use exist primitives and decouple
>>> unnecessary codes from qemu.
>>>
>>> Thanks
>>>
>>>>>
>>>>> Thanks
>>>>> .
>>>>>
>>>>
>>>
>>>
>>> .
>>>
>>
>
> .
>
--
Thanks,
Yang.
next prev parent reply other threads:[~2015-07-27 5:52 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-20 6:42 [Qemu-devel] [POC] colo-proxy in qemu Li Zhijian
2015-07-20 10:32 ` Stefan Hajnoczi
2015-07-20 11:55 ` zhanghailiang
2015-07-20 13:12 ` Vasiliy Tolstov
2015-07-20 15:01 ` Stefan Hajnoczi
2015-07-21 1:59 ` zhanghailiang
2015-07-28 22:13 ` Samuel Thibault
2015-07-21 6:13 ` Jan Kiszka
2015-07-21 9:49 ` Stefan Hajnoczi
2015-07-27 10:13 ` Stefan Hajnoczi
2015-07-27 11:24 ` zhanghailiang
2015-07-27 11:31 ` Samuel Thibault
2015-07-27 13:33 ` Jan Kiszka
2015-07-28 22:12 ` Samuel Thibault
2015-07-29 7:36 ` Jan Kiszka
2015-07-29 9:33 ` [Qemu-devel] [PATCH] MAINTAINERS: Add Samuel Thibault as slirp maintainer Samuel Thibault
2015-08-06 10:10 ` Stefan Hajnoczi
2015-08-06 12:29 ` Fam Zheng
2015-08-07 10:19 ` Stefan Hajnoczi
2015-08-07 10:34 ` Fam Zheng
2015-07-20 12:02 ` [Qemu-devel] [POC] colo-proxy in qemu Li Zhijian
2015-07-24 2:04 ` Dong, Eddie
2015-07-24 2:12 ` Jason Wang
2015-07-24 8:04 ` Yang Hongyang
2015-07-27 3:24 ` Jason Wang
2015-07-27 3:54 ` Yang Hongyang
2015-07-27 4:49 ` Jason Wang
2015-07-27 5:51 ` Yang Hongyang [this message]
2015-07-27 7:37 ` Jason Wang
2015-07-27 7:49 ` Yang Hongyang
2015-07-27 8:06 ` Jason Wang
2015-07-27 8:22 ` Yang Hongyang
2015-07-27 7:53 ` Jason Wang
2015-07-27 8:17 ` Yang Hongyang
2015-07-27 18:33 ` Dr. David Alan Gilbert
2015-07-27 10:40 ` Dr. David Alan Gilbert
2015-07-27 13:39 ` Yang Hongyang
2015-07-24 2:05 ` Dong, Eddie
2015-07-30 4:23 ` Jason Wang
2015-07-30 7:16 ` Gonglei
2015-07-30 7:47 ` Dong, Eddie
2015-07-30 8:03 ` Dr. David Alan Gilbert
2015-07-30 8:15 ` Jason Wang
2015-07-30 11:56 ` Dr. David Alan Gilbert
2015-07-30 12:10 ` Gonglei
2015-07-30 12:30 ` Dr. David Alan Gilbert
2015-07-30 12:42 ` zhanghailiang
2015-07-30 13:59 ` Dr. David Alan Gilbert
2015-07-30 15:17 ` Yang Hongyang
2015-07-30 17:53 ` Dr. David Alan Gilbert
2015-07-31 1:08 ` Yang Hongyang
2015-07-31 1:28 ` zhanghailiang
2015-07-31 1:31 ` Yang Hongyang
2015-07-31 1:26 ` zhanghailiang
-- strict thread matches above, loose matches on Subject: below --
2015-11-10 5:26 [Qemu-devel] [POC]colo-proxy " Tkid
2015-11-10 7:35 ` Jason Wang
2015-11-10 8:30 ` zhanghailiang
2015-11-11 2:28 ` Jason Wang
2015-11-10 9:35 ` Tkid
2015-11-11 3:04 ` Jason Wang
2015-11-10 9:41 ` Dr. David Alan Gilbert
2015-11-11 3:09 ` Jason Wang
2015-11-11 9:03 ` Dr. David Alan Gilbert
2015-11-11 1:23 ` Dong, Eddie
2015-11-11 3:26 ` Jason Wang
2015-11-10 10:54 ` Dr. David Alan Gilbert
2015-11-11 2:46 ` Zhang Chen
2015-11-13 12:33 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55B5C6E9.6090707@cn.fujitsu.com \
--to=yanghy@cn.fujitsu.com \
--cc=arei.gonglei@huawei.com \
--cc=dgilbert@redhat.com \
--cc=eddie.dong@intel.com \
--cc=jan.kiszka@siemens.com \
--cc=jasowang@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=peter.huangpeng@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
--cc=zhang.zhanghailiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.