From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49101) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZJdAs-0001Rk-J2 for qemu-devel@nongnu.org; Mon, 27 Jul 2015 03:50:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZJdAp-0001UA-SS for qemu-devel@nongnu.org; Mon, 27 Jul 2015 03:50:14 -0400 Received: from [59.151.112.132] (port=28221 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZJdAn-0001Ky-Sp for qemu-devel@nongnu.org; Mon, 27 Jul 2015 03:50:11 -0400 Message-ID: <55B5E2A3.40600@cn.fujitsu.com> Date: Mon, 27 Jul 2015 15:49:55 +0800 From: Yang Hongyang MIME-Version: 1.0 References: <55AC9859.3050100@cn.fujitsu.com> <20150720103208.GA12675@stefanha-thinkpad.redhat.com> <55B19F25.10905@redhat.com> <55B1F196.2000008@cn.fujitsu.com> <55B5A465.6030004@redhat.com> <55B5AB70.3030704@cn.fujitsu.com> <55B5B85B.1010009@redhat.com> <55B5C6E9.6090707@cn.fujitsu.com> <55B5DF9E.6020908@redhat.com> In-Reply-To: <55B5DF9E.6020908@redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [POC] colo-proxy in qemu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jason Wang , "Dong, Eddie" , Stefan Hajnoczi , Li Zhijian Cc: zhanghailiang , "jan.kiszka@siemens.com" , "qemu-devel@nongnu.org" , "peter.huangpeng" , "Gonglei (Arei)" , "stefanha@redhat.com" , "dgilbert@redhat.com" On 07/27/2015 03:37 PM, Jason Wang wrote: > > > On 07/27/2015 01:51 PM, Yang Hongyang wrote: >> On 07/27/2015 12:49 PM, Jason Wang wrote: >>> >>> >>> On 07/27/2015 11:54 AM, Yang Hongyang wrote: >>>> >>>> >>>> On 07/27/2015 11:24 AM, Jason Wang wrote: >>>>> >>>>> >>>>> On 07/24/2015 04:04 PM, Yang Hongyang wrote: >>>>>> Hi Jason, >>>>>> >>>>>> On 07/24/2015 10:12 AM, Jason Wang wrote: >>>>>>> >>>>>>> >>>>>>> On 07/24/2015 10:04 AM, Dong, Eddie wrote: >>>>>>>> Hi Stefan: >>>>>>>> Thanks for your comments! >>>>>>>> >>>>>>>>> On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote: >>>>>>>>>> We are planning to implement colo-proxy in qemu to cache and >>>>>>>>>> compare >>>>>>>>> packets. >>>>>>>>> >>>>>>>>> I thought there is a kernel module to do that? >>>>>>>> Yes, that is the previous solution the COLO sub-community >>>>>>>> choose >>>>>>>> to go, but we realized it might be not the best choices, and >>>>>>>> thus we >>>>>>>> want to bring discussion back here :) More comments are welcome. >>>>>>>> >>>>>>> >>>>>>> Hi: >>>>>>> >>>>>>> Could you pls describe more details on this decision? What's the >>>>>>> reason >>>>>>> that you realize it was not the best choice? >>>>>> >>>>>> Below is my opinion: >>>>>> >>>>>> We realized that there're disadvantages do it in kernel spaces: >>>>>> 1. We need to recompile kernel: the colo-proxy kernel module is >>>>>> implemented as a nf conntrack extension. Adding a extension >>>>>> need to >>>>>> modify the extension struct in-kernel, so recompile kernel is >>>>>> needed. >>>>> >>>>> There's no need to do all in kernel, you can use a separate process to >>>>> do the comparing and trigger the state sync through monitor. >>>> >>>> I don't get it, colo-proxy kernel module using a kthread do the >>>> comparing and >>>> trigger the state sync. We implemented it as a nf conntrack extension >>>> module, >>>> so we need to extend the extension struct in-kernel, although it just >>>> needs >>>> few lines changes to kernel, but a recompile of kernel is needed. >>>> Are you >>>> talking about not implement it as a nf conntrack extension? >>> >>> Yes, I mean implement the comparing in userspace but not in qemu. >> >> Yes, it is an alternative, that requires other components such as >> netfilter userspace tools, it will add the complexity I think, we >> wanted to implement a simple solution in QEMU. Another reason is >> that using other userspace tools will affect the performance, the >> context switch between kernel and userspace may be an overhead. >> >>> >>>> >>>>> >>>>>> 2. We need to recompile iptables/nftables to use together with the >>>>>> colo-proxy >>>>>> kernel module. >>>>>> 3. Need to configure primary host to forward input packets to >>>>>> secondary as >>>>>> well as configure secondary to forward output packets to primary >>>>>> host, the >>>>>> network topology and configuration is too complex for a regular >>>>>> user. >>>>>> >>>>> >>>>> You can use current kernel primitives to mirror the traffic of both >>>>> PVM >>>>> and SVM to another process without any modification of kernel. And >>>>> qemu >>>>> can offload all network configuration to management in this case. And >>>>> what's more import, this works for vhost. Filtering in qemu won't work >>>>> for vhost. >>>> >>>> We are using tc to mirror/forward packets now. Implement in QEMU do >>>> have some >>>> limits, but there're also limits in kernel, if the packet do not pass >>>> the host kernel TCP/IP stack, such as vhost-user. >>> >>> But the limits are much less than userspace, no? For vhost-user, maybe >>> we could extend the backed to mirror the traffic also. >> >> IMO the limits are more or less. Besides, for mirror/forward packets, >> using tc requires a separate physical nic or a vlan, the nic should not >> be used for other purpose. if we implement it in QEMU, using an socket >> connection to forward packets, we no longer need an separate nic, it will >> reduce the network topology complexity. > > It depends on how do you design your user space. If you want using > userspace to forward the packet, you can 1) use packet socket to capture > all traffic on the tap that is used by VM 2) mirror the traffic to a new > tap device, the user space can then read all traffic from this new tap > device. Yes, but we can also do it in QEMU space, right? This will make life easier because we do all in one solution within QEMU. > > . > -- Thanks, Yang.