From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38025) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZJZVA-0000kN-Ud for qemu-devel@nongnu.org; Sun, 26 Jul 2015 23:54:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZJZV7-0003o8-Mw for qemu-devel@nongnu.org; Sun, 26 Jul 2015 23:54:56 -0400 Received: from [59.151.112.132] (port=22062 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZJZV4-0003lb-B6 for qemu-devel@nongnu.org; Sun, 26 Jul 2015 23:54:53 -0400 Message-ID: <55B5AB70.3030704@cn.fujitsu.com> Date: Mon, 27 Jul 2015 11:54:24 +0800 From: Yang Hongyang MIME-Version: 1.0 References: <55AC9859.3050100@cn.fujitsu.com> <20150720103208.GA12675@stefanha-thinkpad.redhat.com> <55B19F25.10905@redhat.com> <55B1F196.2000008@cn.fujitsu.com> <55B5A465.6030004@redhat.com> In-Reply-To: <55B5A465.6030004@redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [POC] colo-proxy in qemu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jason Wang , "Dong, Eddie" , Stefan Hajnoczi , Li Zhijian Cc: zhanghailiang , "jan.kiszka@siemens.com" , "qemu-devel@nongnu.org" , "peter.huangpeng" , "Gonglei (Arei)" , "stefanha@redhat.com" , "dgilbert@redhat.com" On 07/27/2015 11:24 AM, Jason Wang wrote: > > > On 07/24/2015 04:04 PM, Yang Hongyang wrote: >> Hi Jason, >> >> On 07/24/2015 10:12 AM, Jason Wang wrote: >>> >>> >>> On 07/24/2015 10:04 AM, Dong, Eddie wrote: >>>> Hi Stefan: >>>> Thanks for your comments! >>>> >>>>> On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote: >>>>>> We are planning to implement colo-proxy in qemu to cache and compare >>>>> packets. >>>>> >>>>> I thought there is a kernel module to do that? >>>> Yes, that is the previous solution the COLO sub-community choose >>>> to go, but we realized it might be not the best choices, and thus we >>>> want to bring discussion back here :) More comments are welcome. >>>> >>> >>> Hi: >>> >>> Could you pls describe more details on this decision? What's the reason >>> that you realize it was not the best choice? >> >> Below is my opinion: >> >> We realized that there're disadvantages do it in kernel spaces: >> 1. We need to recompile kernel: the colo-proxy kernel module is >> implemented as a nf conntrack extension. Adding a extension need to >> modify the extension struct in-kernel, so recompile kernel is needed. > > There's no need to do all in kernel, you can use a separate process to > do the comparing and trigger the state sync through monitor. I don't get it, colo-proxy kernel module using a kthread do the comparing and trigger the state sync. We implemented it as a nf conntrack extension module, so we need to extend the extension struct in-kernel, although it just needs few lines changes to kernel, but a recompile of kernel is needed. Are you talking about not implement it as a nf conntrack extension? > >> 2. We need to recompile iptables/nftables to use together with the >> colo-proxy >> kernel module. >> 3. Need to configure primary host to forward input packets to >> secondary as >> well as configure secondary to forward output packets to primary >> host, the >> network topology and configuration is too complex for a regular user. >> > > You can use current kernel primitives to mirror the traffic of both PVM > and SVM to another process without any modification of kernel. And qemu > can offload all network configuration to management in this case. And > what's more import, this works for vhost. Filtering in qemu won't work > for vhost. We are using tc to mirror/forward packets now. Implement in QEMU do have some limits, but there're also limits in kernel, if the packet do not pass the host kernel TCP/IP stack, such as vhost-user. > > >> You can refer to http://wiki.qemu.org/Features/COLO >> to see the network topology and the steps to setup an env. > > The figure "COLO Framework" shows there's a proxy kernel module in > primary node but in secondary node this is done through a process? This > will complicate the environment a bit more. proxy kernel module also works for secondary node. > >> >> Setup a test env is too complex. The usability is so important to a >> feature >> like COLO which provide VM FT solution, if fewer people can/willing to >> setup the env, the feature is useless. So we decide to develop user space >> colo-proxy. > > If the setup is too complex, need to consider to simplify or reuse codes > and designs. Otherwise you probably introduce something new that needs > fault tolerance. > >> >> The advantage is obvious, >> 1. we do not need to recompile kernel. >> 2. No need to recompile iptables/nftables. > > As I descried above, looks like there's no need to modify kernel. > >> 3. we do not need to deal with the network configuration, we just using a >> socket connection between 2 QEMUs to forward packets. > > All network configurations should be offloaded to management. And you > still need a dedicated topology according to the wiki. > >> 4. A complete VM FT solution in one go, we have already developed the >> block >> replication in QEMU, so with the network replication in QEMU, all >> components we needed are within QEMU, this is very important, it >> greatly >> improves the usability of COLO feature! We hope it will gain more >> testers, >> users and developers. > > Is your block solution works for vhost? No, it can't works for vhost and dataplane, migration also won't work for dataplane IIRC. > >> 5. QEMU will gain a complete VM FT solution and the most advantage FT >> solution >> so far! >> >> Overall, usability is the most important factor that impact our choice. >> >> > > Usability will be improved if you can use exist primitives and decouple > unnecessary codes from qemu. > > Thanks > >>> >>> Thanks >>> . >>> >> > > > . > -- Thanks, Yang.