From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60462) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZJidf-0002Rt-5a for qemu-devel@nongnu.org; Mon, 27 Jul 2015 09:40:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZJidb-0000nz-54 for qemu-devel@nongnu.org; Mon, 27 Jul 2015 09:40:19 -0400 Received: from [59.151.112.132] (port=14798 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZJidZ-0000lh-4S for qemu-devel@nongnu.org; Mon, 27 Jul 2015 09:40:15 -0400 Message-ID: <55B634AE.1080502@cn.fujitsu.com> Date: Mon, 27 Jul 2015 21:39:58 +0800 From: Yang Hongyang MIME-Version: 1.0 References: <55AC9859.3050100@cn.fujitsu.com> <20150720103208.GA12675@stefanha-thinkpad.redhat.com> <55B19F25.10905@redhat.com> <55B1F196.2000008@cn.fujitsu.com> <20150727104016.GF2374@work-vm> In-Reply-To: <20150727104016.GF2374@work-vm> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [POC] colo-proxy in qemu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: zhanghailiang , Li Zhijian , Stefan Hajnoczi , Jason Wang , "Dong, Eddie" , "peter.huangpeng" , "qemu-devel@nongnu.org" , "Gonglei (Arei)" , "stefanha@redhat.com" , "jan.kiszka@siemens.com" Hi Dave, Thanks for the comments! On 07/27/2015 06:40 PM, Dr. David Alan Gilbert wrote: > * Yang Hongyang (yanghy@cn.fujitsu.com) wrote: >> Hi Jason, >> >> On 07/24/2015 10:12 AM, Jason Wang wrote: >>> >>> >>> On 07/24/2015 10:04 AM, Dong, Eddie wrote: >>>> Hi Stefan: >>>> Thanks for your comments! >>>> >>>>> On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote: >>>>>> We are planning to implement colo-proxy in qemu to cache and compare >>>>> packets. >>>>> >>>>> I thought there is a kernel module to do that? >>>> Yes, that is the previous solution the COLO sub-community choose to go, but we realized it might be not the best choices, and thus we want to bring discussion back here :) More comments are welcome. >>>> >>> >>> Hi: >>> >>> Could you pls describe more details on this decision? What's the reason >>> that you realize it was not the best choice? >> >> Below is my opinion: >> >> We realized that there're disadvantages do it in kernel spaces: >> 1. We need to recompile kernel: the colo-proxy kernel module is >> implemented as a nf conntrack extension. Adding a extension need to >> modify the extension struct in-kernel, so recompile kernel is needed. > > That change is tiny though, so I don't think the change to the kernel > is a big issue (but I'm not a netfilter guy). > > (For those following, the patch is: > https://github.com/coloft/colo-proxy/blob/master/patch4kernel/0001-colo-patch-for-kernel.patch > ) > The comparison modules are bigger though, but still not massive. > >> 2. We need to recompile iptables/nftables to use together with the colo-proxy >> kernel module. > > Again, the changes to iptables are small; so I don't think this should > influence it too much. Yes, these changes are small, but even a small change needs to recompile the component and reinstall it, for user, it is not friendly... > > The bigger problem shown by 1&2 is that these changes are single-use - just for > COLO, which does make it a little harder to justify. That's true. > >> 3. Need to configure primary host to forward input packets to secondary as >> well as configure secondary to forward output packets to primary host, the >> network topology and configuration is too complex for a regular user. > > Yes, and that bit is HARD - it took me quite a while to get it right; however, > we'll still need to forward packets between primary and secondary, If we forward in qemu using a socket connection, a separate forward nic will not be needed, and all tc stuff will not needed, will make configuration easier I think. > and all that > hard setup should get rolled into something like libvirt, so perhaps it's not really > that bad for the user in the end. > >> You can refer to http://wiki.qemu.org/Features/COLO >> to see the network topology and the steps to setup an env. >> >> Setup a test env is too complex. The usability is so important to a feature >> like COLO which provide VM FT solution, if fewer people can/willing to >> setup the env, the feature is useless. So we decide to develop user space >> colo-proxy. >> >> The advantage is obvious, >> 1. we do not need to recompile kernel. >> 2. No need to recompile iptables/nftables. >> 3. we do not need to deal with the network configuration, we just using a >> socket connection between 2 QEMUs to forward packets. >> 4. A complete VM FT solution in one go, we have already developed the block >> replication in QEMU, so with the network replication in QEMU, all >> components we needed are within QEMU, this is very important, it greatly >> improves the usability of COLO feature! We hope it will gain more testers, >> users and developers. >> 5. QEMU will gain a complete VM FT solution and the most advantage FT solution >> so far! >> >> Overall, usability is the most important factor that impact our choice. > > My biggest worry is your reliance on SLIRP for the TCP/IP stack; it > doesn't get much work done on it and I worry about it's reliability for > using it for the level of complexity you need. > > Your current kernel implementation gets all the nf_conntrack stuff for free > which is very powerful. > > However, I can see some advantages from doing it in user space; it would > be easier to debug, and possibly easier to configure, and might also be easier > to handle continuous FT (i.e. transferring the state of the proxy to a new COLO > connection). > > I think at the moment I'd still prefer kernel space (especially since your kernel > code now works pretty reliably!) > > Another thought; if you're main worry is to do with the complexity of kernel > changes, had you considered looking at the bpf-jit - I'm not sure if it can > do what you need, but perhaps it's worth a look? Will have a look, thank you! > > Dave > P.S. I think 'proxy' is still the right word to describe it rather than 'agency'. > >> >> >>> >>> Thanks >>> . >>> >> >> -- >> Thanks, >> Yang. > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > . > -- Thanks, Yang.