From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43325) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aLmfX-0003ru-Bt for qemu-devel@nongnu.org; Wed, 20 Jan 2016 01:55:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aLmfU-0005Aw-3w for qemu-devel@nongnu.org; Wed, 20 Jan 2016 01:55:03 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58672) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aLmfT-0005Aa-QU for qemu-devel@nongnu.org; Wed, 20 Jan 2016 01:55:00 -0500 References: <1450780978-19123-1-git-send-email-zhangchen.fnst@cn.fujitsu.com> <568494B8.4080105@redhat.com> <5684E9EB.3070002@cn.fujitsu.com> <568A0527.9040001@redhat.com> <568A2A5F.3090608@cn.fujitsu.com> <568A3F80.8000806@redhat.com> <568A54C2.8050300@cn.fujitsu.com> <568CA327.4020103@redhat.com> <569C8EB7.3060507@cn.fujitsu.com> <569CB08F.4030607@redhat.com> <569EFF25.2020804@cn.fujitsu.com> From: Jason Wang Message-ID: <569F2F27.9000806@redhat.com> Date: Wed, 20 Jan 2016 14:54:31 +0800 MIME-Version: 1.0 In-Reply-To: <569EFF25.2020804@cn.fujitsu.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Zhang Chen , qemu devel Cc: zhanghailiang , Li Zhijian , Gui jianfeng , "eddie.dong" , Huang peng , "Dr. David Alan Gilbert" , Gong lei , Stefan Hajnoczi , jan.kiszka@siemens.com, Yang Hongyang On 01/20/2016 11:29 AM, Zhang Chen wrote: > >> Sure. >> >> Two main comments/suggestions: >> >> - TCP analysis is missed in current version, maybe you point a git tre= e >> (or another version of RFC) to me for a better understanding of the >> design. (Just a skeleton for TCP should be sufficient to discuss). >> - I prefer to make the code as reusable as possible. So it's better to >> split/decouple the reusable parts from the codes. So a vague idea is: >> >> 1) Decouple the packet comparing from the netfilter. You've achieved >> this 99% since the work has been done in a thread. Just let the thread >> poll sockets directly, then the comparing have the possibility to be >> reused by other kinds of dataplane. >> 2) Implement traffic mirror/redirector as filter. >> 3) Implement TCP seq rewriting as a filter. >> >> Then, in primary node, you need just a traffic mirror, which did: >> - mirror ingress traffic to secondary node >> - mirror outgress traffic to packet comparing thread >> >> And in secondadry node, you need two filters: >> - A TCP seq rewriter which adjust tcp sequence number. >> - A traffic redirector which redirect packet from a socket as ingress >> traffic, and redirect outgress traffic to the socket which could be >> polled by remote packet comparing thread. >> Thoughts? >> >> Thanks >> >>> Thanks >>> zhangchen >> > > > Hi, Jason. > We consider your suggestion to split/decouple > the reusable parts from the codes. > Due to filter plugin are traversed one by one in order > we will split colo-proxy to three filters in each side. > > But in this plan,primary and secondary both have socket > server,startup is a problem. I believe this issue could be solved by reusing socket chardev. > > > Primary qemu =20 > Secondary qemu > +----------------------------------------------------------+ =20 > +-----------------------------------------------------------+ > | +-----------------------------------------------------+ | |=20 > +------------------------------------------------------+ | > | | | | |=20 > | | | > | | guest | | |=20 > | guest | | > | | | | |=20 > | | | > | +-----------^--------------+--------------------------+ | |=20 > +---------------------+--------+-----------------------+ | > | | | | =20 > | ^ | | > | | | | =20 > | | | | > | +-------------------------------------------------+=20 > | | | | > | netfilter | | | | | =20 > netfilter | | | > | +-----------------------------------------------------+ | | |=20 > +------------------------------------------------------+ | > | | | | filter excute order | | | |=20 > | | | filter excute order | | > | | | | +-------------------> | | | |=20 > | | | +-------------------> | | > | | | | | | | |=20 > | | | TCP | | > | | +---------+-+ +------v-----+ +----+ +-----+ | | | |=20 > | +-----------+ +---+----+---v+rewriter+ +--------+ | | > | | | | | | | | | | | |=20 > | | | | | | | | | | > | | | mirror | | redirect +----> compare | | | =20 > +--------> mirror +---> adjust | adjust +-->redirect| | | > | | | client | | server | | | | | |=20 > | | server | | ack | seq | |client | | | > | | | | | | | | | | |=20 > | | | | | | | | | | > | | +----^------+ +----^-------+ +-----+------+ | | |=20 > | +-----------+ +--------+-------------+ +----+---+ | | > | | | tx | rx | rx | | |=20 > | tx all | rx | | > | +-----------------------------------------------------+ | |=20 > +------------------------------------------------------+ | > | | =20 > +----------------------------------------------------------------------= ---------------------+ =20 > | > | | | | =20 > | | > +----------------------------------------------------------+ =20 > +-----------------------------------------------------------+ > | | > |guest receive |guest send > | | > +--------+------------------------------------v------------+ > | | > | | > | tap =20 > | NOTE: filter direction is rx/tx/all > | =20 > | rx:receive packets sent to the netdev > | =20 > | tx:receive packets sent by the netdev > +----------------------------------------------------------+ > > > I still like to decouple comparer from netfilter. It have two obvious advantages: - make it can be reused by other dataplane (e.g vhost) - secondary redirector could redirect rx to comparer on primary node directly which simplify the design. > > > > > guest recv packet route > > primary > tap --> mirror client filter > mirror client will send packet to guest,at the > same time, copy and forward packet to secondary > mirror server. > > secondary > mirror server filter --> TCP rewriter > if recv packet is TCP packet,we will adjust ack > and update TCP checksum, then send to secondary > guest. else directly send to guest. > > > guest send packet route > > primary > guest --> redirect server filter > redirect server filter recv primary guest packet > but do nothing, just pass to next filter. > > redirect server filter --> compare filter > compare filter recv primary guest packet then > waiting scondary redirect packet to compare it. > if packet same,send primary packet and clear secondary > packet, else send primary packet and do > checkpoint. > > secondary > guest --> TCP rewriter filter > if the packet is TCP packet,we will adjust seq > and update TCP checksum. then send it to > redirect client filter. else directly send to > redirect client filter. > > redirect client filter --> redirect server filter > forward packet to primary > > > In failover scene=EF=BC=88primary is down=EF=BC=89, the TCP rewriter wi= ll keep > servicing > for the TCP connection which is established after the last checkpoint=E3= =80=82 > > > > How about this plan? Sounds good. And there's indeed no need to differ client/server by reusing the socket chardev. E.g: In primary node: ... -chardev socket,id=3Dcomparer0,host=3Dip_primary,port=3DX,server,nowait -chardev socket,id=3Dcomparer1,host=3Dip_primary,port=3DY,server,nowait -chardev socket,id=3Dmirrorer0,host=3Dip_primary,port=3DZ,server,nowait -netdev tap,id=3Dhn0 -traffic-mirrorer netdev=3Dhn0,id=3Dt0,indev=3Dcomparer0,outdev=3Dmirrore= r0 -colo-comparer primary_traffic=3Dcomparer0,secondary_traffic=3Dcomparer1 ... packet comparer compares the packets from two chardev: comparer0 and comparer1. traffic-mirrorer mirror tx to secondary node through chardev mirrorer0, and mirror rx to packet comparer through chardev comparer0. In secondary node: ... -chardev socket,id=3Dredirector0,host=3Dip_primary,port=3DY -chardev socket,id=3Dredirector1,host=3Dip_primary,port=3DZ -netdev tap,id=3Dhn0 -traffic-redirector netdev=3Dhn0,id,r0,indev=3Dredirector0,outdev=3Dredir= ector1 -colo-rewriter netdev=3Dhn0,id=3Dc0 ... traffic-redirector redirect the rx traffic from primary node through redirector0 and redirect the tx traffic to promary node through redirecto= r1. colo-rewriter rewrite seq number as a normal netfilter. > > >> . >> >