From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46507) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aMWMm-0005zk-MB for qemu-devel@nongnu.org; Fri, 22 Jan 2016 02:42:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aMWMj-00078x-Dh for qemu-devel@nongnu.org; Fri, 22 Jan 2016 02:42:44 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58635) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aMWMj-00078N-44 for qemu-devel@nongnu.org; Fri, 22 Jan 2016 02:42:41 -0500 References: <1450780978-19123-1-git-send-email-zhangchen.fnst@cn.fujitsu.com> <568494B8.4080105@redhat.com> <5684E9EB.3070002@cn.fujitsu.com> <568A0527.9040001@redhat.com> <568A2A5F.3090608@cn.fujitsu.com> <568A3F80.8000806@redhat.com> <568A54C2.8050300@cn.fujitsu.com> <568CA327.4020103@redhat.com> <569C8EB7.3060507@cn.fujitsu.com> <569CB08F.4030607@redhat.com> <569EFF25.2020804@cn.fujitsu.com> <569F2F27.9000806@redhat.com> <569F5AFF.2050302@cn.fujitsu.com> <569F5F43.5030807@redhat.com> <569F61D7.3060502@cn.fujitsu.com> <56A19EEA.4000700@redhat.com> <56A1A1CA.8020008@cn.fujitsu.com> <56A1C112.3060402@redhat.com> <56A1C4A9.3020203@cn.fujitsu.com> <56A1CA7E.3090306@redhat.com> <56A1D09E.2040004@cn.fujitsu.com> From: Jason Wang Message-ID: <56A1DD5F.5030504@redhat.com> Date: Fri, 22 Jan 2016 15:42:23 +0800 MIME-Version: 1.0 In-Reply-To: <56A1D09E.2040004@cn.fujitsu.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wen Congyang , Zhang Chen , qemu devel Cc: zhanghailiang , Li Zhijian , Gui jianfeng , "eddie.dong" , "Dr. David Alan Gilbert" , Huang peng , Gong lei , Stefan Hajnoczi , jan.kiszka@siemens.com, Yang Hongyang On 01/22/2016 02:47 PM, Wen Congyang wrote: > On 01/22/2016 02:21 PM, Jason Wang wrote: >> >> On 01/22/2016 01:56 PM, Wen Congyang wrote: >>> On 01/22/2016 01:41 PM, Jason Wang wrote: >>>>> >>>>> On 01/22/2016 11:28 AM, Wen Congyang wrote: >>>>>>> On 01/22/2016 11:15 AM, Jason Wang wrote: >>>>>>>>> On 01/20/2016 06:30 PM, Wen Congyang wrote: >>>>>>>>>>> On 01/20/2016 06:19 PM, Jason Wang wrote: >>>>>>>>>>>>>>> On 01/20/2016 06:01 PM, Wen Congyang wrote: >>>>>>>>>>>>>>>>>>> On 01/20/2016 02:54 PM, Jason Wang wrote: >>>>>>>>>>>>>>>>>>>>>>> On 01/20/2016 11:29 AM, Zhang Chen wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sure. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Two main comments/suggestions: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - TCP analysis is missed in current versi= on, maybe you point a git tree >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (or another version of RFC) to me for a b= etter understanding of the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> design. (Just a skeleton for TCP should b= e sufficient to discuss). >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - I prefer to make the code as reusable a= s possible. So it's better to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> split/decouple the reusable parts from th= e codes. So a vague idea is: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1) Decouple the packet comparing from the= netfilter. You've achieved >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this 99% since the work has been done in = a thread. Just let the thread >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> poll sockets directly, then the comparing= have the possibility to be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reused by other kinds of dataplane. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2) Implement traffic mirror/redirector as= filter. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3) Implement TCP seq rewriting as a filte= r. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Then, in primary node, you need just a tr= affic mirror, which did: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - mirror ingress traffic to secondary nod= e >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - mirror outgress traffic to packet compa= ring thread >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And in secondadry node, you need two filt= ers: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - A TCP seq rewriter which adjust tcp seq= uence number. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - A traffic redirector which redirect pac= ket from a socket as ingress >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traffic, and redirect outgress traffic to= the socket which could be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> polled by remote packet comparing thread. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thoughts? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> zhangchen >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Jason. >>>>>>>>>>>>>>>>>>>>>>>>>>> We consider your suggestion to split/decouple >>>>>>>>>>>>>>>>>>>>>>>>>>> the reusable parts from the codes. >>>>>>>>>>>>>>>>>>>>>>>>>>> Due to filter plugin are traversed one by one= in order >>>>>>>>>>>>>>>>>>>>>>>>>>> we will split colo-proxy to three filters in = each side. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> But in this plan,primary and secondary both h= ave socket >>>>>>>>>>>>>>>>>>>>>>>>>>> server,startup is a problem. >>>>>>>>>>>>>>>>>>>>>>> I believe this issue could be solved by reusing s= ocket chardev. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Primary qemu = =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> Secondary qemu >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= --------------+ =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= ---------------+ >>>>>>>>>>>>>>>>>>>>>>>>>>> | +------------------------------------------= -----------+ | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= ----------+ | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | = | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | = | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | guest = | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | guest = | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | = | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | = | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | +-----------^--------------+---------------= -----------+ | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> +---------------------+--------+-------------= ----------+ | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | = | =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | ^ | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | = | =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> | +------------------------------= -------------------+=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> | netfilter | | = | | | =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> netfilter | | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> | +------------------------------------------= -----------+ | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= ----------+ | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | | filter exc= ute order | | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | filter excu= te order | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | | +----------= ---------> | | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | +-----------= --------> | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | | = | | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | TCP = | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | +---------+-+ +------v-----+ +----= + +-----+ | | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | +-----------+ +---+----+---v+rewriter+ += --------+ | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | | | | | = | | | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | | | | |= | | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | mirror | | redirect +----> co= mpare | | | =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------> mirror +---> adjust | adjust = +-->redirect| | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | client | | server | | = | | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | | server | | ack | seq | |= client | | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | | | | | = | | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | | | | |= | | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | +----^------+ +----^-------+ +----= -+------+ | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | +-----------+ +--------+-------------+ += ----+---+ | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | | tx | rx = | rx | | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | tx all = | rx | | >>>>>>>>>>>>>>>>>>>>>>>>>>> | +------------------------------------------= -----------+ | |=20 >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= ----------+ | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= -----------------------------------------------+ =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>>>>>>> | | = | | =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= --------------+ =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= ---------------+ >>>>>>>>>>>>>>>>>>>>>>>>>>> | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> |guest receive = |guest send >>>>>>>>>>>>>>>>>>>>>>>>>>> | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------+-----------------------------------= -v------------+ >>>>>>>>>>>>>>>>>>>>>>>>>>> | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> | = | >>>>>>>>>>>>>>>>>>>>>>>>>>> | tap = =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | NOTE: filter d= irection is rx/tx/all >>>>>>>>>>>>>>>>>>>>>>>>>>> | = =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | rx:receive pac= kets sent to the netdev >>>>>>>>>>>>>>>>>>>>>>>>>>> | = =20 >>>>>>>>>>>>>>>>>>>>>>>>>>> | tx:receive pac= kets sent by the netdev >>>>>>>>>>>>>>>>>>>>>>>>>>> +--------------------------------------------= --------------+ >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I still like to decouple comparer from netfilter.= It have two obvious >>>>>>>>>>>>>>>>>>>>>>> advantages: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - make it can be reused by other dataplane (e.g v= host) >>>>>>>>>>>>>>>>>>>>>>> - secondary redirector could redirect rx to compa= rer on primary node >>>>>>>>>>>>>>>>>>>>>>> directly which simplify the design. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> guest recv packet route >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> primary >>>>>>>>>>>>>>>>>>>>>>>>>>> tap --> mirror client filter >>>>>>>>>>>>>>>>>>>>>>>>>>> mirror client will send packet to guest,at th= e >>>>>>>>>>>>>>>>>>>>>>>>>>> same time, copy and forward packet to seconda= ry >>>>>>>>>>>>>>>>>>>>>>>>>>> mirror server. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> secondary >>>>>>>>>>>>>>>>>>>>>>>>>>> mirror server filter --> TCP rewriter >>>>>>>>>>>>>>>>>>>>>>>>>>> if recv packet is TCP packet,we will adjust a= ck >>>>>>>>>>>>>>>>>>>>>>>>>>> and update TCP checksum, then send to seconda= ry >>>>>>>>>>>>>>>>>>>>>>>>>>> guest. else directly send to guest. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> guest send packet route >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> primary >>>>>>>>>>>>>>>>>>>>>>>>>>> guest --> redirect server filter >>>>>>>>>>>>>>>>>>>>>>>>>>> redirect server filter recv primary guest pac= ket >>>>>>>>>>>>>>>>>>>>>>>>>>> but do nothing, just pass to next filter. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> redirect server filter --> compare filter >>>>>>>>>>>>>>>>>>>>>>>>>>> compare filter recv primary guest packet then >>>>>>>>>>>>>>>>>>>>>>>>>>> waiting scondary redirect packet to compare i= t. >>>>>>>>>>>>>>>>>>>>>>>>>>> if packet same,send primary packet and clear = secondary >>>>>>>>>>>>>>>>>>>>>>>>>>> packet, else send primary packet and do >>>>>>>>>>>>>>>>>>>>>>>>>>> checkpoint. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> secondary >>>>>>>>>>>>>>>>>>>>>>>>>>> guest --> TCP rewriter filter >>>>>>>>>>>>>>>>>>>>>>>>>>> if the packet is TCP packet,we will adjust se= q >>>>>>>>>>>>>>>>>>>>>>>>>>> and update TCP checksum. then send it to >>>>>>>>>>>>>>>>>>>>>>>>>>> redirect client filter. else directly send to >>>>>>>>>>>>>>>>>>>>>>>>>>> redirect client filter. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> redirect client filter --> redirect server fi= lter >>>>>>>>>>>>>>>>>>>>>>>>>>> forward packet to primary >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In failover scene=EF=BC=88primary is down=EF=BC= =89, the TCP rewriter will keep >>>>>>>>>>>>>>>>>>>>>>>>>>> servicing >>>>>>>>>>>>>>>>>>>>>>>>>>> for the TCP connection which is established a= fter the last checkpoint=E3=80=82 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> How about this plan? >>>>>>>>>>>>>>>>>>>>>>> Sounds good. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> And there's indeed no need to differ client/serve= r by reusing the socket >>>>>>>>>>>>>>>>>>>>>>> chardev. E.g: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> In primary node: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>>>>>>>> -chardev socket,id=3Dcomparer0,host=3Dip_primary,= port=3DX,server,nowait >>>>>>>>>>>>>>>>>>>>>>> -chardev socket,id=3Dcomparer1,host=3Dip_primary,= port=3DY,server,nowait >>>>>>>>>>>>>>>>>>>>>>> -chardev socket,id=3Dmirrorer0,host=3Dip_primary,= port=3DZ,server,nowait >>>>>>>>>>>>>>>>>>>>>>> -netdev tap,id=3Dhn0 >>>>>>>>>>>>>>>>>>>>>>> -traffic-mirrorer netdev=3Dhn0,id=3Dt0,indev=3Dco= mparer0,outdev=3Dmirrorer0 >>>>>>>>>>>>>>>>>>>>>>> -colo-comparer primary_traffic=3Dcomparer0,second= ary_traffic=3Dcomparer1 >>>>>>>>>>>>>>>>>>> Why mirrorer has indev?=20 >>>>>>>>>>>>>>> As I said in the previous mails. I would like to decouple= packet >>>>>>>>>>>>>>> comparing from netfilter. You've already done most of thi= s since the >>>>>>>>>>>>>>> comparing is done in an independent thread. So the indev = here is to >>>>>>>>>>>>>>> mirror the packet sent by guest to the packet comparing t= hread. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I think we can use traffic-redirector to do it. >>>>>>>>>>>>>>>>>>> The command line is: >>>>>>>>>>>>>>>>>>> -netdev tap,id=3Dhn0 >>>>>>>>>>>>>>>>>>> -object traffic-mirrorer,id=3Df0,netdev=3Dhn0,queue=3D= tx,outdev=3Dmirrorer0 >>>>>>>>>>>>>>>>>>> -object traffic-redirector,id=3Df1,netdev=3Dhn0,queue= =3Drx,outdev=3Dcomparer0 >>>>>>>>>>>>>>>>>>> -colo-comparer primary_traffic=3Dcomparer0,secondary_= traffic=3Dcomparer1,netdev=3Dhn0 >>>>>>>>>>>>>>>>>>> In the comparer thread, we can use qemu_net_queue_sen= d_iov() to send >>>>>>>>>>>>>>>>>>> out the packet. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Also, we can merge the socketdev comparer1 and mirror= er0. >>>>>>>>>>>>>>> It depends on whether or not packet comparing was done in= a net filter >>>>>>>>>>>>>>> (which I prefer not). >>>>>>>>>>> I mean that: packet comapring is done in a thread, not a net = filter. >>>>>>>>>>> The flow of the packet sent from guest: >>>>>>>>>>> 1. traffice-redirecotr, we will redirector the packet to comp= arer0, the next >>>>>>>>>>> filter will never see it. >>>>>>>>>>> 2. comparing thread: read it from socket chardev comparer0 >>>>>>>>>>> 3. call qemu_net_queue_send_iov() to send it back to the netd= ev. >>>>>>>>> Ok, looks like I miss something. >>>>>>>>> >>>>>>>>> My suggestion tries best to let the packet comparing not tie to= filter >>>>>>>>> or netdev. But your suggestion still need it to be coupled with= a >>>>>>>>> netdev. Any advantages of doing this (or is there a reason that= packet >>>>>>>>> must be sent to netdev after doing comparing?). If not, why not= just >>>>>>> Yes, the packet must be sent to netdev after doing comparing. If = both >>>>>>> the primary packet and secondary packet are the same(contains the= same >>>>>>> application level data), we will drop the secondary packet, and s= end the >>>>>>> primary packet to the netdev. Otherwise, we will sync the state. >>>>> And drop primary packet also here? >>> No, the primary packet must be sent back to the netdev, so the client= can receive >>> the response. >>> >>> For example: >>> 1. guest has a ftp server >>> 2. we connect to the ftp server via the network >>> 3. both primary guest and secondary guest receive this request >>> 4. both primary guest and secondary guest ack it >>> 5. we compare these two ack packets in the comparing thread >>> 6. it is the same(the seqno is different, but it is not important, we= can modify it in >>> colo-rewriter). So we drop the secondary packets, and sent back th= e primary packet >>> to netdev >>> 7. The primary ack packet is sent to the ftp client via netdev. >>> >>> The ftp client only cares of the received packet. So if the packets f= rom primay >>> and secondary guest contain the same data, we can say they are in the= "same" state. >>> >>> Thanks >>> Wen Congyang >>> >> Thanks for the example. But still don't get why it must be done before >> comparing consider it will always be sent regardless the result of >> comparing? > Our goal is that: the connection is OK after failover, and the user doe= sn't know one of > the hosts crashed. > > If it sent out regardless the result of comparing, and primary host cra= shes. The connection > may be corrupted after failover. For example: the packet from primary a= nd secondary host > contains different host, and we send the primary packet before comparin= g. The primary host > crashes before comparing these two packets. After failover, the connect= ion may be reset or > the client doesn't receive the correct data, or some unexpected problem= s occurs. > > Another example(tcp): > 1. primary guest acks 100, and secondary guest only ack 95(some packet = is lost in the guest) > 2. client doesn't resend the lost packet > 3. the connection will be recovered after the next checkpoint > If we do failover before the next checkpoint, there is no way to recove= r this connection. > > If we send out the packet after comparing, we can assume that the clien= t always receives the > same data. Thanks. I think I get the point. So if there's a difference, primary packet will only be sent after checkpoint and we could not assume the checkpoint itself is reliable. Back to the filters design. We'd better still decouple packet comparing out of netdev. Maybe a little bit more tweak on what you've suggested: -netdev tap,id=3Dhn0 -object traffic-mirrorer,id=3Df0,netdev=3Dhn0,queue=3Dtx,outdev=3Dmirrore= r0 -object traffic-redirector,id=3Df1,netdev=3Dhn0,queue=3Drx,outdev=3Dcomparer0,ind= ev=3Dcomparer2 -colo-comparer primary_traffic=3Dcomparer0,secondary_traffic=3Dcomparer1,outdev=3Dcompar= er2 Just add one more socket for comparer for sending primary packet, and let f1 redirector its output to netdev? > > Thanks > Wen Congyang > >> >> >> . >> > >