From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55404) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bGKIQ-0008Kr-5D for qemu-devel@nongnu.org; Fri, 24 Jun 2016 02:08:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bGKIN-0000oA-QG for qemu-devel@nongnu.org; Fri, 24 Jun 2016 02:08:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53249) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bGKIN-0000o6-I4 for qemu-devel@nongnu.org; Fri, 24 Jun 2016 02:08:51 -0400 References: <1465902912-27527-1-git-send-email-zhangchen.fnst@cn.fujitsu.com> <1465902912-27527-4-git-send-email-zhangchen.fnst@cn.fujitsu.com> <57678CDA.7020700@redhat.com> <20160620121402.GC2891@work-vm> <576A020E.8040804@cn.fujitsu.com> <576A3174.2010905@redhat.com> <576BBE8D.1030009@cn.fujitsu.com> From: Jason Wang Message-ID: <576CCE68.90309@redhat.com> Date: Fri, 24 Jun 2016 14:08:40 +0800 MIME-Version: 1.0 In-Reply-To: <576BBE8D.1030009@cn.fujitsu.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH 3/3] filter-rewriter: rewrite tcp packet to keep secondary connection List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Zhang Chen , "Dr. David Alan Gilbert" Cc: Yang Hongyang , "eddie . dong" , qemu devel , Li Zhijian , zhanghailiang On 2016=E5=B9=B406=E6=9C=8823=E6=97=A5 18:48, Zhang Chen wrote: > > > On 06/22/2016 02:34 PM, Jason Wang wrote: >> >> >> On 2016=E5=B9=B406=E6=9C=8822=E6=97=A5 11:12, Zhang Chen wrote: >>> >>> >>> On 06/20/2016 08:14 PM, Dr. David Alan Gilbert wrote: >>>> * Jason Wang (jasowang@redhat.com) wrote: >>>>> >>>>> On 2016=E5=B9=B406=E6=9C=8814=E6=97=A5 19:15, Zhang Chen wrote: >>>>>> We will rewrite tcp packet secondary received and sent. >>>>> More verbose please. E.g which fields were rewrote and why. >>> >>> OK. >>> >>>>>> Signed-off-by: Zhang Chen >>>>>> Signed-off-by: Li Zhijian >>>>>> Signed-off-by: Wen Congyang >>>>>> --- >>>>>> net/filter-rewriter.c | 94=20 >>>>>> +++++++++++++++++++++++++++++++++++++++++++++++++-- >>>>>> trace-events | 3 ++ >>>>>> 2 files changed, 95 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c >>>>>> index 12f88c5..86a2f53 100644 >>>>>> --- a/net/filter-rewriter.c >>>>>> +++ b/net/filter-rewriter.c >>>>>> @@ -21,6 +21,7 @@ >>>>>> #include "qemu/main-loop.h" >>>>>> #include "qemu/iov.h" >>>>>> #include "net/checksum.h" >>>>>> +#include "trace.h" >>>>>> #define FILTER_COLO_REWRITER(obj) \ >>>>>> OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER) >>>>>> @@ -64,6 +65,75 @@ static int is_tcp_packet(Packet *pkt) >>>>>> } >>>>>> } >>>>>> +static int handle_primary_tcp_pkt(NetFilterState *nf, >>>>>> + Connection *conn, >>>>>> + Packet *pkt) >>>>>> +{ >>>>>> + struct tcphdr *tcp_pkt; >>>>>> + >>>>>> + tcp_pkt =3D (struct tcphdr *)pkt->transport_layer; >>>>>> + >>>>>> + if (trace_event_get_state(TRACE_COLO_FILTER_REWRITER_DEBUG)) = { >>>>> Why not use tracepoints directly? >>>> Because trace can't cope with you having to do an allocation/free. >>>> >>>>>> + char *sdebug, *ddebug; >>>>>> + sdebug =3D strdup(inet_ntoa(pkt->ip->ip_src)); >>>>>> + ddebug =3D strdup(inet_ntoa(pkt->ip->ip_dst)); >>>>>> + fprintf(stderr, "%s: src/dst: %s/%s p: seq/ack=3D%u/%u" >>>>>> + " flags=3D%x\n", __func__, sdebug, ddebug, >>>>>> + ntohl(tcp_pkt->th_seq), ntohl(tcp_pkt->th_ack), >>>>>> + tcp_pkt->th_flags); >>>> However, this should use the trace_ call to write the result even=20 >>>> if it's >>>> using trace_event_get_state to switch the whole block on/off. >>> >>> I will fix it in next version. >>> >>>> >>>>>> + g_free(sdebug); >>>>>> + g_free(ddebug); >>>>>> + } >>>>>> + >>>>>> + if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) =3D=3D TH_ACK)) = { >>>>>> + /* save primary colo tcp packet seq */ >>>>>> + conn->primary_seq =3D ntohl(tcp_pkt->th_ack) - 1; >>>>> Looks like primary_seq will only be updated during handshake, I=20 >>>>> wonder how >>>>> this works. >>> >>> OK. >>> We assume that colo guest is a tcp server. >>> >>> Firstly, client start a tcp handshake. the packet's seq=3Dclient_seq, >>> ack=3D0,flag=3DSYN. COLO primary guest get this pkt and=20 >>> mirror(filter-mirror) >>> to secondary guest, secondary get it use filter-redirector. >>> Then,primary guest response=20 >>> pkt(seq=3Dprimary_seq,ack=3Dclient_seq+1,flag=3DACK|SYN). >>> secondary guest response=20 >>> pkt(seq=3Dsecondary_seq,ack=3Dclient_seq+1,flag=3DACK|SYN). >>> In here,we use filter-rewriter save the secondary_seq to it's tcp=20 >>> connection. >>> Finally handshake,client send=20 >>> pkt(seq=3Dclient_seq+1,ack=3Dprimary_seq+1,flag=3DACK). >>> Here,filter-rewriter can get primary_seq, and rewrite ack from=20 >>> primary_seq+1 >>> to secondary_seq+1, recalculate checksum. So the secondary tcp=20 >>> connection >>> kept good. >>> >>> When we send/recv packet. >>> client send=20 >>> pkt(seq=3Dclient_seq+1+data_len,ack=3Dprimary_seq+1,flag=3DACK|PSH). >>> filter-rewriter rewrite ack and send to secondary guest. >> >> If I read your code correctly, secondary_seq will only be updated=20 >> during handshake. So the ack seq will always be same for each packet=20 >> received by secondary? > > Yes. I don't know why kernel do this. But=E3=80=80I dump the packet hex= found=20 > that, > the ack packet flag=3DACK means only ack enabled.and the seq will affec= t=20 > tcp checksum > make connection failed. > Not sure I get your meaning, but basically the code here should not have=20 any assumptions on guest behaviors. > >> >>> primary guest response=20 >>> pkt(seq=3Dprimary_seq+1,ack=3Dclient_seq+1+data_len,flag=3DACK) >>> secondary guest response=20 >>> pkt(seq=3Dsecondary_seq+1,ack=3Dclient_seq+1+data_len,flag=3DACK) >> >> Is ACK a must here? > > Yes. > Looks not, e.g what happens if guest does not use piggybacking acks?