From: Jason Wang <jasowang@redhat.com>
To: Wen Congyang <wency@cn.fujitsu.com>,
Zhang Chen <zhangchen.fnst@cn.fujitsu.com>,
qemu devel <qemu-devel@nongnu.org>
Cc: zhanghailiang <zhang.zhanghailiang@huawei.com>,
Li Zhijian <lizhijian@cn.fujitsu.com>,
Gui jianfeng <guijianfeng@cn.fujitsu.com>,
"eddie.dong" <eddie.dong@intel.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
Huang peng <peter.huangpeng@huawei.com>,
Gong lei <arei.gonglei@huawei.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
jan.kiszka@siemens.com,
Yang Hongyang <hongyang.yang@easystack.cn>
Subject: Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter
Date: Fri, 22 Jan 2016 13:41:38 +0800 [thread overview]
Message-ID: <56A1C112.3060402@redhat.com> (raw)
In-Reply-To: <56A1A1CA.8020008@cn.fujitsu.com>
On 01/22/2016 11:28 AM, Wen Congyang wrote:
> On 01/22/2016 11:15 AM, Jason Wang wrote:
>>
>> On 01/20/2016 06:30 PM, Wen Congyang wrote:
>>> On 01/20/2016 06:19 PM, Jason Wang wrote:
>>>>>
>>>>> On 01/20/2016 06:01 PM, Wen Congyang wrote:
>>>>>>> On 01/20/2016 02:54 PM, Jason Wang wrote:
>>>>>>>>> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Two main comments/suggestions:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - TCP analysis is missed in current version, maybe you point a git tree
>>>>>>>>>>>>> (or another version of RFC) to me for a better understanding of the
>>>>>>>>>>>>> design. (Just a skeleton for TCP should be sufficient to discuss).
>>>>>>>>>>>>> - I prefer to make the code as reusable as possible. So it's better to
>>>>>>>>>>>>> split/decouple the reusable parts from the codes. So a vague idea is:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) Decouple the packet comparing from the netfilter. You've achieved
>>>>>>>>>>>>> this 99% since the work has been done in a thread. Just let the thread
>>>>>>>>>>>>> poll sockets directly, then the comparing have the possibility to be
>>>>>>>>>>>>> reused by other kinds of dataplane.
>>>>>>>>>>>>> 2) Implement traffic mirror/redirector as filter.
>>>>>>>>>>>>> 3) Implement TCP seq rewriting as a filter.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Then, in primary node, you need just a traffic mirror, which did:
>>>>>>>>>>>>> - mirror ingress traffic to secondary node
>>>>>>>>>>>>> - mirror outgress traffic to packet comparing thread
>>>>>>>>>>>>>
>>>>>>>>>>>>> And in secondadry node, you need two filters:
>>>>>>>>>>>>> - A TCP seq rewriter which adjust tcp sequence number.
>>>>>>>>>>>>> - A traffic redirector which redirect packet from a socket as ingress
>>>>>>>>>>>>> traffic, and redirect outgress traffic to the socket which could be
>>>>>>>>>>>>> polled by remote packet comparing thread.
>>>>>>>>>>>>> Thoughts?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> zhangchen
>>>>>>>>>>> Hi, Jason.
>>>>>>>>>>> We consider your suggestion to split/decouple
>>>>>>>>>>> the reusable parts from the codes.
>>>>>>>>>>> Due to filter plugin are traversed one by one in order
>>>>>>>>>>> we will split colo-proxy to three filters in each side.
>>>>>>>>>>>
>>>>>>>>>>> But in this plan,primary and secondary both have socket
>>>>>>>>>>> server,startup is a problem.
>>>>>>>>> I believe this issue could be solved by reusing socket chardev.
>>>>>>>>>
>>>>>>>>>>> Primary qemu
>>>>>>>>>>> Secondary qemu
>>>>>>>>>>> +----------------------------------------------------------+
>>>>>>>>>>> +-----------------------------------------------------------+
>>>>>>>>>>> | +-----------------------------------------------------+ | |
>>>>>>>>>>> +------------------------------------------------------+ |
>>>>>>>>>>> | | | | |
>>>>>>>>>>> | | |
>>>>>>>>>>> | | guest | | |
>>>>>>>>>>> | guest | |
>>>>>>>>>>> | | | | |
>>>>>>>>>>> | | |
>>>>>>>>>>> | +-----------^--------------+--------------------------+ | |
>>>>>>>>>>> +---------------------+--------+-----------------------+ |
>>>>>>>>>>> | | | |
>>>>>>>>>>> | ^ | |
>>>>>>>>>>> | | | |
>>>>>>>>>>> | | | |
>>>>>>>>>>> | +-------------------------------------------------+
>>>>>>>>>>> | | | |
>>>>>>>>>>> | netfilter | | | | |
>>>>>>>>>>> netfilter | | |
>>>>>>>>>>> | +-----------------------------------------------------+ | | |
>>>>>>>>>>> +------------------------------------------------------+ |
>>>>>>>>>>> | | | | filter excute order | | | |
>>>>>>>>>>> | | | filter excute order | |
>>>>>>>>>>> | | | | +-------------------> | | | |
>>>>>>>>>>> | | | +-------------------> | |
>>>>>>>>>>> | | | | | | | |
>>>>>>>>>>> | | | TCP | |
>>>>>>>>>>> | | +---------+-+ +------v-----+ +----+ +-----+ | | | |
>>>>>>>>>>> | +-----------+ +---+----+---v+rewriter+ +--------+ | |
>>>>>>>>>>> | | | | | | | | | | | |
>>>>>>>>>>> | | | | | | | | | |
>>>>>>>>>>> | | | mirror | | redirect +----> compare | | |
>>>>>>>>>>> +--------> mirror +---> adjust | adjust +-->redirect| | |
>>>>>>>>>>> | | | client | | server | | | | | |
>>>>>>>>>>> | | server | | ack | seq | |client | | |
>>>>>>>>>>> | | | | | | | | | | |
>>>>>>>>>>> | | | | | | | | | |
>>>>>>>>>>> | | +----^------+ +----^-------+ +-----+------+ | | |
>>>>>>>>>>> | +-----------+ +--------+-------------+ +----+---+ | |
>>>>>>>>>>> | | | tx | rx | rx | | |
>>>>>>>>>>> | tx all | rx | |
>>>>>>>>>>> | +-----------------------------------------------------+ | |
>>>>>>>>>>> +------------------------------------------------------+ |
>>>>>>>>>>> | |
>>>>>>>>>>> +-------------------------------------------------------------------------------------------+
>>>>>>>>>>> |
>>>>>>>>>>> | | | |
>>>>>>>>>>> | |
>>>>>>>>>>> +----------------------------------------------------------+
>>>>>>>>>>> +-----------------------------------------------------------+
>>>>>>>>>>> | |
>>>>>>>>>>> |guest receive |guest send
>>>>>>>>>>> | |
>>>>>>>>>>> +--------+------------------------------------v------------+
>>>>>>>>>>> | |
>>>>>>>>>>> | |
>>>>>>>>>>> | tap
>>>>>>>>>>> | NOTE: filter direction is rx/tx/all
>>>>>>>>>>> |
>>>>>>>>>>> | rx:receive packets sent to the netdev
>>>>>>>>>>> |
>>>>>>>>>>> | tx:receive packets sent by the netdev
>>>>>>>>>>> +----------------------------------------------------------+
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> I still like to decouple comparer from netfilter. It have two obvious
>>>>>>>>> advantages:
>>>>>>>>>
>>>>>>>>> - make it can be reused by other dataplane (e.g vhost)
>>>>>>>>> - secondary redirector could redirect rx to comparer on primary node
>>>>>>>>> directly which simplify the design.
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> guest recv packet route
>>>>>>>>>>>
>>>>>>>>>>> primary
>>>>>>>>>>> tap --> mirror client filter
>>>>>>>>>>> mirror client will send packet to guest,at the
>>>>>>>>>>> same time, copy and forward packet to secondary
>>>>>>>>>>> mirror server.
>>>>>>>>>>>
>>>>>>>>>>> secondary
>>>>>>>>>>> mirror server filter --> TCP rewriter
>>>>>>>>>>> if recv packet is TCP packet,we will adjust ack
>>>>>>>>>>> and update TCP checksum, then send to secondary
>>>>>>>>>>> guest. else directly send to guest.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> guest send packet route
>>>>>>>>>>>
>>>>>>>>>>> primary
>>>>>>>>>>> guest --> redirect server filter
>>>>>>>>>>> redirect server filter recv primary guest packet
>>>>>>>>>>> but do nothing, just pass to next filter.
>>>>>>>>>>>
>>>>>>>>>>> redirect server filter --> compare filter
>>>>>>>>>>> compare filter recv primary guest packet then
>>>>>>>>>>> waiting scondary redirect packet to compare it.
>>>>>>>>>>> if packet same,send primary packet and clear secondary
>>>>>>>>>>> packet, else send primary packet and do
>>>>>>>>>>> checkpoint.
>>>>>>>>>>>
>>>>>>>>>>> secondary
>>>>>>>>>>> guest --> TCP rewriter filter
>>>>>>>>>>> if the packet is TCP packet,we will adjust seq
>>>>>>>>>>> and update TCP checksum. then send it to
>>>>>>>>>>> redirect client filter. else directly send to
>>>>>>>>>>> redirect client filter.
>>>>>>>>>>>
>>>>>>>>>>> redirect client filter --> redirect server filter
>>>>>>>>>>> forward packet to primary
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> In failover scene(primary is down), the TCP rewriter will keep
>>>>>>>>>>> servicing
>>>>>>>>>>> for the TCP connection which is established after the last checkpoint。
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> How about this plan?
>>>>>>>>> Sounds good.
>>>>>>>>>
>>>>>>>>> And there's indeed no need to differ client/server by reusing the socket
>>>>>>>>> chardev. E.g:
>>>>>>>>>
>>>>>>>>> In primary node:
>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>> -chardev socket,id=comparer0,host=ip_primary,port=X,server,nowait
>>>>>>>>> -chardev socket,id=comparer1,host=ip_primary,port=Y,server,nowait
>>>>>>>>> -chardev socket,id=mirrorer0,host=ip_primary,port=Z,server,nowait
>>>>>>>>> -netdev tap,id=hn0
>>>>>>>>> -traffic-mirrorer netdev=hn0,id=t0,indev=comparer0,outdev=mirrorer0
>>>>>>>>> -colo-comparer primary_traffic=comparer0,secondary_traffic=comparer1
>>>>>>> Why mirrorer has indev?
>>>>>
>>>>> As I said in the previous mails. I would like to decouple packet
>>>>> comparing from netfilter. You've already done most of this since the
>>>>> comparing is done in an independent thread. So the indev here is to
>>>>> mirror the packet sent by guest to the packet comparing thread.
>>>>>
>>>>>>> I think we can use traffic-redirector to do it.
>>>>>>> The command line is:
>>>>>>> -netdev tap,id=hn0
>>>>>>> -object traffic-mirrorer,id=f0,netdev=hn0,queue=tx,outdev=mirrorer0
>>>>>>> -object traffic-redirector,id=f1,netdev=hn0,queue=rx,outdev=comparer0
>>>>>>> -colo-comparer primary_traffic=comparer0,secondary_traffic=comparer1,netdev=hn0
>>>>>>> In the comparer thread, we can use qemu_net_queue_send_iov() to send
>>>>>>> out the packet.
>>>>>>>
>>>>>>> Also, we can merge the socketdev comparer1 and mirrorer0.
>>>>> It depends on whether or not packet comparing was done in a net filter
>>>>> (which I prefer not).
>>> I mean that: packet comapring is done in a thread, not a net filter.
>>> The flow of the packet sent from guest:
>>> 1. traffice-redirecotr, we will redirector the packet to comparer0, the next
>>> filter will never see it.
>>> 2. comparing thread: read it from socket chardev comparer0
>>> 3. call qemu_net_queue_send_iov() to send it back to the netdev.
>> Ok, looks like I miss something.
>>
>> My suggestion tries best to let the packet comparing not tie to filter
>> or netdev. But your suggestion still need it to be coupled with a
>> netdev. Any advantages of doing this (or is there a reason that packet
>> must be sent to netdev after doing comparing?). If not, why not just
> Yes, the packet must be sent to netdev after doing comparing. If both
> the primary packet and secondary packet are the same(contains the same
> application level data), we will drop the secondary packet, and send the
> primary packet to the netdev. Otherwise, we will sync the state.
And drop primary packet also here?
>
>> mirror (duplicate the packet and forward it to a chardev, and pass the
>> original packet to the next filter or netdev)? And doing
> We cannot send the packet to the netdev before comparing. We need to keep
> the connection after failover.
>
> Thanks
> Wen Congyang
>
>> qemu_net_queue_send_iov() to a netdev in another thread may need some
>> synchronization with iothread.
>>
>>> Thanks
>>> Wen Congyang
>>>
>>
>>
>> .
>>
>
>
next prev parent reply other threads:[~2016-01-22 5:42 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-22 10:42 [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter Zhang Chen
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 01/10] Init colo-proxy object " Zhang Chen
2016-01-15 18:21 ` Dr. David Alan Gilbert
2016-01-18 7:08 ` Zhang Chen
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 02/10] Jhash: add linux kernel jhashtable in qemu Zhang Chen
2016-01-08 12:08 ` Dr. David Alan Gilbert
2016-01-11 1:49 ` Zhang Chen
2016-01-11 12:50 ` Dr. David Alan Gilbert
2016-01-12 1:58 ` Zhang Chen
2016-01-12 8:58 ` Dr. David Alan Gilbert
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 03/10] Colo-proxy: add colo-proxy framework Zhang Chen
2016-02-19 19:57 ` Dr. David Alan Gilbert
2016-02-22 3:04 ` Zhang Chen
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 04/10] Colo-proxy: add data structure and jhash func Zhang Chen
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 05/10] net/colo-proxy: Add colo interface to use proxy Zhang Chen
2016-02-19 19:58 ` Dr. David Alan Gilbert
2016-02-22 3:08 ` Zhang Chen
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 06/10] net/colo-proxy: add socket used by forward func Zhang Chen
2016-02-19 20:01 ` Dr. David Alan Gilbert
2016-02-22 5:51 ` Zhang Chen
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 07/10] net/colo-proxy: Add packet enqueue & handle func Zhang Chen
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 08/10] net/colo-proxy: Handle packet and connection Zhang Chen
2016-02-19 20:04 ` Dr. David Alan Gilbert
2016-02-22 6:41 ` Zhang Chen
2016-02-22 19:54 ` Dr. David Alan Gilbert
2016-02-23 17:58 ` Dr. David Alan Gilbert
2016-02-24 2:01 ` Zhang Chen
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 09/10] net/colo-proxy: Compare pri pkt to sec pkt Zhang Chen
2016-02-19 20:07 ` Dr. David Alan Gilbert
2015-12-22 10:42 ` [Qemu-devel] [RFC PATCH v2 10/10] net/colo-proxy: Colo-proxy do checkpoint and clear Zhang Chen
2015-12-29 6:31 ` [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter Zhang Chen
2015-12-29 6:58 ` Jason Wang
2015-12-29 7:08 ` Zhang Chen
2015-12-31 2:36 ` Jason Wang
2015-12-31 8:02 ` Li Zhijian
2016-01-04 2:08 ` Jason Wang
2015-12-31 8:40 ` Zhang Chen
2016-01-04 5:37 ` Jason Wang
2016-01-04 8:16 ` Zhang Chen
2016-01-04 9:46 ` Jason Wang
2016-01-04 11:17 ` Zhang Chen
2016-01-06 5:16 ` Jason Wang
2016-01-18 7:05 ` Zhang Chen
2016-01-18 9:29 ` Jason Wang
2016-01-20 3:29 ` Zhang Chen
2016-01-20 6:54 ` Jason Wang
2016-01-20 7:44 ` Wen Congyang
2016-01-20 9:20 ` Jason Wang
2016-01-20 9:49 ` Wen Congyang
2016-01-20 10:03 ` Jason Wang
2016-01-20 10:34 ` Wen Congyang
2016-01-22 5:33 ` Jason Wang
2016-01-22 5:57 ` Wen Congyang
2016-01-20 10:01 ` Wen Congyang
2016-01-20 10:19 ` Jason Wang
2016-01-20 10:30 ` Wen Congyang
2016-01-22 3:15 ` Jason Wang
2016-01-22 3:28 ` Wen Congyang
2016-01-22 5:41 ` Jason Wang [this message]
2016-01-22 5:56 ` Wen Congyang
2016-01-22 6:21 ` Jason Wang
2016-01-22 6:47 ` Wen Congyang
2016-01-22 7:42 ` Jason Wang
2016-01-22 7:46 ` Wen Congyang
2016-01-27 15:22 ` Eric Blake
2016-01-04 16:52 ` Dr. David Alan Gilbert
2016-01-06 5:20 ` Jason Wang
2016-01-06 9:10 ` Dr. David Alan Gilbert
2016-01-08 11:19 ` Dr. David Alan Gilbert
2016-01-11 1:30 ` Zhang Chen
2016-01-11 12:59 ` Dr. David Alan Gilbert
2016-01-12 7:32 ` Zhang Chen
2016-02-29 20:04 ` Dr. David Alan Gilbert
2016-03-01 2:39 ` Li Zhijian
2016-03-01 10:48 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A1C112.3060402@redhat.com \
--to=jasowang@redhat.com \
--cc=arei.gonglei@huawei.com \
--cc=dgilbert@redhat.com \
--cc=eddie.dong@intel.com \
--cc=guijianfeng@cn.fujitsu.com \
--cc=hongyang.yang@easystack.cn \
--cc=jan.kiszka@siemens.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=peter.huangpeng@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=wency@cn.fujitsu.com \
--cc=zhang.zhanghailiang@huawei.com \
--cc=zhangchen.fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).