qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter
@ 2015-11-27 12:27 Zhang Chen
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 1/9] Init colo-proxy object " Zhang Chen
                   ` (9 more replies)
  0 siblings, 10 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang


Hi,all

This patch add an colo-proxy object, COLO-Proxy is a part of COLO,
based on qemu netfilter and it's a plugin for qemu netfilter. the function
keep Secondary VM connect normal to Primary VM and compare packets 
sent by PVM to sent by SVM.if the packet difference,notify COLO do
checkpoint and send all primary packet has queued.

You can also get the series from: 

https://github.com/zhangckid/qemu/tree/colo-proxy-V1

Usage:

primary:
-netdev tap,id=bn0 -device e1000,netdev=bn0
-object colo-proxy,id=f0,netdev=bn0,queue=all,mode=primary,addr=ip:port

secondary:
-netdev tap,id=bn0 -device e1000,netdev=bn0
-object colo-proxy,id=f0,netdev=bn0,queue=all,mode=secondary,addr=ip:port

NOTE:
queue must set "all". See enum NetFilterDirection for detail.
colo-proxy need queue all packets
colo-proxy V1 just a demo of colo proxy,not pass test with colo upstream


## Background

COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service)
project is a high availability solution. Both Primary VM (PVM) and Secondary VM
(SVM) run in parallel. They receive the same request from client, and generate
responses in parallel too. If the response packets from PVM and SVM are
identical, they are released immediately. Otherwise, a VM checkpoint (on 
demand)is conducted.

Paper:
http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0

COLO on Xen:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

COLO on Qemu/KVM:
http://wiki.qemu.org/Features/COLO

By the needs of capturing response packets from PVM and SVM and finding out
whether they are identical, we introduce a new module to qemu networking 
called colo-proxy.


v1:
 initial patch.



zhangchen (9):
  Init colo-proxy object based on netfilter
  jhash: add linux kernel jhashtable in qemu
  colo-proxy: add colo-proxy framework
  colo-proxy: add colo-proxy setup work
  net/colo-proxy: add colo packet handler
  net/colo-proxy: add packet forward function
  net/colo-proxy: add packet enqueue and handle function
  net/colo-proxy: enqueue primary and secondary packet
  net/colo-proxy: add packet compare and notify checkpoint

 include/qemu/jhash.h |  52 ++++
 net/Makefile.objs    |   1 +
 net/colo-proxy.c     | 745 +++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo-proxy.h     | 124 +++++++++
 qemu-options.hx      |   4 +
 vl.c                 |   3 +-
 6 files changed, 928 insertions(+), 1 deletion(-)
 create mode 100644 include/qemu/jhash.h
 create mode 100644 net/colo-proxy.c
 create mode 100644 net/colo-proxy.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 1/9] Init colo-proxy object based on netfilter
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-11-30  2:50   ` Wen Congyang
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 2/9] jhash: add linux kernel jhashtable in qemu Zhang Chen
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

add colo-proxy in vl.c and qemu-options.hx

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 qemu-options.hx | 4 ++++
 vl.c            | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 949db7f..5e6f1e3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3666,6 +3666,10 @@ queue @var{all|rx|tx} is an option that can be applied to any netfilter.
 @option{tx}: the filter is attached to the transmit queue of the netdev,
              where it will receive packets sent by the netdev.
 
+@item -object colo-proxy,id=@var{id},netdev=@var{netdevid},port=@var{t},addr=@var{ip:port},mode=@var{primary|secondary}[,queue=@var{all|rx|tx}]
+
+colo-proxy
+
 @item -object filter-dump,id=@var{id},netdev=@var{dev},file=@var{filename}][,maxlen=@var{len}]
 
 Dump the network traffic on netdev @var{dev} to the file specified by
diff --git a/vl.c b/vl.c
index f5f7c3f..9037743 100644
--- a/vl.c
+++ b/vl.c
@@ -2774,7 +2774,8 @@ static bool object_create_initial(const char *type)
      * they depend on netdevs already existing
      */
     if (g_str_equal(type, "filter-buffer") ||
-        g_str_equal(type, "filter-dump")) {
+        g_str_equal(type, "filter-dump") ||
+        g_str_equal(type, "colo-proxy")) {
         return false;
     }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 2/9] jhash: add linux kernel jhashtable in qemu
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 1/9] Init colo-proxy object " Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-12-01 11:23   ` Dr. David Alan Gilbert
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework Zhang Chen
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

This used by colo-proxy to save and lookup
connection info

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 include/qemu/jhash.h | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)
 create mode 100644 include/qemu/jhash.h

diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h
new file mode 100644
index 0000000..f6cc7b3
--- /dev/null
+++ b/include/qemu/jhash.h
@@ -0,0 +1,52 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_JHASH_H__
+#define QEMU_JHASH_H__
+
+/*
+ * hashtable relation copy from linux kernel jhash
+ */
+static inline uint32_t rol32(uint32_t word, unsigned int shift)
+{
+    return (word << shift) | (word >> (32 - shift));
+}
+
+/* __jhash_mix -- mix 3 32-bit values reversibly. */
+#define __jhash_mix(a, b, c)                \
+{                                           \
+    a -= c;  a ^= rol32(c, 4);  c += b;     \
+    b -= a;  b ^= rol32(a, 6);  a += c;     \
+    c -= b;  c ^= rol32(b, 8);  b += a;     \
+    a -= c;  a ^= rol32(c, 16); c += b;     \
+    b -= a;  b ^= rol32(a, 19); a += c;     \
+    c -= b;  c ^= rol32(b, 4);  b += a;     \
+}
+
+/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
+#define __jhash_final(a, b, c)  \
+{                               \
+    c ^= b; c -= rol32(b, 14);  \
+    a ^= c; a -= rol32(c, 11);  \
+    b ^= a; b -= rol32(a, 25);  \
+    c ^= b; c -= rol32(b, 16);  \
+    a ^= c; a -= rol32(c, 4);   \
+    b ^= a; b -= rol32(a, 14);  \
+    c ^= b; c -= rol32(b, 24);  \
+}
+
+/* An arbitrary initial parameter */
+#define JHASH_INITVAL           0xdeadbeef
+
+#endif /* QEMU_JHASH_H__ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 1/9] Init colo-proxy object " Zhang Chen
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 2/9] jhash: add linux kernel jhashtable in qemu Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-11-28  2:46   ` Hailiang Zhang
  2015-11-30  3:10   ` Wen Congyang
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work Zhang Chen
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

Colo-proxy is a plugin of qemu netfilter
like filter-buffer and dump

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 net/Makefile.objs |   1 +
 net/colo-proxy.c  | 139 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo-proxy.h  |  63 +++++++++++++++++++++++++
 3 files changed, 203 insertions(+)
 create mode 100644 net/colo-proxy.c
 create mode 100644 net/colo-proxy.h

diff --git a/net/Makefile.objs b/net/Makefile.objs
index 5fa2f97..95670f2 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -15,3 +15,4 @@ common-obj-$(CONFIG_VDE) += vde.o
 common-obj-$(CONFIG_NETMAP) += netmap.o
 common-obj-y += filter.o
 common-obj-y += filter-buffer.o
+common-obj-y += colo-proxy.o
diff --git a/net/colo-proxy.c b/net/colo-proxy.c
new file mode 100644
index 0000000..98c2699
--- /dev/null
+++ b/net/colo-proxy.c
@@ -0,0 +1,139 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "colo-proxy.h"
+
+#define __DEBUG__
+
+#ifdef __DEBUG__
+#define DEBUG(format, ...) printf(format, ##__VA_ARGS__)
+#else
+#define DEBUG(format, ...)
+#endif
+
+
+static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
+                                         NetClientState *sender,
+                                         unsigned flags,
+                                         const struct iovec *iov,
+                                         int iovcnt,
+                                         NetPacketSent *sent_cb)
+{
+    /*
+     * We return size when buffer a packet, the sender will take it as
+     * a already sent packet, so sent_cb should not be called later.
+     *
+     */
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    if (s->colo_mode == COLO_PRIMARY_MODE) {
+         /* colo_proxy_primary_handler */
+    } else {
+         /* colo_proxy_primary_handler */
+    }
+    return iov_size(iov, iovcnt);
+}
+
+static void colo_proxy_cleanup(NetFilterState *nf)
+{
+     /* cleanup */
+}
+
+
+static void colo_proxy_setup(NetFilterState *nf, Error **errp)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    if (!s->addr) {
+        error_setg(errp, "filter colo_proxy needs 'addr' \
+                     property set!");
+        return;
+    }
+
+    if (nf->direction != NET_FILTER_DIRECTION_ALL) {
+        printf("colo need queue all packet,\
+                    please startup colo-proxy with queue=all\n");
+        return;
+    }
+
+    s->sockfd = -1;
+    s->has_failover = false;
+    colo_do_checkpoint = false;
+    g_queue_init(&s->unprocessed_connections);
+
+    if (!strcmp(mode, PRIMARY_MODE)) {
+        s->colo_mode = COLO_PRIMARY_MODE;
+    } else if (!strcmp(mode, SECONDARY_MODE)) {
+        s->colo_mode = COLO_SECONDARY_MODE;
+    } else {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
+                    "primary or secondary");
+        return;
+    }
+}
+
+static void colo_proxy_class_init(ObjectClass *oc, void *data)
+{
+    NetFilterClass *nfc = NETFILTER_CLASS(oc);
+
+    nfc->setup = colo_proxy_setup;
+    nfc->cleanup = colo_proxy_cleanup;
+    nfc->receive_iov = colo_proxy_receive_iov;
+}
+
+static char *colo_proxy_get_mode(Object *obj, Error **errp)
+{
+    return g_strdup(mode);
+}
+
+static void colo_proxy_set_mode(Object *obj, const char *value, Error **errp)
+{
+    g_free(mode);
+    mode = g_strdup(value);
+}
+
+static char *colo_proxy_get_addr(Object *obj, Error **errp)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(obj);
+
+    return g_strdup(s->addr);
+}
+
+static void colo_proxy_set_addr(Object *obj, const char *value, Error **errp)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(obj);
+    g_free(s->addr);
+    s->addr = g_strdup(value);
+}
+
+static void colo_proxy_init(Object *obj)
+{
+    object_property_add_str(obj, "mode", colo_proxy_get_mode,
+                            colo_proxy_set_mode, NULL);
+    object_property_add_str(obj, "addr", colo_proxy_get_addr,
+                            colo_proxy_set_addr, NULL);
+}
+
+static const TypeInfo colo_proxy_info = {
+    .name = TYPE_FILTER_COLO_PROXY,
+    .parent = TYPE_NETFILTER,
+    .class_init = colo_proxy_class_init,
+    .instance_init = colo_proxy_init,
+    .instance_size = sizeof(ColoProxyState),
+};
+
+static void register_types(void)
+{
+    type_register_static(&colo_proxy_info);
+}
+
+type_init(register_types);
diff --git a/net/colo-proxy.h b/net/colo-proxy.h
new file mode 100644
index 0000000..94afbc7
--- /dev/null
+++ b/net/colo-proxy.h
@@ -0,0 +1,63 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+
+#ifndef QEMU_COLO_PROXY_H
+#define QEMU_COLO_PROXY_H
+
+#include "net/filter.h"
+#include "net/queue.h"
+#include "qemu-common.h"
+#include "qemu/iov.h"
+#include "qapi/qmp/qerror.h"
+#include "qapi-visit.h"
+#include "qom/object.h"
+#include "qemu/sockets.h"
+#include "qemu/main-loop.h"
+#include <netinet/if_ether.h>
+#include "qemu/jhash.h"
+#include "qemu/coroutine.h"
+
+#define FILTER_COLO_PROXY(obj) \
+    OBJECT_CHECK(ColoProxyState, (obj), TYPE_FILTER_COLO_PROXY)
+
+#define TYPE_FILTER_COLO_PROXY "colo-proxy"
+#define PRIMARY_MODE "primary"
+#define SECONDARY_MODE "secondary"
+
+typedef enum {
+    COLO_PRIMARY_MODE,               /* primary mode  */
+    COLO_SECONDARY_MODE,             /* secondary mode */
+} mode_type;
+
+typedef struct ColoProxyState {
+    NetFilterState parent_obj;
+    NetQueue *incoming_queue;        /* guest normal net queue */
+    NetFilterDirection direction;    /* packet direction */
+    mode_type colo_mode;             /* colo mode (primary or
+                                      * secondary)
+                                      */
+    char *addr;                       /* primary colo connect addr
+                                      * or secondary server addr
+                                      */
+    int sockfd;                      /* primary client socket fd or
+                                      * secondary server socket fd
+                                      */
+    bool has_failover;               /* colo failover flag */
+    GHashTable *unprocessed_packets; /* hashtable to save connection */
+    GQueue unprocessed_connections;  /* to save unprocessed_connections */
+    Coroutine *co;
+} ColoProxyState;
+
+#endif /* QEMU_COLO_PROXY_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
                   ` (2 preceding siblings ...)
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-11-28  3:02   ` Hailiang Zhang
  2015-12-01 15:35   ` Dr. David Alan Gilbert
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 5/9] net/colo-proxy: add colo packet handler Zhang Chen
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

Secondary setup socket server for colo-forward
primary setup connect to secondary for colo-forward
add data structure will be uesed

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-proxy.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 net/colo-proxy.h | 61 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 147 insertions(+), 1 deletion(-)

diff --git a/net/colo-proxy.c b/net/colo-proxy.c
index 98c2699..89d9616 100644
--- a/net/colo-proxy.c
+++ b/net/colo-proxy.c
@@ -22,6 +22,8 @@
 #define DEBUG(format, ...)
 #endif
 
+static char *mode;
+static bool colo_do_checkpoint;
 
 static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
                                          NetClientState *sender,
@@ -46,13 +48,84 @@ static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
 
 static void colo_proxy_cleanup(NetFilterState *nf)
 {
-     /* cleanup */
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    close(s->sockfd);
+    s->sockfd = -1;
+    g_free(mode);
+    g_free(s->addr);
 }
 
+static void colo_accept_incoming(ColoProxyState *s)
+{
+    DEBUG("into colo_accept_incoming\n");
+    struct sockaddr_in addr;
+    socklen_t addrlen = sizeof(addr);
+    int acceptsock, err;
+
+    do {
+        acceptsock = qemu_accept(s->sockfd, (struct sockaddr *)&addr, &addrlen);
+        err = socket_error();
+    } while (acceptsock < 0 && err == EINTR);
+    qemu_set_fd_handler(s->sockfd, NULL, NULL, NULL);
+    closesocket(s->sockfd);
+
+    DEBUG("accept colo proxy\n");
+
+    if (acceptsock < 0) {
+        printf("could not accept colo connection (%s)\n",
+                     strerror(err));
+        return;
+    }
+    s->sockfd = acceptsock;
+    /* TODO: handle the packets that primary forward */
+    return;
+}
+
+/* Return 1 on success, or return -1 if failed */
+static ssize_t colo_start_incoming(ColoProxyState *s)
+{
+    int serversock;
+    serversock = inet_listen(s->addr, NULL, 256, SOCK_STREAM, 0, NULL);
+    if (serversock < 0) {
+        g_free(s->addr);
+        return -1;
+    }
+    s->sockfd = serversock;
+    qemu_set_fd_handler(serversock, (IOHandler *)colo_accept_incoming, NULL,
+                        (void *)s);
+    g_free(s->addr);
+    return 1;
+}
+
+/* Return 1 on success, or return -1 if setup failed */
+static ssize_t colo_proxy_primary_setup(NetFilterState *nf)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    int sock;
+    sock = inet_connect(s->addr, NULL);
+    if (sock < 0) {
+        printf("colo proxy connect failed\n");
+        g_free(s->addr);
+        return -1;
+    }
+    DEBUG("colo proxy connect success\n");
+    s->sockfd = sock;
+   /* TODO: handle the packets that secondary forward */
+    g_free(s->addr);
+    return 1;
+}
+
+/* Return 1 on success, or return -1 if setup failed */
+static ssize_t colo_proxy_secondary_setup(NetFilterState *nf)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    return colo_start_incoming(s);
+}
 
 static void colo_proxy_setup(NetFilterState *nf, Error **errp)
 {
     ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    ssize_t ret = 0;
     if (!s->addr) {
         error_setg(errp, "filter colo_proxy needs 'addr' \
                      property set!");
@@ -68,17 +141,29 @@ static void colo_proxy_setup(NetFilterState *nf, Error **errp)
     s->sockfd = -1;
     s->has_failover = false;
     colo_do_checkpoint = false;
+    s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
+    s->unprocessed_packets = g_hash_table_new_full(connection_key_hash,
+                                                       connection_key_equal,
+                                                       g_free,
+                                                       connection_destroy);
     g_queue_init(&s->unprocessed_connections);
 
     if (!strcmp(mode, PRIMARY_MODE)) {
         s->colo_mode = COLO_PRIMARY_MODE;
+        ret = colo_proxy_primary_setup(nf);
     } else if (!strcmp(mode, SECONDARY_MODE)) {
         s->colo_mode = COLO_SECONDARY_MODE;
+        ret = colo_proxy_secondary_setup(nf);
     } else {
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
                     "primary or secondary");
         return;
     }
+    if (ret) {
+        DEBUG("colo_proxy_setup success\n");
+    } else {
+        DEBUG("colo_proxy_setup failed\n");
+    }
 }
 
 static void colo_proxy_class_init(ObjectClass *oc, void *data)
diff --git a/net/colo-proxy.h b/net/colo-proxy.h
index 94afbc7..f77db2f 100644
--- a/net/colo-proxy.h
+++ b/net/colo-proxy.h
@@ -60,4 +60,65 @@ typedef struct ColoProxyState {
     Coroutine *co;
 } ColoProxyState;
 
+struct ip {
+#ifdef HOST_WORDS_BIGENDIAN
+    uint8_t  ip_v:4,                 /* version */
+             ip_hl:4;                /* header length */
+#else
+    uint8_t  ip_hl:4,                /* header length */
+             ip_v:4;                 /* version */
+#endif
+    uint8_t  ip_tos;                 /* type of service */
+    uint16_t ip_len;                 /* total length */
+    uint16_t ip_id;                  /* identification */
+    uint16_t ip_off;                 /* fragment offset field */
+#define    IP_DF 0x4000              /* don't fragment flag */
+#define    IP_MF 0x2000              /* more fragments flag */
+#define    IP_OFFMASK 0x1fff
+/* mask for fragmenting bits */
+    uint8_t  ip_ttl;                 /* time to live */
+    uint8_t  ip_p;                   /* protocol */
+    uint16_t ip_sum;                 /* checksum */
+    uint32_t ip_src, ip_dst;         /* source and dest address */
+} QEMU_PACKED;
+
+typedef struct Packet {
+    void *data;
+    union {
+        uint8_t *network_layer;
+        struct ip *ip;
+    };
+    uint8_t *transport_layer;
+    int size;
+    ColoProxyState *s;
+    bool should_be_sent;
+    NetClientState *sender;
+} Packet;
+
+typedef struct Connection_key {
+    /* (src, dst) must be grouped, in the same way than in IP header */
+    uint32_t src;
+    uint32_t dst;
+    union {
+        uint32_t ports;
+        uint16_t port16[2];
+    };
+    uint8_t ip_proto;
+} QEMU_PACKED Connection_key;
+
+typedef struct Connection {
+    /* connection primary send queue */
+    GQueue primary_list;
+    /* connection secondary send queue */
+    GQueue secondary_list;
+     /* flag to enqueue unprocessed_connections */
+    bool processing;
+} Connection;
+
+typedef enum {
+    PRIMARY_OUTPUT,           /* primary output packet queue */
+    PRIMARY_INPUT,            /* primary input packet queue */
+    SECONDARY_OUTPUT,         /* secondary output packet queue */
+} packet_type;
+
 #endif /* QEMU_COLO_PROXY_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 5/9] net/colo-proxy: add colo packet handler
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
                   ` (3 preceding siblings ...)
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-11-28  3:17   ` Hailiang Zhang
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 6/9] net/colo-proxy: add packet forward function Zhang Chen
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

add primary and secondary handler

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-proxy.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 103 insertions(+), 2 deletions(-)

diff --git a/net/colo-proxy.c b/net/colo-proxy.c
index 89d9616..ece5661 100644
--- a/net/colo-proxy.c
+++ b/net/colo-proxy.c
@@ -25,6 +25,101 @@
 static char *mode;
 static bool colo_do_checkpoint;
 
+/*
+ * colo primary handle host's normal send and
+ * recv packets to primary guest
+ * return:          >= 0      success
+ *                  < 0       failed
+ */
+static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
+                                         NetClientState *sender,
+                                         unsigned flags,
+                                         const struct iovec *iov,
+                                         int iovcnt,
+                                         NetPacketSent *sent_cb)
+{
+    ssize_t ret = 0;
+    int direction;
+
+    if (sender == nf->netdev) {
+        /* This packet is sent by netdev itself */
+        direction = NET_FILTER_DIRECTION_TX;
+    } else {
+        direction = NET_FILTER_DIRECTION_RX;
+    }
+    /*
+     * if packet's direction=rx
+     * enqueue packets to primary queue
+     * and wait secondary queue to compare
+     * if packet's direction=tx
+     * enqueue packets then send packets to
+     * secondary and flush  queued packets
+    */
+
+    if (colo_do_checkpoint) {
+        colo_proxy_do_checkpoint(nf);
+    }
+
+    if (direction == NET_FILTER_DIRECTION_RX) {
+        /* TODO: enqueue_primary_packet */
+    } else {
+        /* TODO: forward packets to another */
+    }
+
+    return ret;
+}
+
+/*
+ * colo secondary handle host's normal send and
+ * recv packets to secondary guest
+ * return:          >= 0      success
+ *                  < 0       failed
+ */
+static ssize_t colo_proxy_secondary_handler(NetFilterState *nf,
+                                         NetClientState *sender,
+                                         unsigned flags,
+                                         const struct iovec *iov,
+                                         int iovcnt,
+                                         NetPacketSent *sent_cb)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    int direction;
+    ssize_t ret = 0;
+
+    if (sender == nf->netdev) {
+        /* This packet is sent by netdev itself */
+        direction = NET_FILTER_DIRECTION_TX;
+    } else {
+        direction = NET_FILTER_DIRECTION_RX;
+    }
+    /*
+     * if packet's direction=rx
+     * enqueue packets and send to
+     * primary QEMU
+     * if packet's direction=tx
+     * record PVM's packet inital seq & adjust
+     * client's ack,send adjusted packets to SVM(next version will be do)
+     */
+
+    if (direction == NET_FILTER_DIRECTION_RX) {
+        if (colo_has_failover(nf)) {
+            qemu_net_queue_send_iov(s->incoming_queue, sender, flags, iov,
+                            iovcnt, NULL);
+            return 1;
+        } else {
+        /* TODO: forward packets to another */
+        }
+
+    } else {
+        if (colo_has_failover(nf)) {
+            qemu_net_queue_send_iov(s->incoming_queue, sender, flags, iov,
+                            iovcnt, NULL);
+        }
+        return 1;
+    }
+    return ret;
+}
+
 static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
                                          NetClientState *sender,
                                          unsigned flags,
@@ -38,10 +133,16 @@ static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
      *
      */
     ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    ssize_t ret = 0;
     if (s->colo_mode == COLO_PRIMARY_MODE) {
-         /* colo_proxy_primary_handler */
+        ret = colo_proxy_primary_handler(nf, sender, flags,
+                    iov, iovcnt, sent_cb);
     } else {
-         /* colo_proxy_primary_handler */
+        ret = colo_proxy_secondary_handler(nf, sender, flags,
+                    iov, iovcnt, sent_cb);
+    }
+    if (ret < 0) {
+        DEBUG("colo_proxy_receive_iov running failed\n");
     }
     return iov_size(iov, iovcnt);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 6/9] net/colo-proxy: add packet forward function
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
                   ` (4 preceding siblings ...)
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 5/9] net/colo-proxy: add colo packet handler Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-12-01 15:50   ` Dr. David Alan Gilbert
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function Zhang Chen
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

The packet recv by primary forward to secondary
The packet send by secondary forward to primary

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-proxy.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 114 insertions(+), 4 deletions(-)

diff --git a/net/colo-proxy.c b/net/colo-proxy.c
index ece5661..08a852f 100644
--- a/net/colo-proxy.c
+++ b/net/colo-proxy.c
@@ -26,6 +26,110 @@ static char *mode;
 static bool colo_do_checkpoint;
 
 /*
+ * Packets to be sent by colo forward to
+ * another colo
+ * return:          >= 0        success
+ *                  < 0        failed
+ */
+static ssize_t colo_forward2another(NetFilterState *nf,
+                                         NetClientState *sender,
+                                         unsigned flags,
+                                         const struct iovec *iov,
+                                         int iovcnt,
+                                         NetPacketSent *sent_cb,
+                                         mode_type mode)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    ssize_t ret = 0;
+    ssize_t size = 0;
+    struct iovec sizeiov = {
+        .iov_base = &size,
+        .iov_len = 8
+    };
+    size = iov_size(iov, iovcnt);
+    if (!size) {
+        return 0;
+    }
+
+    if (mode == COLO_PRIMARY_MODE) {
+        qemu_net_queue_send_iov(s->incoming_queue, sender, flags,
+                           iov, iovcnt, NULL);
+    }
+    ret = iov_send(s->sockfd, &sizeiov, 8, 0, 8);
+    if (ret < 0) {
+        return ret;
+    }
+    ret = iov_send(s->sockfd, iov, iovcnt, 0, size);
+    return ret;
+}
+
+/*
+ * recv and handle colo secondary
+ * forward packets in colo primary
+ */
+static void colo_proxy_primary_forward_handler(NetFilterState *nf)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    ssize_t len = 0;
+    ssize_t ret = 0;
+    struct iovec sizeiov = {
+        .iov_base = &len,
+        .iov_len = 8
+    };
+    if (s->sockfd < 0) {
+        printf("secondary forward disconnected\n");
+        return;
+    }
+    iov_recv(s->sockfd, &sizeiov, 8, 0, 8);
+    DEBUG("primary_forward_handler recv lensbuf lens=%zu\n", len);
+
+    if (len > 0) {
+        char *recvbuf;
+        recvbuf = g_malloc0(len);
+        struct iovec iov = {
+            .iov_base = recvbuf,
+            .iov_len = len
+        };
+        iov_recv(s->sockfd, &iov, len, 0, len);
+        DEBUG("primary_forward_handler primary recvbuf=%s\n", recvbuf);
+        ret = colo_enqueue_secondary_packet(nf, recvbuf, len);
+        if (ret) {
+            DEBUG("colo_enqueue_secondary_packet succese\n");
+        } else {
+            DEBUG("colo_enqueue_secondary_packet failed\n");
+        }
+        g_free(recvbuf);
+    }
+}
+
+/*
+ * recv and handle colo primary
+ * forward packets in colo secondary
+ */
+static void colo_proxy_secondary_forward_handler(NetFilterState *nf)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    ssize_t len = 0;
+    struct iovec sizeiov = {
+        .iov_base = &len,
+        .iov_len = 8
+    };
+    iov_recv(s->sockfd, &sizeiov, 8, 0, 8);
+    if (len > 0) {
+        char *buf;
+        buf = g_malloc0(len);
+        struct iovec iov = {
+            .iov_base = buf,
+            .iov_len = len
+        };
+        iov_recv(s->sockfd, &iov, len, 0, len);
+        qemu_net_queue_send(s->incoming_queue, nf->netdev,
+                    0, (const uint8_t *)buf, len, NULL);
+        g_free(buf);
+    }
+}
+
+/*
  * colo primary handle host's normal send and
  * recv packets to primary guest
  * return:          >= 0      success
@@ -63,7 +167,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
     if (direction == NET_FILTER_DIRECTION_RX) {
         /* TODO: enqueue_primary_packet */
     } else {
-        /* TODO: forward packets to another */
+        ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
+                    sent_cb, COLO_PRIMARY_MODE);
     }
 
     return ret;
@@ -107,7 +212,8 @@ static ssize_t colo_proxy_secondary_handler(NetFilterState *nf,
                             iovcnt, NULL);
             return 1;
         } else {
-        /* TODO: forward packets to another */
+            ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
+                        sent_cb, COLO_SECONDARY_MODE);
         }
 
     } else {
@@ -178,7 +284,9 @@ static void colo_accept_incoming(ColoProxyState *s)
         return;
     }
     s->sockfd = acceptsock;
-    /* TODO: handle the packets that primary forward */
+    qemu_set_fd_handler(s->sockfd,
+                (IOHandler *)colo_proxy_secondary_forward_handler, NULL,
+                (void *)s);
     return;
 }
 
@@ -211,7 +319,9 @@ static ssize_t colo_proxy_primary_setup(NetFilterState *nf)
     }
     DEBUG("colo proxy connect success\n");
     s->sockfd = sock;
-   /* TODO: handle the packets that secondary forward */
+    qemu_set_fd_handler(s->sockfd,
+                (IOHandler *)colo_proxy_primary_forward_handler,
+                NULL, (void *)s);
     g_free(s->addr);
     return 1;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
                   ` (5 preceding siblings ...)
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 6/9] net/colo-proxy: add packet forward function Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-12-01 16:12   ` Dr. David Alan Gilbert
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 8/9] net/colo-proxy: enqueue primary and secondary packet Zhang Chen
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

Add common packet handle function and enqueue
packet distinguished connection,then we can
lookup one connection packet to compare

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-proxy.c | 167 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 166 insertions(+), 1 deletion(-)

diff --git a/net/colo-proxy.c b/net/colo-proxy.c
index 08a852f..a664e6d 100644
--- a/net/colo-proxy.c
+++ b/net/colo-proxy.c
@@ -24,6 +24,170 @@
 
 static char *mode;
 static bool colo_do_checkpoint;
+static void packet_destroy(void *opaque, void *user_data);
+
+static uint32_t connection_key_hash(const void *opaque)
+{
+    const Connection_key *key = opaque;
+    uint32_t a, b, c;
+
+    /* Jenkins hash */
+    a = b = c = JHASH_INITVAL + sizeof(*key);
+    a += key->src;
+    b += key->dst;
+    c += key->ports;
+    __jhash_mix(a, b, c);
+
+    a += key->ip_proto;
+    __jhash_final(a, b, c);
+
+    return c;
+}
+
+static int connection_key_equal(const void *opaque1, const void *opaque2)
+{
+    return memcmp(opaque1, opaque2, sizeof(Connection_key)) == 0;
+}
+
+static void connection_destroy(void *opaque)
+{
+    Connection *connection = opaque;
+    g_queue_foreach(&connection->primary_list, packet_destroy, NULL);
+    g_queue_free(&connection->primary_list);
+    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
+    g_queue_free(&connection->secondary_list);
+    g_slice_free(Connection, connection);
+}
+
+static Connection *connection_new(void)
+{
+    Connection *connection = g_slice_new(Connection);
+
+    g_queue_init(&connection->primary_list);
+    g_queue_init(&connection->secondary_list);
+    connection->processing = false;
+
+    return connection;
+}
+
+/* Return 0 on success, or return -1 if the pkt is corrpted */
+static int parse_packet_early(Packet *pkt, Connection_key *key)
+{
+    int network_length;
+    uint8_t *data = pkt->data;
+
+    pkt->network_layer = data + ETH_HLEN;
+    if (ntohs(*(uint16_t *)(data + 12)) != ETH_P_IP) {
+        if (ntohs(*(uint16_t *)(data + 12)) == ETH_P_ARP) {
+            return -1;
+        }
+        return 0;
+    }
+
+    network_length = pkt->ip->ip_hl * 4;
+    pkt->transport_layer = pkt->network_layer + network_length;
+    key->ip_proto = pkt->ip->ip_p;
+    key->src = pkt->ip->ip_src;
+    key->dst = pkt->ip->ip_dst;
+
+    switch (key->ip_proto) {
+    case IPPROTO_TCP:
+    case IPPROTO_UDP:
+    case IPPROTO_DCCP:
+    case IPPROTO_ESP:
+    case IPPROTO_SCTP:
+    case IPPROTO_UDPLITE:
+        key->ports = *(uint32_t *)(pkt->transport_layer);
+        break;
+    case IPPROTO_AH:
+        key->ports = *(uint32_t *)(pkt->transport_layer + 4);
+        break;
+    default:
+        break;
+    }
+
+    return 0;
+}
+
+static Packet *packet_new(ColoProxyState *s, const void *data,
+                          int size, Connection_key *key, NetClientState *sender)
+{
+    Packet *pkt = g_slice_new(Packet);
+
+    pkt->data = g_malloc(size);
+    memcpy(pkt->data, data, size);
+    pkt->size = size;
+    pkt->s = s;
+    pkt->sender = sender;
+    pkt->should_be_sent = false;
+
+    if (parse_packet_early(pkt, key)) {
+        packet_destroy(pkt, NULL);
+        pkt = NULL;
+    }
+
+    return pkt;
+}
+
+static void packet_destroy(void *opaque, void *user_data)
+{
+    Packet *pkt = opaque;
+    g_free(pkt->data);
+    g_slice_free(Packet, pkt);
+}
+
+static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
+                                          Connection_key *key,
+                                          Packet *pkt, packet_type type)
+{
+    Connection *connection;
+    Packet *tmppkt;
+    connection = g_hash_table_lookup(unprocessed_packets, key);
+    if (connection == NULL) {
+        Connection_key *new_key = g_malloc(sizeof(*key));
+
+        connection = connection_new();
+        memcpy(new_key, key, sizeof(*key));
+        key = new_key;
+
+        g_hash_table_insert(unprocessed_packets, key, connection);
+    }
+    switch (type) {
+    case PRIMARY_OUTPUT:
+        if (g_queue_get_length(&connection->secondary_list) > 0) {
+            tmppkt = g_queue_pop_head(&connection->secondary_list);
+            DEBUG("g_queue_get_length(&connection->primary_list)=%d\n",
+                        g_queue_get_length(&connection->primary_list));
+            DEBUG("g_queue_get_length(&connection->secondary_list)=%d\n",
+                        g_queue_get_length(&connection->secondary_list));
+            if (colo_packet_compare(pkt, tmppkt)) {
+                DEBUG("packet same and release packet\n");
+                pkt->should_be_sent = true;
+                break;
+            } else {
+                DEBUG("packet different\n");
+                colo_proxy_notify_checkpoint();
+                pkt->should_be_sent = false;
+                break;
+            }
+        } else {
+            g_queue_push_tail(&connection->primary_list, pkt);
+            pkt->should_be_sent = false;
+        }
+
+        break;
+    case SECONDARY_OUTPUT:
+        g_queue_push_tail(&connection->secondary_list, pkt);
+        DEBUG("secondary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
+                    (char *)pkt->data, pkt->ip->ip_src, pkt->ip->ip_dst);
+        break;
+    default:
+        abort();
+    }
+
+    return connection;
+}
+
 
 /*
  * Packets to be sent by colo forward to
@@ -165,7 +329,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
     }
 
     if (direction == NET_FILTER_DIRECTION_RX) {
-        /* TODO: enqueue_primary_packet */
+        ret = colo_enqueue_primary_packet(nf, sender, flags, iov,
+                    iovcnt, sent_cb);
     } else {
         ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
                     sent_cb, COLO_PRIMARY_MODE);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 8/9] net/colo-proxy: enqueue primary and secondary packet
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
                   ` (6 preceding siblings ...)
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 9/9] net/colo-proxy: add packet compare and notify checkpoint Zhang Chen
  2015-12-01 16:44 ` [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Dr. David Alan Gilbert
  9 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

Enqueue packets sent by primary and packets sent by secondary

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-proxy.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/net/colo-proxy.c b/net/colo-proxy.c
index a664e6d..5f1852a 100644
--- a/net/colo-proxy.c
+++ b/net/colo-proxy.c
@@ -188,6 +188,77 @@ static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
     return connection;
 }
 
+static ssize_t colo_enqueue_primary_packet(NetFilterState *nf,
+                                         NetClientState *sender,
+                                         unsigned flags,
+                                         const struct iovec *iov,
+                                         int iovcnt,
+                                         NetPacketSent *sent_cb)
+{
+    /*
+     * 1. parse packet, try to get connection factor
+     * (src_ip, src_port, dest_ip, dest_port)
+     * 2. enqueue the packet to primary_packet_list by connection
+     */
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    char *buf;
+    ssize_t size = iov_size(iov, iovcnt);
+    buf = g_malloc0(size);
+    iov_to_buf(iov, iovcnt, 0, buf, size);
+
+    Connection_key key = { 0 };
+    Packet *pkt = packet_new(s, buf, size, &key, sender);
+    Connection *connection;
+
+    if (!pkt) {
+        qemu_net_queue_send(s->incoming_queue, sender, flags,
+                    (const uint8_t *)buf, size, NULL);
+        g_free(buf);
+        return 0;
+    }
+
+    connection = colo_proxy_enqueue_packet(s->unprocessed_packets, &key,
+                                        pkt, PRIMARY_OUTPUT);
+
+    if (!connection->processing) {
+        g_queue_push_tail(&s->unprocessed_connections, connection);
+        connection->processing = true;
+    }
+
+    if (pkt->should_be_sent) {
+        qemu_net_queue_send(s->incoming_queue, sender, flags,
+                    (const uint8_t *)buf, size, NULL);
+    }
+
+    g_free(buf);
+    return 1;
+}
+
+static ssize_t colo_enqueue_secondary_packet(NetFilterState *nf,
+            char *buf, int len)
+{
+    /*
+     * 1, parse packet, try to get connection factor
+     * (src_ip, src_port, dest_ip, dest_port)
+     * 2. enqueue the packet to secondary_packet_list by connection
+    */
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    Connection_key key = { 0 };
+    Packet *pkt = packet_new(s, buf, len, &key, NULL);
+    Connection *connection;
+    if (!pkt) {
+        return -1;
+    }
+
+    connection = colo_proxy_enqueue_packet(s->unprocessed_packets, &key,
+                                        pkt, SECONDARY_OUTPUT);
+
+    if (!connection->processing) {
+        g_queue_push_tail(&s->unprocessed_connections, connection);
+        connection->processing = true;
+    }
+    return 1;
+}
 
 /*
  * Packets to be sent by colo forward to
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [Qemu-devel] [RFC PATCH 9/9] net/colo-proxy: add packet compare and notify checkpoint
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
                   ` (7 preceding siblings ...)
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 8/9] net/colo-proxy: enqueue primary and secondary packet Zhang Chen
@ 2015-11-27 12:27 ` Zhang Chen
  2015-12-01 16:37   ` Dr. David Alan Gilbert
  2015-12-01 16:44 ` [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Dr. David Alan Gilbert
  9 siblings, 1 reply; 35+ messages in thread
From: Zhang Chen @ 2015-11-27 12:27 UTC (permalink / raw)
  To: qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, Dr. David Alan Gilbert,
	Huang peng, Gong lei, jan.kiszka, Zhang Chen, zhanghailiang

From: zhangchen <zhangchen.fnst@cn.fujitsu.com>

Lookup same connection's primary and secondary packet
to compare,if same we will send primary packet and
drop secondary packet,else send all of primary
packets be queued,drop secondary queue and notify
colo to do checkpoint

Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-proxy.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/net/colo-proxy.c b/net/colo-proxy.c
index 5f1852a..847f7f2 100644
--- a/net/colo-proxy.c
+++ b/net/colo-proxy.c
@@ -70,6 +70,41 @@ static Connection *connection_new(void)
     return connection;
 }
 
+static void colo_send_primary_packet(void *opaque, void *user_data)
+{
+    Packet *pkt = opaque;
+    qemu_net_queue_send(pkt->s->incoming_queue, pkt->sender, 0,
+                    (const uint8_t *)pkt->data, pkt->size, NULL);
+}
+
+static void colo_flush_connection(void *opaque, void *user_data)
+{
+    Connection *connection = opaque;
+    g_queue_foreach(&connection->primary_list, colo_send_primary_packet, NULL);
+    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
+}
+
+static void colo_proxy_notify_checkpoint(void)
+{
+    DEBUG("colo_proxy_notify_checkpoint\n");
+}
+
+static void colo_proxy_do_checkpoint(NetFilterState *nf)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+
+    g_queue_foreach(&s->unprocessed_connections, colo_flush_connection, NULL);
+}
+
+/*
+ * colo failover flag
+ */
+static ssize_t colo_has_failover(NetFilterState *nf)
+{
+    ColoProxyState *s = FILTER_COLO_PROXY(nf);
+    return s->has_failover;
+}
+
 /* Return 0 on success, or return -1 if the pkt is corrpted */
 static int parse_packet_early(Packet *pkt, Connection_key *key)
 {
@@ -136,6 +171,45 @@ static void packet_destroy(void *opaque, void *user_data)
     g_slice_free(Packet, pkt);
 }
 
+/*
+ * The sent IP packets comparison between primary
+ * and secondary
+ * TODO: support ip fragment
+ * return:    true  means packet same
+ *            false means packet different
+ */
+static bool colo_packet_compare(Packet *ppkt, Packet *spkt)
+{
+    int i;
+    DEBUG("colo_packet_compare lens ppkt %d,spkt %d\n", ppkt->size,
+                spkt->size);
+    DEBUG("primary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
+                (char *)ppkt->data, ppkt->ip->ip_src, ppkt->ip->ip_dst);
+    DEBUG("seconda pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
+                (char *)spkt->data, spkt->ip->ip_src, spkt->ip->ip_dst);
+    if (ppkt->size == spkt->size) {
+        DEBUG("colo_packet_compare data   ppkt\n");
+        for (i = 0; i < spkt->size; i++) {
+            DEBUG("%x", ((char *)ppkt->data)[i]);
+            DEBUG("|");
+        }
+        DEBUG("\ncolo_packet_compare data   spkt\n");
+        for (i = 0; i < spkt->size; i++) {
+            DEBUG("%x", ((char *)spkt->data)[i]);
+            DEBUG("|");
+        }
+        DEBUG("\ncolo_packet_compare data   ppkt %s\n", (char *)ppkt->data);
+        DEBUG("colo_packet_compare data   spkt %s\n", (char *)spkt->data);
+        if (!memcmp(ppkt->data, spkt->data, spkt->size)) {
+            return true;
+        } else {
+            return false;
+        }
+    } else {
+        return false;
+    }
+}
+
 static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
                                           Connection_key *key,
                                           Packet *pkt, packet_type type)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework Zhang Chen
@ 2015-11-28  2:46   ` Hailiang Zhang
  2015-11-30  2:25     ` Zhang Chen
  2015-11-30  3:10   ` Wen Congyang
  1 sibling, 1 reply; 35+ messages in thread
From: Hailiang Zhang @ 2015-11-28  2:46 UTC (permalink / raw)
  To: Zhang Chen, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, peter.huangpeng,
	Dr. David Alan Gilbert, Gong lei, jan.kiszka

On 2015/11/27 20:27, Zhang Chen wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>
> Colo-proxy is a plugin of qemu netfilter
> like filter-buffer and dump
>
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/Makefile.objs |   1 +
>   net/colo-proxy.c  | 139 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/colo-proxy.h  |  63 +++++++++++++++++++++++++
>   3 files changed, 203 insertions(+)
>   create mode 100644 net/colo-proxy.c
>   create mode 100644 net/colo-proxy.h
>
> diff --git a/net/Makefile.objs b/net/Makefile.objs
> index 5fa2f97..95670f2 100644
> --- a/net/Makefile.objs
> +++ b/net/Makefile.objs
> @@ -15,3 +15,4 @@ common-obj-$(CONFIG_VDE) += vde.o
>   common-obj-$(CONFIG_NETMAP) += netmap.o
>   common-obj-y += filter.o
>   common-obj-y += filter-buffer.o
> +common-obj-y += colo-proxy.o
> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> new file mode 100644
> index 0000000..98c2699
> --- /dev/null
> +++ b/net/colo-proxy.c
> @@ -0,0 +1,139 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2015 FUJITSU LIMITED
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#include "colo-proxy.h"
> +
> +#define __DEBUG__
> +
> +#ifdef __DEBUG__
> +#define DEBUG(format, ...) printf(format, ##__VA_ARGS__)
> +#else
> +#define DEBUG(format, ...)
> +#endif
> +
> +
> +static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
> +                                         NetClientState *sender,
> +                                         unsigned flags,
> +                                         const struct iovec *iov,
> +                                         int iovcnt,
> +                                         NetPacketSent *sent_cb)
> +{
> +    /*
> +     * We return size when buffer a packet, the sender will take it as
> +     * a already sent packet, so sent_cb should not be called later.
> +     *
> +     */
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);

  blank space ~

> +    if (s->colo_mode == COLO_PRIMARY_MODE) {
> +         /* colo_proxy_primary_handler */
> +    } else {
> +         /* colo_proxy_primary_handler */
> +    }
> +    return iov_size(iov, iovcnt);
> +}
> +
> +static void colo_proxy_cleanup(NetFilterState *nf)
> +{
> +     /* cleanup */
> +}
> +
> +
> +static void colo_proxy_setup(NetFilterState *nf, Error **errp)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);

blank space ~

> +    if (!s->addr) {
> +        error_setg(errp, "filter colo_proxy needs 'addr' \
> +                     property set!");
> +        return;
> +    }
> +
> +    if (nf->direction != NET_FILTER_DIRECTION_ALL) {
> +        printf("colo need queue all packet,\
> +                    please startup colo-proxy with queue=all\n");

printf/error_setg/g

> +        return;
> +    }
> +
> +    s->sockfd = -1;
> +    s->has_failover = false;
> +    colo_do_checkpoint = false;
> +    g_queue_init(&s->unprocessed_connections);
> +
> +    if (!strcmp(mode, PRIMARY_MODE)) {
> +        s->colo_mode = COLO_PRIMARY_MODE;
> +    } else if (!strcmp(mode, SECONDARY_MODE)) {
> +        s->colo_mode = COLO_SECONDARY_MODE;
> +    } else {
> +        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
> +                    "primary or secondary");
> +        return;
> +    }
> +}
> +
> +static void colo_proxy_class_init(ObjectClass *oc, void *data)
> +{
> +    NetFilterClass *nfc = NETFILTER_CLASS(oc);
> +
> +    nfc->setup = colo_proxy_setup;
> +    nfc->cleanup = colo_proxy_cleanup;
> +    nfc->receive_iov = colo_proxy_receive_iov;
> +}
> +
> +static char *colo_proxy_get_mode(Object *obj, Error **errp)
> +{
> +    return g_strdup(mode);
> +}

Wrong patch ? Where did you define 'mode' ?

> +
> +static void colo_proxy_set_mode(Object *obj, const char *value, Error **errp)
> +{
> +    g_free(mode);
> +    mode = g_strdup(value);
> +}
> +
> +static char *colo_proxy_get_addr(Object *obj, Error **errp)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(obj);
> +
> +    return g_strdup(s->addr);
> +}
> +
> +static void colo_proxy_set_addr(Object *obj, const char *value, Error **errp)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(obj);
> +    g_free(s->addr);
> +    s->addr = g_strdup(value);
> +}
> +
> +static void colo_proxy_init(Object *obj)
> +{
> +    object_property_add_str(obj, "mode", colo_proxy_get_mode,
> +                            colo_proxy_set_mode, NULL);
> +    object_property_add_str(obj, "addr", colo_proxy_get_addr,
> +                            colo_proxy_set_addr, NULL);
> +}
> +
> +static const TypeInfo colo_proxy_info = {
> +    .name = TYPE_FILTER_COLO_PROXY,
> +    .parent = TYPE_NETFILTER,
> +    .class_init = colo_proxy_class_init,
> +    .instance_init = colo_proxy_init,
> +    .instance_size = sizeof(ColoProxyState),
> +};
> +
> +static void register_types(void)
> +{
> +    type_register_static(&colo_proxy_info);
> +}
> +
> +type_init(register_types);
> diff --git a/net/colo-proxy.h b/net/colo-proxy.h
> new file mode 100644
> index 0000000..94afbc7
> --- /dev/null
> +++ b/net/colo-proxy.h
> @@ -0,0 +1,63 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2015 FUJITSU LIMITED
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +
> +#ifndef QEMU_COLO_PROXY_H
> +#define QEMU_COLO_PROXY_H
> +
> +#include "net/filter.h"
> +#include "net/queue.h"
> +#include "qemu-common.h"
> +#include "qemu/iov.h"
> +#include "qapi/qmp/qerror.h"
> +#include "qapi-visit.h"
> +#include "qom/object.h"
> +#include "qemu/sockets.h"
> +#include "qemu/main-loop.h"
> +#include <netinet/if_ether.h>
> +#include "qemu/jhash.h"
> +#include "qemu/coroutine.h"
> +
> +#define FILTER_COLO_PROXY(obj) \
> +    OBJECT_CHECK(ColoProxyState, (obj), TYPE_FILTER_COLO_PROXY)
> +
> +#define TYPE_FILTER_COLO_PROXY "colo-proxy"
> +#define PRIMARY_MODE "primary"
> +#define SECONDARY_MODE "secondary"
> +
> +typedef enum {
> +    COLO_PRIMARY_MODE,               /* primary mode  */
> +    COLO_SECONDARY_MODE,             /* secondary mode */
> +} mode_type;
> +

We already have the similar enum defining in COLO-Frame and this can be dropped after
Frame is merged. But for now, it is OK.

> +typedef struct ColoProxyState {
> +    NetFilterState parent_obj;
> +    NetQueue *incoming_queue;        /* guest normal net queue */
> +    NetFilterDirection direction;    /* packet direction */
> +    mode_type colo_mode;             /* colo mode (primary or
> +                                      * secondary)
> +                                      */
> +    char *addr;                       /* primary colo connect addr
> +                                      * or secondary server addr
> +                                      */
> +    int sockfd;                      /* primary client socket fd or
> +                                      * secondary server socket fd
> +                                      */
> +    bool has_failover;               /* colo failover flag */
> +    GHashTable *unprocessed_packets; /* hashtable to save connection */
> +    GQueue unprocessed_connections;  /* to save unprocessed_connections */
> +    Coroutine *co;
> +} ColoProxyState;
> +
> +#endif /* QEMU_COLO_PROXY_H */
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work Zhang Chen
@ 2015-11-28  3:02   ` Hailiang Zhang
  2015-11-30  2:35     ` Zhang Chen
  2015-12-01 15:35   ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 35+ messages in thread
From: Hailiang Zhang @ 2015-11-28  3:02 UTC (permalink / raw)
  To: Zhang Chen, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, peter.huangpeng,
	Dr. David Alan Gilbert, Gong lei, jan.kiszka

On 2015/11/27 20:27, Zhang Chen wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>
> Secondary setup socket server for colo-forward
> primary setup connect to secondary for colo-forward
> add data structure will be uesed
>
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/colo-proxy.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>   net/colo-proxy.h | 61 +++++++++++++++++++++++++++++++++++++++
>   2 files changed, 147 insertions(+), 1 deletion(-)
>
> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> index 98c2699..89d9616 100644
> --- a/net/colo-proxy.c
> +++ b/net/colo-proxy.c
> @@ -22,6 +22,8 @@
>   #define DEBUG(format, ...)
>   #endif
>

> +static char *mode;
> +static bool colo_do_checkpoint;
>

>   static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>                                            NetClientState *sender,
> @@ -46,13 +48,84 @@ static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>
>   static void colo_proxy_cleanup(NetFilterState *nf)
>   {
> -     /* cleanup */
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    close(s->sockfd);
> +    s->sockfd = -1;
> +    g_free(mode);
> +    g_free(s->addr);
>   }
>

Please move the above codes to the previous patch~

> +static void colo_accept_incoming(ColoProxyState *s)
> +{
> +    DEBUG("into colo_accept_incoming\n");
> +    struct sockaddr_in addr;
> +    socklen_t addrlen = sizeof(addr);
> +    int acceptsock, err;
> +
> +    do {
> +        acceptsock = qemu_accept(s->sockfd, (struct sockaddr *)&addr, &addrlen);
> +        err = socket_error();
> +    } while (acceptsock < 0 && err == EINTR);
> +    qemu_set_fd_handler(s->sockfd, NULL, NULL, NULL);
> +    closesocket(s->sockfd);
> +
> +    DEBUG("accept colo proxy\n");
> +

It's better to use trace instead of DEBUG~

> +    if (acceptsock < 0) {
> +        printf("could not accept colo connection (%s)\n",
> +                     strerror(err));

/printf/error_report/g

> +        return;
> +    }
> +    s->sockfd = acceptsock;
> +    /* TODO: handle the packets that primary forward */
> +    return;
> +}
> +
> +/* Return 1 on success, or return -1 if failed */
> +static ssize_t colo_start_incoming(ColoProxyState *s)
> +{
> +    int serversock;

Space~

> +    serversock = inet_listen(s->addr, NULL, 256, SOCK_STREAM, 0, NULL);
> +    if (serversock < 0) {
> +        g_free(s->addr);
> +        return -1;
> +    }
> +    s->sockfd = serversock;
> +    qemu_set_fd_handler(serversock, (IOHandler *)colo_accept_incoming, NULL,
> +                        (void *)s);
> +    g_free(s->addr);

Double free ? I noticed you also free this in colo_proxy_cleanup(), we'd better do this in
the cleanup function.

> +    return 1;

Odd, it's better to return 0 to indicate success.

> +}
> +
> +/* Return 1 on success, or return -1 if setup failed */
> +static ssize_t colo_proxy_primary_setup(NetFilterState *nf)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    int sock;

> +    sock = inet_connect(s->addr, NULL);
> +    if (sock < 0) {
> +        printf("colo proxy connect failed\n");
> +        g_free(s->addr);

> +        return -1;
> +    }
> +    DEBUG("colo proxy connect success\n");
> +    s->sockfd = sock;
> +   /* TODO: handle the packets that secondary forward */
> +    g_free(s->addr);

As above comment.

> +    return 1;

> +}
> +
> +/* Return 1 on success, or return -1 if setup failed */
> +static ssize_t colo_proxy_secondary_setup(NetFilterState *nf)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);

Space~

> +    return colo_start_incoming(s);
> +}
>
>   static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>   {
>       ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    ssize_t ret = 0;

Space~

>       if (!s->addr) {
>           error_setg(errp, "filter colo_proxy needs 'addr' \
>                        property set!");
> @@ -68,17 +141,29 @@ static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>       s->sockfd = -1;
>       s->has_failover = false;
>       colo_do_checkpoint = false;
> +    s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
> +    s->unprocessed_packets = g_hash_table_new_full(connection_key_hash,
> +                                                       connection_key_equal,
> +                                                       g_free,
> +                                                       connection_destroy);
>       g_queue_init(&s->unprocessed_connections);
>
>       if (!strcmp(mode, PRIMARY_MODE)) {
>           s->colo_mode = COLO_PRIMARY_MODE;
> +        ret = colo_proxy_primary_setup(nf);
>       } else if (!strcmp(mode, SECONDARY_MODE)) {
>           s->colo_mode = COLO_SECONDARY_MODE;
> +        ret = colo_proxy_secondary_setup(nf);
>       } else {
>           error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
>                       "primary or secondary");
>           return;
>       }
> +    if (ret) {
> +        DEBUG("colo_proxy_setup success\n");
> +    } else {
> +        DEBUG("colo_proxy_setup failed\n");
> +    }
>   }
>
>   static void colo_proxy_class_init(ObjectClass *oc, void *data)
> diff --git a/net/colo-proxy.h b/net/colo-proxy.h
> index 94afbc7..f77db2f 100644
> --- a/net/colo-proxy.h
> +++ b/net/colo-proxy.h
> @@ -60,4 +60,65 @@ typedef struct ColoProxyState {
>       Coroutine *co;
>   } ColoProxyState;
>
> +struct ip {
> +#ifdef HOST_WORDS_BIGENDIAN
> +    uint8_t  ip_v:4,                 /* version */
> +             ip_hl:4;                /* header length */
> +#else
> +    uint8_t  ip_hl:4,                /* header length */
> +             ip_v:4;                 /* version */
> +#endif
> +    uint8_t  ip_tos;                 /* type of service */
> +    uint16_t ip_len;                 /* total length */
> +    uint16_t ip_id;                  /* identification */
> +    uint16_t ip_off;                 /* fragment offset field */
> +#define    IP_DF 0x4000              /* don't fragment flag */
> +#define    IP_MF 0x2000              /* more fragments flag */
> +#define    IP_OFFMASK 0x1fff
> +/* mask for fragmenting bits */
> +    uint8_t  ip_ttl;                 /* time to live */
> +    uint8_t  ip_p;                   /* protocol */
> +    uint16_t ip_sum;                 /* checksum */
> +    uint32_t ip_src, ip_dst;         /* source and dest address */
> +} QEMU_PACKED;
> +
> +typedef struct Packet {
> +    void *data;
> +    union {
> +        uint8_t *network_layer;
> +        struct ip *ip;
> +    };
> +    uint8_t *transport_layer;
> +    int size;
> +    ColoProxyState *s;
> +    bool should_be_sent;
> +    NetClientState *sender;
> +} Packet;
> +
> +typedef struct Connection_key {
> +    /* (src, dst) must be grouped, in the same way than in IP header */
> +    uint32_t src;
> +    uint32_t dst;
> +    union {
> +        uint32_t ports;
> +        uint16_t port16[2];
> +    };
> +    uint8_t ip_proto;
> +} QEMU_PACKED Connection_key;
> +
> +typedef struct Connection {
> +    /* connection primary send queue */
> +    GQueue primary_list;
> +    /* connection secondary send queue */
> +    GQueue secondary_list;
> +     /* flag to enqueue unprocessed_connections */
> +    bool processing;
> +} Connection;
> +
> +typedef enum {
> +    PRIMARY_OUTPUT,           /* primary output packet queue */
> +    PRIMARY_INPUT,            /* primary input packet queue */
> +    SECONDARY_OUTPUT,         /* secondary output packet queue */
> +} packet_type;
> +
>   #endif /* QEMU_COLO_PROXY_H */
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 5/9] net/colo-proxy: add colo packet handler
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 5/9] net/colo-proxy: add colo packet handler Zhang Chen
@ 2015-11-28  3:17   ` Hailiang Zhang
  2015-11-30  5:37     ` Zhang Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Hailiang Zhang @ 2015-11-28  3:17 UTC (permalink / raw)
  To: Zhang Chen, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, peter.huangpeng,
	Dr. David Alan Gilbert, Gong lei, jan.kiszka

On 2015/11/27 20:27, Zhang Chen wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>
> add primary and secondary handler
>
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/colo-proxy.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>   1 file changed, 103 insertions(+), 2 deletions(-)
>
> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> index 89d9616..ece5661 100644
> --- a/net/colo-proxy.c
> +++ b/net/colo-proxy.c
> @@ -25,6 +25,101 @@
>   static char *mode;
>   static bool colo_do_checkpoint;
>
> +/*
> + * colo primary handle host's normal send and
> + * recv packets to primary guest
> + * return:          >= 0      success
> + *                  < 0       failed
> + */
> +static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
> +                                         NetClientState *sender,
> +                                         unsigned flags,
> +                                         const struct iovec *iov,
> +                                         int iovcnt,
> +                                         NetPacketSent *sent_cb)
> +{
> +    ssize_t ret = 0;
> +    int direction;
> +
> +    if (sender == nf->netdev) {
> +        /* This packet is sent by netdev itself */
> +        direction = NET_FILTER_DIRECTION_TX;
> +    } else {
> +        direction = NET_FILTER_DIRECTION_RX;
> +    }
> +    /*
> +     * if packet's direction=rx
> +     * enqueue packets to primary queue
> +     * and wait secondary queue to compare
> +     * if packet's direction=tx
> +     * enqueue packets then send packets to
> +     * secondary and flush  queued packets
> +    */
> +
> +    if (colo_do_checkpoint) {
> +        colo_proxy_do_checkpoint(nf);
> +    }
> +

Wrong patch ? Where is the definition of colo_proxy_do_checkpoint() ?
Besides, why did we need to call colo_proxy_do_checkpoint() here ?

> +    if (direction == NET_FILTER_DIRECTION_RX) {
> +        /* TODO: enqueue_primary_packet */
> +    } else {
> +        /* TODO: forward packets to another */
> +    }
> +
> +    return ret;
> +}
> +
> +/*
> + * colo secondary handle host's normal send and
> + * recv packets to secondary guest
> + * return:          >= 0      success
> + *                  < 0       failed
> + */
> +static ssize_t colo_proxy_secondary_handler(NetFilterState *nf,
> +                                         NetClientState *sender,
> +                                         unsigned flags,
> +                                         const struct iovec *iov,
> +                                         int iovcnt,
> +                                         NetPacketSent *sent_cb)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    int direction;
> +    ssize_t ret = 0;
> +
> +    if (sender == nf->netdev) {
> +        /* This packet is sent by netdev itself */
> +        direction = NET_FILTER_DIRECTION_TX;
> +    } else {
> +        direction = NET_FILTER_DIRECTION_RX;
> +    }
> +    /*
> +     * if packet's direction=rx
> +     * enqueue packets and send to
> +     * primary QEMU
> +     * if packet's direction=tx
> +     * record PVM's packet inital seq & adjust
> +     * client's ack,send adjusted packets to SVM(next version will be do)
> +     */
> +
> +    if (direction == NET_FILTER_DIRECTION_RX) {

> +        if (colo_has_failover(nf)) {
> +            qemu_net_queue_send_iov(s->incoming_queue, sender, flags, iov,
> +                            iovcnt, NULL);
> +            return 1;

> +        } else {
> +        /* TODO: forward packets to another */
> +        }
> +
> +    } else {

> +        if (colo_has_failover(nf)) {
> +            qemu_net_queue_send_iov(s->incoming_queue, sender, flags, iov,
> +                            iovcnt, NULL);
> +        }
> +        return 1;

These codes can be placed outside of the outer if/else.

> +    }
> +    return ret;
> +}
> +
>   static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>                                            NetClientState *sender,
>                                            unsigned flags,
> @@ -38,10 +133,16 @@ static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>        *
>        */
>       ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    ssize_t ret = 0;

Space ~

>       if (s->colo_mode == COLO_PRIMARY_MODE) {
> -         /* colo_proxy_primary_handler */
> +        ret = colo_proxy_primary_handler(nf, sender, flags,
> +                    iov, iovcnt, sent_cb);
>       } else {
> -         /* colo_proxy_primary_handler */
> +        ret = colo_proxy_secondary_handler(nf, sender, flags,
> +                    iov, iovcnt, sent_cb);
> +    }
> +    if (ret < 0) {
> +        DEBUG("colo_proxy_receive_iov running failed\n");
>       }
>       return iov_size(iov, iovcnt);
>   }
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework
  2015-11-28  2:46   ` Hailiang Zhang
@ 2015-11-30  2:25     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-30  2:25 UTC (permalink / raw)
  To: Hailiang Zhang, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, peter.huangpeng,
	Dr. David Alan Gilbert, Gong lei, jan.kiszka, hongyang.yang



On 11/28/2015 10:46 AM, Hailiang Zhang wrote:
> On 2015/11/27 20:27, Zhang Chen wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> Colo-proxy is a plugin of qemu netfilter
>> like filter-buffer and dump
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/Makefile.objs |   1 +
>>   net/colo-proxy.c  | 139 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/colo-proxy.h  |  63 +++++++++++++++++++++++++
>>   3 files changed, 203 insertions(+)
>>   create mode 100644 net/colo-proxy.c
>>   create mode 100644 net/colo-proxy.h
>>
>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>> index 5fa2f97..95670f2 100644
>> --- a/net/Makefile.objs
>> +++ b/net/Makefile.objs
>> @@ -15,3 +15,4 @@ common-obj-$(CONFIG_VDE) += vde.o
>>   common-obj-$(CONFIG_NETMAP) += netmap.o
>>   common-obj-y += filter.o
>>   common-obj-y += filter-buffer.o
>> +common-obj-y += colo-proxy.o
>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>> new file mode 100644
>> index 0000000..98c2699
>> --- /dev/null
>> +++ b/net/colo-proxy.c
>> @@ -0,0 +1,139 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service 
>> (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2015 FUJITSU LIMITED
>> + * Copyright (c) 2015 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "colo-proxy.h"
>> +
>> +#define __DEBUG__
>> +
>> +#ifdef __DEBUG__
>> +#define DEBUG(format, ...) printf(format, ##__VA_ARGS__)
>> +#else
>> +#define DEBUG(format, ...)
>> +#endif
>> +
>> +
>> +static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>> +                                         NetClientState *sender,
>> +                                         unsigned flags,
>> +                                         const struct iovec *iov,
>> +                                         int iovcnt,
>> +                                         NetPacketSent *sent_cb)
>> +{
>> +    /*
>> +     * We return size when buffer a packet, the sender will take it as
>> +     * a already sent packet, so sent_cb should not be called later.
>> +     *
>> +     */
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>
>  blank space ~
>

fix

thanks for review

zhang chen
>> +    if (s->colo_mode == COLO_PRIMARY_MODE) {
>> +         /* colo_proxy_primary_handler */
>> +    } else {
>> +         /* colo_proxy_primary_handler */
>> +    }
>> +    return iov_size(iov, iovcnt);
>> +}
>> +
>> +static void colo_proxy_cleanup(NetFilterState *nf)
>> +{
>> +     /* cleanup */
>> +}
>> +
>> +
>> +static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>
> blank space ~
>

fix

>> +    if (!s->addr) {
>> +        error_setg(errp, "filter colo_proxy needs 'addr' \
>> +                     property set!");
>> +        return;
>> +    }
>> +
>> +    if (nf->direction != NET_FILTER_DIRECTION_ALL) {
>> +        printf("colo need queue all packet,\
>> +                    please startup colo-proxy with queue=all\n");
>
> printf/error_setg/g

fix

>
>> +        return;
>> +    }
>> +
>> +    s->sockfd = -1;
>> +    s->has_failover = false;
>> +    colo_do_checkpoint = false;
>> +    g_queue_init(&s->unprocessed_connections);
>> +
>> +    if (!strcmp(mode, PRIMARY_MODE)) {
>> +        s->colo_mode = COLO_PRIMARY_MODE;
>> +    } else if (!strcmp(mode, SECONDARY_MODE)) {
>> +        s->colo_mode = COLO_SECONDARY_MODE;
>> +    } else {
>> +        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
>> +                    "primary or secondary");
>> +        return;
>> +    }
>> +}
>> +
>> +static void colo_proxy_class_init(ObjectClass *oc, void *data)
>> +{
>> +    NetFilterClass *nfc = NETFILTER_CLASS(oc);
>> +
>> +    nfc->setup = colo_proxy_setup;
>> +    nfc->cleanup = colo_proxy_cleanup;
>> +    nfc->receive_iov = colo_proxy_receive_iov;
>> +}
>> +
>> +static char *colo_proxy_get_mode(Object *obj, Error **errp)
>> +{
>> +    return g_strdup(mode);
>> +}
>
> Wrong patch ? Where did you define 'mode' ?
>

sorry,in next patch,i will move define to this patch in next version.


>> +
>> +static void colo_proxy_set_mode(Object *obj, const char *value, 
>> Error **errp)
>> +{
>> +    g_free(mode);
>> +    mode = g_strdup(value);
>> +}
>> +
>> +static char *colo_proxy_get_addr(Object *obj, Error **errp)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(obj);
>> +
>> +    return g_strdup(s->addr);
>> +}
>> +
>> +static void colo_proxy_set_addr(Object *obj, const char *value, 
>> Error **errp)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(obj);
>> +    g_free(s->addr);
>> +    s->addr = g_strdup(value);
>> +}
>> +
>> +static void colo_proxy_init(Object *obj)
>> +{
>> +    object_property_add_str(obj, "mode", colo_proxy_get_mode,
>> +                            colo_proxy_set_mode, NULL);
>> +    object_property_add_str(obj, "addr", colo_proxy_get_addr,
>> +                            colo_proxy_set_addr, NULL);
>> +}
>> +
>> +static const TypeInfo colo_proxy_info = {
>> +    .name = TYPE_FILTER_COLO_PROXY,
>> +    .parent = TYPE_NETFILTER,
>> +    .class_init = colo_proxy_class_init,
>> +    .instance_init = colo_proxy_init,
>> +    .instance_size = sizeof(ColoProxyState),
>> +};
>> +
>> +static void register_types(void)
>> +{
>> +    type_register_static(&colo_proxy_info);
>> +}
>> +
>> +type_init(register_types);
>> diff --git a/net/colo-proxy.h b/net/colo-proxy.h
>> new file mode 100644
>> index 0000000..94afbc7
>> --- /dev/null
>> +++ b/net/colo-proxy.h
>> @@ -0,0 +1,63 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service 
>> (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2015 FUJITSU LIMITED
>> + * Copyright (c) 2015 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +
>> +#ifndef QEMU_COLO_PROXY_H
>> +#define QEMU_COLO_PROXY_H
>> +
>> +#include "net/filter.h"
>> +#include "net/queue.h"
>> +#include "qemu-common.h"
>> +#include "qemu/iov.h"
>> +#include "qapi/qmp/qerror.h"
>> +#include "qapi-visit.h"
>> +#include "qom/object.h"
>> +#include "qemu/sockets.h"
>> +#include "qemu/main-loop.h"
>> +#include <netinet/if_ether.h>
>> +#include "qemu/jhash.h"
>> +#include "qemu/coroutine.h"
>> +
>> +#define FILTER_COLO_PROXY(obj) \
>> +    OBJECT_CHECK(ColoProxyState, (obj), TYPE_FILTER_COLO_PROXY)
>> +
>> +#define TYPE_FILTER_COLO_PROXY "colo-proxy"
>> +#define PRIMARY_MODE "primary"
>> +#define SECONDARY_MODE "secondary"
>> +
>> +typedef enum {
>> +    COLO_PRIMARY_MODE,               /* primary mode  */
>> +    COLO_SECONDARY_MODE,             /* secondary mode */
>> +} mode_type;
>> +
>
> We already have the similar enum defining in COLO-Frame and this can 
> be dropped after
> Frame is merged. But for now, it is OK.
>
>> +typedef struct ColoProxyState {
>> +    NetFilterState parent_obj;
>> +    NetQueue *incoming_queue;        /* guest normal net queue */
>> +    NetFilterDirection direction;    /* packet direction */
>> +    mode_type colo_mode;             /* colo mode (primary or
>> +                                      * secondary)
>> +                                      */
>> +    char *addr;                       /* primary colo connect addr
>> +                                      * or secondary server addr
>> +                                      */
>> +    int sockfd;                      /* primary client socket fd or
>> +                                      * secondary server socket fd
>> +                                      */
>> +    bool has_failover;               /* colo failover flag */
>> +    GHashTable *unprocessed_packets; /* hashtable to save connection */
>> +    GQueue unprocessed_connections;  /* to save 
>> unprocessed_connections */
>> +    Coroutine *co;
>> +} ColoProxyState;
>> +
>> +#endif /* QEMU_COLO_PROXY_H */
>>
>
>
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work
  2015-11-28  3:02   ` Hailiang Zhang
@ 2015-11-30  2:35     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-30  2:35 UTC (permalink / raw)
  To: Hailiang Zhang, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, peter.huangpeng,
	Dr. David Alan Gilbert, Gong lei, jan.kiszka, hongyang.yang



On 11/28/2015 11:02 AM, Hailiang Zhang wrote:
> On 2015/11/27 20:27, Zhang Chen wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> Secondary setup socket server for colo-forward
>> primary setup connect to secondary for colo-forward
>> add data structure will be uesed
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-proxy.c | 87 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   net/colo-proxy.h | 61 +++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 147 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>> index 98c2699..89d9616 100644
>> --- a/net/colo-proxy.c
>> +++ b/net/colo-proxy.c
>> @@ -22,6 +22,8 @@
>>   #define DEBUG(format, ...)
>>   #endif
>>
>
>> +static char *mode;
>> +static bool colo_do_checkpoint;
>>
>
>>   static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>>                                            NetClientState *sender,
>> @@ -46,13 +48,84 @@ static ssize_t 
>> colo_proxy_receive_iov(NetFilterState *nf,
>>
>>   static void colo_proxy_cleanup(NetFilterState *nf)
>>   {
>> -     /* cleanup */
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    close(s->sockfd);
>> +    s->sockfd = -1;
>> +    g_free(mode);
>> +    g_free(s->addr);
>>   }
>>
>
> Please move the above codes to the previous patch~
>

ok

thanks for review
zhangchen

>> +static void colo_accept_incoming(ColoProxyState *s)
>> +{
>> +    DEBUG("into colo_accept_incoming\n");
>> +    struct sockaddr_in addr;
>> +    socklen_t addrlen = sizeof(addr);
>> +    int acceptsock, err;
>> +
>> +    do {
>> +        acceptsock = qemu_accept(s->sockfd, (struct sockaddr 
>> *)&addr, &addrlen);
>> +        err = socket_error();
>> +    } while (acceptsock < 0 && err == EINTR);
>> +    qemu_set_fd_handler(s->sockfd, NULL, NULL, NULL);
>> +    closesocket(s->sockfd);
>> +
>> +    DEBUG("accept colo proxy\n");
>> +
>
> It's better to use trace instead of DEBUG~

ok,i will fix it in next version

>
>> +    if (acceptsock < 0) {
>> +        printf("could not accept colo connection (%s)\n",
>> +                     strerror(err));
>
> /printf/error_report/g

fix

>
>> +        return;
>> +    }
>> +    s->sockfd = acceptsock;
>> +    /* TODO: handle the packets that primary forward */
>> +    return;
>> +}
>> +
>> +/* Return 1 on success, or return -1 if failed */
>> +static ssize_t colo_start_incoming(ColoProxyState *s)
>> +{
>> +    int serversock;
>
> Space~

fix

>
>> +    serversock = inet_listen(s->addr, NULL, 256, SOCK_STREAM, 0, NULL);
>> +    if (serversock < 0) {
>> +        g_free(s->addr);
>> +        return -1;
>> +    }
>> +    s->sockfd = serversock;
>> +    qemu_set_fd_handler(serversock, (IOHandler 
>> *)colo_accept_incoming, NULL,
>> +                        (void *)s);
>> +    g_free(s->addr);
>
> Double free ? I noticed you also free this in colo_proxy_cleanup(), 
> we'd better do this in
> the cleanup function.

fix,thanks

>
>> +    return 1;
>
> Odd, it's better to return 0 to indicate success.
>

will fix in next version

>> +}
>> +
>> +/* Return 1 on success, or return -1 if setup failed */
>> +static ssize_t colo_proxy_primary_setup(NetFilterState *nf)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    int sock;
>
>> +    sock = inet_connect(s->addr, NULL);
>> +    if (sock < 0) {
>> +        printf("colo proxy connect failed\n");
>> +        g_free(s->addr);
>
>> +        return -1;
>> +    }
>> +    DEBUG("colo proxy connect success\n");
>> +    s->sockfd = sock;
>> +   /* TODO: handle the packets that secondary forward */
>> +    g_free(s->addr);
>
> As above comment.
>
>> +    return 1;
>
>> +}
>> +
>> +/* Return 1 on success, or return -1 if setup failed */
>> +static ssize_t colo_proxy_secondary_setup(NetFilterState *nf)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>
> Space~
>

fix

>> +    return colo_start_incoming(s);
>> +}
>>
>>   static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>>   {
>>       ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    ssize_t ret = 0;
>
> Space~
>

fix

>>       if (!s->addr) {
>>           error_setg(errp, "filter colo_proxy needs 'addr' \
>>                        property set!");
>> @@ -68,17 +141,29 @@ static void colo_proxy_setup(NetFilterState *nf, 
>> Error **errp)
>>       s->sockfd = -1;
>>       s->has_failover = false;
>>       colo_do_checkpoint = false;
>> +    s->incoming_queue = 
>> qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
>> +    s->unprocessed_packets = g_hash_table_new_full(connection_key_hash,
>> + connection_key_equal,
>> +                                                       g_free,
>> + connection_destroy);
>>       g_queue_init(&s->unprocessed_connections);
>>
>>       if (!strcmp(mode, PRIMARY_MODE)) {
>>           s->colo_mode = COLO_PRIMARY_MODE;
>> +        ret = colo_proxy_primary_setup(nf);
>>       } else if (!strcmp(mode, SECONDARY_MODE)) {
>>           s->colo_mode = COLO_SECONDARY_MODE;
>> +        ret = colo_proxy_secondary_setup(nf);
>>       } else {
>>           error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
>>                       "primary or secondary");
>>           return;
>>       }
>> +    if (ret) {
>> +        DEBUG("colo_proxy_setup success\n");
>> +    } else {
>> +        DEBUG("colo_proxy_setup failed\n");
>> +    }
>>   }
>>
>>   static void colo_proxy_class_init(ObjectClass *oc, void *data)
>> diff --git a/net/colo-proxy.h b/net/colo-proxy.h
>> index 94afbc7..f77db2f 100644
>> --- a/net/colo-proxy.h
>> +++ b/net/colo-proxy.h
>> @@ -60,4 +60,65 @@ typedef struct ColoProxyState {
>>       Coroutine *co;
>>   } ColoProxyState;
>>
>> +struct ip {
>> +#ifdef HOST_WORDS_BIGENDIAN
>> +    uint8_t  ip_v:4,                 /* version */
>> +             ip_hl:4;                /* header length */
>> +#else
>> +    uint8_t  ip_hl:4,                /* header length */
>> +             ip_v:4;                 /* version */
>> +#endif
>> +    uint8_t  ip_tos;                 /* type of service */
>> +    uint16_t ip_len;                 /* total length */
>> +    uint16_t ip_id;                  /* identification */
>> +    uint16_t ip_off;                 /* fragment offset field */
>> +#define    IP_DF 0x4000              /* don't fragment flag */
>> +#define    IP_MF 0x2000              /* more fragments flag */
>> +#define    IP_OFFMASK 0x1fff
>> +/* mask for fragmenting bits */
>> +    uint8_t  ip_ttl;                 /* time to live */
>> +    uint8_t  ip_p;                   /* protocol */
>> +    uint16_t ip_sum;                 /* checksum */
>> +    uint32_t ip_src, ip_dst;         /* source and dest address */
>> +} QEMU_PACKED;
>> +
>> +typedef struct Packet {
>> +    void *data;
>> +    union {
>> +        uint8_t *network_layer;
>> +        struct ip *ip;
>> +    };
>> +    uint8_t *transport_layer;
>> +    int size;
>> +    ColoProxyState *s;
>> +    bool should_be_sent;
>> +    NetClientState *sender;
>> +} Packet;
>> +
>> +typedef struct Connection_key {
>> +    /* (src, dst) must be grouped, in the same way than in IP header */
>> +    uint32_t src;
>> +    uint32_t dst;
>> +    union {
>> +        uint32_t ports;
>> +        uint16_t port16[2];
>> +    };
>> +    uint8_t ip_proto;
>> +} QEMU_PACKED Connection_key;
>> +
>> +typedef struct Connection {
>> +    /* connection primary send queue */
>> +    GQueue primary_list;
>> +    /* connection secondary send queue */
>> +    GQueue secondary_list;
>> +     /* flag to enqueue unprocessed_connections */
>> +    bool processing;
>> +} Connection;
>> +
>> +typedef enum {
>> +    PRIMARY_OUTPUT,           /* primary output packet queue */
>> +    PRIMARY_INPUT,            /* primary input packet queue */
>> +    SECONDARY_OUTPUT,         /* secondary output packet queue */
>> +} packet_type;
>> +
>>   #endif /* QEMU_COLO_PROXY_H */
>>
>
>
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/9] Init colo-proxy object based on netfilter
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 1/9] Init colo-proxy object " Zhang Chen
@ 2015-11-30  2:50   ` Wen Congyang
  2015-11-30  5:38     ` Zhang Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Wen Congyang @ 2015-11-30  2:50 UTC (permalink / raw)
  To: Zhang Chen, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: zhanghailiang, Li Zhijian, Gui jianfeng, eddie.dong,
	Dr. David Alan Gilbert, Huang peng, Gong lei, jan.kiszka

On 11/27/2015 08:27 PM, Zhang Chen wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> 
> add colo-proxy in vl.c and qemu-options.hx
> 
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  qemu-options.hx | 4 ++++
>  vl.c            | 3 ++-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 949db7f..5e6f1e3 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -3666,6 +3666,10 @@ queue @var{all|rx|tx} is an option that can be applied to any netfilter.
>  @option{tx}: the filter is attached to the transmit queue of the netdev,
>               where it will receive packets sent by the netdev.
>  
> +@item -object colo-proxy,id=@var{id},netdev=@var{netdevid},port=@var{t},addr=@var{ip:port},mode=@var{primary|secondary}[,queue=@var{all|rx|tx}]

1. queue *MUST* be all for the filter colo-proxy.
2. The option port should be removed
3. The option addr is socket address. The format can be host:port, or fd.

> +
> +colo-proxy

Add more description here.

Thanks
Wen Congyang

> +
>  @item -object filter-dump,id=@var{id},netdev=@var{dev},file=@var{filename}][,maxlen=@var{len}]
>  
>  Dump the network traffic on netdev @var{dev} to the file specified by
> diff --git a/vl.c b/vl.c
> index f5f7c3f..9037743 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2774,7 +2774,8 @@ static bool object_create_initial(const char *type)
>       * they depend on netdevs already existing
>       */
>      if (g_str_equal(type, "filter-buffer") ||
> -        g_str_equal(type, "filter-dump")) {
> +        g_str_equal(type, "filter-dump") ||
> +        g_str_equal(type, "colo-proxy")) {
>          return false;
>      }
>  
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework Zhang Chen
  2015-11-28  2:46   ` Hailiang Zhang
@ 2015-11-30  3:10   ` Wen Congyang
  2015-11-30  5:44     ` Zhang Chen
  1 sibling, 1 reply; 35+ messages in thread
From: Wen Congyang @ 2015-11-30  3:10 UTC (permalink / raw)
  To: Zhang Chen, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: zhanghailiang, Li Zhijian, Gui jianfeng, eddie.dong,
	Dr. David Alan Gilbert, Huang peng, Gong lei, jan.kiszka

On 11/27/2015 08:27 PM, Zhang Chen wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> 
> Colo-proxy is a plugin of qemu netfilter
> like filter-buffer and dump
> 
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  net/Makefile.objs |   1 +
>  net/colo-proxy.c  | 139 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  net/colo-proxy.h  |  63 +++++++++++++++++++++++++
>  3 files changed, 203 insertions(+)
>  create mode 100644 net/colo-proxy.c
>  create mode 100644 net/colo-proxy.h
> 
> diff --git a/net/Makefile.objs b/net/Makefile.objs
> index 5fa2f97..95670f2 100644
> --- a/net/Makefile.objs
> +++ b/net/Makefile.objs
> @@ -15,3 +15,4 @@ common-obj-$(CONFIG_VDE) += vde.o
>  common-obj-$(CONFIG_NETMAP) += netmap.o
>  common-obj-y += filter.o
>  common-obj-y += filter-buffer.o
> +common-obj-y += colo-proxy.o
> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> new file mode 100644
> index 0000000..98c2699
> --- /dev/null
> +++ b/net/colo-proxy.c
> @@ -0,0 +1,139 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2015 FUJITSU LIMITED
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#include "colo-proxy.h"
> +
> +#define __DEBUG__
> +
> +#ifdef __DEBUG__
> +#define DEBUG(format, ...) printf(format, ##__VA_ARGS__)
> +#else
> +#define DEBUG(format, ...)
> +#endif
> +
> +
> +static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
> +                                         NetClientState *sender,
> +                                         unsigned flags,
> +                                         const struct iovec *iov,
> +                                         int iovcnt,
> +                                         NetPacketSent *sent_cb)
> +{
> +    /*
> +     * We return size when buffer a packet, the sender will take it as
> +     * a already sent packet, so sent_cb should not be called later.
> +     *
> +     */
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    if (s->colo_mode == COLO_PRIMARY_MODE) {
> +         /* colo_proxy_primary_handler */
> +    } else {
> +         /* colo_proxy_primary_handler */
> +    }
> +    return iov_size(iov, iovcnt);
> +}
> +
> +static void colo_proxy_cleanup(NetFilterState *nf)
> +{
> +     /* cleanup */
> +}
> +
> +
> +static void colo_proxy_setup(NetFilterState *nf, Error **errp)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    if (!s->addr) {
> +        error_setg(errp, "filter colo_proxy needs 'addr' \
> +                     property set!");
> +        return;
> +    }
> +
> +    if (nf->direction != NET_FILTER_DIRECTION_ALL) {
> +        printf("colo need queue all packet,\

s/need/needs/

> +                    please startup colo-proxy with queue=all\n");
> +        return;
> +    }
> +
> +    s->sockfd = -1;
> +    s->has_failover = false;
> +    colo_do_checkpoint = false;
> +    g_queue_init(&s->unprocessed_connections);
> +
> +    if (!strcmp(mode, PRIMARY_MODE)) {
> +        s->colo_mode = COLO_PRIMARY_MODE;
> +    } else if (!strcmp(mode, SECONDARY_MODE)) {
> +        s->colo_mode = COLO_SECONDARY_MODE;
> +    } else {
> +        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
> +                    "primary or secondary");
> +        return;
> +    }
> +}
> +
> +static void colo_proxy_class_init(ObjectClass *oc, void *data)
> +{
> +    NetFilterClass *nfc = NETFILTER_CLASS(oc);
> +
> +    nfc->setup = colo_proxy_setup;
> +    nfc->cleanup = colo_proxy_cleanup;
> +    nfc->receive_iov = colo_proxy_receive_iov;
> +}
> +
> +static char *colo_proxy_get_mode(Object *obj, Error **errp)
> +{
> +    return g_strdup(mode);
> +}
> +
> +static void colo_proxy_set_mode(Object *obj, const char *value, Error **errp)
> +{
> +    g_free(mode);
> +    mode = g_strdup(value);
> +}
> +
> +static char *colo_proxy_get_addr(Object *obj, Error **errp)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(obj);
> +
> +    return g_strdup(s->addr);
> +}
> +
> +static void colo_proxy_set_addr(Object *obj, const char *value, Error **errp)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(obj);
> +    g_free(s->addr);
> +    s->addr = g_strdup(value);

You can parse the address here, and can find the format error as early as possible.

> +}
> +
> +static void colo_proxy_init(Object *obj)
> +{
> +    object_property_add_str(obj, "mode", colo_proxy_get_mode,
> +                            colo_proxy_set_mode, NULL);
> +    object_property_add_str(obj, "addr", colo_proxy_get_addr,
> +                            colo_proxy_set_addr, NULL);
> +}
> +
> +static const TypeInfo colo_proxy_info = {
> +    .name = TYPE_FILTER_COLO_PROXY,
> +    .parent = TYPE_NETFILTER,
> +    .class_init = colo_proxy_class_init,
> +    .instance_init = colo_proxy_init,
> +    .instance_size = sizeof(ColoProxyState),
> +};
> +
> +static void register_types(void)
> +{
> +    type_register_static(&colo_proxy_info);
> +}
> +
> +type_init(register_types);
> diff --git a/net/colo-proxy.h b/net/colo-proxy.h
> new file mode 100644
> index 0000000..94afbc7
> --- /dev/null
> +++ b/net/colo-proxy.h
> @@ -0,0 +1,63 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2015 FUJITSU LIMITED
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +
> +#ifndef QEMU_COLO_PROXY_H
> +#define QEMU_COLO_PROXY_H
> +
> +#include "net/filter.h"
> +#include "net/queue.h"
> +#include "qemu-common.h"
> +#include "qemu/iov.h"
> +#include "qapi/qmp/qerror.h"
> +#include "qapi-visit.h"
> +#include "qom/object.h"
> +#include "qemu/sockets.h"
> +#include "qemu/main-loop.h"
> +#include <netinet/if_ether.h>
> +#include "qemu/jhash.h"
> +#include "qemu/coroutine.h"

Don't include too many header files here. You should only include the header files
which will be used in this header file.

> +
> +#define FILTER_COLO_PROXY(obj) \
> +    OBJECT_CHECK(ColoProxyState, (obj), TYPE_FILTER_COLO_PROXY)
> +
> +#define TYPE_FILTER_COLO_PROXY "colo-proxy"
> +#define PRIMARY_MODE "primary"
> +#define SECONDARY_MODE "secondary"
> +
> +typedef enum {
> +    COLO_PRIMARY_MODE,               /* primary mode  */
> +    COLO_SECONDARY_MODE,             /* secondary mode */
> +} mode_type;
> +
> +typedef struct ColoProxyState {
> +    NetFilterState parent_obj;
> +    NetQueue *incoming_queue;        /* guest normal net queue */
> +    NetFilterDirection direction;    /* packet direction */
> +    mode_type colo_mode;             /* colo mode (primary or
> +                                      * secondary)
> +                                      */
> +    char *addr;                       /* primary colo connect addr
> +                                      * or secondary server addr
> +                                      */
> +    int sockfd;                      /* primary client socket fd or
> +                                      * secondary server socket fd
> +                                      */
> +    bool has_failover;               /* colo failover flag */
> +    GHashTable *unprocessed_packets; /* hashtable to save connection */
> +    GQueue unprocessed_connections;  /* to save unprocessed_connections */
> +    Coroutine *co;
> +} ColoProxyState;
> +
> +#endif /* QEMU_COLO_PROXY_H */
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 5/9] net/colo-proxy: add colo packet handler
  2015-11-28  3:17   ` Hailiang Zhang
@ 2015-11-30  5:37     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-30  5:37 UTC (permalink / raw)
  To: Hailiang Zhang, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: Li Zhijian, Gui jianfeng, eddie.dong, peter.huangpeng,
	Dr. David Alan Gilbert, Gong lei, jan.kiszka, hongyang.yang



On 11/28/2015 11:17 AM, Hailiang Zhang wrote:
> On 2015/11/27 20:27, Zhang Chen wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> add primary and secondary handler
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-proxy.c | 105 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 103 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>> index 89d9616..ece5661 100644
>> --- a/net/colo-proxy.c
>> +++ b/net/colo-proxy.c
>> @@ -25,6 +25,101 @@
>>   static char *mode;
>>   static bool colo_do_checkpoint;
>>
>> +/*
>> + * colo primary handle host's normal send and
>> + * recv packets to primary guest
>> + * return:          >= 0      success
>> + *                  < 0       failed
>> + */
>> +static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
>> +                                         NetClientState *sender,
>> +                                         unsigned flags,
>> +                                         const struct iovec *iov,
>> +                                         int iovcnt,
>> +                                         NetPacketSent *sent_cb)
>> +{
>> +    ssize_t ret = 0;
>> +    int direction;
>> +
>> +    if (sender == nf->netdev) {
>> +        /* This packet is sent by netdev itself */
>> +        direction = NET_FILTER_DIRECTION_TX;
>> +    } else {
>> +        direction = NET_FILTER_DIRECTION_RX;
>> +    }
>> +    /*
>> +     * if packet's direction=rx
>> +     * enqueue packets to primary queue
>> +     * and wait secondary queue to compare
>> +     * if packet's direction=tx
>> +     * enqueue packets then send packets to
>> +     * secondary and flush  queued packets
>> +    */
>> +
>> +    if (colo_do_checkpoint) {
>> +        colo_proxy_do_checkpoint(nf);
>> +    }
>> +
>
> Wrong patch ? Where is the definition of colo_proxy_do_checkpoint() ?

sorry,the definition in patch 9/9,in next version I will replace it with
/* colo_proxy_do_checkpoint */

thanks for review
zhangchen

> Besides, why did we need to call colo_proxy_do_checkpoint() here ?
>

if proxy compare modles find packet different,it will nofity colo to do 
checkpoint
(use colo_proxy_notify_checkpoint).then proxy wait colo to respond and
change colo_do_checkpoint = true,in that time proxy flush queued primary 
packet.
the location we call colo_proxy_do_checkpoint() will fix in next version.

>> +    if (direction == NET_FILTER_DIRECTION_RX) {
>> +        /* TODO: enqueue_primary_packet */
>> +    } else {
>> +        /* TODO: forward packets to another */
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +/*
>> + * colo secondary handle host's normal send and
>> + * recv packets to secondary guest
>> + * return:          >= 0      success
>> + *                  < 0       failed
>> + */
>> +static ssize_t colo_proxy_secondary_handler(NetFilterState *nf,
>> +                                         NetClientState *sender,
>> +                                         unsigned flags,
>> +                                         const struct iovec *iov,
>> +                                         int iovcnt,
>> +                                         NetPacketSent *sent_cb)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    int direction;
>> +    ssize_t ret = 0;
>> +
>> +    if (sender == nf->netdev) {
>> +        /* This packet is sent by netdev itself */
>> +        direction = NET_FILTER_DIRECTION_TX;
>> +    } else {
>> +        direction = NET_FILTER_DIRECTION_RX;
>> +    }
>> +    /*
>> +     * if packet's direction=rx
>> +     * enqueue packets and send to
>> +     * primary QEMU
>> +     * if packet's direction=tx
>> +     * record PVM's packet inital seq & adjust
>> +     * client's ack,send adjusted packets to SVM(next version will 
>> be do)
>> +     */
>> +
>> +    if (direction == NET_FILTER_DIRECTION_RX) {
>
>> +        if (colo_has_failover(nf)) {
>> +            qemu_net_queue_send_iov(s->incoming_queue, sender, 
>> flags, iov,
>> +                            iovcnt, NULL);
>> +            return 1;
>
>> +        } else {
>> +        /* TODO: forward packets to another */
>> +        }
>> +
>> +    } else {
>
>> +        if (colo_has_failover(nf)) {
>> +            qemu_net_queue_send_iov(s->incoming_queue, sender, 
>> flags, iov,
>> +                            iovcnt, NULL);
>> +        }
>> +        return 1;
>
> These codes can be placed outside of the outer if/else.
>

fix

>> +    }
>> +    return ret;
>> +}
>> +
>>   static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>>                                            NetClientState *sender,
>>                                            unsigned flags,
>> @@ -38,10 +133,16 @@ static ssize_t 
>> colo_proxy_receive_iov(NetFilterState *nf,
>>        *
>>        */
>>       ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    ssize_t ret = 0;
>
> Space ~
>

fix

>>       if (s->colo_mode == COLO_PRIMARY_MODE) {
>> -         /* colo_proxy_primary_handler */
>> +        ret = colo_proxy_primary_handler(nf, sender, flags,
>> +                    iov, iovcnt, sent_cb);
>>       } else {
>> -         /* colo_proxy_primary_handler */
>> +        ret = colo_proxy_secondary_handler(nf, sender, flags,
>> +                    iov, iovcnt, sent_cb);
>> +    }
>> +    if (ret < 0) {
>> +        DEBUG("colo_proxy_receive_iov running failed\n");
>>       }
>>       return iov_size(iov, iovcnt);
>>   }
>>
>
>
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/9] Init colo-proxy object based on netfilter
  2015-11-30  2:50   ` Wen Congyang
@ 2015-11-30  5:38     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-30  5:38 UTC (permalink / raw)
  To: Wen Congyang, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: zhanghailiang, Li Zhijian, Gui jianfeng, eddie.dong,
	Dr. David Alan Gilbert, Huang peng, Gong lei, jan.kiszka,
	hongyang.yang



On 11/30/2015 10:50 AM, Wen Congyang wrote:
> On 11/27/2015 08:27 PM, Zhang Chen wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> add colo-proxy in vl.c and qemu-options.hx
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   qemu-options.hx | 4 ++++
>>   vl.c            | 3 ++-
>>   2 files changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 949db7f..5e6f1e3 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -3666,6 +3666,10 @@ queue @var{all|rx|tx} is an option that can be applied to any netfilter.
>>   @option{tx}: the filter is attached to the transmit queue of the netdev,
>>                where it will receive packets sent by the netdev.
>>   
>> +@item -object colo-proxy,id=@var{id},netdev=@var{netdevid},port=@var{t},addr=@var{ip:port},mode=@var{primary|secondary}[,queue=@var{all|rx|tx}]
> 1. queue *MUST* be all for the filter colo-proxy.
> 2. The option port should be removed
> 3. The option addr is socket address. The format can be host:port, or fd.

will fix in next version

thanks for review
zhangchen

>> +
>> +colo-proxy
> Add more description here.
>
> Thanks
> Wen Congyang
>

will fix in next version

>> +
>>   @item -object filter-dump,id=@var{id},netdev=@var{dev},file=@var{filename}][,maxlen=@var{len}]
>>   
>>   Dump the network traffic on netdev @var{dev} to the file specified by
>> diff --git a/vl.c b/vl.c
>> index f5f7c3f..9037743 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -2774,7 +2774,8 @@ static bool object_create_initial(const char *type)
>>        * they depend on netdevs already existing
>>        */
>>       if (g_str_equal(type, "filter-buffer") ||
>> -        g_str_equal(type, "filter-dump")) {
>> +        g_str_equal(type, "filter-dump") ||
>> +        g_str_equal(type, "colo-proxy")) {
>>           return false;
>>       }
>>   
>>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework
  2015-11-30  3:10   ` Wen Congyang
@ 2015-11-30  5:44     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-11-30  5:44 UTC (permalink / raw)
  To: Wen Congyang, qemu devel, Jason Wang, Stefan Hajnoczi
  Cc: zhanghailiang, Li Zhijian, Gui jianfeng, eddie.dong,
	Dr. David Alan Gilbert, Huang peng, Gong lei, jan.kiszka,
	hongyang.yang



On 11/30/2015 11:10 AM, Wen Congyang wrote:
> On 11/27/2015 08:27 PM, Zhang Chen wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> Colo-proxy is a plugin of qemu netfilter
>> like filter-buffer and dump
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/Makefile.objs |   1 +
>>   net/colo-proxy.c  | 139 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/colo-proxy.h  |  63 +++++++++++++++++++++++++
>>   3 files changed, 203 insertions(+)
>>   create mode 100644 net/colo-proxy.c
>>   create mode 100644 net/colo-proxy.h
>>
>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>> index 5fa2f97..95670f2 100644
>> --- a/net/Makefile.objs
>> +++ b/net/Makefile.objs
>> @@ -15,3 +15,4 @@ common-obj-$(CONFIG_VDE) += vde.o
>>   common-obj-$(CONFIG_NETMAP) += netmap.o
>>   common-obj-y += filter.o
>>   common-obj-y += filter-buffer.o
>> +common-obj-y += colo-proxy.o
>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>> new file mode 100644
>> index 0000000..98c2699
>> --- /dev/null
>> +++ b/net/colo-proxy.c
>> @@ -0,0 +1,139 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2015 FUJITSU LIMITED
>> + * Copyright (c) 2015 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "colo-proxy.h"
>> +
>> +#define __DEBUG__
>> +
>> +#ifdef __DEBUG__
>> +#define DEBUG(format, ...) printf(format, ##__VA_ARGS__)
>> +#else
>> +#define DEBUG(format, ...)
>> +#endif
>> +
>> +
>> +static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>> +                                         NetClientState *sender,
>> +                                         unsigned flags,
>> +                                         const struct iovec *iov,
>> +                                         int iovcnt,
>> +                                         NetPacketSent *sent_cb)
>> +{
>> +    /*
>> +     * We return size when buffer a packet, the sender will take it as
>> +     * a already sent packet, so sent_cb should not be called later.
>> +     *
>> +     */
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    if (s->colo_mode == COLO_PRIMARY_MODE) {
>> +         /* colo_proxy_primary_handler */
>> +    } else {
>> +         /* colo_proxy_primary_handler */
>> +    }
>> +    return iov_size(iov, iovcnt);
>> +}
>> +
>> +static void colo_proxy_cleanup(NetFilterState *nf)
>> +{
>> +     /* cleanup */
>> +}
>> +
>> +
>> +static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    if (!s->addr) {
>> +        error_setg(errp, "filter colo_proxy needs 'addr' \
>> +                     property set!");
>> +        return;
>> +    }
>> +
>> +    if (nf->direction != NET_FILTER_DIRECTION_ALL) {
>> +        printf("colo need queue all packet,\
> s/need/needs/

fix

thanks for review
zhangchen

>> +                    please startup colo-proxy with queue=all\n");
>> +        return;
>> +    }
>> +
>> +    s->sockfd = -1;
>> +    s->has_failover = false;
>> +    colo_do_checkpoint = false;
>> +    g_queue_init(&s->unprocessed_connections);
>> +
>> +    if (!strcmp(mode, PRIMARY_MODE)) {
>> +        s->colo_mode = COLO_PRIMARY_MODE;
>> +    } else if (!strcmp(mode, SECONDARY_MODE)) {
>> +        s->colo_mode = COLO_SECONDARY_MODE;
>> +    } else {
>> +        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
>> +                    "primary or secondary");
>> +        return;
>> +    }
>> +}
>> +
>> +static void colo_proxy_class_init(ObjectClass *oc, void *data)
>> +{
>> +    NetFilterClass *nfc = NETFILTER_CLASS(oc);
>> +
>> +    nfc->setup = colo_proxy_setup;
>> +    nfc->cleanup = colo_proxy_cleanup;
>> +    nfc->receive_iov = colo_proxy_receive_iov;
>> +}
>> +
>> +static char *colo_proxy_get_mode(Object *obj, Error **errp)
>> +{
>> +    return g_strdup(mode);
>> +}
>> +
>> +static void colo_proxy_set_mode(Object *obj, const char *value, Error **errp)
>> +{
>> +    g_free(mode);
>> +    mode = g_strdup(value);
>> +}
>> +
>> +static char *colo_proxy_get_addr(Object *obj, Error **errp)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(obj);
>> +
>> +    return g_strdup(s->addr);
>> +}
>> +
>> +static void colo_proxy_set_addr(Object *obj, const char *value, Error **errp)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(obj);
>> +    g_free(s->addr);
>> +    s->addr = g_strdup(value);
> You can parse the address here, and can find the format error as early as possible.

ok,it will fix in next version

>> +}
>> +
>> +static void colo_proxy_init(Object *obj)
>> +{
>> +    object_property_add_str(obj, "mode", colo_proxy_get_mode,
>> +                            colo_proxy_set_mode, NULL);
>> +    object_property_add_str(obj, "addr", colo_proxy_get_addr,
>> +                            colo_proxy_set_addr, NULL);
>> +}
>> +
>> +static const TypeInfo colo_proxy_info = {
>> +    .name = TYPE_FILTER_COLO_PROXY,
>> +    .parent = TYPE_NETFILTER,
>> +    .class_init = colo_proxy_class_init,
>> +    .instance_init = colo_proxy_init,
>> +    .instance_size = sizeof(ColoProxyState),
>> +};
>> +
>> +static void register_types(void)
>> +{
>> +    type_register_static(&colo_proxy_info);
>> +}
>> +
>> +type_init(register_types);
>> diff --git a/net/colo-proxy.h b/net/colo-proxy.h
>> new file mode 100644
>> index 0000000..94afbc7
>> --- /dev/null
>> +++ b/net/colo-proxy.h
>> @@ -0,0 +1,63 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2015 FUJITSU LIMITED
>> + * Copyright (c) 2015 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +
>> +#ifndef QEMU_COLO_PROXY_H
>> +#define QEMU_COLO_PROXY_H
>> +
>> +#include "net/filter.h"
>> +#include "net/queue.h"
>> +#include "qemu-common.h"
>> +#include "qemu/iov.h"
>> +#include "qapi/qmp/qerror.h"
>> +#include "qapi-visit.h"
>> +#include "qom/object.h"
>> +#include "qemu/sockets.h"
>> +#include "qemu/main-loop.h"
>> +#include <netinet/if_ether.h>
>> +#include "qemu/jhash.h"
>> +#include "qemu/coroutine.h"
> Don't include too many header files here. You should only include the header files
> which will be used in this header file.

ok,I will move some to colo-proxy.c

>> +
>> +#define FILTER_COLO_PROXY(obj) \
>> +    OBJECT_CHECK(ColoProxyState, (obj), TYPE_FILTER_COLO_PROXY)
>> +
>> +#define TYPE_FILTER_COLO_PROXY "colo-proxy"
>> +#define PRIMARY_MODE "primary"
>> +#define SECONDARY_MODE "secondary"
>> +
>> +typedef enum {
>> +    COLO_PRIMARY_MODE,               /* primary mode  */
>> +    COLO_SECONDARY_MODE,             /* secondary mode */
>> +} mode_type;
>> +
>> +typedef struct ColoProxyState {
>> +    NetFilterState parent_obj;
>> +    NetQueue *incoming_queue;        /* guest normal net queue */
>> +    NetFilterDirection direction;    /* packet direction */
>> +    mode_type colo_mode;             /* colo mode (primary or
>> +                                      * secondary)
>> +                                      */
>> +    char *addr;                       /* primary colo connect addr
>> +                                      * or secondary server addr
>> +                                      */
>> +    int sockfd;                      /* primary client socket fd or
>> +                                      * secondary server socket fd
>> +                                      */
>> +    bool has_failover;               /* colo failover flag */
>> +    GHashTable *unprocessed_packets; /* hashtable to save connection */
>> +    GQueue unprocessed_connections;  /* to save unprocessed_connections */
>> +    Coroutine *co;
>> +} ColoProxyState;
>> +
>> +#endif /* QEMU_COLO_PROXY_H */
>>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 2/9] jhash: add linux kernel jhashtable in qemu
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 2/9] jhash: add linux kernel jhashtable in qemu Zhang Chen
@ 2015-12-01 11:23   ` Dr. David Alan Gilbert
  2015-12-03  3:40     ` Zhang Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-01 11:23 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> 
> This used by colo-proxy to save and lookup
> connection info
> 
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  include/qemu/jhash.h | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
>  create mode 100644 include/qemu/jhash.h
> 
> diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h
> new file mode 100644
> index 0000000..f6cc7b3
> --- /dev/null
> +++ b/include/qemu/jhash.h
> @@ -0,0 +1,52 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2015 FUJITSU LIMITED
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */

Please be careful with the copyright.
This code is mostly a copy of the kernel code, so I think it should take the copyright
from the kernel code (see kernel's include/linux/jhash.h); and that copyright
states it's based on Bob Jenkins' Public Domain code.
Normally we have a problem as well QEMU doesn't like taking GPL2 code from the kernel,
but my reading of the kernel's header file is that it's still Public domain; but I don't
know what the right thing is to be sure.

Dave

> +
> +#ifndef QEMU_JHASH_H__
> +#define QEMU_JHASH_H__
> +
> +/*
> + * hashtable relation copy from linux kernel jhash
> + */
> +static inline uint32_t rol32(uint32_t word, unsigned int shift)
> +{
> +    return (word << shift) | (word >> (32 - shift));
> +}
> +
> +/* __jhash_mix -- mix 3 32-bit values reversibly. */
> +#define __jhash_mix(a, b, c)                \
> +{                                           \
> +    a -= c;  a ^= rol32(c, 4);  c += b;     \
> +    b -= a;  b ^= rol32(a, 6);  a += c;     \
> +    c -= b;  c ^= rol32(b, 8);  b += a;     \
> +    a -= c;  a ^= rol32(c, 16); c += b;     \
> +    b -= a;  b ^= rol32(a, 19); a += c;     \
> +    c -= b;  c ^= rol32(b, 4);  b += a;     \
> +}
> +
> +/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
> +#define __jhash_final(a, b, c)  \
> +{                               \
> +    c ^= b; c -= rol32(b, 14);  \
> +    a ^= c; a -= rol32(c, 11);  \
> +    b ^= a; b -= rol32(a, 25);  \
> +    c ^= b; c -= rol32(b, 16);  \
> +    a ^= c; a -= rol32(c, 4);   \
> +    b ^= a; b -= rol32(a, 14);  \
> +    c ^= b; c -= rol32(b, 24);  \
> +}
> +
> +/* An arbitrary initial parameter */
> +#define JHASH_INITVAL           0xdeadbeef
> +
> +#endif /* QEMU_JHASH_H__ */
> -- 
> 1.9.1
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work Zhang Chen
  2015-11-28  3:02   ` Hailiang Zhang
@ 2015-12-01 15:35   ` Dr. David Alan Gilbert
  2015-12-03  3:49     ` Zhang Chen
  1 sibling, 1 reply; 35+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-01 15:35 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> 
> Secondary setup socket server for colo-forward
> primary setup connect to secondary for colo-forward
> add data structure will be uesed

I wodner if it's possible to reuse the '-netdev socket,' stuff rather
than handling the socket connection yourself (I don't know much about it,
but since it's there it might be worth checking, see net/socket)

> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  net/colo-proxy.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  net/colo-proxy.h | 61 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 147 insertions(+), 1 deletion(-)
> 
> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> index 98c2699..89d9616 100644
> --- a/net/colo-proxy.c
> +++ b/net/colo-proxy.c
> @@ -22,6 +22,8 @@
>  #define DEBUG(format, ...)
>  #endif
>  
> +static char *mode;

Do we really need this as a global - you parse it and put it into the
ColoProxyState anyway? If it really does need to be a global you need to
use a better name.

> +static bool colo_do_checkpoint;
>  
>  static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>                                           NetClientState *sender,
> @@ -46,13 +48,84 @@ static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>  
>  static void colo_proxy_cleanup(NetFilterState *nf)
>  {
> -     /* cleanup */
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    close(s->sockfd);
> +    s->sockfd = -1;
> +    g_free(mode);
> +    g_free(s->addr);
>  }
>  
> +static void colo_accept_incoming(ColoProxyState *s)
> +{
> +    DEBUG("into colo_accept_incoming\n");
> +    struct sockaddr_in addr;
> +    socklen_t addrlen = sizeof(addr);
> +    int acceptsock, err;
> +
> +    do {
> +        acceptsock = qemu_accept(s->sockfd, (struct sockaddr *)&addr, &addrlen);
> +        err = socket_error();
> +    } while (acceptsock < 0 && err == EINTR);
> +    qemu_set_fd_handler(s->sockfd, NULL, NULL, NULL);
> +    closesocket(s->sockfd);
> +
> +    DEBUG("accept colo proxy\n");
> +
> +    if (acceptsock < 0) {
> +        printf("could not accept colo connection (%s)\n",
> +                     strerror(err));

use error_report instead of printf please; also instead of 'colo connection'
make sure to say 'colo proxy connection' to make it easy to know which
failure it is.

> +        return;
> +    }
> +    s->sockfd = acceptsock;
> +    /* TODO: handle the packets that primary forward */
> +    return;
> +}
> +
> +/* Return 1 on success, or return -1 if failed */
> +static ssize_t colo_start_incoming(ColoProxyState *s)
> +{
> +    int serversock;
> +    serversock = inet_listen(s->addr, NULL, 256, SOCK_STREAM, 0, NULL);
> +    if (serversock < 0) {
> +        g_free(s->addr);
> +        return -1;
> +    }
> +    s->sockfd = serversock;
> +    qemu_set_fd_handler(serversock, (IOHandler *)colo_accept_incoming, NULL,
> +                        (void *)s);
> +    g_free(s->addr);
> +    return 1;
> +}
> +
> +/* Return 1 on success, or return -1 if setup failed */
> +static ssize_t colo_proxy_primary_setup(NetFilterState *nf)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    int sock;
> +    sock = inet_connect(s->addr, NULL);
> +    if (sock < 0) {
> +        printf("colo proxy connect failed\n");
> +        g_free(s->addr);
> +        return -1;
> +    }
> +    DEBUG("colo proxy connect success\n");
> +    s->sockfd = sock;
> +   /* TODO: handle the packets that secondary forward */
> +    g_free(s->addr);
> +    return 1;
> +}
> +
> +/* Return 1 on success, or return -1 if setup failed */
> +static ssize_t colo_proxy_secondary_setup(NetFilterState *nf)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    return colo_start_incoming(s);
> +}
>  
>  static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>  {
>      ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    ssize_t ret = 0;
>      if (!s->addr) {
>          error_setg(errp, "filter colo_proxy needs 'addr' \
>                       property set!");
> @@ -68,17 +141,29 @@ static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>      s->sockfd = -1;
>      s->has_failover = false;
>      colo_do_checkpoint = false;
> +    s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
> +    s->unprocessed_packets = g_hash_table_new_full(connection_key_hash,
> +                                                       connection_key_equal,
> +                                                       g_free,
> +                                                       connection_destroy);
>      g_queue_init(&s->unprocessed_connections);
>  
>      if (!strcmp(mode, PRIMARY_MODE)) {
>          s->colo_mode = COLO_PRIMARY_MODE;
> +        ret = colo_proxy_primary_setup(nf);
>      } else if (!strcmp(mode, SECONDARY_MODE)) {
>          s->colo_mode = COLO_SECONDARY_MODE;
> +        ret = colo_proxy_secondary_setup(nf);
>      } else {
>          error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
>                      "primary or secondary");
>          return;
>      }
> +    if (ret) {
> +        DEBUG("colo_proxy_setup success\n");
> +    } else {
> +        DEBUG("colo_proxy_setup failed\n");
> +    }

It's easier to use trace_ (especially with the stderr backend, it's very easy).
Do you not want to return the 'ret' value?

>  }
>  
>  static void colo_proxy_class_init(ObjectClass *oc, void *data)
> diff --git a/net/colo-proxy.h b/net/colo-proxy.h
> index 94afbc7..f77db2f 100644
> --- a/net/colo-proxy.h
> +++ b/net/colo-proxy.h
> @@ -60,4 +60,65 @@ typedef struct ColoProxyState {
>      Coroutine *co;
>  } ColoProxyState;
>  
> +struct ip {
> +#ifdef HOST_WORDS_BIGENDIAN
> +    uint8_t  ip_v:4,                 /* version */
> +             ip_hl:4;                /* header length */
> +#else
> +    uint8_t  ip_hl:4,                /* header length */
> +             ip_v:4;                 /* version */
> +#endif
> +    uint8_t  ip_tos;                 /* type of service */
> +    uint16_t ip_len;                 /* total length */
> +    uint16_t ip_id;                  /* identification */
> +    uint16_t ip_off;                 /* fragment offset field */
> +#define    IP_DF 0x4000              /* don't fragment flag */
> +#define    IP_MF 0x2000              /* more fragments flag */
> +#define    IP_OFFMASK 0x1fff
> +/* mask for fragmenting bits */
> +    uint8_t  ip_ttl;                 /* time to live */
> +    uint8_t  ip_p;                   /* protocol */
> +    uint16_t ip_sum;                 /* checksum */
> +    uint32_t ip_src, ip_dst;         /* source and dest address */
> +} QEMU_PACKED;
> +

Why not just #include slirp/ip.h ?

> +typedef struct Packet {
> +    void *data;
> +    union {
> +        uint8_t *network_layer;
> +        struct ip *ip;
> +    };
> +    uint8_t *transport_layer;
> +    int size;
> +    ColoProxyState *s;
> +    bool should_be_sent;
> +    NetClientState *sender;
> +} Packet;
> +
> +typedef struct Connection_key {
> +    /* (src, dst) must be grouped, in the same way than in IP header */
> +    uint32_t src;
> +    uint32_t dst;
> +    union {
> +        uint32_t ports;
> +        uint16_t port16[2];
> +    };

Why the union?

Dave

> +    uint8_t ip_proto;
> +} QEMU_PACKED Connection_key;
> +
> +typedef struct Connection {
> +    /* connection primary send queue */
> +    GQueue primary_list;
> +    /* connection secondary send queue */
> +    GQueue secondary_list;
> +     /* flag to enqueue unprocessed_connections */
> +    bool processing;
> +} Connection;
> +
> +typedef enum {
> +    PRIMARY_OUTPUT,           /* primary output packet queue */
> +    PRIMARY_INPUT,            /* primary input packet queue */
> +    SECONDARY_OUTPUT,         /* secondary output packet queue */
> +} packet_type;
> +
>  #endif /* QEMU_COLO_PROXY_H */
> -- 
> 1.9.1
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 6/9] net/colo-proxy: add packet forward function
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 6/9] net/colo-proxy: add packet forward function Zhang Chen
@ 2015-12-01 15:50   ` Dr. David Alan Gilbert
  2015-12-03  6:17     ` Zhang Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-01 15:50 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> 
> The packet recv by primary forward to secondary
> The packet send by secondary forward to primary
> 
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  net/colo-proxy.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 114 insertions(+), 4 deletions(-)
> 
> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> index ece5661..08a852f 100644
> --- a/net/colo-proxy.c
> +++ b/net/colo-proxy.c
> @@ -26,6 +26,110 @@ static char *mode;
>  static bool colo_do_checkpoint;
>  
>  /*
> + * Packets to be sent by colo forward to
> + * another colo
> + * return:          >= 0        success
> + *                  < 0        failed
> + */
> +static ssize_t colo_forward2another(NetFilterState *nf,
> +                                         NetClientState *sender,
> +                                         unsigned flags,
> +                                         const struct iovec *iov,
> +                                         int iovcnt,
> +                                         NetPacketSent *sent_cb,
> +                                         mode_type mode)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    ssize_t ret = 0;
> +    ssize_t size = 0;
> +    struct iovec sizeiov = {
> +        .iov_base = &size,
> +        .iov_len = 8
> +    };

If you always want 'size' to be 8 bytes then use an int64_t
(or uint64_t probably if you are sending a length).

> +    size = iov_size(iov, iovcnt);
> +    if (!size) {
> +        return 0;
> +    }
> +
> +    if (mode == COLO_PRIMARY_MODE) {
> +        qemu_net_queue_send_iov(s->incoming_queue, sender, flags,
> +                           iov, iovcnt, NULL);
> +    }
> +    ret = iov_send(s->sockfd, &sizeiov, 8, 0, 8);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +    ret = iov_send(s->sockfd, iov, iovcnt, 0, size);
> +    return ret;
> +}
> +
> +/*
> + * recv and handle colo secondary
> + * forward packets in colo primary
> + */
> +static void colo_proxy_primary_forward_handler(NetFilterState *nf)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    ssize_t len = 0;
> +    ssize_t ret = 0;
> +    struct iovec sizeiov = {
> +        .iov_base = &len,
> +        .iov_len = 8
> +    };
> +    if (s->sockfd < 0) {
> +        printf("secondary forward disconnected\n");
> +        return;
> +    }
> +    iov_recv(s->sockfd, &sizeiov, 8, 0, 8);

Check the return value of iov_recv.

> +    DEBUG("primary_forward_handler recv lensbuf lens=%zu\n", len);
> +
> +    if (len > 0) {
> +        char *recvbuf;
> +        recvbuf = g_malloc0(len);

You should check the value of 'len' received off the wire - if things
go wrong the value could be massive and try and allocate a huge amount
of memory - checking it would also be a good check for something going
wrong.

> +        struct iovec iov = {
> +            .iov_base = recvbuf,
> +            .iov_len = len
> +        };
> +        iov_recv(s->sockfd, &iov, len, 0, len);
> +        DEBUG("primary_forward_handler primary recvbuf=%s\n", recvbuf);
> +        ret = colo_enqueue_secondary_packet(nf, recvbuf, len);
> +        if (ret) {
> +            DEBUG("colo_enqueue_secondary_packet succese\n");
> +        } else {
> +            DEBUG("colo_enqueue_secondary_packet failed\n");
> +        }
> +        g_free(recvbuf);
> +    }
> +}
> +
> +/*
> + * recv and handle colo primary
> + * forward packets in colo secondary
> + */
> +static void colo_proxy_secondary_forward_handler(NetFilterState *nf)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    ssize_t len = 0;
> +    struct iovec sizeiov = {
> +        .iov_base = &len,
> +        .iov_len = 8
> +    };
> +    iov_recv(s->sockfd, &sizeiov, 8, 0, 8);
> +    if (len > 0) {
> +        char *buf;
> +        buf = g_malloc0(len);
> +        struct iovec iov = {
> +            .iov_base = buf,
> +            .iov_len = len
> +        };
> +        iov_recv(s->sockfd, &iov, len, 0, len);

This code is very similar - factor it out and share it with the primary?

Dave

> +        qemu_net_queue_send(s->incoming_queue, nf->netdev,
> +                    0, (const uint8_t *)buf, len, NULL);
> +        g_free(buf);
> +    }
> +}
> +
> +/*
>   * colo primary handle host's normal send and
>   * recv packets to primary guest
>   * return:          >= 0      success
> @@ -63,7 +167,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
>      if (direction == NET_FILTER_DIRECTION_RX) {
>          /* TODO: enqueue_primary_packet */
>      } else {
> -        /* TODO: forward packets to another */
> +        ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
> +                    sent_cb, COLO_PRIMARY_MODE);
>      }
>  
>      return ret;
> @@ -107,7 +212,8 @@ static ssize_t colo_proxy_secondary_handler(NetFilterState *nf,
>                              iovcnt, NULL);
>              return 1;
>          } else {
> -        /* TODO: forward packets to another */
> +            ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
> +                        sent_cb, COLO_SECONDARY_MODE);
>          }
>  
>      } else {
> @@ -178,7 +284,9 @@ static void colo_accept_incoming(ColoProxyState *s)
>          return;
>      }
>      s->sockfd = acceptsock;
> -    /* TODO: handle the packets that primary forward */
> +    qemu_set_fd_handler(s->sockfd,
> +                (IOHandler *)colo_proxy_secondary_forward_handler, NULL,
> +                (void *)s);
>      return;
>  }
>  
> @@ -211,7 +319,9 @@ static ssize_t colo_proxy_primary_setup(NetFilterState *nf)
>      }
>      DEBUG("colo proxy connect success\n");
>      s->sockfd = sock;
> -   /* TODO: handle the packets that secondary forward */
> +    qemu_set_fd_handler(s->sockfd,
> +                (IOHandler *)colo_proxy_primary_forward_handler,
> +                NULL, (void *)s);
>      g_free(s->addr);
>      return 1;
>  }
> -- 
> 1.9.1
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function Zhang Chen
@ 2015-12-01 16:12   ` Dr. David Alan Gilbert
  2015-12-03  6:35     ` Zhang Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-01 16:12 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> 
> Add common packet handle function and enqueue
> packet distinguished connection,then we can
> lookup one connection packet to compare
> 
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  net/colo-proxy.c | 167 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 166 insertions(+), 1 deletion(-)
> 
> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> index 08a852f..a664e6d 100644
> --- a/net/colo-proxy.c
> +++ b/net/colo-proxy.c
> @@ -24,6 +24,170 @@
>  
>  static char *mode;
>  static bool colo_do_checkpoint;
> +static void packet_destroy(void *opaque, void *user_data);
> +
> +static uint32_t connection_key_hash(const void *opaque)
> +{
> +    const Connection_key *key = opaque;
> +    uint32_t a, b, c;
> +
> +    /* Jenkins hash */
> +    a = b = c = JHASH_INITVAL + sizeof(*key);
> +    a += key->src;
> +    b += key->dst;
> +    c += key->ports;
> +    __jhash_mix(a, b, c);
> +
> +    a += key->ip_proto;
> +    __jhash_final(a, b, c);
> +
> +    return c;
> +}
> +
> +static int connection_key_equal(const void *opaque1, const void *opaque2)
> +{
> +    return memcmp(opaque1, opaque2, sizeof(Connection_key)) == 0;
> +}
> +
> +static void connection_destroy(void *opaque)
> +{
> +    Connection *connection = opaque;
> +    g_queue_foreach(&connection->primary_list, packet_destroy, NULL);
> +    g_queue_free(&connection->primary_list);
> +    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
> +    g_queue_free(&connection->secondary_list);
> +    g_slice_free(Connection, connection);
> +}
> +
> +static Connection *connection_new(void)
> +{
> +    Connection *connection = g_slice_new(Connection);
> +
> +    g_queue_init(&connection->primary_list);
> +    g_queue_init(&connection->secondary_list);
> +    connection->processing = false;
> +
> +    return connection;
> +}
> +
> +/* Return 0 on success, or return -1 if the pkt is corrpted */
> +static int parse_packet_early(Packet *pkt, Connection_key *key)
> +{
> +    int network_length;
> +    uint8_t *data = pkt->data;
> +
> +    pkt->network_layer = data + ETH_HLEN;
> +    if (ntohs(*(uint16_t *)(data + 12)) != ETH_P_IP) {
> +        if (ntohs(*(uint16_t *)(data + 12)) == ETH_P_ARP) {
> +            return -1;
> +        }
> +        return 0;
> +    }

Can you use some of the functions/macros in include/net/eth.h to
make this easier? Maybe eth_get_l3_proto ?
Do you plan to do IPv6 at some point?

> +    network_length = pkt->ip->ip_hl * 4;
> +    pkt->transport_layer = pkt->network_layer + network_length;
> +    key->ip_proto = pkt->ip->ip_p;
> +    key->src = pkt->ip->ip_src;
> +    key->dst = pkt->ip->ip_dst;
> +
> +    switch (key->ip_proto) {
> +    case IPPROTO_TCP:
> +    case IPPROTO_UDP:
> +    case IPPROTO_DCCP:
> +    case IPPROTO_ESP:
> +    case IPPROTO_SCTP:
> +    case IPPROTO_UDPLITE:
> +        key->ports = *(uint32_t *)(pkt->transport_layer);
> +        break;
> +    case IPPROTO_AH:
> +        key->ports = *(uint32_t *)(pkt->transport_layer + 4);

Interesting; I don't see any other code in QEMU to handle AH,
and I don't know much about it.

> +        break;
> +    default:
> +        break;
> +    }
> +
> +    return 0;
> +}
> +
> +static Packet *packet_new(ColoProxyState *s, const void *data,
> +                          int size, Connection_key *key, NetClientState *sender)
> +{
> +    Packet *pkt = g_slice_new(Packet);
> +
> +    pkt->data = g_malloc(size);
> +    memcpy(pkt->data, data, size);

g_memdup might be useful for these:
https://developer.gnome.org/glib/stable/glib-Memory-Allocation.html#g-memdup

> +    pkt->size = size;
> +    pkt->s = s;
> +    pkt->sender = sender;
> +    pkt->should_be_sent = false;
> +
> +    if (parse_packet_early(pkt, key)) {
> +        packet_destroy(pkt, NULL);
> +        pkt = NULL;
> +    }
> +
> +    return pkt;
> +}
> +
> +static void packet_destroy(void *opaque, void *user_data)
> +{
> +    Packet *pkt = opaque;
> +    g_free(pkt->data);
> +    g_slice_free(Packet, pkt);
> +}
> +
> +static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
> +                                          Connection_key *key,
> +                                          Packet *pkt, packet_type type)
> +{
> +    Connection *connection;
> +    Packet *tmppkt;
> +    connection = g_hash_table_lookup(unprocessed_packets, key);
> +    if (connection == NULL) {
> +        Connection_key *new_key = g_malloc(sizeof(*key));
> +
> +        connection = connection_new();
> +        memcpy(new_key, key, sizeof(*key));
> +        key = new_key;
> +
> +        g_hash_table_insert(unprocessed_packets, key, connection);

Is 'unprocessed_packets' a good name for this hashtable? I'm not quite
sure I understand, but it looks to me like it's your connection-tracking equivalent,
which then has a queue for each connection with unprocessed packets?

Also, do we do anything to stop this hash growing really huge? If there
are lots-and-lots of connections can we limit it somehow? (what does Linux do?)

> +    }
> +    switch (type) {
> +    case PRIMARY_OUTPUT:
> +        if (g_queue_get_length(&connection->secondary_list) > 0) {

Please add some more comments; I think this is when a packet comes in
on the primary, and then we find we've already got a packet from the secondary
waiting?

> +            tmppkt = g_queue_pop_head(&connection->secondary_list);
> +            DEBUG("g_queue_get_length(&connection->primary_list)=%d\n",
> +                        g_queue_get_length(&connection->primary_list));
> +            DEBUG("g_queue_get_length(&connection->secondary_list)=%d\n",
> +                        g_queue_get_length(&connection->secondary_list));

> +            if (colo_packet_compare(pkt, tmppkt)) {
> +                DEBUG("packet same and release packet\n");
> +                pkt->should_be_sent = true;
> +                break;
> +            } else {
> +                DEBUG("packet different\n");
> +                colo_proxy_notify_checkpoint();
> +                pkt->should_be_sent = false;
> +                break;
> +            }
> +        } else {
> +            g_queue_push_tail(&connection->primary_list, pkt);
> +            pkt->should_be_sent = false;
> +        }
> +
> +        break;
> +    case SECONDARY_OUTPUT:
> +        g_queue_push_tail(&connection->secondary_list, pkt);
> +        DEBUG("secondary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
> +                    (char *)pkt->data, pkt->ip->ip_src, pkt->ip->ip_dst);
> +        break;
> +    default:
> +        abort();
> +    }
> +
> +    return connection;
> +}
> +
>  
>  /*
>   * Packets to be sent by colo forward to
> @@ -165,7 +329,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
>      }
>  
>      if (direction == NET_FILTER_DIRECTION_RX) {
> -        /* TODO: enqueue_primary_packet */
> +        ret = colo_enqueue_primary_packet(nf, sender, flags, iov,
> +                    iovcnt, sent_cb);

The routine above is 'colo_enqueue_packet' rather than colo_enqueue_primary_packet?

>      } else {
>          ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
>                      sent_cb, COLO_PRIMARY_MODE);
> -- 
> 1.9.1

Dave

> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 9/9] net/colo-proxy: add packet compare and notify checkpoint
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 9/9] net/colo-proxy: add packet compare and notify checkpoint Zhang Chen
@ 2015-12-01 16:37   ` Dr. David Alan Gilbert
  2015-12-03  7:10     ` Zhang Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-01 16:37 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> 
> Lookup same connection's primary and secondary packet
> to compare,if same we will send primary packet and
> drop secondary packet,else send all of primary
> packets be queued,drop secondary queue and notify
> colo to do checkpoint
> 
> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  net/colo-proxy.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 74 insertions(+)
> 
> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> index 5f1852a..847f7f2 100644
> --- a/net/colo-proxy.c
> +++ b/net/colo-proxy.c
> @@ -70,6 +70,41 @@ static Connection *connection_new(void)
>      return connection;
>  }
>  
> +static void colo_send_primary_packet(void *opaque, void *user_data)
> +{
> +    Packet *pkt = opaque;
> +    qemu_net_queue_send(pkt->s->incoming_queue, pkt->sender, 0,
> +                    (const uint8_t *)pkt->data, pkt->size, NULL);
> +}
> +
> +static void colo_flush_connection(void *opaque, void *user_data)
> +{
> +    Connection *connection = opaque;
> +    g_queue_foreach(&connection->primary_list, colo_send_primary_packet, NULL);
> +    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
> +}
> +
> +static void colo_proxy_notify_checkpoint(void)
> +{
> +    DEBUG("colo_proxy_notify_checkpoint\n");
> +}
> +
> +static void colo_proxy_do_checkpoint(NetFilterState *nf)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +
> +    g_queue_foreach(&s->unprocessed_connections, colo_flush_connection, NULL);
> +}
> +
> +/*
> + * colo failover flag
> + */
> +static ssize_t colo_has_failover(NetFilterState *nf)
> +{
> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
> +    return s->has_failover;
> +}
> +
>  /* Return 0 on success, or return -1 if the pkt is corrpted */
>  static int parse_packet_early(Packet *pkt, Connection_key *key)
>  {
> @@ -136,6 +171,45 @@ static void packet_destroy(void *opaque, void *user_data)
>      g_slice_free(Packet, pkt);
>  }
>  
> +/*
> + * The sent IP packets comparison between primary
> + * and secondary
> + * TODO: support ip fragment
> + * return:    true  means packet same
> + *            false means packet different
> + */
> +static bool colo_packet_compare(Packet *ppkt, Packet *spkt)
> +{
> +    int i;
> +    DEBUG("colo_packet_compare lens ppkt %d,spkt %d\n", ppkt->size,
> +                spkt->size);
> +    DEBUG("primary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
> +                (char *)ppkt->data, ppkt->ip->ip_src, ppkt->ip->ip_dst);
> +    DEBUG("seconda pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
> +                (char *)spkt->data, spkt->ip->ip_src, spkt->ip->ip_dst);

Is it always IP packets we're comparing at this point?

> +    if (ppkt->size == spkt->size) {
> +        DEBUG("colo_packet_compare data   ppkt\n");
> +        for (i = 0; i < spkt->size; i++) {
> +            DEBUG("%x", ((char *)ppkt->data)[i]);
> +            DEBUG("|");
> +        }
> +        DEBUG("\ncolo_packet_compare data   spkt\n");
> +        for (i = 0; i < spkt->size; i++) {
> +            DEBUG("%x", ((char *)spkt->data)[i]);
> +            DEBUG("|");
> +        }
> +        DEBUG("\ncolo_packet_compare data   ppkt %s\n", (char *)ppkt->data);
> +        DEBUG("colo_packet_compare data   spkt %s\n", (char *)spkt->data);

It's probably better to make these a helper debug routine to dump
a packet;  I bet you'll wnat to sometimes do it when the sizes
are different.

> +        if (!memcmp(ppkt->data, spkt->data, spkt->size)) {
> +            return true;
> +        } else {
> +            return false;
> +        }
> +    } else {
> +        return false;
> +    }
> +}
> +
>  static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
>                                            Connection_key *key,
>                                            Packet *pkt, packet_type type)
> -- 
> 1.9.1
> 
> 
> 

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter
  2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
                   ` (8 preceding siblings ...)
  2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 9/9] net/colo-proxy: add packet compare and notify checkpoint Zhang Chen
@ 2015-12-01 16:44 ` Dr. David Alan Gilbert
  2015-12-03  7:33   ` Zhang Chen
  9 siblings, 1 reply; 35+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-01 16:44 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> 
> Hi,all
> 
> This patch add an colo-proxy object, COLO-Proxy is a part of COLO,
> based on qemu netfilter and it's a plugin for qemu netfilter. the function
> keep Secondary VM connect normal to Primary VM and compare packets 
> sent by PVM to sent by SVM.if the packet difference,notify COLO do
> checkpoint and send all primary packet has queued.
> 
> You can also get the series from: 
> 
> https://github.com/zhangckid/qemu/tree/colo-proxy-V1
> 
> Usage:
> 
> primary:
> -netdev tap,id=bn0 -device e1000,netdev=bn0
> -object colo-proxy,id=f0,netdev=bn0,queue=all,mode=primary,addr=ip:port
> 
> secondary:
> -netdev tap,id=bn0 -device e1000,netdev=bn0
> -object colo-proxy,id=f0,netdev=bn0,queue=all,mode=secondary,addr=ip:port

If we have more than one NIC on the guest, do you intend to allow
multiple colo-proxy's ?

Having read through the series, it looks like the main missing piece
is the work to synchronise sequence numbers.

I think also you'll need to clean out the connection hash when either
you see both sides have closed the connection or (maybe after some
time of idleness as well? Otherwise we'd just accumulate dead connections
overtime).

I'm guessing the buffer filter also has to be created on the command line?
How does the order of buffers work?

Dave

> 
> NOTE:
> queue must set "all". See enum NetFilterDirection for detail.
> colo-proxy need queue all packets
> colo-proxy V1 just a demo of colo proxy,not pass test with colo upstream
> 
> 
> ## Background
> 
> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service)
> project is a high availability solution. Both Primary VM (PVM) and Secondary VM
> (SVM) run in parallel. They receive the same request from client, and generate
> responses in parallel too. If the response packets from PVM and SVM are
> identical, they are released immediately. Otherwise, a VM checkpoint (on 
> demand)is conducted.
> 
> Paper:
> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
> 
> COLO on Xen:
> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
> 
> COLO on Qemu/KVM:
> http://wiki.qemu.org/Features/COLO
> 
> By the needs of capturing response packets from PVM and SVM and finding out
> whether they are identical, we introduce a new module to qemu networking 
> called colo-proxy.
> 
> 
> v1:
>  initial patch.
> 
> 
> 
> zhangchen (9):
>   Init colo-proxy object based on netfilter
>   jhash: add linux kernel jhashtable in qemu
>   colo-proxy: add colo-proxy framework
>   colo-proxy: add colo-proxy setup work
>   net/colo-proxy: add colo packet handler
>   net/colo-proxy: add packet forward function
>   net/colo-proxy: add packet enqueue and handle function
>   net/colo-proxy: enqueue primary and secondary packet
>   net/colo-proxy: add packet compare and notify checkpoint
> 
>  include/qemu/jhash.h |  52 ++++
>  net/Makefile.objs    |   1 +
>  net/colo-proxy.c     | 745 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  net/colo-proxy.h     | 124 +++++++++
>  qemu-options.hx      |   4 +
>  vl.c                 |   3 +-
>  6 files changed, 928 insertions(+), 1 deletion(-)
>  create mode 100644 include/qemu/jhash.h
>  create mode 100644 net/colo-proxy.c
>  create mode 100644 net/colo-proxy.h
> 
> -- 
> 1.9.1
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 2/9] jhash: add linux kernel jhashtable in qemu
  2015-12-01 11:23   ` Dr. David Alan Gilbert
@ 2015-12-03  3:40     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-12-03  3:40 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang

Hi,Dave


On 12/01/2015 07:23 PM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> This used by colo-proxy to save and lookup
>> connection info
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   include/qemu/jhash.h | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 52 insertions(+)
>>   create mode 100644 include/qemu/jhash.h
>>
>> diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h
>> new file mode 100644
>> index 0000000..f6cc7b3
>> --- /dev/null
>> +++ b/include/qemu/jhash.h
>> @@ -0,0 +1,52 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2015 FUJITSU LIMITED
>> + * Copyright (c) 2015 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
> Please be careful with the copyright.
> This code is mostly a copy of the kernel code, so I think it should take the copyright
> from the kernel code (see kernel's include/linux/jhash.h); and that copyright
> states it's based on Bob Jenkins' Public Domain code.
> Normally we have a problem as well QEMU doesn't like taking GPL2 code from the kernel,
> but my reading of the kernel's header file is that it's still Public domain; but I don't
> know what the right thing is to be sure.
>
> Dave

I will remove it and change it to kernel jhash.h's copyright in next version

Thanks for review
zhangchen

>> +
>> +#ifndef QEMU_JHASH_H__
>> +#define QEMU_JHASH_H__
>> +
>> +/*
>> + * hashtable relation copy from linux kernel jhash
>> + */
>> +static inline uint32_t rol32(uint32_t word, unsigned int shift)
>> +{
>> +    return (word << shift) | (word >> (32 - shift));
>> +}
>> +
>> +/* __jhash_mix -- mix 3 32-bit values reversibly. */
>> +#define __jhash_mix(a, b, c)                \
>> +{                                           \
>> +    a -= c;  a ^= rol32(c, 4);  c += b;     \
>> +    b -= a;  b ^= rol32(a, 6);  a += c;     \
>> +    c -= b;  c ^= rol32(b, 8);  b += a;     \
>> +    a -= c;  a ^= rol32(c, 16); c += b;     \
>> +    b -= a;  b ^= rol32(a, 19); a += c;     \
>> +    c -= b;  c ^= rol32(b, 4);  b += a;     \
>> +}
>> +
>> +/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
>> +#define __jhash_final(a, b, c)  \
>> +{                               \
>> +    c ^= b; c -= rol32(b, 14);  \
>> +    a ^= c; a -= rol32(c, 11);  \
>> +    b ^= a; b -= rol32(a, 25);  \
>> +    c ^= b; c -= rol32(b, 16);  \
>> +    a ^= c; a -= rol32(c, 4);   \
>> +    b ^= a; b -= rol32(a, 14);  \
>> +    c ^= b; c -= rol32(b, 24);  \
>> +}
>> +
>> +/* An arbitrary initial parameter */
>> +#define JHASH_INITVAL           0xdeadbeef
>> +
>> +#endif /* QEMU_JHASH_H__ */
>> -- 
>> 1.9.1
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work
  2015-12-01 15:35   ` Dr. David Alan Gilbert
@ 2015-12-03  3:49     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-12-03  3:49 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang


Hi,Dave


On 12/01/2015 11:35 PM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> Secondary setup socket server for colo-forward
>> primary setup connect to secondary for colo-forward
>> add data structure will be uesed
> I wodner if it's possible to reuse the '-netdev socket,' stuff rather
> than handling the socket connection yourself (I don't know much about it,
> but since it's there it might be worth checking, see net/socket)

I will check the possible to reuse the '-netdev socket,' but the way can
increase complexity of code,logic and user config obviously.

>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-proxy.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   net/colo-proxy.h | 61 +++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 147 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>> index 98c2699..89d9616 100644
>> --- a/net/colo-proxy.c
>> +++ b/net/colo-proxy.c
>> @@ -22,6 +22,8 @@
>>   #define DEBUG(format, ...)
>>   #endif
>>   
>> +static char *mode;
> Do we really need this as a global - you parse it and put it into the
> ColoProxyState anyway? If it really does need to be a global you need to
> use a better name.

I will fix it in next version

>> +static bool colo_do_checkpoint;
>>   
>>   static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>>                                            NetClientState *sender,
>> @@ -46,13 +48,84 @@ static ssize_t colo_proxy_receive_iov(NetFilterState *nf,
>>   
>>   static void colo_proxy_cleanup(NetFilterState *nf)
>>   {
>> -     /* cleanup */
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    close(s->sockfd);
>> +    s->sockfd = -1;
>> +    g_free(mode);
>> +    g_free(s->addr);
>>   }
>>   
>> +static void colo_accept_incoming(ColoProxyState *s)
>> +{
>> +    DEBUG("into colo_accept_incoming\n");
>> +    struct sockaddr_in addr;
>> +    socklen_t addrlen = sizeof(addr);
>> +    int acceptsock, err;
>> +
>> +    do {
>> +        acceptsock = qemu_accept(s->sockfd, (struct sockaddr *)&addr, &addrlen);
>> +        err = socket_error();
>> +    } while (acceptsock < 0 && err == EINTR);
>> +    qemu_set_fd_handler(s->sockfd, NULL, NULL, NULL);
>> +    closesocket(s->sockfd);
>> +
>> +    DEBUG("accept colo proxy\n");
>> +
>> +    if (acceptsock < 0) {
>> +        printf("could not accept colo connection (%s)\n",
>> +                     strerror(err));
> use error_report instead of printf please; also instead of 'colo connection'
> make sure to say 'colo proxy connection' to make it easy to know which
> failure it is.
>

I will fix it in next version

>> +        return;
>> +    }
>> +    s->sockfd = acceptsock;
>> +    /* TODO: handle the packets that primary forward */
>> +    return;
>> +}
>> +
>> +/* Return 1 on success, or return -1 if failed */
>> +static ssize_t colo_start_incoming(ColoProxyState *s)
>> +{
>> +    int serversock;
>> +    serversock = inet_listen(s->addr, NULL, 256, SOCK_STREAM, 0, NULL);
>> +    if (serversock < 0) {
>> +        g_free(s->addr);
>> +        return -1;
>> +    }
>> +    s->sockfd = serversock;
>> +    qemu_set_fd_handler(serversock, (IOHandler *)colo_accept_incoming, NULL,
>> +                        (void *)s);
>> +    g_free(s->addr);
>> +    return 1;
>> +}
>> +
>> +/* Return 1 on success, or return -1 if setup failed */
>> +static ssize_t colo_proxy_primary_setup(NetFilterState *nf)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    int sock;
>> +    sock = inet_connect(s->addr, NULL);
>> +    if (sock < 0) {
>> +        printf("colo proxy connect failed\n");
>> +        g_free(s->addr);
>> +        return -1;
>> +    }
>> +    DEBUG("colo proxy connect success\n");
>> +    s->sockfd = sock;
>> +   /* TODO: handle the packets that secondary forward */
>> +    g_free(s->addr);
>> +    return 1;
>> +}
>> +
>> +/* Return 1 on success, or return -1 if setup failed */
>> +static ssize_t colo_proxy_secondary_setup(NetFilterState *nf)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    return colo_start_incoming(s);
>> +}
>>   
>>   static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>>   {
>>       ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    ssize_t ret = 0;
>>       if (!s->addr) {
>>           error_setg(errp, "filter colo_proxy needs 'addr' \
>>                        property set!");
>> @@ -68,17 +141,29 @@ static void colo_proxy_setup(NetFilterState *nf, Error **errp)
>>       s->sockfd = -1;
>>       s->has_failover = false;
>>       colo_do_checkpoint = false;
>> +    s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
>> +    s->unprocessed_packets = g_hash_table_new_full(connection_key_hash,
>> +                                                       connection_key_equal,
>> +                                                       g_free,
>> +                                                       connection_destroy);
>>       g_queue_init(&s->unprocessed_connections);
>>   
>>       if (!strcmp(mode, PRIMARY_MODE)) {
>>           s->colo_mode = COLO_PRIMARY_MODE;
>> +        ret = colo_proxy_primary_setup(nf);
>>       } else if (!strcmp(mode, SECONDARY_MODE)) {
>>           s->colo_mode = COLO_SECONDARY_MODE;
>> +        ret = colo_proxy_secondary_setup(nf);
>>       } else {
>>           error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "mode",
>>                       "primary or secondary");
>>           return;
>>       }
>> +    if (ret) {
>> +        DEBUG("colo_proxy_setup success\n");
>> +    } else {
>> +        DEBUG("colo_proxy_setup failed\n");
>> +    }
> It's easier to use trace_ (especially with the stderr backend, it's very easy).
> Do you not want to return the 'ret' value?

I will fix it in next version

>
>>   }
>>   
>>   static void colo_proxy_class_init(ObjectClass *oc, void *data)
>> diff --git a/net/colo-proxy.h b/net/colo-proxy.h
>> index 94afbc7..f77db2f 100644
>> --- a/net/colo-proxy.h
>> +++ b/net/colo-proxy.h
>> @@ -60,4 +60,65 @@ typedef struct ColoProxyState {
>>       Coroutine *co;
>>   } ColoProxyState;
>>   
>> +struct ip {
>> +#ifdef HOST_WORDS_BIGENDIAN
>> +    uint8_t  ip_v:4,                 /* version */
>> +             ip_hl:4;                /* header length */
>> +#else
>> +    uint8_t  ip_hl:4,                /* header length */
>> +             ip_v:4;                 /* version */
>> +#endif
>> +    uint8_t  ip_tos;                 /* type of service */
>> +    uint16_t ip_len;                 /* total length */
>> +    uint16_t ip_id;                  /* identification */
>> +    uint16_t ip_off;                 /* fragment offset field */
>> +#define    IP_DF 0x4000              /* don't fragment flag */
>> +#define    IP_MF 0x2000              /* more fragments flag */
>> +#define    IP_OFFMASK 0x1fff
>> +/* mask for fragmenting bits */
>> +    uint8_t  ip_ttl;                 /* time to live */
>> +    uint8_t  ip_p;                   /* protocol */
>> +    uint16_t ip_sum;                 /* checksum */
>> +    uint32_t ip_src, ip_dst;         /* source and dest address */
>> +} QEMU_PACKED;
>> +
> Why not just #include slirp/ip.h ?

Thanks,in next version I will add slirp/ip.h

>> +typedef struct Packet {
>> +    void *data;
>> +    union {
>> +        uint8_t *network_layer;
>> +        struct ip *ip;
>> +    };
>> +    uint8_t *transport_layer;
>> +    int size;
>> +    ColoProxyState *s;
>> +    bool should_be_sent;
>> +    NetClientState *sender;
>> +} Packet;
>> +
>> +typedef struct Connection_key {
>> +    /* (src, dst) must be grouped, in the same way than in IP header */
>> +    uint32_t src;
>> +    uint32_t dst;
>> +    union {
>> +        uint32_t ports;
>> +        uint16_t port16[2];
>> +    };
> Why the union?
>
> Dave

It will easy to look client port and server port,
in next version I will change it to src_port, dest_port


thanks for review
zhangchen

>> +    uint8_t ip_proto;
>> +} QEMU_PACKED Connection_key;
>> +
>> +typedef struct Connection {
>> +    /* connection primary send queue */
>> +    GQueue primary_list;
>> +    /* connection secondary send queue */
>> +    GQueue secondary_list;
>> +     /* flag to enqueue unprocessed_connections */
>> +    bool processing;
>> +} Connection;
>> +
>> +typedef enum {
>> +    PRIMARY_OUTPUT,           /* primary output packet queue */
>> +    PRIMARY_INPUT,            /* primary input packet queue */
>> +    SECONDARY_OUTPUT,         /* secondary output packet queue */
>> +} packet_type;
>> +
>>   #endif /* QEMU_COLO_PROXY_H */
>> -- 
>> 1.9.1
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 6/9] net/colo-proxy: add packet forward function
  2015-12-01 15:50   ` Dr. David Alan Gilbert
@ 2015-12-03  6:17     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-12-03  6:17 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang


Hi,Dave

On 12/01/2015 11:50 PM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> The packet recv by primary forward to secondary
>> The packet send by secondary forward to primary
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-proxy.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 114 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>> index ece5661..08a852f 100644
>> --- a/net/colo-proxy.c
>> +++ b/net/colo-proxy.c
>> @@ -26,6 +26,110 @@ static char *mode;
>>   static bool colo_do_checkpoint;
>>   
>>   /*
>> + * Packets to be sent by colo forward to
>> + * another colo
>> + * return:          >= 0        success
>> + *                  < 0        failed
>> + */
>> +static ssize_t colo_forward2another(NetFilterState *nf,
>> +                                         NetClientState *sender,
>> +                                         unsigned flags,
>> +                                         const struct iovec *iov,
>> +                                         int iovcnt,
>> +                                         NetPacketSent *sent_cb,
>> +                                         mode_type mode)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    ssize_t ret = 0;
>> +    ssize_t size = 0;
>> +    struct iovec sizeiov = {
>> +        .iov_base = &size,
>> +        .iov_len = 8
>> +    };
> If you always want 'size' to be 8 bytes then use an int64_t
> (or uint64_t probably if you are sending a length).

In next version I will change it to sizeof(size)

>> +    size = iov_size(iov, iovcnt);
>> +    if (!size) {
>> +        return 0;
>> +    }
>> +
>> +    if (mode == COLO_PRIMARY_MODE) {
>> +        qemu_net_queue_send_iov(s->incoming_queue, sender, flags,
>> +                           iov, iovcnt, NULL);
>> +    }
>> +    ret = iov_send(s->sockfd, &sizeiov, 8, 0, 8);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +    ret = iov_send(s->sockfd, iov, iovcnt, 0, size);
>> +    return ret;
>> +}
>> +
>> +/*
>> + * recv and handle colo secondary
>> + * forward packets in colo primary
>> + */
>> +static void colo_proxy_primary_forward_handler(NetFilterState *nf)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    ssize_t len = 0;
>> +    ssize_t ret = 0;
>> +    struct iovec sizeiov = {
>> +        .iov_base = &len,
>> +        .iov_len = 8
>> +    };
>> +    if (s->sockfd < 0) {
>> +        printf("secondary forward disconnected\n");
>> +        return;
>> +    }
>> +    iov_recv(s->sockfd, &sizeiov, 8, 0, 8);
> Check the return value of iov_recv.

I will fix it in next version

>
>> +    DEBUG("primary_forward_handler recv lensbuf lens=%zu\n", len);
>> +
>> +    if (len > 0) {
>> +        char *recvbuf;
>> +        recvbuf = g_malloc0(len);
> You should check the value of 'len' received off the wire - if things
> go wrong the value could be massive and try and allocate a huge amount
> of memory - checking it would also be a good check for something going
> wrong.

Thanks
I consider that when we send packet size so big,we can split it to send
I will try to fix it in next version

>> +        struct iovec iov = {
>> +            .iov_base = recvbuf,
>> +            .iov_len = len
>> +        };
>> +        iov_recv(s->sockfd, &iov, len, 0, len);
>> +        DEBUG("primary_forward_handler primary recvbuf=%s\n", recvbuf);
>> +        ret = colo_enqueue_secondary_packet(nf, recvbuf, len);
>> +        if (ret) {
>> +            DEBUG("colo_enqueue_secondary_packet succese\n");
>> +        } else {
>> +            DEBUG("colo_enqueue_secondary_packet failed\n");
>> +        }
>> +        g_free(recvbuf);
>> +    }
>> +}
>> +
>> +/*
>> + * recv and handle colo primary
>> + * forward packets in colo secondary
>> + */
>> +static void colo_proxy_secondary_forward_handler(NetFilterState *nf)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    ssize_t len = 0;
>> +    struct iovec sizeiov = {
>> +        .iov_base = &len,
>> +        .iov_len = 8
>> +    };
>> +    iov_recv(s->sockfd, &sizeiov, 8, 0, 8);
>> +    if (len > 0) {
>> +        char *buf;
>> +        buf = g_malloc0(len);
>> +        struct iovec iov = {
>> +            .iov_base = buf,
>> +            .iov_len = len
>> +        };
>> +        iov_recv(s->sockfd, &iov, len, 0, len);
> This code is very similar - factor it out and share it with the primary?
>
> Dave
>

I will fix it in next version

Thanks for review
zhangchen

>> +        qemu_net_queue_send(s->incoming_queue, nf->netdev,
>> +                    0, (const uint8_t *)buf, len, NULL);
>> +        g_free(buf);
>> +    }
>> +}
>> +
>> +/*
>>    * colo primary handle host's normal send and
>>    * recv packets to primary guest
>>    * return:          >= 0      success
>> @@ -63,7 +167,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
>>       if (direction == NET_FILTER_DIRECTION_RX) {
>>           /* TODO: enqueue_primary_packet */
>>       } else {
>> -        /* TODO: forward packets to another */
>> +        ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
>> +                    sent_cb, COLO_PRIMARY_MODE);
>>       }
>>   
>>       return ret;
>> @@ -107,7 +212,8 @@ static ssize_t colo_proxy_secondary_handler(NetFilterState *nf,
>>                               iovcnt, NULL);
>>               return 1;
>>           } else {
>> -        /* TODO: forward packets to another */
>> +            ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
>> +                        sent_cb, COLO_SECONDARY_MODE);
>>           }
>>   
>>       } else {
>> @@ -178,7 +284,9 @@ static void colo_accept_incoming(ColoProxyState *s)
>>           return;
>>       }
>>       s->sockfd = acceptsock;
>> -    /* TODO: handle the packets that primary forward */
>> +    qemu_set_fd_handler(s->sockfd,
>> +                (IOHandler *)colo_proxy_secondary_forward_handler, NULL,
>> +                (void *)s);
>>       return;
>>   }
>>   
>> @@ -211,7 +319,9 @@ static ssize_t colo_proxy_primary_setup(NetFilterState *nf)
>>       }
>>       DEBUG("colo proxy connect success\n");
>>       s->sockfd = sock;
>> -   /* TODO: handle the packets that secondary forward */
>> +    qemu_set_fd_handler(s->sockfd,
>> +                (IOHandler *)colo_proxy_primary_forward_handler,
>> +                NULL, (void *)s);
>>       g_free(s->addr);
>>       return 1;
>>   }
>> -- 
>> 1.9.1
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function
  2015-12-01 16:12   ` Dr. David Alan Gilbert
@ 2015-12-03  6:35     ` Zhang Chen
  2015-12-03  9:09       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 35+ messages in thread
From: Zhang Chen @ 2015-12-03  6:35 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, hongyang.yang,
	zhanghailiang


Hi,Dave

On 12/02/2015 12:12 AM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> Add common packet handle function and enqueue
>> packet distinguished connection,then we can
>> lookup one connection packet to compare
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-proxy.c | 167 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 166 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>> index 08a852f..a664e6d 100644
>> --- a/net/colo-proxy.c
>> +++ b/net/colo-proxy.c
>> @@ -24,6 +24,170 @@
>>   
>>   static char *mode;
>>   static bool colo_do_checkpoint;
>> +static void packet_destroy(void *opaque, void *user_data);
>> +
>> +static uint32_t connection_key_hash(const void *opaque)
>> +{
>> +    const Connection_key *key = opaque;
>> +    uint32_t a, b, c;
>> +
>> +    /* Jenkins hash */
>> +    a = b = c = JHASH_INITVAL + sizeof(*key);
>> +    a += key->src;
>> +    b += key->dst;
>> +    c += key->ports;
>> +    __jhash_mix(a, b, c);
>> +
>> +    a += key->ip_proto;
>> +    __jhash_final(a, b, c);
>> +
>> +    return c;
>> +}
>> +
>> +static int connection_key_equal(const void *opaque1, const void *opaque2)
>> +{
>> +    return memcmp(opaque1, opaque2, sizeof(Connection_key)) == 0;
>> +}
>> +
>> +static void connection_destroy(void *opaque)
>> +{
>> +    Connection *connection = opaque;
>> +    g_queue_foreach(&connection->primary_list, packet_destroy, NULL);
>> +    g_queue_free(&connection->primary_list);
>> +    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
>> +    g_queue_free(&connection->secondary_list);
>> +    g_slice_free(Connection, connection);
>> +}
>> +
>> +static Connection *connection_new(void)
>> +{
>> +    Connection *connection = g_slice_new(Connection);
>> +
>> +    g_queue_init(&connection->primary_list);
>> +    g_queue_init(&connection->secondary_list);
>> +    connection->processing = false;
>> +
>> +    return connection;
>> +}
>> +
>> +/* Return 0 on success, or return -1 if the pkt is corrpted */
>> +static int parse_packet_early(Packet *pkt, Connection_key *key)
>> +{
>> +    int network_length;
>> +    uint8_t *data = pkt->data;
>> +
>> +    pkt->network_layer = data + ETH_HLEN;
>> +    if (ntohs(*(uint16_t *)(data + 12)) != ETH_P_IP) {
>> +        if (ntohs(*(uint16_t *)(data + 12)) == ETH_P_ARP) {
>> +            return -1;
>> +        }
>> +        return 0;
>> +    }
> Can you use some of the functions/macros in include/net/eth.h to
> make this easier? Maybe eth_get_l3_proto ?
> Do you plan to do IPv6 at some point?

I will use include/net/eth.h in next version

IPv6 currently not support, still colo framework be merged

>> +    network_length = pkt->ip->ip_hl * 4;
>> +    pkt->transport_layer = pkt->network_layer + network_length;
>> +    key->ip_proto = pkt->ip->ip_p;
>> +    key->src = pkt->ip->ip_src;
>> +    key->dst = pkt->ip->ip_dst;
>> +
>> +    switch (key->ip_proto) {
>> +    case IPPROTO_TCP:
>> +    case IPPROTO_UDP:
>> +    case IPPROTO_DCCP:
>> +    case IPPROTO_ESP:
>> +    case IPPROTO_SCTP:
>> +    case IPPROTO_UDPLITE:
>> +        key->ports = *(uint32_t *)(pkt->transport_layer);
>> +        break;
>> +    case IPPROTO_AH:
>> +        key->ports = *(uint32_t *)(pkt->transport_layer + 4);
> Interesting; I don't see any other code in QEMU to handle AH,
> and I don't know much about it.
>
>> +        break;
>> +    default:
>> +        break;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static Packet *packet_new(ColoProxyState *s, const void *data,
>> +                          int size, Connection_key *key, NetClientState *sender)
>> +{
>> +    Packet *pkt = g_slice_new(Packet);
>> +
>> +    pkt->data = g_malloc(size);
>> +    memcpy(pkt->data, data, size);
> g_memdup might be useful for these:
> https://developer.gnome.org/glib/stable/glib-Memory-Allocation.html#g-memdup

I will fix it in next version

>> +    pkt->size = size;
>> +    pkt->s = s;
>> +    pkt->sender = sender;
>> +    pkt->should_be_sent = false;
>> +
>> +    if (parse_packet_early(pkt, key)) {
>> +        packet_destroy(pkt, NULL);
>> +        pkt = NULL;
>> +    }
>> +
>> +    return pkt;
>> +}
>> +
>> +static void packet_destroy(void *opaque, void *user_data)
>> +{
>> +    Packet *pkt = opaque;
>> +    g_free(pkt->data);
>> +    g_slice_free(Packet, pkt);
>> +}
>> +
>> +static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
>> +                                          Connection_key *key,
>> +                                          Packet *pkt, packet_type type)
>> +{
>> +    Connection *connection;
>> +    Packet *tmppkt;
>> +    connection = g_hash_table_lookup(unprocessed_packets, key);
>> +    if (connection == NULL) {
>> +        Connection_key *new_key = g_malloc(sizeof(*key));
>> +
>> +        connection = connection_new();
>> +        memcpy(new_key, key, sizeof(*key));
>> +        key = new_key;
>> +
>> +        g_hash_table_insert(unprocessed_packets, key, connection);
> Is 'unprocessed_packets' a good name for this hashtable? I'm not quite
> sure I understand, but it looks to me like it's your connection-tracking equivalent,
> which then has a queue for each connection with unprocessed packets?

i will change hashtable name to connection_track_table,is it ok?

> Also, do we do anything to stop this hash growing really huge? If there
> are lots-and-lots of connections can we limit it somehow? (what does Linux do?)

when we find PVM's packet different to SVM's packet,colo will do 
checkpoint.
that's means we will flush all connection's packets,even though all 
packets are
same,colo will alse do checkpoint periodically. so hashtable can't 
growing really huge.

>> +    }
>> +    switch (type) {
>> +    case PRIMARY_OUTPUT:
>> +        if (g_queue_get_length(&connection->secondary_list) > 0) {
> Please add some more comments; I think this is when a packet comes in
> on the primary, and then we find we've already got a packet from the secondary
> waiting?

yes,you are right

I will add more comments in next version

>> +            tmppkt = g_queue_pop_head(&connection->secondary_list);
>> +            DEBUG("g_queue_get_length(&connection->primary_list)=%d\n",
>> +                        g_queue_get_length(&connection->primary_list));
>> +            DEBUG("g_queue_get_length(&connection->secondary_list)=%d\n",
>> +                        g_queue_get_length(&connection->secondary_list));
>> +            if (colo_packet_compare(pkt, tmppkt)) {
>> +                DEBUG("packet same and release packet\n");
>> +                pkt->should_be_sent = true;
>> +                break;
>> +            } else {
>> +                DEBUG("packet different\n");
>> +                colo_proxy_notify_checkpoint();
>> +                pkt->should_be_sent = false;
>> +                break;
>> +            }
>> +        } else {
>> +            g_queue_push_tail(&connection->primary_list, pkt);
>> +            pkt->should_be_sent = false;
>> +        }
>> +
>> +        break;
>> +    case SECONDARY_OUTPUT:
>> +        g_queue_push_tail(&connection->secondary_list, pkt);
>> +        DEBUG("secondary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
>> +                    (char *)pkt->data, pkt->ip->ip_src, pkt->ip->ip_dst);
>> +        break;
>> +    default:
>> +        abort();
>> +    }
>> +
>> +    return connection;
>> +}
>> +
>>   
>>   /*
>>    * Packets to be sent by colo forward to
>> @@ -165,7 +329,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
>>       }
>>   
>>       if (direction == NET_FILTER_DIRECTION_RX) {
>> -        /* TODO: enqueue_primary_packet */
>> +        ret = colo_enqueue_primary_packet(nf, sender, flags, iov,
>> +                    iovcnt, sent_cb);
> The routine above is 'colo_enqueue_packet' rather than colo_enqueue_primary_packet?

yes,colo_enqueue_packet is enqueue packet common

Thanks for review
zhangchen

>>       } else {
>>           ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
>>                       sent_cb, COLO_PRIMARY_MODE);
>> -- 
>> 1.9.1
> Dave
>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 9/9] net/colo-proxy: add packet compare and notify checkpoint
  2015-12-01 16:37   ` Dr. David Alan Gilbert
@ 2015-12-03  7:10     ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-12-03  7:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, zhanghailiang


Hi,Dave

On 12/02/2015 12:37 AM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>
>> Lookup same connection's primary and secondary packet
>> to compare,if same we will send primary packet and
>> drop secondary packet,else send all of primary
>> packets be queued,drop secondary queue and notify
>> colo to do checkpoint
>>
>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-proxy.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 74 insertions(+)
>>
>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>> index 5f1852a..847f7f2 100644
>> --- a/net/colo-proxy.c
>> +++ b/net/colo-proxy.c
>> @@ -70,6 +70,41 @@ static Connection *connection_new(void)
>>       return connection;
>>   }
>>   
>> +static void colo_send_primary_packet(void *opaque, void *user_data)
>> +{
>> +    Packet *pkt = opaque;
>> +    qemu_net_queue_send(pkt->s->incoming_queue, pkt->sender, 0,
>> +                    (const uint8_t *)pkt->data, pkt->size, NULL);
>> +}
>> +
>> +static void colo_flush_connection(void *opaque, void *user_data)
>> +{
>> +    Connection *connection = opaque;
>> +    g_queue_foreach(&connection->primary_list, colo_send_primary_packet, NULL);
>> +    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
>> +}
>> +
>> +static void colo_proxy_notify_checkpoint(void)
>> +{
>> +    DEBUG("colo_proxy_notify_checkpoint\n");
>> +}
>> +
>> +static void colo_proxy_do_checkpoint(NetFilterState *nf)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +
>> +    g_queue_foreach(&s->unprocessed_connections, colo_flush_connection, NULL);
>> +}
>> +
>> +/*
>> + * colo failover flag
>> + */
>> +static ssize_t colo_has_failover(NetFilterState *nf)
>> +{
>> +    ColoProxyState *s = FILTER_COLO_PROXY(nf);
>> +    return s->has_failover;
>> +}
>> +
>>   /* Return 0 on success, or return -1 if the pkt is corrpted */
>>   static int parse_packet_early(Packet *pkt, Connection_key *key)
>>   {
>> @@ -136,6 +171,45 @@ static void packet_destroy(void *opaque, void *user_data)
>>       g_slice_free(Packet, pkt);
>>   }
>>   
>> +/*
>> + * The sent IP packets comparison between primary
>> + * and secondary
>> + * TODO: support ip fragment
>> + * return:    true  means packet same
>> + *            false means packet different
>> + */
>> +static bool colo_packet_compare(Packet *ppkt, Packet *spkt)
>> +{
>> +    int i;
>> +    DEBUG("colo_packet_compare lens ppkt %d,spkt %d\n", ppkt->size,
>> +                spkt->size);
>> +    DEBUG("primary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
>> +                (char *)ppkt->data, ppkt->ip->ip_src, ppkt->ip->ip_dst);
>> +    DEBUG("seconda pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
>> +                (char *)spkt->data, spkt->ip->ip_src, spkt->ip->ip_dst);
> Is it always IP packets we're comparing at this point?

yes,you are right

>> +    if (ppkt->size == spkt->size) {
>> +        DEBUG("colo_packet_compare data   ppkt\n");
>> +        for (i = 0; i < spkt->size; i++) {
>> +            DEBUG("%x", ((char *)ppkt->data)[i]);
>> +            DEBUG("|");
>> +        }
>> +        DEBUG("\ncolo_packet_compare data   spkt\n");
>> +        for (i = 0; i < spkt->size; i++) {
>> +            DEBUG("%x", ((char *)spkt->data)[i]);
>> +            DEBUG("|");
>> +        }
>> +        DEBUG("\ncolo_packet_compare data   ppkt %s\n", (char *)ppkt->data);
>> +        DEBUG("colo_packet_compare data   spkt %s\n", (char *)spkt->data);
> It's probably better to make these a helper debug routine to dump
> a packet;  I bet you'll wnat to sometimes do it when the sizes
> are different.

I will dump packet before diff size in next version

Thanks for review
zhangchen

>> +        if (!memcmp(ppkt->data, spkt->data, spkt->size)) {
>> +            return true;
>> +        } else {
>> +            return false;
>> +        }
>> +    } else {
>> +        return false;
>> +    }
>> +}
>> +
>>   static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
>>                                             Connection_key *key,
>>                                             Packet *pkt, packet_type type)
>> -- 
>> 1.9.1
>>
>>
>>
> Dave
>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter
  2015-12-01 16:44 ` [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Dr. David Alan Gilbert
@ 2015-12-03  7:33   ` Zhang Chen
  0 siblings, 0 replies; 35+ messages in thread
From: Zhang Chen @ 2015-12-03  7:33 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, hongyang.yang,
	zhanghailiang


Hi,Dave

On 12/02/2015 12:44 AM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>> Hi,all
>>
>> This patch add an colo-proxy object, COLO-Proxy is a part of COLO,
>> based on qemu netfilter and it's a plugin for qemu netfilter. the function
>> keep Secondary VM connect normal to Primary VM and compare packets
>> sent by PVM to sent by SVM.if the packet difference,notify COLO do
>> checkpoint and send all primary packet has queued.
>>
>> You can also get the series from:
>>
>> https://github.com/zhangckid/qemu/tree/colo-proxy-V1
>>
>> Usage:
>>
>> primary:
>> -netdev tap,id=bn0 -device e1000,netdev=bn0
>> -object colo-proxy,id=f0,netdev=bn0,queue=all,mode=primary,addr=ip:port
>>
>> secondary:
>> -netdev tap,id=bn0 -device e1000,netdev=bn0
>> -object colo-proxy,id=f0,netdev=bn0,queue=all,mode=secondary,addr=ip:port
> If we have more than one NIC on the guest, do you intend to allow
> multiple colo-proxy's ?

Yes,we support.Colo-proxy based on netfilter, same to filter-buffer, 
proxy attach your netdev.
but we haven't test it

> Having read through the series, it looks like the main missing piece
> is the work to synchronise sequence numbers.

Yes, we will support it in the futrue

> I think also you'll need to clean out the connection hash when either
> you see both sides have closed the connection or (maybe after some
> time of idleness as well? Otherwise we'd just accumulate dead connections
> overtime).
>
> I'm guessing the buffer filter also has to be created on the command line?
> How does the order of buffers work?
>
> Dave

In colo-proxy we have done buffer filter's work,so we just need startup
colo-proxy.

Thanks for review
zhangchen

>> NOTE:
>> queue must set "all". See enum NetFilterDirection for detail.
>> colo-proxy need queue all packets
>> colo-proxy V1 just a demo of colo proxy,not pass test with colo upstream
>>
>>
>> ## Background
>>
>> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service)
>> project is a high availability solution. Both Primary VM (PVM) and Secondary VM
>> (SVM) run in parallel. They receive the same request from client, and generate
>> responses in parallel too. If the response packets from PVM and SVM are
>> identical, they are released immediately. Otherwise, a VM checkpoint (on
>> demand)is conducted.
>>
>> Paper:
>> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
>>
>> COLO on Xen:
>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>>
>> COLO on Qemu/KVM:
>> http://wiki.qemu.org/Features/COLO
>>
>> By the needs of capturing response packets from PVM and SVM and finding out
>> whether they are identical, we introduce a new module to qemu networking
>> called colo-proxy.
>>
>>
>> v1:
>>   initial patch.
>>
>>
>>
>> zhangchen (9):
>>    Init colo-proxy object based on netfilter
>>    jhash: add linux kernel jhashtable in qemu
>>    colo-proxy: add colo-proxy framework
>>    colo-proxy: add colo-proxy setup work
>>    net/colo-proxy: add colo packet handler
>>    net/colo-proxy: add packet forward function
>>    net/colo-proxy: add packet enqueue and handle function
>>    net/colo-proxy: enqueue primary and secondary packet
>>    net/colo-proxy: add packet compare and notify checkpoint
>>
>>   include/qemu/jhash.h |  52 ++++
>>   net/Makefile.objs    |   1 +
>>   net/colo-proxy.c     | 745 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/colo-proxy.h     | 124 +++++++++
>>   qemu-options.hx      |   4 +
>>   vl.c                 |   3 +-
>>   6 files changed, 928 insertions(+), 1 deletion(-)
>>   create mode 100644 include/qemu/jhash.h
>>   create mode 100644 net/colo-proxy.c
>>   create mode 100644 net/colo-proxy.h
>>
>> -- 
>> 1.9.1
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function
  2015-12-03  6:35     ` Zhang Chen
@ 2015-12-03  9:09       ` Dr. David Alan Gilbert
  2015-12-04  3:21         ` Zhang Chen
  0 siblings, 1 reply; 35+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-03  9:09 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, hongyang.yang,
	zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> 
> Hi,Dave
> 
> On 12/02/2015 12:12 AM, Dr. David Alan Gilbert wrote:
> >* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> >>From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> >>
> >>Add common packet handle function and enqueue
> >>packet distinguished connection,then we can
> >>lookup one connection packet to compare
> >>
> >>Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> >>---
> >>  net/colo-proxy.c | 167 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> >>  1 file changed, 166 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> >>index 08a852f..a664e6d 100644
> >>--- a/net/colo-proxy.c
> >>+++ b/net/colo-proxy.c
> >>@@ -24,6 +24,170 @@
> >>  static char *mode;
> >>  static bool colo_do_checkpoint;
> >>+static void packet_destroy(void *opaque, void *user_data);
> >>+
> >>+static uint32_t connection_key_hash(const void *opaque)
> >>+{
> >>+    const Connection_key *key = opaque;
> >>+    uint32_t a, b, c;
> >>+
> >>+    /* Jenkins hash */
> >>+    a = b = c = JHASH_INITVAL + sizeof(*key);
> >>+    a += key->src;
> >>+    b += key->dst;
> >>+    c += key->ports;
> >>+    __jhash_mix(a, b, c);
> >>+
> >>+    a += key->ip_proto;
> >>+    __jhash_final(a, b, c);
> >>+
> >>+    return c;
> >>+}
> >>+
> >>+static int connection_key_equal(const void *opaque1, const void *opaque2)
> >>+{
> >>+    return memcmp(opaque1, opaque2, sizeof(Connection_key)) == 0;
> >>+}
> >>+
> >>+static void connection_destroy(void *opaque)
> >>+{
> >>+    Connection *connection = opaque;
> >>+    g_queue_foreach(&connection->primary_list, packet_destroy, NULL);
> >>+    g_queue_free(&connection->primary_list);
> >>+    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
> >>+    g_queue_free(&connection->secondary_list);
> >>+    g_slice_free(Connection, connection);
> >>+}
> >>+
> >>+static Connection *connection_new(void)
> >>+{
> >>+    Connection *connection = g_slice_new(Connection);
> >>+
> >>+    g_queue_init(&connection->primary_list);
> >>+    g_queue_init(&connection->secondary_list);
> >>+    connection->processing = false;
> >>+
> >>+    return connection;
> >>+}
> >>+
> >>+/* Return 0 on success, or return -1 if the pkt is corrpted */
> >>+static int parse_packet_early(Packet *pkt, Connection_key *key)
> >>+{
> >>+    int network_length;
> >>+    uint8_t *data = pkt->data;
> >>+
> >>+    pkt->network_layer = data + ETH_HLEN;
> >>+    if (ntohs(*(uint16_t *)(data + 12)) != ETH_P_IP) {
> >>+        if (ntohs(*(uint16_t *)(data + 12)) == ETH_P_ARP) {
> >>+            return -1;
> >>+        }
> >>+        return 0;
> >>+    }
> >Can you use some of the functions/macros in include/net/eth.h to
> >make this easier? Maybe eth_get_l3_proto ?
> >Do you plan to do IPv6 at some point?
> 
> I will use include/net/eth.h in next version
> 
> IPv6 currently not support, still colo framework be merged
> 
> >>+    network_length = pkt->ip->ip_hl * 4;
> >>+    pkt->transport_layer = pkt->network_layer + network_length;
> >>+    key->ip_proto = pkt->ip->ip_p;
> >>+    key->src = pkt->ip->ip_src;
> >>+    key->dst = pkt->ip->ip_dst;
> >>+
> >>+    switch (key->ip_proto) {
> >>+    case IPPROTO_TCP:
> >>+    case IPPROTO_UDP:
> >>+    case IPPROTO_DCCP:
> >>+    case IPPROTO_ESP:
> >>+    case IPPROTO_SCTP:
> >>+    case IPPROTO_UDPLITE:
> >>+        key->ports = *(uint32_t *)(pkt->transport_layer);
> >>+        break;
> >>+    case IPPROTO_AH:
> >>+        key->ports = *(uint32_t *)(pkt->transport_layer + 4);
> >Interesting; I don't see any other code in QEMU to handle AH,
> >and I don't know much about it.
> >
> >>+        break;
> >>+    default:
> >>+        break;
> >>+    }
> >>+
> >>+    return 0;
> >>+}
> >>+
> >>+static Packet *packet_new(ColoProxyState *s, const void *data,
> >>+                          int size, Connection_key *key, NetClientState *sender)
> >>+{
> >>+    Packet *pkt = g_slice_new(Packet);
> >>+
> >>+    pkt->data = g_malloc(size);
> >>+    memcpy(pkt->data, data, size);
> >g_memdup might be useful for these:
> >https://developer.gnome.org/glib/stable/glib-Memory-Allocation.html#g-memdup
> 
> I will fix it in next version
> 
> >>+    pkt->size = size;
> >>+    pkt->s = s;
> >>+    pkt->sender = sender;
> >>+    pkt->should_be_sent = false;
> >>+
> >>+    if (parse_packet_early(pkt, key)) {
> >>+        packet_destroy(pkt, NULL);
> >>+        pkt = NULL;
> >>+    }
> >>+
> >>+    return pkt;
> >>+}
> >>+
> >>+static void packet_destroy(void *opaque, void *user_data)
> >>+{
> >>+    Packet *pkt = opaque;
> >>+    g_free(pkt->data);
> >>+    g_slice_free(Packet, pkt);
> >>+}
> >>+
> >>+static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
> >>+                                          Connection_key *key,
> >>+                                          Packet *pkt, packet_type type)
> >>+{
> >>+    Connection *connection;
> >>+    Packet *tmppkt;
> >>+    connection = g_hash_table_lookup(unprocessed_packets, key);
> >>+    if (connection == NULL) {
> >>+        Connection_key *new_key = g_malloc(sizeof(*key));
> >>+
> >>+        connection = connection_new();
> >>+        memcpy(new_key, key, sizeof(*key));
> >>+        key = new_key;
> >>+
> >>+        g_hash_table_insert(unprocessed_packets, key, connection);
> >Is 'unprocessed_packets' a good name for this hashtable? I'm not quite
> >sure I understand, but it looks to me like it's your connection-tracking equivalent,
> >which then has a queue for each connection with unprocessed packets?
> 
> i will change hashtable name to connection_track_table,is it ok?

Yes, thank you.

> >Also, do we do anything to stop this hash growing really huge? If there
> >are lots-and-lots of connections can we limit it somehow? (what does Linux do?)
> 
> when we find PVM's packet different to SVM's packet,colo will do checkpoint.
> that's means we will flush all connection's packets,even though all packets
> are
> same,colo will alse do checkpoint periodically. so hashtable can't growing
> really huge.

I see the flush clears all the packets, but does it also clear the hash?

> >>+    }
> >>+    switch (type) {
> >>+    case PRIMARY_OUTPUT:
> >>+        if (g_queue_get_length(&connection->secondary_list) > 0) {
> >Please add some more comments; I think this is when a packet comes in
> >on the primary, and then we find we've already got a packet from the secondary
> >waiting?
> 
> yes,you are right
> 
> I will add more comments in next version

Thank you.

Dave

> >>+            tmppkt = g_queue_pop_head(&connection->secondary_list);
> >>+            DEBUG("g_queue_get_length(&connection->primary_list)=%d\n",
> >>+                        g_queue_get_length(&connection->primary_list));
> >>+            DEBUG("g_queue_get_length(&connection->secondary_list)=%d\n",
> >>+                        g_queue_get_length(&connection->secondary_list));
> >>+            if (colo_packet_compare(pkt, tmppkt)) {
> >>+                DEBUG("packet same and release packet\n");
> >>+                pkt->should_be_sent = true;
> >>+                break;
> >>+            } else {
> >>+                DEBUG("packet different\n");
> >>+                colo_proxy_notify_checkpoint();
> >>+                pkt->should_be_sent = false;
> >>+                break;
> >>+            }
> >>+        } else {
> >>+            g_queue_push_tail(&connection->primary_list, pkt);
> >>+            pkt->should_be_sent = false;
> >>+        }
> >>+
> >>+        break;
> >>+    case SECONDARY_OUTPUT:
> >>+        g_queue_push_tail(&connection->secondary_list, pkt);
> >>+        DEBUG("secondary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
> >>+                    (char *)pkt->data, pkt->ip->ip_src, pkt->ip->ip_dst);
> >>+        break;
> >>+    default:
> >>+        abort();
> >>+    }
> >>+
> >>+    return connection;
> >>+}
> >>+
> >>  /*
> >>   * Packets to be sent by colo forward to
> >>@@ -165,7 +329,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
> >>      }
> >>      if (direction == NET_FILTER_DIRECTION_RX) {
> >>-        /* TODO: enqueue_primary_packet */
> >>+        ret = colo_enqueue_primary_packet(nf, sender, flags, iov,
> >>+                    iovcnt, sent_cb);
> >The routine above is 'colo_enqueue_packet' rather than colo_enqueue_primary_packet?
> 
> yes,colo_enqueue_packet is enqueue packet common
> 
> Thanks for review
> zhangchen
> 
> >>      } else {
> >>          ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
> >>                      sent_cb, COLO_PRIMARY_MODE);
> >>-- 
> >>1.9.1
> >Dave
> >
> >>
> >>
> >--
> >Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> >
> >.
> >
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function
  2015-12-03  9:09       ` Dr. David Alan Gilbert
@ 2015-12-04  3:21         ` Zhang Chen
  2015-12-04  9:14           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 35+ messages in thread
From: Zhang Chen @ 2015-12-04  3:21 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, hongyang.yang,
	zhanghailiang

Hi,Dave


On 12/03/2015 05:09 PM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>> Hi,Dave
>>
>> On 12/02/2015 12:12 AM, Dr. David Alan Gilbert wrote:
>>> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>>>> From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>>>
>>>> Add common packet handle function and enqueue
>>>> packet distinguished connection,then we can
>>>> lookup one connection packet to compare
>>>>
>>>> Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
>>>> ---
>>>>   net/colo-proxy.c | 167 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>   1 file changed, 166 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/colo-proxy.c b/net/colo-proxy.c
>>>> index 08a852f..a664e6d 100644
>>>> --- a/net/colo-proxy.c
>>>> +++ b/net/colo-proxy.c
>>>> @@ -24,6 +24,170 @@
>>>>   static char *mode;
>>>>   static bool colo_do_checkpoint;
>>>> +static void packet_destroy(void *opaque, void *user_data);
>>>> +
>>>> +static uint32_t connection_key_hash(const void *opaque)
>>>> +{
>>>> +    const Connection_key *key = opaque;
>>>> +    uint32_t a, b, c;
>>>> +
>>>> +    /* Jenkins hash */
>>>> +    a = b = c = JHASH_INITVAL + sizeof(*key);
>>>> +    a += key->src;
>>>> +    b += key->dst;
>>>> +    c += key->ports;
>>>> +    __jhash_mix(a, b, c);
>>>> +
>>>> +    a += key->ip_proto;
>>>> +    __jhash_final(a, b, c);
>>>> +
>>>> +    return c;
>>>> +}
>>>> +
>>>> +static int connection_key_equal(const void *opaque1, const void *opaque2)
>>>> +{
>>>> +    return memcmp(opaque1, opaque2, sizeof(Connection_key)) == 0;
>>>> +}
>>>> +
>>>> +static void connection_destroy(void *opaque)
>>>> +{
>>>> +    Connection *connection = opaque;
>>>> +    g_queue_foreach(&connection->primary_list, packet_destroy, NULL);
>>>> +    g_queue_free(&connection->primary_list);
>>>> +    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
>>>> +    g_queue_free(&connection->secondary_list);
>>>> +    g_slice_free(Connection, connection);
>>>> +}
>>>> +
>>>> +static Connection *connection_new(void)
>>>> +{
>>>> +    Connection *connection = g_slice_new(Connection);
>>>> +
>>>> +    g_queue_init(&connection->primary_list);
>>>> +    g_queue_init(&connection->secondary_list);
>>>> +    connection->processing = false;
>>>> +
>>>> +    return connection;
>>>> +}
>>>> +
>>>> +/* Return 0 on success, or return -1 if the pkt is corrpted */
>>>> +static int parse_packet_early(Packet *pkt, Connection_key *key)
>>>> +{
>>>> +    int network_length;
>>>> +    uint8_t *data = pkt->data;
>>>> +
>>>> +    pkt->network_layer = data + ETH_HLEN;
>>>> +    if (ntohs(*(uint16_t *)(data + 12)) != ETH_P_IP) {
>>>> +        if (ntohs(*(uint16_t *)(data + 12)) == ETH_P_ARP) {
>>>> +            return -1;
>>>> +        }
>>>> +        return 0;
>>>> +    }
>>> Can you use some of the functions/macros in include/net/eth.h to
>>> make this easier? Maybe eth_get_l3_proto ?
>>> Do you plan to do IPv6 at some point?
>> I will use include/net/eth.h in next version
>>
>> IPv6 currently not support, still colo framework be merged
>>
>>>> +    network_length = pkt->ip->ip_hl * 4;
>>>> +    pkt->transport_layer = pkt->network_layer + network_length;
>>>> +    key->ip_proto = pkt->ip->ip_p;
>>>> +    key->src = pkt->ip->ip_src;
>>>> +    key->dst = pkt->ip->ip_dst;
>>>> +
>>>> +    switch (key->ip_proto) {
>>>> +    case IPPROTO_TCP:
>>>> +    case IPPROTO_UDP:
>>>> +    case IPPROTO_DCCP:
>>>> +    case IPPROTO_ESP:
>>>> +    case IPPROTO_SCTP:
>>>> +    case IPPROTO_UDPLITE:
>>>> +        key->ports = *(uint32_t *)(pkt->transport_layer);
>>>> +        break;
>>>> +    case IPPROTO_AH:
>>>> +        key->ports = *(uint32_t *)(pkt->transport_layer + 4);
>>> Interesting; I don't see any other code in QEMU to handle AH,
>>> and I don't know much about it.
>>>
>>>> +        break;
>>>> +    default:
>>>> +        break;
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static Packet *packet_new(ColoProxyState *s, const void *data,
>>>> +                          int size, Connection_key *key, NetClientState *sender)
>>>> +{
>>>> +    Packet *pkt = g_slice_new(Packet);
>>>> +
>>>> +    pkt->data = g_malloc(size);
>>>> +    memcpy(pkt->data, data, size);
>>> g_memdup might be useful for these:
>>> https://developer.gnome.org/glib/stable/glib-Memory-Allocation.html#g-memdup
>> I will fix it in next version
>>
>>>> +    pkt->size = size;
>>>> +    pkt->s = s;
>>>> +    pkt->sender = sender;
>>>> +    pkt->should_be_sent = false;
>>>> +
>>>> +    if (parse_packet_early(pkt, key)) {
>>>> +        packet_destroy(pkt, NULL);
>>>> +        pkt = NULL;
>>>> +    }
>>>> +
>>>> +    return pkt;
>>>> +}
>>>> +
>>>> +static void packet_destroy(void *opaque, void *user_data)
>>>> +{
>>>> +    Packet *pkt = opaque;
>>>> +    g_free(pkt->data);
>>>> +    g_slice_free(Packet, pkt);
>>>> +}
>>>> +
>>>> +static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
>>>> +                                          Connection_key *key,
>>>> +                                          Packet *pkt, packet_type type)
>>>> +{
>>>> +    Connection *connection;
>>>> +    Packet *tmppkt;
>>>> +    connection = g_hash_table_lookup(unprocessed_packets, key);
>>>> +    if (connection == NULL) {
>>>> +        Connection_key *new_key = g_malloc(sizeof(*key));
>>>> +
>>>> +        connection = connection_new();
>>>> +        memcpy(new_key, key, sizeof(*key));
>>>> +        key = new_key;
>>>> +
>>>> +        g_hash_table_insert(unprocessed_packets, key, connection);
>>> Is 'unprocessed_packets' a good name for this hashtable? I'm not quite
>>> sure I understand, but it looks to me like it's your connection-tracking equivalent,
>>> which then has a queue for each connection with unprocessed packets?
>> i will change hashtable name to connection_track_table,is it ok?
> Yes, thank you.
>
>>> Also, do we do anything to stop this hash growing really huge? If there
>>> are lots-and-lots of connections can we limit it somehow? (what does Linux do?)
>> when we find PVM's packet different to SVM's packet,colo will do checkpoint.
>> that's means we will flush all connection's packets,even though all packets
>> are
>> same,colo will alse do checkpoint periodically. so hashtable can't growing
>> really huge.
> I see the flush clears all the packets, but does it also clear the hash?
>

I read the kernel code,TCP conntrack will clear hash one time every five 
days periodicity.
and the hashtable size
     /* Idea from tcp.c: use 1/16384 of memory.  On i386: 32MB
      * machine has 512 buckets. >= 1GB machines have 16384 buckets. */
I will follow kernel's done to fix colo-proxy in next version.

Thanks for review
zhangchen

>>>> +    }
>>>> +    switch (type) {
>>>> +    case PRIMARY_OUTPUT:
>>>> +        if (g_queue_get_length(&connection->secondary_list) > 0) {
>>> Please add some more comments; I think this is when a packet comes in
>>> on the primary, and then we find we've already got a packet from the secondary
>>> waiting?
>> yes,you are right
>>
>> I will add more comments in next version
> Thank you.
>
> Dave
>
>>>> +            tmppkt = g_queue_pop_head(&connection->secondary_list);
>>>> +            DEBUG("g_queue_get_length(&connection->primary_list)=%d\n",
>>>> +                        g_queue_get_length(&connection->primary_list));
>>>> +            DEBUG("g_queue_get_length(&connection->secondary_list)=%d\n",
>>>> +                        g_queue_get_length(&connection->secondary_list));
>>>> +            if (colo_packet_compare(pkt, tmppkt)) {
>>>> +                DEBUG("packet same and release packet\n");
>>>> +                pkt->should_be_sent = true;
>>>> +                break;
>>>> +            } else {
>>>> +                DEBUG("packet different\n");
>>>> +                colo_proxy_notify_checkpoint();
>>>> +                pkt->should_be_sent = false;
>>>> +                break;
>>>> +            }
>>>> +        } else {
>>>> +            g_queue_push_tail(&connection->primary_list, pkt);
>>>> +            pkt->should_be_sent = false;
>>>> +        }
>>>> +
>>>> +        break;
>>>> +    case SECONDARY_OUTPUT:
>>>> +        g_queue_push_tail(&connection->secondary_list, pkt);
>>>> +        DEBUG("secondary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
>>>> +                    (char *)pkt->data, pkt->ip->ip_src, pkt->ip->ip_dst);
>>>> +        break;
>>>> +    default:
>>>> +        abort();
>>>> +    }
>>>> +
>>>> +    return connection;
>>>> +}
>>>> +
>>>>   /*
>>>>    * Packets to be sent by colo forward to
>>>> @@ -165,7 +329,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
>>>>       }
>>>>       if (direction == NET_FILTER_DIRECTION_RX) {
>>>> -        /* TODO: enqueue_primary_packet */
>>>> +        ret = colo_enqueue_primary_packet(nf, sender, flags, iov,
>>>> +                    iovcnt, sent_cb);
>>> The routine above is 'colo_enqueue_packet' rather than colo_enqueue_primary_packet?
>> yes,colo_enqueue_packet is enqueue packet common
>>
>> Thanks for review
>> zhangchen
>>
>>>>       } else {
>>>>           ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
>>>>                       sent_cb, COLO_PRIMARY_MODE);
>>>> -- 
>>>> 1.9.1
>>> Dave
>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>>
>>> .
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function
  2015-12-04  3:21         ` Zhang Chen
@ 2015-12-04  9:14           ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 35+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-04  9:14 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Li Zhijian, Gui jianfeng, Jason Wang, eddie.dong, qemu devel,
	Huang peng, Gong lei, Stefan Hajnoczi, jan.kiszka, hongyang.yang,
	zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> Hi,Dave
> 
> 
> On 12/03/2015 05:09 PM, Dr. David Alan Gilbert wrote:
> >* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> >>Hi,Dave
> >>
> >>On 12/02/2015 12:12 AM, Dr. David Alan Gilbert wrote:
> >>>* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> >>>>From: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> >>>>
> >>>>Add common packet handle function and enqueue
> >>>>packet distinguished connection,then we can
> >>>>lookup one connection packet to compare
> >>>>
> >>>>Signed-off-by: zhangchen <zhangchen.fnst@cn.fujitsu.com>
> >>>>---
> >>>>  net/colo-proxy.c | 167 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> >>>>  1 file changed, 166 insertions(+), 1 deletion(-)
> >>>>
> >>>>diff --git a/net/colo-proxy.c b/net/colo-proxy.c
> >>>>index 08a852f..a664e6d 100644
> >>>>--- a/net/colo-proxy.c
> >>>>+++ b/net/colo-proxy.c
> >>>>@@ -24,6 +24,170 @@
> >>>>  static char *mode;
> >>>>  static bool colo_do_checkpoint;
> >>>>+static void packet_destroy(void *opaque, void *user_data);
> >>>>+
> >>>>+static uint32_t connection_key_hash(const void *opaque)
> >>>>+{
> >>>>+    const Connection_key *key = opaque;
> >>>>+    uint32_t a, b, c;
> >>>>+
> >>>>+    /* Jenkins hash */
> >>>>+    a = b = c = JHASH_INITVAL + sizeof(*key);
> >>>>+    a += key->src;
> >>>>+    b += key->dst;
> >>>>+    c += key->ports;
> >>>>+    __jhash_mix(a, b, c);
> >>>>+
> >>>>+    a += key->ip_proto;
> >>>>+    __jhash_final(a, b, c);
> >>>>+
> >>>>+    return c;
> >>>>+}
> >>>>+
> >>>>+static int connection_key_equal(const void *opaque1, const void *opaque2)
> >>>>+{
> >>>>+    return memcmp(opaque1, opaque2, sizeof(Connection_key)) == 0;
> >>>>+}
> >>>>+
> >>>>+static void connection_destroy(void *opaque)
> >>>>+{
> >>>>+    Connection *connection = opaque;
> >>>>+    g_queue_foreach(&connection->primary_list, packet_destroy, NULL);
> >>>>+    g_queue_free(&connection->primary_list);
> >>>>+    g_queue_foreach(&connection->secondary_list, packet_destroy, NULL);
> >>>>+    g_queue_free(&connection->secondary_list);
> >>>>+    g_slice_free(Connection, connection);
> >>>>+}
> >>>>+
> >>>>+static Connection *connection_new(void)
> >>>>+{
> >>>>+    Connection *connection = g_slice_new(Connection);
> >>>>+
> >>>>+    g_queue_init(&connection->primary_list);
> >>>>+    g_queue_init(&connection->secondary_list);
> >>>>+    connection->processing = false;
> >>>>+
> >>>>+    return connection;
> >>>>+}
> >>>>+
> >>>>+/* Return 0 on success, or return -1 if the pkt is corrpted */
> >>>>+static int parse_packet_early(Packet *pkt, Connection_key *key)
> >>>>+{
> >>>>+    int network_length;
> >>>>+    uint8_t *data = pkt->data;
> >>>>+
> >>>>+    pkt->network_layer = data + ETH_HLEN;
> >>>>+    if (ntohs(*(uint16_t *)(data + 12)) != ETH_P_IP) {
> >>>>+        if (ntohs(*(uint16_t *)(data + 12)) == ETH_P_ARP) {
> >>>>+            return -1;
> >>>>+        }
> >>>>+        return 0;
> >>>>+    }
> >>>Can you use some of the functions/macros in include/net/eth.h to
> >>>make this easier? Maybe eth_get_l3_proto ?
> >>>Do you plan to do IPv6 at some point?
> >>I will use include/net/eth.h in next version
> >>
> >>IPv6 currently not support, still colo framework be merged
> >>
> >>>>+    network_length = pkt->ip->ip_hl * 4;
> >>>>+    pkt->transport_layer = pkt->network_layer + network_length;
> >>>>+    key->ip_proto = pkt->ip->ip_p;
> >>>>+    key->src = pkt->ip->ip_src;
> >>>>+    key->dst = pkt->ip->ip_dst;
> >>>>+
> >>>>+    switch (key->ip_proto) {
> >>>>+    case IPPROTO_TCP:
> >>>>+    case IPPROTO_UDP:
> >>>>+    case IPPROTO_DCCP:
> >>>>+    case IPPROTO_ESP:
> >>>>+    case IPPROTO_SCTP:
> >>>>+    case IPPROTO_UDPLITE:
> >>>>+        key->ports = *(uint32_t *)(pkt->transport_layer);
> >>>>+        break;
> >>>>+    case IPPROTO_AH:
> >>>>+        key->ports = *(uint32_t *)(pkt->transport_layer + 4);
> >>>Interesting; I don't see any other code in QEMU to handle AH,
> >>>and I don't know much about it.
> >>>
> >>>>+        break;
> >>>>+    default:
> >>>>+        break;
> >>>>+    }
> >>>>+
> >>>>+    return 0;
> >>>>+}
> >>>>+
> >>>>+static Packet *packet_new(ColoProxyState *s, const void *data,
> >>>>+                          int size, Connection_key *key, NetClientState *sender)
> >>>>+{
> >>>>+    Packet *pkt = g_slice_new(Packet);
> >>>>+
> >>>>+    pkt->data = g_malloc(size);
> >>>>+    memcpy(pkt->data, data, size);
> >>>g_memdup might be useful for these:
> >>>https://developer.gnome.org/glib/stable/glib-Memory-Allocation.html#g-memdup
> >>I will fix it in next version
> >>
> >>>>+    pkt->size = size;
> >>>>+    pkt->s = s;
> >>>>+    pkt->sender = sender;
> >>>>+    pkt->should_be_sent = false;
> >>>>+
> >>>>+    if (parse_packet_early(pkt, key)) {
> >>>>+        packet_destroy(pkt, NULL);
> >>>>+        pkt = NULL;
> >>>>+    }
> >>>>+
> >>>>+    return pkt;
> >>>>+}
> >>>>+
> >>>>+static void packet_destroy(void *opaque, void *user_data)
> >>>>+{
> >>>>+    Packet *pkt = opaque;
> >>>>+    g_free(pkt->data);
> >>>>+    g_slice_free(Packet, pkt);
> >>>>+}
> >>>>+
> >>>>+static Connection *colo_proxy_enqueue_packet(GHashTable *unprocessed_packets,
> >>>>+                                          Connection_key *key,
> >>>>+                                          Packet *pkt, packet_type type)
> >>>>+{
> >>>>+    Connection *connection;
> >>>>+    Packet *tmppkt;
> >>>>+    connection = g_hash_table_lookup(unprocessed_packets, key);
> >>>>+    if (connection == NULL) {
> >>>>+        Connection_key *new_key = g_malloc(sizeof(*key));
> >>>>+
> >>>>+        connection = connection_new();
> >>>>+        memcpy(new_key, key, sizeof(*key));
> >>>>+        key = new_key;
> >>>>+
> >>>>+        g_hash_table_insert(unprocessed_packets, key, connection);
> >>>Is 'unprocessed_packets' a good name for this hashtable? I'm not quite
> >>>sure I understand, but it looks to me like it's your connection-tracking equivalent,
> >>>which then has a queue for each connection with unprocessed packets?
> >>i will change hashtable name to connection_track_table,is it ok?
> >Yes, thank you.
> >
> >>>Also, do we do anything to stop this hash growing really huge? If there
> >>>are lots-and-lots of connections can we limit it somehow? (what does Linux do?)
> >>when we find PVM's packet different to SVM's packet,colo will do checkpoint.
> >>that's means we will flush all connection's packets,even though all packets
> >>are
> >>same,colo will alse do checkpoint periodically. so hashtable can't growing
> >>really huge.
> >I see the flush clears all the packets, but does it also clear the hash?
> >
> 
> I read the kernel code,TCP conntrack will clear hash one time every five
> days periodicity.
> and the hashtable size
>     /* Idea from tcp.c: use 1/16384 of memory.  On i386: 32MB
>      * machine has 512 buckets. >= 1GB machines have 16384 buckets. */
> I will follow kernel's done to fix colo-proxy in next version.

I think it's OK if you just set a size and make it limit to that size;
lets keep it simple for now.   I think you'll also have to free entries
when you see the TCP connection closed.


Dave

> 
> Thanks for review
> zhangchen
> 
> >>>>+    }
> >>>>+    switch (type) {
> >>>>+    case PRIMARY_OUTPUT:
> >>>>+        if (g_queue_get_length(&connection->secondary_list) > 0) {
> >>>Please add some more comments; I think this is when a packet comes in
> >>>on the primary, and then we find we've already got a packet from the secondary
> >>>waiting?
> >>yes,you are right
> >>
> >>I will add more comments in next version
> >Thank you.
> >
> >Dave
> >
> >>>>+            tmppkt = g_queue_pop_head(&connection->secondary_list);
> >>>>+            DEBUG("g_queue_get_length(&connection->primary_list)=%d\n",
> >>>>+                        g_queue_get_length(&connection->primary_list));
> >>>>+            DEBUG("g_queue_get_length(&connection->secondary_list)=%d\n",
> >>>>+                        g_queue_get_length(&connection->secondary_list));
> >>>>+            if (colo_packet_compare(pkt, tmppkt)) {
> >>>>+                DEBUG("packet same and release packet\n");
> >>>>+                pkt->should_be_sent = true;
> >>>>+                break;
> >>>>+            } else {
> >>>>+                DEBUG("packet different\n");
> >>>>+                colo_proxy_notify_checkpoint();
> >>>>+                pkt->should_be_sent = false;
> >>>>+                break;
> >>>>+            }
> >>>>+        } else {
> >>>>+            g_queue_push_tail(&connection->primary_list, pkt);
> >>>>+            pkt->should_be_sent = false;
> >>>>+        }
> >>>>+
> >>>>+        break;
> >>>>+    case SECONDARY_OUTPUT:
> >>>>+        g_queue_push_tail(&connection->secondary_list, pkt);
> >>>>+        DEBUG("secondary pkt data=%s,  pkt->ip->ipsrc=%x,pkt->ip->ipdst=%x\n",
> >>>>+                    (char *)pkt->data, pkt->ip->ip_src, pkt->ip->ip_dst);
> >>>>+        break;
> >>>>+    default:
> >>>>+        abort();
> >>>>+    }
> >>>>+
> >>>>+    return connection;
> >>>>+}
> >>>>+
> >>>>  /*
> >>>>   * Packets to be sent by colo forward to
> >>>>@@ -165,7 +329,8 @@ static ssize_t colo_proxy_primary_handler(NetFilterState *nf,
> >>>>      }
> >>>>      if (direction == NET_FILTER_DIRECTION_RX) {
> >>>>-        /* TODO: enqueue_primary_packet */
> >>>>+        ret = colo_enqueue_primary_packet(nf, sender, flags, iov,
> >>>>+                    iovcnt, sent_cb);
> >>>The routine above is 'colo_enqueue_packet' rather than colo_enqueue_primary_packet?
> >>yes,colo_enqueue_packet is enqueue packet common
> >>
> >>Thanks for review
> >>zhangchen
> >>
> >>>>      } else {
> >>>>          ret = colo_forward2another(nf, sender, flags, iov, iovcnt,
> >>>>                      sent_cb, COLO_PRIMARY_MODE);
> >>>>-- 
> >>>>1.9.1
> >>>Dave
> >>>
> >>>>
> >>>--
> >>>Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >>>
> >>>
> >>>.
> >>>
> >>
> >>
> >--
> >Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> >
> >.
> >
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-12-04  9:14 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-27 12:27 [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Zhang Chen
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 1/9] Init colo-proxy object " Zhang Chen
2015-11-30  2:50   ` Wen Congyang
2015-11-30  5:38     ` Zhang Chen
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 2/9] jhash: add linux kernel jhashtable in qemu Zhang Chen
2015-12-01 11:23   ` Dr. David Alan Gilbert
2015-12-03  3:40     ` Zhang Chen
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 3/9] colo-proxy: add colo-proxy framework Zhang Chen
2015-11-28  2:46   ` Hailiang Zhang
2015-11-30  2:25     ` Zhang Chen
2015-11-30  3:10   ` Wen Congyang
2015-11-30  5:44     ` Zhang Chen
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 4/9] colo-proxy: add colo-proxy setup work Zhang Chen
2015-11-28  3:02   ` Hailiang Zhang
2015-11-30  2:35     ` Zhang Chen
2015-12-01 15:35   ` Dr. David Alan Gilbert
2015-12-03  3:49     ` Zhang Chen
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 5/9] net/colo-proxy: add colo packet handler Zhang Chen
2015-11-28  3:17   ` Hailiang Zhang
2015-11-30  5:37     ` Zhang Chen
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 6/9] net/colo-proxy: add packet forward function Zhang Chen
2015-12-01 15:50   ` Dr. David Alan Gilbert
2015-12-03  6:17     ` Zhang Chen
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 7/9] net/colo-proxy: add packet enqueue and handle function Zhang Chen
2015-12-01 16:12   ` Dr. David Alan Gilbert
2015-12-03  6:35     ` Zhang Chen
2015-12-03  9:09       ` Dr. David Alan Gilbert
2015-12-04  3:21         ` Zhang Chen
2015-12-04  9:14           ` Dr. David Alan Gilbert
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 8/9] net/colo-proxy: enqueue primary and secondary packet Zhang Chen
2015-11-27 12:27 ` [Qemu-devel] [RFC PATCH 9/9] net/colo-proxy: add packet compare and notify checkpoint Zhang Chen
2015-12-01 16:37   ` Dr. David Alan Gilbert
2015-12-03  7:10     ` Zhang Chen
2015-12-01 16:44 ` [Qemu-devel] [RFC PATCH 0/9] Add colo-proxy based on netfilter Dr. David Alan Gilbert
2015-12-03  7:33   ` Zhang Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).