[Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare
@ 2017-04-22  8:25 zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 01/18] net/colo: Add notifier/callback related helpers for filter zhanghailiang
                   ` (17 more replies)
  0 siblings, 18 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Dong eddie, Jiang yunhong, Xu Quan, Jason Wang

Hi,

COLO Frame, block replication and COLO net compare have been exist in qemu
for long time, it's time to integrate these three parts to make COLO really works.

In this series, we have some optimizations for COLO frame, including separating the
process of saving ram and device state, using an COLO_EXIT event to notify users that
VM exits COLO, for these parts, most of them have been reviewed long time ago in old version,
but since this series have just rebased on upstream which had merged a new series of migration,
parts of pathes in this series deserve review again.

We use notifier/callback method for COLO compare to notify COLO frame about
net packets inconsistent event, and add a handle_event method for NetFilterClass to
help COLO frame to notify filters and colo-compare about checkpoint/failover event, 
it is flexible.

Besides, this series is on top of '[PATCH 0/3] colo-compare: fix three bugs' series.

For the neweset version, please refer to:
https://github.com/coloft/qemu/tree/colo-for-qemu-2.10-2017-4-22

Please review, thanks.

Cc: Dong eddie <eddie.dong@intel.com>
Cc: Jiang yunhong <yunhong.jiang@intel.com>
Cc: Xu Quan <xuquan8@huawei.com>
Cc: Jason Wang <jasowang@redhat.com> 

zhanghailiang (18):
  net/colo: Add notifier/callback related helpers for filter
  colo-compare: implement the process of checkpoint
  colo-compare: use notifier to notify packets comparing result
  COLO: integrate colo compare with colo frame
  COLO: Handle shutdown command for VM in COLO state
  COLO: Add block replication into colo process
  COLO: Load dirty pages into SVM's RAM cache firstly
  ram/COLO: Record the dirty pages that SVM received
  COLO: Flush memory data from ram cache
  qmp event: Add COLO_EXIT event to notify users while exited COLO
  savevm: split save/find loadvm_handlers entry into two helper
    functions
  savevm: split the process of different stages for loadvm/savevm
  COLO: Separate the process of saving/loading ram and device state
  COLO: Split qemu_savevm_state_begin out of checkpoint process
  COLO: flush host dirty ram from cache
  filter: Add handle_event method for NetFilterClass
  filter-rewriter: handle checkpoint and failover event
  COLO: notify net filters about checkpoint/failover event

 include/exec/ram_addr.h       |   1 +
 include/migration/colo.h      |   1 +
 include/migration/migration.h |   5 +
 include/net/filter.h          |   5 +
 include/sysemu/sysemu.h       |   9 ++
 migration/colo.c              | 242 +++++++++++++++++++++++++++++++++++++++---
 migration/migration.c         |  24 ++++-
 migration/ram.c               | 147 ++++++++++++++++++++++++-
 migration/savevm.c            | 113 ++++++++++++++++----
 migration/trace-events        |   2 +
 net/colo-compare.c            | 110 ++++++++++++++++++-
 net/colo-compare.h            |   8 ++
 net/colo.c                    | 105 ++++++++++++++++++
 net/colo.h                    |  19 ++++
 net/filter-rewriter.c         |  39 +++++++
 net/filter.c                  |  16 +++
 net/net.c                     |  28 +++++
 qapi-schema.json              |  18 +++-
 qapi/event.json               |  21 ++++
 vl.c                          |  19 +++-
 20 files changed, 886 insertions(+), 46 deletions(-)
 create mode 100644 net/colo-compare.h

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 01/18] net/colo: Add notifier/callback related helpers for filter
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 02/18] colo-compare: implement the process of checkpoint zhanghailiang
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Jason Wang

We will use this notifier to help COLO to notify filter object
to do something, like do checkpoint, or process failover event.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 net/colo.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo.h |  19 +++++++++++
 2 files changed, 124 insertions(+)

diff --git a/net/colo.c b/net/colo.c
index 8cc166b..8aef670 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -15,6 +15,7 @@
 #include "qemu/osdep.h"
 #include "trace.h"
 #include "net/colo.h"
+#include "qapi/error.h"
 
 uint32_t connection_key_hash(const void *opaque)
 {
@@ -209,3 +210,107 @@ Connection *connection_get(GHashTable *connection_track_table,
 
     return conn;
 }
+
+static gboolean
+filter_notify_prepare(GSource *source, gint *timeout)
+{
+    *timeout = -1;
+
+    return FALSE;
+}
+
+static gboolean
+filter_notify_check(GSource *source)
+{
+    FilterNotifier *notify = (FilterNotifier *)source;
+
+    return notify->pfd.revents & (G_IO_IN | G_IO_HUP | G_IO_ERR);
+}
+
+static gboolean
+filter_notify_dispatch(GSource *source,
+                       GSourceFunc callback,
+                       gpointer user_data)
+{
+    FilterNotifier *notify = (FilterNotifier *)source;
+    int revents;
+    uint64_t value;
+    int ret;
+
+    revents = notify->pfd.revents & notify->pfd.events;
+    if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) {
+        ret = filter_notifier_get(notify, &value);
+        if (notify->cb && !ret) {
+            notify->cb(notify, value);
+        }
+    }
+    return TRUE;
+}
+
+static void
+filter_notify_finalize(GSource *source)
+{
+    FilterNotifier *notify = (FilterNotifier *)source;
+
+    event_notifier_cleanup(&notify->event);
+}
+
+static GSourceFuncs notifier_source_funcs = {
+    filter_notify_prepare,
+    filter_notify_check,
+    filter_notify_dispatch,
+    filter_notify_finalize,
+};
+
+FilterNotifier *filter_notifier_new(FilterNotifierCallback *cb,
+                    void *opaque, Error **errp)
+{
+    FilterNotifier *notify;
+    int ret;
+
+    notify = (FilterNotifier *)g_source_new(&notifier_source_funcs,
+                sizeof(FilterNotifier));
+    ret = event_notifier_init(&notify->event, false);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Failed to initialize event notifier");
+        goto fail;
+    }
+    notify->pfd.fd = event_notifier_get_fd(&notify->event);
+    notify->pfd.events = G_IO_IN | G_IO_HUP | G_IO_ERR;
+    notify->cb = cb;
+    notify->opaque = opaque;
+    g_source_add_poll(&notify->source, &notify->pfd);
+
+    return notify;
+
+fail:
+    g_source_destroy(&notify->source);
+    return NULL;
+}
+
+int filter_notifier_set(FilterNotifier *notify, uint64_t value)
+{
+    ssize_t ret;
+
+    do {
+        ret = write(notify->event.wfd, &value, sizeof(value));
+    } while (ret < 0 && errno == EINTR);
+
+    /* EAGAIN is fine, a read must be pending.  */
+    if (ret < 0 && errno != EAGAIN) {
+        return -errno;
+    }
+    return 0;
+}
+
+int filter_notifier_get(FilterNotifier *notify, uint64_t *value)
+{
+    ssize_t len;
+
+    /* Drain the notify pipe.  For eventfd, only 8 bytes will be read.  */
+    do {
+        len = read(notify->event.rfd, value, sizeof(*value));
+    } while ((len == -1 && errno == EINTR));
+
+    return len != sizeof(*value) ? -1 : 0;
+}
diff --git a/net/colo.h b/net/colo.h
index cd9027f..b586db3 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -19,6 +19,7 @@
 #include "qemu/jhash.h"
 #include "qemu/timer.h"
 #include "slirp/tcp.h"
+#include "qemu/event_notifier.h"
 
 #define HASHTABLE_MAX_SIZE 16384
 
@@ -89,4 +90,22 @@ void connection_hashtable_reset(GHashTable *connection_track_table);
 Packet *packet_new(const void *data, int size);
 void packet_destroy(void *opaque, void *user_data);
 
+typedef void FilterNotifierCallback(void *opaque, int value);
+typedef struct FilterNotifier {
+    GSource source;
+    EventNotifier event;
+    GPollFD pfd;
+    FilterNotifierCallback *cb;
+    void *opaque;
+} FilterNotifier;
+
+FilterNotifier *filter_notifier_new(FilterNotifierCallback *cb,
+                    void *opaque, Error **errp);
+int filter_notifier_set(FilterNotifier *notify, uint64_t value);
+int filter_notifier_get(FilterNotifier *notify, uint64_t *value);
+
+enum {
+    COLO_CHECKPOINT = 2,
+    COLO_FAILOVER,
+};
 #endif /* QEMU_COLO_PROXY_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 02/18] colo-compare: implement the process of checkpoint
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 01/18] net/colo: Add notifier/callback related helpers for filter zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 03/18] colo-compare: use notifier to notify packets comparing result zhanghailiang
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Jason Wang

While do checkpoint, we need to flush all the unhandled packets,
By using the filter notifier mechanism, we can easily to notify
every compare object to do this process, which runs inside
of compare threads as a coroutine.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-compare.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo-compare.h |  6 +++++
 2 files changed, 84 insertions(+)
 create mode 100644 net/colo-compare.h

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 97bf0e5..3adccfb 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -29,17 +29,24 @@
 #include "qemu/sockets.h"
 #include "qapi-visit.h"
 #include "net/colo.h"
+#include "net/colo-compare.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
     OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
+static QTAILQ_HEAD(, CompareState) net_compares =
+       QTAILQ_HEAD_INITIALIZER(net_compares);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
 /* TODO: Should be configurable */
 #define REGULAR_PACKET_CHECK_MS 3000
 
+static QemuMutex event_mtx = { .lock = PTHREAD_MUTEX_INITIALIZER };
+static QemuCond event_complete_cond = { .cond = PTHREAD_COND_INITIALIZER };
+static int event_unhandled_count;
 /*
   + CompareState ++
   |               |
@@ -87,6 +94,10 @@ typedef struct CompareState {
 
     GMainContext *worker_context;
     GMainLoop *compare_loop;
+    /* Used for COLO to notify compare to do something */
+    FilterNotifier *notifier;
+
+    QTAILQ_ENTRY(CompareState) next;
 } CompareState;
 
 typedef struct CompareClass {
@@ -417,6 +428,11 @@ static void colo_compare_connection(void *opaque, void *user_data)
     while (!g_queue_is_empty(&conn->primary_list) &&
            !g_queue_is_empty(&conn->secondary_list)) {
         pkt = g_queue_pop_tail(&conn->primary_list);
+        if (!pkt) {
+            error_report("colo-compare pop pkt failed");
+            return;
+        }
+
         switch (conn->ip_proto) {
         case IPPROTO_TCP:
             result = g_queue_find_custom(&conn->secondary_list,
@@ -538,6 +554,53 @@ static gboolean check_old_packet_regular(void *opaque)
     return TRUE;
 }
 
+/* Public API, Used for COLO frame to notify compare event */
+void colo_notify_compares_event(void *opaque, int event, Error **errp)
+{
+    CompareState *s;
+    int ret;
+
+    qemu_mutex_lock(&event_mtx);
+    QTAILQ_FOREACH(s, &net_compares, next) {
+        ret = filter_notifier_set(s->notifier, event);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Failed to write value to eventfd");
+            goto fail;
+        }
+        event_unhandled_count++;
+    }
+    /* Wait all compare threads to finish handling this event */
+    while (event_unhandled_count > 0) {
+        qemu_cond_wait(&event_complete_cond, &event_mtx);
+    }
+
+fail:
+    qemu_mutex_unlock(&event_mtx);
+}
+
+static void colo_flush_packets(void *opaque, void *user_data);
+
+static void colo_compare_handle_event(void *opaque, int event)
+{
+    FilterNotifier *notify = opaque;
+    CompareState *s = notify->opaque;
+
+    switch (event) {
+    case COLO_CHECKPOINT:
+        g_queue_foreach(&s->conn_list, colo_flush_packets, s);
+        break;
+    case COLO_FAILOVER:
+        break;
+    default:
+        break;
+    }
+    qemu_mutex_lock(&event_mtx);
+    assert(event_unhandled_count > 0);
+    event_unhandled_count--;
+    qemu_cond_broadcast(&event_complete_cond);
+    qemu_mutex_unlock(&event_mtx);
+}
+
 static void *colo_compare_thread(void *opaque)
 {
     CompareState *s = opaque;
@@ -558,10 +621,15 @@ static void *colo_compare_thread(void *opaque)
                           (GSourceFunc)check_old_packet_regular, s, NULL);
     g_source_attach(timeout_source, s->worker_context);
 
+    s->notifier = filter_notifier_new(colo_compare_handle_event, s, NULL);
+    g_source_attach(&s->notifier->source, s->worker_context);
+
     qemu_sem_post(&s->thread_ready);
 
     g_main_loop_run(s->compare_loop);
 
+    g_source_destroy(&s->notifier->source);
+    g_source_unref(&s->notifier->source);
     g_source_destroy(timeout_source);
     g_source_unref(timeout_source);
 
@@ -706,6 +774,8 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
     net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
     net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
 
+    QTAILQ_INSERT_TAIL(&net_compares, s, next);
+
     g_queue_init(&s->conn_list);
 
     s->connection_track_table = g_hash_table_new_full(connection_key_hash,
@@ -765,6 +835,7 @@ static void colo_compare_init(Object *obj)
 static void colo_compare_finalize(Object *obj)
 {
     CompareState *s = COLO_COMPARE(obj);
+    CompareState *tmp = NULL;
 
     qemu_chr_fe_set_handlers(&s->chr_pri_in, NULL, NULL, NULL, NULL,
                              s->worker_context, true);
@@ -777,6 +848,13 @@ static void colo_compare_finalize(Object *obj)
     }
     qemu_thread_join(&s->thread);
 
+    QTAILQ_FOREACH(tmp, &net_compares, next) {
+        if (!strcmp(tmp->outdev, s->outdev)) {
+            QTAILQ_REMOVE(&net_compares, s, next);
+            break;
+        }
+    }
+
     /* Release all unhandled packets after compare thead exited */
     g_queue_foreach(&s->conn_list, colo_flush_packets, s);
 
diff --git a/net/colo-compare.h b/net/colo-compare.h
new file mode 100644
index 0000000..c9c62f5
--- /dev/null
+++ b/net/colo-compare.h
@@ -0,0 +1,6 @@
+#ifndef QEMU_COLO_COMPARE_H
+#define QEMU_COLO_COMPARE_H
+
+void colo_notify_compares_event(void *opaque, int event, Error **errp);
+
+#endif /* QEMU_COLO_COMPARE_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 03/18] colo-compare: use notifier to notify packets comparing result
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 01/18] net/colo: Add notifier/callback related helpers for filter zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 02/18] colo-compare: implement the process of checkpoint zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 04/18] COLO: integrate colo compare with colo frame zhanghailiang
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Jason Wang

It's a good idea to use notifier to notify COLO frame of
inconsistent packets comparing.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 32 ++++++++++++++++++++++++++++----
 net/colo-compare.h |  2 ++
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 3adccfb..bb234dd 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -30,6 +30,7 @@
 #include "qapi-visit.h"
 #include "net/colo.h"
 #include "net/colo-compare.h"
+#include "migration/migration.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
@@ -38,6 +39,9 @@
 static QTAILQ_HEAD(, CompareState) net_compares =
        QTAILQ_HEAD_INITIALIZER(net_compares);
 
+static NotifierList colo_compare_notifiers =
+    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
@@ -384,6 +388,22 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
     }
 }
 
+static void colo_compare_inconsistent_notify(void)
+{
+    notifier_list_notify(&colo_compare_notifiers,
+                migrate_get_current());
+}
+
+void colo_compare_register_notifier(Notifier *notify)
+{
+    notifier_list_add(&colo_compare_notifiers, notify);
+}
+
+void colo_compare_unregister_notifier(Notifier *notify)
+{
+    notifier_remove(notify);
+}
+
 static void colo_old_packet_check_one_conn(void *opaque,
                                            void *user_data)
 {
@@ -397,7 +417,7 @@ static void colo_old_packet_check_one_conn(void *opaque,
 
     if (result) {
         /* do checkpoint will flush old packet */
-        /* TODO: colo_notify_checkpoint();*/
+        colo_compare_inconsistent_notify();
     }
 }
 
@@ -415,7 +435,10 @@ static void colo_old_packet_check(void *opaque)
 
 /*
  * Called from the compare thread on the primary
- * for compare connection
+ * for compare connection.
+ * TODO: Reconstruct this function, we should hold the max handled sequence
+ * of the connect, Don't trigger a checkpoint request if we only get packets
+ * from one side (primary or secondary).
  */
 static void colo_compare_connection(void *opaque, void *user_data)
 {
@@ -464,11 +487,12 @@ static void colo_compare_connection(void *opaque, void *user_data)
             /*
              * If one packet arrive late, the secondary_list or
              * primary_list will be empty, so we can't compare it
-             * until next comparison.
+             * until next comparison. If the packets in the list are
+             * timeout, it will trigger a checkpoint request.
              */
             trace_colo_compare_main("packet different");
             g_queue_push_tail(&conn->primary_list, pkt);
-            /* TODO: colo_notify_checkpoint();*/
+            colo_compare_inconsistent_notify();
             break;
         }
     }
diff --git a/net/colo-compare.h b/net/colo-compare.h
index c9c62f5..a0b573e 100644
--- a/net/colo-compare.h
+++ b/net/colo-compare.h
@@ -2,5 +2,7 @@
 #define QEMU_COLO_COMPARE_H
 
 void colo_notify_compares_event(void *opaque, int event, Error **errp);
+void colo_compare_register_notifier(Notifier *notify);
+void colo_compare_unregister_notifier(Notifier *notify);
 
 #endif /* QEMU_COLO_COMPARE_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 04/18] COLO: integrate colo compare with colo frame
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (2 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 03/18] colo-compare: use notifier to notify packets comparing result zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 05/18] COLO: Handle shutdown command for VM in COLO state zhanghailiang
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Jason Wang

For COLO FT, both the PVM and SVM run at the same time,
only sync the state while it needs.

So here, let SVM runs while not doing checkpoint, change
DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100.

Besides, we forgot to release colo_checkpoint_semd and
colo_delay_timer, fix them here.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c      | 42 ++++++++++++++++++++++++++++++++++++++++--
 migration/migration.c |  2 +-
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index c19eb3f..a3344ce 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -21,8 +21,11 @@
 #include "migration/failover.h"
 #include "replication.h"
 #include "qmp-commands.h"
+#include "net/colo-compare.h"
+#include "net/colo.h"
 
 static bool vmstate_loading;
+static Notifier packets_compare_notifier;
 
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
@@ -332,6 +335,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
     /* Disable block migration */
     s->params.blk = 0;
     s->params.shared = 0;
@@ -390,6 +398,11 @@ out:
     return ret;
 }
 
+static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)
+{
+    colo_checkpoint_notify(data);
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     QIOChannelBuffer *bioc;
@@ -406,6 +419,9 @@ static void colo_process_checkpoint(MigrationState *s)
         goto out;
     }
 
+    packets_compare_notifier.notify = colo_compare_notify_checkpoint;
+    colo_compare_register_notifier(&packets_compare_notifier);
+
     /*
      * Wait for Secondary finish loading VM states and enter COLO
      * restore.
@@ -451,11 +467,21 @@ out:
         qemu_fclose(fb);
     }
 
-    timer_del(s->colo_delay_timer);
-
     /* Hope this not to be too long to wait here */
     qemu_sem_wait(&s->colo_exit_sem);
     qemu_sem_destroy(&s->colo_exit_sem);
+
+    /*
+     * It is safe to unregister notifier after failover finished.
+     * Besides, colo_delay_timer and colo_checkpoint_sem can't be
+     * released befor unregister notifier, or there will be use-after-free
+     * error.
+     */
+    colo_compare_unregister_notifier(&packets_compare_notifier);
+    timer_del(s->colo_delay_timer);
+    timer_free(s->colo_delay_timer);
+    qemu_sem_destroy(&s->colo_checkpoint_sem);
+
     /*
      * Must be called after failover BH is completed,
      * Or the failover BH may shutdown the wrong fd that
@@ -548,6 +574,11 @@ void *colo_process_incoming_thread(void *opaque)
     fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
     object_unref(OBJECT(bioc));
 
+    qemu_mutex_lock_iothread();
+    vm_start();
+    trace_colo_vm_state_change("stop", "run");
+    qemu_mutex_unlock_iothread();
+
     colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
                       &local_err);
     if (local_err) {
@@ -567,6 +598,11 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        qemu_mutex_lock_iothread();
+        vm_stop_force_state(RUN_STATE_COLO);
+        trace_colo_vm_state_change("run", "stop");
+        qemu_mutex_unlock_iothread();
+
         /* FIXME: This is unnecessary for periodic checkpoint mode */
         colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
                      &local_err);
@@ -620,6 +656,8 @@ void *colo_process_incoming_thread(void *opaque)
         }
 
         vmstate_loading = false;
+        vm_start();
+        trace_colo_vm_state_change("stop", "run");
         qemu_mutex_unlock_iothread();
 
         if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
diff --git a/migration/migration.c b/migration/migration.c
index 353f272..2ade2aa 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -70,7 +70,7 @@
 /* The delay time (in ms) between two COLO checkpoints
  * Note: Please change this default value to 10000 when we support hybrid mode.
  */
-#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
+#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
 
 static NotifierList migration_state_notifiers =
     NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 05/18] COLO: Handle shutdown command for VM in COLO state
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (3 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 04/18] COLO: integrate colo compare with colo frame zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-24 14:51   ` Eric Blake
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 06/18] COLO: Add block replication into colo process zhanghailiang
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Paolo Bonzini

If VM is in COLO FT state, we need to do some extra works before
starting normal shutdown process.

Secondary VM will ignore the shutdown command if users issue it directly
to Secondary VM. COLO will capture shutdown command and after
shutdown request from user.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/colo.h |  1 +
 include/sysemu/sysemu.h  |  3 +++
 migration/colo.c         | 46 +++++++++++++++++++++++++++++++++++++++++++++-
 qapi-schema.json         |  4 +++-
 vl.c                     | 19 ++++++++++++++++---
 5 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 2bbff9e..aadd040 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -37,4 +37,5 @@ COLOMode get_colo_mode(void);
 void colo_do_failover(MigrationState *s);
 
 void colo_checkpoint_notify(void *opaque);
+bool colo_handle_shutdown(void);
 #endif
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 16175f7..8054f53 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -49,6 +49,8 @@ typedef enum WakeupReason {
     QEMU_WAKEUP_REASON_OTHER,
 } WakeupReason;
 
+extern int colo_shutdown_requested;
+
 void qemu_system_reset_request(void);
 void qemu_system_suspend_request(void);
 void qemu_register_suspend_notifier(Notifier *notifier);
@@ -56,6 +58,7 @@ void qemu_system_wakeup_request(WakeupReason reason);
 void qemu_system_wakeup_enable(WakeupReason reason, bool enabled);
 void qemu_register_wakeup_notifier(Notifier *notifier);
 void qemu_system_shutdown_request(void);
+void qemu_system_shutdown_request_core(void);
 void qemu_system_powerdown_request(void);
 void qemu_register_powerdown_notifier(Notifier *notifier);
 void qemu_system_debug_request(void);
diff --git a/migration/colo.c b/migration/colo.c
index a3344ce..c4fc865 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -384,6 +384,21 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    if (colo_shutdown_requested) {
+        colo_send_message(s->to_dst_file, COLO_MESSAGE_GUEST_SHUTDOWN,
+                          &local_err);
+        if (local_err) {
+            error_free(local_err);
+            /* Go on the shutdown process and throw the error message */
+            error_report("Failed to send shutdown message to SVM");
+        }
+        qemu_fflush(s->to_dst_file);
+        colo_shutdown_requested = 0;
+        qemu_system_shutdown_request_core();
+        /* Fix me: Just let the colo thread exit ? */
+        qemu_thread_exit(0);
+    }
+
     ret = 0;
 
     qemu_mutex_lock_iothread();
@@ -449,7 +464,9 @@ static void colo_process_checkpoint(MigrationState *s)
             goto out;
         }
 
-        qemu_sem_wait(&s->colo_checkpoint_sem);
+        if (!colo_shutdown_requested) {
+            qemu_sem_wait(&s->colo_checkpoint_sem);
+        }
 
         ret = colo_do_checkpoint_transaction(s, bioc, fb);
         if (ret < 0) {
@@ -534,6 +551,16 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
     case COLO_MESSAGE_CHECKPOINT_REQUEST:
         *checkpoint_request = 1;
         break;
+    case COLO_MESSAGE_GUEST_SHUTDOWN:
+        qemu_mutex_lock_iothread();
+        vm_stop_force_state(RUN_STATE_COLO);
+        qemu_system_shutdown_request_core();
+        qemu_mutex_unlock_iothread();
+        /*
+         * The main thread will be exit and terminate the whole
+         * process, do need some cleanup ?
+         */
+        qemu_thread_exit(0);
     default:
         *checkpoint_request = 0;
         error_setg(errp, "Got unknown COLO message: %d", msg);
@@ -696,3 +723,20 @@ out:
 
     return NULL;
 }
+
+bool colo_handle_shutdown(void)
+{
+    /*
+     * If VM is in COLO-FT mode, we need do some significant work before
+     * respond to the shutdown request. Besides, Secondary VM will ignore
+     * the shutdown request from users.
+     */
+    if (migration_incoming_in_colo_state()) {
+        return true;
+    }
+    if (migration_in_colo_state()) {
+        colo_shutdown_requested = 1;
+        return true;
+    }
+    return false;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index 01b087f..4b3e1b7 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1187,12 +1187,14 @@
 #
 # @vmstate-loaded: VM's state has been loaded by SVM.
 #
+# @guest-shutdown: shutdown requested from PVM to SVM. (Since 2.9)
+#
 # Since: 2.8
 ##
 { 'enum': 'COLOMessage',
   'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
             'vmstate-send', 'vmstate-size', 'vmstate-received',
-            'vmstate-loaded' ] }
+            'vmstate-loaded', 'guest-shutdown' ] }
 
 ##
 # @COLOMode:
diff --git a/vl.c b/vl.c
index 0b4ed52..72638c9 100644
--- a/vl.c
+++ b/vl.c
@@ -1611,6 +1611,8 @@ static NotifierList wakeup_notifiers =
     NOTIFIER_LIST_INITIALIZER(wakeup_notifiers);
 static uint32_t wakeup_reason_mask = ~(1 << QEMU_WAKEUP_REASON_NONE);
 
+int colo_shutdown_requested;
+
 int qemu_shutdown_requested_get(void)
 {
     return shutdown_requested;
@@ -1737,7 +1739,10 @@ void qemu_system_guest_panicked(GuestPanicInformation *info)
 void qemu_system_reset_request(void)
 {
     if (no_reboot) {
-        shutdown_requested = 1;
+        qemu_system_shutdown_request();
+        if (!shutdown_requested) {/* colo handle it ? */
+            return;
+        }
     } else {
         reset_requested = 1;
     }
@@ -1810,14 +1815,22 @@ void qemu_system_killed(int signal, pid_t pid)
     qemu_notify_event();
 }
 
-void qemu_system_shutdown_request(void)
+void qemu_system_shutdown_request_core(void)
 {
-    trace_qemu_system_shutdown_request();
     replay_shutdown_request();
     shutdown_requested = 1;
     qemu_notify_event();
 }
 
+void qemu_system_shutdown_request(void)
+{
+    trace_qemu_system_shutdown_request();
+    if (colo_handle_shutdown()) {
+        return;
+    }
+    qemu_system_shutdown_request_core();
+}
+
 static void qemu_system_powerdown(void)
 {
     qapi_event_send_powerdown(&error_abort);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 06/18] COLO: Add block replication into colo process
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (4 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 05/18] COLO: Handle shutdown command for VM in COLO state zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 07/18] COLO: Load dirty pages into SVM's RAM cache firstly zhanghailiang
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Stefan Hajnoczi, Kevin Wolf, Max Reitz, Xie Changlong

Make sure master start block replication after slave's block
replication started.

Besides, we need to activate VM's blocks before goes into
COLO state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Xie Changlong <xiechanglong.d@gmail.com>
---
 migration/colo.c      | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 migration/migration.c | 16 ++++++++++++++++
 2 files changed, 66 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index c4fc865..9949293 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -23,6 +23,9 @@
 #include "qmp-commands.h"
 #include "net/colo-compare.h"
 #include "net/colo.h"
+#include "qapi-event.h"
+#include "block/block.h"
+#include "replication.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -57,6 +60,7 @@ static void secondary_vm_do_failover(void)
 {
     int old_state;
     MigrationIncomingState *mis = migration_incoming_get_current();
+    Error *local_err = NULL;
 
     /* Can not do failover during the process of VM's loading VMstate, Or
      * it will break the secondary VM.
@@ -74,6 +78,11 @@ static void secondary_vm_do_failover(void)
     migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
 
+    replication_stop_all(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
+
     if (!autostart) {
         error_report("\"-S\" qemu option will be ignored in secondary side");
         /* recover runstate to normal migration finish state */
@@ -111,6 +120,7 @@ static void primary_vm_do_failover(void)
 {
     MigrationState *s = migrate_get_current();
     int old_state;
+    Error *local_err = NULL;
 
     migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
@@ -134,6 +144,13 @@ static void primary_vm_do_failover(void)
                      FailoverStatus_lookup[old_state]);
         return;
     }
+
+    replication_stop_all(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+        local_err = NULL;
+    }
+
     /* Notify COLO thread that failover work is finished */
     qemu_sem_post(&s->colo_exit_sem);
 }
@@ -345,6 +362,15 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     s->params.shared = 0;
     qemu_savevm_state_header(fb);
     qemu_savevm_state_begin(fb, &s->params);
+
+    /* We call this API although this may do nothing on primary side. */
+    qemu_mutex_lock_iothread();
+    replication_do_checkpoint_all(&local_err);
+    qemu_mutex_unlock_iothread();
+    if (local_err) {
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     qemu_savevm_state_complete_precopy(fb, false);
     qemu_mutex_unlock_iothread();
@@ -451,6 +477,12 @@ static void colo_process_checkpoint(MigrationState *s)
     object_unref(OBJECT(bioc));
 
     qemu_mutex_lock_iothread();
+    replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
+    if (local_err) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
+
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
@@ -554,6 +586,7 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
     case COLO_MESSAGE_GUEST_SHUTDOWN:
         qemu_mutex_lock_iothread();
         vm_stop_force_state(RUN_STATE_COLO);
+        replication_stop_all(false, NULL);
         qemu_system_shutdown_request_core();
         qemu_mutex_unlock_iothread();
         /*
@@ -602,6 +635,11 @@ void *colo_process_incoming_thread(void *opaque)
     object_unref(OBJECT(bioc));
 
     qemu_mutex_lock_iothread();
+    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
+    if (local_err) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
     vm_start();
     trace_colo_vm_state_change("stop", "run");
     qemu_mutex_unlock_iothread();
@@ -682,6 +720,18 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        replication_get_error_all(&local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+        /* discard colo disk buffer */
+        replication_do_checkpoint_all(&local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+
         vmstate_loading = false;
         vm_start();
         trace_colo_vm_state_change("stop", "run");
diff --git a/migration/migration.c b/migration/migration.c
index 2ade2aa..755ea54 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -394,6 +394,7 @@ static void process_incoming_migration_co(void *opaque)
     MigrationIncomingState *mis = migration_incoming_get_current();
     PostcopyState ps;
     int ret;
+    Error *local_err = NULL;
 
     mis->from_src_file = f;
     mis->largest_page_size = qemu_ram_pagesize_largest();
@@ -425,6 +426,21 @@ static void process_incoming_migration_co(void *opaque)
 
     /* we get COLO info, and know if we are in COLO mode */
     if (!ret && migration_incoming_enable_colo()) {
+        /* Make sure all file formats flush their mutable metadata */
+        bdrv_invalidate_cache_all(&local_err);
+        if (local_err) {
+            migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                    MIGRATION_STATUS_FAILED);
+            error_report_err(local_err);
+            migrate_decompress_threads_join();
+            exit(EXIT_FAILURE);
+        }
+        /* If we get an error here, just exit qemu. */
+        blk_resume_after_migration(&local_err);
+        if (local_err) {
+            error_report_err(local_err);
+            exit(EXIT_FAILURE);
+        }
         mis->migration_incoming_co = qemu_coroutine_self();
         qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
              colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 07/18] COLO: Load dirty pages into SVM's RAM cache firstly
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (5 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 06/18] COLO: Add block replication into colo process zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 08/18] ram/COLO: Record the dirty pages that SVM received zhanghailiang
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Dr . David Alan Gilbert

We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.

We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached ram
to SVM after we receive all PVM's state.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
v2:
- Move colo_init_ram_cache() and colo_release_ram_cache() out of
  incoming thread since both of them need the global lock, if we keep
  colo_release_ram_cache() in incoming thread, there are potential
  dead-lock.
- Remove bool ram_cache_enable flag, use migration_incoming_in_state() instead.
- Remove the Reviewd-by tag because of the above changes.
---
 include/exec/ram_addr.h       |  1 +
 include/migration/migration.h |  4 +++
 migration/migration.c         |  6 ++++
 migration/ram.c               | 71 ++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index c9ddcd0..0b3d77c 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -27,6 +27,7 @@ struct RAMBlock {
     struct rcu_head rcu;
     struct MemoryRegion *mr;
     uint8_t *host;
+    uint8_t *colo_cache; /* For colo, VM's ram cache */
     ram_addr_t offset;
     ram_addr_t used_length;
     ram_addr_t max_length;
diff --git a/include/migration/migration.h b/include/migration/migration.h
index ba1a16c..ba765eb 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -360,4 +360,8 @@ uint64_t ram_pagesize_summary(void);
 PostcopyState postcopy_state_get(void);
 /* Set the state and return the old state */
 PostcopyState postcopy_state_set(PostcopyState new_state);
+
+/* ram cache */
+int colo_init_ram_cache(void);
+void colo_release_ram_cache(void);
 #endif
diff --git a/migration/migration.c b/migration/migration.c
index 755ea54..7419404 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -441,6 +441,10 @@ static void process_incoming_migration_co(void *opaque)
             error_report_err(local_err);
             exit(EXIT_FAILURE);
         }
+        if (colo_init_ram_cache() < 0) {
+            error_report("Init ram cache failed");
+            exit(EXIT_FAILURE);
+        }
         mis->migration_incoming_co = qemu_coroutine_self();
         qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
              colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
@@ -449,6 +453,8 @@ static void process_incoming_migration_co(void *opaque)
 
         /* Wait checkpoint incoming thread exit before free resource */
         qemu_thread_join(&mis->colo_incoming_thread);
+        /* We hold the global iothread lock, so it is safe here */
+        colo_release_ram_cache();
     }
 
     if (ret < 0) {
diff --git a/migration/ram.c b/migration/ram.c
index f48664e..05d1b06 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2265,6 +2265,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
     return block->host + offset;
 }
 
+static inline void *colo_cache_from_block_offset(RAMBlock *block,
+                                                 ram_addr_t offset)
+{
+    if (!offset_in_ramblock(block, offset)) {
+        return NULL;
+    }
+    if (!block->colo_cache) {
+        error_report("%s: colo_cache is NULL in block :%s",
+                     __func__, block->idstr);
+        return NULL;
+    }
+    return block->colo_cache + offset;
+}
+
 /**
  * ram_handle_compressed: handle the zero page case
  *
@@ -2605,7 +2619,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                      RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
             RAMBlock *block = ram_block_from_stream(f, flags);
 
-            host = host_from_ram_block_offset(block, addr);
+            /* After going into COLO, we should load the Page into colo_cache */
+            if (migration_incoming_in_colo_state()) {
+                host = colo_cache_from_block_offset(block, addr);
+            } else {
+                host = host_from_ram_block_offset(block, addr);
+            }
             if (!host) {
                 error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
                 ret = -EINVAL;
@@ -2712,6 +2731,56 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     return ret;
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM, it is need to hold the global lock
+ * to call this helper.
+ */
+int colo_init_ram_cache(void)
+{
+    RAMBlock *block;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        block->colo_cache = qemu_anon_ram_alloc(block->used_length, NULL);
+        if (!block->colo_cache) {
+            error_report("%s: Can't alloc memory for COLO cache of block %s,"
+                         "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
+                         block->used_length);
+            goto out_locked;
+        }
+        memcpy(block->colo_cache, block->host, block->used_length);
+    }
+    rcu_read_unlock();
+    return 0;
+
+out_locked:
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->colo_cache) {
+            qemu_anon_ram_free(block->colo_cache, block->used_length);
+            block->colo_cache = NULL;
+        }
+    }
+
+    rcu_read_unlock();
+    return -errno;
+}
+
+/* It is need to hold the global lock to call this helper */
+void colo_release_ram_cache(void)
+{
+    RAMBlock *block;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->colo_cache) {
+            qemu_anon_ram_free(block->colo_cache, block->used_length);
+            block->colo_cache = NULL;
+        }
+    }
+    rcu_read_unlock();
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 08/18] ram/COLO: Record the dirty pages that SVM received
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (6 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 07/18] COLO: Load dirty pages into SVM's RAM cache firstly zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 09/18] COLO: Flush memory data from ram cache zhanghailiang
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang

We record the address of the dirty pages that received,
it will help flushing pages that cached into SVM.

Here, it is a trick, we record dirty pages by re-using migration
dirty bitmap. In the later patch, we will start the dirty log
for SVM, just like migration, in this way, we can record both
the dirty pages caused by PVM and SVM, we only flush those dirty
pages from RAM cache while do checkpoint.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 05d1b06..0653a24 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2268,6 +2268,9 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
 static inline void *colo_cache_from_block_offset(RAMBlock *block,
                                                  ram_addr_t offset)
 {
+    unsigned long *bitmap;
+    long k;
+
     if (!offset_in_ramblock(block, offset)) {
         return NULL;
     }
@@ -2276,6 +2279,17 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block,
                      __func__, block->idstr);
         return NULL;
     }
+
+    k = (memory_region_get_ram_addr(block->mr) + offset) >> TARGET_PAGE_BITS;
+    bitmap = atomic_rcu_read(&ram_state.ram_bitmap)->bmap;
+    /*
+    * During colo checkpoint, we need bitmap of these migrated pages.
+    * It help us to decide which pages in ram cache should be flushed
+    * into VM's RAM later.
+    */
+    if (!test_and_set_bit(k, bitmap)) {
+        ram_state.migration_dirty_pages++;
+    }
     return block->colo_cache + offset;
 }
 
@@ -2752,6 +2766,15 @@ int colo_init_ram_cache(void)
         memcpy(block->colo_cache, block->host, block->used_length);
     }
     rcu_read_unlock();
+    /*
+    * Record the dirty pages that sent by PVM, we use this dirty bitmap together
+    * with to decide which page in cache should be flushed into SVM's RAM. Here
+    * we use the same name 'ram_bitmap' as for migration.
+    */
+    ram_state.ram_bitmap = g_new0(RAMBitmap, 1);
+    ram_state.ram_bitmap->bmap = bitmap_new(last_ram_page());
+    ram_state.migration_dirty_pages = 0;
+
     return 0;
 
 out_locked:
@@ -2770,6 +2793,12 @@ out_locked:
 void colo_release_ram_cache(void)
 {
     RAMBlock *block;
+    RAMBitmap *bitmap = ram_state.ram_bitmap;
+
+    atomic_rcu_set(&ram_state.ram_bitmap, NULL);
+    if (bitmap) {
+        call_rcu(bitmap, migration_bitmap_free, rcu);
+    }
 
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 09/18] COLO: Flush memory data from ram cache
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (7 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 08/18] ram/COLO: Record the dirty pages that SVM received zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 10/18] qmp event: Add COLO_EXIT event to notify users while exited COLO zhanghailiang
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang

During the time of VM's running, PVM may dirty some pages, we will transfer
PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
time. So, the content of SVM's RAM cache will always be same with PVM's memory
after checkpoint.

Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
we do this in a more efficient way:
Only flush any page that dirtied by PVM since last checkpoint.
In this way, we can ensure SVM's memory same with PVM's.

Besides, we must ensure flush RAM cache before load device state.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/migration.h |  1 +
 migration/ram.c               | 40 ++++++++++++++++++++++++++++++++++++++++
 migration/trace-events        |  2 ++
 3 files changed, 43 insertions(+)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index ba765eb..2aa7654 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -364,4 +364,5 @@ PostcopyState postcopy_state_set(PostcopyState new_state);
 /* ram cache */
 int colo_init_ram_cache(void);
 void colo_release_ram_cache(void);
+void colo_flush_ram_cache(void);
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index 0653a24..df10d4b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2602,6 +2602,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     bool postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING;
     /* ADVISE is earlier, it shows the source has the postcopy capability on */
     bool postcopy_advised = postcopy_state_get() >= POSTCOPY_INCOMING_ADVISE;
+    bool need_flush = false;
 
     seq_iter++;
 
@@ -2636,6 +2637,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
             /* After going into COLO, we should load the Page into colo_cache */
             if (migration_incoming_in_colo_state()) {
                 host = colo_cache_from_block_offset(block, addr);
+                need_flush = true;
             } else {
                 host = host_from_ram_block_offset(block, addr);
             }
@@ -2742,6 +2744,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     wait_for_decompress_done();
     rcu_read_unlock();
     trace_ram_load_complete(ret, seq_iter);
+
+    if (!ret  && ram_cache_enable && need_flush) {
+        colo_flush_ram_cache();
+    }
     return ret;
 }
 
@@ -2810,6 +2816,40 @@ void colo_release_ram_cache(void)
     rcu_read_unlock();
 }
 
+/*
+ * Flush content of RAM cache into SVM's memory.
+ * Only flush the pages that be dirtied by PVM or SVM or both.
+ */
+void colo_flush_ram_cache(void)
+{
+    RAMBlock *block = NULL;
+    void *dst_host;
+    void *src_host;
+    unsigned long offset = 0;
+
+    trace_colo_flush_ram_cache_begin(ram_state.migration_dirty_pages);
+    rcu_read_lock();
+    block = QLIST_FIRST_RCU(&ram_list.blocks);
+
+    while (block) {
+        offset = migration_bitmap_find_dirty(&ram_state, block, offset);
+        migration_bitmap_clear_dirty(&ram_state, block, offset);
+
+        if (offset << TARGET_PAGE_BITS >= block->used_length) {
+            offset = 0;
+            block = QLIST_NEXT_RCU(block, next);
+        } else {
+            dst_host = block->host + (offset << TARGET_PAGE_BITS);
+            src_host = block->colo_cache + (offset << TARGET_PAGE_BITS);
+            memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+        }
+    }
+
+    rcu_read_unlock();
+    trace_colo_flush_ram_cache_end();
+    assert(ram_state.migration_dirty_pages == 0);
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/migration/trace-events b/migration/trace-events
index b8f01a2..93f4337 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -72,6 +72,8 @@ ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %"
 ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x"
 ram_postcopy_send_discard_bitmap(void) ""
 ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: start: %zx len: %zx"
+colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64
+colo_flush_ram_cache_end(void) ""
 
 # migration/migration.c
 await_return_path_close_on_source_close(void) ""
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 10/18] qmp event: Add COLO_EXIT event to notify users while exited COLO
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (8 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 09/18] COLO: Flush memory data from ram cache zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 11/18] savevm: split save/find loadvm_handlers entry into two helper functions zhanghailiang
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Markus Armbruster, Michael Roth

If some errors happen during VM's COLO FT stage, it's important to
notify the users of this event. Together with 'x_colo_lost_heartbeat',
Users can intervene in COLO's failover work immediately.
If users don't want to get involved in COLO's failover verdict,
it is still necessary to notify users that we exited COLO mode.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 migration/colo.c | 19 +++++++++++++++++++
 qapi-schema.json | 14 ++++++++++++++
 qapi/event.json  | 21 +++++++++++++++++++++
 3 files changed, 54 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 9949293..e62da93 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -516,6 +516,18 @@ out:
         qemu_fclose(fb);
     }
 
+    /*
+     * There are only two reasons we can go here, some error happened.
+     * Or the user triggered failover.
+     */
+    if (failover_get_state() == FAILOVER_STATUS_NONE) {
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+                                  COLO_EXIT_REASON_ERROR, NULL);
+    } else {
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+                                  COLO_EXIT_REASON_REQUEST, NULL);
+    }
+
     /* Hope this not to be too long to wait here */
     qemu_sem_wait(&s->colo_exit_sem);
     qemu_sem_destroy(&s->colo_exit_sem);
@@ -757,6 +769,13 @@ out:
     if (local_err) {
         error_report_err(local_err);
     }
+    if (failover_get_state() == FAILOVER_STATUS_NONE) {
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+                                  COLO_EXIT_REASON_ERROR, NULL);
+    } else {
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+                                  COLO_EXIT_REASON_REQUEST, NULL);
+    }
 
     if (fb) {
         qemu_fclose(fb);
diff --git a/qapi-schema.json b/qapi-schema.json
index 4b3e1b7..460ca53 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1233,6 +1233,20 @@
   'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
 
 ##
+# @COLOExitReason:
+#
+# The reason for a COLO exit
+#
+# @request: COLO exit is due to an external request
+#
+# @error: COLO exit is due to an internal error
+#
+# Since: 2.10
+##
+{ 'enum': 'COLOExitReason',
+  'data': [ 'request', 'error' ] }
+
+##
 # @x-colo-lost-heartbeat:
 #
 # Tell qemu that heartbeat is lost, request it to do takeover procedures.
diff --git a/qapi/event.json b/qapi/event.json
index e80f3f4..924bc6f 100644
--- a/qapi/event.json
+++ b/qapi/event.json
@@ -441,6 +441,27 @@
   'data': { 'pass': 'int' } }
 
 ##
+# @COLO_EXIT:
+#
+# Emitted when VM finishes COLO mode due to some errors happening or
+# at the request of users.
+#
+# @mode: which COLO mode the VM was in when it exited.
+#
+# @reason: describes the reason for the COLO exit.
+#
+# Since: 2.10
+#
+# Example:
+#
+# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172},
+#      "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
+#
+##
+{ 'event': 'COLO_EXIT',
+  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
+
+##
 # @ACPI_DEVICE_OST:
 #
 # Emitted when guest executes ACPI _OST method.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 11/18] savevm: split save/find loadvm_handlers entry into two helper functions
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (9 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 10/18] qmp event: Add COLO_EXIT event to notify users while exited COLO zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 12/18] savevm: split the process of different stages for loadvm/savevm zhanghailiang
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang

COLO's checkpoint process is based on migration process,
everytime we do checkpoint we will repeat the process of savevm and loadvm.

So we will call qemu_loadvm_section_start_full() repeatedly, It will
add all migration sections information into loadvm_handlers list everytime,
which will lead to memory leak.

To fix it, we split the process of saving and finding section entry into two
helper functions, we will check if section info was exist in loadvm_handlers
list before save it.

This modifications have no side effect for normal migration.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/savevm.c | 55 +++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 03ae1bd..f87cd8d 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1836,6 +1836,37 @@ void loadvm_free_handlers(MigrationIncomingState *mis)
     }
 }
 
+static LoadStateEntry *loadvm_add_section_entry(MigrationIncomingState *mis,
+                                                 SaveStateEntry *se,
+                                                 uint32_t section_id,
+                                                 uint32_t version_id)
+{
+    LoadStateEntry *le;
+
+    /* Add entry */
+    le = g_malloc0(sizeof(*le));
+
+    le->se = se;
+    le->section_id = section_id;
+    le->version_id = version_id;
+    QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry);
+    return le;
+}
+
+static LoadStateEntry *loadvm_find_section_entry(MigrationIncomingState *mis,
+                                                 uint32_t section_id)
+{
+    LoadStateEntry *le;
+
+    QLIST_FOREACH(le, &mis->loadvm_handlers, entry) {
+        if (le->section_id == section_id) {
+            break;
+        }
+    }
+
+    return le;
+}
+
 static int
 qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
 {
@@ -1878,15 +1909,12 @@ qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
         return -EINVAL;
     }
 
-    /* Add entry */
-    le = g_malloc0(sizeof(*le));
-
-    le->se = se;
-    le->section_id = section_id;
-    le->version_id = version_id;
-    QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry);
-
-    ret = vmstate_load(f, le->se, le->version_id);
+     /* Check if we have saved this section info before, if not, save it */
+    le = loadvm_find_section_entry(mis, section_id);
+    if (!le) {
+        le = loadvm_add_section_entry(mis, se, section_id, version_id);
+    }
+    ret = vmstate_load(f, se, version_id);
     if (ret < 0) {
         error_report("error while loading state for instance 0x%x of"
                      " device '%s'", instance_id, idstr);
@@ -1909,12 +1937,9 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
     section_id = qemu_get_be32(f);
 
     trace_qemu_loadvm_state_section_partend(section_id);
-    QLIST_FOREACH(le, &mis->loadvm_handlers, entry) {
-        if (le->section_id == section_id) {
-            break;
-        }
-    }
-    if (le == NULL) {
+
+    le = loadvm_find_section_entry(mis, section_id);
+    if (!le) {
         error_report("Unknown savevm section %d", section_id);
         return -EINVAL;
     }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 12/18] savevm: split the process of different stages for loadvm/savevm
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (10 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 11/18] savevm: split save/find loadvm_handlers entry into two helper functions zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 13/18] COLO: Separate the process of saving/loading ram and device state zhanghailiang
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang

There are several stages during loadvm/savevm process. In different stage,
migration incoming processes different types of sections.
We want to control these stages more accuracy, it will benefit COLO
performance, we don't have to save type of QEMU_VM_SECTION_START
sections everytime while do checkpoint, besides, we want to separate
the process of saving/loading memory and devices state.

So we add three new helper functions: qemu_loadvm_state_begin(),
qemu_load_device_state() and qemu_savevm_live_state() to achieve
different process during migration.

Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
public, and simplify the codes of qemu_save_device_state() by calling the
wrapper qemu_savevm_state_header().

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
v2:
 - Use the wrapper qemu_savevm_state_header() to simplify the codes
  of qemu_save_device_state() (Dave's suggestion)
---
 include/sysemu/sysemu.h |  6 ++++++
 migration/savevm.c      | 54 ++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 8054f53..0255c4e 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
                                            uint64_t *start_list,
                                            uint64_t *length_list);
 
+void qemu_savevm_live_state(QEMUFile *f);
+int qemu_save_device_state(QEMUFile *f);
+
 int qemu_loadvm_state(QEMUFile *f);
+int qemu_loadvm_state_begin(QEMUFile *f);
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
+int qemu_load_device_state(QEMUFile *f);
 
 extern int autostart;
 
diff --git a/migration/savevm.c b/migration/savevm.c
index f87cd8d..8c2ce0b 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -54,6 +54,7 @@
 #include "qemu/cutils.h"
 #include "io/channel-buffer.h"
 #include "io/channel-file.h"
+#include "migration/colo.h"
 
 #ifndef ETH_P_RARP
 #define ETH_P_RARP 0x8035
@@ -1285,13 +1286,20 @@ done:
     return ret;
 }
 
-static int qemu_save_device_state(QEMUFile *f)
+void qemu_savevm_live_state(QEMUFile *f)
 {
-    SaveStateEntry *se;
+    /* save QEMU_VM_SECTION_END section */
+    qemu_savevm_state_complete_precopy(f, true);
+    qemu_put_byte(f, QEMU_VM_EOF);
+}
 
-    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
-    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
+int qemu_save_device_state(QEMUFile *f)
+{
+    SaveStateEntry *se;
 
+    if (!migration_in_colo_state()) {
+        qemu_savevm_state_header(f);
+    }
     cpu_synchronize_all_states();
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
@@ -1342,8 +1350,6 @@ enum LoadVMExitCodes {
     LOADVM_QUIT     =  1,
 };
 
-static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
-
 /* ------ incoming postcopy messages ------ */
 /* 'advise' arrives before any transfers just to tell us that a postcopy
  * *might* happen - it might be skipped if precopy transferred everything
@@ -1957,7 +1963,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
     return 0;
 }
 
-static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
 {
     uint8_t section_type;
     int ret = 0;
@@ -2095,6 +2101,40 @@ int qemu_loadvm_state(QEMUFile *f)
     return ret;
 }
 
+int qemu_loadvm_state_begin(QEMUFile *f)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    Error *local_err = NULL;
+    int ret;
+
+    if (qemu_savevm_state_blocked(&local_err)) {
+        error_report_err(local_err);
+        return -EINVAL;
+    }
+    /* Load QEMU_VM_SECTION_START section */
+    ret = qemu_loadvm_state_main(f, mis);
+    if (ret < 0) {
+        error_report("Failed to loadvm begin work: %d", ret);
+    }
+    return ret;
+}
+
+int qemu_load_device_state(QEMUFile *f)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    int ret;
+
+    /* Load QEMU_VM_SECTION_FULL section */
+    ret = qemu_loadvm_state_main(f, mis);
+    if (ret < 0) {
+        error_report("Failed to load device state: %d", ret);
+        return ret;
+    }
+
+    cpu_synchronize_all_post_init();
+    return 0;
+}
+
 int save_vmstate(Monitor *mon, const char *name)
 {
     BlockDriverState *bs, *bs1;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 13/18] COLO: Separate the process of saving/loading ram and device state
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (11 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 12/18] savevm: split the process of different stages for loadvm/savevm zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 14/18] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang

We separate the process of saving/loading ram and device state when do
checkpoint. We add new helpers for save/load ram/device. With this change,
we can directly transfer RAM from primary side to secondary side without
using channel-buffer as assistant, which also reduce the size of extra memory
was used during checkpoint.

Besides, we move the colo_flush_ram_cache to the proper position after the
above change.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c   | 49 +++++++++++++++++++++++++++++++++++++++----------
 migration/ram.c    |  5 -----
 migration/savevm.c |  4 ++++
 3 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index e62da93..8e27a4c 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -357,11 +357,20 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
     /* Disable block migration */
     s->params.blk = 0;
     s->params.shared = 0;
-    qemu_savevm_state_header(fb);
-    qemu_savevm_state_begin(fb, &s->params);
+    qemu_savevm_state_begin(s->to_dst_file, &s->params);
+    ret = qemu_file_get_error(s->to_dst_file);
+    if (ret < 0) {
+        error_report("Save VM state begin error");
+        goto out;
+    }
 
     /* We call this API although this may do nothing on primary side. */
     qemu_mutex_lock_iothread();
@@ -372,15 +381,21 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     }
 
     qemu_mutex_lock_iothread();
-    qemu_savevm_state_complete_precopy(fb, false);
+    /*
+     * Only save VM's live state, which not including device state.
+     * TODO: We may need a timeout mechanism to prevent COLO process
+     * to be blocked here.
+     */
+    qemu_savevm_live_state(s->to_dst_file);
+    /* Note: device state is saved into buffer */
+    ret = qemu_save_device_state(fb);
     qemu_mutex_unlock_iothread();
-
-    qemu_fflush(fb);
-
-    colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
-    if (local_err) {
+    if (ret < 0) {
+        error_report("Save device state error");
         goto out;
     }
+    qemu_fflush(fb);
+
     /*
      * We need the size of the VMstate data in Secondary side,
      * With which we can decide how much data should be read.
@@ -621,6 +636,7 @@ void *colo_process_incoming_thread(void *opaque)
     uint64_t total_size;
     uint64_t value;
     Error *local_err = NULL;
+    int ret;
 
     qemu_sem_init(&mis->colo_incoming_sem, 0);
 
@@ -693,6 +709,17 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        ret = qemu_loadvm_state_begin(mis->from_src_file);
+        if (ret < 0) {
+            error_report("Load vm state begin error, ret=%d", ret);
+            goto out;
+        }
+        ret = qemu_loadvm_state_main(mis->from_src_file, mis);
+        if (ret < 0) {
+            error_report("Load VM's live state (ram) error");
+            goto out;
+        }
+
         value = colo_receive_message_value(mis->from_src_file,
                                  COLO_MESSAGE_VMSTATE_SIZE, &local_err);
         if (local_err) {
@@ -726,8 +753,10 @@ void *colo_process_incoming_thread(void *opaque)
         qemu_mutex_lock_iothread();
         qemu_system_reset(VMRESET_SILENT);
         vmstate_loading = true;
-        if (qemu_loadvm_state(fb) < 0) {
-            error_report("COLO: loadvm failed");
+        colo_flush_ram_cache();
+        ret = qemu_load_device_state(fb);
+        if (ret < 0) {
+            error_report("COLO: load device state failed");
             qemu_mutex_unlock_iothread();
             goto out;
         }
diff --git a/migration/ram.c b/migration/ram.c
index df10d4b..f171a82 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2602,7 +2602,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     bool postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING;
     /* ADVISE is earlier, it shows the source has the postcopy capability on */
     bool postcopy_advised = postcopy_state_get() >= POSTCOPY_INCOMING_ADVISE;
-    bool need_flush = false;
 
     seq_iter++;
 
@@ -2637,7 +2636,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
             /* After going into COLO, we should load the Page into colo_cache */
             if (migration_incoming_in_colo_state()) {
                 host = colo_cache_from_block_offset(block, addr);
-                need_flush = true;
             } else {
                 host = host_from_ram_block_offset(block, addr);
             }
@@ -2745,9 +2743,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     rcu_read_unlock();
     trace_ram_load_complete(ret, seq_iter);
 
-    if (!ret  && ram_cache_enable && need_flush) {
-        colo_flush_ram_cache();
-    }
     return ret;
 }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index 8c2ce0b..60d346c 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1003,6 +1003,10 @@ void qemu_savevm_state_begin(QEMUFile *f,
             break;
         }
     }
+    if (migration_in_colo_state()) {
+        qemu_put_byte(f, QEMU_VM_EOF);
+        qemu_fflush(f);
+    }
 }
 
 /*
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 14/18] COLO: Split qemu_savevm_state_begin out of checkpoint process
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (12 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 13/18] COLO: Separate the process of saving/loading ram and device state zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 15/18] COLO: flush host dirty ram from cache zhanghailiang
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang

It is unnecessary to call qemu_savevm_state_begin() in every checkpoint process.
It mainly sets up devices and does the first device state pass. These data will
not change during the later checkpoint process. So, we split it out of
colo_do_checkpoint_transaction(), in this way, we can reduce these data
transferring in the subsequent checkpoint.

Cc: Juan Quintela <quintela@redhat.com>
Sgned-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c | 51 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 8e27a4c..66bb5b2 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -362,16 +362,6 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
-    /* Disable block migration */
-    s->params.blk = 0;
-    s->params.shared = 0;
-    qemu_savevm_state_begin(s->to_dst_file, &s->params);
-    ret = qemu_file_get_error(s->to_dst_file);
-    if (ret < 0) {
-        error_report("Save VM state begin error");
-        goto out;
-    }
-
     /* We call this API although this may do nothing on primary side. */
     qemu_mutex_lock_iothread();
     replication_do_checkpoint_all(&local_err);
@@ -459,6 +449,21 @@ static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)
     colo_checkpoint_notify(data);
 }
 
+static int colo_prepare_before_save(MigrationState *s)
+{
+    int ret;
+
+    /* Disable block migration */
+    s->params.blk = 0;
+    s->params.shared = 0;
+    qemu_savevm_state_begin(s->to_dst_file, &s->params);
+    ret = qemu_file_get_error(s->to_dst_file);
+    if (ret < 0) {
+        error_report("Save VM state begin error");
+    }
+    return ret;
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     QIOChannelBuffer *bioc;
@@ -478,6 +483,11 @@ static void colo_process_checkpoint(MigrationState *s)
     packets_compare_notifier.notify = colo_compare_notify_checkpoint;
     colo_compare_register_notifier(&packets_compare_notifier);
 
+    ret = colo_prepare_before_save(s);
+    if (ret < 0) {
+        goto out;
+    }
+
     /*
      * Wait for Secondary finish loading VM states and enter COLO
      * restore.
@@ -628,6 +638,17 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
     }
 }
 
+static int colo_prepare_before_load(QEMUFile *f)
+{
+    int ret;
+
+    ret = qemu_loadvm_state_begin(f);
+    if (ret < 0) {
+        error_report("Load VM state begin error, ret = %d", ret);
+    }
+    return ret;
+}
+
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
@@ -662,6 +683,11 @@ void *colo_process_incoming_thread(void *opaque)
     fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
     object_unref(OBJECT(bioc));
 
+    ret = colo_prepare_before_load(mis->from_src_file);
+    if (ret < 0) {
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
     if (local_err) {
@@ -709,11 +735,6 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
-        ret = qemu_loadvm_state_begin(mis->from_src_file);
-        if (ret < 0) {
-            error_report("Load vm state begin error, ret=%d", ret);
-            goto out;
-        }
         ret = qemu_loadvm_state_main(mis->from_src_file, mis);
         if (ret < 0) {
             error_report("Load VM's live state (ram) error");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 15/18] COLO: flush host dirty ram from cache
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (13 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 14/18] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 16/18] filter: Add handle_event method for NetFilterClass zhanghailiang
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang

Don't need to flush all VM's ram from cache, only
flush the dirty pages since last checkpoint

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
v2:
 - stop dirty log after exit from COLO state. (Dave)
---
 migration/ram.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index f171a82..7bf3515 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2775,6 +2775,7 @@ int colo_init_ram_cache(void)
     ram_state.ram_bitmap = g_new0(RAMBitmap, 1);
     ram_state.ram_bitmap->bmap = bitmap_new(last_ram_page());
     ram_state.migration_dirty_pages = 0;
+    memory_global_dirty_log_start();
 
     return 0;
 
@@ -2798,6 +2799,7 @@ void colo_release_ram_cache(void)
 
     atomic_rcu_set(&ram_state.ram_bitmap, NULL);
     if (bitmap) {
+        memory_global_dirty_log_stop();
         call_rcu(bitmap, migration_bitmap_free, rcu);
     }
 
@@ -2822,6 +2824,16 @@ void colo_flush_ram_cache(void)
     void *src_host;
     unsigned long offset = 0;
 
+    memory_global_dirty_log_sync();
+    qemu_mutex_lock(&ram_state.bitmap_mutex);
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        migration_bitmap_sync_range(&ram_state, block, block->offset,
+                                    block->used_length);
+    }
+    rcu_read_unlock();
+    qemu_mutex_unlock(&ram_state.bitmap_mutex);
+
     trace_colo_flush_ram_cache_begin(ram_state.migration_dirty_pages);
     rcu_read_lock();
     block = QLIST_FIRST_RCU(&ram_list.blocks);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 16/18] filter: Add handle_event method for NetFilterClass
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (14 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 15/18] COLO: flush host dirty ram from cache zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 17/18] filter-rewriter: handle checkpoint and failover event zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 18/18] COLO: notify net filters about checkpoint/failover event zhanghailiang
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Jason Wang

Filter needs to process the event of checkpoint/failover or
other event passed by COLO frame.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 include/net/filter.h |  5 +++++
 net/filter.c         | 16 ++++++++++++++++
 net/net.c            | 28 ++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 0c4a2ea..df4510d 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -37,6 +37,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
 
 typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp);
 
+typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error **errp);
+
 typedef struct NetFilterClass {
     ObjectClass parent_class;
 
@@ -44,6 +46,7 @@ typedef struct NetFilterClass {
     FilterSetup *setup;
     FilterCleanup *cleanup;
     FilterStatusChanged *status_changed;
+    FilterHandleEvent *handle_event;
     /* mandatory */
     FilterReceiveIOV *receive_iov;
 } NetFilterClass;
@@ -76,4 +79,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
                                     int iovcnt,
                                     void *opaque);
 
+void colo_notify_filters_event(int event, Error **errp);
+
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter.c b/net/filter.c
index 1dfd2ca..993b35e 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -17,6 +17,7 @@
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
 #include "qemu/iov.h"
+#include "net/colo.h"
 
 static inline bool qemu_can_skip_netfilter(NetFilterState *nf)
 {
@@ -245,11 +246,26 @@ static void netfilter_finalize(Object *obj)
     g_free(nf->netdev_id);
 }
 
+static void dummy_handle_event(NetFilterState *nf, int event, Error **errp)
+{
+    switch (event) {
+    case COLO_CHECKPOINT:
+        break;
+    case COLO_FAILOVER:
+        object_property_set_str(OBJECT(nf), "off", "status", errp);
+        break;
+    default:
+        break;
+    }
+}
+
 static void netfilter_class_init(ObjectClass *oc, void *data)
 {
     UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+    NetFilterClass *nfc = NETFILTER_CLASS(oc);
 
     ucc->complete = netfilter_complete;
+    nfc->handle_event = dummy_handle_event;
 }
 
 static const TypeInfo netfilter_info = {
diff --git a/net/net.c b/net/net.c
index 0ac3b9e..1373f63 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1373,6 +1373,34 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
     }
 }
 
+void colo_notify_filters_event(int event, Error **errp)
+{
+    NetClientState *nc, *peer;
+    NetClientDriver type;
+    NetFilterState *nf;
+    NetFilterClass *nfc = NULL;
+    Error *local_err = NULL;
+
+    QTAILQ_FOREACH(nc, &net_clients, next) {
+        peer = nc->peer;
+        type = nc->info->type;
+        if (!peer || type != NET_CLIENT_DRIVER_NIC) {
+            continue;
+        }
+        QTAILQ_FOREACH(nf, &nc->filters, next) {
+            nfc =  NETFILTER_GET_CLASS(OBJECT(nf));
+            if (!nfc->handle_event) {
+                continue;
+            }
+            nfc->handle_event(nf, event, &local_err);
+            if (local_err) {
+                error_propagate(errp, local_err);
+                return;
+            }
+        }
+    }
+}
+
 void qmp_set_link(const char *name, bool up, Error **errp)
 {
     NetClientState *ncs[MAX_QUEUE_NUM];
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 17/18] filter-rewriter: handle checkpoint and failover event
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (15 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 16/18] filter: Add handle_event method for NetFilterClass zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 18/18] COLO: notify net filters about checkpoint/failover event zhanghailiang
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Jason Wang

After one round of checkpoint, the states between PVM and SVM
become consistent, so it is unnecessary to adjust the sequence
of net packets for old connections, besides, while failover
happens, filter-rewriter needs to check if it still needs to
adjust sequence of net packets.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/filter-rewriter.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index c9a6d43..0a90b11 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -22,6 +22,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/iov.h"
 #include "net/checksum.h"
+#include "net/colo.h"
 
 #define FILTER_COLO_REWRITER(obj) \
     OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
@@ -270,6 +271,43 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
     return 0;
 }
 
+static void reset_seq_offset(gpointer key, gpointer value, gpointer user_data)
+{
+    Connection *conn = (Connection *)value;
+
+    conn->offset = 0;
+}
+
+static gboolean offset_is_nonzero(gpointer key,
+                                  gpointer value,
+                                  gpointer user_data)
+{
+    Connection *conn = (Connection *)value;
+
+    return conn->offset ? true : false;
+}
+
+static void colo_rewriter_handle_event(NetFilterState *nf, int event,
+                                       Error **errp)
+{
+    RewriterState *rs = FILTER_COLO_REWRITER(nf);
+
+    switch (event) {
+    case COLO_CHECKPOINT:
+        g_hash_table_foreach(rs->connection_track_table,
+                            reset_seq_offset, NULL);
+        break;
+    case COLO_FAILOVER:
+        if (!g_hash_table_find(rs->connection_track_table,
+                              offset_is_nonzero, NULL)) {
+            object_property_set_str(OBJECT(nf), "off", "status", errp);
+        }
+        break;
+    default:
+        break;
+    }
+}
+
 static void colo_rewriter_cleanup(NetFilterState *nf)
 {
     RewriterState *s = FILTER_COLO_REWRITER(nf);
@@ -299,6 +337,7 @@ static void colo_rewriter_class_init(ObjectClass *oc, void *data)
     nfc->setup = colo_rewriter_setup;
     nfc->cleanup = colo_rewriter_cleanup;
     nfc->receive_iov = colo_rewriter_receive_iov;
+    nfc->handle_event = colo_rewriter_handle_event;
 }
 
 static const TypeInfo colo_rewriter_info = {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 18/18] COLO: notify net filters about checkpoint/failover event
  2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (16 preceding siblings ...)
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 17/18] filter-rewriter: handle checkpoint and failover event zhanghailiang
@ 2017-04-22  8:25 ` zhanghailiang
  17 siblings, 0 replies; 21+ messages in thread
From: zhanghailiang @ 2017-04-22  8:25 UTC (permalink / raw)
  To: qemu-devel, gilbert
  Cc: quintela, lizhijian, zhangchen.fnst, xiecl.fnst, zhanghailiang,
	Jason Wang

Notify all net filters about the checkpoint and failover event.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 migration/colo.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 66bb5b2..62f58c6 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -26,6 +26,7 @@
 #include "qapi-event.h"
 #include "block/block.h"
 #include "replication.h"
+#include "net/filter.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -82,6 +83,11 @@ static void secondary_vm_do_failover(void)
     if (local_err) {
         error_report_err(local_err);
     }
+    /* Notify all filters of all NIC to do checkpoint */
+    colo_notify_filters_event(COLO_FAILOVER, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
 
     if (!autostart) {
         error_report("\"-S\" qemu option will be ignored in secondary side");
@@ -794,6 +800,13 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        /* Notify all filters of all NIC to do checkpoint */
+        colo_notify_filters_event(COLO_CHECKPOINT, &local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+
         vmstate_loading = false;
         vm_start();
         trace_colo_vm_state_change("stop", "run");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/18] COLO: Handle shutdown command for VM in COLO state
  2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 05/18] COLO: Handle shutdown command for VM in COLO state zhanghailiang
@ 2017-04-24 14:51   ` Eric Blake
  2017-04-26  7:38     ` Hailiang Zhang
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Blake @ 2017-04-24 14:51 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, xiecl.fnst, zhangchen.fnst, quintela, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1247 bytes --]

On 04/22/2017 03:25 AM, zhanghailiang wrote:
> If VM is in COLO FT state, we need to do some extra works before
> starting normal shutdown process.
> 
> Secondary VM will ignore the shutdown command if users issue it directly
> to Secondary VM. COLO will capture shutdown command and after
> shutdown request from user.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---

> +++ b/qapi-schema.json
> @@ -1187,12 +1187,14 @@
>  #
>  # @vmstate-loaded: VM's state has been loaded by SVM.
>  #
> +# @guest-shutdown: shutdown requested from PVM to SVM. (Since 2.9)

You missed 2.9. Please fix this to state 2.10.

> +#
>  # Since: 2.8
>  ##
>  { 'enum': 'COLOMessage',
>    'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
>              'vmstate-send', 'vmstate-size', 'vmstate-received',
> -            'vmstate-loaded' ] }
> +            'vmstate-loaded', 'guest-shutdown' ] }
>  


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/18] COLO: Handle shutdown command for VM in COLO state
  2017-04-24 14:51   ` Eric Blake
@ 2017-04-26  7:38     ` Hailiang Zhang
  0 siblings, 0 replies; 21+ messages in thread
From: Hailiang Zhang @ 2017-04-26  7:38 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: lizhijian, xiecl.fnst, zhangchen.fnst, quintela, Paolo Bonzini

On 2017/4/24 22:51, Eric Blake wrote:
> On 04/22/2017 03:25 AM, zhanghailiang wrote:
>> If VM is in COLO FT state, we need to do some extra works before
>> starting normal shutdown process.
>>
>> Secondary VM will ignore the shutdown command if users issue it directly
>> to Secondary VM. COLO will capture shutdown command and after
>> shutdown request from user.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> ---
>> +++ b/qapi-schema.json
>> @@ -1187,12 +1187,14 @@
>>   #
>>   # @vmstate-loaded: VM's state has been loaded by SVM.
>>   #
>> +# @guest-shutdown: shutdown requested from PVM to SVM. (Since 2.9)
> You missed 2.9. Please fix this to state 2.10.

OK, will fix in next version, thanks.

>> +#
>>   # Since: 2.8
>>   ##
>>   { 'enum': 'COLOMessage',
>>     'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
>>               'vmstate-send', 'vmstate-size', 'vmstate-received',
>> -            'vmstate-loaded' ] }
>> +            'vmstate-loaded', 'guest-shutdown' ] }
>>   
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-04-26  7:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-22  8:25 [Qemu-devel] [PATCH v2 00/18] COLO: integrate colo frame with block replication and net compare zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 01/18] net/colo: Add notifier/callback related helpers for filter zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 02/18] colo-compare: implement the process of checkpoint zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 03/18] colo-compare: use notifier to notify packets comparing result zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 04/18] COLO: integrate colo compare with colo frame zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 05/18] COLO: Handle shutdown command for VM in COLO state zhanghailiang
2017-04-24 14:51   ` Eric Blake
2017-04-26  7:38     ` Hailiang Zhang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 06/18] COLO: Add block replication into colo process zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 07/18] COLO: Load dirty pages into SVM's RAM cache firstly zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 08/18] ram/COLO: Record the dirty pages that SVM received zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 09/18] COLO: Flush memory data from ram cache zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 10/18] qmp event: Add COLO_EXIT event to notify users while exited COLO zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 11/18] savevm: split save/find loadvm_handlers entry into two helper functions zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 12/18] savevm: split the process of different stages for loadvm/savevm zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 13/18] COLO: Separate the process of saving/loading ram and device state zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 14/18] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 15/18] COLO: flush host dirty ram from cache zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 16/18] filter: Add handle_event method for NetFilterClass zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 17/18] filter-rewriter: handle checkpoint and failover event zhanghailiang
2017-04-22  8:25 ` [Qemu-devel] [PATCH v2 18/18] COLO: notify net filters about checkpoint/failover event zhanghailiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).