qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration
@ 2018-05-05 14:35 Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

The RDMA QIOChannel does not support bi-directional communication, so when RDMA 
live migration with postcopy enabled, the source qemu return path get qemu file 
error.

These patches implement bi-directional communication for RDMA QIOChannel and 
disable the RDMA WRITE during the postcopy phase.

This patch just make postcopy works, and will improve performance later.

[v3]
 - add a mutex in QEMUFile struct to avoid concurrent channel close (Daniel)
 - destroy the mutex before free QEMUFile (David)
 - use rdmain and rmdaout instead of rdma->return_path (Daniel)

[v2]
 - does not update bytes_xfer when disable RDMA WRITE (David)
 - implement bi-directional communication for RDMA QIOChannel (Daniel)


Lidong Chen (6):
  migration: disable RDMA WRITE after postcopy started
  migration: create a dedicated connection for rdma return path
  migration: remove unnecessary variables len in QIOChannelRDMA
  migration: avoid concurrent invoke channel_close by different threads
  migration: implement bi-directional RDMA QIOChannel
  migration: Stop rdma yielding during incoming postcopy

 migration/colo.c         |   2 +
 migration/migration.c    |   2 +
 migration/postcopy-ram.c |   2 +
 migration/qemu-file.c    |  13 +-
 migration/ram.c          |   4 +
 migration/rdma.c         | 321 +++++++++++++++++++++++++++++++++++++++++------
 migration/savevm.c       |   3 +
 7 files changed, 307 insertions(+), 40 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

RDMA WRITE operations are performed with no notification to the destination
qemu, then the destination qemu can not wakeup. This patch disable RDMA WRITE
after postcopy started.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/qemu-file.c |  8 ++++++--
 migration/rdma.c      | 12 ++++++++++++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 0463f4c..977b9ae 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -253,8 +253,12 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
     if (f->hooks && f->hooks->save_page) {
         int ret = f->hooks->save_page(f, f->opaque, block_offset,
                                       offset, size, bytes_sent);
-        f->bytes_xfer += size;
-        if (ret != RAM_SAVE_CONTROL_DELAYED) {
+        if (ret != RAM_SAVE_CONTROL_NOT_SUPP) {
+            f->bytes_xfer += size;
+        }
+
+        if (ret != RAM_SAVE_CONTROL_DELAYED &&
+            ret != RAM_SAVE_CONTROL_NOT_SUPP) {
             if (bytes_sent && *bytes_sent > 0) {
                 qemu_update_position(f, *bytes_sent);
             } else if (ret < 0) {
diff --git a/migration/rdma.c b/migration/rdma.c
index da474fc..a22be43 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2927,6 +2927,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
 
     CHECK_ERROR_STATE();
 
+    if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        return RAM_SAVE_CONTROL_NOT_SUPP;
+    }
+
     qemu_fflush(f);
 
     if (size > 0) {
@@ -3482,6 +3486,10 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
 
     CHECK_ERROR_STATE();
 
+    if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        return 0;
+    }
+
     trace_qemu_rdma_registration_start(flags);
     qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
     qemu_fflush(f);
@@ -3504,6 +3512,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
 
     CHECK_ERROR_STATE();
 
+    if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        return 0;
+    }
+
     qemu_fflush(f);
     ret = qemu_rdma_drain_cq(f, rdma);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

If start a RDMA migration with postcopy enabled, the source qemu
establish a dedicated connection for return path.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/rdma.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 91 insertions(+), 3 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index a22be43..c745427 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -387,6 +387,10 @@ typedef struct RDMAContext {
     uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
 
     GHashTable *blockmap;
+
+    /* the RDMAContext for return path */
+    struct RDMAContext *return_path;
+    bool is_return_path;
 } RDMAContext;
 
 #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
@@ -2329,10 +2333,22 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
         rdma_destroy_id(rdma->cm_id);
         rdma->cm_id = NULL;
     }
+
+    /* the destination side, listen_id and channel is shared */
     if (rdma->listen_id) {
-        rdma_destroy_id(rdma->listen_id);
+        if (!rdma->is_return_path) {
+            rdma_destroy_id(rdma->listen_id);
+        }
         rdma->listen_id = NULL;
+
+        if (rdma->channel) {
+            if (!rdma->is_return_path) {
+                rdma_destroy_event_channel(rdma->channel);
+            }
+            rdma->channel = NULL;
+        }
     }
+
     if (rdma->channel) {
         rdma_destroy_event_channel(rdma->channel);
         rdma->channel = NULL;
@@ -2561,6 +2577,25 @@ err_dest_init_create_listen_id:
 
 }
 
+static void qemu_rdma_return_path_dest_init(RDMAContext *rdma_return_path,
+                                            RDMAContext *rdma)
+{
+    int idx;
+
+    for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
+        rdma_return_path->wr_data[idx].control_len = 0;
+        rdma_return_path->wr_data[idx].control_curr = NULL;
+    }
+
+    /*the CM channel and CM id is shared*/
+    rdma_return_path->channel = rdma->channel;
+    rdma_return_path->listen_id = rdma->listen_id;
+
+    rdma->return_path = rdma_return_path;
+    rdma_return_path->return_path = rdma;
+    rdma_return_path->is_return_path = true;
+}
+
 static void *qemu_rdma_data_init(const char *host_port, Error **errp)
 {
     RDMAContext *rdma = NULL;
@@ -3018,6 +3053,8 @@ err:
     return ret;
 }
 
+static void rdma_accept_incoming_migration(void *opaque);
+
 static int qemu_rdma_accept(RDMAContext *rdma)
 {
     RDMACapabilities cap;
@@ -3112,7 +3149,14 @@ static int qemu_rdma_accept(RDMAContext *rdma)
         }
     }
 
-    qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
+    /* Accept the second connection request for return path */
+    if (migrate_postcopy() && !rdma->is_return_path) {
+        qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
+                            NULL,
+                            (void *)(intptr_t)rdma->return_path);
+    } else {
+        qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
+    }
 
     ret = rdma_accept(rdma->cm_id, &conn_param);
     if (ret) {
@@ -3693,6 +3737,10 @@ static void rdma_accept_incoming_migration(void *opaque)
 
     trace_qemu_rdma_accept_incoming_migration_accepted();
 
+    if (rdma->is_return_path) {
+        return;
+    }
+
     f = qemu_fopen_rdma(rdma, "rb");
     if (f == NULL) {
         ERROR(errp, "could not qemu_fopen_rdma!");
@@ -3707,7 +3755,7 @@ static void rdma_accept_incoming_migration(void *opaque)
 void rdma_start_incoming_migration(const char *host_port, Error **errp)
 {
     int ret;
-    RDMAContext *rdma;
+    RDMAContext *rdma, *rdma_return_path;
     Error *local_err = NULL;
 
     trace_rdma_start_incoming_migration();
@@ -3734,12 +3782,24 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp)
 
     trace_rdma_start_incoming_migration_after_rdma_listen();
 
+    /* initialize the RDMAContext for return path */
+    if (migrate_postcopy()) {
+        rdma_return_path = qemu_rdma_data_init(host_port, &local_err);
+
+        if (rdma_return_path == NULL) {
+            goto err;
+        }
+
+        qemu_rdma_return_path_dest_init(rdma_return_path, rdma);
+    }
+
     qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
                         NULL, (void *)(intptr_t)rdma);
     return;
 err:
     error_propagate(errp, local_err);
     g_free(rdma);
+    g_free(rdma_return_path);
 }
 
 void rdma_start_outgoing_migration(void *opaque,
@@ -3747,6 +3807,7 @@ void rdma_start_outgoing_migration(void *opaque,
 {
     MigrationState *s = opaque;
     RDMAContext *rdma = qemu_rdma_data_init(host_port, errp);
+    RDMAContext *rdma_return_path = NULL;
     int ret = 0;
 
     if (rdma == NULL) {
@@ -3767,6 +3828,32 @@ void rdma_start_outgoing_migration(void *opaque,
         goto err;
     }
 
+    /* RDMA postcopy need a seprate queue pair for return path */
+    if (migrate_postcopy()) {
+        rdma_return_path = qemu_rdma_data_init(host_port, errp);
+
+        if (rdma_return_path == NULL) {
+            goto err;
+        }
+
+        ret = qemu_rdma_source_init(rdma_return_path,
+            s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL], errp);
+
+        if (ret) {
+            goto err;
+        }
+
+        ret = qemu_rdma_connect(rdma_return_path, errp);
+
+        if (ret) {
+            goto err;
+        }
+
+        rdma->return_path = rdma_return_path;
+        rdma_return_path->return_path = rdma;
+        rdma_return_path->is_return_path = true;
+    }
+
     trace_rdma_start_outgoing_migration_after_rdma_connect();
 
     s->to_dst_file = qemu_fopen_rdma(rdma, "wb");
@@ -3774,4 +3861,5 @@ void rdma_start_outgoing_migration(void *opaque,
     return;
 err:
     g_free(rdma);
+    g_free(rdma_return_path);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-08 14:19   ` Dr. David Alan Gilbert
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
by different threads concurrently, this patch removes unnecessary variables
len in QIOChannelRDMA and use local variable instead.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangéberrange@redhat.com>
---
 migration/rdma.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index c745427..f5c1d02 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -404,7 +404,6 @@ struct QIOChannelRDMA {
     QIOChannel parent;
     RDMAContext *rdma;
     QEMUFile *file;
-    size_t len;
     bool blocking; /* XXX we don't actually honour this yet */
 };
 
@@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
     int ret;
     ssize_t done = 0;
     size_t i;
+    size_t len = 0;
 
     CHECK_ERROR_STATE();
 
@@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
         while (remaining) {
             RDMAControlHeader head;
 
-            rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
-            remaining -= rioc->len;
+            len = MIN(remaining, RDMA_SEND_INCREMENT);
+            remaining -= len;
 
-            head.len = rioc->len;
+            head.len = len;
             head.type = RDMA_CONTROL_QEMU_FILE;
 
             ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
@@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
                 return ret;
             }
 
-            data += rioc->len;
-            done += rioc->len;
+            data += len;
+            done += len;
         }
     }
 
@@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
             }
         }
     }
-    rioc->len = done;
-    return rioc->len;
+    return done;
 }
 
 /*
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
                   ` (2 preceding siblings ...)
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen
  5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

The channel_close maybe invoked by different threads. For example, source
qemu invokes qemu_fclose in main thread, migration thread and return path
thread. Destination qemu invokes qemu_fclose in main thread, listen thread
and COLO incoming thread.

Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
 migration/qemu-file.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 977b9ae..87d0f05 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -52,6 +52,7 @@ struct QEMUFile {
     unsigned int iovcnt;
 
     int last_error;
+    QemuMutex lock;
 };
 
 /*
@@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops)
 
     f = g_new0(QEMUFile, 1);
 
+    qemu_mutex_init(&f->lock);
     f->opaque = opaque;
     f->ops = ops;
     return f;
@@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f)
     ret = qemu_file_get_error(f);
 
     if (f->ops->close) {
+        qemu_mutex_lock(&f->lock);
         int ret2 = f->ops->close(f->opaque);
+        qemu_mutex_unlock(&f->lock);
         if (ret >= 0) {
             ret = ret2;
         }
@@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f)
     if (f->last_error) {
         ret = f->last_error;
     }
+    qemu_mutex_destroy(&f->lock);
     g_free(f);
     trace_qemu_file_fclose();
     return ret;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
                   ` (3 preceding siblings ...)
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  2018-05-15 14:54   ` Paolo Bonzini
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen
  5 siblings, 1 reply; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

This patch implements bi-directional RDMA QIOChannel. Because different
threads may access RDMAQIOChannel currently, this patch use RCU to protect it.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
 migration/colo.c         |   2 +
 migration/migration.c    |   2 +
 migration/postcopy-ram.c |   2 +
 migration/ram.c          |   4 +
 migration/rdma.c         | 196 ++++++++++++++++++++++++++++++++++++++++-------
 migration/savevm.c       |   3 +
 6 files changed, 183 insertions(+), 26 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 4381067..88936f5 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -534,6 +534,7 @@ void *colo_process_incoming_thread(void *opaque)
     uint64_t value;
     Error *local_err = NULL;
 
+    rcu_register_thread();
     qemu_sem_init(&mis->colo_incoming_sem, 0);
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
@@ -666,5 +667,6 @@ out:
     }
     migration_incoming_exit_colo();
 
+    rcu_unregister_thread();
     return NULL;
 }
diff --git a/migration/migration.c b/migration/migration.c
index 0bdb28e..584666b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1787,6 +1787,7 @@ static void *source_return_path_thread(void *opaque)
     int res;
 
     trace_source_return_path_thread_entry();
+    rcu_register_thread();
     while (!ms->rp_state.error && !qemu_file_get_error(rp) &&
            migration_is_setup_or_active(ms->state)) {
         trace_source_return_path_thread_loop_top();
@@ -1887,6 +1888,7 @@ static void *source_return_path_thread(void *opaque)
 out:
     ms->rp_state.from_dst_file = NULL;
     qemu_fclose(rp);
+    rcu_unregister_thread();
     return NULL;
 }
 
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 8ceeaa2..4e05966 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -842,6 +842,7 @@ static void *postcopy_ram_fault_thread(void *opaque)
     RAMBlock *rb = NULL;
 
     trace_postcopy_ram_fault_thread_entry();
+    rcu_register_thread();
     mis->last_rb = NULL; /* last RAMBlock we sent part of */
     qemu_sem_post(&mis->fault_thread_sem);
 
@@ -1013,6 +1014,7 @@ static void *postcopy_ram_fault_thread(void *opaque)
             }
         }
     }
+    rcu_unregister_thread();
     trace_postcopy_ram_fault_thread_exit();
     g_free(pfd);
     return NULL;
diff --git a/migration/ram.c b/migration/ram.c
index 912810c..9bc92fc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -491,6 +491,7 @@ static void *multifd_send_thread(void *opaque)
 {
     MultiFDSendParams *p = opaque;
 
+    rcu_register_thread();
     while (true) {
         qemu_mutex_lock(&p->mutex);
         if (p->quit) {
@@ -500,6 +501,7 @@ static void *multifd_send_thread(void *opaque)
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_wait(&p->sem);
     }
+    rcu_unregister_thread();
 
     return NULL;
 }
@@ -592,6 +594,7 @@ static void *multifd_recv_thread(void *opaque)
 {
     MultiFDRecvParams *p = opaque;
 
+    rcu_register_thread();
     while (true) {
         qemu_mutex_lock(&p->mutex);
         if (p->quit) {
@@ -601,6 +604,7 @@ static void *multifd_recv_thread(void *opaque)
         qemu_mutex_unlock(&p->mutex);
         qemu_sem_wait(&p->sem);
     }
+    rcu_unregister_thread();
 
     return NULL;
 }
diff --git a/migration/rdma.c b/migration/rdma.c
index f5c1d02..854f355 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -86,6 +86,7 @@ static uint32_t known_capabilities = RDMA_CAPABILITY_PIN_ALL;
                                 " to abort!"); \
                 rdma->error_reported = 1; \
             } \
+            rcu_read_unlock(); \
             return rdma->error_state; \
         } \
     } while (0)
@@ -402,7 +403,8 @@ typedef struct QIOChannelRDMA QIOChannelRDMA;
 
 struct QIOChannelRDMA {
     QIOChannel parent;
-    RDMAContext *rdma;
+    RDMAContext *rdmain;
+    RDMAContext *rdmaout;
     QEMUFile *file;
     bool blocking; /* XXX we don't actually honour this yet */
 };
@@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
     QEMUFile *f = rioc->file;
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     int ret;
     ssize_t done = 0;
     size_t i;
     size_t len = 0;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmaout);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
     /*
@@ -2650,6 +2660,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
     ret = qemu_rdma_write_flush(f, rdma);
     if (ret < 0) {
         rdma->error_state = ret;
+        rcu_read_unlock();
         return ret;
     }
 
@@ -2669,6 +2680,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 
             if (ret < 0) {
                 rdma->error_state = ret;
+                rcu_read_unlock();
                 return ret;
             }
 
@@ -2677,6 +2689,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
         }
     }
 
+    rcu_read_unlock();
     return done;
 }
 
@@ -2710,12 +2723,20 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
                                       Error **errp)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     RDMAControlHeader head;
     int ret = 0;
     ssize_t i;
     size_t done = 0;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmain);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
     for (i = 0; i < niov; i++) {
@@ -2727,7 +2748,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
          * were given and dish out the bytes until we run
          * out of bytes.
          */
-        ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
+        ret = qemu_rdma_fill(rdma, data, want, 0);
         done += ret;
         want -= ret;
         /* Got what we needed, so go to next iovec */
@@ -2749,25 +2770,28 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
 
         if (ret < 0) {
             rdma->error_state = ret;
+            rcu_read_unlock();
             return ret;
         }
 
         /*
          * SEND was received with new bytes, now try again.
          */
-        ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
+        ret = qemu_rdma_fill(rdma, data, want, 0);
         done += ret;
         want -= ret;
 
         /* Still didn't get enough, so lets just return */
         if (want) {
             if (done == 0) {
+                rcu_read_unlock();
                 return QIO_CHANNEL_ERR_BLOCK;
             } else {
                 break;
             }
         }
     }
+    rcu_read_unlock();
     return done;
 }
 
@@ -2819,15 +2843,29 @@ qio_channel_rdma_source_prepare(GSource *source,
                                 gint *timeout)
 {
     QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
-    RDMAContext *rdma = rsource->rioc->rdma;
+    RDMAContext *rdma;
     GIOCondition cond = 0;
     *timeout = -1;
 
+    rcu_read_lock();
+    if (rsource->condition == G_IO_IN) {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+    } else {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+    }
+
+    if (!rdma) {
+        error_report("RDMAContext is NULL when prepare Gsource");
+        rcu_read_unlock();
+        return FALSE;
+    }
+
     if (rdma->wr_data[0].control_len) {
         cond |= G_IO_IN;
     }
     cond |= G_IO_OUT;
 
+    rcu_read_unlock();
     return cond & rsource->condition;
 }
 
@@ -2835,14 +2873,28 @@ static gboolean
 qio_channel_rdma_source_check(GSource *source)
 {
     QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
-    RDMAContext *rdma = rsource->rioc->rdma;
+    RDMAContext *rdma;
     GIOCondition cond = 0;
 
+    rcu_read_lock();
+    if (rsource->condition == G_IO_IN) {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+    } else {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+    }
+
+    if (!rdma) {
+        error_report("RDMAContext is NULL when check Gsource");
+        rcu_read_unlock();
+        return FALSE;
+    }
+
     if (rdma->wr_data[0].control_len) {
         cond |= G_IO_IN;
     }
     cond |= G_IO_OUT;
 
+    rcu_read_unlock();
     return cond & rsource->condition;
 }
 
@@ -2853,14 +2905,28 @@ qio_channel_rdma_source_dispatch(GSource *source,
 {
     QIOChannelFunc func = (QIOChannelFunc)callback;
     QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
-    RDMAContext *rdma = rsource->rioc->rdma;
+    RDMAContext *rdma;
     GIOCondition cond = 0;
 
+    rcu_read_lock();
+    if (rsource->condition == G_IO_IN) {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+    } else {
+        rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+    }
+
+    if (!rdma) {
+        error_report("RDMAContext is NULL when dispatch Gsource");
+        rcu_read_unlock();
+        return FALSE;
+    }
+
     if (rdma->wr_data[0].control_len) {
         cond |= G_IO_IN;
     }
     cond |= G_IO_OUT;
 
+    rcu_read_unlock();
     return (*func)(QIO_CHANNEL(rsource->rioc),
                    (cond & rsource->condition),
                    user_data);
@@ -2905,15 +2971,32 @@ static int qio_channel_rdma_close(QIOChannel *ioc,
                                   Error **errp)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
+    RDMAContext *rdmain, *rdmaout;
     trace_qemu_rdma_close();
-    if (rioc->rdma) {
-        if (!rioc->rdma->error_state) {
-            rioc->rdma->error_state = qemu_file_get_error(rioc->file);
-        }
-        qemu_rdma_cleanup(rioc->rdma);
-        g_free(rioc->rdma);
-        rioc->rdma = NULL;
+
+    rdmain = rioc->rdmain;
+    if (rdmain) {
+        atomic_rcu_set(&rioc->rdmain, NULL);
+    }
+
+    rdmaout = rioc->rdmaout;
+    if (rdmaout) {
+        atomic_rcu_set(&rioc->rdmaout, NULL);
     }
+
+    synchronize_rcu();
+
+    if (rdmain) {
+        qemu_rdma_cleanup(rdmain);
+    }
+
+    if (rdmaout) {
+        qemu_rdma_cleanup(rdmaout);
+    }
+
+    g_free(rdmain);
+    g_free(rdmaout);
+
     return 0;
 }
 
@@ -2956,12 +3039,21 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
                                   size_t size, uint64_t *bytes_sent)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     int ret;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmaout);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
     if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        rcu_read_unlock();
         return RAM_SAVE_CONTROL_NOT_SUPP;
     }
 
@@ -3046,9 +3138,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
         }
     }
 
+    rcu_read_unlock();
     return RAM_SAVE_CONTROL_DELAYED;
 err:
     rdma->error_state = ret;
+    rcu_read_unlock();
     return ret;
 }
 
@@ -3224,8 +3318,8 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
     RDMAControlHeader blocks = { .type = RDMA_CONTROL_RAM_BLOCKS_RESULT,
                                  .repeat = 1 };
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
-    RDMAContext *rdma = rioc->rdma;
-    RDMALocalBlocks *local = &rdma->local_ram_blocks;
+    RDMAContext *rdma;
+    RDMALocalBlocks *local;
     RDMAControlHeader head;
     RDMARegister *reg, *registers;
     RDMACompress *comp;
@@ -3238,8 +3332,17 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
     int count = 0;
     int i = 0;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmain);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
+    local = &rdma->local_ram_blocks;
     do {
         trace_qemu_rdma_registration_handle_wait();
 
@@ -3469,6 +3572,7 @@ out:
     if (ret < 0) {
         rdma->error_state = ret;
     }
+    rcu_read_unlock();
     return ret;
 }
 
@@ -3482,10 +3586,18 @@ out:
 static int
 rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
 {
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     int curr;
     int found = -1;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmain);
+
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     /* Find the matching RAMBlock in our local list */
     for (curr = 0; curr < rdma->local_ram_blocks.nb_blocks; curr++) {
         if (!strcmp(rdma->local_ram_blocks.block[curr].block_name, name)) {
@@ -3496,6 +3608,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
 
     if (found == -1) {
         error_report("RAMBlock '%s' not found on destination", name);
+        rcu_read_unlock();
         return -ENOENT;
     }
 
@@ -3503,6 +3616,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
     trace_rdma_block_notification_handle(name, rdma->next_src_index);
     rdma->next_src_index++;
 
+    rcu_read_unlock();
     return 0;
 }
 
@@ -3525,11 +3639,19 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
                                         uint64_t flags, void *data)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
+
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmaout);
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
 
     CHECK_ERROR_STATE();
 
     if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        rcu_read_unlock();
         return 0;
     }
 
@@ -3537,6 +3659,7 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
     qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
     qemu_fflush(f);
 
+    rcu_read_unlock();
     return 0;
 }
 
@@ -3549,13 +3672,21 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
 {
     Error *local_err = NULL, **errp = &local_err;
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
-    RDMAContext *rdma = rioc->rdma;
+    RDMAContext *rdma;
     RDMAControlHeader head = { .len = 0, .repeat = 1 };
     int ret = 0;
 
+    rcu_read_lock();
+    rdma = atomic_rcu_read(&rioc->rdmaout);
+    if (!rdma) {
+        rcu_read_unlock();
+        return -EIO;
+    }
+
     CHECK_ERROR_STATE();
 
     if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+        rcu_read_unlock();
         return 0;
     }
 
@@ -3587,6 +3718,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
                     qemu_rdma_reg_whole_ram_blocks : NULL);
         if (ret < 0) {
             ERROR(errp, "receiving remote info!");
+            rcu_read_unlock();
             return ret;
         }
 
@@ -3610,6 +3742,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
                         "not identical on both the source and destination.",
                         local->nb_blocks, nb_dest_blocks);
             rdma->error_state = -EINVAL;
+            rcu_read_unlock();
             return -EINVAL;
         }
 
@@ -3626,6 +3759,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
                             local->block[i].length,
                             rdma->dest_blocks[i].length);
                 rdma->error_state = -EINVAL;
+                rcu_read_unlock();
                 return -EINVAL;
             }
             local->block[i].remote_host_addr =
@@ -3643,9 +3777,11 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
         goto err;
     }
 
+    rcu_read_unlock();
     return 0;
 err:
     rdma->error_state = ret;
+    rcu_read_unlock();
     return ret;
 }
 
@@ -3663,10 +3799,15 @@ static const QEMUFileHooks rdma_write_hooks = {
 static void qio_channel_rdma_finalize(Object *obj)
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(obj);
-    if (rioc->rdma) {
-        qemu_rdma_cleanup(rioc->rdma);
-        g_free(rioc->rdma);
-        rioc->rdma = NULL;
+    if (rioc->rdmain) {
+        qemu_rdma_cleanup(rioc->rdmain);
+        g_free(rioc->rdmain);
+        rioc->rdmain = NULL;
+    }
+    if (rioc->rdmaout) {
+        qemu_rdma_cleanup(rioc->rdmaout);
+        g_free(rioc->rdmaout);
+        rioc->rdmaout = NULL;
     }
 }
 
@@ -3706,13 +3847,16 @@ static QEMUFile *qemu_fopen_rdma(RDMAContext *rdma, const char *mode)
     }
 
     rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA));
-    rioc->rdma = rdma;
 
     if (mode[0] == 'w') {
         rioc->file = qemu_fopen_channel_output(QIO_CHANNEL(rioc));
+        rioc->rdmaout = rdma;
+        rioc->rdmain = rdma->return_path;
         qemu_file_set_hooks(rioc->file, &rdma_write_hooks);
     } else {
         rioc->file = qemu_fopen_channel_input(QIO_CHANNEL(rioc));
+        rioc->rdmain = rdma;
+        rioc->rdmaout = rdma->return_path;
         qemu_file_set_hooks(rioc->file, &rdma_read_hooks);
     }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index e2be02a..45ec809 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1573,6 +1573,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
     qemu_sem_post(&mis->listen_thread_sem);
     trace_postcopy_ram_listen_thread_start();
 
+    rcu_register_thread();
     /*
      * Because we're a thread and not a coroutine we can't yield
      * in qemu_file, and thus we must be blocking now.
@@ -1605,6 +1606,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
          * to leave the guest running and fire MCEs for pages that never
          * arrived as a desperate recovery step.
          */
+        rcu_unregister_thread();
         exit(EXIT_FAILURE);
     }
 
@@ -1619,6 +1621,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
     migration_incoming_state_destroy();
     qemu_loadvm_state_cleanup();
 
+    rcu_unregister_thread();
     return NULL;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy
  2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
                   ` (4 preceding siblings ...)
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
  5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
  To: quintela, dgilbert, berrange
  Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen

During incoming postcopy, the destination qemu will invoke
qemu_rdma_wait_comp_channel in a seprate thread. So does not use rdma
yield, and poll the completion channel fd instead.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/rdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 854f355..ed9cfb1 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1490,11 +1490,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
      * Coroutine doesn't start until migration_fd_process_incoming()
      * so don't yield unless we know we're running inside of a coroutine.
      */
-    if (rdma->migration_started_on_destination) {
+    if (rdma->migration_started_on_destination &&
+        migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {
         yield_until_fd_readable(rdma->comp_channel->fd);
     } else {
         /* This is the source side, we're in a separate thread
          * or destination prior to migration_fd_process_incoming()
+         * after postcopy, the destination also in a seprate thread.
          * we can't yield; so we have to poll the fd.
          * But we need to be able to handle 'cancel' or an error
          * without hanging forever.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
@ 2018-05-08 14:19   ` Dr. David Alan Gilbert
  2018-05-09  1:28     ` 858585 jemmy
  0 siblings, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2018-05-08 14:19 UTC (permalink / raw)
  To: Lidong Chen
  Cc: quintela, berrange, qemu-devel, galsha, aviadye, adido,
	Lidong Chen

* Lidong Chen (jemmy858585@gmail.com) wrote:
> Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
> by different threads concurrently, this patch removes unnecessary variables
> len in QIOChannelRDMA and use local variable instead.
> 
> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Reviewed-by: Daniel P. Berrangéberrange@redhat.com>

Note there's a ' <' missing somehow; minor fix up during commit
hopefully.

Dave

> ---
>  migration/rdma.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index c745427..f5c1d02 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -404,7 +404,6 @@ struct QIOChannelRDMA {
>      QIOChannel parent;
>      RDMAContext *rdma;
>      QEMUFile *file;
> -    size_t len;
>      bool blocking; /* XXX we don't actually honour this yet */
>  };
>  
> @@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>      int ret;
>      ssize_t done = 0;
>      size_t i;
> +    size_t len = 0;
>  
>      CHECK_ERROR_STATE();
>  
> @@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>          while (remaining) {
>              RDMAControlHeader head;
>  
> -            rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
> -            remaining -= rioc->len;
> +            len = MIN(remaining, RDMA_SEND_INCREMENT);
> +            remaining -= len;
>  
> -            head.len = rioc->len;
> +            head.len = len;
>              head.type = RDMA_CONTROL_QEMU_FILE;
>  
>              ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
> @@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>                  return ret;
>              }
>  
> -            data += rioc->len;
> -            done += rioc->len;
> +            data += len;
> +            done += len;
>          }
>      }
>  
> @@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
>              }
>          }
>      }
> -    rioc->len = done;
> -    return rioc->len;
> +    return done;
>  }
>  
>  /*
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
  2018-05-08 14:19   ` Dr. David Alan Gilbert
@ 2018-05-09  1:28     ` 858585 jemmy
  0 siblings, 0 replies; 13+ messages in thread
From: 858585 jemmy @ 2018-05-09  1:28 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Juan Quintela, Daniel P. Berrange, qemu-devel, Gal Shachaf,
	Aviad Yehezkel, adido, Lidong Chen

On Tue, May 8, 2018 at 10:19 PM, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
> * Lidong Chen (jemmy858585@gmail.com) wrote:
>> Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
>> by different threads concurrently, this patch removes unnecessary variables
>> len in QIOChannelRDMA and use local variable instead.
>>
>> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Reviewed-by: Daniel P. Berrangéberrange@redhat.com>
>
> Note there's a ' <' missing somehow; minor fix up during commit
> hopefully.
>
> Dave

Sorry for this mistake, I will check more carefully.

>
>> ---
>>  migration/rdma.c | 15 +++++++--------
>>  1 file changed, 7 insertions(+), 8 deletions(-)
>>
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index c745427..f5c1d02 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -404,7 +404,6 @@ struct QIOChannelRDMA {
>>      QIOChannel parent;
>>      RDMAContext *rdma;
>>      QEMUFile *file;
>> -    size_t len;
>>      bool blocking; /* XXX we don't actually honour this yet */
>>  };
>>
>> @@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>      int ret;
>>      ssize_t done = 0;
>>      size_t i;
>> +    size_t len = 0;
>>
>>      CHECK_ERROR_STATE();
>>
>> @@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>          while (remaining) {
>>              RDMAControlHeader head;
>>
>> -            rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
>> -            remaining -= rioc->len;
>> +            len = MIN(remaining, RDMA_SEND_INCREMENT);
>> +            remaining -= len;
>>
>> -            head.len = rioc->len;
>> +            head.len = len;
>>              head.type = RDMA_CONTROL_QEMU_FILE;
>>
>>              ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
>> @@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>                  return ret;
>>              }
>>
>> -            data += rioc->len;
>> -            done += rioc->len;
>> +            data += len;
>> +            done += len;
>>          }
>>      }
>>
>> @@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
>>              }
>>          }
>>      }
>> -    rioc->len = done;
>> -    return rioc->len;
>> +    return done;
>>  }
>>
>>  /*
>> --
>> 1.8.3.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
@ 2018-05-15 14:54   ` Paolo Bonzini
  2018-05-16  9:36     ` 858585 jemmy
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2018-05-15 14:54 UTC (permalink / raw)
  To: Lidong Chen, quintela, dgilbert, berrange
  Cc: adido, galsha, aviadye, qemu-devel, Lidong Chen

On 05/05/2018 16:35, Lidong Chen wrote:
> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>  {
>      QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>      QEMUFile *f = rioc->file;
> -    RDMAContext *rdma = rioc->rdma;
> +    RDMAContext *rdma;
>      int ret;
>      ssize_t done = 0;
>      size_t i;
>      size_t len = 0;
>  
> +    rcu_read_lock();
> +    rdma = atomic_rcu_read(&rioc->rdmaout);
> +
> +    if (!rdma) {
> +        rcu_read_unlock();
> +        return -EIO;
> +    }
> +
>      CHECK_ERROR_STATE();
>  
>      /*

I am not sure I understand this.  It would probably be wrong to use the
output side from two threads at the same time, so why not use two mutexes?

Also, who is calling qio_channel_rdma_close in such a way that another
thread is still using it?  Would it be possible to synchronize with the
other thread *before*, for example with qemu_thread_join?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-15 14:54   ` Paolo Bonzini
@ 2018-05-16  9:36     ` 858585 jemmy
  2018-05-21 11:49       ` 858585 jemmy
  0 siblings, 1 reply; 13+ messages in thread
From: 858585 jemmy @ 2018-05-16  9:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Juan Quintela, Dave Gilbert, Daniel P. Berrange, adido,
	Gal Shachaf, Aviad Yehezkel, qemu-devel, Lidong Chen

On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 05/05/2018 16:35, Lidong Chen wrote:
>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>  {
>>      QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>      QEMUFile *f = rioc->file;
>> -    RDMAContext *rdma = rioc->rdma;
>> +    RDMAContext *rdma;
>>      int ret;
>>      ssize_t done = 0;
>>      size_t i;
>>      size_t len = 0;
>>
>> +    rcu_read_lock();
>> +    rdma = atomic_rcu_read(&rioc->rdmaout);
>> +
>> +    if (!rdma) {
>> +        rcu_read_unlock();
>> +        return -EIO;
>> +    }
>> +
>>      CHECK_ERROR_STATE();
>>
>>      /*
>
> I am not sure I understand this.  It would probably be wrong to use the
> output side from two threads at the same time, so why not use two mutexes?

Two thread will not invoke qio_channel_rdma_writev at the same time.
The source qemu, migration thread only use writev, and the return path
thread only
use readv.
The destination qemu already have a mutex mis->rp_mutex to make sure
not use writev
at the same time.

The rcu_read_lock is used to protect not use RDMAContext when another
thread closes it.

>
> Also, who is calling qio_channel_rdma_close in such a way that another
> thread is still using it?  Would it be possible to synchronize with the
> other thread *before*, for example with qemu_thread_join?

The MigrationState structure includes to_dst_file and from_dst_file
QEMUFile, the two QEMUFile use the same QIOChannel.
For example, if the return path thread call
qemu_fclose(ms->rp_state.from_dst_file),
It will also close the RDMAContext for ms->to_dst_file.

For live migration, the source qemu invokes qemu_fclose in different
threads include main thread, migration thread, return path thread.

The destination qemu invokes qemu_fclose in main thread, listen thread and
COLO incoming thread.

I do not find an effective way to synchronize these threads.

Thanks.

>
> Thanks,
>
> Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-16  9:36     ` 858585 jemmy
@ 2018-05-21 11:49       ` 858585 jemmy
  2018-05-23  2:36         ` 858585 jemmy
  0 siblings, 1 reply; 13+ messages in thread
From: 858585 jemmy @ 2018-05-21 11:49 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P. Berrange
  Cc: Juan Quintela, Dave Gilbert, Adi Dotan, Gal Shachaf,
	Aviad Yehezkel, qemu-devel, Lidong Chen

On Wed, May 16, 2018 at 5:36 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
> On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 05/05/2018 16:35, Lidong Chen wrote:
>>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>>  {
>>>      QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>>      QEMUFile *f = rioc->file;
>>> -    RDMAContext *rdma = rioc->rdma;
>>> +    RDMAContext *rdma;
>>>      int ret;
>>>      ssize_t done = 0;
>>>      size_t i;
>>>      size_t len = 0;
>>>
>>> +    rcu_read_lock();
>>> +    rdma = atomic_rcu_read(&rioc->rdmaout);
>>> +
>>> +    if (!rdma) {
>>> +        rcu_read_unlock();
>>> +        return -EIO;
>>> +    }
>>> +
>>>      CHECK_ERROR_STATE();
>>>
>>>      /*
>>
>> I am not sure I understand this.  It would probably be wrong to use the
>> output side from two threads at the same time, so why not use two mutexes?
>
> Two thread will not invoke qio_channel_rdma_writev at the same time.
> The source qemu, migration thread only use writev, and the return path
> thread only
> use readv.
> The destination qemu already have a mutex mis->rp_mutex to make sure
> not use writev
> at the same time.
>
> The rcu_read_lock is used to protect not use RDMAContext when another
> thread closes it.

Any suggestion?

>
>>
>> Also, who is calling qio_channel_rdma_close in such a way that another
>> thread is still using it?  Would it be possible to synchronize with the
>> other thread *before*, for example with qemu_thread_join?
>
> The MigrationState structure includes to_dst_file and from_dst_file
> QEMUFile, the two QEMUFile use the same QIOChannel.
> For example, if the return path thread call
> qemu_fclose(ms->rp_state.from_dst_file),
> It will also close the RDMAContext for ms->to_dst_file.
>
> For live migration, the source qemu invokes qemu_fclose in different
> threads include main thread, migration thread, return path thread.
>
> The destination qemu invokes qemu_fclose in main thread, listen thread and
> COLO incoming thread.
>
> I do not find an effective way to synchronize these threads.
>
> Thanks.
>
>>
>> Thanks,
>>
>> Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
  2018-05-21 11:49       ` 858585 jemmy
@ 2018-05-23  2:36         ` 858585 jemmy
  0 siblings, 0 replies; 13+ messages in thread
From: 858585 jemmy @ 2018-05-23  2:36 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P. Berrange
  Cc: Juan Quintela, Dave Gilbert, Adi Dotan, Gal Shachaf,
	Aviad Yehezkel, qemu-devel, Lidong Chen

ping.

On Mon, May 21, 2018 at 7:49 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
> On Wed, May 16, 2018 at 5:36 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
>> On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 05/05/2018 16:35, Lidong Chen wrote:
>>>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>>>  {
>>>>      QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>>>      QEMUFile *f = rioc->file;
>>>> -    RDMAContext *rdma = rioc->rdma;
>>>> +    RDMAContext *rdma;
>>>>      int ret;
>>>>      ssize_t done = 0;
>>>>      size_t i;
>>>>      size_t len = 0;
>>>>
>>>> +    rcu_read_lock();
>>>> +    rdma = atomic_rcu_read(&rioc->rdmaout);
>>>> +
>>>> +    if (!rdma) {
>>>> +        rcu_read_unlock();
>>>> +        return -EIO;
>>>> +    }
>>>> +
>>>>      CHECK_ERROR_STATE();
>>>>
>>>>      /*
>>>
>>> I am not sure I understand this.  It would probably be wrong to use the
>>> output side from two threads at the same time, so why not use two mutexes?
>>
>> Two thread will not invoke qio_channel_rdma_writev at the same time.
>> The source qemu, migration thread only use writev, and the return path
>> thread only
>> use readv.
>> The destination qemu already have a mutex mis->rp_mutex to make sure
>> not use writev
>> at the same time.
>>
>> The rcu_read_lock is used to protect not use RDMAContext when another
>> thread closes it.
>
> Any suggestion?
>
>>
>>>
>>> Also, who is calling qio_channel_rdma_close in such a way that another
>>> thread is still using it?  Would it be possible to synchronize with the
>>> other thread *before*, for example with qemu_thread_join?
>>
>> The MigrationState structure includes to_dst_file and from_dst_file
>> QEMUFile, the two QEMUFile use the same QIOChannel.
>> For example, if the return path thread call
>> qemu_fclose(ms->rp_state.from_dst_file),
>> It will also close the RDMAContext for ms->to_dst_file.
>>
>> For live migration, the source qemu invokes qemu_fclose in different
>> threads include main thread, migration thread, return path thread.
>>
>> The destination qemu invokes qemu_fclose in main thread, listen thread and
>> COLO incoming thread.
>>
>> I do not find an effective way to synchronize these threads.
>>
>> Thanks.
>>
>>>
>>> Thanks,
>>>
>>> Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-05-23  2:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
2018-05-08 14:19   ` Dr. David Alan Gilbert
2018-05-09  1:28     ` 858585 jemmy
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
2018-05-15 14:54   ` Paolo Bonzini
2018-05-16  9:36     ` 858585 jemmy
2018-05-21 11:49       ` 858585 jemmy
2018-05-23  2:36         ` 858585 jemmy
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).