* [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration
@ 2018-05-05 14:35 Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
To: quintela, dgilbert, berrange
Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen
The RDMA QIOChannel does not support bi-directional communication, so when RDMA
live migration with postcopy enabled, the source qemu return path get qemu file
error.
These patches implement bi-directional communication for RDMA QIOChannel and
disable the RDMA WRITE during the postcopy phase.
This patch just make postcopy works, and will improve performance later.
[v3]
- add a mutex in QEMUFile struct to avoid concurrent channel close (Daniel)
- destroy the mutex before free QEMUFile (David)
- use rdmain and rmdaout instead of rdma->return_path (Daniel)
[v2]
- does not update bytes_xfer when disable RDMA WRITE (David)
- implement bi-directional communication for RDMA QIOChannel (Daniel)
Lidong Chen (6):
migration: disable RDMA WRITE after postcopy started
migration: create a dedicated connection for rdma return path
migration: remove unnecessary variables len in QIOChannelRDMA
migration: avoid concurrent invoke channel_close by different threads
migration: implement bi-directional RDMA QIOChannel
migration: Stop rdma yielding during incoming postcopy
migration/colo.c | 2 +
migration/migration.c | 2 +
migration/postcopy-ram.c | 2 +
migration/qemu-file.c | 13 +-
migration/ram.c | 4 +
migration/rdma.c | 321 +++++++++++++++++++++++++++++++++++++++++------
migration/savevm.c | 3 +
7 files changed, 307 insertions(+), 40 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
` (4 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
To: quintela, dgilbert, berrange
Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen
RDMA WRITE operations are performed with no notification to the destination
qemu, then the destination qemu can not wakeup. This patch disable RDMA WRITE
after postcopy started.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/qemu-file.c | 8 ++++++--
migration/rdma.c | 12 ++++++++++++
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 0463f4c..977b9ae 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -253,8 +253,12 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
if (f->hooks && f->hooks->save_page) {
int ret = f->hooks->save_page(f, f->opaque, block_offset,
offset, size, bytes_sent);
- f->bytes_xfer += size;
- if (ret != RAM_SAVE_CONTROL_DELAYED) {
+ if (ret != RAM_SAVE_CONTROL_NOT_SUPP) {
+ f->bytes_xfer += size;
+ }
+
+ if (ret != RAM_SAVE_CONTROL_DELAYED &&
+ ret != RAM_SAVE_CONTROL_NOT_SUPP) {
if (bytes_sent && *bytes_sent > 0) {
qemu_update_position(f, *bytes_sent);
} else if (ret < 0) {
diff --git a/migration/rdma.c b/migration/rdma.c
index da474fc..a22be43 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2927,6 +2927,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
CHECK_ERROR_STATE();
+ if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+ return RAM_SAVE_CONTROL_NOT_SUPP;
+ }
+
qemu_fflush(f);
if (size > 0) {
@@ -3482,6 +3486,10 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
CHECK_ERROR_STATE();
+ if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+ return 0;
+ }
+
trace_qemu_rdma_registration_start(flags);
qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
qemu_fflush(f);
@@ -3504,6 +3512,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
CHECK_ERROR_STATE();
+ if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+ return 0;
+ }
+
qemu_fflush(f);
ret = qemu_rdma_drain_cq(f, rdma);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
` (3 subsequent siblings)
5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
To: quintela, dgilbert, berrange
Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen
If start a RDMA migration with postcopy enabled, the source qemu
establish a dedicated connection for return path.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/rdma.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 91 insertions(+), 3 deletions(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index a22be43..c745427 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -387,6 +387,10 @@ typedef struct RDMAContext {
uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
GHashTable *blockmap;
+
+ /* the RDMAContext for return path */
+ struct RDMAContext *return_path;
+ bool is_return_path;
} RDMAContext;
#define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
@@ -2329,10 +2333,22 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
rdma_destroy_id(rdma->cm_id);
rdma->cm_id = NULL;
}
+
+ /* the destination side, listen_id and channel is shared */
if (rdma->listen_id) {
- rdma_destroy_id(rdma->listen_id);
+ if (!rdma->is_return_path) {
+ rdma_destroy_id(rdma->listen_id);
+ }
rdma->listen_id = NULL;
+
+ if (rdma->channel) {
+ if (!rdma->is_return_path) {
+ rdma_destroy_event_channel(rdma->channel);
+ }
+ rdma->channel = NULL;
+ }
}
+
if (rdma->channel) {
rdma_destroy_event_channel(rdma->channel);
rdma->channel = NULL;
@@ -2561,6 +2577,25 @@ err_dest_init_create_listen_id:
}
+static void qemu_rdma_return_path_dest_init(RDMAContext *rdma_return_path,
+ RDMAContext *rdma)
+{
+ int idx;
+
+ for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
+ rdma_return_path->wr_data[idx].control_len = 0;
+ rdma_return_path->wr_data[idx].control_curr = NULL;
+ }
+
+ /*the CM channel and CM id is shared*/
+ rdma_return_path->channel = rdma->channel;
+ rdma_return_path->listen_id = rdma->listen_id;
+
+ rdma->return_path = rdma_return_path;
+ rdma_return_path->return_path = rdma;
+ rdma_return_path->is_return_path = true;
+}
+
static void *qemu_rdma_data_init(const char *host_port, Error **errp)
{
RDMAContext *rdma = NULL;
@@ -3018,6 +3053,8 @@ err:
return ret;
}
+static void rdma_accept_incoming_migration(void *opaque);
+
static int qemu_rdma_accept(RDMAContext *rdma)
{
RDMACapabilities cap;
@@ -3112,7 +3149,14 @@ static int qemu_rdma_accept(RDMAContext *rdma)
}
}
- qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
+ /* Accept the second connection request for return path */
+ if (migrate_postcopy() && !rdma->is_return_path) {
+ qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
+ NULL,
+ (void *)(intptr_t)rdma->return_path);
+ } else {
+ qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
+ }
ret = rdma_accept(rdma->cm_id, &conn_param);
if (ret) {
@@ -3693,6 +3737,10 @@ static void rdma_accept_incoming_migration(void *opaque)
trace_qemu_rdma_accept_incoming_migration_accepted();
+ if (rdma->is_return_path) {
+ return;
+ }
+
f = qemu_fopen_rdma(rdma, "rb");
if (f == NULL) {
ERROR(errp, "could not qemu_fopen_rdma!");
@@ -3707,7 +3755,7 @@ static void rdma_accept_incoming_migration(void *opaque)
void rdma_start_incoming_migration(const char *host_port, Error **errp)
{
int ret;
- RDMAContext *rdma;
+ RDMAContext *rdma, *rdma_return_path;
Error *local_err = NULL;
trace_rdma_start_incoming_migration();
@@ -3734,12 +3782,24 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp)
trace_rdma_start_incoming_migration_after_rdma_listen();
+ /* initialize the RDMAContext for return path */
+ if (migrate_postcopy()) {
+ rdma_return_path = qemu_rdma_data_init(host_port, &local_err);
+
+ if (rdma_return_path == NULL) {
+ goto err;
+ }
+
+ qemu_rdma_return_path_dest_init(rdma_return_path, rdma);
+ }
+
qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
NULL, (void *)(intptr_t)rdma);
return;
err:
error_propagate(errp, local_err);
g_free(rdma);
+ g_free(rdma_return_path);
}
void rdma_start_outgoing_migration(void *opaque,
@@ -3747,6 +3807,7 @@ void rdma_start_outgoing_migration(void *opaque,
{
MigrationState *s = opaque;
RDMAContext *rdma = qemu_rdma_data_init(host_port, errp);
+ RDMAContext *rdma_return_path = NULL;
int ret = 0;
if (rdma == NULL) {
@@ -3767,6 +3828,32 @@ void rdma_start_outgoing_migration(void *opaque,
goto err;
}
+ /* RDMA postcopy need a seprate queue pair for return path */
+ if (migrate_postcopy()) {
+ rdma_return_path = qemu_rdma_data_init(host_port, errp);
+
+ if (rdma_return_path == NULL) {
+ goto err;
+ }
+
+ ret = qemu_rdma_source_init(rdma_return_path,
+ s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL], errp);
+
+ if (ret) {
+ goto err;
+ }
+
+ ret = qemu_rdma_connect(rdma_return_path, errp);
+
+ if (ret) {
+ goto err;
+ }
+
+ rdma->return_path = rdma_return_path;
+ rdma_return_path->return_path = rdma;
+ rdma_return_path->is_return_path = true;
+ }
+
trace_rdma_start_outgoing_migration_after_rdma_connect();
s->to_dst_file = qemu_fopen_rdma(rdma, "wb");
@@ -3774,4 +3861,5 @@ void rdma_start_outgoing_migration(void *opaque,
return;
err:
g_free(rdma);
+ g_free(rdma_return_path);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
2018-05-08 14:19 ` Dr. David Alan Gilbert
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
` (2 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
To: quintela, dgilbert, berrange
Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen
Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
by different threads concurrently, this patch removes unnecessary variables
len in QIOChannelRDMA and use local variable instead.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangéberrange@redhat.com>
---
migration/rdma.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index c745427..f5c1d02 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -404,7 +404,6 @@ struct QIOChannelRDMA {
QIOChannel parent;
RDMAContext *rdma;
QEMUFile *file;
- size_t len;
bool blocking; /* XXX we don't actually honour this yet */
};
@@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
int ret;
ssize_t done = 0;
size_t i;
+ size_t len = 0;
CHECK_ERROR_STATE();
@@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
while (remaining) {
RDMAControlHeader head;
- rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
- remaining -= rioc->len;
+ len = MIN(remaining, RDMA_SEND_INCREMENT);
+ remaining -= len;
- head.len = rioc->len;
+ head.len = len;
head.type = RDMA_CONTROL_QEMU_FILE;
ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
@@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
return ret;
}
- data += rioc->len;
- done += rioc->len;
+ data += len;
+ done += len;
}
}
@@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
}
}
}
- rioc->len = done;
- return rioc->len;
+ return done;
}
/*
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
` (2 preceding siblings ...)
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen
5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
To: quintela, dgilbert, berrange
Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen
The channel_close maybe invoked by different threads. For example, source
qemu invokes qemu_fclose in main thread, migration thread and return path
thread. Destination qemu invokes qemu_fclose in main thread, listen thread
and COLO incoming thread.
Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
migration/qemu-file.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 977b9ae..87d0f05 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -52,6 +52,7 @@ struct QEMUFile {
unsigned int iovcnt;
int last_error;
+ QemuMutex lock;
};
/*
@@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops)
f = g_new0(QEMUFile, 1);
+ qemu_mutex_init(&f->lock);
f->opaque = opaque;
f->ops = ops;
return f;
@@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f)
ret = qemu_file_get_error(f);
if (f->ops->close) {
+ qemu_mutex_lock(&f->lock);
int ret2 = f->ops->close(f->opaque);
+ qemu_mutex_unlock(&f->lock);
if (ret >= 0) {
ret = ret2;
}
@@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f)
if (f->last_error) {
ret = f->last_error;
}
+ qemu_mutex_destroy(&f->lock);
g_free(f);
trace_qemu_file_fclose();
return ret;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
` (3 preceding siblings ...)
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
2018-05-15 14:54 ` Paolo Bonzini
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen
5 siblings, 1 reply; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
To: quintela, dgilbert, berrange
Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen
This patch implements bi-directional RDMA QIOChannel. Because different
threads may access RDMAQIOChannel currently, this patch use RCU to protect it.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
migration/colo.c | 2 +
migration/migration.c | 2 +
migration/postcopy-ram.c | 2 +
migration/ram.c | 4 +
migration/rdma.c | 196 ++++++++++++++++++++++++++++++++++++++++-------
migration/savevm.c | 3 +
6 files changed, 183 insertions(+), 26 deletions(-)
diff --git a/migration/colo.c b/migration/colo.c
index 4381067..88936f5 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -534,6 +534,7 @@ void *colo_process_incoming_thread(void *opaque)
uint64_t value;
Error *local_err = NULL;
+ rcu_register_thread();
qemu_sem_init(&mis->colo_incoming_sem, 0);
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
@@ -666,5 +667,6 @@ out:
}
migration_incoming_exit_colo();
+ rcu_unregister_thread();
return NULL;
}
diff --git a/migration/migration.c b/migration/migration.c
index 0bdb28e..584666b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1787,6 +1787,7 @@ static void *source_return_path_thread(void *opaque)
int res;
trace_source_return_path_thread_entry();
+ rcu_register_thread();
while (!ms->rp_state.error && !qemu_file_get_error(rp) &&
migration_is_setup_or_active(ms->state)) {
trace_source_return_path_thread_loop_top();
@@ -1887,6 +1888,7 @@ static void *source_return_path_thread(void *opaque)
out:
ms->rp_state.from_dst_file = NULL;
qemu_fclose(rp);
+ rcu_unregister_thread();
return NULL;
}
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 8ceeaa2..4e05966 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -842,6 +842,7 @@ static void *postcopy_ram_fault_thread(void *opaque)
RAMBlock *rb = NULL;
trace_postcopy_ram_fault_thread_entry();
+ rcu_register_thread();
mis->last_rb = NULL; /* last RAMBlock we sent part of */
qemu_sem_post(&mis->fault_thread_sem);
@@ -1013,6 +1014,7 @@ static void *postcopy_ram_fault_thread(void *opaque)
}
}
}
+ rcu_unregister_thread();
trace_postcopy_ram_fault_thread_exit();
g_free(pfd);
return NULL;
diff --git a/migration/ram.c b/migration/ram.c
index 912810c..9bc92fc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -491,6 +491,7 @@ static void *multifd_send_thread(void *opaque)
{
MultiFDSendParams *p = opaque;
+ rcu_register_thread();
while (true) {
qemu_mutex_lock(&p->mutex);
if (p->quit) {
@@ -500,6 +501,7 @@ static void *multifd_send_thread(void *opaque)
qemu_mutex_unlock(&p->mutex);
qemu_sem_wait(&p->sem);
}
+ rcu_unregister_thread();
return NULL;
}
@@ -592,6 +594,7 @@ static void *multifd_recv_thread(void *opaque)
{
MultiFDRecvParams *p = opaque;
+ rcu_register_thread();
while (true) {
qemu_mutex_lock(&p->mutex);
if (p->quit) {
@@ -601,6 +604,7 @@ static void *multifd_recv_thread(void *opaque)
qemu_mutex_unlock(&p->mutex);
qemu_sem_wait(&p->sem);
}
+ rcu_unregister_thread();
return NULL;
}
diff --git a/migration/rdma.c b/migration/rdma.c
index f5c1d02..854f355 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -86,6 +86,7 @@ static uint32_t known_capabilities = RDMA_CAPABILITY_PIN_ALL;
" to abort!"); \
rdma->error_reported = 1; \
} \
+ rcu_read_unlock(); \
return rdma->error_state; \
} \
} while (0)
@@ -402,7 +403,8 @@ typedef struct QIOChannelRDMA QIOChannelRDMA;
struct QIOChannelRDMA {
QIOChannel parent;
- RDMAContext *rdma;
+ RDMAContext *rdmain;
+ RDMAContext *rdmaout;
QEMUFile *file;
bool blocking; /* XXX we don't actually honour this yet */
};
@@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
{
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
QEMUFile *f = rioc->file;
- RDMAContext *rdma = rioc->rdma;
+ RDMAContext *rdma;
int ret;
ssize_t done = 0;
size_t i;
size_t len = 0;
+ rcu_read_lock();
+ rdma = atomic_rcu_read(&rioc->rdmaout);
+
+ if (!rdma) {
+ rcu_read_unlock();
+ return -EIO;
+ }
+
CHECK_ERROR_STATE();
/*
@@ -2650,6 +2660,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
ret = qemu_rdma_write_flush(f, rdma);
if (ret < 0) {
rdma->error_state = ret;
+ rcu_read_unlock();
return ret;
}
@@ -2669,6 +2680,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
if (ret < 0) {
rdma->error_state = ret;
+ rcu_read_unlock();
return ret;
}
@@ -2677,6 +2689,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
}
}
+ rcu_read_unlock();
return done;
}
@@ -2710,12 +2723,20 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
Error **errp)
{
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
- RDMAContext *rdma = rioc->rdma;
+ RDMAContext *rdma;
RDMAControlHeader head;
int ret = 0;
ssize_t i;
size_t done = 0;
+ rcu_read_lock();
+ rdma = atomic_rcu_read(&rioc->rdmain);
+
+ if (!rdma) {
+ rcu_read_unlock();
+ return -EIO;
+ }
+
CHECK_ERROR_STATE();
for (i = 0; i < niov; i++) {
@@ -2727,7 +2748,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
* were given and dish out the bytes until we run
* out of bytes.
*/
- ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
+ ret = qemu_rdma_fill(rdma, data, want, 0);
done += ret;
want -= ret;
/* Got what we needed, so go to next iovec */
@@ -2749,25 +2770,28 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
if (ret < 0) {
rdma->error_state = ret;
+ rcu_read_unlock();
return ret;
}
/*
* SEND was received with new bytes, now try again.
*/
- ret = qemu_rdma_fill(rioc->rdma, data, want, 0);
+ ret = qemu_rdma_fill(rdma, data, want, 0);
done += ret;
want -= ret;
/* Still didn't get enough, so lets just return */
if (want) {
if (done == 0) {
+ rcu_read_unlock();
return QIO_CHANNEL_ERR_BLOCK;
} else {
break;
}
}
}
+ rcu_read_unlock();
return done;
}
@@ -2819,15 +2843,29 @@ qio_channel_rdma_source_prepare(GSource *source,
gint *timeout)
{
QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
- RDMAContext *rdma = rsource->rioc->rdma;
+ RDMAContext *rdma;
GIOCondition cond = 0;
*timeout = -1;
+ rcu_read_lock();
+ if (rsource->condition == G_IO_IN) {
+ rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+ } else {
+ rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+ }
+
+ if (!rdma) {
+ error_report("RDMAContext is NULL when prepare Gsource");
+ rcu_read_unlock();
+ return FALSE;
+ }
+
if (rdma->wr_data[0].control_len) {
cond |= G_IO_IN;
}
cond |= G_IO_OUT;
+ rcu_read_unlock();
return cond & rsource->condition;
}
@@ -2835,14 +2873,28 @@ static gboolean
qio_channel_rdma_source_check(GSource *source)
{
QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
- RDMAContext *rdma = rsource->rioc->rdma;
+ RDMAContext *rdma;
GIOCondition cond = 0;
+ rcu_read_lock();
+ if (rsource->condition == G_IO_IN) {
+ rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+ } else {
+ rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+ }
+
+ if (!rdma) {
+ error_report("RDMAContext is NULL when check Gsource");
+ rcu_read_unlock();
+ return FALSE;
+ }
+
if (rdma->wr_data[0].control_len) {
cond |= G_IO_IN;
}
cond |= G_IO_OUT;
+ rcu_read_unlock();
return cond & rsource->condition;
}
@@ -2853,14 +2905,28 @@ qio_channel_rdma_source_dispatch(GSource *source,
{
QIOChannelFunc func = (QIOChannelFunc)callback;
QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source;
- RDMAContext *rdma = rsource->rioc->rdma;
+ RDMAContext *rdma;
GIOCondition cond = 0;
+ rcu_read_lock();
+ if (rsource->condition == G_IO_IN) {
+ rdma = atomic_rcu_read(&rsource->rioc->rdmain);
+ } else {
+ rdma = atomic_rcu_read(&rsource->rioc->rdmaout);
+ }
+
+ if (!rdma) {
+ error_report("RDMAContext is NULL when dispatch Gsource");
+ rcu_read_unlock();
+ return FALSE;
+ }
+
if (rdma->wr_data[0].control_len) {
cond |= G_IO_IN;
}
cond |= G_IO_OUT;
+ rcu_read_unlock();
return (*func)(QIO_CHANNEL(rsource->rioc),
(cond & rsource->condition),
user_data);
@@ -2905,15 +2971,32 @@ static int qio_channel_rdma_close(QIOChannel *ioc,
Error **errp)
{
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
+ RDMAContext *rdmain, *rdmaout;
trace_qemu_rdma_close();
- if (rioc->rdma) {
- if (!rioc->rdma->error_state) {
- rioc->rdma->error_state = qemu_file_get_error(rioc->file);
- }
- qemu_rdma_cleanup(rioc->rdma);
- g_free(rioc->rdma);
- rioc->rdma = NULL;
+
+ rdmain = rioc->rdmain;
+ if (rdmain) {
+ atomic_rcu_set(&rioc->rdmain, NULL);
+ }
+
+ rdmaout = rioc->rdmaout;
+ if (rdmaout) {
+ atomic_rcu_set(&rioc->rdmaout, NULL);
}
+
+ synchronize_rcu();
+
+ if (rdmain) {
+ qemu_rdma_cleanup(rdmain);
+ }
+
+ if (rdmaout) {
+ qemu_rdma_cleanup(rdmaout);
+ }
+
+ g_free(rdmain);
+ g_free(rdmaout);
+
return 0;
}
@@ -2956,12 +3039,21 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
size_t size, uint64_t *bytes_sent)
{
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
- RDMAContext *rdma = rioc->rdma;
+ RDMAContext *rdma;
int ret;
+ rcu_read_lock();
+ rdma = atomic_rcu_read(&rioc->rdmaout);
+
+ if (!rdma) {
+ rcu_read_unlock();
+ return -EIO;
+ }
+
CHECK_ERROR_STATE();
if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+ rcu_read_unlock();
return RAM_SAVE_CONTROL_NOT_SUPP;
}
@@ -3046,9 +3138,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque,
}
}
+ rcu_read_unlock();
return RAM_SAVE_CONTROL_DELAYED;
err:
rdma->error_state = ret;
+ rcu_read_unlock();
return ret;
}
@@ -3224,8 +3318,8 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
RDMAControlHeader blocks = { .type = RDMA_CONTROL_RAM_BLOCKS_RESULT,
.repeat = 1 };
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
- RDMAContext *rdma = rioc->rdma;
- RDMALocalBlocks *local = &rdma->local_ram_blocks;
+ RDMAContext *rdma;
+ RDMALocalBlocks *local;
RDMAControlHeader head;
RDMARegister *reg, *registers;
RDMACompress *comp;
@@ -3238,8 +3332,17 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
int count = 0;
int i = 0;
+ rcu_read_lock();
+ rdma = atomic_rcu_read(&rioc->rdmain);
+
+ if (!rdma) {
+ rcu_read_unlock();
+ return -EIO;
+ }
+
CHECK_ERROR_STATE();
+ local = &rdma->local_ram_blocks;
do {
trace_qemu_rdma_registration_handle_wait();
@@ -3469,6 +3572,7 @@ out:
if (ret < 0) {
rdma->error_state = ret;
}
+ rcu_read_unlock();
return ret;
}
@@ -3482,10 +3586,18 @@ out:
static int
rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
{
- RDMAContext *rdma = rioc->rdma;
+ RDMAContext *rdma;
int curr;
int found = -1;
+ rcu_read_lock();
+ rdma = atomic_rcu_read(&rioc->rdmain);
+
+ if (!rdma) {
+ rcu_read_unlock();
+ return -EIO;
+ }
+
/* Find the matching RAMBlock in our local list */
for (curr = 0; curr < rdma->local_ram_blocks.nb_blocks; curr++) {
if (!strcmp(rdma->local_ram_blocks.block[curr].block_name, name)) {
@@ -3496,6 +3608,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
if (found == -1) {
error_report("RAMBlock '%s' not found on destination", name);
+ rcu_read_unlock();
return -ENOENT;
}
@@ -3503,6 +3616,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name)
trace_rdma_block_notification_handle(name, rdma->next_src_index);
rdma->next_src_index++;
+ rcu_read_unlock();
return 0;
}
@@ -3525,11 +3639,19 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
uint64_t flags, void *data)
{
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
- RDMAContext *rdma = rioc->rdma;
+ RDMAContext *rdma;
+
+ rcu_read_lock();
+ rdma = atomic_rcu_read(&rioc->rdmaout);
+ if (!rdma) {
+ rcu_read_unlock();
+ return -EIO;
+ }
CHECK_ERROR_STATE();
if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+ rcu_read_unlock();
return 0;
}
@@ -3537,6 +3659,7 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque,
qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
qemu_fflush(f);
+ rcu_read_unlock();
return 0;
}
@@ -3549,13 +3672,21 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
{
Error *local_err = NULL, **errp = &local_err;
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque);
- RDMAContext *rdma = rioc->rdma;
+ RDMAContext *rdma;
RDMAControlHeader head = { .len = 0, .repeat = 1 };
int ret = 0;
+ rcu_read_lock();
+ rdma = atomic_rcu_read(&rioc->rdmaout);
+ if (!rdma) {
+ rcu_read_unlock();
+ return -EIO;
+ }
+
CHECK_ERROR_STATE();
if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+ rcu_read_unlock();
return 0;
}
@@ -3587,6 +3718,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
qemu_rdma_reg_whole_ram_blocks : NULL);
if (ret < 0) {
ERROR(errp, "receiving remote info!");
+ rcu_read_unlock();
return ret;
}
@@ -3610,6 +3742,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
"not identical on both the source and destination.",
local->nb_blocks, nb_dest_blocks);
rdma->error_state = -EINVAL;
+ rcu_read_unlock();
return -EINVAL;
}
@@ -3626,6 +3759,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
local->block[i].length,
rdma->dest_blocks[i].length);
rdma->error_state = -EINVAL;
+ rcu_read_unlock();
return -EINVAL;
}
local->block[i].remote_host_addr =
@@ -3643,9 +3777,11 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque,
goto err;
}
+ rcu_read_unlock();
return 0;
err:
rdma->error_state = ret;
+ rcu_read_unlock();
return ret;
}
@@ -3663,10 +3799,15 @@ static const QEMUFileHooks rdma_write_hooks = {
static void qio_channel_rdma_finalize(Object *obj)
{
QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(obj);
- if (rioc->rdma) {
- qemu_rdma_cleanup(rioc->rdma);
- g_free(rioc->rdma);
- rioc->rdma = NULL;
+ if (rioc->rdmain) {
+ qemu_rdma_cleanup(rioc->rdmain);
+ g_free(rioc->rdmain);
+ rioc->rdmain = NULL;
+ }
+ if (rioc->rdmaout) {
+ qemu_rdma_cleanup(rioc->rdmaout);
+ g_free(rioc->rdmaout);
+ rioc->rdmaout = NULL;
}
}
@@ -3706,13 +3847,16 @@ static QEMUFile *qemu_fopen_rdma(RDMAContext *rdma, const char *mode)
}
rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA));
- rioc->rdma = rdma;
if (mode[0] == 'w') {
rioc->file = qemu_fopen_channel_output(QIO_CHANNEL(rioc));
+ rioc->rdmaout = rdma;
+ rioc->rdmain = rdma->return_path;
qemu_file_set_hooks(rioc->file, &rdma_write_hooks);
} else {
rioc->file = qemu_fopen_channel_input(QIO_CHANNEL(rioc));
+ rioc->rdmain = rdma;
+ rioc->rdmaout = rdma->return_path;
qemu_file_set_hooks(rioc->file, &rdma_read_hooks);
}
diff --git a/migration/savevm.c b/migration/savevm.c
index e2be02a..45ec809 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1573,6 +1573,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
qemu_sem_post(&mis->listen_thread_sem);
trace_postcopy_ram_listen_thread_start();
+ rcu_register_thread();
/*
* Because we're a thread and not a coroutine we can't yield
* in qemu_file, and thus we must be blocking now.
@@ -1605,6 +1606,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
* to leave the guest running and fire MCEs for pages that never
* arrived as a desperate recovery step.
*/
+ rcu_unregister_thread();
exit(EXIT_FAILURE);
}
@@ -1619,6 +1621,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
migration_incoming_state_destroy();
qemu_loadvm_state_cleanup();
+ rcu_unregister_thread();
return NULL;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
` (4 preceding siblings ...)
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
@ 2018-05-05 14:35 ` Lidong Chen
5 siblings, 0 replies; 13+ messages in thread
From: Lidong Chen @ 2018-05-05 14:35 UTC (permalink / raw)
To: quintela, dgilbert, berrange
Cc: qemu-devel, galsha, aviadye, adido, Lidong Chen
During incoming postcopy, the destination qemu will invoke
qemu_rdma_wait_comp_channel in a seprate thread. So does not use rdma
yield, and poll the completion channel fd instead.
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/rdma.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index 854f355..ed9cfb1 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1490,11 +1490,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
* Coroutine doesn't start until migration_fd_process_incoming()
* so don't yield unless we know we're running inside of a coroutine.
*/
- if (rdma->migration_started_on_destination) {
+ if (rdma->migration_started_on_destination &&
+ migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {
yield_until_fd_readable(rdma->comp_channel->fd);
} else {
/* This is the source side, we're in a separate thread
* or destination prior to migration_fd_process_incoming()
+ * after postcopy, the destination also in a seprate thread.
* we can't yield; so we have to poll the fd.
* But we need to be able to handle 'cancel' or an error
* without hanging forever.
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
@ 2018-05-08 14:19 ` Dr. David Alan Gilbert
2018-05-09 1:28 ` 858585 jemmy
0 siblings, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2018-05-08 14:19 UTC (permalink / raw)
To: Lidong Chen
Cc: quintela, berrange, qemu-devel, galsha, aviadye, adido,
Lidong Chen
* Lidong Chen (jemmy858585@gmail.com) wrote:
> Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
> by different threads concurrently, this patch removes unnecessary variables
> len in QIOChannelRDMA and use local variable instead.
>
> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Reviewed-by: Daniel P. Berrangéberrange@redhat.com>
Note there's a ' <' missing somehow; minor fix up during commit
hopefully.
Dave
> ---
> migration/rdma.c | 15 +++++++--------
> 1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/migration/rdma.c b/migration/rdma.c
> index c745427..f5c1d02 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -404,7 +404,6 @@ struct QIOChannelRDMA {
> QIOChannel parent;
> RDMAContext *rdma;
> QEMUFile *file;
> - size_t len;
> bool blocking; /* XXX we don't actually honour this yet */
> };
>
> @@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
> int ret;
> ssize_t done = 0;
> size_t i;
> + size_t len = 0;
>
> CHECK_ERROR_STATE();
>
> @@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
> while (remaining) {
> RDMAControlHeader head;
>
> - rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
> - remaining -= rioc->len;
> + len = MIN(remaining, RDMA_SEND_INCREMENT);
> + remaining -= len;
>
> - head.len = rioc->len;
> + head.len = len;
> head.type = RDMA_CONTROL_QEMU_FILE;
>
> ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
> @@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
> return ret;
> }
>
> - data += rioc->len;
> - done += rioc->len;
> + data += len;
> + done += len;
> }
> }
>
> @@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
> }
> }
> }
> - rioc->len = done;
> - return rioc->len;
> + return done;
> }
>
> /*
> --
> 1.8.3.1
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA
2018-05-08 14:19 ` Dr. David Alan Gilbert
@ 2018-05-09 1:28 ` 858585 jemmy
0 siblings, 0 replies; 13+ messages in thread
From: 858585 jemmy @ 2018-05-09 1:28 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Juan Quintela, Daniel P. Berrange, qemu-devel, Gal Shachaf,
Aviad Yehezkel, adido, Lidong Chen
On Tue, May 8, 2018 at 10:19 PM, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
> * Lidong Chen (jemmy858585@gmail.com) wrote:
>> Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
>> by different threads concurrently, this patch removes unnecessary variables
>> len in QIOChannelRDMA and use local variable instead.
>>
>> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Reviewed-by: Daniel P. Berrangéberrange@redhat.com>
>
> Note there's a ' <' missing somehow; minor fix up during commit
> hopefully.
>
> Dave
Sorry for this mistake, I will check more carefully.
>
>> ---
>> migration/rdma.c | 15 +++++++--------
>> 1 file changed, 7 insertions(+), 8 deletions(-)
>>
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index c745427..f5c1d02 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -404,7 +404,6 @@ struct QIOChannelRDMA {
>> QIOChannel parent;
>> RDMAContext *rdma;
>> QEMUFile *file;
>> - size_t len;
>> bool blocking; /* XXX we don't actually honour this yet */
>> };
>>
>> @@ -2640,6 +2639,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>> int ret;
>> ssize_t done = 0;
>> size_t i;
>> + size_t len = 0;
>>
>> CHECK_ERROR_STATE();
>>
>> @@ -2659,10 +2659,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>> while (remaining) {
>> RDMAControlHeader head;
>>
>> - rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
>> - remaining -= rioc->len;
>> + len = MIN(remaining, RDMA_SEND_INCREMENT);
>> + remaining -= len;
>>
>> - head.len = rioc->len;
>> + head.len = len;
>> head.type = RDMA_CONTROL_QEMU_FILE;
>>
>> ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
>> @@ -2672,8 +2672,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>> return ret;
>> }
>>
>> - data += rioc->len;
>> - done += rioc->len;
>> + data += len;
>> + done += len;
>> }
>> }
>>
>> @@ -2768,8 +2768,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
>> }
>> }
>> }
>> - rioc->len = done;
>> - return rioc->len;
>> + return done;
>> }
>>
>> /*
>> --
>> 1.8.3.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
@ 2018-05-15 14:54 ` Paolo Bonzini
2018-05-16 9:36 ` 858585 jemmy
0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2018-05-15 14:54 UTC (permalink / raw)
To: Lidong Chen, quintela, dgilbert, berrange
Cc: adido, galsha, aviadye, qemu-devel, Lidong Chen
On 05/05/2018 16:35, Lidong Chen wrote:
> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
> {
> QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
> QEMUFile *f = rioc->file;
> - RDMAContext *rdma = rioc->rdma;
> + RDMAContext *rdma;
> int ret;
> ssize_t done = 0;
> size_t i;
> size_t len = 0;
>
> + rcu_read_lock();
> + rdma = atomic_rcu_read(&rioc->rdmaout);
> +
> + if (!rdma) {
> + rcu_read_unlock();
> + return -EIO;
> + }
> +
> CHECK_ERROR_STATE();
>
> /*
I am not sure I understand this. It would probably be wrong to use the
output side from two threads at the same time, so why not use two mutexes?
Also, who is calling qio_channel_rdma_close in such a way that another
thread is still using it? Would it be possible to synchronize with the
other thread *before*, for example with qemu_thread_join?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
2018-05-15 14:54 ` Paolo Bonzini
@ 2018-05-16 9:36 ` 858585 jemmy
2018-05-21 11:49 ` 858585 jemmy
0 siblings, 1 reply; 13+ messages in thread
From: 858585 jemmy @ 2018-05-16 9:36 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Juan Quintela, Dave Gilbert, Daniel P. Berrange, adido,
Gal Shachaf, Aviad Yehezkel, qemu-devel, Lidong Chen
On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 05/05/2018 16:35, Lidong Chen wrote:
>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>> {
>> QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>> QEMUFile *f = rioc->file;
>> - RDMAContext *rdma = rioc->rdma;
>> + RDMAContext *rdma;
>> int ret;
>> ssize_t done = 0;
>> size_t i;
>> size_t len = 0;
>>
>> + rcu_read_lock();
>> + rdma = atomic_rcu_read(&rioc->rdmaout);
>> +
>> + if (!rdma) {
>> + rcu_read_unlock();
>> + return -EIO;
>> + }
>> +
>> CHECK_ERROR_STATE();
>>
>> /*
>
> I am not sure I understand this. It would probably be wrong to use the
> output side from two threads at the same time, so why not use two mutexes?
Two thread will not invoke qio_channel_rdma_writev at the same time.
The source qemu, migration thread only use writev, and the return path
thread only
use readv.
The destination qemu already have a mutex mis->rp_mutex to make sure
not use writev
at the same time.
The rcu_read_lock is used to protect not use RDMAContext when another
thread closes it.
>
> Also, who is calling qio_channel_rdma_close in such a way that another
> thread is still using it? Would it be possible to synchronize with the
> other thread *before*, for example with qemu_thread_join?
The MigrationState structure includes to_dst_file and from_dst_file
QEMUFile, the two QEMUFile use the same QIOChannel.
For example, if the return path thread call
qemu_fclose(ms->rp_state.from_dst_file),
It will also close the RDMAContext for ms->to_dst_file.
For live migration, the source qemu invokes qemu_fclose in different
threads include main thread, migration thread, return path thread.
The destination qemu invokes qemu_fclose in main thread, listen thread and
COLO incoming thread.
I do not find an effective way to synchronize these threads.
Thanks.
>
> Thanks,
>
> Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
2018-05-16 9:36 ` 858585 jemmy
@ 2018-05-21 11:49 ` 858585 jemmy
2018-05-23 2:36 ` 858585 jemmy
0 siblings, 1 reply; 13+ messages in thread
From: 858585 jemmy @ 2018-05-21 11:49 UTC (permalink / raw)
To: Paolo Bonzini, Daniel P. Berrange
Cc: Juan Quintela, Dave Gilbert, Adi Dotan, Gal Shachaf,
Aviad Yehezkel, qemu-devel, Lidong Chen
On Wed, May 16, 2018 at 5:36 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
> On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 05/05/2018 16:35, Lidong Chen wrote:
>>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>> {
>>> QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>> QEMUFile *f = rioc->file;
>>> - RDMAContext *rdma = rioc->rdma;
>>> + RDMAContext *rdma;
>>> int ret;
>>> ssize_t done = 0;
>>> size_t i;
>>> size_t len = 0;
>>>
>>> + rcu_read_lock();
>>> + rdma = atomic_rcu_read(&rioc->rdmaout);
>>> +
>>> + if (!rdma) {
>>> + rcu_read_unlock();
>>> + return -EIO;
>>> + }
>>> +
>>> CHECK_ERROR_STATE();
>>>
>>> /*
>>
>> I am not sure I understand this. It would probably be wrong to use the
>> output side from two threads at the same time, so why not use two mutexes?
>
> Two thread will not invoke qio_channel_rdma_writev at the same time.
> The source qemu, migration thread only use writev, and the return path
> thread only
> use readv.
> The destination qemu already have a mutex mis->rp_mutex to make sure
> not use writev
> at the same time.
>
> The rcu_read_lock is used to protect not use RDMAContext when another
> thread closes it.
Any suggestion?
>
>>
>> Also, who is calling qio_channel_rdma_close in such a way that another
>> thread is still using it? Would it be possible to synchronize with the
>> other thread *before*, for example with qemu_thread_join?
>
> The MigrationState structure includes to_dst_file and from_dst_file
> QEMUFile, the two QEMUFile use the same QIOChannel.
> For example, if the return path thread call
> qemu_fclose(ms->rp_state.from_dst_file),
> It will also close the RDMAContext for ms->to_dst_file.
>
> For live migration, the source qemu invokes qemu_fclose in different
> threads include main thread, migration thread, return path thread.
>
> The destination qemu invokes qemu_fclose in main thread, listen thread and
> COLO incoming thread.
>
> I do not find an effective way to synchronize these threads.
>
> Thanks.
>
>>
>> Thanks,
>>
>> Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel
2018-05-21 11:49 ` 858585 jemmy
@ 2018-05-23 2:36 ` 858585 jemmy
0 siblings, 0 replies; 13+ messages in thread
From: 858585 jemmy @ 2018-05-23 2:36 UTC (permalink / raw)
To: Paolo Bonzini, Daniel P. Berrange
Cc: Juan Quintela, Dave Gilbert, Adi Dotan, Gal Shachaf,
Aviad Yehezkel, qemu-devel, Lidong Chen
ping.
On Mon, May 21, 2018 at 7:49 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
> On Wed, May 16, 2018 at 5:36 PM, 858585 jemmy <jemmy858585@gmail.com> wrote:
>> On Tue, May 15, 2018 at 10:54 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 05/05/2018 16:35, Lidong Chen wrote:
>>>> @@ -2635,12 +2637,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>>> {
>>>> QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>>> QEMUFile *f = rioc->file;
>>>> - RDMAContext *rdma = rioc->rdma;
>>>> + RDMAContext *rdma;
>>>> int ret;
>>>> ssize_t done = 0;
>>>> size_t i;
>>>> size_t len = 0;
>>>>
>>>> + rcu_read_lock();
>>>> + rdma = atomic_rcu_read(&rioc->rdmaout);
>>>> +
>>>> + if (!rdma) {
>>>> + rcu_read_unlock();
>>>> + return -EIO;
>>>> + }
>>>> +
>>>> CHECK_ERROR_STATE();
>>>>
>>>> /*
>>>
>>> I am not sure I understand this. It would probably be wrong to use the
>>> output side from two threads at the same time, so why not use two mutexes?
>>
>> Two thread will not invoke qio_channel_rdma_writev at the same time.
>> The source qemu, migration thread only use writev, and the return path
>> thread only
>> use readv.
>> The destination qemu already have a mutex mis->rp_mutex to make sure
>> not use writev
>> at the same time.
>>
>> The rcu_read_lock is used to protect not use RDMAContext when another
>> thread closes it.
>
> Any suggestion?
>
>>
>>>
>>> Also, who is calling qio_channel_rdma_close in such a way that another
>>> thread is still using it? Would it be possible to synchronize with the
>>> other thread *before*, for example with qemu_thread_join?
>>
>> The MigrationState structure includes to_dst_file and from_dst_file
>> QEMUFile, the two QEMUFile use the same QIOChannel.
>> For example, if the return path thread call
>> qemu_fclose(ms->rp_state.from_dst_file),
>> It will also close the RDMAContext for ms->to_dst_file.
>>
>> For live migration, the source qemu invokes qemu_fclose in different
>> threads include main thread, migration thread, return path thread.
>>
>> The destination qemu invokes qemu_fclose in main thread, listen thread and
>> COLO incoming thread.
>>
>> I do not find an effective way to synchronize these threads.
>>
>> Thanks.
>>
>>>
>>> Thanks,
>>>
>>> Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-05-23 2:36 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-05 14:35 [Qemu-devel] [PATCH v3 0/6] Enable postcopy RDMA live migration Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 1/6] migration: disable RDMA WRITE after postcopy started Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 2/6] migration: create a dedicated connection for rdma return path Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 3/6] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
2018-05-08 14:19 ` Dr. David Alan Gilbert
2018-05-09 1:28 ` 858585 jemmy
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 4/6] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 5/6] migration: implement bi-directional RDMA QIOChannel Lidong Chen
2018-05-15 14:54 ` Paolo Bonzini
2018-05-16 9:36 ` 858585 jemmy
2018-05-21 11:49 ` 858585 jemmy
2018-05-23 2:36 ` 858585 jemmy
2018-05-05 14:35 ` [Qemu-devel] [PATCH v3 6/6] migration: Stop rdma yielding during incoming postcopy Lidong Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).