* [Qemu-devel] [RFC 00/13] Multiple fd migration support
@ 2016-04-20 14:44 Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time Juan Quintela
` (14 more replies)
0 siblings, 15 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
Hi
This patch series is "an" initial implementation of multiple fd migration.
This is to get something out for others to comment, it is not finished at all.
So far:
- we create threads for each new fd
- only for tcp of course, rest of transports are out of luck
I need to integrate this with daniel channel changes
- I *think* the locking is right, at least I don't get more random
lookups (and yes, it was not trivial). And yes, I think that the
compression code locking is not completely correct. I think it
would be much, much better to do the compression code on top of this
(will avoid a lot of copies), but I need to finish this first.
- Last patch, I add a BIG hack to try to know what the real bandwidth
is.
Preleminar testing so far:
- quite good, the latency is much better, but was change so far, I
think I found the problem for the random high latencies, but more
testing is needed.
- under load, I think our bandwidth calculations are *not* completely
correct (This is the way to spell it to be allowed for a family audience).
ToDo list:
- bandwidth calculation: I am going to send another mail
with my ToDo list for migration, see there.
- stats: We need better stats, by thread, etc
- sincronize less times with the worker threads.
right now we syncronize for each page, there are two obvious optimizations
* send a list of pages each time we wakeup an fd
* if we have to sent a HUGE page, dont' do a single split, just sent the whole page
in one send() and read things with a single recv() on destination.
My understanding is that this would make Transparent Huge pages trivial.
- measure things under bigger loads
Comments, please?
Later, Juan.
Juan Quintela (13):
migration: create Migration Incoming State at init time
migration: Pass TCP args in an struct
migration: [HACK] Don't create decompression threads if not enabled
migration: Add multifd capability
migration: Create x-multifd-threads parameter
migration: create multifd migration threads
migration: Start of multiple fd work
migration: create ram_multifd_page
migration: Create thread infrastructure for multifd send side
migration: Send the fd number which we are going to use for this page
migration: Create thread infrastructure for multifd recv side
migration: Test new fd infrastructure
migration: [HACK]Transfer pages over new channels
hmp.c | 10 ++
include/migration/migration.h | 13 ++
migration/migration.c | 100 ++++++++----
migration/ram.c | 350 +++++++++++++++++++++++++++++++++++++++++-
migration/savevm.c | 3 +-
migration/tcp.c | 76 ++++++++-
qapi-schema.json | 29 +++-
7 files changed, 540 insertions(+), 41 deletions(-)
--
2.5.5
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-22 11:27 ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 02/13] migration: Pass TCP args in an struct Juan Quintela
` (13 subsequent siblings)
14 siblings, 1 reply; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/migration.c | 38 +++++++++++++++++---------------------
migration/savevm.c | 3 ++-
2 files changed, 19 insertions(+), 22 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 991313a..314c5c0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -100,32 +100,28 @@ MigrationState *migrate_get_current(void)
return ¤t_migration;
}
-/* For incoming */
-static MigrationIncomingState *mis_current;
-
MigrationIncomingState *migration_incoming_get_current(void)
{
- return mis_current;
-}
+ static bool once;
+ static MigrationIncomingState mis_current;
-MigrationIncomingState *migration_incoming_state_new(QEMUFile* f)
-{
- mis_current = g_new0(MigrationIncomingState, 1);
- mis_current->from_src_file = f;
- mis_current->state = MIGRATION_STATUS_NONE;
- QLIST_INIT(&mis_current->loadvm_handlers);
- qemu_mutex_init(&mis_current->rp_mutex);
- qemu_event_init(&mis_current->main_thread_load_event, false);
-
- return mis_current;
+ if (!once) {
+ mis_current.state = MIGRATION_STATUS_NONE;
+ memset(&mis_current, 0, sizeof(MigrationIncomingState));
+ QLIST_INIT(&mis_current.loadvm_handlers);
+ qemu_mutex_init(&mis_current.rp_mutex);
+ qemu_event_init(&mis_current.main_thread_load_event, false);
+ once = true;
+ }
+ return &mis_current;
}
void migration_incoming_state_destroy(void)
{
- qemu_event_destroy(&mis_current->main_thread_load_event);
- loadvm_free_handlers(mis_current);
- g_free(mis_current);
- mis_current = NULL;
+ struct MigrationIncomingState *mis = migration_incoming_get_current();
+
+ qemu_event_destroy(&mis->main_thread_load_event);
+ loadvm_free_handlers(mis);
}
@@ -373,11 +369,11 @@ static void process_incoming_migration_bh(void *opaque)
static void process_incoming_migration_co(void *opaque)
{
QEMUFile *f = opaque;
- MigrationIncomingState *mis;
+ MigrationIncomingState *mis = migration_incoming_get_current();
PostcopyState ps;
int ret;
- mis = migration_incoming_state_new(f);
+ mis->from_src_file = f;
postcopy_state_set(POSTCOPY_INCOMING_NONE);
migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
MIGRATION_STATUS_ACTIVE);
diff --git a/migration/savevm.c b/migration/savevm.c
index 16ba443..49137a1 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2091,6 +2091,7 @@ int load_vmstate(const char *name)
QEMUFile *f;
int ret;
AioContext *aio_context;
+ MigrationIncomingState *mis = migration_incoming_get_current();
if (!bdrv_all_can_snapshot(&bs)) {
error_report("Device '%s' is writable but does not support snapshots.",
@@ -2141,7 +2142,7 @@ int load_vmstate(const char *name)
}
qemu_system_reset(VMRESET_SILENT);
- migration_incoming_state_new(f);
+ mis->from_src_file = f;
aio_context_acquire(aio_context);
ret = qemu_loadvm_state(f);
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 02/13] migration: Pass TCP args in an struct
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 03/13] migration: [HACK] Don't create decompression threads if not enabled Juan Quintela
` (12 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
Both for incomming and outgoing migration.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/tcp.c | 29 ++++++++++++++++++++++-------
1 file changed, 22 insertions(+), 7 deletions(-)
diff --git a/migration/tcp.c b/migration/tcp.c
index e1fa7f8..5d42c96 100644
--- a/migration/tcp.c
+++ b/migration/tcp.c
@@ -33,10 +33,16 @@
do { } while (0)
#endif
+struct OutgoingArgs {
+ MigrationState *s;
+};
+
static void tcp_wait_for_connect(int fd, Error *err, void *opaque)
{
- MigrationState *s = opaque;
+ struct OutgoingArgs *args = opaque;
+ MigrationState *s = args->s;
+ g_free(args);
if (fd < 0) {
DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
s->to_dst_file = NULL;
@@ -50,17 +56,26 @@ static void tcp_wait_for_connect(int fd, Error *err, void *opaque)
void tcp_start_outgoing_migration(MigrationState *s, const char *host_port, Error **errp)
{
- inet_nonblocking_connect(host_port, tcp_wait_for_connect, s, errp);
+ struct OutgoingArgs *args = g_new0(struct OutgoingArgs, 1);
+
+ args->s = s;
+ inet_nonblocking_connect(host_port, tcp_wait_for_connect, args, errp);
}
+struct IncomingArgs {
+ int s;
+};
+
static void tcp_accept_incoming_migration(void *opaque)
{
+ struct IncomingArgs *args = opaque;
struct sockaddr_in addr;
socklen_t addrlen = sizeof(addr);
- int s = (intptr_t)opaque;
+ int s = args->s;
QEMUFile *f;
int c;
+ g_free(args);
do {
c = qemu_accept(s, (struct sockaddr *)&addr, &addrlen);
} while (c < 0 && errno == EINTR);
@@ -80,7 +95,6 @@ static void tcp_accept_incoming_migration(void *opaque)
error_report("could not qemu_fopen socket");
goto out;
}
-
process_incoming_migration(f);
return;
@@ -90,13 +104,14 @@ out:
void tcp_start_incoming_migration(const char *host_port, Error **errp)
{
+ struct IncomingArgs *args = g_new0(struct IncomingArgs, 1);
int s;
s = inet_listen(host_port, NULL, 256, SOCK_STREAM, 0, errp);
if (s < 0) {
+ g_free(args);
return;
}
-
- qemu_set_fd_handler(s, tcp_accept_incoming_migration, NULL,
- (void *)(intptr_t)s);
+ args->s = s;
+ qemu_set_fd_handler(s, tcp_accept_incoming_migration, NULL, args);
}
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 03/13] migration: [HACK] Don't create decompression threads if not enabled
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 02/13] migration: Pass TCP args in an struct Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 04/13] migration: Add multifd capability Juan Quintela
` (11 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
This is a partial fix, we also need to not allow reception of
compression packages if not enabled.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/ram.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/migration/ram.c b/migration/ram.c
index 3f05738..648362c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2215,6 +2215,9 @@ void migrate_decompress_threads_create(void)
{
int i, thread_count;
+ if (!migrate_use_compression()) {
+ return;
+ }
thread_count = migrate_decompress_threads();
decompress_threads = g_new0(QemuThread, thread_count);
decomp_param = g_new0(DecompressParam, thread_count);
@@ -2233,6 +2236,9 @@ void migrate_decompress_threads_join(void)
{
int i, thread_count;
+ if (!migrate_use_compression()) {
+ return;
+ }
quit_decomp_thread = true;
thread_count = migrate_decompress_threads();
for (i = 0; i < thread_count; i++) {
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 04/13] migration: Add multifd capability
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (2 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 03/13] migration: [HACK] Don't create decompression threads if not enabled Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter Juan Quintela
` (10 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
include/migration/migration.h | 1 +
migration/migration.c | 9 +++++++++
qapi-schema.json | 5 +++--
3 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/include/migration/migration.h b/include/migration/migration.h
index ac2c12c..a626b7d 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -269,6 +269,7 @@ bool migrate_postcopy_ram(void);
bool migrate_zero_blocks(void);
bool migrate_auto_converge(void);
+bool migrate_multifd(void);
int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
uint8_t *dst, int dlen);
diff --git a/migration/migration.c b/migration/migration.c
index 314c5c0..92e6dc4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1187,6 +1187,15 @@ bool migrate_use_events(void)
return s->enabled_capabilities[MIGRATION_CAPABILITY_EVENTS];
}
+bool migrate_multifd(void)
+{
+ MigrationState *s;
+
+ s = migrate_get_current();
+
+ return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
+}
+
int migrate_use_xbzrle(void)
{
MigrationState *s;
diff --git a/qapi-schema.json b/qapi-schema.json
index 54634c4..9fdf902 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -544,12 +544,13 @@
# been migrated, pulling the remaining pages along as needed. NOTE: If
# the migration fails during postcopy the VM will fail. (since 2.6)
#
+# @x-multifd: Use more than one fd for migration
+#
# Since: 1.2
##
{ 'enum': 'MigrationCapability',
'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
- 'compress', 'events', 'postcopy-ram'] }
-
+ 'compress', 'events', 'postcopy-ram', 'x-multifd'] }
##
# @MigrationCapabilityStatus
#
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (3 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 04/13] migration: Add multifd capability Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-22 11:37 ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 06/13] migration: create multifd migration threads Juan Quintela
` (9 subsequent siblings)
14 siblings, 1 reply; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
Indicates the number of threads that we would create. By default we
create 2 threads.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
hmp.c | 8 ++++++++
include/migration/migration.h | 2 ++
migration/migration.c | 30 +++++++++++++++++++++++++++++-
qapi-schema.json | 19 ++++++++++++++++---
4 files changed, 55 insertions(+), 4 deletions(-)
diff --git a/hmp.c b/hmp.c
index d510236..2a40f1f 100644
--- a/hmp.c
+++ b/hmp.c
@@ -286,6 +286,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
monitor_printf(mon, " %s: %" PRId64,
MigrationParameter_lookup[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT],
params->x_cpu_throttle_increment);
+ monitor_printf(mon, " %s: %" PRId64,
+ MigrationParameter_lookup[MIGRATION_PARAMETER_X_MULTIFD_THREADS],
+ params->x_multifd_threads);
monitor_printf(mon, "\n");
}
@@ -1242,6 +1245,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
bool has_decompress_threads = false;
bool has_x_cpu_throttle_initial = false;
bool has_x_cpu_throttle_increment = false;
+ bool has_x_multifd_threads = false;
int i;
for (i = 0; i < MIGRATION_PARAMETER__MAX; i++) {
@@ -1262,12 +1266,16 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
case MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT:
has_x_cpu_throttle_increment = true;
break;
+ case MIGRATION_PARAMETER_X_MULTIFD_THREADS:
+ has_x_multifd_threads = true;
+ break;
}
qmp_migrate_set_parameters(has_compress_level, value,
has_compress_threads, value,
has_decompress_threads, value,
has_x_cpu_throttle_initial, value,
has_x_cpu_throttle_increment, value,
+ has_x_multifd_threads, value,
&err);
break;
}
diff --git a/include/migration/migration.h b/include/migration/migration.h
index a626b7d..19d535d 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -219,6 +219,8 @@ bool migration_in_postcopy(MigrationState *);
bool migration_in_postcopy_after_devices(MigrationState *);
MigrationState *migrate_get_current(void);
+int migrate_multifd_threads(void);
+
void migrate_compress_threads_create(void);
void migrate_compress_threads_join(void);
void migrate_decompress_threads_create(void);
diff --git a/migration/migration.c b/migration/migration.c
index 92e6dc4..29e43ff 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -56,6 +56,8 @@
/* Migration XBZRLE default cache size */
#define DEFAULT_MIGRATE_CACHE_SIZE (64 * 1024 * 1024)
+#define DEFAULT_MIGRATE_MULTIFD_THREADS 2
+
static NotifierList migration_state_notifiers =
NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -91,6 +93,8 @@ MigrationState *migrate_get_current(void)
DEFAULT_MIGRATE_X_CPU_THROTTLE_INITIAL,
.parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
DEFAULT_MIGRATE_X_CPU_THROTTLE_INCREMENT,
+ .parameters[MIGRATION_PARAMETER_X_MULTIFD_THREADS] =
+ DEFAULT_MIGRATE_MULTIFD_THREADS,
};
if (!once) {
@@ -521,6 +525,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL];
params->x_cpu_throttle_increment =
s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT];
+ params->x_multifd_threads =
+ s->parameters[MIGRATION_PARAMETER_X_MULTIFD_THREADS];
return params;
}
@@ -717,7 +723,10 @@ void qmp_migrate_set_parameters(bool has_compress_level,
bool has_x_cpu_throttle_initial,
int64_t x_cpu_throttle_initial,
bool has_x_cpu_throttle_increment,
- int64_t x_cpu_throttle_increment, Error **errp)
+ int64_t x_cpu_throttle_increment,
+ bool has_multifd_threads,
+ int64_t multifd_threads,
+ Error **errp)
{
MigrationState *s = migrate_get_current();
@@ -753,6 +762,13 @@ void qmp_migrate_set_parameters(bool has_compress_level,
"an integer in the range of 1 to 99");
}
+ if (has_multifd_threads &&
+ (multifd_threads < 1 || multifd_threads > 255)) {
+ error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+ "multifd_threads",
+ "is invalid, it should be in the range of 1 to 255");
+ return;
+ }
if (has_compress_level) {
s->parameters[MIGRATION_PARAMETER_COMPRESS_LEVEL] = compress_level;
}
@@ -772,6 +788,9 @@ void qmp_migrate_set_parameters(bool has_compress_level,
s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
x_cpu_throttle_increment;
}
+ if (has_multifd_threads) {
+ s->parameters[MIGRATION_PARAMETER_X_MULTIFD_THREADS] = multifd_threads;
+ }
}
void qmp_migrate_start_postcopy(Error **errp)
@@ -1196,6 +1215,15 @@ bool migrate_multifd(void)
return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
}
+int migrate_multifd_threads(void)
+{
+ MigrationState *s;
+
+ s = migrate_get_current();
+
+ return s->parameters[MIGRATION_PARAMETER_X_MULTIFD_THREADS];
+}
+
int migrate_use_xbzrle(void)
{
MigrationState *s;
diff --git a/qapi-schema.json b/qapi-schema.json
index 9fdf902..6ff9ac6 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -613,11 +613,16 @@
# @x-cpu-throttle-increment: throttle percentage increase each time
# auto-converge detects that migration is not making
# progress. The default value is 10. (Since 2.5)
+#
+# @x-multifd-threads: Number of threads used to migrate data in parallel
+# The default value is 1 (since 2.6)
+#
# Since: 2.4
##
{ 'enum': 'MigrationParameter',
'data': ['compress-level', 'compress-threads', 'decompress-threads',
- 'x-cpu-throttle-initial', 'x-cpu-throttle-increment'] }
+ 'x-cpu-throttle-initial', 'x-cpu-throttle-increment',
+ 'x-multifd-threads'] }
#
# @migrate-set-parameters
@@ -637,6 +642,10 @@
# @x-cpu-throttle-increment: throttle percentage increase each time
# auto-converge detects that migration is not making
# progress. The default value is 10. (Since 2.5)
+#
+# @x-multifd-threads: Number of threads used to migrate data in parallel
+# The default value is 1 (since 2.6)
+#
# Since: 2.4
##
{ 'command': 'migrate-set-parameters',
@@ -644,7 +653,8 @@
'*compress-threads': 'int',
'*decompress-threads': 'int',
'*x-cpu-throttle-initial': 'int',
- '*x-cpu-throttle-increment': 'int'} }
+ '*x-cpu-throttle-increment': 'int',
+ '*x-multifd-threads': 'int'} }
#
# @MigrationParameters
@@ -663,6 +673,8 @@
# auto-converge detects that migration is not making
# progress. The default value is 10. (Since 2.5)
#
+# @x-multifd-threads: Number of threads used to migrate data in parallel
+# The default value is 1 (since 2.6)
# Since: 2.4
##
{ 'struct': 'MigrationParameters',
@@ -670,7 +682,8 @@
'compress-threads': 'int',
'decompress-threads': 'int',
'x-cpu-throttle-initial': 'int',
- 'x-cpu-throttle-increment': 'int'} }
+ 'x-cpu-throttle-increment': 'int',
+ 'x-multifd-threads': 'int'} }
##
# @query-migrate-parameters
#
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 06/13] migration: create multifd migration threads
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (4 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 07/13] migration: Start of multiple fd work Juan Quintela
` (8 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
Creation of the threads, nothing inside yet.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
include/migration/migration.h | 4 ++
migration/migration.c | 6 ++
migration/ram.c | 148 ++++++++++++++++++++++++++++++++++++++++++
3 files changed, 158 insertions(+)
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 19d535d..9f94c75 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -220,6 +220,10 @@ bool migration_in_postcopy_after_devices(MigrationState *);
MigrationState *migrate_get_current(void);
int migrate_multifd_threads(void);
+void migrate_multifd_send_threads_create(void);
+void migrate_multifd_send_threads_join(void);
+void migrate_multifd_recv_threads_create(void);
+void migrate_multifd_recv_threads_join(void);
void migrate_compress_threads_create(void);
void migrate_compress_threads_join(void);
diff --git a/migration/migration.c b/migration/migration.c
index 29e43ff..b3ad36b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -335,6 +335,7 @@ static void process_incoming_migration_bh(void *opaque)
MIGRATION_STATUS_FAILED);
error_report_err(local_err);
migrate_decompress_threads_join();
+ migrate_multifd_recv_threads_join();
exit(EXIT_FAILURE);
}
@@ -359,6 +360,7 @@ static void process_incoming_migration_bh(void *opaque)
runstate_set(global_state_get_runstate());
}
migrate_decompress_threads_join();
+ migrate_multifd_recv_threads_join();
/*
* This must happen after any state changes since as soon as an external
* observer sees this event they might start to prod at the VM assuming
@@ -412,6 +414,7 @@ static void process_incoming_migration_co(void *opaque)
MIGRATION_STATUS_FAILED);
error_report("load of migration failed: %s", strerror(-ret));
migrate_decompress_threads_join();
+ migrate_multifd_recv_threads_join();
exit(EXIT_FAILURE);
}
@@ -426,6 +429,7 @@ void process_incoming_migration(QEMUFile *f)
assert(fd != -1);
migrate_decompress_threads_create();
+ migrate_multifd_recv_threads_create();
qemu_set_nonblock(fd);
qemu_coroutine_enter(co, f);
}
@@ -844,6 +848,7 @@ static void migrate_fd_cleanup(void *opaque)
qemu_mutex_lock_iothread();
migrate_compress_threads_join();
+ migrate_multifd_send_threads_join();
qemu_fclose(s->to_dst_file);
s->to_dst_file = NULL;
}
@@ -1825,6 +1830,7 @@ void migrate_fd_connect(MigrationState *s)
}
migrate_compress_threads_create();
+ migrate_multifd_send_threads_create();
qemu_thread_create(&s->thread, "migration", migration_thread, s,
QEMU_THREAD_JOINABLE);
s->migration_thread_running = true;
diff --git a/migration/ram.c b/migration/ram.c
index 648362c..6139f7c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -390,6 +390,154 @@ void migrate_compress_threads_create(void)
}
}
+/* Multiple fd's */
+
+struct MultiFDSendParams {
+ QemuThread thread;
+ QemuCond cond;
+ QemuMutex mutex;
+ bool quit;
+};
+typedef struct MultiFDSendParams MultiFDSendParams;
+
+static MultiFDSendParams *multifd_send;
+
+static void *multifd_send_thread(void *opaque)
+{
+ MultiFDSendParams *params = opaque;
+
+ qemu_mutex_lock(¶ms->mutex);
+ while (!params->quit){
+ qemu_cond_wait(¶ms->cond, ¶ms->mutex);
+ }
+ qemu_mutex_unlock(¶ms->mutex);
+
+ return NULL;
+}
+
+static void terminate_multifd_send_threads(void)
+{
+ int i, thread_count;
+
+ thread_count = migrate_multifd_threads();
+ for (i = 0; i < thread_count; i++) {
+ qemu_mutex_lock(&multifd_send[i].mutex);
+ multifd_send[i].quit = true;
+ qemu_cond_signal(&multifd_send[i].cond);
+ qemu_mutex_unlock(&multifd_send[i].mutex);
+ }
+}
+
+void migrate_multifd_send_threads_join(void)
+{
+ int i, thread_count;
+
+ if (!migrate_multifd()){
+ return;
+ }
+ terminate_multifd_send_threads();
+ thread_count = migrate_multifd_threads();
+ for (i = 0; i < thread_count; i++) {
+ qemu_thread_join(&multifd_send[i].thread);
+ qemu_mutex_destroy(&multifd_send[i].mutex);
+ qemu_cond_destroy(&multifd_send[i].cond);
+ }
+ g_free(multifd_send);
+ multifd_send = NULL;
+}
+
+void migrate_multifd_send_threads_create(void)
+{
+ int i, thread_count;
+
+ if (!migrate_multifd()){
+ return;
+ }
+ thread_count = migrate_multifd_threads();
+ multifd_send = g_new0(MultiFDSendParams, thread_count);
+ for (i = 0; i < thread_count; i++) {
+ qemu_mutex_init(&multifd_send[i].mutex);
+ qemu_cond_init(&multifd_send[i].cond);
+ multifd_send[i].quit = false;
+ qemu_thread_create(&multifd_send[i].thread, "multifd_send",
+ multifd_send_thread, &multifd_send[i],
+ QEMU_THREAD_JOINABLE);
+ }
+}
+
+struct MultiFDRecvParams {
+ QemuThread thread;
+ QemuCond cond;
+ QemuMutex mutex;
+ bool quit;
+};
+typedef struct MultiFDRecvParams MultiFDRecvParams;
+
+static MultiFDRecvParams *multifd_recv;
+
+static void *multifd_recv_thread(void *opaque)
+{
+ MultiFDSendParams *params = opaque;
+
+ qemu_mutex_lock(¶ms->mutex);
+ while (!params->quit){
+ qemu_cond_wait(¶ms->cond, ¶ms->mutex);
+ }
+ qemu_mutex_unlock(¶ms->mutex);
+
+ return NULL;
+}
+
+static void terminate_multifd_recv_threads(void)
+{
+ int i, thread_count;
+
+ thread_count = migrate_multifd_threads();
+ for (i = 0; i < thread_count; i++) {
+ qemu_mutex_lock(&multifd_recv[i].mutex);
+ multifd_recv[i].quit = true;
+ qemu_cond_signal(&multifd_recv[i].cond);
+ qemu_mutex_unlock(&multifd_recv[i].mutex);
+ }
+}
+
+void migrate_multifd_recv_threads_join(void)
+{
+ int i, thread_count;
+
+ if (!migrate_multifd()){
+ return;
+ }
+ terminate_multifd_recv_threads();
+ thread_count = migrate_multifd_threads();
+ for (i = 0; i < thread_count; i++) {
+ qemu_thread_join(&multifd_recv[i].thread);
+ qemu_mutex_destroy(&multifd_recv[i].mutex);
+ qemu_cond_destroy(&multifd_recv[i].cond);
+ }
+ g_free(multifd_recv);
+ multifd_recv = NULL;
+}
+
+void migrate_multifd_recv_threads_create(void)
+{
+ int i, thread_count;
+
+ if (!migrate_multifd()){
+ return;
+ }
+ thread_count = migrate_multifd_threads();
+ multifd_recv = g_new0(MultiFDRecvParams, thread_count);
+ for (i = 0; i < thread_count; i++) {
+ qemu_mutex_init(&multifd_recv[i].mutex);
+ qemu_cond_init(&multifd_recv[i].cond);
+ multifd_recv[i].quit = false;
+ qemu_thread_create(&multifd_recv[i].thread, "multifd_recv",
+ multifd_recv_thread, &multifd_recv[i],
+ QEMU_THREAD_JOINABLE);
+ }
+}
+
/**
* save_page_header: Write page header to wire
*
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 07/13] migration: Start of multiple fd work
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (5 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 06/13] migration: create multifd migration threads Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 08/13] migration: create ram_multifd_page Juan Quintela
` (7 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
We create new channels for each new thread created, still nothing send
through them.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
include/migration/migration.h | 5 ++++
migration/ram.c | 15 ++++++++++
migration/tcp.c | 69 ++++++++++++++++++++++++++++++++++++-------
3 files changed, 78 insertions(+), 11 deletions(-)
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 9f94c75..343cd90 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -189,6 +189,11 @@ void tcp_start_incoming_migration(const char *host_port, Error **errp);
void tcp_start_outgoing_migration(MigrationState *s, const char *host_port, Error **errp);
+int tcp_send_channel_create(void);
+int tcp_send_channel_destroy(int s);
+int tcp_recv_channel_create(void);
+int tcp_recv_channel_destroy(int s);
+
void unix_start_incoming_migration(const char *path, Error **errp);
void unix_start_outgoing_migration(MigrationState *s, const char *path, Error **errp);
diff --git a/migration/ram.c b/migration/ram.c
index 6139f7c..d321e6b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -397,6 +397,7 @@ struct MultiFDSendParams {
QemuCond cond;
QemuMutex mutex;
bool quit;
+ int s;
};
typedef struct MultiFDSendParams MultiFDSendParams;
@@ -441,6 +442,7 @@ void migrate_multifd_send_threads_join(void)
qemu_thread_join(&multifd_send[i].thread);
qemu_mutex_destroy(&multifd_send[i].mutex);
qemu_cond_destroy(&multifd_send[i].cond);
+ tcp_recv_channel_destroy(multifd_send[i].s);
}
g_free(multifd_send);
multifd_send = NULL;
@@ -459,6 +461,11 @@ void migrate_multifd_send_threads_create(void)
qemu_mutex_init(&multifd_send[i].mutex);
qemu_cond_init(&multifd_send[i].cond);
multifd_send[i].quit = false;
+ multifd_send[i].s = tcp_send_channel_create();
+ if(multifd_send[i].s < 0) {
+ printf("Error creating a send channel");
+ exit(0);
+ }
qemu_thread_create(&multifd_send[i].thread, "multifd_send",
multifd_send_thread, &multifd_send[i],
QEMU_THREAD_JOINABLE);
@@ -470,6 +477,7 @@ struct MultiFDRecvParams {
QemuCond cond;
QemuMutex mutex;
bool quit;
+ int s;
};
typedef struct MultiFDRecvParams MultiFDRecvParams;
@@ -514,6 +522,7 @@ void migrate_multifd_recv_threads_join(void)
qemu_thread_join(&multifd_recv[i].thread);
qemu_mutex_destroy(&multifd_recv[i].mutex);
qemu_cond_destroy(&multifd_recv[i].cond);
+ tcp_send_channel_destroy(multifd_recv[i].s);
}
g_free(multifd_recv);
multifd_recv = NULL;
@@ -532,6 +541,12 @@ void migrate_multifd_recv_threads_create(void)
qemu_mutex_init(&multifd_recv[i].mutex);
qemu_cond_init(&multifd_recv[i].cond);
multifd_recv[i].quit = false;
+ multifd_recv[i].s = tcp_recv_channel_create();
+
+ if(multifd_recv[i].s < 0) {
+ printf("Error creating a recv channel");
+ exit(0);
+ }
qemu_thread_create(&multifd_recv[i].thread, "multifd_recv",
multifd_recv_thread, &multifd_recv[i],
QEMU_THREAD_JOINABLE);
diff --git a/migration/tcp.c b/migration/tcp.c
index 5d42c96..5e693a7 100644
--- a/migration/tcp.c
+++ b/migration/tcp.c
@@ -35,14 +35,15 @@
struct OutgoingArgs {
MigrationState *s;
-};
+ char *host_port;
+ Error **err;
+} out_args;
static void tcp_wait_for_connect(int fd, Error *err, void *opaque)
{
struct OutgoingArgs *args = opaque;
MigrationState *s = args->s;
- g_free(args);
if (fd < 0) {
DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
s->to_dst_file = NULL;
@@ -54,17 +55,38 @@ static void tcp_wait_for_connect(int fd, Error *err, void *opaque)
}
}
+int tcp_send_channel_create(void)
+{
+ int s;
+
+ usleep(100000);
+ s = inet_connect(out_args.host_port, out_args.err);
+ if (s < 0) {
+ DPRINTF("migrate_connect multilpe fd error: %s\n",
+ error_get_pretty_(err));
+ }
+ return s;
+}
+
+int tcp_send_channel_destroy(int s)
+{
+ return closesocket(s);
+}
+
void tcp_start_outgoing_migration(MigrationState *s, const char *host_port, Error **errp)
{
- struct OutgoingArgs *args = g_new0(struct OutgoingArgs, 1);
- args->s = s;
- inet_nonblocking_connect(host_port, tcp_wait_for_connect, args, errp);
+ out_args.s = s;
+ out_args.host_port = g_strdup(host_port);
+ out_args.err = errp;
+ inet_nonblocking_connect(host_port, tcp_wait_for_connect, &out_args, errp);
}
struct IncomingArgs {
int s;
-};
+ char *host_port;
+ Error **errp;
+} in_args;
static void tcp_accept_incoming_migration(void *opaque)
{
@@ -75,7 +97,6 @@ static void tcp_accept_incoming_migration(void *opaque)
QEMUFile *f;
int c;
- g_free(args);
do {
c = qemu_accept(s, (struct sockaddr *)&addr, &addrlen);
} while (c < 0 && errno == EINTR);
@@ -102,16 +123,42 @@ out:
closesocket(c);
}
+int tcp_recv_channel_create(void)
+{
+ int c;
+ struct sockaddr_in addr;
+ socklen_t addrlen = sizeof(addr);
+ int s = inet_listen(in_args.host_port, NULL, 256, SOCK_STREAM, 0,
+ in_args.errp);
+ do {
+ c = qemu_accept(s, (struct sockaddr *)&addr, &addrlen);
+ } while (c < 0 && errno == EINTR);
+ closesocket(s);
+
+ DPRINTF("accepted multiple fd migration\n");
+
+ if (c < 0) {
+ error_report("could not accept migration connection (%s)",
+ strerror(errno));
+ }
+ return c;
+}
+
+int tcp_recv_channel_destroy(int s)
+{
+ return closesocket(s);
+}
+
void tcp_start_incoming_migration(const char *host_port, Error **errp)
{
- struct IncomingArgs *args = g_new0(struct IncomingArgs, 1);
int s;
s = inet_listen(host_port, NULL, 256, SOCK_STREAM, 0, errp);
if (s < 0) {
- g_free(args);
return;
}
- args->s = s;
- qemu_set_fd_handler(s, tcp_accept_incoming_migration, NULL, args);
+ in_args.s = s;
+ in_args.host_port = g_strdup(host_port);
+ in_args.errp = errp;
+ qemu_set_fd_handler(s, tcp_accept_incoming_migration, NULL, &in_args);
}
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 08/13] migration: create ram_multifd_page
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (6 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 07/13] migration: Start of multiple fd work Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 09/13] migration: Create thread infrastructure for multifd send side Juan Quintela
` (6 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
The function still don't use multifd, but we have simplified
ram_save_page, xbzrle and RDMA stuff is gone. We have added a new
counter and a new flag for this type of pages.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
hmp.c | 2 ++
include/migration/migration.h | 1 +
migration/migration.c | 2 ++
migration/ram.c | 44 ++++++++++++++++++++++++++++++++++++++++++-
qapi-schema.json | 5 ++++-
5 files changed, 52 insertions(+), 2 deletions(-)
diff --git a/hmp.c b/hmp.c
index 2a40f1f..f496918 100644
--- a/hmp.c
+++ b/hmp.c
@@ -209,6 +209,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "dirty pages rate: %" PRIu64 " pages\n",
info->ram->dirty_pages_rate);
}
+ monitor_printf(mon, "multifd: %" PRIu64 " pages\n",
+ info->ram->multifd);
}
if (info->has_disk) {
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 343cd90..cd02a55 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -252,6 +252,7 @@ uint64_t xbzrle_mig_pages_transferred(void);
uint64_t xbzrle_mig_pages_overflow(void);
uint64_t xbzrle_mig_pages_cache_miss(void);
double xbzrle_mig_cache_miss_rate(void);
+uint64_t multifd_mig_pages_transferred(void);
void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
void ram_debug_dump_bitmap(unsigned long *todump, bool expected);
diff --git a/migration/migration.c b/migration/migration.c
index b3ad36b..efdd981 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -603,6 +603,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
info->ram->dirty_pages_rate = s->dirty_pages_rate;
info->ram->mbps = s->mbps;
info->ram->dirty_sync_count = s->dirty_sync_count;
+ info->ram->multifd = multifd_mig_pages_transferred();
if (blk_mig_active()) {
info->has_disk = true;
@@ -675,6 +676,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
info->ram->normal_bytes = norm_mig_bytes_transferred();
info->ram->mbps = s->mbps;
info->ram->dirty_sync_count = s->dirty_sync_count;
+ info->ram->multifd = multifd_mig_pages_transferred();
break;
case MIGRATION_STATUS_FAILED:
info->has_status = true;
diff --git a/migration/ram.c b/migration/ram.c
index d321e6b..ef15bff 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -66,6 +66,7 @@ static uint64_t bitmap_sync_count;
#define RAM_SAVE_FLAG_XBZRLE 0x40
/* 0x80 is reserved in migration.h start with 0x100 next */
#define RAM_SAVE_FLAG_COMPRESS_PAGE 0x100
+#define RAM_SAVE_FLAG_MULTIFD_PAGE 0x200
static const uint8_t ZERO_TARGET_PAGE[TARGET_PAGE_SIZE];
@@ -146,6 +147,7 @@ typedef struct AccountingInfo {
uint64_t dup_pages;
uint64_t skipped_pages;
uint64_t norm_pages;
+ uint64_t multifd_pages;
uint64_t iterations;
uint64_t xbzrle_bytes;
uint64_t xbzrle_pages;
@@ -216,6 +218,11 @@ uint64_t xbzrle_mig_pages_overflow(void)
return acct_info.xbzrle_overflows;
}
+uint64_t multifd_mig_pages_transferred(void)
+{
+ return acct_info.multifd_pages;
+}
+
/* This is the last block that we have visited serching for dirty pages
*/
static RAMBlock *last_seen_block;
@@ -968,6 +975,33 @@ static int ram_save_page(QEMUFile *f, PageSearchStatus *pss,
return pages;
}
+static int ram_multifd_page(QEMUFile *f, PageSearchStatus *pss,
+ bool last_stage, uint64_t *bytes_transferred)
+{
+ int pages;
+ uint8_t *p;
+ RAMBlock *block = pss->block;
+ ram_addr_t offset = pss->offset;
+
+ p = block->host + offset;
+
+ if (block == last_sent_block) {
+ offset |= RAM_SAVE_FLAG_CONTINUE;
+ }
+ pages = save_zero_page(f, block, offset, p, bytes_transferred);
+ if (pages == -1) {
+ *bytes_transferred +=
+ save_page_header(f, block, offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
+ qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+ *bytes_transferred += TARGET_PAGE_SIZE;
+ pages = 1;
+ acct_info.norm_pages++;
+ acct_info.multifd_pages++;
+ }
+
+ return pages;
+}
+
static int do_compress_ram_page(CompressParam *param)
{
int bytes_sent, blen;
@@ -1410,6 +1444,8 @@ static int ram_save_target_page(MigrationState *ms, QEMUFile *f,
res = ram_save_compressed_page(f, pss,
last_stage,
bytes_transferred);
+ } else if (migrate_multifd()) {
+ res = ram_multifd_page(f, pss, last_stage, bytes_transferred);
} else {
res = ram_save_page(f, pss, last_stage,
bytes_transferred);
@@ -2615,7 +2651,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
addr &= TARGET_PAGE_MASK;
if (flags & (RAM_SAVE_FLAG_COMPRESS | RAM_SAVE_FLAG_PAGE |
- RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
+ RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE |
+ RAM_SAVE_FLAG_MULTIFD_PAGE)) {
RAMBlock *block = ram_block_from_stream(f, flags);
host = host_from_ram_block_offset(block, addr);
@@ -2690,6 +2727,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
break;
}
break;
+
+ case RAM_SAVE_FLAG_MULTIFD_PAGE:
+ qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
+ break;
+
case RAM_SAVE_FLAG_EOS:
/* normal exit */
break;
diff --git a/qapi-schema.json b/qapi-schema.json
index 6ff9ac6..d9c900f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -382,13 +382,16 @@
#
# @dirty-sync-count: number of times that dirty ram was synchronized (since 2.1)
#
+# @multifd: number of pages sent with multifd (since 2.6)
+#
# Since: 0.14.0
##
{ 'struct': 'MigrationStats',
'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' ,
'duplicate': 'int', 'skipped': 'int', 'normal': 'int',
'normal-bytes': 'int', 'dirty-pages-rate' : 'int',
- 'mbps' : 'number', 'dirty-sync-count' : 'int' } }
+ 'mbps' : 'number', 'dirty-sync-count' : 'int',
+ 'multifd' : 'int'} }
##
# @XBZRLECacheStats
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 09/13] migration: Create thread infrastructure for multifd send side
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (7 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 08/13] migration: create ram_multifd_page Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 10/13] migration: Send the fd number which we are going to use for this page Juan Quintela
` (5 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
We make the locking and the transfer of information specific, even if we
are still transmiting things through the main thread.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/ram.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 51 insertions(+), 2 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index ef15bff..4b73100 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -400,23 +400,39 @@ void migrate_compress_threads_create(void)
/* Multiple fd's */
struct MultiFDSendParams {
+ /* not changed */
QemuThread thread;
QemuCond cond;
QemuMutex mutex;
- bool quit;
int s;
+ /* protected by param mutex */
+ bool quit;
+ uint8_t *address;
+ /* protected by multifd mutex */
+ bool done;
};
typedef struct MultiFDSendParams MultiFDSendParams;
static MultiFDSendParams *multifd_send;
+QemuMutex multifd_send_mutex;
+QemuCond multifd_send_cond;
+
static void *multifd_send_thread(void *opaque)
{
MultiFDSendParams *params = opaque;
qemu_mutex_lock(¶ms->mutex);
while (!params->quit){
- qemu_cond_wait(¶ms->cond, ¶ms->mutex);
+ if (params->address) {
+ params->address = 0;
+ qemu_mutex_unlock(¶ms->mutex);
+ qemu_mutex_lock(&multifd_send_mutex);
+ params->done = true;
+ qemu_cond_signal(&multifd_send_cond);
+ qemu_mutex_unlock(&multifd_send_mutex);
+ qemu_mutex_lock(¶ms->mutex);
+ }
}
qemu_mutex_unlock(¶ms->mutex);
@@ -424,6 +440,8 @@ static void *multifd_send_thread(void *opaque)
}
static void terminate_multifd_send_threads(void)
+ } else {
+ qemu_cond_wait(¶ms->cond, ¶ms->mutex);
{
int i, thread_count;
@@ -464,11 +482,15 @@ void migrate_multifd_send_threads_create(void)
}
thread_count = migrate_multifd_threads();
multifd_send = g_new0(MultiFDSendParams, thread_count);
+ qemu_mutex_init(&multifd_send_mutex);
+ qemu_cond_init(&multifd_send_cond);
for (i = 0; i < thread_count; i++) {
qemu_mutex_init(&multifd_send[i].mutex);
qemu_cond_init(&multifd_send[i].cond);
multifd_send[i].quit = false;
+ multifd_send[i].done = true;
multifd_send[i].s = tcp_send_channel_create();
+ multifd_send[i].address = 0;
if(multifd_send[i].s < 0) {
printf("Error creating a send channel");
exit(0);
@@ -479,6 +501,32 @@ void migrate_multifd_send_threads_create(void)
}
}
+static void multifd_send_page(uint8_t *address)
+{
+ int i, thread_count;
+ bool found = false;
+
+ thread_count = migrate_multifd_threads();
+ qemu_mutex_lock(&multifd_send_mutex);
+ while (!found) {
+ for (i = 0; i < thread_count; i++) {
+ if (multifd_send[i].done) {
+ multifd_send[i].done = false;
+ found = true;
+ break;
+ }
+ }
+ if (!found) {
+ qemu_cond_wait(&multifd_send_cond, &multifd_send_mutex);
+ }
+ }
+ qemu_mutex_unlock(&multifd_send_mutex);
+ qemu_mutex_lock(&multifd_send[i].mutex);
+ multifd_send[i].address = address;
+ qemu_cond_signal(&multifd_send[i].cond);
+ qemu_mutex_unlock(&multifd_send[i].mutex);
+}
+
struct MultiFDRecvParams {
QemuThread thread;
QemuCond cond;
@@ -993,6 +1041,7 @@ static int ram_multifd_page(QEMUFile *f, PageSearchStatus *pss,
*bytes_transferred +=
save_page_header(f, block, offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+ multifd_send_page(p);
*bytes_transferred += TARGET_PAGE_SIZE;
pages = 1;
acct_info.norm_pages++;
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 10/13] migration: Send the fd number which we are going to use for this page
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (8 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 09/13] migration: Create thread infrastructure for multifd send side Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 11/13] migration: Create thread infrastructure for multifd recv side Juan Quintela
` (4 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
We are still sending the page through the main channel, that would
change later in the series
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/ram.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 4b73100..47e208b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -501,7 +501,7 @@ void migrate_multifd_send_threads_create(void)
}
}
-static void multifd_send_page(uint8_t *address)
+static int multifd_send_page(uint8_t *address)
{
int i, thread_count;
bool found = false;
@@ -525,6 +525,8 @@ static void multifd_send_page(uint8_t *address)
multifd_send[i].address = address;
qemu_cond_signal(&multifd_send[i].cond);
qemu_mutex_unlock(&multifd_send[i].mutex);
+
+ return i;
}
struct MultiFDRecvParams {
@@ -1027,6 +1029,7 @@ static int ram_multifd_page(QEMUFile *f, PageSearchStatus *pss,
bool last_stage, uint64_t *bytes_transferred)
{
int pages;
+ uint16_t fd_num;
uint8_t *p;
RAMBlock *block = pss->block;
ram_addr_t offset = pss->offset;
@@ -1040,8 +1043,10 @@ static int ram_multifd_page(QEMUFile *f, PageSearchStatus *pss,
if (pages == -1) {
*bytes_transferred +=
save_page_header(f, block, offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
+ fd_num = multifd_send_page(p);
+ qemu_put_be16(f, fd_num);
+ *bytes_transferred += 2; /* size of fd_num */
qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
- multifd_send_page(p);
*bytes_transferred += TARGET_PAGE_SIZE;
pages = 1;
acct_info.norm_pages++;
@@ -2693,6 +2698,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
while (!postcopy_running && !ret && !(flags & RAM_SAVE_FLAG_EOS)) {
ram_addr_t addr, total_ram_bytes;
void *host = NULL;
+ uint16_t fd_num;
uint8_t ch;
addr = qemu_get_be64(f);
@@ -2778,6 +2784,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
break;
case RAM_SAVE_FLAG_MULTIFD_PAGE:
+ fd_num = qemu_get_be16(f);
+ if (fd_num == fd_num) {
+ /* this is yet an unused variable, changed later */
+ fd_num = 0;
+ }
qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
break;
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 11/13] migration: Create thread infrastructure for multifd recv side
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (9 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 10/13] migration: Send the fd number which we are going to use for this page Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 12/13] migration: Test new fd infrastructure Juan Quintela
` (3 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
We make the locking and the transfer of information specific, even if we
are still receiving things through the main thread.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/ram.c | 51 ++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 44 insertions(+), 7 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 47e208b..5507b1f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -533,20 +533,37 @@ struct MultiFDRecvParams {
QemuThread thread;
QemuCond cond;
QemuMutex mutex;
- bool quit;
int s;
+ /* proteced by param mutex */
+ bool quit;
+ uint8_t *address;
+ /* proteced by multifd mutex */
+ bool done;
};
typedef struct MultiFDRecvParams MultiFDRecvParams;
static MultiFDRecvParams *multifd_recv;
+QemuMutex multifd_recv_mutex;
+QemuCond multifd_recv_cond;
+
static void *multifd_recv_thread(void *opaque)
{
- MultiFDSendParams *params = opaque;
+ MultiFDRecvParams *params = opaque;
qemu_mutex_lock(¶ms->mutex);
while (!params->quit){
- qemu_cond_wait(¶ms->cond, ¶ms->mutex);
+ if (params->address) {
+ params->address = 0;
+ qemu_mutex_unlock(¶ms->mutex);
+ qemu_mutex_lock(&multifd_recv_mutex);
+ params->done = true;
+ qemu_cond_signal(&multifd_recv_cond);
+ qemu_mutex_unlock(&multifd_recv_mutex);
+ qemu_mutex_lock(¶ms->mutex);
+ } else {
+ qemu_cond_wait(¶ms->cond, ¶ms->mutex);
+ }
}
qemu_mutex_unlock(¶ms->mutex);
@@ -598,7 +615,9 @@ void migrate_multifd_recv_threads_create(void)
qemu_mutex_init(&multifd_recv[i].mutex);
qemu_cond_init(&multifd_recv[i].cond);
multifd_recv[i].quit = false;
+ multifd_recv[i].done = true;
multifd_recv[i].s = tcp_recv_channel_create();
+ multifd_recv[i].address = 0;
if(multifd_recv[i].s < 0) {
printf("Error creating a recv channel");
@@ -610,6 +629,27 @@ void migrate_multifd_recv_threads_create(void)
}
}
+static void multifd_recv_page(uint8_t *address, int fd_num)
+{
+ int thread_count;
+ MultiFDRecvParams *params;
+
+ thread_count = migrate_multifd_threads();
+ assert(fd_num < thread_count);
+ params = &multifd_recv[fd_num];
+
+ qemu_mutex_lock(&multifd_recv_mutex);
+ while (!params->done) {
+ qemu_cond_wait(&multifd_recv_cond, &multifd_recv_mutex);
+ }
+ params->done = false;
+ qemu_mutex_unlock(&multifd_recv_mutex);
+ qemu_mutex_lock(¶ms->mutex);
+ params->address = address;
+ qemu_cond_signal(¶ms->cond);
+ qemu_mutex_unlock(¶ms->mutex);
+}
+
/**
* save_page_header: Write page header to wire
*
@@ -2785,10 +2825,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
case RAM_SAVE_FLAG_MULTIFD_PAGE:
fd_num = qemu_get_be16(f);
- if (fd_num == fd_num) {
- /* this is yet an unused variable, changed later */
- fd_num = 0;
- }
+ multifd_recv_page(host, fd_num);
qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
break;
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 12/13] migration: Test new fd infrastructure
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (10 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 11/13] migration: Create thread infrastructure for multifd recv side Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels Juan Quintela
` (2 subsequent siblings)
14 siblings, 0 replies; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
We just send the address through the alternate channels and test that it
is ok.
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/ram.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 5507b1f..b1b69cb 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -421,17 +421,27 @@ QemuCond multifd_send_cond;
static void *multifd_send_thread(void *opaque)
{
MultiFDSendParams *params = opaque;
+ uint8_t *address;
qemu_mutex_lock(¶ms->mutex);
while (!params->quit){
if (params->address) {
+ address = params->address;
params->address = 0;
qemu_mutex_unlock(¶ms->mutex);
+
+ if (write(params->s, &address, sizeof(uint8_t *))
+ != sizeof(uint8_t*)) {
+ /* Shuoudn't ever happen */
+ exit(-1);
+ }
qemu_mutex_lock(&multifd_send_mutex);
params->done = true;
qemu_cond_signal(&multifd_send_cond);
qemu_mutex_unlock(&multifd_send_mutex);
qemu_mutex_lock(¶ms->mutex);
+ } else {
+ qemu_cond_wait(¶ms->cond, ¶ms->mutex);
}
}
qemu_mutex_unlock(¶ms->mutex);
@@ -440,8 +450,6 @@ static void *multifd_send_thread(void *opaque)
}
static void terminate_multifd_send_threads(void)
- } else {
- qemu_cond_wait(¶ms->cond, ¶ms->mutex);
{
int i, thread_count;
@@ -550,12 +558,28 @@ QemuCond multifd_recv_cond;
static void *multifd_recv_thread(void *opaque)
{
MultiFDRecvParams *params = opaque;
+ uint8_t *address;
+ uint8_t *recv_address;
qemu_mutex_lock(¶ms->mutex);
while (!params->quit){
if (params->address) {
+ address = params->address;
params->address = 0;
qemu_mutex_unlock(¶ms->mutex);
+
+ if (read(params->s, &recv_address, sizeof(uint8_t*))
+ != sizeof(uint8_t *)) {
+ /* shouldn't ever happen */
+ exit(-1);
+ }
+
+ if (address != recv_address) {
+ printf("We received %p what we were expecting %p\n",
+ recv_address, address);
+ exit(-1);
+ }
+
qemu_mutex_lock(&multifd_recv_mutex);
params->done = true;
qemu_cond_signal(&multifd_recv_cond);
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (11 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 12/13] migration: Test new fd infrastructure Juan Quintela
@ 2016-04-20 14:44 ` Juan Quintela
2016-04-22 12:09 ` Dr. David Alan Gilbert
2016-04-20 15:46 ` [Qemu-devel] [RFC 00/13] Multiple fd migration support Michael S. Tsirkin
2016-04-22 12:26 ` Dr. David Alan Gilbert
14 siblings, 1 reply; 21+ messages in thread
From: Juan Quintela @ 2016-04-20 14:44 UTC (permalink / raw)
To: qemu-devel; +Cc: amit.shah, dgilbert
We switch for sending the page number to send real pages.
[HACK]
How we calculate the bandwidth is beyond repair, there is a hack there
that would work for x86 and archs thta have 4kb pages.
If you are having a nice day just go to migration/ram.c and look at
acct_update_position(). Now you are depressed, right?
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/migration.c | 15 +++++++++++----
migration/ram.c | 42 +++++++++++++++++++++++++++++-------------
2 files changed, 40 insertions(+), 17 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index efdd981..1db6e52 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1665,7 +1665,8 @@ static void *migration_thread(void *opaque)
/* Used by the bandwidth calcs, updated later */
int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
- int64_t initial_bytes = 0;
+ int64_t qemu_file_bytes = 0;
+ int64_t multifd_pages = 0;
int64_t max_size = 0;
int64_t start_time = initial_time;
int64_t end_time;
@@ -1748,9 +1749,14 @@ static void *migration_thread(void *opaque)
}
current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
if (current_time >= initial_time + BUFFER_DELAY) {
- uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
- initial_bytes;
uint64_t time_spent = current_time - initial_time;
+ uint64_t qemu_file_bytes_now = qemu_ftell(s->to_dst_file);
+ uint64_t multifd_pages_now = multifd_mig_pages_transferred();
+ /* Hack ahead. Why the hell we don't have a function to now the
+ target_page_size. Hard coding it to 4096 */
+ uint64_t transferred_bytes =
+ (qemu_file_bytes_now - qemu_file_bytes) +
+ (multifd_pages_now - multifd_pages) * 4096;
double bandwidth = (double)transferred_bytes / time_spent;
max_size = bandwidth * migrate_max_downtime() / 1000000;
@@ -1767,7 +1773,8 @@ static void *migration_thread(void *opaque)
qemu_file_reset_rate_limit(s->to_dst_file);
initial_time = current_time;
- initial_bytes = qemu_ftell(s->to_dst_file);
+ qemu_file_bytes = qemu_file_bytes_now;
+ multifd_pages = multifd_pages_now;
}
if (qemu_file_rate_limit(s->to_dst_file)) {
/* usleep expects microseconds */
diff --git a/migration/ram.c b/migration/ram.c
index b1b69cb..1d9ecb9 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -430,8 +430,8 @@ static void *multifd_send_thread(void *opaque)
params->address = 0;
qemu_mutex_unlock(¶ms->mutex);
- if (write(params->s, &address, sizeof(uint8_t *))
- != sizeof(uint8_t*)) {
+ if (write(params->s, address, TARGET_PAGE_SIZE)
+ != TARGET_PAGE_SIZE) {
/* Shuoudn't ever happen */
exit(-1);
}
@@ -537,6 +537,23 @@ static int multifd_send_page(uint8_t *address)
return i;
}
+static void flush_multifd_send_data(QEMUFile *f)
+{
+ int i, thread_count;
+
+ if (!migrate_multifd()) {
+ return;
+ }
+ qemu_fflush(f);
+ thread_count = migrate_multifd_threads();
+ qemu_mutex_lock(&multifd_send_mutex);
+ for (i = 0; i < thread_count; i++) {
+ while(!multifd_send[i].done) {
+ qemu_cond_wait(&multifd_send_cond, &multifd_send_mutex);
+ }
+ }
+}
+
struct MultiFDRecvParams {
QemuThread thread;
QemuCond cond;
@@ -559,7 +576,6 @@ static void *multifd_recv_thread(void *opaque)
{
MultiFDRecvParams *params = opaque;
uint8_t *address;
- uint8_t *recv_address;
qemu_mutex_lock(¶ms->mutex);
while (!params->quit){
@@ -568,18 +584,12 @@ static void *multifd_recv_thread(void *opaque)
params->address = 0;
qemu_mutex_unlock(¶ms->mutex);
- if (read(params->s, &recv_address, sizeof(uint8_t*))
- != sizeof(uint8_t *)) {
+ if (read(params->s, address, TARGET_PAGE_SIZE)
+ != TARGET_PAGE_SIZE) {
/* shouldn't ever happen */
exit(-1);
}
- if (address != recv_address) {
- printf("We received %p what we were expecting %p\n",
- recv_address, address);
- exit(-1);
- }
-
qemu_mutex_lock(&multifd_recv_mutex);
params->done = true;
qemu_cond_signal(&multifd_recv_cond);
@@ -1097,6 +1107,7 @@ static int ram_multifd_page(QEMUFile *f, PageSearchStatus *pss,
uint8_t *p;
RAMBlock *block = pss->block;
ram_addr_t offset = pss->offset;
+ static int count = 32;
p = block->host + offset;
@@ -1108,9 +1119,14 @@ static int ram_multifd_page(QEMUFile *f, PageSearchStatus *pss,
*bytes_transferred +=
save_page_header(f, block, offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
fd_num = multifd_send_page(p);
+ count--;
+ if (!count) {
+ qemu_fflush(f);
+ count = 32;
+ }
+
qemu_put_be16(f, fd_num);
*bytes_transferred += 2; /* size of fd_num */
- qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
*bytes_transferred += TARGET_PAGE_SIZE;
pages = 1;
acct_info.norm_pages++;
@@ -2375,6 +2391,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
}
flush_compressed_data(f);
+ flush_multifd_send_data(f);
ram_control_after_iterate(f, RAM_CONTROL_FINISH);
rcu_read_unlock();
@@ -2850,7 +2867,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
case RAM_SAVE_FLAG_MULTIFD_PAGE:
fd_num = qemu_get_be16(f);
multifd_recv_page(host, fd_num);
- qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
break;
case RAM_SAVE_FLAG_EOS:
--
2.5.5
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (12 preceding siblings ...)
2016-04-20 14:44 ` [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels Juan Quintela
@ 2016-04-20 15:46 ` Michael S. Tsirkin
2016-04-22 12:26 ` Dr. David Alan Gilbert
14 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2016-04-20 15:46 UTC (permalink / raw)
To: Juan Quintela; +Cc: qemu-devel, amit.shah, dgilbert
On Wed, Apr 20, 2016 at 04:44:28PM +0200, Juan Quintela wrote:
> Hi
>
> This patch series is "an" initial implementation of multiple fd migration.
> This is to get something out for others to comment, it is not finished at all.
>
> So far:
>
> - we create threads for each new fd
>
> - only for tcp of course, rest of transports are out of luck
> I need to integrate this with daniel channel changes
>
> - I *think* the locking is right, at least I don't get more random
> lookups (and yes, it was not trivial). And yes, I think that the
> compression code locking is not completely correct. I think it
> would be much, much better to do the compression code on top of this
> (will avoid a lot of copies), but I need to finish this first.
>
> - Last patch, I add a BIG hack to try to know what the real bandwidth
> is.
>
>
> Preleminar testing so far:
>
> - quite good, the latency is much better, but was change so far, I
> think I found the problem for the random high latencies, but more
> testing is needed.
>
> - under load, I think our bandwidth calculations are *not* completely
> correct (This is the way to spell it to be allowed for a family audience).
>
>
> ToDo list:
> - bandwidth calculation: I am going to send another mail
> with my ToDo list for migration, see there.
>
> - stats: We need better stats, by thread, etc
>
> - sincronize less times with the worker threads.
> right now we syncronize for each page, there are two obvious optimizations
> * send a list of pages each time we wakeup an fd
> * if we have to sent a HUGE page, dont' do a single split, just sent the whole page
> in one send() and read things with a single recv() on destination.
> My understanding is that this would make Transparent Huge pages trivial.
> - measure things under bigger loads
>
> Comments, please?
Nice to see this take shape.
There's something that looks suspicious from quick look at
the patches:
- imagine that the same page gets transmitted on two sockets
on first, then on second one
- it's possible that the second update is received and handled on
destination before the first one
Note: you do make sure a single thread sends data for
a page at a time, but that does not seem to affect the order
in which it's received.
In that case, I suspect the first one will overwrite the
page with stale data.
A simple fix would be to change
static int multifd_send_page(uint8_t *address)
to calculate the fd based on address. E.g.
(long)address/PAGE_SIZE % thread_count.
Or split memory between threads in some other way.
HTH
> Later, Juan.
>
> Juan Quintela (13):
> migration: create Migration Incoming State at init time
> migration: Pass TCP args in an struct
> migration: [HACK] Don't create decompression threads if not enabled
> migration: Add multifd capability
> migration: Create x-multifd-threads parameter
> migration: create multifd migration threads
> migration: Start of multiple fd work
> migration: create ram_multifd_page
> migration: Create thread infrastructure for multifd send side
> migration: Send the fd number which we are going to use for this page
> migration: Create thread infrastructure for multifd recv side
> migration: Test new fd infrastructure
> migration: [HACK]Transfer pages over new channels
>
> hmp.c | 10 ++
> include/migration/migration.h | 13 ++
> migration/migration.c | 100 ++++++++----
> migration/ram.c | 350 +++++++++++++++++++++++++++++++++++++++++-
> migration/savevm.c | 3 +-
> migration/tcp.c | 76 ++++++++-
> qapi-schema.json | 29 +++-
> 7 files changed, 540 insertions(+), 41 deletions(-)
>
> --
> 2.5.5
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time
2016-04-20 14:44 ` [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time Juan Quintela
@ 2016-04-22 11:27 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-04-22 11:27 UTC (permalink / raw)
To: Juan Quintela; +Cc: qemu-devel, amit.shah
* Juan Quintela (quintela@redhat.com) wrote:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> migration/migration.c | 38 +++++++++++++++++---------------------
> migration/savevm.c | 3 ++-
> 2 files changed, 19 insertions(+), 22 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 991313a..314c5c0 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -100,32 +100,28 @@ MigrationState *migrate_get_current(void)
> return ¤t_migration;
> }
>
> -/* For incoming */
> -static MigrationIncomingState *mis_current;
> -
> MigrationIncomingState *migration_incoming_get_current(void)
> {
> - return mis_current;
> -}
> + static bool once;
> + static MigrationIncomingState mis_current;
>
> -MigrationIncomingState *migration_incoming_state_new(QEMUFile* f)
> -{
> - mis_current = g_new0(MigrationIncomingState, 1);
> - mis_current->from_src_file = f;
> - mis_current->state = MIGRATION_STATUS_NONE;
> - QLIST_INIT(&mis_current->loadvm_handlers);
> - qemu_mutex_init(&mis_current->rp_mutex);
> - qemu_event_init(&mis_current->main_thread_load_event, false);
> -
> - return mis_current;
> + if (!once) {
> + mis_current.state = MIGRATION_STATUS_NONE;
> + memset(&mis_current, 0, sizeof(MigrationIncomingState));
> + QLIST_INIT(&mis_current.loadvm_handlers);
> + qemu_mutex_init(&mis_current.rp_mutex);
> + qemu_event_init(&mis_current.main_thread_load_event, false);
> + once = true;
> + }
> + return &mis_current;
> }
>
> void migration_incoming_state_destroy(void)
> {
> - qemu_event_destroy(&mis_current->main_thread_load_event);
> - loadvm_free_handlers(mis_current);
> - g_free(mis_current);
> - mis_current = NULL;
> + struct MigrationIncomingState *mis = migration_incoming_get_current();
> +
> + qemu_event_destroy(&mis->main_thread_load_event);
> + loadvm_free_handlers(mis);
> }
>
>
> @@ -373,11 +369,11 @@ static void process_incoming_migration_bh(void *opaque)
> static void process_incoming_migration_co(void *opaque)
> {
> QEMUFile *f = opaque;
> - MigrationIncomingState *mis;
> + MigrationIncomingState *mis = migration_incoming_get_current();
> PostcopyState ps;
> int ret;
>
> - mis = migration_incoming_state_new(f);
> + mis->from_src_file = f;
> postcopy_state_set(POSTCOPY_INCOMING_NONE);
> migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
> MIGRATION_STATUS_ACTIVE);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 16ba443..49137a1 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2091,6 +2091,7 @@ int load_vmstate(const char *name)
> QEMUFile *f;
> int ret;
> AioContext *aio_context;
> + MigrationIncomingState *mis = migration_incoming_get_current();
>
> if (!bdrv_all_can_snapshot(&bs)) {
> error_report("Device '%s' is writable but does not support snapshots.",
> @@ -2141,7 +2142,7 @@ int load_vmstate(const char *name)
> }
>
> qemu_system_reset(VMRESET_SILENT);
> - migration_incoming_state_new(f);
> + mis->from_src_file = f;
>
> aio_context_acquire(aio_context);
> ret = qemu_loadvm_state(f);
> --
> 2.5.5
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter
2016-04-20 14:44 ` [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter Juan Quintela
@ 2016-04-22 11:37 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-04-22 11:37 UTC (permalink / raw)
To: Juan Quintela; +Cc: qemu-devel, amit.shah
* Juan Quintela (quintela@redhat.com) wrote:
> Indicates the number of threads that we would create. By default we
> create 2 threads.
Is a migration with the multifd capability set and x_multifd_threads=1
the same as a normal migration, or I'm thinking it's still different in
that it has a separate data fd?
Dave
>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
> hmp.c | 8 ++++++++
> include/migration/migration.h | 2 ++
> migration/migration.c | 30 +++++++++++++++++++++++++++++-
> qapi-schema.json | 19 ++++++++++++++++---
> 4 files changed, 55 insertions(+), 4 deletions(-)
>
> diff --git a/hmp.c b/hmp.c
> index d510236..2a40f1f 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -286,6 +286,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
> monitor_printf(mon, " %s: %" PRId64,
> MigrationParameter_lookup[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT],
> params->x_cpu_throttle_increment);
> + monitor_printf(mon, " %s: %" PRId64,
> + MigrationParameter_lookup[MIGRATION_PARAMETER_X_MULTIFD_THREADS],
> + params->x_multifd_threads);
> monitor_printf(mon, "\n");
> }
>
> @@ -1242,6 +1245,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
> bool has_decompress_threads = false;
> bool has_x_cpu_throttle_initial = false;
> bool has_x_cpu_throttle_increment = false;
> + bool has_x_multifd_threads = false;
> int i;
>
> for (i = 0; i < MIGRATION_PARAMETER__MAX; i++) {
> @@ -1262,12 +1266,16 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
> case MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT:
> has_x_cpu_throttle_increment = true;
> break;
> + case MIGRATION_PARAMETER_X_MULTIFD_THREADS:
> + has_x_multifd_threads = true;
> + break;
> }
> qmp_migrate_set_parameters(has_compress_level, value,
> has_compress_threads, value,
> has_decompress_threads, value,
> has_x_cpu_throttle_initial, value,
> has_x_cpu_throttle_increment, value,
> + has_x_multifd_threads, value,
> &err);
> break;
> }
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index a626b7d..19d535d 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -219,6 +219,8 @@ bool migration_in_postcopy(MigrationState *);
> bool migration_in_postcopy_after_devices(MigrationState *);
> MigrationState *migrate_get_current(void);
>
> +int migrate_multifd_threads(void);
> +
> void migrate_compress_threads_create(void);
> void migrate_compress_threads_join(void);
> void migrate_decompress_threads_create(void);
> diff --git a/migration/migration.c b/migration/migration.c
> index 92e6dc4..29e43ff 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -56,6 +56,8 @@
> /* Migration XBZRLE default cache size */
> #define DEFAULT_MIGRATE_CACHE_SIZE (64 * 1024 * 1024)
>
> +#define DEFAULT_MIGRATE_MULTIFD_THREADS 2
> +
> static NotifierList migration_state_notifiers =
> NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
>
> @@ -91,6 +93,8 @@ MigrationState *migrate_get_current(void)
> DEFAULT_MIGRATE_X_CPU_THROTTLE_INITIAL,
> .parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
> DEFAULT_MIGRATE_X_CPU_THROTTLE_INCREMENT,
> + .parameters[MIGRATION_PARAMETER_X_MULTIFD_THREADS] =
> + DEFAULT_MIGRATE_MULTIFD_THREADS,
> };
>
> if (!once) {
> @@ -521,6 +525,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
> s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL];
> params->x_cpu_throttle_increment =
> s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT];
> + params->x_multifd_threads =
> + s->parameters[MIGRATION_PARAMETER_X_MULTIFD_THREADS];
>
> return params;
> }
> @@ -717,7 +723,10 @@ void qmp_migrate_set_parameters(bool has_compress_level,
> bool has_x_cpu_throttle_initial,
> int64_t x_cpu_throttle_initial,
> bool has_x_cpu_throttle_increment,
> - int64_t x_cpu_throttle_increment, Error **errp)
> + int64_t x_cpu_throttle_increment,
> + bool has_multifd_threads,
> + int64_t multifd_threads,
> + Error **errp)
> {
> MigrationState *s = migrate_get_current();
>
> @@ -753,6 +762,13 @@ void qmp_migrate_set_parameters(bool has_compress_level,
> "an integer in the range of 1 to 99");
> }
>
> + if (has_multifd_threads &&
> + (multifd_threads < 1 || multifd_threads > 255)) {
> + error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> + "multifd_threads",
> + "is invalid, it should be in the range of 1 to 255");
> + return;
> + }
> if (has_compress_level) {
> s->parameters[MIGRATION_PARAMETER_COMPRESS_LEVEL] = compress_level;
> }
> @@ -772,6 +788,9 @@ void qmp_migrate_set_parameters(bool has_compress_level,
> s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
> x_cpu_throttle_increment;
> }
> + if (has_multifd_threads) {
> + s->parameters[MIGRATION_PARAMETER_X_MULTIFD_THREADS] = multifd_threads;
> + }
> }
>
> void qmp_migrate_start_postcopy(Error **errp)
> @@ -1196,6 +1215,15 @@ bool migrate_multifd(void)
> return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
> }
>
> +int migrate_multifd_threads(void)
> +{
> + MigrationState *s;
> +
> + s = migrate_get_current();
> +
> + return s->parameters[MIGRATION_PARAMETER_X_MULTIFD_THREADS];
> +}
> +
> int migrate_use_xbzrle(void)
> {
> MigrationState *s;
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 9fdf902..6ff9ac6 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -613,11 +613,16 @@
> # @x-cpu-throttle-increment: throttle percentage increase each time
> # auto-converge detects that migration is not making
> # progress. The default value is 10. (Since 2.5)
> +#
> +# @x-multifd-threads: Number of threads used to migrate data in parallel
> +# The default value is 1 (since 2.6)
> +#
> # Since: 2.4
> ##
> { 'enum': 'MigrationParameter',
> 'data': ['compress-level', 'compress-threads', 'decompress-threads',
> - 'x-cpu-throttle-initial', 'x-cpu-throttle-increment'] }
> + 'x-cpu-throttle-initial', 'x-cpu-throttle-increment',
> + 'x-multifd-threads'] }
>
> #
> # @migrate-set-parameters
> @@ -637,6 +642,10 @@
> # @x-cpu-throttle-increment: throttle percentage increase each time
> # auto-converge detects that migration is not making
> # progress. The default value is 10. (Since 2.5)
> +#
> +# @x-multifd-threads: Number of threads used to migrate data in parallel
> +# The default value is 1 (since 2.6)
> +#
> # Since: 2.4
> ##
> { 'command': 'migrate-set-parameters',
> @@ -644,7 +653,8 @@
> '*compress-threads': 'int',
> '*decompress-threads': 'int',
> '*x-cpu-throttle-initial': 'int',
> - '*x-cpu-throttle-increment': 'int'} }
> + '*x-cpu-throttle-increment': 'int',
> + '*x-multifd-threads': 'int'} }
>
> #
> # @MigrationParameters
> @@ -663,6 +673,8 @@
> # auto-converge detects that migration is not making
> # progress. The default value is 10. (Since 2.5)
> #
> +# @x-multifd-threads: Number of threads used to migrate data in parallel
> +# The default value is 1 (since 2.6)
> # Since: 2.4
> ##
> { 'struct': 'MigrationParameters',
> @@ -670,7 +682,8 @@
> 'compress-threads': 'int',
> 'decompress-threads': 'int',
> 'x-cpu-throttle-initial': 'int',
> - 'x-cpu-throttle-increment': 'int'} }
> + 'x-cpu-throttle-increment': 'int',
> + 'x-multifd-threads': 'int'} }
> ##
> # @query-migrate-parameters
> #
> --
> 2.5.5
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels
2016-04-20 14:44 ` [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels Juan Quintela
@ 2016-04-22 12:09 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-04-22 12:09 UTC (permalink / raw)
To: Juan Quintela; +Cc: qemu-devel, amit.shah
* Juan Quintela (quintela@redhat.com) wrote:
> We switch for sending the page number to send real pages.
>
> [HACK]
> How we calculate the bandwidth is beyond repair, there is a hack there
> that would work for x86 and archs thta have 4kb pages.
>
> If you are having a nice day just go to migration/ram.c and look at
> acct_update_position(). Now you are depressed, right?
>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
> migration/migration.c | 15 +++++++++++----
> migration/ram.c | 42 +++++++++++++++++++++++++++++-------------
> 2 files changed, 40 insertions(+), 17 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index efdd981..1db6e52 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1665,7 +1665,8 @@ static void *migration_thread(void *opaque)
> /* Used by the bandwidth calcs, updated later */
> int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> - int64_t initial_bytes = 0;
> + int64_t qemu_file_bytes = 0;
> + int64_t multifd_pages = 0;
> int64_t max_size = 0;
> int64_t start_time = initial_time;
> int64_t end_time;
> @@ -1748,9 +1749,14 @@ static void *migration_thread(void *opaque)
> }
> current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> if (current_time >= initial_time + BUFFER_DELAY) {
> - uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
> - initial_bytes;
> uint64_t time_spent = current_time - initial_time;
> + uint64_t qemu_file_bytes_now = qemu_ftell(s->to_dst_file);
> + uint64_t multifd_pages_now = multifd_mig_pages_transferred();
> + /* Hack ahead. Why the hell we don't have a function to now the
> + target_page_size. Hard coding it to 4096 */
> + uint64_t transferred_bytes =
> + (qemu_file_bytes_now - qemu_file_bytes) +
> + (multifd_pages_now - multifd_pages) * 4096;
We do; I added qemu_target_page_bits in the postcopy series; so add
1ul << qemu_target_page_bits()
(I added bits so that you can get page easily; adding just page
can't go the other way).
> double bandwidth = (double)transferred_bytes / time_spent;
> max_size = bandwidth * migrate_max_downtime() / 1000000;
>
> @@ -1767,7 +1773,8 @@ static void *migration_thread(void *opaque)
>
> qemu_file_reset_rate_limit(s->to_dst_file);
> initial_time = current_time;
> - initial_bytes = qemu_ftell(s->to_dst_file);
> + qemu_file_bytes = qemu_file_bytes_now;
> + multifd_pages = multifd_pages_now;
> }
> if (qemu_file_rate_limit(s->to_dst_file)) {
> /* usleep expects microseconds */
> diff --git a/migration/ram.c b/migration/ram.c
> index b1b69cb..1d9ecb9 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -430,8 +430,8 @@ static void *multifd_send_thread(void *opaque)
> params->address = 0;
> qemu_mutex_unlock(¶ms->mutex);
>
> - if (write(params->s, &address, sizeof(uint8_t *))
> - != sizeof(uint8_t*)) {
> + if (write(params->s, address, TARGET_PAGE_SIZE)
> + != TARGET_PAGE_SIZE) {
> /* Shuoudn't ever happen */
> exit(-1);
> }
> @@ -537,6 +537,23 @@ static int multifd_send_page(uint8_t *address)
> return i;
> }
>
> +static void flush_multifd_send_data(QEMUFile *f)
> +{
> + int i, thread_count;
> +
> + if (!migrate_multifd()) {
> + return;
> + }
> + qemu_fflush(f);
> + thread_count = migrate_multifd_threads();
> + qemu_mutex_lock(&multifd_send_mutex);
> + for (i = 0; i < thread_count; i++) {
> + while(!multifd_send[i].done) {
> + qemu_cond_wait(&multifd_send_cond, &multifd_send_mutex);
> + }
> + }
> +}
> +
> struct MultiFDRecvParams {
> QemuThread thread;
> QemuCond cond;
> @@ -559,7 +576,6 @@ static void *multifd_recv_thread(void *opaque)
> {
> MultiFDRecvParams *params = opaque;
> uint8_t *address;
> - uint8_t *recv_address;
>
> qemu_mutex_lock(¶ms->mutex);
> while (!params->quit){
> @@ -568,18 +584,12 @@ static void *multifd_recv_thread(void *opaque)
> params->address = 0;
> qemu_mutex_unlock(¶ms->mutex);
>
> - if (read(params->s, &recv_address, sizeof(uint8_t*))
> - != sizeof(uint8_t *)) {
> + if (read(params->s, address, TARGET_PAGE_SIZE)
> + != TARGET_PAGE_SIZE) {
> /* shouldn't ever happen */
> exit(-1);
> }
>
> - if (address != recv_address) {
> - printf("We received %p what we were expecting %p\n",
> - recv_address, address);
> - exit(-1);
> - }
> -
> qemu_mutex_lock(&multifd_recv_mutex);
> params->done = true;
> qemu_cond_signal(&multifd_recv_cond);
> @@ -1097,6 +1107,7 @@ static int ram_multifd_page(QEMUFile *f, PageSearchStatus *pss,
> uint8_t *p;
> RAMBlock *block = pss->block;
> ram_addr_t offset = pss->offset;
> + static int count = 32;
>
> p = block->host + offset;
>
> @@ -1108,9 +1119,14 @@ static int ram_multifd_page(QEMUFile *f, PageSearchStatus *pss,
> *bytes_transferred +=
> save_page_header(f, block, offset | RAM_SAVE_FLAG_MULTIFD_PAGE);
> fd_num = multifd_send_page(p);
> + count--;
> + if (!count) {
> + qemu_fflush(f);
> + count = 32;
> + }
> +
> qemu_put_be16(f, fd_num);
> *bytes_transferred += 2; /* size of fd_num */
> - qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
> *bytes_transferred += TARGET_PAGE_SIZE;
> pages = 1;
> acct_info.norm_pages++;
> @@ -2375,6 +2391,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
> }
>
> flush_compressed_data(f);
> + flush_multifd_send_data(f);
> ram_control_after_iterate(f, RAM_CONTROL_FINISH);
>
> rcu_read_unlock();
> @@ -2850,7 +2867,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> case RAM_SAVE_FLAG_MULTIFD_PAGE:
> fd_num = qemu_get_be16(f);
> multifd_recv_page(host, fd_num);
> - qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
I think this breaks postcopy, because 'host' is often the same between multiple
calls around this loop.
Dave
> break;
>
> case RAM_SAVE_FLAG_EOS:
> --
> 2.5.5
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
` (13 preceding siblings ...)
2016-04-20 15:46 ` [Qemu-devel] [RFC 00/13] Multiple fd migration support Michael S. Tsirkin
@ 2016-04-22 12:26 ` Dr. David Alan Gilbert
2016-04-25 16:53 ` Juan Quintela
14 siblings, 1 reply; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-04-22 12:26 UTC (permalink / raw)
To: Juan Quintela; +Cc: qemu-devel, amit.shah
* Juan Quintela (quintela@redhat.com) wrote:
> Hi
>
> This patch series is "an" initial implementation of multiple fd migration.
> This is to get something out for others to comment, it is not finished at all.
I've had a quick skim:
a) I think mst is right about the risk of getting stale pages out of order.
b) Since you don't change the URI at all, it's a bit restricted; for example,
it means I can't run separate sessions over different NICs unless I've
done something clever at the routing/or bonded them.
One thing I liked the sound of multi-fd for is NUMA; get a BIG box
and give each numa node a separate NIC and run a separate thread on each
node.
c) Hmm we do still have a single thread doing all the bitmap syncing and scanning,
we'll have to watch out if that is the bottleneck at all.
d) All the zero testing is still done in the main thread which we know is
expensive.
e) Do we need to do something for security with having multiple ports? How
do we check that nothing snuck in on one of our extra ports, have we got
sanity checks to make sure it's actually the right stream.
f) You're handing out pages to the sending threads on the basis of which one
is free (in the same way as the multi threaded compression); but I think
it needs some sanity adding to only hand out whole host pages - it feels
like receiving all the chunks of one host page down separate FDs would
be horrible.
g) I think you might be able to combine the compression into the same threads;
so that if multi-fd + multi-threaded-compresison is set you don't end
up with 2 sets of threads and it might be the simplest way to make them
work together.
h) You've used the last free RAM_SAVE_FLAG! And the person who takes the last
slice^Wbit has to get some more.
Since arm, ppc, and 68k have variants that have TARGET_PAGE_BITS 10 that
means we're full; I suggest what you do is use that flag to mean that we
send another 64bit word; and in that word you use the bottom 7 bits for
the fd index and bit 7 is set to indicate it's fd. The other bits are sent
as zero and available for the next use.
Either that or start combining with some other flags.
(I may have a use for some more bits in mind!)
i) Is this safe for xbzrle - what happens to the cache (or is it all
still the main thread?)
j) For postcopy I could do with a separate fd for the requested pages
(but again that comes back to needing an easy solution to the ordering)
Dave
>
> So far:
>
> - we create threads for each new fd
>
> - only for tcp of course, rest of transports are out of luck
> I need to integrate this with daniel channel changes
>
> - I *think* the locking is right, at least I don't get more random
> lookups (and yes, it was not trivial). And yes, I think that the
> compression code locking is not completely correct. I think it
> would be much, much better to do the compression code on top of this
> (will avoid a lot of copies), but I need to finish this first.
>
> - Last patch, I add a BIG hack to try to know what the real bandwidth
> is.
>
>
> Preleminar testing so far:
>
> - quite good, the latency is much better, but was change so far, I
> think I found the problem for the random high latencies, but more
> testing is needed.
>
> - under load, I think our bandwidth calculations are *not* completely
> correct (This is the way to spell it to be allowed for a family audience).
>
>
> ToDo list:
> - bandwidth calculation: I am going to send another mail
> with my ToDo list for migration, see there.
>
> - stats: We need better stats, by thread, etc
>
> - sincronize less times with the worker threads.
> right now we syncronize for each page, there are two obvious optimizations
> * send a list of pages each time we wakeup an fd
> * if we have to sent a HUGE page, dont' do a single split, just sent the whole page
> in one send() and read things with a single recv() on destination.
> My understanding is that this would make Transparent Huge pages trivial.
> - measure things under bigger loads
>
> Comments, please?
>
> Later, Juan.
>
> Juan Quintela (13):
> migration: create Migration Incoming State at init time
> migration: Pass TCP args in an struct
> migration: [HACK] Don't create decompression threads if not enabled
> migration: Add multifd capability
> migration: Create x-multifd-threads parameter
> migration: create multifd migration threads
> migration: Start of multiple fd work
> migration: create ram_multifd_page
> migration: Create thread infrastructure for multifd send side
> migration: Send the fd number which we are going to use for this page
> migration: Create thread infrastructure for multifd recv side
> migration: Test new fd infrastructure
> migration: [HACK]Transfer pages over new channels
>
> hmp.c | 10 ++
> include/migration/migration.h | 13 ++
> migration/migration.c | 100 ++++++++----
> migration/ram.c | 350 +++++++++++++++++++++++++++++++++++++++++-
> migration/savevm.c | 3 +-
> migration/tcp.c | 76 ++++++++-
> qapi-schema.json | 29 +++-
> 7 files changed, 540 insertions(+), 41 deletions(-)
>
> --
> 2.5.5
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support
2016-04-22 12:26 ` Dr. David Alan Gilbert
@ 2016-04-25 16:53 ` Juan Quintela
2016-04-26 12:38 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 21+ messages in thread
From: Juan Quintela @ 2016-04-25 16:53 UTC (permalink / raw)
To: Dr. David Alan Gilbert; +Cc: qemu-devel, amit.shah
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Hi
>>
>> This patch series is "an" initial implementation of multiple fd migration.
>> This is to get something out for others to comment, it is not finished at all.
>
> I've had a quick skim:
> a) I think mst is right about the risk of getting stale pages out of order.
I have been thinking about this. We just need to send a "we have
finish" this round packet. And reception has to wait for all threads to
finish before continue. It is easier and not expensive. We never
resend the same page during the same round.
> b) Since you don't change the URI at all, it's a bit restricted; for example,
> it means I can't run separate sessions over different NICs unless I've
> done something clever at the routing/or bonded them.
> One thing I liked the sound of multi-fd for is NUMA; get a BIG box
> and give each numa node a separate NIC and run a separate thread on each
> node.
If we want this _how_ we want to configure it. This was part of the
reason to post the patch. It works only for tcp, I don't even try the
others, just to see what people want.
> c) Hmm we do still have a single thread doing all the bitmap syncing and scanning,
> we'll have to watch out if that is the bottleneck at all.
Yeap. My idea here was to still maintain the bitmap scanning on the
main thread, but send work to the "worker threads" in batches, not in
single pages. But I haven't really profiled how long we spend there.
> d) All the zero testing is still done in the main thread which we know is
> expensive.
Not trivial if we don't want to send control information over the
"other" channels. One solution would be split the main memory in
different "main" threads. No performance profiles.
> e) Do we need to do something for security with having multiple ports? How
> do we check that nothing snuck in on one of our extra ports, have we got
> sanity checks to make sure it's actually the right stream.
We only have a single port. We opened it several times. It shouldn't
require changes in either libvirt/firewall. (Famous last words)
> f) You're handing out pages to the sending threads on the basis of which one
> is free (in the same way as the multi threaded compression); but I think
> it needs some sanity adding to only hand out whole host pages - it feels
> like receiving all the chunks of one host page down separate FDs would
> be horrible.
Trivial optimization would be to send _whole_ huge pages in one go. I
wanted comments about what people wanted here. My idea was really to
add multipage or several pages in one go. Would reduce synchronization
a lot. I do to the 1st that becomes free because ...... I don't know
how long a specific transmission is going to take. TCP for you :-(
> g) I think you might be able to combine the compression into the same threads;
> so that if multi-fd + multi-threaded-compresison is set you don't end
> up with 2 sets of threads and it might be the simplest way to make them
> work together.
Yeap, I thought that. But I didn't want to merge them in a first
stage. It makes much more sense to _not_ send the compressed data
through the main channel. But that would be v2 (or 3, or 4 ...)
> h) You've used the last free RAM_SAVE_FLAG! And the person who takes the last
> slice^Wbit has to get some more.
> Since arm, ppc, and 68k have variants that have TARGET_PAGE_BITS 10 that
> means we're full; I suggest what you do is use that flag to mean that we
> send another 64bit word; and in that word you use the bottom 7 bits for
> the fd index and bit 7 is set to indicate it's fd. The other bits are sent
> as zero and available for the next use.
> Either that or start combining with some other flags.
> (I may have a use for some more bits in mind!)
Ok. I can looke at that.
> i) Is this safe for xbzrle - what happens to the cache (or is it all
> still the main thread?)
Nope. Only way to use xbzrle is:
if (zero(page) {
...
} else if (xbzrle(page)) {
} else {
multifd(page)
}
Otherwise we would have to make xbzrle multithread, or split memory
between fd's. Problem to split memory between fd's is that we need to
know where the hot spots are.
> j) For postcopy I could do with a separate fd for the requested pages
> (but again that comes back to needing an easy solution to the ordering)
The ordering was easy, as said. You can just use that command with each
postcopy requested page. Or something similar, no?
I think that just forgetting about that pages, and each time that we
receive a requested page, we first wait for the main thread to finish
its pages should be enough, no?
> Dave
Thanks very much, JUan.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support
2016-04-25 16:53 ` Juan Quintela
@ 2016-04-26 12:38 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 21+ messages in thread
From: Dr. David Alan Gilbert @ 2016-04-26 12:38 UTC (permalink / raw)
To: Juan Quintela; +Cc: qemu-devel, amit.shah
* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> Hi
> >>
> >> This patch series is "an" initial implementation of multiple fd migration.
> >> This is to get something out for others to comment, it is not finished at all.
> >
> > I've had a quick skim:
> > a) I think mst is right about the risk of getting stale pages out of order.
>
> I have been thinking about this. We just need to send a "we have
> finish" this round packet. And reception has to wait for all threads to
> finish before continue. It is easier and not expensive. We never
> resend the same page during the same round.
Yes.
> > b) Since you don't change the URI at all, it's a bit restricted; for example,
> > it means I can't run separate sessions over different NICs unless I've
> > done something clever at the routing/or bonded them.
> > One thing I liked the sound of multi-fd for is NUMA; get a BIG box
> > and give each numa node a separate NIC and run a separate thread on each
> > node.
>
> If we want this _how_ we want to configure it. This was part of the
> reason to post the patch. It works only for tcp, I don't even try the
> others, just to see what people want.
I was thinking this would work even for TCP; you'd just need a way to pass
different URIs (with address/port) for each connection.
> > c) Hmm we do still have a single thread doing all the bitmap syncing and scanning,
> > we'll have to watch out if that is the bottleneck at all.
>
> Yeap. My idea here was to still maintain the bitmap scanning on the
> main thread, but send work to the "worker threads" in batches, not in
> single pages. But I haven't really profiled how long we spend there.
Yeh, it would be interesting to see what this profile looked like; if we
suddenly found that main thread had spare cycles perhaps we could do some
more interesting types of scanning.
> > d) All the zero testing is still done in the main thread which we know is
> > expensive.
>
> Not trivial if we don't want to send control information over the
> "other" channels. One solution would be split the main memory in
> different "main" threads. No performance profiles.
Yes, and it's tricky because the order is:
1) Send control information
2) Farm it out to individual thread
It's too late for '2' to say 'it's zero'.
> > e) Do we need to do something for security with having multiple ports? How
> > do we check that nothing snuck in on one of our extra ports, have we got
> > sanity checks to make sure it's actually the right stream.
>
>
> We only have a single port. We opened it several times. It shouldn't
> require changes in either libvirt/firewall. (Famous last words)
True I guess.
>
> > f) You're handing out pages to the sending threads on the basis of which one
> > is free (in the same way as the multi threaded compression); but I think
> > it needs some sanity adding to only hand out whole host pages - it feels
> > like receiving all the chunks of one host page down separate FDs would
> > be horrible.
>
> Trivial optimization would be to send _whole_ huge pages in one go. I
> wanted comments about what people wanted here. My idea was really to
> add multipage or several pages in one go. Would reduce synchronization
> a lot. I do to the 1st that becomes free because ...... I don't know
> how long a specific transmission is going to take. TCP for you :-(
Sending huge pages would be very nice; the tricky thing is you don't want to send
a huge page unless it's all marked dirty.
> > g) I think you might be able to combine the compression into the same threads;
> > so that if multi-fd + multi-threaded-compresison is set you don't end
> > up with 2 sets of threads and it might be the simplest way to make them
> > work together.
>
> Yeap, I thought that. But I didn't want to merge them in a first
> stage. It makes much more sense to _not_ send the compressed data
> through the main channel. But that would be v2 (or 3, or 4 ...)
Right.
> > h) You've used the last free RAM_SAVE_FLAG! And the person who takes the last
> > slice^Wbit has to get some more.
> > Since arm, ppc, and 68k have variants that have TARGET_PAGE_BITS 10 that
> > means we're full; I suggest what you do is use that flag to mean that we
> > send another 64bit word; and in that word you use the bottom 7 bits for
> > the fd index and bit 7 is set to indicate it's fd. The other bits are sent
> > as zero and available for the next use.
> > Either that or start combining with some other flags.
> > (I may have a use for some more bits in mind!)
>
> Ok. I can looke at that.
>
> > i) Is this safe for xbzrle - what happens to the cache (or is it all
> > still the main thread?)
>
> Nope. Only way to use xbzrle is:
>
> if (zero(page) {
> ...
> } else if (xbzrle(page)) {
>
> } else {
> multifd(page)
> }
>
> Otherwise we would have to make xbzrle multithread, or split memory
> between fd's. Problem to split memory between fd's is that we need to
> know where the hot spots are.
OK, that makes sense. So does that mean that some pages can get xbzrle sent?
> > j) For postcopy I could do with a separate fd for the requested pages
> > (but again that comes back to needing an easy solution to the ordering)
>
> The ordering was easy, as said. You can just use that command with each
> postcopy requested page. Or something similar, no?
>
> I think that just forgetting about that pages, and each time that we
> receive a requested page, we first wait for the main thread to finish
> its pages should be enough, no?
Actually, I realised it's simpler; once we're in postcopy mode we never
send the same page again; so we never have any ordering problems as long
as we perform a sync across the fd's at postcopy entry.
Dave
>
> > Dave
>
> Thanks very much, JUan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2016-04-26 12:38 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time Juan Quintela
2016-04-22 11:27 ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 02/13] migration: Pass TCP args in an struct Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 03/13] migration: [HACK] Don't create decompression threads if not enabled Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 04/13] migration: Add multifd capability Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter Juan Quintela
2016-04-22 11:37 ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 06/13] migration: create multifd migration threads Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 07/13] migration: Start of multiple fd work Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 08/13] migration: create ram_multifd_page Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 09/13] migration: Create thread infrastructure for multifd send side Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 10/13] migration: Send the fd number which we are going to use for this page Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 11/13] migration: Create thread infrastructure for multifd recv side Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 12/13] migration: Test new fd infrastructure Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels Juan Quintela
2016-04-22 12:09 ` Dr. David Alan Gilbert
2016-04-20 15:46 ` [Qemu-devel] [RFC 00/13] Multiple fd migration support Michael S. Tsirkin
2016-04-22 12:26 ` Dr. David Alan Gilbert
2016-04-25 16:53 ` Juan Quintela
2016-04-26 12:38 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).