* [Qemu-devel] [PATCH v2 0/6] postcopy block time calculation + ppc32 build fix
[not found] <CGME20180322181739eucas1p2b5f31dc21663881bc244deaedec22712@eucas1p2.samsung.com>
@ 2018-03-22 18:17 ` Alexey Perevalov
[not found] ` <CGME20180322181740eucas1p2a7bde534f74bd8376fc521a4ad1bcfdb@eucas1p2.samsung.com>
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Alexey Perevalov @ 2018-03-22 18:17 UTC (permalink / raw)
To: qemu-devel, dgilbert
Cc: Alexey Perevalov, v.kuramshin, ash.billore, f4bug, quintela,
peterx, lvivier
V1-V2
accidentally appeared __nocheck after rebase
this patch set also rebased after latest pull request
This patch set includes patches which were reverted by commit
ee86981bd, due to build problem on 32 powerpc/arm architecture.
Also it includes patch to fix build
([PATCH v4] migration: change blocktime type to uint32_t), but that
patch was merged into:
migration: add postcopy blocktime ctx into MigrationIncomingState
migration: calculate vCPU blocktime on dst side
migration: add postcopy total blocktime into query-migrate
based on
commit c6740fc88ecd8f5cf3cf3185ee112c3eea41caa2
"hw/rdma: Implementation of PVRDMA device"
Alexey Perevalov (6):
migration: introduce postcopy-blocktime capability
migration: add postcopy blocktime ctx into MigrationIncomingState
migration: calculate vCPU blocktime on dst side
migration: postcopy_blocktime documentation
migration: add blocktime calculation into migration-test
migration: add postcopy total blocktime into query-migrate
docs/devel/migration.rst | 14 +++
hmp.c | 15 +++
migration/migration.c | 51 ++++++++-
migration/migration.h | 13 +++
migration/postcopy-ram.c | 268 ++++++++++++++++++++++++++++++++++++++++++++++-
migration/trace-events | 6 +-
qapi/migration.json | 17 ++-
tests/migration-test.c | 16 +++
8 files changed, 392 insertions(+), 8 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 1/6] migration: introduce postcopy-blocktime capability
[not found] ` <CGME20180322181740eucas1p2a7bde534f74bd8376fc521a4ad1bcfdb@eucas1p2.samsung.com>
@ 2018-03-22 18:17 ` Alexey Perevalov
0 siblings, 0 replies; 9+ messages in thread
From: Alexey Perevalov @ 2018-03-22 18:17 UTC (permalink / raw)
To: qemu-devel, dgilbert
Cc: Alexey Perevalov, v.kuramshin, ash.billore, f4bug, quintela,
peterx, lvivier
Right now it could be used on destination side to
enable vCPU blocktime calculation for postcopy live migration.
vCPU blocktime - it's time since vCPU thread was put into
interruptible sleep, till memory page was copied and thread awake.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/migration.c | 9 +++++++++
migration/migration.h | 1 +
qapi/migration.json | 6 +++++-
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index fc629e5..f95a7f3 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1540,6 +1540,15 @@ bool migrate_zero_blocks(void)
return s->enabled_capabilities[MIGRATION_CAPABILITY_ZERO_BLOCKS];
}
+bool migrate_postcopy_blocktime(void)
+{
+ MigrationState *s;
+
+ s = migrate_get_current();
+
+ return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_BLOCKTIME];
+}
+
bool migrate_use_compression(void)
{
MigrationState *s;
diff --git a/migration/migration.h b/migration/migration.h
index 8d2f320..46a50bc 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -230,6 +230,7 @@ int migrate_compress_level(void);
int migrate_compress_threads(void);
int migrate_decompress_threads(void);
bool migrate_use_events(void);
+bool migrate_postcopy_blocktime(void);
/* Sending on the return path - generic and then for each message type */
void migrate_send_rp_shut(MigrationIncomingState *mis,
diff --git a/qapi/migration.json b/qapi/migration.json
index 9d0bf82..24bfc19 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -354,16 +354,20 @@
#
# @x-multifd: Use more than one fd for migration (since 2.11)
#
+#
# @dirty-bitmaps: If enabled, QEMU will migrate named dirty bitmaps.
# (since 2.12)
#
+# @postcopy-blocktime: Calculate downtime for postcopy live migration
+# (since 2.13)
+#
# Since: 1.2
##
{ 'enum': 'MigrationCapability',
'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
'block', 'return-path', 'pause-before-switchover', 'x-multifd',
- 'dirty-bitmaps' ] }
+ 'dirty-bitmaps', 'postcopy-blocktime' ] }
##
# @MigrationCapabilityStatus:
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 2/6] migration: add postcopy blocktime ctx into MigrationIncomingState
[not found] ` <CGME20180322181740eucas1p13028d8418e837695a91ded89385f8978@eucas1p1.samsung.com>
@ 2018-03-22 18:17 ` Alexey Perevalov
0 siblings, 0 replies; 9+ messages in thread
From: Alexey Perevalov @ 2018-03-22 18:17 UTC (permalink / raw)
To: qemu-devel, dgilbert
Cc: Alexey Perevalov, v.kuramshin, ash.billore, f4bug, quintela,
peterx, lvivier
This patch adds request to kernel space for UFFD_FEATURE_THREAD_ID, in
case this feature is provided by kernel.
PostcopyBlocktimeContext is encapsulated inside postcopy-ram.c,
due to it being a postcopy-only feature.
Also it defines PostcopyBlocktimeContext's instance live time.
Information from PostcopyBlocktimeContext instance will be provided
much after postcopy migration end, instance of PostcopyBlocktimeContext
will live till QEMU exit, but part of it (vcpu_addr,
page_fault_vcpu_time) used only during calculation, will be released
when postcopy ended or failed.
To enable postcopy blocktime calculation on destination, need to
request proper compatibility (Patch for documentation will be at the
tail of the patch set).
As an example following command enable that capability, assume QEMU was
started with
-chardev socket,id=charmonitor,path=/var/lib/migrate-vm-monitor.sock
option to control it
[root@host]#printf "{\"execute\" : \"qmp_capabilities\"}\r\n \
{\"execute\": \"migrate-set-capabilities\" , \"arguments\": {
\"capabilities\": [ { \"capability\": \"postcopy-blocktime\", \"state\":
true } ] } }" | nc -U /var/lib/migrate-vm-monitor.sock
Or just with HMP
(qemu) migrate_set_capability postcopy-blocktime on
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/migration.h | 8 +++++++
migration/postcopy-ram.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 69 insertions(+)
diff --git a/migration/migration.h b/migration/migration.h
index 46a50bc..6d9aaeb 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -22,6 +22,8 @@
#include "hw/qdev.h"
#include "io/channel.h"
+struct PostcopyBlocktimeContext;
+
/* State for the incoming migration */
struct MigrationIncomingState {
QEMUFile *from_src_file;
@@ -65,6 +67,12 @@ struct MigrationIncomingState {
/* The coroutine we should enter (back) after failover */
Coroutine *migration_incoming_co;
QemuSemaphore colo_incoming_sem;
+
+ /*
+ * PostcopyBlocktimeContext to keep information for postcopy
+ * live migration, to calculate vCPU block time
+ * */
+ struct PostcopyBlocktimeContext *blocktime_ctx;
};
MigrationIncomingState *migration_incoming_get_current(void);
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index efd7793..66f1df9 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -90,6 +90,54 @@ int postcopy_notify(enum PostcopyNotifyReason reason, Error **errp)
#include <sys/eventfd.h>
#include <linux/userfaultfd.h>
+typedef struct PostcopyBlocktimeContext {
+ /* time when page fault initiated per vCPU */
+ uint32_t *page_fault_vcpu_time;
+ /* page address per vCPU */
+ uintptr_t *vcpu_addr;
+ uint32_t total_blocktime;
+ /* blocktime per vCPU */
+ uint32_t *vcpu_blocktime;
+ /* point in time when last page fault was initiated */
+ uint32_t last_begin;
+ /* number of vCPU are suspended */
+ int smp_cpus_down;
+ uint64_t start_time;
+
+ /*
+ * Handler for exit event, necessary for
+ * releasing whole blocktime_ctx
+ */
+ Notifier exit_notifier;
+} PostcopyBlocktimeContext;
+
+static void destroy_blocktime_context(struct PostcopyBlocktimeContext *ctx)
+{
+ g_free(ctx->page_fault_vcpu_time);
+ g_free(ctx->vcpu_addr);
+ g_free(ctx->vcpu_blocktime);
+ g_free(ctx);
+}
+
+static void migration_exit_cb(Notifier *n, void *data)
+{
+ PostcopyBlocktimeContext *ctx = container_of(n, PostcopyBlocktimeContext,
+ exit_notifier);
+ destroy_blocktime_context(ctx);
+}
+
+static struct PostcopyBlocktimeContext *blocktime_context_new(void)
+{
+ PostcopyBlocktimeContext *ctx = g_new0(PostcopyBlocktimeContext, 1);
+ ctx->page_fault_vcpu_time = g_new0(uint32_t, smp_cpus);
+ ctx->vcpu_addr = g_new0(uintptr_t, smp_cpus);
+ ctx->vcpu_blocktime = g_new0(uint32_t, smp_cpus);
+
+ ctx->exit_notifier.notify = migration_exit_cb;
+ ctx->start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+ qemu_add_exit_notifier(&ctx->exit_notifier);
+ return ctx;
+}
/**
* receive_ufd_features: check userfault fd features, to request only supported
@@ -182,6 +230,19 @@ static bool ufd_check_and_apply(int ufd, MigrationIncomingState *mis)
}
}
+#ifdef UFFD_FEATURE_THREAD_ID
+ if (migrate_postcopy_blocktime() && mis &&
+ UFFD_FEATURE_THREAD_ID & supported_features) {
+ /* kernel supports that feature */
+ /* don't create blocktime_context if it exists */
+ if (!mis->blocktime_ctx) {
+ mis->blocktime_ctx = blocktime_context_new();
+ }
+
+ asked_features |= UFFD_FEATURE_THREAD_ID;
+ }
+#endif
+
/*
* request features, even if asked_features is 0, due to
* kernel expects UFFD_API before UFFDIO_REGISTER, per
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 3/6] migration: calculate vCPU blocktime on dst side
[not found] ` <CGME20180322181741eucas1p1d80d609a6d0aca10e459563b715b37d6@eucas1p1.samsung.com>
@ 2018-03-22 18:17 ` Alexey Perevalov
0 siblings, 0 replies; 9+ messages in thread
From: Alexey Perevalov @ 2018-03-22 18:17 UTC (permalink / raw)
To: qemu-devel, dgilbert
Cc: Alexey Perevalov, v.kuramshin, ash.billore, f4bug, quintela,
peterx, lvivier
This patch provides blocktime calculation per vCPU,
as a summary and as a overlapped value for all vCPUs.
This approach was suggested by Peter Xu, as an improvements of
previous approch where QEMU kept tree with faulted page address and cpus bitmask
in it. Now QEMU is keeping array with faulted page address as value and vCPU
as index. It helps to find proper vCPU at UFFD_COPY time. Also it keeps
list for blocktime per vCPU (could be traced with page_fault_addr)
Blocktime will not calculated if postcopy_blocktime field of
MigrationIncomingState wasn't initialized.
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
migration/postcopy-ram.c | 151 ++++++++++++++++++++++++++++++++++++++++++++++-
migration/trace-events | 5 +-
2 files changed, 154 insertions(+), 2 deletions(-)
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 66f1df9..6b01884 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -636,6 +636,148 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb,
return 0;
}
+static int get_mem_fault_cpu_index(uint32_t pid)
+{
+ CPUState *cpu_iter;
+
+ CPU_FOREACH(cpu_iter) {
+ if (cpu_iter->thread_id == pid) {
+ trace_get_mem_fault_cpu_index(cpu_iter->cpu_index, pid);
+ return cpu_iter->cpu_index;
+ }
+ }
+ trace_get_mem_fault_cpu_index(-1, pid);
+ return -1;
+}
+
+static uint32_t get_low_time_offset(PostcopyBlocktimeContext *dc)
+{
+ int64_t start_time_offset = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) -
+ dc->start_time;
+ return start_time_offset < 1 ? 1 : start_time_offset & UINT32_MAX;
+}
+
+/*
+ * This function is being called when pagefault occurs. It
+ * tracks down vCPU blocking time.
+ *
+ * @addr: faulted host virtual address
+ * @ptid: faulted process thread id
+ * @rb: ramblock appropriate to addr
+ */
+static void mark_postcopy_blocktime_begin(uintptr_t addr, uint32_t ptid,
+ RAMBlock *rb)
+{
+ int cpu, already_received;
+ MigrationIncomingState *mis = migration_incoming_get_current();
+ PostcopyBlocktimeContext *dc = mis->blocktime_ctx;
+ uint32_t low_time_offset;
+
+ if (!dc || ptid == 0) {
+ return;
+ }
+ cpu = get_mem_fault_cpu_index(ptid);
+ if (cpu < 0) {
+ return;
+ }
+
+ low_time_offset = get_low_time_offset(dc);
+ if (dc->vcpu_addr[cpu] == 0) {
+ atomic_inc(&dc->smp_cpus_down);
+ }
+
+ atomic_xchg(&dc->last_begin, low_time_offset);
+ atomic_xchg(&dc->page_fault_vcpu_time[cpu], low_time_offset);
+ atomic_xchg(&dc->vcpu_addr[cpu], addr);
+
+ /* check it here, not at the begining of the function,
+ * due to, check could accur early than bitmap_set in
+ * qemu_ufd_copy_ioctl */
+ already_received = ramblock_recv_bitmap_test(rb, (void *)addr);
+ if (already_received) {
+ atomic_xchg(&dc->vcpu_addr[cpu], 0);
+ atomic_xchg(&dc->page_fault_vcpu_time[cpu], 0);
+ atomic_dec(&dc->smp_cpus_down);
+ }
+ trace_mark_postcopy_blocktime_begin(addr, dc, dc->page_fault_vcpu_time[cpu],
+ cpu, already_received);
+}
+
+/*
+ * This function just provide calculated blocktime per cpu and trace it.
+ * Total blocktime is calculated in mark_postcopy_blocktime_end.
+ *
+ *
+ * Assume we have 3 CPU
+ *
+ * S1 E1 S1 E1
+ * -----***********------------xxx***************------------------------> CPU1
+ *
+ * S2 E2
+ * ------------****************xxx---------------------------------------> CPU2
+ *
+ * S3 E3
+ * ------------------------****xxx********-------------------------------> CPU3
+ *
+ * We have sequence S1,S2,E1,S3,S1,E2,E3,E1
+ * S2,E1 - doesn't match condition due to sequence S1,S2,E1 doesn't include CPU3
+ * S3,S1,E2 - sequence includes all CPUs, in this case overlap will be S1,E2 -
+ * it's a part of total blocktime.
+ * S1 - here is last_begin
+ * Legend of the picture is following:
+ * * - means blocktime per vCPU
+ * x - means overlapped blocktime (total blocktime)
+ *
+ * @addr: host virtual address
+ */
+static void mark_postcopy_blocktime_end(uintptr_t addr)
+{
+ MigrationIncomingState *mis = migration_incoming_get_current();
+ PostcopyBlocktimeContext *dc = mis->blocktime_ctx;
+ int i, affected_cpu = 0;
+ bool vcpu_total_blocktime = false;
+ uint32_t read_vcpu_time, low_time_offset;
+
+ if (!dc) {
+ return;
+ }
+
+ low_time_offset = get_low_time_offset(dc);
+ /* lookup cpu, to clear it,
+ * that algorithm looks straighforward, but it's not
+ * optimal, more optimal algorithm is keeping tree or hash
+ * where key is address value is a list of */
+ for (i = 0; i < smp_cpus; i++) {
+ uint32_t vcpu_blocktime = 0;
+
+ read_vcpu_time = atomic_fetch_add(&dc->page_fault_vcpu_time[i], 0);
+ if (atomic_fetch_add(&dc->vcpu_addr[i], 0) != addr ||
+ read_vcpu_time == 0) {
+ continue;
+ }
+ atomic_xchg(&dc->vcpu_addr[i], 0);
+ vcpu_blocktime = low_time_offset - read_vcpu_time;
+ affected_cpu += 1;
+ /* we need to know is that mark_postcopy_end was due to
+ * faulted page, another possible case it's prefetched
+ * page and in that case we shouldn't be here */
+ if (!vcpu_total_blocktime &&
+ atomic_fetch_add(&dc->smp_cpus_down, 0) == smp_cpus) {
+ vcpu_total_blocktime = true;
+ }
+ /* continue cycle, due to one page could affect several vCPUs */
+ dc->vcpu_blocktime[i] += vcpu_blocktime;
+ }
+
+ atomic_sub(&dc->smp_cpus_down, affected_cpu);
+ if (vcpu_total_blocktime) {
+ dc->total_blocktime += low_time_offset - atomic_fetch_add(
+ &dc->last_begin, 0);
+ }
+ trace_mark_postcopy_blocktime_end(addr, dc, dc->total_blocktime,
+ affected_cpu);
+}
+
/*
* Handle faults detected by the USERFAULT markings
*/
@@ -742,7 +884,12 @@ static void *postcopy_ram_fault_thread(void *opaque)
rb_offset &= ~(qemu_ram_pagesize(rb) - 1);
trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address,
qemu_ram_get_idstr(rb),
- rb_offset);
+ rb_offset,
+ msg.arg.pagefault.feat.ptid);
+ mark_postcopy_blocktime_begin(
+ (uintptr_t)(msg.arg.pagefault.address),
+ msg.arg.pagefault.feat.ptid, rb);
+
/*
* Send the request to the source - we want to request one
* of our host page sizes (which is >= TPS)
@@ -889,6 +1036,8 @@ static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr,
if (!ret) {
ramblock_recv_bitmap_set_range(rb, host_addr,
pagesize / qemu_target_page_size());
+ mark_postcopy_blocktime_end((uintptr_t)host_addr);
+
}
return ret;
}
diff --git a/migration/trace-events b/migration/trace-events
index a180d7b..368bc4b 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -115,6 +115,8 @@ process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d"
process_incoming_migration_co_postcopy_end_main(void) ""
migration_set_incoming_channel(void *ioc, const char *ioctype) "ioc=%p ioctype=%s"
migration_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname, void *err) "ioc=%p ioctype=%s hostname=%s err=%p"
+mark_postcopy_blocktime_begin(uint64_t addr, void *dd, uint32_t time, int cpu, int received) "addr: 0x%" PRIx64 ", dd: %p, time: %u, cpu: %d, already_received: %d"
+mark_postcopy_blocktime_end(uint64_t addr, void *dd, uint32_t time, int affected_cpu) "addr: 0x%" PRIx64 ", dd: %p, time: %u, affected_cpu: %d"
# migration/rdma.c
qemu_rdma_accept_incoming_migration(void) ""
@@ -193,7 +195,7 @@ postcopy_ram_fault_thread_exit(void) ""
postcopy_ram_fault_thread_fds_core(int baseufd, int quitfd) "ufd: %d quitfd: %d"
postcopy_ram_fault_thread_fds_extra(size_t index, const char *name, int fd) "%zd/%s: %d"
postcopy_ram_fault_thread_quit(void) ""
-postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset) "Request for HVA=0x%" PRIx64 " rb=%s offset=0x%zx"
+postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset, uint32_t pid) "Request for HVA=0x%" PRIx64 " rb=%s offset=0x%zx pid=%u"
postcopy_ram_incoming_cleanup_closeuf(void) ""
postcopy_ram_incoming_cleanup_entry(void) ""
postcopy_ram_incoming_cleanup_exit(void) ""
@@ -206,6 +208,7 @@ save_xbzrle_page_skipping(void) ""
save_xbzrle_page_overflow(void) ""
ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRIu64 " milliseconds, %d iterations"
ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRIu64
+get_mem_fault_cpu_index(int cpu, uint32_t pid) "cpu: %d, pid: %u"
# migration/exec.c
migration_exec_outgoing(const char *cmd) "cmd=%s"
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 4/6] migration: postcopy_blocktime documentation
[not found] ` <CGME20180322181742eucas1p13b9727b6985dab1ac280f6e07cae9ba6@eucas1p1.samsung.com>
@ 2018-03-22 18:17 ` Alexey Perevalov
0 siblings, 0 replies; 9+ messages in thread
From: Alexey Perevalov @ 2018-03-22 18:17 UTC (permalink / raw)
To: qemu-devel, dgilbert
Cc: Alexey Perevalov, v.kuramshin, ash.billore, f4bug, quintela,
peterx, lvivier
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
docs/devel/migration.rst | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
index e32b087..9342a8a 100644
--- a/docs/devel/migration.rst
+++ b/docs/devel/migration.rst
@@ -401,6 +401,20 @@ will now cause the transition from precopy to postcopy.
It can be issued immediately after migration is started or any
time later on. Issuing it after the end of a migration is harmless.
+Blocktime is a postcopy live migration metric, intended to show how
+long the vCPU was in state of interruptable sleep due to pagefault.
+That metric is calculated both for all vCPUs as overlapped value, and
+separately for each vCPU. These values are calculated on destination
+side. To enable postcopy blocktime calculation, enter following
+command on destination monitor:
+
+``migrate_set_capability postcopy-blocktime on``
+
+Postcopy blocktime can be retrieved by query-migrate qmp command.
+postcopy-blocktime value of qmp command will show overlapped blocking
+time for all vCPU, postcopy-vcpu-blocktime will show list of blocking
+time per vCPU.
+
.. note::
During the postcopy phase, the bandwidth limits set using
``migrate_set_speed`` is ignored (to avoid delaying requested pages that
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 5/6] migration: add blocktime calculation into migration-test
[not found] ` <CGME20180322181743eucas1p21654973717f1bfb660fd226f75a206dc@eucas1p2.samsung.com>
@ 2018-03-22 18:17 ` Alexey Perevalov
0 siblings, 0 replies; 9+ messages in thread
From: Alexey Perevalov @ 2018-03-22 18:17 UTC (permalink / raw)
To: qemu-devel, dgilbert
Cc: Alexey Perevalov, v.kuramshin, ash.billore, f4bug, quintela,
peterx, lvivier
This patch just requests blocktime calculation,
and check it in case when UFFD_FEATURE_THREAD_ID feature is set
on the host.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
tests/migration-test.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/tests/migration-test.c b/tests/migration-test.c
index 422bf1a..dde7c46 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -26,6 +26,7 @@
const unsigned start_address = 1024 * 1024;
const unsigned end_address = 100 * 1024 * 1024;
bool got_stop;
+static bool uffd_feature_thread_id;
#if defined(__linux__)
#include <sys/syscall.h>
@@ -55,6 +56,7 @@ static bool ufd_version_check(void)
g_test_message("Skipping test: UFFDIO_API failed");
return false;
}
+ uffd_feature_thread_id = api_struct.features & UFFD_FEATURE_THREAD_ID;
ioctl_mask = (__u64)1 << _UFFDIO_REGISTER |
(__u64)1 << _UFFDIO_UNREGISTER;
@@ -223,6 +225,16 @@ static uint64_t get_migration_pass(QTestState *who)
return result;
}
+static void read_blocktime(QTestState *who)
+{
+ QDict *rsp, *rsp_return;
+
+ rsp = wait_command(who, "{ 'execute': 'query-migrate' }");
+ rsp_return = qdict_get_qdict(rsp, "return");
+ g_assert(qdict_haskey(rsp_return, "postcopy-blocktime"));
+ QDECREF(rsp);
+}
+
static void wait_for_migration_complete(QTestState *who)
{
while (true) {
@@ -533,6 +545,7 @@ static void test_migrate(void)
migrate_set_capability(from, "postcopy-ram", "true");
migrate_set_capability(to, "postcopy-ram", "true");
+ migrate_set_capability(to, "postcopy-blocktime", "true");
/* We want to pick a speed slow enough that the test completes
* quickly, but that it doesn't complete precopy even on a slow
@@ -559,6 +572,9 @@ static void test_migrate(void)
wait_for_serial("dest_serial");
wait_for_migration_complete(from);
+ if (uffd_feature_thread_id) {
+ read_blocktime(to);
+ }
g_free(uri);
test_migrate_end(from, to, true);
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 6/6] migration: add postcopy total blocktime into query-migrate
[not found] ` <CGME20180322181744eucas1p1a738955967cd8a6cc0330980753b2bad@eucas1p1.samsung.com>
@ 2018-03-22 18:17 ` Alexey Perevalov
0 siblings, 0 replies; 9+ messages in thread
From: Alexey Perevalov @ 2018-03-22 18:17 UTC (permalink / raw)
To: qemu-devel, dgilbert
Cc: Alexey Perevalov, v.kuramshin, ash.billore, f4bug, quintela,
peterx, lvivier
Postcopy total blocktime is available on destination side only.
But query-migrate was possible only for source. This patch
adds ability to call query-migrate on destination.
To be able to see postcopy blocktime, need to request postcopy-blocktime
capability.
The query-migrate command will show following sample result:
{"return":
"postcopy-vcpu-blocktime": [115, 100],
"status": "completed",
"postcopy-blocktime": 100
}}
postcopy_vcpu_blocktime contains list, where the first item is the first
vCPU in QEMU.
This patch has a drawback, it combines states of incoming and
outgoing migration. Ongoing migration state will overwrite incoming
state. Looks like better to separate query-migrate for incoming and
outgoing migration or add parameter to indicate type of migration.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
hmp.c | 15 +++++++++++++
migration/migration.c | 42 ++++++++++++++++++++++++++++++++----
migration/migration.h | 4 ++++
migration/postcopy-ram.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++
migration/trace-events | 1 +
qapi/migration.json | 11 +++++++++-
6 files changed, 124 insertions(+), 5 deletions(-)
diff --git a/hmp.c b/hmp.c
index 679467d..6c51df5 100644
--- a/hmp.c
+++ b/hmp.c
@@ -274,6 +274,21 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->cpu_throttle_percentage);
}
+ if (info->has_postcopy_blocktime) {
+ monitor_printf(mon, "postcopy blocktime: %u\n",
+ info->postcopy_blocktime);
+ }
+
+ if (info->has_postcopy_vcpu_blocktime) {
+ Visitor *v;
+ char *str;
+ v = string_output_visitor_new(false, &str);
+ visit_type_uint32List(v, NULL, &info->postcopy_vcpu_blocktime, NULL);
+ visit_complete(v, &str);
+ monitor_printf(mon, "postcopy vcpu blocktime: %s\n", str);
+ g_free(str);
+ visit_free(v);
+ }
qapi_free_MigrationInfo(info);
qapi_free_MigrationCapabilityStatusList(caps);
}
diff --git a/migration/migration.c b/migration/migration.c
index f95a7f3..71b0f19 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -630,14 +630,15 @@ static void populate_disk_info(MigrationInfo *info)
}
}
-MigrationInfo *qmp_query_migrate(Error **errp)
+static void fill_source_migration_info(MigrationInfo *info)
{
- MigrationInfo *info = g_malloc0(sizeof(*info));
MigrationState *s = migrate_get_current();
switch (s->state) {
case MIGRATION_STATUS_NONE:
/* no migration has happened ever */
+ /* do not overwrite destination migration status */
+ return;
break;
case MIGRATION_STATUS_SETUP:
info->has_status = true;
@@ -688,8 +689,6 @@ MigrationInfo *qmp_query_migrate(Error **errp)
break;
}
info->status = s->state;
-
- return info;
}
/**
@@ -753,6 +752,41 @@ static bool migrate_caps_check(bool *cap_list,
return true;
}
+static void fill_destination_migration_info(MigrationInfo *info)
+{
+ MigrationIncomingState *mis = migration_incoming_get_current();
+
+ switch (mis->state) {
+ case MIGRATION_STATUS_NONE:
+ return;
+ break;
+ case MIGRATION_STATUS_SETUP:
+ case MIGRATION_STATUS_CANCELLING:
+ case MIGRATION_STATUS_CANCELLED:
+ case MIGRATION_STATUS_ACTIVE:
+ case MIGRATION_STATUS_POSTCOPY_ACTIVE:
+ case MIGRATION_STATUS_FAILED:
+ case MIGRATION_STATUS_COLO:
+ info->has_status = true;
+ break;
+ case MIGRATION_STATUS_COMPLETED:
+ info->has_status = true;
+ fill_destination_postcopy_migration_info(info);
+ break;
+ }
+ info->status = mis->state;
+}
+
+MigrationInfo *qmp_query_migrate(Error **errp)
+{
+ MigrationInfo *info = g_malloc0(sizeof(*info));
+
+ fill_destination_migration_info(info);
+ fill_source_migration_info(info);
+
+ return info;
+}
+
void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
Error **errp)
{
diff --git a/migration/migration.h b/migration/migration.h
index 6d9aaeb..7c69598 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -77,6 +77,10 @@ struct MigrationIncomingState {
MigrationIncomingState *migration_incoming_get_current(void);
void migration_incoming_state_destroy(void);
+/*
+ * Functions to work with blocktime context
+ */
+void fill_destination_postcopy_migration_info(MigrationInfo *info);
#define TYPE_MIGRATION "migration"
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 6b01884..bbc1a95 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -139,6 +139,55 @@ static struct PostcopyBlocktimeContext *blocktime_context_new(void)
return ctx;
}
+static uint32List *get_vcpu_blocktime_list(PostcopyBlocktimeContext *ctx)
+{
+ uint32List *list = NULL, *entry = NULL;
+ int i;
+
+ for (i = smp_cpus - 1; i >= 0; i--) {
+ entry = g_new0(uint32List, 1);
+ entry->value = ctx->vcpu_blocktime[i];
+ entry->next = list;
+ list = entry;
+ }
+
+ return list;
+}
+
+/*
+ * This function just populates MigrationInfo from postcopy's
+ * blocktime context. It will not populate MigrationInfo,
+ * unless postcopy-blocktime capability was set.
+ *
+ * @info: pointer to MigrationInfo to populate
+ */
+void fill_destination_postcopy_migration_info(MigrationInfo *info)
+{
+ MigrationIncomingState *mis = migration_incoming_get_current();
+ PostcopyBlocktimeContext *bc = mis->blocktime_ctx;
+
+ if (!bc) {
+ return;
+ }
+
+ info->has_postcopy_blocktime = true;
+ info->postcopy_blocktime = bc->total_blocktime;
+ info->has_postcopy_vcpu_blocktime = true;
+ info->postcopy_vcpu_blocktime = get_vcpu_blocktime_list(bc);
+}
+
+static uint32_t get_postcopy_total_blocktime(void)
+{
+ MigrationIncomingState *mis = migration_incoming_get_current();
+ PostcopyBlocktimeContext *bc = mis->blocktime_ctx;
+
+ if (!bc) {
+ return 0;
+ }
+
+ return bc->total_blocktime;
+}
+
/**
* receive_ufd_features: check userfault fd features, to request only supported
* features in the future.
@@ -512,6 +561,9 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
munmap(mis->postcopy_tmp_zero_page, mis->largest_page_size);
mis->postcopy_tmp_zero_page = NULL;
}
+ trace_postcopy_ram_incoming_cleanup_blocktime(
+ get_postcopy_total_blocktime());
+
trace_postcopy_ram_incoming_cleanup_exit();
return 0;
}
@@ -1156,6 +1208,10 @@ void *postcopy_get_tmp_page(MigrationIncomingState *mis)
#else
/* No target OS support, stubs just fail */
+void fill_destination_postcopy_migration_info(MigrationInfo *info)
+{
+}
+
bool postcopy_ram_supported_by_host(MigrationIncomingState *mis)
{
error_report("%s: No OS support", __func__);
diff --git a/migration/trace-events b/migration/trace-events
index 368bc4b..d6be74b 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -200,6 +200,7 @@ postcopy_ram_incoming_cleanup_closeuf(void) ""
postcopy_ram_incoming_cleanup_entry(void) ""
postcopy_ram_incoming_cleanup_exit(void) ""
postcopy_ram_incoming_cleanup_join(void) ""
+postcopy_ram_incoming_cleanup_blocktime(uint64_t total) "total blocktime %" PRIu64
postcopy_request_shared_page(const char *sharer, const char *rb, uint64_t rb_offset) "for %s in %s offset 0x%"PRIx64
postcopy_request_shared_page_present(const char *sharer, const char *rb, uint64_t rb_offset) "%s already %s offset 0x%"PRIx64
postcopy_wake_shared(uint64_t client_addr, const char *rb) "at 0x%"PRIx64" in %s"
diff --git a/qapi/migration.json b/qapi/migration.json
index 24bfc19..f3974c6 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -156,6 +156,13 @@
# @status is 'failed'. Clients should not attempt to parse the
# error strings. (Since 2.7)
#
+# @postcopy-blocktime: total time when all vCPU were blocked during postcopy
+# live migration (Since 2.13)
+#
+# @postcopy-vcpu-blocktime: list of the postcopy blocktime per vCPU (Since 2.13)
+#
+
+#
# Since: 0.14.0
##
{ 'struct': 'MigrationInfo',
@@ -167,7 +174,9 @@
'*downtime': 'int',
'*setup-time': 'int',
'*cpu-throttle-percentage': 'int',
- '*error-desc': 'str'} }
+ '*error-desc': 'str',
+ '*postcopy-blocktime' : 'uint32',
+ '*postcopy-vcpu-blocktime': ['uint32']} }
##
# @query-migrate:
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 0/6] postcopy block time calculation + ppc32 build fix
2018-03-22 18:17 ` [Qemu-devel] [PATCH v2 0/6] postcopy block time calculation + ppc32 build fix Alexey Perevalov
` (5 preceding siblings ...)
[not found] ` <CGME20180322181744eucas1p1a738955967cd8a6cc0330980753b2bad@eucas1p1.samsung.com>
@ 2018-03-29 19:18 ` Dr. David Alan Gilbert
2018-04-25 16:58 ` Dr. David Alan Gilbert
6 siblings, 1 reply; 9+ messages in thread
From: Dr. David Alan Gilbert @ 2018-03-29 19:18 UTC (permalink / raw)
To: Alexey Perevalov
Cc: qemu-devel, v.kuramshin, ash.billore, f4bug, quintela, peterx,
lvivier
* Alexey Perevalov (a.perevalov@samsung.com) wrote:
> V1-V2
> accidentally appeared __nocheck after rebase
> this patch set also rebased after latest pull request
>
> This patch set includes patches which were reverted by commit
> ee86981bd, due to build problem on 32 powerpc/arm architecture.
> Also it includes patch to fix build
> ([PATCH v4] migration: change blocktime type to uint32_t), but that
> patch was merged into:
> migration: add postcopy blocktime ctx into MigrationIncomingState
> migration: calculate vCPU blocktime on dst side
> migration: add postcopy total blocktime into query-migrate
OK, lets get this in when 2.13 opens.
Thanks!
Dave
>
> based on
> commit c6740fc88ecd8f5cf3cf3185ee112c3eea41caa2
> "hw/rdma: Implementation of PVRDMA device"
>
> Alexey Perevalov (6):
> migration: introduce postcopy-blocktime capability
> migration: add postcopy blocktime ctx into MigrationIncomingState
> migration: calculate vCPU blocktime on dst side
> migration: postcopy_blocktime documentation
> migration: add blocktime calculation into migration-test
> migration: add postcopy total blocktime into query-migrate
>
> docs/devel/migration.rst | 14 +++
> hmp.c | 15 +++
> migration/migration.c | 51 ++++++++-
> migration/migration.h | 13 +++
> migration/postcopy-ram.c | 268 ++++++++++++++++++++++++++++++++++++++++++++++-
> migration/trace-events | 6 +-
> qapi/migration.json | 17 ++-
> tests/migration-test.c | 16 +++
> 8 files changed, 392 insertions(+), 8 deletions(-)
>
> --
> 2.7.4
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 0/6] postcopy block time calculation + ppc32 build fix
2018-03-29 19:18 ` [Qemu-devel] [PATCH v2 0/6] postcopy block time calculation + ppc32 build fix Dr. David Alan Gilbert
@ 2018-04-25 16:58 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 9+ messages in thread
From: Dr. David Alan Gilbert @ 2018-04-25 16:58 UTC (permalink / raw)
To: Alexey Perevalov
Cc: qemu-devel, v.kuramshin, ash.billore, f4bug, quintela, peterx,
lvivier
* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> * Alexey Perevalov (a.perevalov@samsung.com) wrote:
> > V1-V2
> > accidentally appeared __nocheck after rebase
> > this patch set also rebased after latest pull request
> >
> > This patch set includes patches which were reverted by commit
> > ee86981bd, due to build problem on 32 powerpc/arm architecture.
> > Also it includes patch to fix build
> > ([PATCH v4] migration: change blocktime type to uint32_t), but that
> > patch was merged into:
> > migration: add postcopy blocktime ctx into MigrationIncomingState
> > migration: calculate vCPU blocktime on dst side
> > migration: add postcopy total blocktime into query-migrate
>
> OK, lets get this in when 2.13 opens.
>
> Thanks!
Queued.
> Dave
>
> >
> > based on
> > commit c6740fc88ecd8f5cf3cf3185ee112c3eea41caa2
> > "hw/rdma: Implementation of PVRDMA device"
> >
> > Alexey Perevalov (6):
> > migration: introduce postcopy-blocktime capability
> > migration: add postcopy blocktime ctx into MigrationIncomingState
> > migration: calculate vCPU blocktime on dst side
> > migration: postcopy_blocktime documentation
> > migration: add blocktime calculation into migration-test
> > migration: add postcopy total blocktime into query-migrate
> >
> > docs/devel/migration.rst | 14 +++
> > hmp.c | 15 +++
> > migration/migration.c | 51 ++++++++-
> > migration/migration.h | 13 +++
> > migration/postcopy-ram.c | 268 ++++++++++++++++++++++++++++++++++++++++++++++-
> > migration/trace-events | 6 +-
> > qapi/migration.json | 17 ++-
> > tests/migration-test.c | 16 +++
> > 8 files changed, 392 insertions(+), 8 deletions(-)
> >
> > --
> > 2.7.4
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-04-25 16:59 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20180322181739eucas1p2b5f31dc21663881bc244deaedec22712@eucas1p2.samsung.com>
2018-03-22 18:17 ` [Qemu-devel] [PATCH v2 0/6] postcopy block time calculation + ppc32 build fix Alexey Perevalov
[not found] ` <CGME20180322181740eucas1p2a7bde534f74bd8376fc521a4ad1bcfdb@eucas1p2.samsung.com>
2018-03-22 18:17 ` [Qemu-devel] [PATCH v2 1/6] migration: introduce postcopy-blocktime capability Alexey Perevalov
[not found] ` <CGME20180322181740eucas1p13028d8418e837695a91ded89385f8978@eucas1p1.samsung.com>
2018-03-22 18:17 ` [Qemu-devel] [PATCH v2 2/6] migration: add postcopy blocktime ctx into MigrationIncomingState Alexey Perevalov
[not found] ` <CGME20180322181741eucas1p1d80d609a6d0aca10e459563b715b37d6@eucas1p1.samsung.com>
2018-03-22 18:17 ` [Qemu-devel] [PATCH v2 3/6] migration: calculate vCPU blocktime on dst side Alexey Perevalov
[not found] ` <CGME20180322181742eucas1p13b9727b6985dab1ac280f6e07cae9ba6@eucas1p1.samsung.com>
2018-03-22 18:17 ` [Qemu-devel] [PATCH v2 4/6] migration: postcopy_blocktime documentation Alexey Perevalov
[not found] ` <CGME20180322181743eucas1p21654973717f1bfb660fd226f75a206dc@eucas1p2.samsung.com>
2018-03-22 18:17 ` [Qemu-devel] [PATCH v2 5/6] migration: add blocktime calculation into migration-test Alexey Perevalov
[not found] ` <CGME20180322181744eucas1p1a738955967cd8a6cc0330980753b2bad@eucas1p1.samsung.com>
2018-03-22 18:17 ` [Qemu-devel] [PATCH v2 6/6] migration: add postcopy total blocktime into query-migrate Alexey Perevalov
2018-03-29 19:18 ` [Qemu-devel] [PATCH v2 0/6] postcopy block time calculation + ppc32 build fix Dr. David Alan Gilbert
2018-04-25 16:58 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).