From: Anthony Liguori <anthony@codemonkey.ws>
To: Chegu Vinod <chegu_vinod@hp.com>,
eblake@redhat.com, quintela@redhat.com, owasserm@redhat.com,
pbonzini@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration
Date: Fri, 10 May 2013 08:07:51 -0500 [thread overview]
Message-ID: <87y5bnc7a0.fsf@codemonkey.ws> (raw)
In-Reply-To: <1368128600-30721-4-git-send-email-chegu_vinod@hp.com>
Chegu Vinod <chegu_vinod@hp.com> writes:
> If a user chooses to turn on the auto-converge migration capability
> these changes detect the lack of convergence and throttle down the
> guest. i.e. force the VCPUs out of the guest for some duration
> and let the migration thread catchup and help converge.
>
> Verified the convergence using the following :
> - SpecJbb2005 workload running on a 20VCPU/256G guest(~80% busy)
> - OLTP like workload running on a 80VCPU/512G guest (~80% busy)
>
> Sample results with SpecJbb2005 workload : (migrate speed set to 20Gb and
> migrate downtime set to 4seconds).
Would it make sense to separate out the "slow the VCPU down" part of
this?
That would give a management tool more flexibility to create policies
around slowing the VCPU down to encourage migration.
In fact, I wonder if we need anything in the migration path if we just
expose the "slow the VCPU down" bit as a feature.
Slow the VCPU down is not quite the same as setting priority of the VCPU
thread largely because of the QBL so I recognize the need to have
something for this in QEMU.
Regards,
Anthony Liguori
>
> (qemu) info migrate
> capabilities: xbzrle: off auto-converge: off <----
> Migration status: active
> total time: 1487503 milliseconds
> expected downtime: 519 milliseconds
> transferred ram: 383749347 kbytes
> remaining ram: 2753372 kbytes
> total ram: 268444224 kbytes
> duplicate: 65461532 pages
> skipped: 64901568 pages
> normal: 95750218 pages
> normal bytes: 383000872 kbytes
> dirty pages rate: 67551 pages
>
> ---
>
> (qemu) info migrate
> capabilities: xbzrle: off auto-converge: on <----
> Migration status: completed
> total time: 241161 milliseconds
> downtime: 6373 milliseconds
> transferred ram: 28235307 kbytes
> remaining ram: 0 kbytes
> total ram: 268444224 kbytes
> duplicate: 64946416 pages
> skipped: 64903523 pages
> normal: 7044971 pages
> normal bytes: 28179884 kbytes
>
> Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>
> ---
> arch_init.c | 68 +++++++++++++++++++++++++++++++++++++++++
> include/migration/migration.h | 4 ++
> migration.c | 1 +
> 3 files changed, 73 insertions(+), 0 deletions(-)
>
> diff --git a/arch_init.c b/arch_init.c
> index 49c5dc2..29788d6 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -49,6 +49,7 @@
> #include "trace.h"
> #include "exec/cpu-all.h"
> #include "hw/acpi/acpi.h"
> +#include "sysemu/cpus.h"
>
> #ifdef DEBUG_ARCH_INIT
> #define DPRINTF(fmt, ...) \
> @@ -104,6 +105,8 @@ int graphic_depth = 15;
> #endif
>
> const uint32_t arch_type = QEMU_ARCH;
> +static bool mig_throttle_on;
> +
>
> /***********************************************************/
> /* ram save/restore */
> @@ -378,8 +381,15 @@ static void migration_bitmap_sync(void)
> uint64_t num_dirty_pages_init = migration_dirty_pages;
> MigrationState *s = migrate_get_current();
> static int64_t start_time;
> + static int64_t bytes_xfer_prev;
> static int64_t num_dirty_pages_period;
> int64_t end_time;
> + int64_t bytes_xfer_now;
> + static int dirty_rate_high_cnt;
> +
> + if (!bytes_xfer_prev) {
> + bytes_xfer_prev = ram_bytes_transferred();
> + }
>
> if (!start_time) {
> start_time = qemu_get_clock_ms(rt_clock);
> @@ -404,6 +414,23 @@ static void migration_bitmap_sync(void)
>
> /* more than 1 second = 1000 millisecons */
> if (end_time > start_time + 1000) {
> + if (migrate_auto_converge()) {
> + /* The following detection logic can be refined later. For now:
> + Check to see if the dirtied bytes is 50% more than the approx.
> + amount of bytes that just got transferred since the last time we
> + were in this routine. If that happens N times (for now N==5)
> + we turn on the throttle down logic */
> + bytes_xfer_now = ram_bytes_transferred();
> + if (s->dirty_pages_rate &&
> + ((num_dirty_pages_period*TARGET_PAGE_SIZE) >
> + ((bytes_xfer_now - bytes_xfer_prev)/2))) {
> + if (dirty_rate_high_cnt++ > 5) {
> + DPRINTF("Unable to converge. Throtting down guest\n");
> + mig_throttle_on = true;
> + }
> + }
> + bytes_xfer_prev = bytes_xfer_now;
> + }
> s->dirty_pages_rate = num_dirty_pages_period * 1000
> / (end_time - start_time);
> s->dirty_bytes_rate = s->dirty_pages_rate * TARGET_PAGE_SIZE;
> @@ -496,6 +523,15 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
> return bytes_sent;
> }
>
> +bool throttling_needed(void)
> +{
> + if (!migrate_auto_converge()) {
> + return false;
> + }
> +
> + return mig_throttle_on;
> +}
> +
> static uint64_t bytes_transferred;
>
> static ram_addr_t ram_save_remaining(void)
> @@ -1098,3 +1134,35 @@ TargetInfo *qmp_query_target(Error **errp)
>
> return info;
> }
> +
> +static void mig_delay_vcpu(void)
> +{
> + qemu_mutex_unlock_iothread();
> + g_usleep(50*1000);
> + qemu_mutex_lock_iothread();
> +}
> +
> +/* Stub used for getting the vcpu out of VM and into qemu via
> + run_on_cpu()*/
> +static void mig_kick_cpu(void *opq)
> +{
> + mig_delay_vcpu();
> + return;
> +}
> +
> +/* To reduce the dirty rate explicitly disallow the VCPUs from spending
> + much time in the VM. The migration thread will try to catchup.
> + Workload will experience a performance drop.
> +*/
> +void migration_throttle_down(void)
> +{
> + if (throttling_needed()) {
> + CPUArchState *penv = first_cpu;
> + while (penv) {
> + qemu_mutex_lock_iothread();
> + async_run_on_cpu(ENV_GET_CPU(penv), mig_kick_cpu, NULL);
> + qemu_mutex_unlock_iothread();
> + penv = penv->next_cpu;
> + }
> + }
> +}
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index ace91b0..68b65c6 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -129,4 +129,8 @@ int64_t migrate_xbzrle_cache_size(void);
> int64_t xbzrle_cache_resize(int64_t new_size);
>
> bool migrate_auto_converge(void);
> +bool throttling_needed(void);
> +void stop_throttling(void);
> +void migration_throttle_down(void);
> +
> #endif
> diff --git a/migration.c b/migration.c
> index 570cee5..d3673a6 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -526,6 +526,7 @@ static void *migration_thread(void *opaque)
> DPRINTF("pending size %lu max %lu\n", pending_size, max_size);
> if (pending_size && pending_size >= max_size) {
> qemu_savevm_state_iterate(s->file);
> + migration_throttle_down();
> } else {
> DPRINTF("done iterating\n");
> qemu_mutex_lock_iothread();
> --
> 1.7.1
next prev parent reply other threads:[~2013-05-10 13:08 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-09 19:43 [Qemu-devel] [RFC PATCH v5 0/3] Throttle-down guest to help with live migration convergence Chegu Vinod
2013-05-09 19:43 ` [Qemu-devel] [RFC PATCH v5 1/3] Introduce async_run_on_cpu() Chegu Vinod
2013-05-10 7:43 ` Paolo Bonzini
2013-05-09 19:43 ` [Qemu-devel] [RFC PATCH v5 2/3] Add 'auto-converge' migration capability Chegu Vinod
2013-05-10 7:43 ` Paolo Bonzini
2013-05-10 14:26 ` Eric Blake
2013-05-09 19:43 ` [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration Chegu Vinod
2013-05-09 20:05 ` Igor Mammedov
2013-05-09 22:26 ` Chegu Vinod
2013-05-09 20:24 ` Igor Mammedov
2013-05-09 23:00 ` Chegu Vinod
2013-05-10 7:47 ` Paolo Bonzini
2013-05-10 7:41 ` Paolo Bonzini
2013-05-10 13:07 ` Anthony Liguori [this message]
2013-05-10 14:14 ` Chegu Vinod
2013-05-10 15:11 ` Anthony Liguori
2013-05-12 17:19 ` Paolo Bonzini
2013-05-13 12:18 ` Anthony Liguori
2013-05-10 14:17 ` Daniel P. Berrange
2013-05-10 15:08 ` Anthony Liguori
2013-05-13 12:33 ` Daniel P. Berrange
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y5bnc7a0.fsf@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=chegu_vinod@hp.com \
--cc=eblake@redhat.com \
--cc=owasserm@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).