qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Jason J. Herne" <jjherne@linux.vnet.ibm.com>
Cc: amit.shah@redhat.com, borntraeger@de.ibm.com,
	qemu-devel@nongnu.org, afaerber@suse.de, quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCH 2/2] migration: Dynamic cpu throttling for auto-converge
Date: Mon, 1 Jun 2015 16:32:59 +0100	[thread overview]
Message-ID: <20150601153259.GK2314@work-vm> (raw)
In-Reply-To: <1433171851-18507-3-git-send-email-jjherne@linux.vnet.ibm.com>

* Jason J. Herne (jjherne@linux.vnet.ibm.com) wrote:
> Remove traditional auto-converge static 30ms throttling code and replace it
> with a dynamic throttling algorithm.
> 
> Additionally, be more aggressive when deciding when to start throttling.
> Previously we waited until four unproductive memory passes. Now we begin
> throttling after only two unproductive memory passes. Four seemed quite
> arbitrary and only waiting for two passes allows us to complete the migration
> faster.
> 
> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
> Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
> ---
>  arch_init.c           | 95 +++++++++++++++++----------------------------------
>  migration/migration.c |  9 +++++
>  2 files changed, 41 insertions(+), 63 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 23d3feb..73ae494 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -111,9 +111,7 @@ int graphic_depth = 32;
>  #endif
>  
>  const uint32_t arch_type = QEMU_ARCH;
> -static bool mig_throttle_on;
>  static int dirty_rate_high_cnt;
> -static void check_guest_throttling(void);
>  
>  static uint64_t bitmap_sync_count;
>  
> @@ -487,6 +485,31 @@ static size_t save_page_header(QEMUFile *f, RAMBlock *block, ram_addr_t offset)
>      return size;
>  }
>  
> +/* Reduce amount of guest cpu execution to hopefully slow down memory writes.
> + * If guest dirty memory rate is reduced below the rate at which we can
> + * transfer pages to the destination then we should be able to complete
> + * migration. Some workloads dirty memory way too fast and will not effectively
> + * converge, even with auto-converge. For these workloads we will continue to
> + * increase throttling until the guest is paused long enough to complete the
> + * migration. This essentially becomes a non-live migration.
> + */
> +static void mig_throttle_guest_down(void)
> +{
> +    CPUState *cpu;
> +
> +    CPU_FOREACH(cpu) {
> +        /* We have not started throttling yet. Lets start it.*/
> +        if (!cpu_throttle_active(cpu)) {
> +            cpu_throttle_start(cpu, 0.2);
> +        }
> +
> +        /* Throttling is already in place. Just increase the throttling rate */
> +        else {
> +            cpu_throttle_start(cpu, cpu_throttle_get_ratio(cpu) * 2);
> +        }

Now that migration has migrate_parameters, it would be best to replace
the magic numbers (the 0.2, the *2 - anything else?)  by parameters that can
change the starting throttling and increase rate.  It would probably also be
good to make the current throttling rate visible in info somewhere; maybe
info migrate?

> +    }
> +}
> +
>  /* Update the xbzrle cache to reflect a page that's been sent as all 0.
>   * The important thing is that a stale (not-yet-0'd) page be replaced
>   * by the new data.
> @@ -714,21 +737,21 @@ static void migration_bitmap_sync(void)
>              /* The following detection logic can be refined later. For now:
>                 Check to see if the dirtied bytes is 50% more than the approx.
>                 amount of bytes that just got transferred since the last time we
> -               were in this routine. If that happens >N times (for now N==4)
> -               we turn on the throttle down logic */
> +               were in this routine. If that happens twice, start or increase
> +               throttling */
>              bytes_xfer_now = ram_bytes_transferred();
> +
>              if (s->dirty_pages_rate &&
>                 (num_dirty_pages_period * TARGET_PAGE_SIZE >
>                     (bytes_xfer_now - bytes_xfer_prev)/2) &&
> -               (dirty_rate_high_cnt++ > 4)) {
> +               (dirty_rate_high_cnt++ >= 2)) {
>                      trace_migration_throttle();
> -                    mig_throttle_on = true;
>                      dirty_rate_high_cnt = 0;
> +                    mig_throttle_guest_down();
>               }
>               bytes_xfer_prev = bytes_xfer_now;
> -        } else {
> -             mig_throttle_on = false;
>          }
> +
>          if (migrate_use_xbzrle()) {
>              if (iterations_prev != acct_info.iterations) {
>                  acct_info.xbzrle_cache_miss_rate =
> @@ -1197,7 +1220,6 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      RAMBlock *block;
>      int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
>  
> -    mig_throttle_on = false;
>      dirty_rate_high_cnt = 0;
>      bitmap_sync_count = 0;
>      migration_bitmap_sync_init();
> @@ -1301,12 +1323,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>          }
>          pages_sent += pages;
>          acct_info.iterations++;
> -        check_guest_throttling();
> -        /* we want to check in the 1st loop, just in case it was the 1st time
> -           and we had to sync the dirty bitmap.
> -           qemu_get_clock_ns() is a bit expensive, so we only check each some
> -           iterations
> -        */
> +
>          if ((i & 63) == 0) {
>              uint64_t t1 = (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - t0) / 1000000;
>              if (t1 > MAX_WAIT) {
> @@ -1913,51 +1930,3 @@ TargetInfo *qmp_query_target(Error **errp)
>      return info;
>  }
>  
> -/* Stub function that's gets run on the vcpu when its brought out of the
> -   VM to run inside qemu via async_run_on_cpu()*/
> -static void mig_sleep_cpu(void *opq)
> -{
> -    qemu_mutex_unlock_iothread();
> -    g_usleep(30*1000);
> -    qemu_mutex_lock_iothread();
> -}
> -
> -/* To reduce the dirty rate explicitly disallow the VCPUs from spending
> -   much time in the VM. The migration thread will try to catchup.
> -   Workload will experience a performance drop.
> -*/
> -static void mig_throttle_guest_down(void)
> -{
> -    CPUState *cpu;
> -
> -    qemu_mutex_lock_iothread();
> -    CPU_FOREACH(cpu) {
> -        async_run_on_cpu(cpu, mig_sleep_cpu, NULL);
> -    }
> -    qemu_mutex_unlock_iothread();
> -}
> -
> -static void check_guest_throttling(void)
> -{
> -    static int64_t t0;
> -    int64_t        t1;
> -
> -    if (!mig_throttle_on) {
> -        return;
> -    }
> -
> -    if (!t0)  {
> -        t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> -        return;
> -    }
> -
> -    t1 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> -
> -    /* If it has been more than 40 ms since the last time the guest
> -     * was throttled then do it again.
> -     */
> -    if (40 < (t1-t0)/1000000) {
> -        mig_throttle_guest_down();
> -        t0 = t1;
> -    }
> -}

Lots of deleted code; that's got to be good.

> diff --git a/migration/migration.c b/migration/migration.c
> index 732d229..c9545df 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -25,6 +25,7 @@
>  #include "qemu/thread.h"
>  #include "qmp-commands.h"
>  #include "trace.h"
> +#include "qom/cpu.h"
>  
>  #define MAX_THROTTLE  (32 << 20)      /* Migration speed throttling */
>  
> @@ -731,6 +732,7 @@ int64_t migrate_xbzrle_cache_size(void)
>  static void *migration_thread(void *opaque)
>  {
>      MigrationState *s = opaque;
> +    CPUState *cpu;
>      int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>      int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      int64_t initial_bytes = 0;
> @@ -814,6 +816,13 @@ static void *migration_thread(void *opaque)
>          }
>      }
>  
> +    /* If we enabled cpu throttling for auto-converge, turn it off. */
> +    CPU_FOREACH(cpu) {
> +        if (cpu_throttle_active(cpu)) {
> +            cpu_throttle_stop(cpu);
> +        }
> +    }
> +
>      qemu_mutex_lock_iothread();
>      if (s->state == MIGRATION_STATUS_COMPLETED) {
>          int64_t end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> -- 
> 1.9.1
> 

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2015-06-01 15:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-01 15:17 [Qemu-devel] [PATCH 0/2] migration: Dynamic cpu throttling for auto-converge Jason J. Herne
2015-06-01 15:17 ` [Qemu-devel] [PATCH 1/2] cpu: Provide vcpu throttling interface Jason J. Herne
2015-06-01 15:23   ` Andrey Korolyov
2015-06-01 17:04     ` Jason J. Herne
2015-06-03  7:12   ` Juan Quintela
2015-06-03 14:35     ` Jason J. Herne
2015-06-01 15:17 ` [Qemu-devel] [PATCH 2/2] migration: Dynamic cpu throttling for auto-converge Jason J. Herne
2015-06-01 15:32   ` Dr. David Alan Gilbert [this message]
2015-06-01 17:16     ` Jason J. Herne
2015-06-02 13:58       ` Dr. David Alan Gilbert
2015-06-02 14:37         ` Jason J. Herne
2015-06-02 14:57           ` Dr. David Alan Gilbert
2015-06-02 16:45           ` Eric Blake
2015-06-03  7:24           ` Juan Quintela
2015-06-03  7:21   ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150601153259.GK2314@work-vm \
    --to=dgilbert@redhat.com \
    --cc=afaerber@suse.de \
    --cc=amit.shah@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=jjherne@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).