Re: [PATCH v5 2/4] migration: fix calculating xbzrle_counters.cache_miss_rate

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Juan Quintela <quintela@redhat.com>
To: guangrong.xiao@gmail.com
Cc: kvm@vger.kernel.org, mst@redhat.com, mtosatti@redhat.com,
	Xiao Guangrong <xiaoguangrong@tencent.com>,
	dgilbert@redhat.com, peterx@redhat.com, qemu-devel@nongnu.org,
	wei.w.wang@intel.com, jiang.biao2@zte.com.cn,
	pbonzini@redhat.com
Subject: Re: [PATCH v5 2/4] migration: fix calculating xbzrle_counters.cache_miss_rate
Date: Mon, 03 Sep 2018 19:19:25 +0200	[thread overview]
Message-ID: <87a7oy33du.fsf@trasno.org> (raw)
In-Reply-To: <20180903092644.25812-3-xiaoguangrong@tencent.com> (guangrong xiao's message of "Mon, 3 Sep 2018 17:26:42 +0800")

guangrong.xiao@gmail.com wrote:
> From: Xiao Guangrong <xiaoguangrong@tencent.com>
>
> As Peter pointed out:
> | - xbzrle_counters.cache_miss is done in save_xbzrle_page(), so it's
> |   per-guest-page granularity
> |
> | - RAMState.iterations is done for each ram_find_and_save_block(), so
> |   it's per-host-page granularity
> |
> | An example is that when we migrate a 2M huge page in the guest, we
> | will only increase the RAMState.iterations by 1 (since
> | ram_find_and_save_block() will be called once), but we might increase
> | xbzrle_counters.cache_miss for 2M/4K=512 times (we'll call
> | save_xbzrle_page() that many times) if all the pages got cache miss.
> | Then IMHO the cache miss rate will be 512/1=51200% (while it should
> | actually be just 100% cache miss).
>
> And he also suggested as xbzrle_counters.cache_miss_rate is the only
> user of rs->iterations we can adapt it to count target guest page
> numbers
>
> After that, rename 'iterations' to 'target_page_count' to better reflect
> its meaning
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

Because I don't mind the name change.  But the example is wrong, this is
not about huge pages, it is about architectures with different page size
(arm is king here, second place for ppc as far as I know).  You can have
a guest with 4kb pages on a host with 64kb pages.  Then we do "magic"
and when we synchronize the dirty bitmap, we update it on guest page
sizes.

>From cpu_physical_memory_set_dirty_lebitmap()

    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;

    [...]
        /*
         * bitmap-traveling is faster than memory-traveling (for addr...)
         * especially when most of the memory is not dirty.
         */
        for (i = 0; i < len; i++) {
            if (bitmap[i] != 0) {
                c = leul_to_cpu(bitmap[i]);
                do {
                    j = ctzl(c);
                    c &= ~(1ul << j);
                    page_number = (i * HOST_LONG_BITS + j) * hpratio;
                    addr = page_number * TARGET_PAGE_SIZE;
                    ram_addr = start + addr;
                    cpu_physical_memory_set_dirty_range(ram_addr,
                                       TARGET_PAGE_SIZE * hpratio, clients);
                } while (c != 0);
            }
        }

This is where the hpratio is used, as you can see, if getpagesize() is
smaller than TARGET_PAGE_SIZE, things start to get really rually funny.

Later, Juan.

> ---
>  migration/ram.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index 2ad07b5e15..25af797c0a 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -301,10 +301,10 @@ struct RAMState {
>      uint64_t num_dirty_pages_period;
>      /* xbzrle misses since the beginning of the period */
>      uint64_t xbzrle_cache_miss_prev;
> -    /* number of iterations at the beginning of period */
> -    uint64_t iterations_prev;
> -    /* Iterations since start */
> -    uint64_t iterations;
> +    /* total handled target pages at the beginning of period */
> +    uint64_t target_page_count_prev;
> +    /* total handled target pages since start */
> +    uint64_t target_page_count;
>      /* number of dirty bits in the bitmap */
>      uint64_t migration_dirty_pages;
>      /* last dirty_sync_count we have seen */
> @@ -1594,19 +1594,19 @@ uint64_t ram_pagesize_summary(void)
>  
>  static void migration_update_rates(RAMState *rs, int64_t end_time)
>  {
> -    uint64_t iter_count = rs->iterations - rs->iterations_prev;
> +    uint64_t page_count = rs->target_page_count - rs->target_page_count_prev;
>  
>      /* calculate period counters */
>      ram_counters.dirty_pages_rate = rs->num_dirty_pages_period * 1000
>                  / (end_time - rs->time_last_bitmap_sync);
>  
> -    if (!iter_count) {
> +    if (!page_count) {
>          return;
>      }
>  
>      if (migrate_use_xbzrle()) {
>          xbzrle_counters.cache_miss_rate = (double)(xbzrle_counters.cache_miss -
> -            rs->xbzrle_cache_miss_prev) / iter_count;
> +            rs->xbzrle_cache_miss_prev) / page_count;
>          rs->xbzrle_cache_miss_prev = xbzrle_counters.cache_miss;
>      }
>  }
> @@ -1664,7 +1664,7 @@ static void migration_bitmap_sync(RAMState *rs)
>  
>          migration_update_rates(rs, end_time);
>  
> -        rs->iterations_prev = rs->iterations;
> +        rs->target_page_count_prev = rs->target_page_count;
>  
>          /* reset period counters */
>          rs->time_last_bitmap_sync = end_time;
> @@ -3209,7 +3209,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>              done = 1;
>              break;
>          }
> -        rs->iterations++;
> +        rs->target_page_count += pages;
>  
>          /* we want to check in the 1st loop, just in case it was the 1st time
>             and we had to sync the dirty bitmap.

WARNING: multiple messages have this Message-ID (diff)

From: Juan Quintela <quintela@redhat.com>
To: guangrong.xiao@gmail.com
Cc: pbonzini@redhat.com, mst@redhat.com, mtosatti@redhat.com,
	qemu-devel@nongnu.org, kvm@vger.kernel.org, dgilbert@redhat.com,
	peterx@redhat.com, wei.w.wang@intel.com, jiang.biao2@zte.com.cn,
	eblake@redhat.com, Xiao Guangrong <xiaoguangrong@tencent.com>
Subject: Re: [Qemu-devel] [PATCH v5 2/4] migration: fix calculating xbzrle_counters.cache_miss_rate
Date: Mon, 03 Sep 2018 19:19:25 +0200	[thread overview]
Message-ID: <87a7oy33du.fsf@trasno.org> (raw)
In-Reply-To: <20180903092644.25812-3-xiaoguangrong@tencent.com> (guangrong xiao's message of "Mon, 3 Sep 2018 17:26:42 +0800")

guangrong.xiao@gmail.com wrote:
> From: Xiao Guangrong <xiaoguangrong@tencent.com>
>
> As Peter pointed out:
> | - xbzrle_counters.cache_miss is done in save_xbzrle_page(), so it's
> |   per-guest-page granularity
> |
> | - RAMState.iterations is done for each ram_find_and_save_block(), so
> |   it's per-host-page granularity
> |
> | An example is that when we migrate a 2M huge page in the guest, we
> | will only increase the RAMState.iterations by 1 (since
> | ram_find_and_save_block() will be called once), but we might increase
> | xbzrle_counters.cache_miss for 2M/4K=512 times (we'll call
> | save_xbzrle_page() that many times) if all the pages got cache miss.
> | Then IMHO the cache miss rate will be 512/1=51200% (while it should
> | actually be just 100% cache miss).
>
> And he also suggested as xbzrle_counters.cache_miss_rate is the only
> user of rs->iterations we can adapt it to count target guest page
> numbers
>
> After that, rename 'iterations' to 'target_page_count' to better reflect
> its meaning
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

Because I don't mind the name change.  But the example is wrong, this is
not about huge pages, it is about architectures with different page size
(arm is king here, second place for ppc as far as I know).  You can have
a guest with 4kb pages on a host with 64kb pages.  Then we do "magic"
and when we synchronize the dirty bitmap, we update it on guest page
sizes.

>From cpu_physical_memory_set_dirty_lebitmap()

    unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;

    [...]
        /*
         * bitmap-traveling is faster than memory-traveling (for addr...)
         * especially when most of the memory is not dirty.
         */
        for (i = 0; i < len; i++) {
            if (bitmap[i] != 0) {
                c = leul_to_cpu(bitmap[i]);
                do {
                    j = ctzl(c);
                    c &= ~(1ul << j);
                    page_number = (i * HOST_LONG_BITS + j) * hpratio;
                    addr = page_number * TARGET_PAGE_SIZE;
                    ram_addr = start + addr;
                    cpu_physical_memory_set_dirty_range(ram_addr,
                                       TARGET_PAGE_SIZE * hpratio, clients);
                } while (c != 0);
            }
        }

This is where the hpratio is used, as you can see, if getpagesize() is
smaller than TARGET_PAGE_SIZE, things start to get really rually funny.

Later, Juan.

> ---
>  migration/ram.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index 2ad07b5e15..25af797c0a 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -301,10 +301,10 @@ struct RAMState {
>      uint64_t num_dirty_pages_period;
>      /* xbzrle misses since the beginning of the period */
>      uint64_t xbzrle_cache_miss_prev;
> -    /* number of iterations at the beginning of period */
> -    uint64_t iterations_prev;
> -    /* Iterations since start */
> -    uint64_t iterations;
> +    /* total handled target pages at the beginning of period */
> +    uint64_t target_page_count_prev;
> +    /* total handled target pages since start */
> +    uint64_t target_page_count;
>      /* number of dirty bits in the bitmap */
>      uint64_t migration_dirty_pages;
>      /* last dirty_sync_count we have seen */
> @@ -1594,19 +1594,19 @@ uint64_t ram_pagesize_summary(void)
>  
>  static void migration_update_rates(RAMState *rs, int64_t end_time)
>  {
> -    uint64_t iter_count = rs->iterations - rs->iterations_prev;
> +    uint64_t page_count = rs->target_page_count - rs->target_page_count_prev;
>  
>      /* calculate period counters */
>      ram_counters.dirty_pages_rate = rs->num_dirty_pages_period * 1000
>                  / (end_time - rs->time_last_bitmap_sync);
>  
> -    if (!iter_count) {
> +    if (!page_count) {
>          return;
>      }
>  
>      if (migrate_use_xbzrle()) {
>          xbzrle_counters.cache_miss_rate = (double)(xbzrle_counters.cache_miss -
> -            rs->xbzrle_cache_miss_prev) / iter_count;
> +            rs->xbzrle_cache_miss_prev) / page_count;
>          rs->xbzrle_cache_miss_prev = xbzrle_counters.cache_miss;
>      }
>  }
> @@ -1664,7 +1664,7 @@ static void migration_bitmap_sync(RAMState *rs)
>  
>          migration_update_rates(rs, end_time);
>  
> -        rs->iterations_prev = rs->iterations;
> +        rs->target_page_count_prev = rs->target_page_count;
>  
>          /* reset period counters */
>          rs->time_last_bitmap_sync = end_time;
> @@ -3209,7 +3209,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>              done = 1;
>              break;
>          }
> -        rs->iterations++;
> +        rs->target_page_count += pages;
>  
>          /* we want to check in the 1st loop, just in case it was the 1st time
>             and we had to sync the dirty bitmap.

next prev parent reply	other threads:[~2018-09-03 17:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-03  9:26 [PATCH v5 0/4] migration: compression optimization guangrong.xiao
2018-09-03  9:26 ` [Qemu-devel] " guangrong.xiao
2018-09-03  9:26 ` [PATCH v5 1/4] migration: do not flush_compressed_data at the end of each iteration guangrong.xiao
2018-09-03  9:26   ` [Qemu-devel] " guangrong.xiao
2018-09-03 16:38   ` Juan Quintela
2018-09-03 16:38     ` [Qemu-devel] " Juan Quintela
2018-09-04  3:54     ` Xiao Guangrong
2018-09-04  3:54       ` [Qemu-devel] " Xiao Guangrong
2018-09-04  4:00       ` Xiao Guangrong
2018-09-04  4:00         ` [Qemu-devel] " Xiao Guangrong
2018-09-04  9:28       ` Juan Quintela
2018-09-04  9:28         ` [Qemu-devel] " Juan Quintela
2018-09-03  9:26 ` [PATCH v5 2/4] migration: fix calculating xbzrle_counters.cache_miss_rate guangrong.xiao
2018-09-03  9:26   ` [Qemu-devel] " guangrong.xiao
2018-09-03 17:19   ` Juan Quintela [this message]
2018-09-03 17:19     ` Juan Quintela
2018-09-03  9:26 ` [PATCH v5 3/4] migration: show the statistics of compression guangrong.xiao
2018-09-03  9:26   ` [Qemu-devel] " guangrong.xiao
2018-09-03 17:22   ` Juan Quintela
2018-09-03 17:22     ` [Qemu-devel] " Juan Quintela
2018-09-03  9:26 ` [PATCH v5 4/4] migration: handle the error condition properly guangrong.xiao
2018-09-03  9:26   ` [Qemu-devel] " guangrong.xiao
2018-09-03 17:28   ` Juan Quintela
2018-09-03 17:28     ` [Qemu-devel] " Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a7oy33du.fsf@trasno.org \
    --to=quintela@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=guangrong.xiao@gmail.com \
    --cc=jiang.biao2@zte.com.cn \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wei.w.wang@intel.com \
    --cc=xiaoguangrong@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.