From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:44452)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1WAdX9-0008Ls-P3
	for qemu-devel@nongnu.org; Tue, 04 Feb 2014 05:47:24 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1WAdX1-0005Mq-7R
	for qemu-devel@nongnu.org; Tue, 04 Feb 2014 05:47:15 -0500
Received: from mail-ee0-x230.google.com ([2a00:1450:4013:c00::230]:54009)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1WAdX0-0005Mk-W0
	for qemu-devel@nongnu.org; Tue, 04 Feb 2014 05:47:07 -0500
Received: by mail-ee0-f48.google.com with SMTP id t10so4161529eei.7
	for <qemu-devel@nongnu.org>; Tue, 04 Feb 2014 02:47:06 -0800 (PST)
Sender: Paolo Bonzini <paolo.bonzini@gmail.com>
Message-ID: <52F0C523.30102@redhat.com>
Date: Tue, 04 Feb 2014 11:46:59 +0100
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <52F0938F.2040102@ozlabs.ru>
In-Reply-To: <52F0938F.2040102@ozlabs.ru>
Content-Type: text/plain; charset=KOI8-R; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] migration: broken ram_save_pending
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Alexey Kardashevskiy <aik@ozlabs.ru>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: Alex Graf <agraf@suse.de>

Il 04/02/2014 08:15, Alexey Kardashevskiy ha scritto:
> So. migration_thread() gets dirty pages number, tries to send them in a
> loop but every iteration resets the number of pages to 96 and we start
> again. After several tries we cross BUFFER_DELAY timeout and calculate new
> @max_size and if the host machine is fast enough it is bigger than 393216
> and next loop will finally finish the migration.

This should have happened pretty much immediately, because it's not 
while (pending()) but rather

             while (pending_size && pending_size >= max_size)

(it's an "if" in the code, but the idea is the same).  And max_size is 
the following:

             max_size = bandwidth * migrate_max_downtime() / 1000000;

With the default throttling of 32 MiB/s, bandwidth must be something 
like 33000 (expressed in bytes/ms) with the default settings, and then 
max_size should be 33000*3*10^9 / 10^6 = 6000000.  Where is my 
computation wrong?

Also, did you profile it to find the hotspot?  Perhaps the bitmap 
operations are taking a lot of time.  How big is the guest?  Juan's 
patches were optimizing the bitmaps but not all of them apply to your 
case because of hpratio.

> I can only think of something simple like below and not sure it does not
> break other things. I would expect ram_save_pending() to return correct
> number of bytes QEMU is going to send rather than number of pages
> multiplied by 4096 but checking if all these pages are really empty is not
> too cheap.

If you use qemu_update_position you will use very little bandwidth in 
the case where a lot of pages are zero.

What you mention in ram_save_pending() is not problematic just because 
of finding if the pages are empty, but also because you have to find the 
nonzero spots in the bitmap!

Paolo

> Thanks!
>
>
> diff --git a/arch_init.c b/arch_init.c
> index 2ba297e..90949b0 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -537,16 +537,17 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
>                          acct_info.dup_pages++;
>                      }
>                  }
>              } else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
>                  acct_info.dup_pages++;
>                  bytes_sent = save_block_hdr(f, block, offset, cont,
>                                              RAM_SAVE_FLAG_COMPRESS);
>                  qemu_put_byte(f, 0);
> +                qemu_update_position(f, TARGET_PAGE_SIZE);
>                  bytes_sent++;
>              } else if (!ram_bulk_stage && migrate_use_xbzrle()) {
>                  current_addr = block->offset + offset;
>                  bytes_sent = save_xbzrle_page(f, p, current_addr, block,
>                                                offset, cont, last_stage);
>                  if (!last_stage) {
>                      p = get_cached_data(XBZRLE.cache, current_addr);
>                  }
>
>