From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44452) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WAdX9-0008Ls-P3 for qemu-devel@nongnu.org; Tue, 04 Feb 2014 05:47:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WAdX1-0005Mq-7R for qemu-devel@nongnu.org; Tue, 04 Feb 2014 05:47:15 -0500 Received: from mail-ee0-x230.google.com ([2a00:1450:4013:c00::230]:54009) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WAdX0-0005Mk-W0 for qemu-devel@nongnu.org; Tue, 04 Feb 2014 05:47:07 -0500 Received: by mail-ee0-f48.google.com with SMTP id t10so4161529eei.7 for ; Tue, 04 Feb 2014 02:47:06 -0800 (PST) Sender: Paolo Bonzini Message-ID: <52F0C523.30102@redhat.com> Date: Tue, 04 Feb 2014 11:46:59 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <52F0938F.2040102@ozlabs.ru> In-Reply-To: <52F0938F.2040102@ozlabs.ru> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] migration: broken ram_save_pending List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy , "qemu-devel@nongnu.org" Cc: Alex Graf Il 04/02/2014 08:15, Alexey Kardashevskiy ha scritto: > So. migration_thread() gets dirty pages number, tries to send them in a > loop but every iteration resets the number of pages to 96 and we start > again. After several tries we cross BUFFER_DELAY timeout and calculate new > @max_size and if the host machine is fast enough it is bigger than 393216 > and next loop will finally finish the migration. This should have happened pretty much immediately, because it's not while (pending()) but rather while (pending_size && pending_size >= max_size) (it's an "if" in the code, but the idea is the same). And max_size is the following: max_size = bandwidth * migrate_max_downtime() / 1000000; With the default throttling of 32 MiB/s, bandwidth must be something like 33000 (expressed in bytes/ms) with the default settings, and then max_size should be 33000*3*10^9 / 10^6 = 6000000. Where is my computation wrong? Also, did you profile it to find the hotspot? Perhaps the bitmap operations are taking a lot of time. How big is the guest? Juan's patches were optimizing the bitmaps but not all of them apply to your case because of hpratio. > I can only think of something simple like below and not sure it does not > break other things. I would expect ram_save_pending() to return correct > number of bytes QEMU is going to send rather than number of pages > multiplied by 4096 but checking if all these pages are really empty is not > too cheap. If you use qemu_update_position you will use very little bandwidth in the case where a lot of pages are zero. What you mention in ram_save_pending() is not problematic just because of finding if the pages are empty, but also because you have to find the nonzero spots in the bitmap! Paolo > Thanks! > > > diff --git a/arch_init.c b/arch_init.c > index 2ba297e..90949b0 100644 > --- a/arch_init.c > +++ b/arch_init.c > @@ -537,16 +537,17 @@ static int ram_save_block(QEMUFile *f, bool last_stage) > acct_info.dup_pages++; > } > } > } else if (is_zero_range(p, TARGET_PAGE_SIZE)) { > acct_info.dup_pages++; > bytes_sent = save_block_hdr(f, block, offset, cont, > RAM_SAVE_FLAG_COMPRESS); > qemu_put_byte(f, 0); > + qemu_update_position(f, TARGET_PAGE_SIZE); > bytes_sent++; > } else if (!ram_bulk_stage && migrate_use_xbzrle()) { > current_addr = block->offset + offset; > bytes_sent = save_xbzrle_page(f, p, current_addr, block, > offset, cont, last_stage); > if (!last_stage) { > p = get_cached_data(XBZRLE.cache, current_addr); > } > >