From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56555) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WAef1-0002yM-D1 for qemu-devel@nongnu.org; Tue, 04 Feb 2014 06:59:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WAeeu-0002QB-1i for qemu-devel@nongnu.org; Tue, 04 Feb 2014 06:59:27 -0500 Received: from mail-pd0-f171.google.com ([209.85.192.171]:39978) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WAeet-0002Ps-Rd for qemu-devel@nongnu.org; Tue, 04 Feb 2014 06:59:19 -0500 Received: by mail-pd0-f171.google.com with SMTP id g10so8117850pdj.2 for ; Tue, 04 Feb 2014 03:59:19 -0800 (PST) Message-ID: <52F0D611.7070105@ozlabs.ru> Date: Tue, 04 Feb 2014 22:59:13 +1100 From: Alexey Kardashevskiy MIME-Version: 1.0 References: <52F0938F.2040102@ozlabs.ru> <52F0C523.30102@redhat.com> In-Reply-To: <52F0C523.30102@redhat.com> Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] migration: broken ram_save_pending List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , "qemu-devel@nongnu.org" Cc: Alex Graf On 02/04/2014 09:46 PM, Paolo Bonzini wrote: > Il 04/02/2014 08:15, Alexey Kardashevskiy ha scritto: >> So. migration_thread() gets dirty pages number, tries to send them in a >> loop but every iteration resets the number of pages to 96 and we start >> again. After several tries we cross BUFFER_DELAY timeout and calculate new >> @max_size and if the host machine is fast enough it is bigger than 393216 >> and next loop will finally finish the migration. > > This should have happened pretty much immediately, because it's not while > (pending()) but rather > > while (pending_size && pending_size >= max_size) > > (it's an "if" in the code, but the idea is the same). And max_size is the > following: > > max_size = bandwidth * migrate_max_downtime() / 1000000; > > With the default throttling of 32 MiB/s, bandwidth must be something like > 33000 (expressed in bytes/ms) with the default settings, and then max_size > should be 33000*3*10^9 / 10^6 = 6000000. Where is my computation wrong? migrate_max_downtime() = 30000000 = 3*10^7. When the migration is in iterating stage, bandwidth is a speed in last 100ms which is usually 5 blocks 250KB each so it is 1250000/100=12500bytes/s and max_size=12500*30000000/10^6=375000 which is less than the last chunk is. > > Also, did you profile it to find the hotspot? Perhaps the bitmap > operations are taking a lot of time. How big is the guest? 1024MB. > Juan's patches > were optimizing the bitmaps but not all of them apply to your case because > of hpratio. This I had to disable :) >> I can only think of something simple like below and not sure it does not >> break other things. I would expect ram_save_pending() to return correct >> number of bytes QEMU is going to send rather than number of pages >> multiplied by 4096 but checking if all these pages are really empty is not >> too cheap. > > If you use qemu_update_position you will use very little bandwidth in the > case where a lot of pages are zero. My guest migrates in a second or so. I guess in this case qemu_file_rate_limit() limits the speed and it does not look at QEMUFile::pos. > What you mention in ram_save_pending() is not problematic just because of > finding if the pages are empty, but also because you have to find the > nonzero spots in the bitmap! Sure. > > Paolo > >> Thanks! >> >> >> diff --git a/arch_init.c b/arch_init.c >> index 2ba297e..90949b0 100644 >> --- a/arch_init.c >> +++ b/arch_init.c >> @@ -537,16 +537,17 @@ static int ram_save_block(QEMUFile *f, bool >> last_stage) >> acct_info.dup_pages++; >> } >> } >> } else if (is_zero_range(p, TARGET_PAGE_SIZE)) { >> acct_info.dup_pages++; >> bytes_sent = save_block_hdr(f, block, offset, cont, >> RAM_SAVE_FLAG_COMPRESS); >> qemu_put_byte(f, 0); >> + qemu_update_position(f, TARGET_PAGE_SIZE); >> bytes_sent++; >> } else if (!ram_bulk_stage && migrate_use_xbzrle()) { >> current_addr = block->offset + offset; >> bytes_sent = save_xbzrle_page(f, p, current_addr, block, >> offset, cont, last_stage); >> if (!last_stage) { >> p = get_cached_data(XBZRLE.cache, current_addr); >> } >> >> > > -- Alexey