From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60514) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WJKUj-0002bb-03 for qemu-devel@nongnu.org; Fri, 28 Feb 2014 05:16:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WJKUe-0002j2-2I for qemu-devel@nongnu.org; Fri, 28 Feb 2014 05:16:40 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49294) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WJKUd-0002iP-Q7 for qemu-devel@nongnu.org; Fri, 28 Feb 2014 05:16:35 -0500 Date: Fri, 28 Feb 2014 10:16:17 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20140228101616.GF2695@work-vm> References: <33183CC9F5247A488A2544077AF19020815D2295@SZXEMA503-MBS.china.huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <33183CC9F5247A488A2544077AF19020815D2295@SZXEMA503-MBS.china.huawei.com> Subject: Re: [Qemu-devel] [PATCH 5/7] migration: Fix the migrate auto converge process List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Gonglei (Arei)" Cc: "chenliang (T)" , Peter Maydell , Juan Quintela , "pl@kamp.de" , "qemu-devel@nongnu.org" , "aliguori@amazon.com" , "pbonzini@redhat.com" * Gonglei (Arei) (arei.gonglei@huawei.com) wrote: > It is inaccuracy and complex that using the transfer speed of > migration thread to determine whether the convergence migration. > The dirty page may be compressed by XBZRLE or ZERO_PAGE.The counter > of updating dirty bitmap will be increasing continuously if the > migration can't convergence. > > Signed-off-by: ChenLiang > Signed-off-by: Gonglei > --- > arch_init.c | 26 +++----------------------- > 1 file changed, 3 insertions(+), 23 deletions(-) > > diff --git a/arch_init.c b/arch_init.c > index fc71331..2211e0b 100644 > --- a/arch_init.c > +++ b/arch_init.c > @@ -107,7 +107,6 @@ int graphic_depth = 32; > > const uint32_t arch_type = QEMU_ARCH; > static bool mig_throttle_on; > -static int dirty_rate_high_cnt; > static void check_guest_throttling(void); > > static uint64_t bitmap_sync_cnt; > @@ -464,17 +463,11 @@ static void migration_bitmap_sync(void) > uint64_t num_dirty_pages_init = migration_dirty_pages; > MigrationState *s = migrate_get_current(); > static int64_t start_time; > - static int64_t bytes_xfer_prev; > static int64_t num_dirty_pages_period; > int64_t end_time; > - int64_t bytes_xfer_now; > > increase_bitmap_sync_cnt(); > > - if (!bytes_xfer_prev) { > - bytes_xfer_prev = ram_bytes_transferred(); > - } > - > if (!start_time) { > start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); > } > @@ -493,21 +486,9 @@ static void migration_bitmap_sync(void) > /* more than 1 second = 1000 millisecons */ > if (end_time > start_time + 1000) { > if (migrate_auto_converge()) { > - /* The following detection logic can be refined later. For now: > - Check to see if the dirtied bytes is 50% more than the approx. > - amount of bytes that just got transferred since the last time we > - were in this routine. If that happens >N times (for now N==4) > - we turn on the throttle down logic */ > - bytes_xfer_now = ram_bytes_transferred(); > - if (s->dirty_pages_rate && > - (num_dirty_pages_period * TARGET_PAGE_SIZE > > - (bytes_xfer_now - bytes_xfer_prev)/2) && > - (dirty_rate_high_cnt++ > 4)) { > - trace_migration_throttle(); > - mig_throttle_on = true; > - dirty_rate_high_cnt = 0; > - } > - bytes_xfer_prev = bytes_xfer_now; > + if (get_bitmap_sync_cnt() > 15) { > + mig_throttle_on = true; > + } That is a lot simpler, and I suspect as good - again I'd move that magic '15' to a constant somewhere. What have you tested this on - have you tested with really big RAM VMs? What's it's behaviour like with rate-limiting? Dave > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK