From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53361) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WB5YW-00052D-AK for qemu-devel@nongnu.org; Wed, 05 Feb 2014 11:42:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WB5YQ-0005v6-Av for qemu-devel@nongnu.org; Wed, 05 Feb 2014 11:42:32 -0500 Received: from mx1.redhat.com ([209.132.183.28]:10821) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WB5YQ-0005v2-1q for qemu-devel@nongnu.org; Wed, 05 Feb 2014 11:42:26 -0500 Date: Wed, 5 Feb 2014 16:42:20 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20140205164219.GJ2398@work-vm> References: <52F0938F.2040102@ozlabs.ru> <52F0C523.30102@redhat.com> <52F0D611.7070105@ozlabs.ru> <52F0D810.4070806@redhat.com> <52F0DA04.9040003@ozlabs.ru> <52F0F26A.5020304@redhat.com> <52F16708.8060902@ozlabs.ru> <52F1E5BA.60902@redhat.com> <20140205090912.GA2398@work-vm> <52F2685D.2050405@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52F2685D.2050405@redhat.com> Subject: Re: [Qemu-devel] migration: broken ram_save_pending List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Alexey Kardashevskiy , "qemu-devel@nongnu.org" , Alex Graf * Paolo Bonzini (pbonzini@redhat.com) wrote: > Il 05/02/2014 10:09, Dr. David Alan Gilbert ha scritto: > >I think the case Alexey is hitting is: > > 1 A few dirtied pages > > 2 but because of the hpratio most of the data is actually zero > > - indeed most of the target-page sized chunks are zero > > 3 Thus the data compresses very heavily > > 4 When the bandwidth/delay calculation happens it's spent a reasonable > > amount of time transferring a reasonable amount of pages but not > > actually many bytes on the wire, so the estimate of the available > > bandwidth available is lower than reality. > > 5 The max-downtime calculation is a comparison of pending-dirty uncompressed > > bytes with compressed bandwidth > > > >(5) is bound to fail if the compression ratio is particularly high, which > >because of the hpratio it is if we're just dirtying one word in an entire > >host page. > > So far so good, but why isn't pending-dirty (aka > migration_dirty_pages in the code) zero? Because: * the code is still running and keeps redirtying a small handful of pages * but because we've underestimated our available bandwidth we never stop it and just throw those pages across immediately Dave -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK