From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1M786l-0006mx-Pf for qemu-devel@nongnu.org; Thu, 21 May 2009 09:14:51 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1M786h-0006ju-2l for qemu-devel@nongnu.org; Thu, 21 May 2009 09:14:51 -0400 Received: from [199.232.76.173] (port=58687 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1M786g-0006jl-SO for qemu-devel@nongnu.org; Thu, 21 May 2009 09:14:46 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:50584) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1M786g-0007OY-Ec for qemu-devel@nongnu.org; Thu, 21 May 2009 09:14:46 -0400 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e35.co.us.ibm.com (8.13.1/8.13.1) with ESMTP id n4LD8JIn002386 for ; Thu, 21 May 2009 07:08:19 -0600 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n4LDEWDq238702 for ; Thu, 21 May 2009 07:14:34 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n4LDEU8T032152 for ; Thu, 21 May 2009 07:14:31 -0600 Message-ID: <4A1553AF.2000707@us.ibm.com> Date: Thu, 21 May 2009 08:14:23 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] augment info migrate with page status References: <1242861605-12844-1-git-send-email-glommer@redhat.com> <4A152B51.2080600@redhat.com> In-Reply-To: <4A152B51.2080600@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: dlaor@redhat.com Cc: Glauber Costa , qemu-devel@nongnu.org Dor Laor wrote: >> static ram_addr_t ram_save_threshold = 10; >> +static ram_addr_t pages_transferred = 0; >> > > It would be nice to zero pages_transferred each migration operation. > ram_save_threshold is really to small. From Uri's past measurements, > as value of 50 is a > better suite. Alternately it can be parametrized by the monitor command. > > In general there is small drawback in the current approach: > The way bandwidth is capped, iirc, in every second you start > consuming migration > bandwidth. If the bandwidth allocation was consumed after 100msec, > you'll wait 900msec. > In this period, mgmt app reading the ram_save_remaining will notice > that migration does > not progress and might either increase bandwidth or stop the guest. > That's why #of no-progress-iteration has advantage. If I were implementing this in libvirt, here's what I would do: B = MB/sec bandwidth limit S = guest size in MB C = some constant factor, perhaps 4-5 T = S / B * C 1) Wait for T seconds or until migration completes. 2) If timeout occurred: a) M = actual transfer rate for migration in MB/sec b) If M < B, T1 = S / M * C c) T = T1 - T d) If T <= 0, migration failed d) else goto 1 Basically, this institutes a policy that a migration must complete after transferring C * guest_size amount of data. It adjusts for observed bandwidth rate vs. capped. It makes sense from an administrative perspective because you are probably only willing to waste so much network bandwidth on attempting to migrate. Obviously, C and B are tunables that heavily depend on the relatively priority of the migration. Whether you force a non-live migration after failure of a live migration is an administrative decision. -- Regards, Anthony Liguori