From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:32957) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QqS7g-0001lw-8t for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:52:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QqS7f-0001A6-4K for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:52:12 -0400 Received: from mail-pz0-f42.google.com ([209.85.210.42]:54995) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QqS7e-00019t-RK for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:52:11 -0400 Received: by pzk37 with SMTP id 37so3212475pzk.29 for ; Mon, 08 Aug 2011 08:52:09 -0700 (PDT) Message-ID: <4E400624.1060504@codemonkey.ws> Date: Mon, 08 Aug 2011 10:52:04 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <20110808032438.GC24764@valinux.co.jp> <4E3FAA53.4030602@redhat.com> <20110808105910.GA25964@fermat.math.technion.ac.il> <4E3FCCBB.4060205@redhat.com> <4E401449.7050404@redhat.com> In-Reply-To: <4E401449.7050404@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] postcopy livemigration proposal List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Cleber Rosa Cc: qemu-devel@nongnu.org On 08/08/2011 11:52 AM, Cleber Rosa wrote: > On 08/08/2011 07:47 AM, Dor Laor wrote: >> On 08/08/2011 01:59 PM, Nadav Har'El wrote: >>>>> * What's is postcopy livemigration >>>>> It is is yet another live migration mechanism for Qemu/KVM, which >>>>> implements the migration technique known as "postcopy" or "lazy" >>>>> migration. Just after the "migrate" command is invoked, the execution >>>>> host of a VM is instantaneously switched to a destination host. >>> >>> Sounds like a cool idea. >>> >>>>> The benefit is, total migration time is shorter because it transfer >>>>> a page only once. On the other hand precopy may repeat sending same >>>>> pages >>>>> again and again because they can be dirtied. >>>>> The switching time from the source to the destination is several >>>>> hunderds mili seconds so that it enables quick load balancing. >>>>> For details, please refer to the papers. >>> >>> While these are the obvious benefits, the possible downside (that, as >>> always, depends on the workload) is the amount of time that the guest >>> workload runs more slowly than usual, waiting for pages it needs to >>> continue. There are a whole spectrum between the guest pausing >>> completely >>> (which would solve all the problems of migration, but is often >>> considered >>> unacceptible) and running at full-speed. Is it acceptable that the guest >>> runs at 90% speed during the migration? 50%? 10%? >>> I guess we could have nothing to lose from having both options, and >>> choosing >>> the most appropriate technique for each guest! > > Not sure if it's possible to have smart heuristics on guest memory page > faults, but maybe a technique that reads ahead more pages if a given > pattern is detected may help to lower the impact. It's got to be a user choice. Post-copy can mean unbounded downtime for a guest with no way to mitigate it. It's impossible to cancel a post-copy migration. I actually think the use-cases for post-copy are fairly limited in an enterprise environment. Regards, Anthony Liguori