From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:56525)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <crosa@redhat.com>) id 1QqS5Z-0000us-6Z
	for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:50:02 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <crosa@redhat.com>) id 1QqS5X-0000ez-3Y
	for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:50:01 -0400
Received: from mx1.redhat.com ([209.132.183.28]:12449)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <crosa@redhat.com>) id 1QqS5W-0000eg-Sk
	for qemu-devel@nongnu.org; Mon, 08 Aug 2011 11:49:59 -0400
Received: from int-mx12.intmail.prod.int.phx2.redhat.com
	(int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p78Fnvtd029618
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <qemu-devel@nongnu.org>; Mon, 8 Aug 2011 11:49:57 -0400
Message-ID: <4E401449.7050404@redhat.com>
Date: Mon, 08 Aug 2011 12:52:25 -0400
From: Cleber Rosa <crosa@redhat.com>
MIME-Version: 1.0
References: <20110808032438.GC24764@valinux.co.jp>	<4E3FAA53.4030602@redhat.com>	<20110808105910.GA25964@fermat.math.technion.ac.il>
	<4E3FCCBB.4060205@redhat.com>
In-Reply-To: <4E3FCCBB.4060205@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [RFC] postcopy livemigration proposal
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org

On 08/08/2011 07:47 AM, Dor Laor wrote:
> On 08/08/2011 01:59 PM, Nadav Har'El wrote:
>>>> * What's is postcopy livemigration
>>>> It is is yet another live migration mechanism for Qemu/KVM, which
>>>> implements the migration technique known as "postcopy" or "lazy"
>>>> migration. Just after the "migrate" command is invoked, the execution
>>>> host of a VM is instantaneously switched to a destination host.
>>
>> Sounds like a cool idea.
>>
>>>> The benefit is, total migration time is shorter because it transfer
>>>> a page only once. On the other hand precopy may repeat sending same 
>>>> pages
>>>> again and again because they can be dirtied.
>>>> The switching time from the source to the destination is several
>>>> hunderds mili seconds so that it enables quick load balancing.
>>>> For details, please refer to the papers.
>>
>> While these are the obvious benefits, the possible downside (that, as
>> always, depends on the workload) is the amount of time that the guest
>> workload runs more slowly than usual, waiting for pages it needs to
>> continue. There are a whole spectrum between the guest pausing 
>> completely
>> (which would solve all the problems of migration, but is often 
>> considered
>> unacceptible) and running at full-speed. Is it acceptable that the guest
>> runs at 90% speed during the migration? 50%? 10%?
>> I guess we could have nothing to lose from having both options, and 
>> choosing
>> the most appropriate technique for each guest!

Not sure if it's possible to have smart heuristics on guest memory page 
faults, but maybe a technique that reads ahead more pages if a given 
pattern is detected may help to lower the impact.

>
> +1
>
>>
>>> That's terrific  (nice video also)!
>>> Orit and myself had the exact same idea too (now we can't patent it..).
>>
>> I think new implementation is not the only reason why you cannot patent
>> this idea :-) Demand-paged migration has actually been discussed (and 
>> done)
>> for nearly a quarter of a century (!) in the area of *process* 
>> migration.
>>
>> The first use I'm aware of was in CMU's Accent 1987 - see [1].
>> Another paper, [2], written in 1991, discusses how process migration 
>> is done
>> in UCB's Sprite operating system, and evaluates the various alternatives
>> common at the time (20 years ago), including what it calls "lazy 
>> copying"
>> is more-or-less the same thing as "post copy". Mosix (a project 
>> which, in some
>> sense, is still alive to day) also used some sort of cross between 
>> pre-copying
>> (of dirty pages) and copying on-demand of clean pages (from their 
>> backing
>> store on the source machine).
>>
>>
>> References
>> [1] "Attacking the Process Migration Bottleneck"
>>       
>> http://www.nd.edu/~dthain/courses/cse598z/fall2004/papers/accent.pdf
>
> w/o reading the internals, patents enable you to implement an existing 
> idea on a new field. Anyway, there won't be no patent in this case. 
> Still let's have the kvm innovation merged.
>
>> [2]  "Transparent Process Migration: Design Alternatives and the Sprite
>>       Implementation"
>>       
>> http://nd.edu/~dthain/courses/cse598z/fall2004/papers/sprite-migration.pdf
>>
>>> Advantages:
>>>          - Virtual machines are using more and more memory resources ,
>>>          for a virtual machine with very large working set doing live
>>>          migration with reasonable down time is impossible today.
>>
>> If a guest actually constantly uses (working set) most of its allocated
>> memory, it will basically be unable to do any significant amount of work
>> on the destination VM until this large working set is transfered to the
>> destination. So in this scenario, "post copying" doesn't give any
>> significant advantages over plain-old "pause guest and send it to the
>> destination". Or am I missing something?
>
> There is one key advantage in this scheme/use case - if you have a 
> guest with a very large working set, you'll need a very large downtime 
> in order to migrate it with today's algorithm. With post copy (aka 
> streaming/demand paging), the guest won't have any downtime but will 
> run slower than expected.
>
> There are guests today that is impractical to really live migrate them.
>
> btw: Even today, marking pages RO also carries some performance penalty.
>
>>
>>> Disadvantageous:
>>>          - During the live migration the guest will run slower than in
>>>          today's live migration. We need to remember that even today
>>>          guests suffer from performance penalty on the source during 
>>> the
>>>          COW stage (memory copy).
>>
>> I wonder if something like asynchronous page faults can help somewhat 
>> with
>> multi-process guest workloads (and modified (PV) guest OS).
>
> They should come in to play for some extent. Note that only newer 
> Linux guest will enjoy of them.
>
>>
>>>          - Failure of the source or destination or the network will 
>>> cause
>>>          us to lose the running virtual machine. Those failures are 
>>> very
>>>          rare.
>>
>> How is this different from a VM running on a single machine that fails?
>> Just that the small probability of failure (roughly) doubles for the
>> relatively-short duration of the transfer?
>
> Exactly my point, this is not a major disadvantage because of this low 
> probability.
>
>