From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [RFC] postcopy livemigration proposal Date: Mon, 08 Aug 2011 18:36:50 +0300 Message-ID: <4E400292.1030005@redhat.com> References: <20110808032438.GC24764@valinux.co.jp> <4E3FAA53.4030602@redhat.com> <4E3FD774.7010502@codemonkey.ws> <4E3FFC9E.8050300@redhat.com> <4E4000BF.20708@codemonkey.ws> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, satoshi.itoh@aist.go.jp, t.hirofuchi@aist.go.jp, dlaor@redhat.com, Orit Wasserman , qemu-devel@nongnu.org, Isaku Yamahata To: Anthony Liguori Return-path: In-Reply-To: <4E4000BF.20708@codemonkey.ws> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org Sender: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org List-Id: kvm.vger.kernel.org On 08/08/2011 06:29 PM, Anthony Liguori wrote: > >>>> - Efficient, reduce needed traffic no need to re-send pages. >>> >>> It's not quite that simple. Post-copy needs to introduce a protocol >>> capable of requesting pages. >> >> Just another subsection.. (kidding), still it shouldn't be too >> complicated, just an offset+pagesize and return page_content/error > > What I meant by this is that there is potentially a lot of round trip > overhead. Pre-copy migration works well with reasonable high latency > network connections because the downtime is capped only by the maximum > latency sending from one point to another. > > But with something like this, the total downtime is > 2*max_latency*nb_pagefaults. That's potentially pretty high. Let's be generous and assume that the latency is dominated by page copy time. So the total downtime is equal to the first live migration pass, ~20 sec for 2GB on 1GbE. It's distributed over potentially even more time, though. If the guest does a lot of I/O, it may not be noticeable (esp. if we don't copy over pages read from disk). If the guest is cpu/memory bound, it'll probably suck badly. > > So it may be desirable to try to reduce nb_pagefaults by prefaulting > in pages, etc. Suffice to say, this ends up getting complicated and > may end up burning network traffic too. Yeah, and prefaulting in the background adds latency to synchronous requests. This really needs excellent networking resources to work well. -- error compiling committee.c: too many arguments to function