From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:32950)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <pl@dlh.net>)
	id 1RiQZq-0001Gj-Ra
	for qemu-devel@nongnu.org; Wed, 04 Jan 2012 08:08:29 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pl@dlh.net>) id 1RiQZp-00089h-Hp
	for qemu-devel@nongnu.org; Wed, 04 Jan 2012 08:08:22 -0500
Received: from ssl.dlh.net ([91.198.192.8]:49319)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <pl@dlh.net>)
	id 1RiQZo-00089T-VR
	for qemu-devel@nongnu.org; Wed, 04 Jan 2012 08:08:21 -0500
Message-ID: <4F044F42.2050508@dlh.net>
Date: Wed, 04 Jan 2012 14:08:18 +0100
From: Peter Lieven <pl@dlh.net>
MIME-Version: 1.0
References: <032f49425e7284e9f050064cd30855bb@mail.dlh.net>
	<4F03AD98.7020700@linux.vnet.ibm.com>
	<4F042FA1.5090909@dlh.net> <4F04326F.8080808@redhat.com>
	<4F043689.2000604@dlh.net> <4F0437DA.8080600@redhat.com>
	<4F043B12.60501@dlh.net> <4F0445EE.9010905@redhat.com>
In-Reply-To: <4F0445EE.9010905@redhat.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Stalls on Live Migration of VMs with a lot of
	memory
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Shu Ming <shuming@linux.vnet.ibm.com>, qemu-devel@nongnu.org, kvm@vger.kernel.org

On 04.01.2012 13:28, Paolo Bonzini wrote:
> On 01/04/2012 12:42 PM, Peter Lieven wrote:
>>>
>> ok, then i misunderstood the ram blocks thing. i thought the guest ram
>> would consist of a collection of ram blocks.
>> then let me describe it differntly. would it make sense to process
>> bigger portions of memory (e.g. 1M) in stage 2 to reduce the number of
>> calls to cpu_physical_memory_reset_dirty and instead run it on bigger
>> portions of memory. we might loose a few dirty pages but they will be
>> tracked in the next iteration in stage 2 or in stage 3 at least. what
>> would be necessary is that nobody marks a page dirty
>> while i copy the dirty information for the portion of memory i want to
>> process.
>
> Dirty memory tracking is done by the hypervisor and must be done at 
> page granularity.
ok, so this is unfortunately no option.

thus my only option at the moment is to limit the runtime of the while 
loop in stage 2 or
are there any post 1.0 patches in git that might already help?

i tried to limit it to migrate_max_downtime() and this at least resolves 
the problem with
the vm stalls. however, migration speed is very limited (approx. 80MB/s 
on a 10G link).
with that.


>
>>>> - in stage 3 the vm is stopped, right? so there can't be any more 
>>>> dirty
>>>> blocks after scanning the whole memory once?
>>>
>>> No, stage 3 is entered when there are very few dirty memory pages
>>> remaining.  This may happen after scanning the whole memory many
>>> times.  It may even never happen if migration does not converge
>>> because of low bandwidth or too strict downtime requirements.
>>>
>> ok, is there a chance that i lose one final page if it is modified just
>> after i walked over it and i found no other page dirty (so bytes_sent 
>> = 0).
>
> No, of course not.  Stage 3 will send all missing pages while the VM 
> is stopped.  There is a chance that the guest will go crazy and start 
> touching lots of pages at exactly the wrong time, and thus the 
> downtime will be longer than expected.  However, that's a necessary 
> evil; if you cannot accept that, post-copy migration would provide a 
> completely different set of tradeoffs.
i don't suffer from long downtimes in stage 3. my issue is in stage 2.
>
> (BTW, bytes_sent = 0 is very rare).
i know, but when the vm is stopped there is no issue. i understood your 
"No, stage 3 is entered ..." wrong ;-)

thanks for your help and explainations.

peter
>
> Paolo