From mboxrd@z Thu Jan  1 00:00:00 1970
From: Peter Lieven <pl@dlh.net>
Subject: Re: [Qemu-devel] Stalls on Live Migration of VMs with a lot of memory
Date: Wed, 04 Jan 2012 12:42:10 +0100
Message-ID: <4F043B12.60501@dlh.net>
References: <032f49425e7284e9f050064cd30855bb@mail.dlh.net> <4F03AD98.7020700@linux.vnet.ibm.com> <4F042FA1.5090909@dlh.net> <4F04326F.8080808@redhat.com> <4F043689.2000604@dlh.net> <4F0437DA.8080600@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Shu Ming <shuming@linux.vnet.ibm.com>, qemu-devel@nongnu.org,
	kvm@vger.kernel.org
To: Paolo Bonzini <pbonzini@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from ssl.dlh.net ([91.198.192.8]:47900 "EHLO ssl.dlh.net"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753694Ab2ADLmM (ORCPT <rfc822;kvm@vger.kernel.org>);
	Wed, 4 Jan 2012 06:42:12 -0500
In-Reply-To: <4F0437DA.8080600@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 04.01.2012 12:28, Paolo Bonzini wrote:
> On 01/04/2012 12:22 PM, Peter Lieven wrote:
>>> There were patches to move RAM migration to a separate thread. The
>>> problem is that they broke block migration.
>>>
>>> However, asynchronous NBD is in and streaming will follow suit soon.
>>> As soon as we have those two features, we might as well remove the
>>> block migration code.
>>
>> ok, so its a matter of time, right?
>
> Well, there are other solutions of varying complexity in the works, 
> that might remove the need for the migration thread or at least reduce 
> the problem (post-copy migration, XBRLE, vectorized hot loops).  But 
> yes, we are aware of the problem and we should solve it in one way or 
> the other.
i have read all these approached and they seem all promising.
>
>> would it make sense to patch ram_save_block to always process a full ram
>> block?
>
> If I understand the proposal, then migration would hardly be live 
> anymore.  The biggest RAM block in a 32G machine is, well, 32G big. 
> Other RAM blocks are for the VRAM and for some BIOS data, but they are 
> very small in proportion.
ok, then i misunderstood the ram blocks thing. i thought the guest ram 
would consist of a collection of ram blocks.
then let me describe it differntly. would it make sense to process 
bigger portions of memory (e.g. 1M) in stage 2 to reduce the number of 
calls to cpu_physical_memory_reset_dirty and instead run it on bigger 
portions of memory. we might loose a few dirty pages but they will be 
tracked in the next iteration in stage 2 or in stage 3 at least. what 
would be necessary is that nobody marks a page dirty
while i copy the dirty information for the portion of memory i want to 
process.
>
>> - in stage 3 the vm is stopped, right? so there can't be any more dirty
>> blocks after scanning the whole memory once?
>
> No, stage 3 is entered when there are very few dirty memory pages 
> remaining.  This may happen after scanning the whole memory many 
> times.  It may even never happen if migration does not converge 
> because of low bandwidth or too strict downtime requirements.
ok, is there a chance that i lose one final page if it is modified just 
after i walked over it and i found no other page dirty (so bytes_sent = 0).

Peter
>
> Paolo