From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:42107)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <lilei@linux.vnet.ibm.com>) id 1VZgRO-0005yV-St
	for qemu-devel@nongnu.org; Fri, 25 Oct 2013 08:24:42 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <lilei@linux.vnet.ibm.com>) id 1VZgRG-0002Jd-UN
	for qemu-devel@nongnu.org; Fri, 25 Oct 2013 08:24:34 -0400
Received: from e28smtp01.in.ibm.com ([122.248.162.1]:50916)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <lilei@linux.vnet.ibm.com>) id 1VZgRG-0002Gr-2F
	for qemu-devel@nongnu.org; Fri, 25 Oct 2013 08:24:26 -0400
Received: from /spool/local
	by e28smtp01.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use
	Only! Violators will be prosecuted
	for <qemu-devel@nongnu.org> from <lilei@linux.vnet.ibm.com>;
	Fri, 25 Oct 2013 17:54:14 +0530
Received: from d28relay01.in.ibm.com (d28relay01.in.ibm.com [9.184.220.58])
	by d28dlp03.in.ibm.com (Postfix) with ESMTP id CD54F1258053
	for <qemu-devel@nongnu.org>; Fri, 25 Oct 2013 17:54:46 +0530 (IST)
Received: from d28av05.in.ibm.com (d28av05.in.ibm.com [9.184.220.67])
	by d28relay01.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	r9PCR6mJ34406548
	for <qemu-devel@nongnu.org>; Fri, 25 Oct 2013 17:57:06 +0530
Received: from d28av05.in.ibm.com (localhost [127.0.0.1])
	by d28av05.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id
	r9PCOA6L017891
	for <qemu-devel@nongnu.org>; Fri, 25 Oct 2013 17:54:10 +0530
Message-ID: <526A62E7.9080201@linux.vnet.ibm.com>
Date: Fri, 25 Oct 2013 20:24:07 +0800
From: Lei Li <lilei@linux.vnet.ibm.com>
MIME-Version: 1.0
References: <1382412341-1173-1-git-send-email-lilei@linux.vnet.ibm.com>
	<52692C10.3080604@redhat.com> <526A0870.3020401@linux.vnet.ibm.com>
	<526A1DF8.2040406@redhat.com>
In-Reply-To: <526A1DF8.2040406@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 0/17 v2] Localhost migration with side
	channel for ram
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: aarcange@redhat.com, quintela@redhat.com, qemu-devel@nongnu.org, mrhines@linux.vnet.ibm.com, mdroth@linux.vnet.ibm.com, Anthony Liguori <aliguori@amazon.com>, lagarcia@br.ibm.com, rcj@linux.vnet.ibm.com

On 10/25/2013 03:30 PM, Paolo Bonzini wrote:
> Il 25/10/2013 06:58, Lei Li ha scritto:
>> Right now just has inaccurate numbers without the new vmsplice, which
>> based on
>> the result from info migrate, as the guest ram size increases, although the
>> 'total time' is number of times less compared with the current live
>> migration, but the 'downtime' performs badly.
> Of course.
>> For a 1GB ram guest,
>>
>> total time: 702 milliseconds
>> downtime: 692 milliseconds
>>
>> And when the ram size of guest increasesexponentially, those numbers are
>> proportional to it.
>>   
>> I will make a list of the performance with the new vmsplice later, I am
>> sure it'd be much better than this at least.
> Yes, please.  Is the memory usage is still 2x without vmsplice?
>
> I think you have a nice proof of concept, but on the other hand this
> probably needs to be coupled with some kind of postcopy live migration,
> that is:
>
> * the source starts sending data
>
> * but the destination starts running immediately
>
> * if the machine needs a page that is missing, the destination asks the
> source to send it
>
> * as soon as it arrives, the destination can restart
>
> Using postcopy is problematic for reliability: if the destination fails,
> the virtual machine is lost because the source doesn't have the latest
> content of memory.  However, this is a much, much smaller problem for
> live QEMU upgrade where the network cannot fail.
>
> If you do this, you can achieve pretty much instantaneous live upgrade,
> well within your original 200 ms goals.  But the flipping code with
> vmsplice should be needed anyway to avoid doubling memory usage, and

Yes, I have read the postcopy migration patches, it does perform very
good on downtime, as just send the vmstates then switch the execution
to destination host. And as you pointed out, it can not avoid
doubling memory usage.

The numbers list above are based on the old vmsplice as I have not yet
worked on the benchmark for performance, it actually copys data rather
than moving. As the feedback for this version is positive, now I am
trying to get a real result out with the new vmsplice.

BTW, kernel side is looking for huge page solution for the improvement of
performance.

The recently patches from kernel as link,

http://article.gmane.org/gmane.linux.kernel/1574277

> it's looking pretty good in this version already!  I'm relieved that the
> RDMA code was designed right!

I am happy with it too. :)
Those RDMA hooks really make thingsmore flexible!


> Paolo
>


-- 
Lei