From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1O8uPO-0004LZ-V8
	for qemu-devel@nongnu.org; Mon, 03 May 2010 08:05:59 -0400
Received: from [140.186.70.92] (port=50748 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1O8uPG-00046B-Fq
	for qemu-devel@nongnu.org; Mon, 03 May 2010 08:05:58 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <aliguori@linux.vnet.ibm.com>) id 1O8uPD-000487-II
	for qemu-devel@nongnu.org; Mon, 03 May 2010 08:05:50 -0400
Received: from e31.co.us.ibm.com ([32.97.110.149]:54193)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <aliguori@linux.vnet.ibm.com>) id 1O8uPD-00047X-Cs
	for qemu-devel@nongnu.org; Mon, 03 May 2010 08:05:47 -0400
Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com
	[9.17.195.226])
	by e31.co.us.ibm.com (8.14.3/8.13.1) with ESMTP id o43Btop7002950
	for <qemu-devel@nongnu.org>; Mon, 3 May 2010 05:55:50 -0600
Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170])
	by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	o43C5Wf4063866
	for <qemu-devel@nongnu.org>; Mon, 3 May 2010 06:05:34 -0600
Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1])
	by d03av04.boulder.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP
	id o43C5VvQ016204
	for <qemu-devel@nongnu.org>; Mon, 3 May 2010 06:05:31 -0600
Message-ID: <4BDEBC09.5020501@linux.vnet.ibm.com>
Date: Mon, 03 May 2010 07:05:29 -0500
From: Anthony Liguori <aliguori@linux.vnet.ibm.com>
MIME-Version: 1.0
References: <1271829445-5328-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp>
	<1271829445-5328-6-git-send-email-tamura.yoshiaki@lab.ntt.co.jp>
	<4BD0A35E.8000205@linux.vnet.ibm.com>
	<4BD11604.3060309@lab.ntt.co.jp>
	<4BD19F12.2020004@linux.vnet.ibm.com> <4BD1A52C.1090406@redhat.com>
	<n2r87e9effc1005030232sae40dd01w95988ed658c065e9@mail.gmail.com>
In-Reply-To: <n2r87e9effc1005030232sae40dd01w95988ed658c065e9@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: [Qemu-devel] Re: [RFC PATCH 05/20] Introduce put_vector() and
 get_vector to QEMUFile and qemu_fopen_ops().
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Cc: ohmura.kei@lab.ntt.co.jp, kvm@vger.kernel.org, mtosatti@redhat.com, Anthony Liguori <aliguori@us.ibm.com>, qemu-devel@nongnu.org, yoshikawa.takuya@oss.ntt.co.jp, Avi Kivity <avi@redhat.com>

On 05/03/2010 04:32 AM, Yoshiaki Tamura wrote:
> 2010/4/23 Avi Kivity<avi@redhat.com>:
>    
>> On 04/23/2010 04:22 PM, Anthony Liguori wrote:
>>      
>>>> I currently don't have data, but I'll prepare it.
>>>> There were two things I wanted to avoid.
>>>>
>>>> 1. Pages to be copied to QEMUFile buf through qemu_put_buffer.
>>>> 2. Calling write() everytime even when we want to send multiple pages at
>>>> once.
>>>>
>>>> I think 2 may be neglectable.
>>>> But 1 seems to be problematic if we want make to the latency as small as
>>>> possible, no?
>>>>          
>>>
>>> Copying often has strange CPU characteristics depending on whether the
>>> data is already in cache.  It's better to drive these sort of optimizations
>>> through performance measurement because changes are not always obvious.
>>>        
>> Copying always introduces more cache pollution, so even if the data is in
>> the cache, it is worthwhile (not disagreeing with the need to measure).
>>      
> Anthony,
>
> I measure how long it takes to send all guest pages during migration, and I
> would like to share the information in this message.  For convenience,
> I modified
> the code to do migration not "live migration" which means buffered file is not
> used here.
>
> In summary, the performance improvement using writev instead of write/send when
> we used GbE seems to be neglectable, however, when the underlying network was
> fast (InfiniBand with IPoIB in this case), writev performed 17% faster than
> write/send, and therefore, it may be worthwhile to introduce vectors.
>
> Since QEMU compresses pages, I copied a junk file to tmpfs to dirty pages to let
> QEMU to transfer fine number of pages.  After setting up the guest, I used
> cpu_get_real_ticks() to measure the time during the while loop calling
> ram_save_block() in ram_save_live().  I removed the qemu_file_rate_limit() to
> disable the function of buffered file, and all of the pages would be transfered
> at the first round.
>
> I measure 10 times for each, and took average and standard deviation.
> Considering the results, I think the trial number was enough.  In addition to
> time duration, number of writev/write and number of pages which were compressed
> (dup)/not compressed (nodup) are demonstrated.
>
> Test Environment:
> CPU: 2x Intel Xeon Dual Core 3GHz
> Mem size: 6GB
> Network: GbE, InfiniBand (IPoIB)
>
> Host OS: Fedora 11 (kernel 2.6.34-rc1)
> Guest OS: Fedora 11 (kernel 2.6.33)
> Guest Mem size: 512MB
>
> * GbE writev
> time (sec): 35.732 (std 0.002)
> write count: 4 (std 0)
> writev count: 8269 (std 1)
> dup count: 36157 (std 124)
> nodup count: 1016808 (std 147)
>
> * GbE write
> time (sec): 35.780 (std 0.164)
> write count: 127367 (21)
> writev count: 0 (std 0)
> dup count: 36134 (std 108)
> nodup count: 1016853 (std 165)
>
> * IPoIB writev
> time (sec): 13.889 (std 0.155)
> write count: 4 (std 0)
> writev count: 8267 (std 1)
> dup count: 36147 (std 105)
> nodup count: 1016838 (std 111)
>
> * IPoIB write
> time (sec): 16.777 (std 0.239)
> write count: 127364 (24)
> writev count: 0 (std 0)
> dup count: 36173 (std 169)
> nodup count: 1016840 (std 190)
>
> Although the improvement wasn't obvious when the network wan GbE, introducing
> writev may be worthwhile when we focus on faster networks like InfiniBand/10GE.
>
> I agree that separating this optimization from the main logic of Kemari since
> this modification must be done widely and carefully at the same time.
>    

Okay.  It looks like it's clear that it's a win so let's split it out of 
the main series and we'll treat it separately.  I imagine we'll see even 
more positive results on 10 gbit and particularly if we move migration 
out into a separate thread.

Regards,

Anthony Liguori

> Thanks,
>
> Yoshi
>