From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1O8s1M-0000gH-1q for qemu-devel@nongnu.org; Mon, 03 May 2010 05:33:00 -0400 Received: from [140.186.70.92] (port=52470 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O8s1K-0000fQ-3y for qemu-devel@nongnu.org; Mon, 03 May 2010 05:32:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1O8s1H-0005rA-P8 for qemu-devel@nongnu.org; Mon, 03 May 2010 05:32:57 -0400 Received: from mail-gy0-f173.google.com ([209.85.160.173]:43658) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O8s1H-0005qw-LJ for qemu-devel@nongnu.org; Mon, 03 May 2010 05:32:55 -0400 Received: by gyd5 with SMTP id 5so1036235gyd.4 for ; Mon, 03 May 2010 02:32:54 -0700 (PDT) MIME-Version: 1.0 Sender: tamura.yoshiaki@gmail.com In-Reply-To: <4BD1A52C.1090406@redhat.com> References: <1271829445-5328-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1271829445-5328-6-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <4BD0A35E.8000205@linux.vnet.ibm.com> <4BD11604.3060309@lab.ntt.co.jp> <4BD19F12.2020004@linux.vnet.ibm.com> <4BD1A52C.1090406@redhat.com> Date: Mon, 3 May 2010 18:32:53 +0900 Message-ID: From: Yoshiaki Tamura Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] Re: [RFC PATCH 05/20] Introduce put_vector() and get_vector to QEMUFile and qemu_fopen_ops(). List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity , Anthony Liguori Cc: Anthony Liguori , kvm@vger.kernel.org, ohmura.kei@lab.ntt.co.jp, mtosatti@redhat.com, qemu-devel@nongnu.org, yoshikawa.takuya@oss.ntt.co.jp 2010/4/23 Avi Kivity : > On 04/23/2010 04:22 PM, Anthony Liguori wrote: >>> >>> I currently don't have data, but I'll prepare it. >>> There were two things I wanted to avoid. >>> >>> 1. Pages to be copied to QEMUFile buf through qemu_put_buffer. >>> 2. Calling write() everytime even when we want to send multiple pages a= t >>> once. >>> >>> I think 2 may be neglectable. >>> But 1 seems to be problematic if we want make to the latency as small a= s >>> possible, no? >> >> >> Copying often has strange CPU characteristics depending on whether the >> data is already in cache. =A0It's better to drive these sort of optimiza= tions >> through performance measurement because changes are not always obvious. > > Copying always introduces more cache pollution, so even if the data is in > the cache, it is worthwhile (not disagreeing with the need to measure). Anthony, I measure how long it takes to send all guest pages during migration, and I would like to share the information in this message. For convenience, I modified the code to do migration not "live migration" which means buffered file is = not used here. In summary, the performance improvement using writev instead of write/send = when we used GbE seems to be neglectable, however, when the underlying network w= as fast (InfiniBand with IPoIB in this case), writev performed 17% faster than write/send, and therefore, it may be worthwhile to introduce vectors. Since QEMU compresses pages, I copied a junk file to tmpfs to dirty pages t= o let QEMU to transfer fine number of pages. After setting up the guest, I used cpu_get_real_ticks() to measure the time during the while loop calling ram_save_block() in ram_save_live(). I removed the qemu_file_rate_limit() = to disable the function of buffered file, and all of the pages would be transf= ered at the first round. I measure 10 times for each, and took average and standard deviation. Considering the results, I think the trial number was enough. In addition = to time duration, number of writev/write and number of pages which were compre= ssed (dup)/not compressed (nodup) are demonstrated. Test Environment: CPU: 2x Intel Xeon Dual Core 3GHz Mem size: 6GB Network: GbE, InfiniBand (IPoIB) Host OS: Fedora 11 (kernel 2.6.34-rc1) Guest OS: Fedora 11 (kernel 2.6.33) Guest Mem size: 512MB * GbE writev time (sec): 35.732 (std 0.002) write count: 4 (std 0) writev count: 8269 (std 1) dup count: 36157 (std 124) nodup count: 1016808 (std 147) * GbE write time (sec): 35.780 (std 0.164) write count: 127367 (21) writev count: 0 (std 0) dup count: 36134 (std 108) nodup count: 1016853 (std 165) * IPoIB writev time (sec): 13.889 (std 0.155) write count: 4 (std 0) writev count: 8267 (std 1) dup count: 36147 (std 105) nodup count: 1016838 (std 111) * IPoIB write time (sec): 16.777 (std 0.239) write count: 127364 (24) writev count: 0 (std 0) dup count: 36173 (std 169) nodup count: 1016840 (std 190) Although the improvement wasn't obvious when the network wan GbE, introduci= ng writev may be worthwhile when we focus on faster networks like InfiniBand/1= 0GE. I agree that separating this optimization from the main logic of Kemari sin= ce this modification must be done widely and carefully at the same time. Thanks, Yoshi