From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57073) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VO6BR-00085Z-QJ for qemu-devel@nongnu.org; Mon, 23 Sep 2013 09:28:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VO6BM-000439-8T for qemu-devel@nongnu.org; Mon, 23 Sep 2013 09:28:13 -0400 Received: from mail-ee0-x22f.google.com ([2a00:1450:4013:c00::22f]:33989) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VO6BM-000434-1n for qemu-devel@nongnu.org; Mon, 23 Sep 2013 09:28:08 -0400 Received: by mail-ee0-f47.google.com with SMTP id d49so1707303eek.6 for ; Mon, 23 Sep 2013 06:28:05 -0700 (PDT) Date: Mon, 23 Sep 2013 15:27:58 +0200 From: Stefan Hajnoczi Message-ID: <20130923132758.GC5814@stefanha-thinkpad.redhat.com> References: <000101cea7b5$cc536660$64fa3320$%choi@samsung.com> <20130904085852.GB9654@stefanha-thinkpad.redhat.com> <002b01cea9d5$d4340430$7c9c0c90$%choi@samsung.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <002b01cea9d5$d4340430$7c9c0c90$%choi@samsung.com> Subject: Re: [Qemu-devel] I/O performance degradation with Virtio-Blk-Data-Plane List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jonghwan Choi Cc: qemu-devel@nongnu.org On Thu, Sep 05, 2013 at 10:18:28AM +0900, Jonghwan Choi wrote: Thanks for posting these details. Have you tried running x-data-plane=off with vcpu = 8 and how does the performance compare to x-data-plane=off with vcpu = 1? > > 1. The fio results so it's clear which cases performed worse and by how > > much. > > > When I set vcpu = 8, read performance is decreased about 25%. > In my test, when vcpu = 1, I got a best formance. Performance with vcpu = 8 is 25% worse than performance with vcpu = 1? Can you try pinning threads to host CPUs? See libvirt emulatorpin and vcpupin attributes: http://libvirt.org/formatdomain.html#elementsCPUTuning > > 2. The fio job files. > > > [testglobal] > description=high_iops > exec_prerun="echo 3 > /proc/sys/vm/drop_caches" > group_reporting=1 > rw=read > direct=1 > ioengine=sync > bs=4m > numjobs=1 > size=2048m A couple of points to check: 1. This test case is synchronous and latency-sensitive, you are not benchmarking parallel I/Os so x-data-plane=on is not expected to perform any better than x-data-plane=off. The point of x-data-plane=on is for smp > 1 guests with parallel I/O to scale well. If both those conditions are not met by the workload then I don't expect you to see any gains over x-data-plane=off. If you want to try parallel I/Os, I suggest using: ioengine=linuxaio iodepth=16 2. size=2048m with bs=4m on an SSD drive seems quite small because the test would complete quickly. What is the overall running time of this test? In order to collect stable results it's usually a good idea for the test to run for a couple of minutes (e.g. 2 minutes minimum). Otherwise outliers can influence the results too much. You may need to increase 'size' or use the 'runtime=2m' option instead. Stefan