From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:33633) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R22zJ-0003gt-Ad for qemu-devel@nongnu.org; Fri, 09 Sep 2011 11:27:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R22zI-0002gj-9J for qemu-devel@nongnu.org; Fri, 09 Sep 2011 11:27:29 -0400 Received: from mtagate7.uk.ibm.com ([194.196.100.167]:53175) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R22zI-0002gF-0x for qemu-devel@nongnu.org; Fri, 09 Sep 2011 11:27:28 -0400 Received: from d06nrmr1507.portsmouth.uk.ibm.com (d06nrmr1507.portsmouth.uk.ibm.com [9.149.38.233]) by mtagate7.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p89FRNYA017129 for ; Fri, 9 Sep 2011 15:27:23 GMT Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by d06nrmr1507.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p89FRNKP2523320 for ; Fri, 9 Sep 2011 16:27:23 +0100 Received: from d06av02.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p89FRM6x027805 for ; Fri, 9 Sep 2011 09:27:23 -0600 Date: Fri, 9 Sep 2011 16:27:22 +0100 From: Stefan Hajnoczi Message-ID: <20110909152722.GA29109@stefanha-thinkpad.localdomain> References: <20110909094436.GB23929@f15.cn.ibm.com> <20110909103801.GA26148@stefanha-thinkpad.localdomain> <4E6A1CD7.70300@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E6A1CD7.70300@redhat.com> Subject: Re: [Qemu-devel] Why qemu write/rw speed is so low? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: aliguro@us.ibm.com, Stefan Hajnoczi , qemu-devel@nongnu.org, ryanh@us.ibm.com, Zhi Yong Wu , Zhi Yong Wu On Fri, Sep 09, 2011 at 04:04:07PM +0200, Kevin Wolf wrote: > Am 09.09.2011 15:54, schrieb Stefan Hajnoczi: > > On Fri, Sep 9, 2011 at 2:48 PM, Zhi Yong Wu wrote: > >> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi > >> wrote: > >>> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote: > >>>> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8. > >>>> > >>>> Do qemu have regression? > >>>> > >>>> The testing data is shown as below: > >>>> > >>>> 1.) write > >>>> > >>>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1 > >>> > >>> Please post your QEMU command-line. If your -drive is using > >>> cache=writethrough then small writes are slow because they require the > >>> physical disk to write and then synchronize its write cache. Typically > >>> cache=none is a good setting to use for local disks. > >> Now i can not access my workstation in the office. > >> -drive if=virtio,cache=none,file=xxxx > >> > >>> > >>> The block size of 512 bytes is too small. Ext4 uses a 4 KB block size, > >>> so I think a 512 byte write from the guest could cause a 4 KB > >>> read-modify-write operation on the host filesystem. > >> You mean RCU? What is its work procedure? Can you explain in more > >> details if you are available? > > > > If the host file system manages space in 4 KB blocks, then a 512 byte > > to an unallocated part of the file causes the file system to find 4 KB > > of free space for this data. Since the write is only 512 bytes and > > does not cover the entire 4 KB region, the file system initializes the > > remaining 3.5 KB with zeros and writes out the full 4 KB block. > > > > Now if a 512 byte write comes in for an allocated 4 KB block, then we > > need to read in the existing 4 KB, modify the 512 bytes in place, and > > write out the 4 KB block again. This is read-modify-write. In this > > worst-case scenario a 512 byte write turns into a 4 KB read followed > > by a 4 KB write. > > But that should only happen with a 4k sector size, otherwise there's no > reason for RMW. You're right. For cache=none (O_DIRECT), the host file system should not need to do read-modify-write because it can write the single sector without caring what is in the surrounding 3.5 KB. Stefan