From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Kpmzp-0003WS-KH for qemu-devel@nongnu.org; Tue, 14 Oct 2008 12:43:45 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Kpmzn-0003RL-Ij for qemu-devel@nongnu.org; Tue, 14 Oct 2008 12:43:44 -0400 Received: from [199.232.76.173] (port=35924 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Kpmzn-0003QW-4G for qemu-devel@nongnu.org; Tue, 14 Oct 2008 12:43:43 -0400 Received: from mx2.redhat.com ([66.187.237.31]:41869) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Kpmzm-0000vP-Pa for qemu-devel@nongnu.org; Tue, 14 Oct 2008 12:43:43 -0400 Message-ID: <48F4CC05.3090408@redhat.com> Date: Tue, 14 Oct 2008 18:42:45 +0200 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU References: <48EE38B9.2050106@codemonkey.ws> <20081013170610.GF21410@us.ibm.com> <48F396C2.30704@codemonkey.ws> In-Reply-To: <48F396C2.30704@codemonkey.ws> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Chris Wright , Mark McLoughlin , Ryan Harper , Laurent Vivier , kvm-devel Anthony Liguori wrote: > > With 16k writes I think we hit a pathological case with the particular > storage backend we're using since it has many disks and the volume is > striped. Also the results a bit different when going through a file > system verses a LVM partition (the later being the first data set). > Presumably, this is because even with no flags, writes happen > synchronously to a LVM partition. > With no flags, writes should hit the buffer cache (which is the page cache's name when used to cache block devices). > Also, cache=off seems to do pretty terribly when operating on an ext3 > file. I suspect this has to do with how ext3 implements O_DIRECT. Is the file horribly fragmented? Otherwise ext3 O_DIRECT should be quite good. Maybe the mapping is not in the host cache and has to be brought in. > > However, the data demonstrates pretty nicely that O_DSYNC gives you > native write speed, but accelerated read speed which I think we agree > is the desirable behavior. cache=off never seems to outperform > cache=wt which is another good argument for it being the default over > cache=off. Without copyless block I/O, there's no reason to expect cache=none to outperform cache=writethrough. I expect the read performance to evaporate with a random access pattern over a large disk (or even sequential access, given enough running time). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.