From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=51486 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OBtxT-0000LD-BM for qemu-devel@nongnu.org; Tue, 11 May 2010 14:13:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OBtxP-0004CU-9E for qemu-devel@nongnu.org; Tue, 11 May 2010 14:13:29 -0400 Received: from mail2.shareable.org ([80.68.89.115]:48729) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OBtxP-0004CG-4N for qemu-devel@nongnu.org; Tue, 11 May 2010 14:13:27 -0400 Date: Tue, 11 May 2010 19:13:22 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] Re: [PATCH 2/2] Add flush=off parameter to -drive Message-ID: <20100511181322.GA30446@shareable.org> References: <1273528310-7051-1-git-send-email-agraf@suse.de> <201005111315.08897.paul@codesourcery.com> <4BE950E0.5050107@codemonkey.ws> <201005111412.02809.paul@codesourcery.com> <4BE959B2.3090904@codemonkey.ws> <20100511163242.GA27028@shareable.org> <4BE990CE.40505@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4BE990CE.40505@codemonkey.ws> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Kevin Wolf , Alexander Graf , armbru@redhat.com, qemu-devel@nongnu.org, Paul Brook , Christoph Hellwig Anthony Liguori wrote: > qemu-img create -f raw foo.img 10G > mkfs.ext3 foo.img > mount -oloop,rw,barrier=1 -t ext3 foo.img mnt > > Works perfectly fine. Hmm, interesting. Didn't know loop propagated barriers. So you're suggesting to use qemu with a loop device, and ext2 (bit faster than ext3) and barrier=0 (well, that's implied if you use ext2), and a raw image file on the ext2/3 filesystem, to provide the effect of flush=off, becuase the loop device caches block writes on the host, except for explicit barrier requests from the fs, which are turned off? That wasn't obvious the first time :-) Does the loop device cache fs writes instead of propagating them immediately to the underlying fs? I guess it probably does. Does the loop device allow the backing file to grow sparsely, to get behavious like qcow2? That's ugly but it might just work. > >2. barrier=0 does _not_ provide the cache=off behaviour. It only > >disables barriers; it does not prevent writing to the disk hardware. > > The proposal has nothing to do with cache=off. Sorry, I meant flush=off (the proposal). Mounting the host filesystem (i.e. not using a loop device anywhere) with barrier=0 doesn't have even close to the same effect. > >>The problem with options added for developers is that those options are > >>very often accidentally used for production. > >> > >We already have risky cache= options. Also, do we call fdatasync > >(with barrier) on _every_ write for guests which disable the > >emulated disk cache? > > None of our cache= options should result in data corruption on power > loss. If they do, it's a bug. (I might have the details below a bit off.) If cache=none uses O_DIRECT without calling fdatasync for guest barriers, then it will get data corruption on power loss. If cache=none does call fdatasync for guest barriers, then it might still get corruption on power loss; I am not sure if recent Linux host behaviour of O_DIRECT+fdatasync (with no buffered writes to commit) issues the necessary barriers. I am quite sure that older kernels did not. cache=writethrough will get data corruption on power loss with older Linux host kernels. O_DSYNC did not issue barriers. I'm not sure if the behaviour of O_DSYNC that was recently changed is now issuing barriers after every write. Provided all the cache= options call fdatasync/fsync when the guest issues a cache flush, and call fdatasync/fsync following _every_ write when the guest has disabled the emulated write cache, that should be as good as Qemu can reasonably do. It's up to the host from there. -- Jamie