From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Kp3io-0003kd-G6 for qemu-devel@nongnu.org; Sun, 12 Oct 2008 12:23:10 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Kp3ik-0003jX-9L for qemu-devel@nongnu.org; Sun, 12 Oct 2008 12:23:07 -0400 Received: from [199.232.76.173] (port=41813 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Kp3ik-0003jU-2j for qemu-devel@nongnu.org; Sun, 12 Oct 2008 12:23:06 -0400 Received: from mail2.shareable.org ([80.68.89.115]:59919) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Kp3ij-0004CP-K9 for qemu-devel@nongnu.org; Sun, 12 Oct 2008 12:23:05 -0400 Date: Sun, 12 Oct 2008 17:22:59 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] [RFC] Disk integrity in QEMU Message-ID: <20081012162258.GE18814@shareable.org> References: <48EE38B9.2050106@codemonkey.ws> <48EF1D55.7060307@redhat.com> <48F0E83E.2000907@redhat.com> <48F10DFD.40505@codemonkey.ws> <48F14814.7000805@redhat.com> <20081012015003.GD9763@acer.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081012015003.GD9763@acer.localdomain> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Chris Wright , Mark McLoughlin , kvm@vger.kernel.org, Laurent Vivier , Ryan Harper , Mark Wagner Chris Wright wrote: > Either wt or uncached (so host O_DSYNC or O_DIRECT) would suffice to get > it through to host's storage subsytem, and I think that's been the core > of the discussion (plus defaults, etc). Just want to point out that the storage commitment from O_DIRECT can be _weaker_ than O_DSYNC. On Linux,m O_DIRECT never uses storage-device barriers or transactions, but O_DSYNC sometimes does, and fsync is even more likely to than O_DSYNC. I'm not certain, but I think the same applies to other host OSes too - including Windows, which has its own equivalents to O_DSYNC and O_DIRECT, and extra documented semantics when they are used together. Although this is a host implementation detail, unfortunately it means that O_DIRECT=no-cache and O_DSYNC=write-through-cache is not an accurate characterisation. Some might be mislead into assuming that "cache=off" is as strongly committing their data to hard storage as "cache=wb" would. I think you can assume this only when the underlying storage devices' write caches are disabled. You cannot assume this if the host filesystem uses barriers instead of disabling the storage devices' write cache. Unfortunately there's not a lot qemu can do about these various quirks, but at least it should be documented, so that someone requiring storage commitment (e.g. for a critical guest database) is advised to investigate whether O_DIRECT and/or O_DSYNC give them what they require with their combination of host kernel, filesystem, filesystem options and storage device(s). -- Jamie