From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JTkHh-0002r2-ES for qemu-devel@nongnu.org; Mon, 25 Feb 2008 15:50:49 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JTkHe-0002qZ-TE for qemu-devel@nongnu.org; Mon, 25 Feb 2008 15:50:48 -0500 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JTkHe-0002qW-O0 for qemu-devel@nongnu.org; Mon, 25 Feb 2008 15:50:46 -0500 Received: from mail2.shareable.org ([80.68.89.115]) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JTkHe-0007LU-BQ for qemu-devel@nongnu.org; Mon, 25 Feb 2008 15:50:46 -0500 Received: from jamie by mail2.shareable.org with local (Exim 4.63) (envelope-from ) id 1JTkHY-0005PI-VJ for qemu-devel@nongnu.org; Mon, 25 Feb 2008 20:50:41 +0000 Date: Mon, 25 Feb 2008 20:50:40 +0000 From: Jamie Lokier Subject: Re: [Qemu-devel] [PATCH] ide.c make write cacheing controllable by guest Message-ID: <20080225205040.GA18613@shareable.org> References: <18371.1341.577787.909764@mariner.uk.xensource.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <18371.1341.577787.909764@mariner.uk.xensource.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Ian Jackson wrote: Content-Description: message body text > The attached patch implements the ATA write cache feature. This > enables a guest to control, in the standard way, whether disk writes > are immediately committed to disk before the IDE command completes, or > may be buffered in the host. > > In this patch, by default buffering is off, which provides better > reliability but may have a performance impact. It would be > straightforward to change the default, or perhaps offer a command-line > option, if that would be preferred. > > This patch is derived from one which was originally submitted to the > Xen tree by Rik van Riel . This is a very sensible improvement, imho. However, I notice that it tells the guest that data is committed to hard storage when the host has merely called fsync(). On Linux (and other host OSes), fdatsync() and fsync() don't always commit data to hard storage; it sometimes only commits it to the hard drive cache. (Seriously, just look at fs/ext3/fsync.c; only journal writes cause the flush, and they aren't done if the inode itself hasn't changed). It may be worth mentioning in documentation that guests which need strong durability guarantees, i.e. for critical database work or for filesystem journalling safety following host power failure, it is not enough to disable the IDE write cache in the guest even with this patch. It is necessary to disable the host's disk write cache too, for that. Ideally, the host would provide variation of fdatasync() which flushes data to hard storage in the same way that kernel filesystem journal writes can do, and Qemu would use that. But, presently, I'm not aware of any way to do that short of the administrator disabling the host's disk write cache. (Darwin provides F_FULLSYNC. On Linux, an extra flag to sync_file_range() suggests itself. It would need changes to the block device and elevator APIs, though, as it's a flush command not an ordering tag, and not always associated with a prior or subsequent write although there are some coalescing optimisations when it can be.) -- Jamie