From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1G7rJy-0002IR-0C for qemu-devel@nongnu.org; Tue, 01 Aug 2006 06:17:54 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1G7rJs-0002Cn-UU for qemu-devel@nongnu.org; Tue, 01 Aug 2006 06:17:53 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1G7rJs-0002Cb-QJ for qemu-devel@nongnu.org; Tue, 01 Aug 2006 06:17:48 -0400 Received: from [81.29.64.88] (helo=mail.shareable.org) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.52) id 1G7rMk-0007NE-9Q for qemu-devel@nongnu.org; Tue, 01 Aug 2006 06:20:46 -0400 Received: from mail.shareable.org (localhost [127.0.0.1]) by mail.shareable.org (8.12.11.20060308/8.12.11) with ESMTP id k71AHim5032075 for ; Tue, 1 Aug 2006 11:17:44 +0100 Received: (from jamie@localhost) by mail.shareable.org (8.12.11.20060308/8.12.8/Submit) id k71AHiB5032073 for qemu-devel@nongnu.org; Tue, 1 Aug 2006 11:17:44 +0100 Date: Tue, 1 Aug 2006 11:17:44 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] Ensuring data is written to disk Message-ID: <20060801101743.GA31760@mail.shareable.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Armistead, Jason wrote: > I've been following the thread about disk data consistency with some > interest. Given that many IDE disk drives may choose to hold data in their > write buffers before actually writing it to disk, and given that the > ordering of the writes may not be the same as the OS or application expects, > the only obvious way I can see to overcome this, and ensure the data is > truly written to the physical platters without disabling write caching is to > overwhelm the disk drive with more data than can fit in its internal write > buffer. > > So, if you have an IDE disk with an 8Mb cache, guess what, send it an 8Mb > chunk of random data to write out when you do an fsync(). Better still, > locate this 8Mb as close to the middle of the travel of its heads, so that > performance is not affected any more than necessary. If the drive firmware > uses a LILO or LRU policy to determine when to do its disk writes, > overwhelming its buffers should ensure that the actual data you sent to it > gets written out It doesn't work. I thought that too, for a while, as a way to avoid sending CACHEFLUSH commands for fs journal ordering when there is a lot of data being written anyway. But there is no guarantee that the drive uses a LILO or LRU policy, and if the firmware is optimised for cache performance over a range of benchmarks, it won't use those - there are better strategies. You could write 8MB to the drive, but it could easily pass through the cache without evicting some of the other data you want written. _Especially_ if the 8MB is written to an area in the middle of the head sweep. > Of course, guessing the disk drive write buffer size and trying not to kill > system I/O performance with all these writes is another question entirely > ... sigh !!! If you just want to evict all data from the drive's cache, and don't actually have other data to write, there is a CACHEFLUSH command you can send to the drive which will be more dependable than writing as much data as the cache size. -- Jamie