From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KpRLR-0000pv-AT for qemu-devel@nongnu.org; Mon, 13 Oct 2008 13:36:37 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KpRLQ-0000oP-A8 for qemu-devel@nongnu.org; Mon, 13 Oct 2008 13:36:36 -0400 Received: from [199.232.76.173] (port=52747 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KpRLQ-0000nz-1s for qemu-devel@nongnu.org; Mon, 13 Oct 2008 13:36:36 -0400 Received: from mail2.shareable.org ([80.68.89.115]:38504) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KpRLP-0003Vp-Ou for qemu-devel@nongnu.org; Mon, 13 Oct 2008 13:36:35 -0400 Date: Mon, 13 Oct 2008 18:36:27 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] [RFC] Disk integrity in QEMU Message-ID: <20081013173627.GA2122@shareable.org> References: <48EE38B9.2050106@codemonkey.ws> <1223914299.4153.22.camel@frecb07144> <48F37E33.7050307@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48F37E33.7050307@codemonkey.ws> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Chris Wright , Mark McLoughlin , Ryan Harper , kvm-devel Anthony Liguori wrote: > >perhaps I'm wrong but I think O_DSYNC (in fact O_SYNC for linux) will > >impact host filesystem performance, at least with ext3, because the > >synchronicity is done through the commit of the journal of the whole > >filesystem: > > > > Yes, but this is important because if the journal isn't committed, then > it's possible that while the data would be on disk, the file system > metadata is out of sync on disk which could result in the changes to the > file being lost. > > I think that you are in fact correct that the journal write is probably > unnecessary overhead in a lot of scenarios but Ryan actually has some > performance data that he should be posting soon that shows that in most > circumstances, O_DSYNC does pretty well compared to O_DIRECT for write > so I don't this is a practical concern. fsync on ext3 is whacky anyway. I haven't checked what the _real_ semantics of O_DSYNC are for ext3, but I would be surprised if it's less whacky than fsync. Sometimes ext3 fsync takes a very long time, because it's waiting for lots of dirty data from other processes to be written. (Firefox 3 was bitten by this - it made Firefox stall repeatedly for up to half a minute for some users.) Sometimes ext3 fsync doesn't write all the dirty pages of a file - there are some recent kernel patches exploring ways to fix this. Sometimes ext3 fsync doesn't flush the disk's write cache after writing data, despite barriers being requested, if only dirty data blocks are written and there is no inode change. -- Jamie