From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KpQkf-0006r5-ON for qemu-devel@nongnu.org; Mon, 13 Oct 2008 12:58:37 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KpQke-0006ps-Ad for qemu-devel@nongnu.org; Mon, 13 Oct 2008 12:58:37 -0400 Received: from [199.232.76.173] (port=52042 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KpQke-0006pp-5y for qemu-devel@nongnu.org; Mon, 13 Oct 2008 12:58:36 -0400 Received: from ag-out-0708.google.com ([72.14.246.241]:1927) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KpQkd-0005hQ-UV for qemu-devel@nongnu.org; Mon, 13 Oct 2008 12:58:36 -0400 Received: by ag-out-0708.google.com with SMTP id 31so1967757agc.5 for ; Mon, 13 Oct 2008 09:58:33 -0700 (PDT) Message-ID: <48F37E33.7050307@codemonkey.ws> Date: Mon, 13 Oct 2008 11:58:27 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [RFC] Disk integrity in QEMU References: <48EE38B9.2050106@codemonkey.ws> <1223914299.4153.22.camel@frecb07144> In-Reply-To: <1223914299.4153.22.camel@frecb07144> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Chris Wright , Mark McLoughlin , Ryan Harper , kvm-devel Laurent Vivier wrote: > Le jeudi 09 octobre 2008 à 12:00 -0500, Anthony Liguori a écrit : > [...] > >> So to summarize, I think we should enable O_DSYNC by default to >> ensure >> that guest data integrity is not dependent on the host OS, and that >> practically speaking, cache=off is only useful for very specialized >> circumstances. Part of the patch I'll follow up with includes >> changes >> to the man page to document all of this for users. >> > > perhaps I'm wrong but I think O_DSYNC (in fact O_SYNC for linux) will > impact host filesystem performance, at least with ext3, because the > synchronicity is done through the commit of the journal of the whole > filesystem: > Yes, but this is important because if the journal isn't committed, then it's possible that while the data would be on disk, the file system metadata is out of sync on disk which could result in the changes to the file being lost. I think that you are in fact correct that the journal write is probably unnecessary overhead in a lot of scenarios but Ryan actually has some performance data that he should be posting soon that shows that in most circumstances, O_DSYNC does pretty well compared to O_DIRECT for write so I don't this is a practical concern. Regards, Anthony Liguori > see fs/ext3/file.c:ext3_file_write() (I've removed the comments here) : > > ... > if (file->f_flags & O_SYNC) { > > if (!ext3_should_journal_data(inode)) > return ret; > > goto force_commit; > } > > > if (!IS_SYNC(inode)) > return ret; > > force_commit: > err = ext3_force_commit(inode->i_sb); > if (err) > return err; > return ret; > } > > Moreover, the real behavior depends on the type of the journaling system > you use... > > Regards, > Laurent >