From mboxrd@z Thu Jan 1 00:00:00 1970 From: Theodore Tso Subject: Re: [rfc] fsync_range? Date: Wed, 21 Jan 2009 09:12:07 -0500 Message-ID: <20090121141207.GD31253@mit.edu> References: <20090120164726.GA24891@wotan.suse.de> <20090120183120.GD27464@shareable.org> <20090121012900.GD24891@wotan.suse.de> <20090121031500.GA2354@shareable.org> <20090121041604.GI24891@wotan.suse.de> <20090121045921.GA3944@shareable.org> <20090121062306.GK24891@wotan.suse.de> <20090121121308.GA31253@mit.edu> <20090121123711.GA10637@shareable.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Nick Piggin , linux-fsdevel@vger.kernel.org To: Jamie Lokier Return-path: Received: from thunk.org ([69.25.196.29]:38135 "EHLO thunker.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753282AbZAUOMN (ORCPT ); Wed, 21 Jan 2009 09:12:13 -0500 Content-Disposition: inline In-Reply-To: <20090121123711.GA10637@shareable.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Jan 21, 2009 at 12:37:11PM +0000, Jamie Lokier wrote: > > What about btrfs with data checksums? Doesn't that count among > data-retrieval metadata? What about nilfs, which always writes data > to a new place? Etc. > > I'm wondering what exactly sync_file_range() definitely writes, and > what it doesn't write. > > If it's just in use by Oracle, and nobody's sure what it does, that > smacks of those secret APIs in Windows that made Word run a bit faster > than everyone else's word processer... sort of. :-) Actually, I take that back; Oracle (and most other enterprise databases; the world is not just Oracle --- there's also DB2, for example) generally uses Direct I/O, so I wonder if they are using sync_file_range() at all. I do wonder though how well or poorly Oracle will work on btrfs, or indeed any filesystem that uses WAFL-like or log-structutred filesystem-like algorithms. Most of the enterprise databases have been optimized for use on block devices and filesystems where you do write-in-place acesses; and some enterprise databases do their own data checksumming. So if I had to guess, I suspect the answer to the question I posed is "disastrously". :-) After all, such db's generally are happiest when the OS acts as a program loader than then gets the heck out of the way of the filesystem, hence their use of DIO. Which again brings me back to the question --- I wonder who is actually using sync_file_range, and what for? I would assume it is some database, most likely; so maybe we should check with MySQL or Postgres? - Ted