From: Theodore Tso <tytso@mit.edu>
To: Jamie Lokier <jamie@shareable.org>
Cc: Christoph Hellwig <hch@infradead.org>,
Jens Axboe <jens.axboe@oracle.com>,
linux-fsdevel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: O_DIRECT and barriers
Date: Fri, 21 Aug 2009 22:06:07 -0400 [thread overview]
Message-ID: <20090822020607.GQ9529@mit.edu> (raw)
In-Reply-To: <20090822005613.GB22530@shareable.org>
On Sat, Aug 22, 2009 at 01:56:13AM +0100, Jamie Lokier wrote:
> AIX behaves like XFS according to documentation:
>
> [ http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.genprogc/doc/genprogc/fileio.htm ]
>
> Direct I/O and Data I/O Integrity Completion
>
> Although direct I/O writes are done synchronously, they do not
> provide synchronized I/O data integrity completion, as defined by
> POSIX. Applications that need this feature should use O_DSYNC in
> addition to O_DIRECT. O_DSYNC guarantees that all of the data and
> enough of the metadata (for example, indirect blocks) have written
> to the stable store to be able to retrieve the data after a system
> crash. O_DIRECT only writes the data; it does not write the
> metadata.
>
> That's another reason to use O_DIRECT|O_DSYNC in moderately portable
> code.
...or use fsync() when they need to guarantee that data has been
atomically written, but not before. This becomes critically important
if the application is writing into a sparse file, or writing into
uninitalized blocks that were allocated using fallocate(); otherwise,
with O_DIRECT|O_DSYNC, the file system would have to do a commit
operation after each write, which could be a performance disaster.
> > http://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO's_Semantics
> >
> > Comments are welcome, either on the wiki's talk page, or directly to
> > me, or to the linux-fsdevel or linux-ext4.
>
> I haven't read it yet. One thing which comes to mind is it would be
> good to summarise what other OSes as well as Linux do with O_DIRECT
> w.r.t. data-finding metadata, preallocation, file extending, hole
> filling, unaligned access and what alignment is required, block
> devices vs. files and different filesystems and behaviour-modifying
> mount options, file open for buffered I/O on another descriptor, file
> has mapped pages, mlocked pages, and of course drive cache write
> through or not.
It's a wiki; contributions to define all of that is welcome. :-)
We may want to carefully consider what we want to guarantee for all
time to application writers, and what we might want to leave open to
allow for performance optimizations by the kernel, though.
- Ted
next prev parent reply other threads:[~2009-08-22 2:06 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1250697884-22288-1-git-send-email-jack@suse.cz>
2009-08-20 22:12 ` O_DIRECT and barriers Christoph Hellwig
2009-08-21 11:40 ` Jens Axboe
2009-08-21 13:54 ` Jamie Lokier
2009-08-21 14:26 ` Christoph Hellwig
2009-08-21 15:24 ` Jamie Lokier
2009-08-21 17:45 ` Christoph Hellwig
2009-08-21 19:18 ` Ric Wheeler
2009-08-22 0:50 ` Jamie Lokier
2009-08-22 2:19 ` Theodore Tso
2009-08-22 2:31 ` Theodore Tso
2009-08-24 2:34 ` Christoph Hellwig
2009-08-27 14:34 ` Jamie Lokier
2009-08-27 17:10 ` adding proper O_SYNC/O_DSYNC, was " Christoph Hellwig
2009-08-27 17:24 ` Ulrich Drepper
2009-08-28 15:46 ` Christoph Hellwig
2009-08-28 16:06 ` Ulrich Drepper
2009-08-28 16:17 ` Christoph Hellwig
2009-08-28 16:33 ` Ulrich Drepper
2009-08-28 16:41 ` Christoph Hellwig
2009-08-28 20:51 ` Ulrich Drepper
2009-08-28 21:08 ` Christoph Hellwig
2009-08-28 21:16 ` Trond Myklebust
2009-08-28 21:29 ` Christoph Hellwig
2009-08-28 21:43 ` Trond Myklebust
2009-08-28 22:39 ` Christoph Hellwig
2009-08-30 16:44 ` Jamie Lokier
2009-08-28 16:46 ` Jamie Lokier
2009-08-29 0:59 ` Jamie Lokier
2009-08-28 16:44 ` Jamie Lokier
2009-08-28 16:50 ` Jamie Lokier
2009-08-28 21:08 ` Ulrich Drepper
2009-08-30 16:58 ` Jamie Lokier
2009-08-30 17:48 ` Jamie Lokier
2009-08-28 23:06 ` Jamie Lokier
2009-08-28 23:46 ` Christoph Hellwig
2009-08-21 22:08 ` Theodore Tso
2009-08-21 22:38 ` Joel Becker
2009-08-21 22:45 ` Joel Becker
2009-08-22 2:11 ` Theodore Tso
2009-08-24 2:42 ` Christoph Hellwig
2009-08-24 2:37 ` Christoph Hellwig
2009-08-22 0:56 ` Jamie Lokier
2009-08-22 2:06 ` Theodore Tso [this message]
2009-08-26 6:34 ` Dave Chinner
2009-08-26 15:01 ` Jamie Lokier
2009-08-26 18:47 ` Theodore Tso
2009-08-27 14:50 ` Jamie Lokier
2009-08-21 14:20 ` Christoph Hellwig
2009-08-21 15:06 ` James Bottomley
2009-08-21 15:23 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090822020607.GQ9529@mit.edu \
--to=tytso@mit.edu \
--cc=hch@infradead.org \
--cc=jamie@shareable.org \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).