All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: Theodore Tso <tytso@mit.edu>, Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jens Axboe <jens.axboe@oracle.com>,
	linux-fsdevel@vger.kernel.org, linux-scsi@
Subject: Re: O_DIRECT and barriers
Date: Thu, 27 Aug 2009 15:50:51 +0100	[thread overview]
Message-ID: <20090827145051.GC31453@shareable.org> (raw)
In-Reply-To: <20090826184700.GD6997@mit.edu>

Theodore Tso wrote:
> On Wed, Aug 26, 2009 at 04:01:02PM +0100, Jamie Lokier wrote:
> >   1. If the automatic O_SYNC fallback mentioned by Christopher is
> >      currently implemented at all, even in a subset of filesystems,
> >      then I think it should be removed.
> 
> Could you clarify what you meant by "it" above?  I'm not sure I
> understood what you were referring to.

I meant the automatic O_SYNC fallback, in other words, if O_DIRECT
falls back to buffered writing, Chris said it automatically did
O_SYNC, and you followed up by saying it doesn't :-)

All I'm saying is if there's _some_ code doing O_SYNC writing when
O_DIRECT falls back to buffered, it should be ripped out.  Leave the
syncing to explicit fsync calls from userspace.

> >   4. On drives which need it, fdatasync/fsync must trigger a drive
> >      cache flush even when there is no dirty page cache to write,
> >      because dirty pages may have been written in the background
> >      already, and because O_DIRECT writes dirty the drive cache but
> >      not the page cache.
> > 
> 
> I agree we *should* do this, but we're going to take a pretty serious
> performance hit when we do.  Mac OS chickened out and added an
> F_FULLSYNC option:

I know about that one.  (I've done quite a lot of research on O_DIRECT
and fsync behaviours).  It's really unfortunate that they didn't
provide F_FULLDATASYNC, which is what a database or VM would ideally
use.

I think Vxfs provides a whole suite of mount options to adjust what
O_SYNC and fdatasync actually do.

> The concern is that there are GUI programers that want to update state
> files after every window resize or move, and after click on a web
> browser.  These GUI programmers then get cranky when changes get lost
> after proprietary video drivers cause the laptop to lock up.  If we
> make fsync() too burdensome, then fewer and fewer applications will
> use it.  Evidently the MacOS developers decided the few applications
> who really cared about doing device cache flushes were much smaller
> than the fast number of applications that need a lightweight file
> flush.  Should we do the same?  

If fsync is cheap but doesn't commit changes properly - what's the
point in encouraging applications to use it?  Without drive cache
flushes, they will still lose changes occasionally.

(Btw, don't blame proprietary video drivers.  I see too many lockups
with open source video drivers too.)

> It seems like an awful cop-out, but having seen, up front and
> personal, how "agressively stupid" some desktop programmers can be[1],
> I can **certainly** understand why Apple chose the F_FULLSYNC route.

I did see a few of those threads, and I think your solution was genius.
Genius at keeping people quiet that is :-)

But it's also a good default.  fsync() isn't practical in shell
scripts or Makefiles, although that's really because "mv" lacks the
fsync option...

Personally I side with "want some kind of full-system asynchronous
transactionality please".  (Possibly aka. O_PONIES :-)

-- Jamie

  reply	other threads:[~2009-08-27 14:51 UTC|newest]

Thread overview: 139+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-19 16:04 [PATCH 0/17] Make O_SYNC handling use standard syncing path Jan Kara
2009-08-19 16:04 ` [PATCH 01/17] vfs: Introduce filemap_fdatawait_range Jan Kara
2009-08-19 16:10   ` Christoph Hellwig
2009-08-19 16:04 ` [Ocfs2-devel] [PATCH 02/17] vfs: Export __generic_file_aio_write() and add some comments Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:11   ` [Ocfs2-devel] " Christoph Hellwig
2009-08-19 16:11     ` Christoph Hellwig
2009-08-20 12:04     ` [Ocfs2-devel] " Jan Kara
2009-08-20 12:04       ` Jan Kara
2009-08-19 20:22   ` Evgeniy Polyakov
2009-08-19 20:22     ` [Ocfs2-devel] " Evgeniy Polyakov
2009-08-20 12:31     ` Jan Kara
2009-08-20 12:31       ` Jan Kara
2009-08-20 13:30       ` Evgeniy Polyakov
2009-08-20 13:30         ` [Ocfs2-devel] " Evgeniy Polyakov
2009-08-20 13:52         ` Jan Kara
2009-08-20 13:52           ` Jan Kara
2009-08-20 13:58           ` Evgeniy Polyakov
2009-08-20 13:58             ` [Ocfs2-devel] " Evgeniy Polyakov
2009-08-19 16:04 ` [PATCH 03/17] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write() Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:04   ` [Ocfs2-devel] " Jan Kara
2009-08-19 16:18   ` Christoph Hellwig
2009-08-19 16:18     ` Christoph Hellwig
2009-08-19 16:18     ` [Ocfs2-devel] " Christoph Hellwig
2009-08-20 13:31     ` Jan Kara
2009-08-20 13:31       ` Jan Kara
2009-08-20 13:31       ` [Ocfs2-devel] " Jan Kara
2009-08-19 16:04 ` [PATCH 04/17] pohmelfs: Use __generic_file_aio_write instead of generic_file_aio_write_nolock Jan Kara
2009-08-19 16:04 ` [Ocfs2-devel] [PATCH 05/17] ocfs2: " Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:04 ` [PATCH 06/17] vfs: Remove sync_page_range_nolock Jan Kara
2009-08-19 16:21   ` Christoph Hellwig
2009-08-19 16:04 ` [PATCH 07/17] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode Jan Kara
2009-08-19 16:04   ` [Ocfs2-devel] " Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:26   ` Christoph Hellwig
2009-08-19 16:26     ` [Ocfs2-devel] " Christoph Hellwig
2009-08-19 16:26     ` Christoph Hellwig
2009-08-20 12:15     ` Jan Kara
2009-08-20 12:15       ` [Ocfs2-devel] " Jan Kara
2009-08-20 12:15       ` Jan Kara
2009-08-20 16:27       ` Christoph Hellwig
2009-08-20 16:27         ` [Ocfs2-devel] " Christoph Hellwig
2009-08-20 16:27         ` Christoph Hellwig
2009-08-21 15:23         ` Jan Kara
2009-08-21 15:23           ` [Ocfs2-devel] " Jan Kara
2009-08-21 15:23           ` Jan Kara
2009-08-21 15:32           ` Christoph Hellwig
2009-08-21 15:32             ` [Ocfs2-devel] " Christoph Hellwig
2009-08-21 15:32             ` Christoph Hellwig
2009-08-21 15:48             ` Jan Kara
2009-08-21 15:48               ` [Ocfs2-devel] " Jan Kara
2009-08-21 15:48               ` Jan Kara
2009-08-26 18:22         ` Christoph Hellwig
2009-08-26 18:22           ` [Ocfs2-devel] " Christoph Hellwig
2009-08-26 18:22           ` Christoph Hellwig
2009-08-27  0:04           ` Christoph Hellwig
2009-08-27  0:04             ` [Ocfs2-devel] " Christoph Hellwig
2009-08-27  0:04             ` Christoph Hellwig
2009-08-19 16:04 ` [PATCH 08/17] ext2: Update comment about generic_osync_inode Jan Kara
2009-08-19 16:04 ` [PATCH 09/17] ext3: Remove syncing logic from ext3_file_write Jan Kara
2009-08-19 16:04 ` [PATCH 10/17] ext4: Remove syncing logic from ext4_file_write Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:04 ` [PATCH 11/17] fat: Opencode sync_page_range_nolock() Jan Kara
2009-08-19 16:04 ` [PATCH 12/17] ntfs: Use new syncing helpers and update comments Jan Kara
2009-08-19 16:04 ` [Ocfs2-devel] [PATCH 13/17] ocfs2: Update syncing after splicing to match generic version Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-21  1:36   ` [Ocfs2-devel] " Joel Becker
2009-08-21  1:36     ` Joel Becker
2009-08-21 14:30     ` Jan Kara
2009-08-21 14:30       ` Jan Kara
2009-08-19 16:04 ` [PATCH 14/17] xfs: Use new syncing helper Jan Kara
2009-08-19 16:04   ` Jan Kara
2009-08-19 16:33   ` Christoph Hellwig
2009-08-19 16:33     ` Christoph Hellwig
2009-08-20 12:22     ` Jan Kara
2009-08-20 12:22       ` Jan Kara
2009-08-19 16:04 ` [PATCH 15/17] pohmelfs: " Jan Kara
2009-08-19 16:04 ` [PATCH 16/17] nfs: Remove reference to generic_osync_inode from a comment Jan Kara
2009-08-19 16:04 ` [PATCH 17/17] vfs: Remove generic_osync_inode() and sync_page_range() Jan Kara
2009-08-20 22:12 ` O_DIRECT and barriers Christoph Hellwig
2009-08-21 11:40   ` Jens Axboe
2009-08-21 13:54     ` Jamie Lokier
2009-08-21 14:26       ` Christoph Hellwig
2009-08-21 15:24         ` Jamie Lokier
2009-08-21 17:45           ` Christoph Hellwig
2009-08-21 19:18             ` Ric Wheeler
2009-08-22  0:50             ` Jamie Lokier
2009-08-22  2:19               ` Theodore Tso
2009-08-22  2:31                 ` Theodore Tso
2009-08-24  2:34               ` Christoph Hellwig
2009-08-27 14:34                 ` Jamie Lokier
2009-08-27 17:10                   ` adding proper O_SYNC/O_DSYNC, was " Christoph Hellwig
2009-08-27 17:24                     ` Ulrich Drepper
2009-08-27 17:24                       ` Ulrich Drepper
2009-08-28 15:46                       ` Christoph Hellwig
2009-08-28 16:06                         ` Ulrich Drepper
2009-08-28 16:06                           ` Ulrich Drepper
2009-08-28 16:17                           ` Christoph Hellwig
2009-08-28 16:33                             ` Ulrich Drepper
2009-08-28 16:33                               ` Ulrich Drepper
2009-08-28 16:41                               ` Christoph Hellwig
2009-08-28 20:51                                 ` Ulrich Drepper
2009-08-28 20:51                                   ` Ulrich Drepper
2009-08-28 21:08                                   ` Christoph Hellwig
2009-08-28 21:16                                     ` Trond Myklebust
2009-08-28 21:29                                       ` Christoph Hellwig
2009-08-28 21:43                                         ` Trond Myklebust
2009-08-28 22:39                                           ` Christoph Hellwig
2009-08-30 16:44                                     ` Jamie Lokier
2009-08-28 16:46                               ` Jamie Lokier
2009-08-29  0:59                                 ` Jamie Lokier
2009-08-28 16:44                         ` Jamie Lokier
2009-08-28 16:50                           ` Jamie Lokier
2009-08-28 21:08                           ` Ulrich Drepper
2009-08-28 21:08                             ` Ulrich Drepper
2009-08-30 16:58                             ` Jamie Lokier
2009-08-30 17:48                             ` Jamie Lokier
2009-08-28 23:06                         ` Jamie Lokier
2009-08-28 23:46                           ` Christoph Hellwig
2009-08-21 22:08         ` Theodore Tso
2009-08-21 22:38           ` Joel Becker
2009-08-21 22:45           ` Joel Becker
2009-08-22  2:11             ` Theodore Tso
2009-08-24  2:42               ` Christoph Hellwig
2009-08-24  2:37             ` Christoph Hellwig
2009-08-24  2:37             ` Christoph Hellwig
2009-08-21 22:45           ` Joel Becker
2009-08-22  0:56           ` Jamie Lokier
2009-08-22  2:06             ` Theodore Tso
2009-08-26  6:34           ` Dave Chinner
2009-08-26 15:01             ` Jamie Lokier
2009-08-26 18:47               ` Theodore Tso
2009-08-27 14:50                 ` Jamie Lokier [this message]
2009-08-26  6:34           ` Dave Chinner
2009-08-21 14:20     ` Christoph Hellwig
2009-08-21 15:06       ` James Bottomley
2009-08-21 15:23         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090827145051.GC31453@shareable.org \
    --to=jamie@shareable.org \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.