From: Jamie Lokier <jamie@shareable.org>
To: Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Theodore Tso <tytso@MIT.EDU>
Subject: Re: [RFC] [PATCH] vfs: Call filesystem callback when backing device caches should be flushed
Date: Wed, 21 Jan 2009 23:55:31 +0000 [thread overview]
Message-ID: <20090121235531.GB20407@shareable.org> (raw)
In-Reply-To: <20090121232524.GQ10158@disturbed>
Dave Chinner wrote:
> If the inode is dirty and fsync does nothing, then that filesystem
> is *broken*. If writing to the inode doesn't dirty it, then the
> filesystem is broken. Fix the broken filesystem.
*Wrong* Very, very wrong.
You do not write totally unchanged inode bytes just for the sake of
causing a NOP transaction to make the disk write the fsync as a
side-effect of a broken paradigm. That's _three_ pointless I/Os (one
redundant barrier and two writes), and probably 50x slowdown in write
performance due to seeking. Now who's filesystem is broken?
> > For efficient fdatasync() you _never_ want a transaction if possible,
> > because it forces the disk head to seek between alternating regions of
> > the disk, two seeks per fsync().
>
> If there is dirty metadata that is need to be logged or flushed,
> then fdatasync() needs to do something. If it doesn't do it
> correctly, then that *filesystem is broken*. Fix the broken
> filesystem.
A series of a writes over existing data and fdatasync() should *never*
write to the transaction log, unless you mounted something like ext3
data=journal, which isn't usual.
There is no dirty metadata to write. It is data only. fdatasync()
*means* "do NOT write metadata that is not needed for data retrieval",
that's it's whole point. A filesystem which keeps seeking to its
inode area _and_ its journal area _and_ the data area on every
fdatasync() is a poor design indeed.
> > So you can't rely on journalling transactions to flush.
>
> The VFS doesn't even know about transactions....
Whoever brought them up said they can be relied on to flush writes
during fsync/fdatasync. Just saying they can't, is all...
> > > Finally, I prefer maintainers of the filesystems themselves to
> > > decide whether their filesystem needs flushing and thus
> > > knowingly impose this performance penalty on them...
> >
> > I say it should flush be default unless a filesystem hooks an
> > alternative strategy. Certainly, it's silly to have the same code
> > duplicated in nearly every filesystem
>
> So write a *generic helper* for those filesystems that do the same
> thing and hook it to their ->fsync method. Don't hard code it in the
> VFS so other filesystem dev's have to come along afterwards and turn
> it off.
Are there any at the moment which would turn it off?
If so that's a fine idea.
-- Jamie
next prev parent reply other threads:[~2009-01-21 23:55 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-20 16:05 [RFC] [PATCH] vfs: Call filesystem callback when backing device caches should be flushed Jan Kara
2009-01-20 23:16 ` Joel Becker
2009-01-21 0:16 ` Jamie Lokier
2009-01-21 15:05 ` Jan Kara
2009-01-21 21:41 ` Jamie Lokier
2009-01-21 12:55 ` Jan Kara
2009-01-21 21:47 ` Jamie Lokier
2009-01-21 21:50 ` Jamie Lokier
2009-01-21 23:25 ` Dave Chinner
2009-01-21 23:55 ` Jamie Lokier [this message]
2009-01-22 1:21 ` Dave Chinner
2009-01-22 3:03 ` Jamie Lokier
2009-01-21 22:03 ` Joel Becker
2009-01-21 22:35 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090121235531.GB20407@shareable.org \
--to=jamie@shareable.org \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@MIT.EDU \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.