From: Jamie Lokier <jamie@shareable.org>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Theodore Tso <tytso@MIT.EDU>
Subject: Re: [RFC] [PATCH] vfs: Call filesystem callback when backing device caches should be flushed
Date: Wed, 21 Jan 2009 21:47:48 +0000 [thread overview]
Message-ID: <20090121214748.GE16133@shareable.org> (raw)
In-Reply-To: <20090121125537.GB3186@duck.suse.cz>
Jan Kara wrote:
> On Tue 20-01-09 15:16:48, Joel Becker wrote:
> > On Tue, Jan 20, 2009 at 05:05:27PM +0100, Jan Kara wrote:
> > > we noted in our testing that ext2 (and it seems some other filesystems as
> > > well) don't flush disk's write caches on cases like fsync() or changing
> > > DIRSYNC directory. This is my attempt to solve the problem in a generic way
> > > by calling a filesystem callback from VFS at appropriate place as Andrew
> > > suggested. For ext2 what I did is enough (it just then fills in
> > > block_flush_device() as .flush_device callback) and I think it could be
> > > fine for other filesystems as well.
> >
> > The only question I have is why this would be optional. It
> > would seem that this would be the preferred default behavior for all
> > block filesystems. We have the backing_dev_info and a way to override
> > the default if a filesystem needs something special.
>
> The reason why I've decided for NOP to be the default is that
> filesystems doing proper journalling with barriers should not need
> this (as the barrier in the transaction commit already does the job
> for them).
No, that doesn't work.
fsync() doesn't always cause a transaction. If there's no inode
change, there may not be a transaction. Writing does not always dirty
mtime, if it's within mtime granularity.
For efficient fdatasync() you _never_ want a transaction if possible,
because it forces the disk head to seek between alternating regions of
the disk, two seeks per fsync().
So you can't rely on journalling transactions to flush.
> Finally, I prefer maintainers of the filesystems themselves to decide
> whether their filesystem needs flushing and thus knowingly impose this
> performance penalty on them...
I say it should flush be default unless a filesystem hooks an
alternative strategy. Certainly, it's silly to have the same code
duplicated in nearly every filesystem
-- Jamie
next prev parent reply other threads:[~2009-01-21 21:47 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-20 16:05 [RFC] [PATCH] vfs: Call filesystem callback when backing device caches should be flushed Jan Kara
2009-01-20 23:16 ` Joel Becker
2009-01-21 0:16 ` Jamie Lokier
2009-01-21 15:05 ` Jan Kara
2009-01-21 21:41 ` Jamie Lokier
2009-01-21 12:55 ` Jan Kara
2009-01-21 21:47 ` Jamie Lokier [this message]
2009-01-21 21:50 ` Jamie Lokier
2009-01-21 23:25 ` Dave Chinner
2009-01-21 23:55 ` Jamie Lokier
2009-01-22 1:21 ` Dave Chinner
2009-01-22 3:03 ` Jamie Lokier
2009-01-21 22:03 ` Joel Becker
2009-01-21 22:35 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090121214748.GE16133@shareable.org \
--to=jamie@shareable.org \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@MIT.EDU \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).