From: Jamie Lokier <jamie@shareable.org>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Theodore Tso <tytso@MIT.EDU>
Subject: Re: [RFC] [PATCH] vfs: Call filesystem callback when backing device caches should be flushed
Date: Wed, 21 Jan 2009 21:47:48 +0000 [thread overview]
Message-ID: <20090121214748.GE16133@shareable.org> (raw)
In-Reply-To: <20090121125537.GB3186@duck.suse.cz>
Jan Kara wrote:
> On Tue 20-01-09 15:16:48, Joel Becker wrote:
> > On Tue, Jan 20, 2009 at 05:05:27PM +0100, Jan Kara wrote:
> > > we noted in our testing that ext2 (and it seems some other filesystems as
> > > well) don't flush disk's write caches on cases like fsync() or changing
> > > DIRSYNC directory. This is my attempt to solve the problem in a generic way
> > > by calling a filesystem callback from VFS at appropriate place as Andrew
> > > suggested. For ext2 what I did is enough (it just then fills in
> > > block_flush_device() as .flush_device callback) and I think it could be
> > > fine for other filesystems as well.
> >
> > The only question I have is why this would be optional. It
> > would seem that this would be the preferred default behavior for all
> > block filesystems. We have the backing_dev_info and a way to override
> > the default if a filesystem needs something special.
>
> The reason why I've decided for NOP to be the default is that
> filesystems doing proper journalling with barriers should not need
> this (as the barrier in the transaction commit already does the job
> for them).
No, that doesn't work.
fsync() doesn't always cause a transaction. If there's no inode
change, there may not be a transaction. Writing does not always dirty
mtime, if it's within mtime granularity.
For efficient fdatasync() you _never_ want a transaction if possible,
because it forces the disk head to seek between alternating regions of
the disk, two seeks per fsync().
So you can't rely on journalling transactions to flush.
> Finally, I prefer maintainers of the filesystems themselves to decide
> whether their filesystem needs flushing and thus knowingly impose this
> performance penalty on them...
I say it should flush be default unless a filesystem hooks an
alternative strategy. Certainly, it's silly to have the same code
duplicated in nearly every filesystem
-- Jamie
next prev parent reply other threads:[~2009-01-21 21:47 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-20 16:05 [RFC] [PATCH] vfs: Call filesystem callback when backing device caches should be flushed Jan Kara
2009-01-20 23:16 ` Joel Becker
2009-01-21 0:16 ` Jamie Lokier
2009-01-21 15:05 ` Jan Kara
2009-01-21 21:41 ` Jamie Lokier
2009-01-21 12:55 ` Jan Kara
2009-01-21 21:47 ` Jamie Lokier [this message]
2009-01-21 21:50 ` Jamie Lokier
2009-01-21 23:25 ` Dave Chinner
2009-01-21 23:55 ` Jamie Lokier
2009-01-22 1:21 ` Dave Chinner
2009-01-22 3:03 ` Jamie Lokier
2009-01-21 22:03 ` Joel Becker
2009-01-21 22:35 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090121214748.GE16133@shareable.org \
--to=jamie@shareable.org \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@MIT.EDU \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.