linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Whitehouse <swhiteho@redhat.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Nick Piggin <npiggin@kernel.dk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	mfasheh@suse.com, joel.becker@oracle.com
Subject: Re: [patch 7/8] fs: fix or note I_DIRTY handling bugs in filesystems
Date: Mon, 03 Jan 2011 15:03:29 +0000	[thread overview]
Message-ID: <1294067009.2429.109.camel@dolmen> (raw)
In-Reply-To: <20101229150108.GA13358@infradead.org>

Hi,

On Wed, 2010-12-29 at 10:01 -0500, Christoph Hellwig wrote:
> As mentioned last round I think the exporting of inode_lock and pusing
> of the I_DIRTY* complexities into the filesystems can be avoided.  See
> the patch below, which compiles and passes xfstests for xfs, but
> otherwise isn't quite done yet.  The only code change vs the opencoded
> variant in the patch is that we do a useless inode_lock roundtrip
> for a non-dirty inode on gfs2, which is I think is acceptable,
> especially once we have the lock split anyway.
> 
> The other thing I don't like yet is passing the datasync flag - the
> callback shouldn't care about what we were called with, but rather
> about which bits it needs to sync out - which the dirty flag already
> tells us about.
> 
> IMHO the behaviour in ocfs2 and gfs2 that relies on it is plain wrong:
> 
>  - ocfs2 really should always force the journal if any bit we care about
>    in the inode is dirty, and only do the pure cache flush is nothing
>    we care about is dirty (similar to the more complex code in XFS)
>  - gfs2 seems really weird.  Doesn't it need to do any log force
>    if an inode has a pending transaction?  Currently it only does for
>    stuffed inodes, and if datasync was set, which seems weird.  Also
>    I can't see why we'd never want to call into ->write_inode to write
>    out the inode for the datasync case - except for not catching
>    timestamp updates fdatasync really isn't any different from fsync.
> 
The algorithm was intended to be:

 - With "journaled data" files
   - Do a log flush conditional upon the inode's glock
   - The core code then writes back any dirty pages

 - With regular files/directories
  - If datasync is not set, we need to write back the metadata including
timestamp updates, so that is done via ->write_inode. Note that an extra
complication here is that we need to get the glock on the inode if we
don't already have it in order to check and conditionally update the
atime.The call to ->write_inode includes an implicit (conditional) log
flush.
 - If datasync is set, we assume that only the data pages need to be
written out. My understanding of datasync was that it was only supposed
to write out data and never any of the metadata. The reason for the call
to flush the log for "stuffed" files is that the data shares a disk
block with the inode metadata, so we cannot avoid the log flush in this
case, since we must unpin the block to write it back.

There is something strange going on here though since there should be a
metadata sync included as well I think - I'm just working back through
the changes to see where that was lost at the moment,

Steve.



  reply	other threads:[~2011-01-03 15:03 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-18  1:46 [patch 0/8] Inode data integrity patches Nick Piggin
2010-12-18  1:46 ` [patch 1/8] fs: mark_inode_dirty barrier fix Nick Piggin
2010-12-18  1:46 ` [patch 2/8] fs: simple fsync race fix Nick Piggin
2010-12-18  1:46 ` [patch 3/8] fs: introduce inode writeback helpers Nick Piggin
2010-12-18  1:46 ` [patch 4/8] fs: preserve inode dirty bits on failed metadata writeback Nick Piggin
2010-12-18  1:46 ` [patch 5/8] fs: ext2 inode sync fix Nick Piggin
2011-01-07 19:08   ` Ted Ts'o
2010-12-18  1:46 ` [patch 6/8] fs: fsync optimisations Nick Piggin
2010-12-18  1:46 ` [patch 7/8] fs: fix or note I_DIRTY handling bugs in filesystems Nick Piggin
2010-12-29 15:01   ` Christoph Hellwig
2011-01-03 15:03     ` Steven Whitehouse [this message]
2011-01-03 16:58       ` Christoph Hellwig
2011-01-04  7:12         ` Nick Piggin
2011-01-04 14:22         ` Steven Whitehouse
2011-01-04  6:04     ` Nick Piggin
2011-01-04  6:39       ` Christoph Hellwig
2011-01-04  7:52         ` Nick Piggin
2011-01-04  9:13           ` Christoph Hellwig
2011-01-04  9:28             ` Nick Piggin
2010-12-18  1:46 ` [patch 8/8] fs: add i_op->sync_inode Nick Piggin
2010-12-29 15:12   ` Christoph Hellwig
2011-01-04  6:27     ` Nick Piggin
2011-01-04  6:57       ` Christoph Hellwig
2011-01-04  8:03         ` Nick Piggin
2011-01-04  8:31           ` Nick Piggin
2011-01-04  9:25             ` Christoph Hellwig
2011-01-04  9:52               ` Nick Piggin
2011-01-06 20:49                 ` Christoph Hellwig
2011-01-07  4:48                   ` Nick Piggin
2011-01-07  7:25                     ` Christoph Hellwig
2011-01-11  3:44                       ` Nick Piggin
2011-01-04  9:25           ` Christoph Hellwig
2011-01-04  9:49             ` Nick Piggin
2011-01-06 20:45               ` Christoph Hellwig
2011-01-07  4:47                 ` Nick Piggin
2011-01-07  7:24                   ` Christoph Hellwig
2011-01-07  7:29                     ` Christoph Hellwig
2011-01-07 13:10                       ` Christoph Hellwig
2011-01-07 18:30                       ` Ted Ts'o
2011-01-07 18:32                         ` Christoph Hellwig
2011-01-07 19:06   ` Ted Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1294067009.2429.109.camel@dolmen \
    --to=swhiteho@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@infradead.org \
    --cc=joel.becker@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfasheh@suse.com \
    --cc=npiggin@kernel.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).