linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Steven Whitehouse <swhiteho@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Nick Piggin <npiggin@kernel.dk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	mfasheh@suse.com, joel.becker@oracle.com
Subject: Re: [patch 7/8] fs: fix or note I_DIRTY handling bugs in filesystems
Date: Mon, 3 Jan 2011 11:58:21 -0500	[thread overview]
Message-ID: <20110103165821.GA1336@infradead.org> (raw)
In-Reply-To: <1294067009.2429.109.camel@dolmen>

On Mon, Jan 03, 2011 at 03:03:29PM +0000, Steven Whitehouse wrote:
> 
>  - With "journaled data" files
>    - Do a log flush conditional upon the inode's glock
>    - The core code then writes back any dirty pages

Any data writeback is done before calling ->fsync.

>  - With regular files/directories
>   - If datasync is not set, we need to write back the metadata including
> timestamp updates, so that is done via ->write_inode. Note that an extra
> complication here is that we need to get the glock on the inode if we
> don't already have it in order to check and conditionally update the
> atime.The call to ->write_inode includes an implicit (conditional) log
> flush.
>  - If datasync is set, we assume that only the data pages need to be
> written out. My understanding of datasync was that it was only supposed
> to write out data and never any of the metadata. The reason for the call
> to flush the log for "stuffed" files is that the data shares a disk
> block with the inode metadata, so we cannot avoid the log flush in this
> case, since we must unpin the block to write it back.

What happens to indirect blocks, inode size updates, etc?  In general
the only correct form to use the datasync argument is along the lines
of:

	if ((inode->i_state & I_DIRTY_DATASYNC) ||
	    ((inode->i_state & I_DIRTY_SYNC) && !datasync)) {
		/* write out the inode */
	} else {
		/*
		 * VFS inode not dirty, no need to write it out.
		 *
		 * If the filesystem support asynchronous inode writes,
		 * we may have to wait for them here.
		 */
	}

or rather mostly correct, as pointed out by Nick in this series, that's
why the above gets replaced with an equivalent check that also
participates in the writeback locking protocol in this series.

For gfs2 on current mainline an fsync respecting that would look like:

static int gfs2_fsync(struct file *file, int datasync)
{
	struct inode *inode = file->f_mapping->host;
	struct gfs2_inode *ip = GFS2_I(inode);
	int ret = 0;

	if (gfs2_is_jdata(ip) {
		gfs2_log_flush(GFS2_SB(inode), ip);
		return 0;
	}

	if ((inode->i_state & I_DIRTY_DATASYNC) ||
	    ((inode->i_state & I_DIRTY_SYNC) && !datasync))
		sync_inode_metadata(inode, 1);
	else if (gfs2_is_stuffed(ip))
		gfs2_log_flush(GFS2_SB(inode), ip->i_gl);
}

Note that the asynchronous write_inode_now is replaced with a
sync_inode_metadata, which doesn't incorrectly write data again, and
makes sure we do a synchronous write.

I'm still not quite sure how the gfs2_log_flush are supposed to work.
What's the reason we don't need the ->write_inode call for journaled
data mode?  Also is it guaranteed that we might not have an asynchronous
transaction that update the inode in the log, e.g. why doesn't gfs2
need some sort of log flush even if the VFS inode is not dirty, unlike
most other journaled filesystems.

  reply	other threads:[~2011-01-03 16:58 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-18  1:46 [patch 0/8] Inode data integrity patches Nick Piggin
2010-12-18  1:46 ` [patch 1/8] fs: mark_inode_dirty barrier fix Nick Piggin
2010-12-18  1:46 ` [patch 2/8] fs: simple fsync race fix Nick Piggin
2010-12-18  1:46 ` [patch 3/8] fs: introduce inode writeback helpers Nick Piggin
2010-12-18  1:46 ` [patch 4/8] fs: preserve inode dirty bits on failed metadata writeback Nick Piggin
2010-12-18  1:46 ` [patch 5/8] fs: ext2 inode sync fix Nick Piggin
2011-01-07 19:08   ` Ted Ts'o
2010-12-18  1:46 ` [patch 6/8] fs: fsync optimisations Nick Piggin
2010-12-18  1:46 ` [patch 7/8] fs: fix or note I_DIRTY handling bugs in filesystems Nick Piggin
2010-12-29 15:01   ` Christoph Hellwig
2011-01-03 15:03     ` Steven Whitehouse
2011-01-03 16:58       ` Christoph Hellwig [this message]
2011-01-04  7:12         ` Nick Piggin
2011-01-04 14:22         ` Steven Whitehouse
2011-01-04  6:04     ` Nick Piggin
2011-01-04  6:39       ` Christoph Hellwig
2011-01-04  7:52         ` Nick Piggin
2011-01-04  9:13           ` Christoph Hellwig
2011-01-04  9:28             ` Nick Piggin
2010-12-18  1:46 ` [patch 8/8] fs: add i_op->sync_inode Nick Piggin
2010-12-29 15:12   ` Christoph Hellwig
2011-01-04  6:27     ` Nick Piggin
2011-01-04  6:57       ` Christoph Hellwig
2011-01-04  8:03         ` Nick Piggin
2011-01-04  8:31           ` Nick Piggin
2011-01-04  9:25             ` Christoph Hellwig
2011-01-04  9:52               ` Nick Piggin
2011-01-06 20:49                 ` Christoph Hellwig
2011-01-07  4:48                   ` Nick Piggin
2011-01-07  7:25                     ` Christoph Hellwig
2011-01-11  3:44                       ` Nick Piggin
2011-01-04  9:25           ` Christoph Hellwig
2011-01-04  9:49             ` Nick Piggin
2011-01-06 20:45               ` Christoph Hellwig
2011-01-07  4:47                 ` Nick Piggin
2011-01-07  7:24                   ` Christoph Hellwig
2011-01-07  7:29                     ` Christoph Hellwig
2011-01-07 13:10                       ` Christoph Hellwig
2011-01-07 18:30                       ` Ted Ts'o
2011-01-07 18:32                         ` Christoph Hellwig
2011-01-07 19:06   ` Ted Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110103165821.GA1336@infradead.org \
    --to=hch@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=joel.becker@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfasheh@suse.com \
    --cc=npiggin@kernel.dk \
    --cc=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).