linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: akpm@linux-foundation.org, xfs@oss.sgi.com,
	linux-fsdevel@vger.kernel.org,
	Chris Mason <chris.mason@oracle.com>
Subject: Re: [patch 0/9] writeback data integrity and other fixes (take 3)
Date: Wed, 29 Oct 2008 06:06:59 +0100	[thread overview]
Message-ID: <20081029050659.GD17624@wotan.suse.de> (raw)
In-Reply-To: <20081029045707.GF17077@disturbed>

On Wed, Oct 29, 2008 at 03:57:07PM +1100, Dave Chinner wrote:
> On Wed, Oct 29, 2008 at 05:11:12AM +0100, Nick Piggin wrote:
> > Just be careful -- in your xfs_flush_pages, I think after the first
> > filemap_fdatawrite, the mapping may no longer be tagged with
> > PAGECACHE_TAG_DIRTY, so you may not pick up those writeback ones
> > you need to wait on.
> 
> Yes, I realised this as soon as I looked at the code.
> I added xfs_wait_on_pages() to do this wait. ;)
 
Oh good.


> > Might need a different variant, or we could just bite the bullet and
> > push through the ->fsync conversion so you get full control of the
> > writeout.
> 
> Not important right now, though....

OK.

 
> > BTW. the Linux pagecache APIs should support range operations quite
> > nicely for these. Any reason not to use them (it looks like the
> > wrappers can take ranges)?
> 
> Because I haven't got around to modifying these wrappers now that
> the range primitives are in place - XFS inherited the range
> operations from Irix and have sat unimplemented since being ported
> to Linux.

OK, fair enough. Just something easy to stick on a todo list somewhere ;)
Probably doesn't make much difference now, but we might start seeing
apps using range syncs.
 

> The patch below should fix this entire class of error value screwup
> in XFS. I've just started running it through XFSQA, so it's not
> really tested yet.

I'll sanity check it by running it through my basic fault injection
tests here.


> FWIW, your entire patch series made it through XFSQA without any new
> regressions, so it looks good from that POV.

Thanks, very good to know.

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> XFS: fix error inversion problems with data flushing
> 
> XFS gets the sign of the error wrong in several places when
> gathering the error from generic linux functions. These functions
> return negative error values, while the core XFS code returns
> positive error values. Hence when XFS inverts the error to be
> returned to the VFS, it can incorrectly invert a negative
> error and this error will be ignored by the syscall return.
> 
> Fix all the problems related to calling filemap_* functions.
> 
> Problem initially identified by Nick Piggin in xfs_fsync().
> 
> Signed-off-by: Dave Chinner <david@fromorbit.com>
> ---
>  fs/xfs/linux-2.6/xfs_fs_subr.c |   23 ++++++++++++++++++++---
>  fs/xfs/linux-2.6/xfs_lrw.c     |    2 +-
>  fs/xfs/linux-2.6/xfs_super.c   |   13 +++++++++----
>  fs/xfs/xfs_vnodeops.c          |    2 +-
>  fs/xfs/xfs_vnodeops.h          |    1 +
>  5 files changed, 32 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/linux-2.6/xfs_fs_subr.c b/fs/xfs/linux-2.6/xfs_fs_subr.c
> index 36caa6d..5aeb777 100644
> --- a/fs/xfs/linux-2.6/xfs_fs_subr.c
> +++ b/fs/xfs/linux-2.6/xfs_fs_subr.c
> @@ -24,6 +24,10 @@ int  fs_noerr(void) { return 0; }
>  int  fs_nosys(void) { return ENOSYS; }
>  void fs_noval(void) { return; }
>  
> +/*
> + * note: all filemap functions return negative error codes. These
> + * need to be inverted before returning to the xfs core functions.
> + */
>  void
>  xfs_tosspages(
>  	xfs_inode_t	*ip,
> @@ -53,7 +57,7 @@ xfs_flushinval_pages(
>  		if (!ret)
>  			truncate_inode_pages(mapping, first);
>  	}
> -	return ret;
> +	return -ret;
>  }
>  
>  int
> @@ -72,10 +76,23 @@ xfs_flush_pages(
>  		xfs_iflags_clear(ip, XFS_ITRUNCATED);
>  		ret = filemap_fdatawrite(mapping);
>  		if (flags & XFS_B_ASYNC)
> -			return ret;
> +			return -ret;
>  		ret2 = filemap_fdatawait(mapping);
>  		if (!ret)
>  			ret = ret2;
>  	}
> -	return ret;
> +	return -ret;
> +}
> +
> +int
> +xfs_wait_on_pages(
> +	xfs_inode_t	*ip,
> +	xfs_off_t	first,
> +	xfs_off_t	last)
> +{
> +	struct address_space *mapping = VFS_I(ip)->i_mapping;
> +
> +	if (mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
> +		return -filemap_fdatawait(mapping);
> +	return 0;
>  }
> diff --git a/fs/xfs/linux-2.6/xfs_lrw.c b/fs/xfs/linux-2.6/xfs_lrw.c
> index 1957e53..4959c87 100644
> --- a/fs/xfs/linux-2.6/xfs_lrw.c
> +++ b/fs/xfs/linux-2.6/xfs_lrw.c
> @@ -243,7 +243,7 @@ xfs_read(
>  
>  	if (unlikely(ioflags & IO_ISDIRECT)) {
>  		if (inode->i_mapping->nrpages)
> -			ret = xfs_flushinval_pages(ip, (*offset & PAGE_CACHE_MASK),
> +			ret = -xfs_flushinval_pages(ip, (*offset & PAGE_CACHE_MASK),
>  						    -1, FI_REMAPF_LOCKED);
>  		mutex_unlock(&inode->i_mutex);
>  		if (ret) {
> diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
> index 9dc977d..c5cfc1e 100644
> --- a/fs/xfs/linux-2.6/xfs_super.c
> +++ b/fs/xfs/linux-2.6/xfs_super.c
> @@ -1012,21 +1012,26 @@ xfs_fs_write_inode(
>  	struct inode		*inode,
>  	int			sync)
>  {
> +	struct xfs_inode	*ip = XFS_I(inode);
>  	int			error = 0;
>  	int			flags = 0;
>  
> -	xfs_itrace_entry(XFS_I(inode));
> +	xfs_itrace_entry(ip);
>  	if (sync) {
> -		filemap_fdatawait(inode->i_mapping);
> +		error = xfs_wait_on_pages(ip, 0, -1);
> +		if (error)
> +			goto out_error;
>  		flags |= FLUSH_SYNC;
>  	}
> -	error = xfs_inode_flush(XFS_I(inode), flags);
> +	error = xfs_inode_flush(ip, flags);
> +
> +out_error:
>  	/*
>  	 * if we failed to write out the inode then mark
>  	 * it dirty again so we'll try again later.
>  	 */
>  	if (error)
> -		xfs_mark_inode_dirty_sync(XFS_I(inode));
> +		xfs_mark_inode_dirty_sync(ip);
>  
>  	return -error;
>  }
> diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
> index 2e2fbd9..5809c42 100644
> --- a/fs/xfs/xfs_vnodeops.c
> +++ b/fs/xfs/xfs_vnodeops.c
> @@ -713,7 +713,7 @@ xfs_fsync(
>  		return XFS_ERROR(EIO);
>  
>  	/* capture size updates in I/O completion before writing the inode. */
> -	error = filemap_fdatawait(VFS_I(ip)->i_mapping);
> +	error = xfs_wait_on_pages(ip, 0, -1);
>  	if (error)
>  		return XFS_ERROR(error);
>  
> diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
> index e932a96..f9cd376 100644
> --- a/fs/xfs/xfs_vnodeops.h
> +++ b/fs/xfs/xfs_vnodeops.h
> @@ -78,5 +78,6 @@ int xfs_flushinval_pages(struct xfs_inode *ip, xfs_off_t first,
>  		xfs_off_t last, int fiopt);
>  int xfs_flush_pages(struct xfs_inode *ip, xfs_off_t first,
>  		xfs_off_t last, uint64_t flags, int fiopt);
> +int xfs_wait_on_pages(struct xfs_inode *ip, xfs_off_t first, xfs_off_t last);
>  
>  #endif /* _XFS_VNODEOPS_H */

  reply	other threads:[~2008-10-29  5:07 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-28 14:47 [patch 0/9] writeback data integrity and other fixes (take 3) npiggin
2008-10-28 14:47 ` [patch 1/9] mm: write_cache_pages cyclic fix npiggin
2008-10-29  0:24   ` [patch 1.1/9] mm: write_cache_pages cyclic fix fix Nick Piggin
2008-10-28 14:47 ` [patch 2/9] mm: write_cache_pages early loop termination npiggin
2008-10-28 14:47 ` [patch 3/9] mm: write_cache_pages writepage error fix npiggin
2008-10-28 14:47 ` [patch 4/9] mm: write_cache_pages integrity fix npiggin
2008-10-28 14:47 ` [patch 5/9] mm: write_cache_pages cleanups npiggin
2008-10-28 14:47 ` [patch 6/9] mm: write_cache_pages optimise page cleaning npiggin
2008-10-28 14:47 ` [patch 7/9] mm: write_cache_pages terminate quickly npiggin
2008-10-30 23:07   ` Andrew Morton
2008-10-31  7:29     ` Nick Piggin
2008-10-28 14:47 ` [patch 8/9] mm: write_cache_pages more " npiggin
2008-10-28 14:47 ` [patch 9/9] mm: do_sync_mapping_range integrity fix npiggin
2008-10-30 23:13   ` Andrew Morton
2008-10-31  9:16     ` Nick Piggin
2008-10-31 10:04       ` Andrew Morton
2008-10-31 10:53         ` Nick Piggin
2008-10-31 20:03         ` Jamie Lokier
2008-10-31 14:10       ` Chris Mason
2008-10-31 14:30         ` steve
2008-10-31 15:02           ` Chris Mason
2008-11-01  8:04         ` Nick Piggin
2008-10-28 15:39 ` [patch 0/9] writeback data integrity and other fixes (take 3) Nick Piggin
2008-10-28 22:27   ` Dave Chinner
2008-10-29  0:04     ` Nick Piggin
2008-10-29  0:16     ` Nick Piggin
2008-10-29  3:16       ` Dave Chinner
2008-10-29  3:26         ` Dave Chinner
2008-10-29  4:11           ` Nick Piggin
2008-10-29  4:57             ` Dave Chinner
2008-10-29  5:06               ` Nick Piggin [this message]
2008-10-29  9:13           ` Christoph Hellwig
2008-10-29 21:42             ` Dave Chinner
2008-10-29 21:45               ` Christoph Hellwig
2008-10-29 21:53                 ` Dave Chinner
2008-10-29  4:00         ` Nick Piggin
2008-10-29  5:27           ` Dave Chinner
2008-10-29  9:12         ` Christoph Hellwig
2008-10-29  9:21           ` Nick Piggin
2008-10-29  9:44             ` Christoph Hellwig
2008-10-29 10:30               ` Nick Piggin
2008-10-29 12:22                 ` Jamie Lokier
     [not found]                   ` <20081029122234.GE846-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2008-10-29 13:32                     ` Ric Wheeler
2008-10-29 14:56                       ` Chris Mason
     [not found]                         ` <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org>
2008-10-30  2:16                           ` Nick Piggin
     [not found]                             ` <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-10-30 12:51                               ` jim owens
2008-10-30 13:41                                 ` Jim Rees
2008-10-29 21:43                   ` Dave Chinner
2008-10-29  8:51     ` Dave Chinner
2008-10-28 23:14 ` Dave Chinner
2008-10-28 23:57   ` Nick Piggin
2008-10-29  0:05     ` Andrew Morton
2008-10-29  0:10       ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081029050659.GD17624@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).