From: Nick Piggin <npiggin@suse.de>
To: akpm@linux-foundation.org, xfs@oss.sgi.com,
linux-fsdevel@vger.kernel.org,
Chris Mason <chris.mason@oracle.com>
Subject: Re: [patch 0/9] writeback data integrity and other fixes (take 3)
Date: Wed, 29 Oct 2008 06:06:59 +0100 [thread overview]
Message-ID: <20081029050659.GD17624@wotan.suse.de> (raw)
In-Reply-To: <20081029045707.GF17077@disturbed>
On Wed, Oct 29, 2008 at 03:57:07PM +1100, Dave Chinner wrote:
> On Wed, Oct 29, 2008 at 05:11:12AM +0100, Nick Piggin wrote:
> > Just be careful -- in your xfs_flush_pages, I think after the first
> > filemap_fdatawrite, the mapping may no longer be tagged with
> > PAGECACHE_TAG_DIRTY, so you may not pick up those writeback ones
> > you need to wait on.
>
> Yes, I realised this as soon as I looked at the code.
> I added xfs_wait_on_pages() to do this wait. ;)
Oh good.
> > Might need a different variant, or we could just bite the bullet and
> > push through the ->fsync conversion so you get full control of the
> > writeout.
>
> Not important right now, though....
OK.
> > BTW. the Linux pagecache APIs should support range operations quite
> > nicely for these. Any reason not to use them (it looks like the
> > wrappers can take ranges)?
>
> Because I haven't got around to modifying these wrappers now that
> the range primitives are in place - XFS inherited the range
> operations from Irix and have sat unimplemented since being ported
> to Linux.
OK, fair enough. Just something easy to stick on a todo list somewhere ;)
Probably doesn't make much difference now, but we might start seeing
apps using range syncs.
> The patch below should fix this entire class of error value screwup
> in XFS. I've just started running it through XFSQA, so it's not
> really tested yet.
I'll sanity check it by running it through my basic fault injection
tests here.
> FWIW, your entire patch series made it through XFSQA without any new
> regressions, so it looks good from that POV.
Thanks, very good to know.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> XFS: fix error inversion problems with data flushing
>
> XFS gets the sign of the error wrong in several places when
> gathering the error from generic linux functions. These functions
> return negative error values, while the core XFS code returns
> positive error values. Hence when XFS inverts the error to be
> returned to the VFS, it can incorrectly invert a negative
> error and this error will be ignored by the syscall return.
>
> Fix all the problems related to calling filemap_* functions.
>
> Problem initially identified by Nick Piggin in xfs_fsync().
>
> Signed-off-by: Dave Chinner <david@fromorbit.com>
> ---
> fs/xfs/linux-2.6/xfs_fs_subr.c | 23 ++++++++++++++++++++---
> fs/xfs/linux-2.6/xfs_lrw.c | 2 +-
> fs/xfs/linux-2.6/xfs_super.c | 13 +++++++++----
> fs/xfs/xfs_vnodeops.c | 2 +-
> fs/xfs/xfs_vnodeops.h | 1 +
> 5 files changed, 32 insertions(+), 9 deletions(-)
>
> diff --git a/fs/xfs/linux-2.6/xfs_fs_subr.c b/fs/xfs/linux-2.6/xfs_fs_subr.c
> index 36caa6d..5aeb777 100644
> --- a/fs/xfs/linux-2.6/xfs_fs_subr.c
> +++ b/fs/xfs/linux-2.6/xfs_fs_subr.c
> @@ -24,6 +24,10 @@ int fs_noerr(void) { return 0; }
> int fs_nosys(void) { return ENOSYS; }
> void fs_noval(void) { return; }
>
> +/*
> + * note: all filemap functions return negative error codes. These
> + * need to be inverted before returning to the xfs core functions.
> + */
> void
> xfs_tosspages(
> xfs_inode_t *ip,
> @@ -53,7 +57,7 @@ xfs_flushinval_pages(
> if (!ret)
> truncate_inode_pages(mapping, first);
> }
> - return ret;
> + return -ret;
> }
>
> int
> @@ -72,10 +76,23 @@ xfs_flush_pages(
> xfs_iflags_clear(ip, XFS_ITRUNCATED);
> ret = filemap_fdatawrite(mapping);
> if (flags & XFS_B_ASYNC)
> - return ret;
> + return -ret;
> ret2 = filemap_fdatawait(mapping);
> if (!ret)
> ret = ret2;
> }
> - return ret;
> + return -ret;
> +}
> +
> +int
> +xfs_wait_on_pages(
> + xfs_inode_t *ip,
> + xfs_off_t first,
> + xfs_off_t last)
> +{
> + struct address_space *mapping = VFS_I(ip)->i_mapping;
> +
> + if (mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
> + return -filemap_fdatawait(mapping);
> + return 0;
> }
> diff --git a/fs/xfs/linux-2.6/xfs_lrw.c b/fs/xfs/linux-2.6/xfs_lrw.c
> index 1957e53..4959c87 100644
> --- a/fs/xfs/linux-2.6/xfs_lrw.c
> +++ b/fs/xfs/linux-2.6/xfs_lrw.c
> @@ -243,7 +243,7 @@ xfs_read(
>
> if (unlikely(ioflags & IO_ISDIRECT)) {
> if (inode->i_mapping->nrpages)
> - ret = xfs_flushinval_pages(ip, (*offset & PAGE_CACHE_MASK),
> + ret = -xfs_flushinval_pages(ip, (*offset & PAGE_CACHE_MASK),
> -1, FI_REMAPF_LOCKED);
> mutex_unlock(&inode->i_mutex);
> if (ret) {
> diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
> index 9dc977d..c5cfc1e 100644
> --- a/fs/xfs/linux-2.6/xfs_super.c
> +++ b/fs/xfs/linux-2.6/xfs_super.c
> @@ -1012,21 +1012,26 @@ xfs_fs_write_inode(
> struct inode *inode,
> int sync)
> {
> + struct xfs_inode *ip = XFS_I(inode);
> int error = 0;
> int flags = 0;
>
> - xfs_itrace_entry(XFS_I(inode));
> + xfs_itrace_entry(ip);
> if (sync) {
> - filemap_fdatawait(inode->i_mapping);
> + error = xfs_wait_on_pages(ip, 0, -1);
> + if (error)
> + goto out_error;
> flags |= FLUSH_SYNC;
> }
> - error = xfs_inode_flush(XFS_I(inode), flags);
> + error = xfs_inode_flush(ip, flags);
> +
> +out_error:
> /*
> * if we failed to write out the inode then mark
> * it dirty again so we'll try again later.
> */
> if (error)
> - xfs_mark_inode_dirty_sync(XFS_I(inode));
> + xfs_mark_inode_dirty_sync(ip);
>
> return -error;
> }
> diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
> index 2e2fbd9..5809c42 100644
> --- a/fs/xfs/xfs_vnodeops.c
> +++ b/fs/xfs/xfs_vnodeops.c
> @@ -713,7 +713,7 @@ xfs_fsync(
> return XFS_ERROR(EIO);
>
> /* capture size updates in I/O completion before writing the inode. */
> - error = filemap_fdatawait(VFS_I(ip)->i_mapping);
> + error = xfs_wait_on_pages(ip, 0, -1);
> if (error)
> return XFS_ERROR(error);
>
> diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
> index e932a96..f9cd376 100644
> --- a/fs/xfs/xfs_vnodeops.h
> +++ b/fs/xfs/xfs_vnodeops.h
> @@ -78,5 +78,6 @@ int xfs_flushinval_pages(struct xfs_inode *ip, xfs_off_t first,
> xfs_off_t last, int fiopt);
> int xfs_flush_pages(struct xfs_inode *ip, xfs_off_t first,
> xfs_off_t last, uint64_t flags, int fiopt);
> +int xfs_wait_on_pages(struct xfs_inode *ip, xfs_off_t first, xfs_off_t last);
>
> #endif /* _XFS_VNODEOPS_H */
next prev parent reply other threads:[~2008-10-29 5:07 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-28 14:47 [patch 0/9] writeback data integrity and other fixes (take 3) npiggin
2008-10-28 14:47 ` [patch 1/9] mm: write_cache_pages cyclic fix npiggin
2008-10-29 0:24 ` [patch 1.1/9] mm: write_cache_pages cyclic fix fix Nick Piggin
2008-10-28 14:47 ` [patch 2/9] mm: write_cache_pages early loop termination npiggin
2008-10-28 14:47 ` [patch 3/9] mm: write_cache_pages writepage error fix npiggin
2008-10-28 14:47 ` [patch 4/9] mm: write_cache_pages integrity fix npiggin
2008-10-28 14:47 ` [patch 5/9] mm: write_cache_pages cleanups npiggin
2008-10-28 14:47 ` [patch 6/9] mm: write_cache_pages optimise page cleaning npiggin
2008-10-28 14:47 ` [patch 7/9] mm: write_cache_pages terminate quickly npiggin
2008-10-30 23:07 ` Andrew Morton
2008-10-31 7:29 ` Nick Piggin
2008-10-28 14:47 ` [patch 8/9] mm: write_cache_pages more " npiggin
2008-10-28 14:47 ` [patch 9/9] mm: do_sync_mapping_range integrity fix npiggin
2008-10-30 23:13 ` Andrew Morton
2008-10-31 9:16 ` Nick Piggin
2008-10-31 10:04 ` Andrew Morton
2008-10-31 10:53 ` Nick Piggin
2008-10-31 20:03 ` Jamie Lokier
2008-10-31 14:10 ` Chris Mason
2008-10-31 14:30 ` steve
2008-10-31 15:02 ` Chris Mason
2008-11-01 8:04 ` Nick Piggin
2008-10-28 15:39 ` [patch 0/9] writeback data integrity and other fixes (take 3) Nick Piggin
2008-10-28 22:27 ` Dave Chinner
2008-10-29 0:04 ` Nick Piggin
2008-10-29 0:16 ` Nick Piggin
2008-10-29 3:16 ` Dave Chinner
2008-10-29 3:26 ` Dave Chinner
2008-10-29 4:11 ` Nick Piggin
2008-10-29 4:57 ` Dave Chinner
2008-10-29 5:06 ` Nick Piggin [this message]
2008-10-29 9:13 ` Christoph Hellwig
2008-10-29 21:42 ` Dave Chinner
2008-10-29 21:45 ` Christoph Hellwig
2008-10-29 21:53 ` Dave Chinner
2008-10-29 4:00 ` Nick Piggin
2008-10-29 5:27 ` Dave Chinner
2008-10-29 9:12 ` Christoph Hellwig
2008-10-29 9:21 ` Nick Piggin
2008-10-29 9:44 ` Christoph Hellwig
2008-10-29 10:30 ` Nick Piggin
2008-10-29 12:22 ` Jamie Lokier
[not found] ` <20081029122234.GE846-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2008-10-29 13:32 ` Ric Wheeler
2008-10-29 14:56 ` Chris Mason
[not found] ` <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org>
2008-10-30 2:16 ` Nick Piggin
[not found] ` <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-10-30 12:51 ` jim owens
2008-10-30 13:41 ` Jim Rees
2008-10-29 21:43 ` Dave Chinner
2008-10-29 8:51 ` Dave Chinner
2008-10-28 23:14 ` Dave Chinner
2008-10-28 23:57 ` Nick Piggin
2008-10-29 0:05 ` Andrew Morton
2008-10-29 0:10 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081029050659.GD17624@wotan.suse.de \
--to=npiggin@suse.de \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).