From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 2/2] xfs: flush both inodes in xfs_swap_extents
Date: Thu, 31 Jul 2014 16:12:08 +1000 [thread overview]
Message-ID: <1406787128-11897-3-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1406787128-11897-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
We need to treat both inodes identically from a page cache point of
view when prepareing them for extent swapping. We don't do this
right now - we assume that one of the inodes empty, because that's
what xfs_fsr currently does. Remove this assumption from the code.
While factoring out the flushing and related checks, move the
transactions reservation to immeidately after the flushes so that we
don't need to pick up and then drop the ilock to do the transaction
reservation. There are no issues with aborting the transaction it if
the checks fail before we join the inodes to the transaction and
dirty them, so this is a safe change to make.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_bmap_util.c | 81 +++++++++++++++++++++++---------------------------
1 file changed, 37 insertions(+), 44 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 3c60c43..2f1e30d 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1619,6 +1619,30 @@ xfs_swap_extents_check_format(
}
int
+xfs_swap_extent_flush(
+ struct xfs_inode *ip)
+{
+ int error;
+
+ error = filemap_write_and_wait(VFS_I(ip)->i_mapping);
+ if (error)
+ return error;
+ truncate_pagecache_range(VFS_I(ip), 0, -1);
+
+ /* Verify O_DIRECT for ftmp */
+ if (VFS_I(ip)->i_mapping->nrpages)
+ return -EINVAL;
+
+ /*
+ * Don't try to swap extents on mmap()d files because we can't lock
+ * out races against page faults safely.
+ */
+ if (mapping_mapped(VFS_I(ip)->i_mapping))
+ return -EBUSY;
+ return 0;
+}
+
+int
xfs_swap_extents(
xfs_inode_t *ip, /* target inode */
xfs_inode_t *tip, /* tmp inode */
@@ -1662,26 +1686,28 @@ xfs_swap_extents(
goto out_unlock;
}
- error = filemap_write_and_wait(VFS_I(tip)->i_mapping);
+ error = xfs_swap_extent_flush(ip);
+ if (error)
+ goto out_unlock;
+ error = xfs_swap_extent_flush(tip);
if (error)
goto out_unlock;
- truncate_pagecache_range(VFS_I(tip), 0, -1);
-
- xfs_lock_two_inodes(ip, tip, XFS_ILOCK_EXCL);
- lock_flags |= XFS_ILOCK_EXCL;
- /* Verify O_DIRECT for ftmp */
- if (VFS_I(tip)->i_mapping->nrpages) {
- error = -EINVAL;
+ tp = xfs_trans_alloc(mp, XFS_TRANS_SWAPEXT);
+ error = xfs_trans_reserve(tp, &M_RES(mp)->tr_ichange, 0, 0);
+ if (error) {
+ xfs_trans_cancel(tp, 0);
goto out_unlock;
}
+ xfs_lock_two_inodes(ip, tip, XFS_ILOCK_EXCL);
+ lock_flags |= XFS_ILOCK_EXCL;
/* Verify all data are being swapped */
if (sxp->sx_offset != 0 ||
sxp->sx_length != ip->i_d.di_size ||
sxp->sx_length != tip->i_d.di_size) {
error = -EFAULT;
- goto out_unlock;
+ goto out_trans_cancel;
}
trace_xfs_swap_extent_before(ip, 0);
@@ -1693,7 +1719,7 @@ xfs_swap_extents(
xfs_notice(mp,
"%s: inode 0x%llx format is incompatible for exchanging.",
__func__, ip->i_ino);
- goto out_unlock;
+ goto out_trans_cancel;
}
/*
@@ -1708,41 +1734,8 @@ xfs_swap_extents(
(sbp->bs_mtime.tv_sec != VFS_I(ip)->i_mtime.tv_sec) ||
(sbp->bs_mtime.tv_nsec != VFS_I(ip)->i_mtime.tv_nsec)) {
error = -EBUSY;
- goto out_unlock;
- }
-
- /* We need to fail if the file is memory mapped. Once we have tossed
- * all existing pages, the page fault will have no option
- * but to go to the filesystem for pages. By making the page fault call
- * vop_read (or write in the case of autogrow) they block on the iolock
- * until we have switched the extents.
- */
- if (mapping_mapped(VFS_I(ip)->i_mapping)) {
- error = -EBUSY;
- goto out_unlock;
- }
-
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
- xfs_iunlock(tip, XFS_ILOCK_EXCL);
- lock_flags &= ~XFS_ILOCK_EXCL;
-
- /*
- * There is a race condition here since we gave up the
- * ilock. However, the data fork will not change since
- * we have the iolock (locked for truncation too) so we
- * are safe. We don't really care if non-io related
- * fields change.
- */
- truncate_pagecache_range(VFS_I(ip), 0, -1);
-
- tp = xfs_trans_alloc(mp, XFS_TRANS_SWAPEXT);
- error = xfs_trans_reserve(tp, &M_RES(mp)->tr_ichange, 0, 0);
- if (error)
goto out_trans_cancel;
-
- xfs_lock_two_inodes(ip, tip, XFS_ILOCK_EXCL);
- lock_flags |= XFS_ILOCK_EXCL;
-
+ }
/*
* Count the number of extended attribute blocks
*/
--
2.0.0
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-07-31 6:12 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-31 6:12 [PATCH 0/2] xfs: extent swap fixes Dave Chinner
2014-07-31 6:12 ` [PATCH 1/2] xfs: fix swapext ilock deadlock Dave Chinner
2014-07-31 17:15 ` Christoph Hellwig
2014-07-31 6:12 ` Dave Chinner [this message]
2014-07-31 17:16 ` [PATCH 2/2] xfs: flush both inodes in xfs_swap_extents Christoph Hellwig
2014-07-31 23:02 ` Dave Chinner
2014-08-01 12:44 ` Brian Foster
2014-08-02 3:19 ` Dave Chinner
2014-08-02 11:24 ` Brian Foster
-- strict thread matches above, loose matches on Subject: below --
2014-06-06 8:22 [PATCH 0/2] xfs: fix a couple of swap extent issues Dave Chinner
2014-06-06 8:22 ` [PATCH 2/2] xfs: flush both inodes in xfs_swap_extents Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1406787128-11897-3-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox