* fix locking for the reflink operation
@ 2016-10-17 12:05 Christoph Hellwig
2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw)
To: linux-xfs; +Cc: darrick.wong
When creating a reflink we need to take the iolock much earlier, as
various early checks done in xfs_file_share_range currently are racy
without it. Patches 1-3 sort that out in a minimal invasive way,
but I think we should just merge xfs_file_share_range and
xfs_reflink_remap_range, which is what patch 4 does.
Patches 1-3 are something I'd like to see in 4.9, patch 4 might not
fully qualify, but just getting it in might make everyones life easier.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range
2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig
@ 2016-10-17 12:05 ` Christoph Hellwig
2016-10-17 21:11 ` Darrick J. Wong
2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw)
To: linux-xfs; +Cc: darrick.wong
The VFS already does the check, and the placement of this duplicate is in
the way of the following locking rework.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index a314fc7..20563f2 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1003,9 +1003,6 @@ xfs_file_share_range(
IS_SWAPFILE(inode_out))
return -ETXTBSY;
- /* Reflink only works within this filesystem. */
- if (inode_in->i_sb != inode_out->i_sb)
- return -EXDEV;
same_inode = (inode_in->i_ino == inode_out->i_ino);
/* Don't reflink dirs, pipes, sockets... */
--
2.1.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range
2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig
2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig
@ 2016-10-17 12:05 ` Christoph Hellwig
2016-10-17 21:12 ` Darrick J. Wong
2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw)
To: linux-xfs; +Cc: darrick.wong
The VFS i_ino is an unsigned long, while XFS inode numbers are 64-bit
wide, so checking i_ino for equality could lead to rate false positives
on 32-bit architectures. Just compare the inode pointers themselves
to be safe.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 20563f2..012a960 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1003,7 +1003,7 @@ xfs_file_share_range(
IS_SWAPFILE(inode_out))
return -ETXTBSY;
- same_inode = (inode_in->i_ino == inode_out->i_ino);
+ same_inode = (inode_in == inode_out);
/* Don't reflink dirs, pipes, sockets... */
if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
--
2.1.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range
2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig
2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig
2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig
@ 2016-10-17 12:05 ` Christoph Hellwig
2016-10-17 21:16 ` Darrick J. Wong
2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig
2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong
4 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw)
To: linux-xfs; +Cc: darrick.wong
We need the iolock protection to stabilizie the IS_SWAPFILE and IS_IMMUTABLE
values, as well as preventing new buffered writers re-dirtying the file data
that we just wrote out.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 62 ++++++++++++++++++++++++++++++++++------------------
fs/xfs/xfs_reflink.c | 15 -------------
2 files changed, 41 insertions(+), 36 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 012a960..0960264 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -996,38 +996,54 @@ xfs_file_share_range(
inode_out = file_inode(file_out);
bs = inode_out->i_sb->s_blocksize;
+ /* Lock both files against IO */
+ same_inode = (inode_in == inode_out);
+ if (same_inode) {
+ xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL);
+ xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL);
+ } else {
+ xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out),
+ XFS_IOLOCK_EXCL);
+ xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out),
+ XFS_MMAPLOCK_EXCL);
+ }
+
/* Don't touch certain kinds of inodes */
+ ret = -EPERM;
if (IS_IMMUTABLE(inode_out))
- return -EPERM;
- if (IS_SWAPFILE(inode_in) ||
- IS_SWAPFILE(inode_out))
- return -ETXTBSY;
-
- same_inode = (inode_in == inode_out);
+ goto out_unlock;
+ ret = -ETXTBSY;
+ if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out))
+ goto out_unlock;
/* Don't reflink dirs, pipes, sockets... */
+ ret = -EISDIR;
if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
- return -EISDIR;
+ goto out_unlock;
+ ret = -EINVAL;
if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode))
- return -EINVAL;
+ goto out_unlock;
if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
- return -EINVAL;
+ goto out_unlock;
/* Don't share DAX file data for now. */
if (IS_DAX(inode_in) || IS_DAX(inode_out))
- return -EINVAL;
+ goto out_unlock;
/* Are we going all the way to the end? */
isize = i_size_read(inode_in);
- if (isize == 0)
- return 0;
+ if (isize == 0) {
+ ret = 0;
+ goto out_unlock;
+ }
+
if (len == 0)
len = isize - pos_in;
/* Ensure offsets don't wrap and the input is inside i_size */
if (pos_in + len < pos_in || pos_out + len < pos_out ||
pos_in + len > isize)
- return -EINVAL;
+ goto out_unlock;
/* Don't allow dedupe past EOF in the dest file */
if (is_dedupe) {
@@ -1035,7 +1051,7 @@ xfs_file_share_range(
disize = i_size_read(inode_out);
if (pos_out >= disize || pos_out + len > disize)
- return -EINVAL;
+ goto out_unlock;
}
/* If we're linking to EOF, continue to the block boundary. */
@@ -1047,28 +1063,32 @@ xfs_file_share_range(
/* Only reflink if we're aligned to block boundaries */
if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) ||
!IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs))
- return -EINVAL;
+ goto out_unlock;
/* Don't allow overlapped reflink within the same file */
if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen)
- return -EINVAL;
+ goto out_unlock;
/* Wait for the completion of any pending IOs on srcfile */
ret = xfs_file_wait_for_io(inode_in, pos_in, len);
if (ret)
- goto out;
+ goto out_unlock;
ret = xfs_file_wait_for_io(inode_out, pos_out, len);
if (ret)
- goto out;
+ goto out_unlock;
if (is_dedupe)
flags |= XFS_REFLINK_DEDUPE;
ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out),
pos_out, len, flags);
- if (ret < 0)
- goto out;
-out:
+out_unlock:
+ xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL);
+ xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL);
+ if (!same_inode) {
+ xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL);
+ xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL);
+ }
return ret;
}
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 5965e94..d012746 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1341,15 +1341,6 @@ xfs_reflink_remap_range(
trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff);
- /* Lock both files against IO */
- if (src->i_ino == dest->i_ino) {
- xfs_ilock(src, XFS_IOLOCK_EXCL);
- xfs_ilock(src, XFS_MMAPLOCK_EXCL);
- } else {
- xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL);
- xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL);
- }
-
/*
* Check that the extents are the same.
*/
@@ -1401,12 +1392,6 @@ xfs_reflink_remap_range(
goto out_error;
out_error:
- xfs_iunlock(src, XFS_MMAPLOCK_EXCL);
- xfs_iunlock(src, XFS_IOLOCK_EXCL);
- if (src->i_ino != dest->i_ino) {
- xfs_iunlock(dest, XFS_MMAPLOCK_EXCL);
- xfs_iunlock(dest, XFS_IOLOCK_EXCL);
- }
if (error)
trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_);
return error;
--
2.1.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range
2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig
` (2 preceding siblings ...)
2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig
@ 2016-10-17 12:05 ` Christoph Hellwig
2016-10-17 21:29 ` Darrick J. Wong
2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong
4 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw)
To: linux-xfs; +Cc: darrick.wong
There is no clear division of responsibility between those functions, so
just merge them into one to keep the code simple. Also move
xfs_file_wait_for_io to xfs_reflink.c together with its only caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 145 -------------------------------------
fs/xfs/xfs_reflink.c | 199 ++++++++++++++++++++++++++++++++++++++++-----------
fs/xfs/xfs_reflink.h | 8 +--
3 files changed, 161 insertions(+), 191 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 0960264..cd37573 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -947,151 +947,6 @@ xfs_file_fallocate(
return error;
}
-/*
- * Flush all file writes out to disk.
- */
-static int
-xfs_file_wait_for_io(
- struct inode *inode,
- loff_t offset,
- size_t len)
-{
- loff_t rounding;
- loff_t ioffset;
- loff_t iendoffset;
- loff_t bs;
- int ret;
-
- bs = inode->i_sb->s_blocksize;
- inode_dio_wait(inode);
-
- rounding = max_t(xfs_off_t, bs, PAGE_SIZE);
- ioffset = round_down(offset, rounding);
- iendoffset = round_up(offset + len, rounding) - 1;
- ret = filemap_write_and_wait_range(inode->i_mapping, ioffset,
- iendoffset);
- return ret;
-}
-
-/* Hook up to the VFS reflink function */
-STATIC int
-xfs_file_share_range(
- struct file *file_in,
- loff_t pos_in,
- struct file *file_out,
- loff_t pos_out,
- u64 len,
- bool is_dedupe)
-{
- struct inode *inode_in;
- struct inode *inode_out;
- ssize_t ret;
- loff_t bs;
- loff_t isize;
- int same_inode;
- loff_t blen;
- unsigned int flags = 0;
-
- inode_in = file_inode(file_in);
- inode_out = file_inode(file_out);
- bs = inode_out->i_sb->s_blocksize;
-
- /* Lock both files against IO */
- same_inode = (inode_in == inode_out);
- if (same_inode) {
- xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL);
- xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL);
- } else {
- xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out),
- XFS_IOLOCK_EXCL);
- xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out),
- XFS_MMAPLOCK_EXCL);
- }
-
- /* Don't touch certain kinds of inodes */
- ret = -EPERM;
- if (IS_IMMUTABLE(inode_out))
- goto out_unlock;
- ret = -ETXTBSY;
- if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out))
- goto out_unlock;
-
- /* Don't reflink dirs, pipes, sockets... */
- ret = -EISDIR;
- if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
- goto out_unlock;
- ret = -EINVAL;
- if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode))
- goto out_unlock;
- if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
- goto out_unlock;
-
- /* Don't share DAX file data for now. */
- if (IS_DAX(inode_in) || IS_DAX(inode_out))
- goto out_unlock;
-
- /* Are we going all the way to the end? */
- isize = i_size_read(inode_in);
- if (isize == 0) {
- ret = 0;
- goto out_unlock;
- }
-
- if (len == 0)
- len = isize - pos_in;
-
- /* Ensure offsets don't wrap and the input is inside i_size */
- if (pos_in + len < pos_in || pos_out + len < pos_out ||
- pos_in + len > isize)
- goto out_unlock;
-
- /* Don't allow dedupe past EOF in the dest file */
- if (is_dedupe) {
- loff_t disize;
-
- disize = i_size_read(inode_out);
- if (pos_out >= disize || pos_out + len > disize)
- goto out_unlock;
- }
-
- /* If we're linking to EOF, continue to the block boundary. */
- if (pos_in + len == isize)
- blen = ALIGN(isize, bs) - pos_in;
- else
- blen = len;
-
- /* Only reflink if we're aligned to block boundaries */
- if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) ||
- !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs))
- goto out_unlock;
-
- /* Don't allow overlapped reflink within the same file */
- if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen)
- goto out_unlock;
-
- /* Wait for the completion of any pending IOs on srcfile */
- ret = xfs_file_wait_for_io(inode_in, pos_in, len);
- if (ret)
- goto out_unlock;
- ret = xfs_file_wait_for_io(inode_out, pos_out, len);
- if (ret)
- goto out_unlock;
-
- if (is_dedupe)
- flags |= XFS_REFLINK_DEDUPE;
- ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out),
- pos_out, len, flags);
-
-out_unlock:
- xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL);
- xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL);
- if (!same_inode) {
- xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL);
- xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL);
- }
- return ret;
-}
-
STATIC ssize_t
xfs_file_copy_range(
struct file *file_in,
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index d012746..11ed2ad 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1308,23 +1308,56 @@ xfs_compare_extents(
}
/*
+ * Flush all file writes out to disk.
+ */
+static int
+xfs_file_wait_for_io(
+ struct inode *inode,
+ loff_t offset,
+ size_t len)
+{
+ loff_t rounding;
+ loff_t ioffset;
+ loff_t iendoffset;
+ loff_t bs;
+ int ret;
+
+ bs = inode->i_sb->s_blocksize;
+ inode_dio_wait(inode);
+
+ rounding = max_t(xfs_off_t, bs, PAGE_SIZE);
+ ioffset = round_down(offset, rounding);
+ iendoffset = round_up(offset + len, rounding) - 1;
+ ret = filemap_write_and_wait_range(inode->i_mapping, ioffset,
+ iendoffset);
+ return ret;
+}
+
+/*
* Link a range of blocks from one file to another.
*/
int
-xfs_reflink_remap_range(
- struct xfs_inode *src,
- xfs_off_t srcoff,
- struct xfs_inode *dest,
- xfs_off_t destoff,
- xfs_off_t len,
- unsigned int flags)
+xfs_file_share_range(
+ struct file *file_in,
+ loff_t pos_in,
+ struct file *file_out,
+ loff_t pos_out,
+ u64 len,
+ bool is_dedupe)
{
+ struct inode *inode_in = file_inode(file_in);
+ struct xfs_inode *src = XFS_I(inode_in);
+ struct inode *inode_out = file_inode(file_out);
+ struct xfs_inode *dest = XFS_I(inode_out);
struct xfs_mount *mp = src->i_mount;
+ loff_t bs = inode_out->i_sb->s_blocksize;
+ bool same_inode = (inode_in == inode_out);
xfs_fileoff_t sfsbno, dfsbno;
xfs_filblks_t fsblen;
- int error;
xfs_extlen_t cowextsize;
- bool is_same;
+ loff_t isize;
+ ssize_t ret;
+ loff_t blen;
if (!xfs_sb_version_hasreflink(&mp->m_sb))
return -EOPNOTSUPP;
@@ -1332,48 +1365,128 @@ xfs_reflink_remap_range(
if (XFS_FORCED_SHUTDOWN(mp))
return -EIO;
+ /* Lock both files against IO */
+ if (same_inode) {
+ xfs_ilock(src, XFS_IOLOCK_EXCL);
+ xfs_ilock(src, XFS_MMAPLOCK_EXCL);
+ } else {
+ xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL);
+ xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL);
+ }
+
+ /* Don't touch certain kinds of inodes */
+ ret = -EPERM;
+ if (IS_IMMUTABLE(inode_out))
+ goto out_unlock;
+
+ ret = -ETXTBSY;
+ if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out))
+ goto out_unlock;
+
+
+ /* Don't reflink dirs, pipes, sockets... */
+ ret = -EISDIR;
+ if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
+ goto out_unlock;
+ ret = -EINVAL;
+ if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode))
+ goto out_unlock;
+ if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
+ goto out_unlock;
+
/* Don't reflink realtime inodes */
if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest))
- return -EINVAL;
+ goto out_unlock;
+
+ /* Don't share DAX file data for now. */
+ if (IS_DAX(inode_in) || IS_DAX(inode_out))
+ goto out_unlock;
+
+ /* Are we going all the way to the end? */
+ isize = i_size_read(inode_in);
+ if (isize == 0) {
+ ret = 0;
+ goto out_unlock;
+ }
+
+ if (len == 0)
+ len = isize - pos_in;
- if (flags & ~XFS_REFLINK_ALL)
- return -EINVAL;
+ /* Ensure offsets don't wrap and the input is inside i_size */
+ if (pos_in + len < pos_in || pos_out + len < pos_out ||
+ pos_in + len > isize)
+ goto out_unlock;
+
+ /* Don't allow dedupe past EOF in the dest file */
+ if (is_dedupe) {
+ loff_t disize;
+
+ disize = i_size_read(inode_out);
+ if (pos_out >= disize || pos_out + len > disize)
+ goto out_unlock;
+ }
- trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff);
+ /* If we're linking to EOF, continue to the block boundary. */
+ if (pos_in + len == isize)
+ blen = ALIGN(isize, bs) - pos_in;
+ else
+ blen = len;
+
+ /* Only reflink if we're aligned to block boundaries */
+ if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) ||
+ !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs))
+ goto out_unlock;
+
+ /* Don't allow overlapped reflink within the same file */
+ if (same_inode) {
+ if (pos_out + blen > pos_in && pos_out < pos_in + blen)
+ goto out_unlock;
+ }
+
+ /* Wait for the completion of any pending IOs on srcfile */
+ ret = xfs_file_wait_for_io(inode_in, pos_in, len);
+ if (ret)
+ goto out_unlock;
+ ret = xfs_file_wait_for_io(inode_out, pos_out, len);
+ if (ret)
+ goto out_unlock;
+
+ trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out);
/*
* Check that the extents are the same.
*/
- if (flags & XFS_REFLINK_DEDUPE) {
- is_same = false;
- error = xfs_compare_extents(VFS_I(src), srcoff, VFS_I(dest),
- destoff, len, &is_same);
- if (error)
- goto out_error;
+ if (is_dedupe) {
+ bool is_same = false;
+
+ ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out,
+ len, &is_same);
+ if (ret)
+ goto out_unlock;;
if (!is_same) {
- error = -EBADE;
- goto out_error;
+ ret = -EBADE;
+ goto out_unlock;
}
}
- error = xfs_reflink_set_inode_flag(src, dest);
- if (error)
- goto out_error;
+ ret = xfs_reflink_set_inode_flag(src, dest);
+ if (ret)
+ goto out_unlock;
/*
* Invalidate the page cache so that we can clear any CoW mappings
* in the destination file.
*/
- truncate_inode_pages_range(&VFS_I(dest)->i_data, destoff,
- PAGE_ALIGN(destoff + len) - 1);
+ truncate_inode_pages_range(&inode_out->i_data, pos_out,
+ PAGE_ALIGN(pos_out + len) - 1);
- dfsbno = XFS_B_TO_FSBT(mp, destoff);
- sfsbno = XFS_B_TO_FSBT(mp, srcoff);
+ dfsbno = XFS_B_TO_FSBT(mp, pos_out);
+ sfsbno = XFS_B_TO_FSBT(mp, pos_in);
fsblen = XFS_B_TO_FSB(mp, len);
- error = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen,
- destoff + len);
- if (error)
- goto out_error;
+ ret = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen,
+ pos_out + len);
+ if (ret)
+ goto out_unlock;
/*
* Carry the cowextsize hint from src to dest if we're sharing the
@@ -1381,20 +1494,24 @@ xfs_reflink_remap_range(
* has a cowextsize hint, and the destination file does not.
*/
cowextsize = 0;
- if (srcoff == 0 && len == i_size_read(VFS_I(src)) &&
+ if (pos_in == 0 && len == i_size_read(inode_in) &&
(src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) &&
- destoff == 0 && len >= i_size_read(VFS_I(dest)) &&
+ pos_out == 0 && len >= i_size_read(inode_out) &&
!(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE))
cowextsize = src->i_d.di_cowextsize;
- error = xfs_reflink_update_dest(dest, destoff + len, cowextsize);
- if (error)
- goto out_error;
+ ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize);
-out_error:
- if (error)
- trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_);
- return error;
+out_unlock:
+ xfs_iunlock(src, XFS_MMAPLOCK_EXCL);
+ xfs_iunlock(src, XFS_IOLOCK_EXCL);
+ if (src->i_ino != dest->i_ino) {
+ xfs_iunlock(dest, XFS_MMAPLOCK_EXCL);
+ xfs_iunlock(dest, XFS_IOLOCK_EXCL);
+ }
+ if (ret)
+ trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_);
+ return ret;
}
/*
diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
index 5dc3c8a..25a8956 100644
--- a/fs/xfs/xfs_reflink.h
+++ b/fs/xfs/xfs_reflink.h
@@ -43,11 +43,6 @@ extern int xfs_reflink_cancel_cow_range(struct xfs_inode *ip, xfs_off_t offset,
extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset,
xfs_off_t count);
extern int xfs_reflink_recover_cow(struct xfs_mount *mp);
-#define XFS_REFLINK_DEDUPE 1 /* only reflink if contents match */
-#define XFS_REFLINK_ALL (XFS_REFLINK_DEDUPE)
-extern int xfs_reflink_remap_range(struct xfs_inode *src, xfs_off_t srcoff,
- struct xfs_inode *dest, xfs_off_t destoff, xfs_off_t len,
- unsigned int flags);
extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip,
struct xfs_trans **tpp);
extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,
@@ -55,4 +50,7 @@ extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,
extern bool xfs_reflink_has_real_cow_blocks(struct xfs_inode *ip);
+int xfs_file_share_range(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe);
+
#endif /* __XFS_REFLINK_H */
--
2.1.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range
2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig
@ 2016-10-17 21:11 ` Darrick J. Wong
0 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2016-10-17 21:11 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Oct 17, 2016 at 02:05:17PM +0200, Christoph Hellwig wrote:
> The VFS already does the check, and the placement of this duplicate is in
> the way of the following locking rework.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/xfs_file.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index a314fc7..20563f2 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1003,9 +1003,6 @@ xfs_file_share_range(
> IS_SWAPFILE(inode_out))
> return -ETXTBSY;
>
> - /* Reflink only works within this filesystem. */
> - if (inode_in->i_sb != inode_out->i_sb)
> - return -EXDEV;
> same_inode = (inode_in->i_ino == inode_out->i_ino);
>
> /* Don't reflink dirs, pipes, sockets... */
> --
> 2.1.4
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range
2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig
@ 2016-10-17 21:12 ` Darrick J. Wong
0 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2016-10-17 21:12 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Oct 17, 2016 at 02:05:18PM +0200, Christoph Hellwig wrote:
> The VFS i_ino is an unsigned long, while XFS inode numbers are 64-bit
> wide, so checking i_ino for equality could lead to rate false positives
> on 32-bit architectures. Just compare the inode pointers themselves
> to be safe.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/xfs_file.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 20563f2..012a960 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1003,7 +1003,7 @@ xfs_file_share_range(
> IS_SWAPFILE(inode_out))
> return -ETXTBSY;
>
> - same_inode = (inode_in->i_ino == inode_out->i_ino);
> + same_inode = (inode_in == inode_out);
>
> /* Don't reflink dirs, pipes, sockets... */
> if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
> --
> 2.1.4
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range
2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig
@ 2016-10-17 21:16 ` Darrick J. Wong
0 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2016-10-17 21:16 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Oct 17, 2016 at 02:05:19PM +0200, Christoph Hellwig wrote:
> We need the iolock protection to stabilizie the IS_SWAPFILE and IS_IMMUTABLE
> values, as well as preventing new buffered writers re-dirtying the file data
> that we just wrote out.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/xfs_file.c | 62 ++++++++++++++++++++++++++++++++++------------------
> fs/xfs/xfs_reflink.c | 15 -------------
> 2 files changed, 41 insertions(+), 36 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 012a960..0960264 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -996,38 +996,54 @@ xfs_file_share_range(
> inode_out = file_inode(file_out);
> bs = inode_out->i_sb->s_blocksize;
>
> + /* Lock both files against IO */
> + same_inode = (inode_in == inode_out);
> + if (same_inode) {
> + xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL);
> + xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL);
> + } else {
> + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out),
> + XFS_IOLOCK_EXCL);
> + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out),
> + XFS_MMAPLOCK_EXCL);
> + }
> +
> /* Don't touch certain kinds of inodes */
> + ret = -EPERM;
> if (IS_IMMUTABLE(inode_out))
> - return -EPERM;
> - if (IS_SWAPFILE(inode_in) ||
> - IS_SWAPFILE(inode_out))
> - return -ETXTBSY;
> -
> - same_inode = (inode_in == inode_out);
> + goto out_unlock;
> + ret = -ETXTBSY;
> + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out))
> + goto out_unlock;
>
> /* Don't reflink dirs, pipes, sockets... */
> + ret = -EISDIR;
> if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
> - return -EISDIR;
> + goto out_unlock;
> + ret = -EINVAL;
> if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode))
> - return -EINVAL;
> + goto out_unlock;
> if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
> - return -EINVAL;
> + goto out_unlock;
>
> /* Don't share DAX file data for now. */
> if (IS_DAX(inode_in) || IS_DAX(inode_out))
> - return -EINVAL;
> + goto out_unlock;
>
> /* Are we going all the way to the end? */
> isize = i_size_read(inode_in);
> - if (isize == 0)
> - return 0;
> + if (isize == 0) {
> + ret = 0;
> + goto out_unlock;
> + }
> +
> if (len == 0)
> len = isize - pos_in;
>
> /* Ensure offsets don't wrap and the input is inside i_size */
> if (pos_in + len < pos_in || pos_out + len < pos_out ||
> pos_in + len > isize)
> - return -EINVAL;
> + goto out_unlock;
>
> /* Don't allow dedupe past EOF in the dest file */
> if (is_dedupe) {
> @@ -1035,7 +1051,7 @@ xfs_file_share_range(
>
> disize = i_size_read(inode_out);
> if (pos_out >= disize || pos_out + len > disize)
> - return -EINVAL;
> + goto out_unlock;
> }
>
> /* If we're linking to EOF, continue to the block boundary. */
> @@ -1047,28 +1063,32 @@ xfs_file_share_range(
> /* Only reflink if we're aligned to block boundaries */
> if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) ||
> !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs))
> - return -EINVAL;
> + goto out_unlock;
>
> /* Don't allow overlapped reflink within the same file */
> if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen)
> - return -EINVAL;
> + goto out_unlock;
>
> /* Wait for the completion of any pending IOs on srcfile */
> ret = xfs_file_wait_for_io(inode_in, pos_in, len);
> if (ret)
> - goto out;
> + goto out_unlock;
> ret = xfs_file_wait_for_io(inode_out, pos_out, len);
> if (ret)
> - goto out;
> + goto out_unlock;
>
> if (is_dedupe)
> flags |= XFS_REFLINK_DEDUPE;
> ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out),
> pos_out, len, flags);
> - if (ret < 0)
> - goto out;
>
> -out:
> +out_unlock:
> + xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL);
> + xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL);
> + if (!same_inode) {
> + xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL);
> + xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL);
> + }
> return ret;
> }
>
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 5965e94..d012746 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1341,15 +1341,6 @@ xfs_reflink_remap_range(
>
> trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff);
>
> - /* Lock both files against IO */
> - if (src->i_ino == dest->i_ino) {
> - xfs_ilock(src, XFS_IOLOCK_EXCL);
> - xfs_ilock(src, XFS_MMAPLOCK_EXCL);
> - } else {
> - xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL);
> - xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL);
> - }
> -
> /*
> * Check that the extents are the same.
> */
> @@ -1401,12 +1392,6 @@ xfs_reflink_remap_range(
> goto out_error;
>
> out_error:
> - xfs_iunlock(src, XFS_MMAPLOCK_EXCL);
> - xfs_iunlock(src, XFS_IOLOCK_EXCL);
> - if (src->i_ino != dest->i_ino) {
> - xfs_iunlock(dest, XFS_MMAPLOCK_EXCL);
> - xfs_iunlock(dest, XFS_IOLOCK_EXCL);
> - }
> if (error)
> trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_);
> return error;
> --
> 2.1.4
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range
2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig
@ 2016-10-17 21:29 ` Darrick J. Wong
2016-10-18 5:17 ` Christoph Hellwig
0 siblings, 1 reply; 11+ messages in thread
From: Darrick J. Wong @ 2016-10-17 21:29 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Oct 17, 2016 at 02:05:20PM +0200, Christoph Hellwig wrote:
> There is no clear division of responsibility between those functions, so
> just merge them into one to keep the code simple. Also move
> xfs_file_wait_for_io to xfs_reflink.c together with its only caller.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_file.c | 145 -------------------------------------
> fs/xfs/xfs_reflink.c | 199 ++++++++++++++++++++++++++++++++++++++++-----------
> fs/xfs/xfs_reflink.h | 8 +--
> 3 files changed, 161 insertions(+), 191 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 0960264..cd37573 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -947,151 +947,6 @@ xfs_file_fallocate(
> return error;
> }
>
> -/*
> - * Flush all file writes out to disk.
> - */
> -static int
> -xfs_file_wait_for_io(
> - struct inode *inode,
> - loff_t offset,
> - size_t len)
> -{
> - loff_t rounding;
> - loff_t ioffset;
> - loff_t iendoffset;
> - loff_t bs;
> - int ret;
> -
> - bs = inode->i_sb->s_blocksize;
> - inode_dio_wait(inode);
> -
> - rounding = max_t(xfs_off_t, bs, PAGE_SIZE);
> - ioffset = round_down(offset, rounding);
> - iendoffset = round_up(offset + len, rounding) - 1;
> - ret = filemap_write_and_wait_range(inode->i_mapping, ioffset,
> - iendoffset);
> - return ret;
> -}
> -
> -/* Hook up to the VFS reflink function */
> -STATIC int
> -xfs_file_share_range(
> - struct file *file_in,
> - loff_t pos_in,
> - struct file *file_out,
> - loff_t pos_out,
> - u64 len,
> - bool is_dedupe)
> -{
> - struct inode *inode_in;
> - struct inode *inode_out;
> - ssize_t ret;
> - loff_t bs;
> - loff_t isize;
> - int same_inode;
> - loff_t blen;
> - unsigned int flags = 0;
> -
> - inode_in = file_inode(file_in);
> - inode_out = file_inode(file_out);
> - bs = inode_out->i_sb->s_blocksize;
> -
> - /* Lock both files against IO */
> - same_inode = (inode_in == inode_out);
> - if (same_inode) {
> - xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL);
> - xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL);
> - } else {
> - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out),
> - XFS_IOLOCK_EXCL);
> - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out),
> - XFS_MMAPLOCK_EXCL);
> - }
> -
> - /* Don't touch certain kinds of inodes */
> - ret = -EPERM;
> - if (IS_IMMUTABLE(inode_out))
> - goto out_unlock;
> - ret = -ETXTBSY;
> - if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out))
> - goto out_unlock;
> -
> - /* Don't reflink dirs, pipes, sockets... */
> - ret = -EISDIR;
> - if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
> - goto out_unlock;
> - ret = -EINVAL;
> - if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode))
> - goto out_unlock;
> - if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
> - goto out_unlock;
> -
> - /* Don't share DAX file data for now. */
> - if (IS_DAX(inode_in) || IS_DAX(inode_out))
> - goto out_unlock;
> -
> - /* Are we going all the way to the end? */
> - isize = i_size_read(inode_in);
> - if (isize == 0) {
> - ret = 0;
> - goto out_unlock;
> - }
> -
> - if (len == 0)
> - len = isize - pos_in;
> -
> - /* Ensure offsets don't wrap and the input is inside i_size */
> - if (pos_in + len < pos_in || pos_out + len < pos_out ||
> - pos_in + len > isize)
> - goto out_unlock;
> -
> - /* Don't allow dedupe past EOF in the dest file */
> - if (is_dedupe) {
> - loff_t disize;
> -
> - disize = i_size_read(inode_out);
> - if (pos_out >= disize || pos_out + len > disize)
> - goto out_unlock;
> - }
> -
> - /* If we're linking to EOF, continue to the block boundary. */
> - if (pos_in + len == isize)
> - blen = ALIGN(isize, bs) - pos_in;
> - else
> - blen = len;
> -
> - /* Only reflink if we're aligned to block boundaries */
> - if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) ||
> - !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs))
> - goto out_unlock;
> -
> - /* Don't allow overlapped reflink within the same file */
> - if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen)
> - goto out_unlock;
> -
> - /* Wait for the completion of any pending IOs on srcfile */
> - ret = xfs_file_wait_for_io(inode_in, pos_in, len);
> - if (ret)
> - goto out_unlock;
> - ret = xfs_file_wait_for_io(inode_out, pos_out, len);
> - if (ret)
> - goto out_unlock;
> -
> - if (is_dedupe)
> - flags |= XFS_REFLINK_DEDUPE;
> - ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out),
> - pos_out, len, flags);
> -
> -out_unlock:
> - xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL);
> - xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL);
> - if (!same_inode) {
> - xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL);
> - xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL);
> - }
> - return ret;
> -}
> -
> STATIC ssize_t
> xfs_file_copy_range(
> struct file *file_in,
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index d012746..11ed2ad 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1308,23 +1308,56 @@ xfs_compare_extents(
> }
>
> /*
> + * Flush all file writes out to disk.
> + */
> +static int
> +xfs_file_wait_for_io(
> + struct inode *inode,
> + loff_t offset,
> + size_t len)
> +{
> + loff_t rounding;
> + loff_t ioffset;
> + loff_t iendoffset;
> + loff_t bs;
> + int ret;
> +
> + bs = inode->i_sb->s_blocksize;
> + inode_dio_wait(inode);
> +
> + rounding = max_t(xfs_off_t, bs, PAGE_SIZE);
> + ioffset = round_down(offset, rounding);
> + iendoffset = round_up(offset + len, rounding) - 1;
> + ret = filemap_write_and_wait_range(inode->i_mapping, ioffset,
> + iendoffset);
> + return ret;
> +}
This seems like a file action not specific to reflink.
> +
> +/*
> * Link a range of blocks from one file to another.
> */
> int
> -xfs_reflink_remap_range(
> - struct xfs_inode *src,
> - xfs_off_t srcoff,
> - struct xfs_inode *dest,
> - xfs_off_t destoff,
> - xfs_off_t len,
> - unsigned int flags)
> +xfs_file_share_range(
xfs_reflink_share_file_range() ?
I'd like to maintain the convention that all reflink functions
start with xfs_reflink_*, particularly since the xfs_file_* functions
largely live in xfs_file.c.
> + struct file *file_in,
> + loff_t pos_in,
> + struct file *file_out,
> + loff_t pos_out,
> + u64 len,
> + bool is_dedupe)
> {
> + struct inode *inode_in = file_inode(file_in);
> + struct xfs_inode *src = XFS_I(inode_in);
> + struct inode *inode_out = file_inode(file_out);
> + struct xfs_inode *dest = XFS_I(inode_out);
> struct xfs_mount *mp = src->i_mount;
> + loff_t bs = inode_out->i_sb->s_blocksize;
> + bool same_inode = (inode_in == inode_out);
> xfs_fileoff_t sfsbno, dfsbno;
> xfs_filblks_t fsblen;
> - int error;
> xfs_extlen_t cowextsize;
> - bool is_same;
> + loff_t isize;
> + ssize_t ret;
> + loff_t blen;
>
> if (!xfs_sb_version_hasreflink(&mp->m_sb))
> return -EOPNOTSUPP;
> @@ -1332,48 +1365,128 @@ xfs_reflink_remap_range(
> if (XFS_FORCED_SHUTDOWN(mp))
> return -EIO;
>
> + /* Lock both files against IO */
> + if (same_inode) {
> + xfs_ilock(src, XFS_IOLOCK_EXCL);
> + xfs_ilock(src, XFS_MMAPLOCK_EXCL);
> + } else {
> + xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL);
> + xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL);
> + }
> +
> + /* Don't touch certain kinds of inodes */
> + ret = -EPERM;
> + if (IS_IMMUTABLE(inode_out))
> + goto out_unlock;
> +
> + ret = -ETXTBSY;
> + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out))
> + goto out_unlock;
> +
> +
> + /* Don't reflink dirs, pipes, sockets... */
> + ret = -EISDIR;
> + if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
> + goto out_unlock;
> + ret = -EINVAL;
> + if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode))
> + goto out_unlock;
> + if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
> + goto out_unlock;
> +
> /* Don't reflink realtime inodes */
> if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest))
> - return -EINVAL;
> + goto out_unlock;
> +
> + /* Don't share DAX file data for now. */
> + if (IS_DAX(inode_in) || IS_DAX(inode_out))
> + goto out_unlock;
> +
> + /* Are we going all the way to the end? */
> + isize = i_size_read(inode_in);
> + if (isize == 0) {
> + ret = 0;
> + goto out_unlock;
> + }
> +
> + if (len == 0)
> + len = isize - pos_in;
>
> - if (flags & ~XFS_REFLINK_ALL)
> - return -EINVAL;
> + /* Ensure offsets don't wrap and the input is inside i_size */
> + if (pos_in + len < pos_in || pos_out + len < pos_out ||
> + pos_in + len > isize)
> + goto out_unlock;
> +
> + /* Don't allow dedupe past EOF in the dest file */
> + if (is_dedupe) {
> + loff_t disize;
> +
> + disize = i_size_read(inode_out);
> + if (pos_out >= disize || pos_out + len > disize)
> + goto out_unlock;
> + }
>
> - trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff);
> + /* If we're linking to EOF, continue to the block boundary. */
> + if (pos_in + len == isize)
> + blen = ALIGN(isize, bs) - pos_in;
> + else
> + blen = len;
> +
> + /* Only reflink if we're aligned to block boundaries */
> + if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) ||
> + !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs))
> + goto out_unlock;
> +
> + /* Don't allow overlapped reflink within the same file */
> + if (same_inode) {
> + if (pos_out + blen > pos_in && pos_out < pos_in + blen)
> + goto out_unlock;
> + }
> +
> + /* Wait for the completion of any pending IOs on srcfile */
> + ret = xfs_file_wait_for_io(inode_in, pos_in, len);
> + if (ret)
> + goto out_unlock;
> + ret = xfs_file_wait_for_io(inode_out, pos_out, len);
> + if (ret)
> + goto out_unlock;
> +
> + trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out);
>
> /*
> * Check that the extents are the same.
> */
> - if (flags & XFS_REFLINK_DEDUPE) {
> - is_same = false;
> - error = xfs_compare_extents(VFS_I(src), srcoff, VFS_I(dest),
> - destoff, len, &is_same);
> - if (error)
> - goto out_error;
> + if (is_dedupe) {
> + bool is_same = false;
> +
> + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out,
> + len, &is_same);
> + if (ret)
> + goto out_unlock;;
Double-semicolon here.
> if (!is_same) {
> - error = -EBADE;
> - goto out_error;
> + ret = -EBADE;
> + goto out_unlock;
> }
> }
>
> - error = xfs_reflink_set_inode_flag(src, dest);
> - if (error)
> - goto out_error;
> + ret = xfs_reflink_set_inode_flag(src, dest);
> + if (ret)
> + goto out_unlock;
>
> /*
> * Invalidate the page cache so that we can clear any CoW mappings
> * in the destination file.
> */
> - truncate_inode_pages_range(&VFS_I(dest)->i_data, destoff,
> - PAGE_ALIGN(destoff + len) - 1);
> + truncate_inode_pages_range(&inode_out->i_data, pos_out,
> + PAGE_ALIGN(pos_out + len) - 1);
>
> - dfsbno = XFS_B_TO_FSBT(mp, destoff);
> - sfsbno = XFS_B_TO_FSBT(mp, srcoff);
> + dfsbno = XFS_B_TO_FSBT(mp, pos_out);
> + sfsbno = XFS_B_TO_FSBT(mp, pos_in);
> fsblen = XFS_B_TO_FSB(mp, len);
> - error = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen,
> - destoff + len);
> - if (error)
> - goto out_error;
> + ret = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen,
> + pos_out + len);
> + if (ret)
> + goto out_unlock;
>
> /*
> * Carry the cowextsize hint from src to dest if we're sharing the
> @@ -1381,20 +1494,24 @@ xfs_reflink_remap_range(
> * has a cowextsize hint, and the destination file does not.
> */
> cowextsize = 0;
> - if (srcoff == 0 && len == i_size_read(VFS_I(src)) &&
> + if (pos_in == 0 && len == i_size_read(inode_in) &&
> (src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) &&
> - destoff == 0 && len >= i_size_read(VFS_I(dest)) &&
> + pos_out == 0 && len >= i_size_read(inode_out) &&
> !(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE))
> cowextsize = src->i_d.di_cowextsize;
>
> - error = xfs_reflink_update_dest(dest, destoff + len, cowextsize);
> - if (error)
> - goto out_error;
> + ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize);
>
> -out_error:
> - if (error)
> - trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_);
> - return error;
> +out_unlock:
> + xfs_iunlock(src, XFS_MMAPLOCK_EXCL);
> + xfs_iunlock(src, XFS_IOLOCK_EXCL);
> + if (src->i_ino != dest->i_ino) {
> + xfs_iunlock(dest, XFS_MMAPLOCK_EXCL);
> + xfs_iunlock(dest, XFS_IOLOCK_EXCL);
> + }
> + if (ret)
> + trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_);
> + return ret;
> }
>
> /*
> diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
> index 5dc3c8a..25a8956 100644
> --- a/fs/xfs/xfs_reflink.h
> +++ b/fs/xfs/xfs_reflink.h
> @@ -43,11 +43,6 @@ extern int xfs_reflink_cancel_cow_range(struct xfs_inode *ip, xfs_off_t offset,
> extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset,
> xfs_off_t count);
> extern int xfs_reflink_recover_cow(struct xfs_mount *mp);
> -#define XFS_REFLINK_DEDUPE 1 /* only reflink if contents match */
> -#define XFS_REFLINK_ALL (XFS_REFLINK_DEDUPE)
Hooray, uglyflags went away! :)
Excepting the minor things I complained about, the rest is
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> -extern int xfs_reflink_remap_range(struct xfs_inode *src, xfs_off_t srcoff,
> - struct xfs_inode *dest, xfs_off_t destoff, xfs_off_t len,
> - unsigned int flags);
> extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip,
> struct xfs_trans **tpp);
> extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,
> @@ -55,4 +50,7 @@ extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset,
>
> extern bool xfs_reflink_has_real_cow_blocks(struct xfs_inode *ip);
>
> +int xfs_file_share_range(struct file *file_in, loff_t pos_in,
> + struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe);
> +
> #endif /* __XFS_REFLINK_H */
> --
> 2.1.4
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: fix locking for the reflink operation
2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig
` (3 preceding siblings ...)
2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig
@ 2016-10-17 21:35 ` Darrick J. Wong
4 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2016-10-17 21:35 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Oct 17, 2016 at 02:05:16PM +0200, Christoph Hellwig wrote:
> When creating a reflink we need to take the iolock much earlier, as
> various early checks done in xfs_file_share_range currently are racy
> without it. Patches 1-3 sort that out in a minimal invasive way,
> but I think we should just merge xfs_file_share_range and
> xfs_reflink_remap_range, which is what patch 4 does.
>
> Patches 1-3 are something I'd like to see in 4.9, patch 4 might not
> fully qualify, but just getting it in might make everyones life easier.
This series (+ the CoW optimization series before it) seem to run ok here.
I'm ok with (more soak testing and) sending it in for 4.9.
--D
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range
2016-10-17 21:29 ` Darrick J. Wong
@ 2016-10-18 5:17 ` Christoph Hellwig
0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2016-10-18 5:17 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs
On Mon, Oct 17, 2016 at 02:29:33PM -0700, Darrick J. Wong wrote:
> > + bs = inode->i_sb->s_blocksize;
> > + inode_dio_wait(inode);
> > +
> > + rounding = max_t(xfs_off_t, bs, PAGE_SIZE);
> > + ioffset = round_down(offset, rounding);
> > + iendoffset = round_up(offset + len, rounding) - 1;
> > + ret = filemap_write_and_wait_range(inode->i_mapping, ioffset,
> > + iendoffset);
> > + return ret;
> > +}
>
> This seems like a file action not specific to reflink.
But it's only used by the reflink code :)
That being said given that filemap_write_and_wait_range operates
on pages there is no need for the rounding anyway, and we could
just replace it with open coded calls to inode_dio_wait and
filemap_write_and_wait_range. Maybe I should do that before
this patch so we don't have to bother moving it.
> > +xfs_file_share_range(
>
> xfs_reflink_share_file_range() ?
>
> I'd like to maintain the convention that all reflink functions
> start with xfs_reflink_*, particularly since the xfs_file_* functions
> largely live in xfs_file.c.
Ok, fine.
> > + if (is_dedupe) {
> > + bool is_same = false;
> > +
> > + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out,
> > + len, &is_same);
> > + if (ret)
> > + goto out_unlock;;
>
> Double-semicolon here.
I'll fix it.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-10-18 5:18 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig
2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig
2016-10-17 21:11 ` Darrick J. Wong
2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig
2016-10-17 21:12 ` Darrick J. Wong
2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig
2016-10-17 21:16 ` Darrick J. Wong
2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig
2016-10-17 21:29 ` Darrick J. Wong
2016-10-18 5:17 ` Christoph Hellwig
2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).