* fix locking for the reflink operation
@ 2016-10-17 12:05 Christoph Hellwig
2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw)
To: linux-xfs; +Cc: darrick.wong
When creating a reflink we need to take the iolock much earlier, as
various early checks done in xfs_file_share_range currently are racy
without it. Patches 1-3 sort that out in a minimal invasive way,
but I think we should just merge xfs_file_share_range and
xfs_reflink_remap_range, which is what patch 4 does.
Patches 1-3 are something I'd like to see in 4.9, patch 4 might not
fully qualify, but just getting it in might make everyones life easier.
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig @ 2016-10-17 12:05 ` Christoph Hellwig 2016-10-17 21:11 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig ` (3 subsequent siblings) 4 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong The VFS already does the check, and the placement of this duplicate is in the way of the following locking rework. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index a314fc7..20563f2 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1003,9 +1003,6 @@ xfs_file_share_range( IS_SWAPFILE(inode_out)) return -ETXTBSY; - /* Reflink only works within this filesystem. */ - if (inode_in->i_sb != inode_out->i_sb) - return -EXDEV; same_inode = (inode_in->i_ino == inode_out->i_ino); /* Don't reflink dirs, pipes, sockets... */ -- 2.1.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:11 ` Darrick J. Wong 0 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:11 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:17PM +0200, Christoph Hellwig wrote: > The VFS already does the check, and the placement of this duplicate is in > the way of the following locking rework. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/xfs_file.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index a314fc7..20563f2 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -1003,9 +1003,6 @@ xfs_file_share_range( > IS_SWAPFILE(inode_out)) > return -ETXTBSY; > > - /* Reflink only works within this filesystem. */ > - if (inode_in->i_sb != inode_out->i_sb) > - return -EXDEV; > same_inode = (inode_in->i_ino == inode_out->i_ino); > > /* Don't reflink dirs, pipes, sockets... */ > -- > 2.1.4 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig @ 2016-10-17 12:05 ` Christoph Hellwig 2016-10-17 21:12 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig ` (2 subsequent siblings) 4 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong The VFS i_ino is an unsigned long, while XFS inode numbers are 64-bit wide, so checking i_ino for equality could lead to rate false positives on 32-bit architectures. Just compare the inode pointers themselves to be safe. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 20563f2..012a960 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1003,7 +1003,7 @@ xfs_file_share_range( IS_SWAPFILE(inode_out)) return -ETXTBSY; - same_inode = (inode_in->i_ino == inode_out->i_ino); + same_inode = (inode_in == inode_out); /* Don't reflink dirs, pipes, sockets... */ if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) -- 2.1.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range 2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:12 ` Darrick J. Wong 0 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:12 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:18PM +0200, Christoph Hellwig wrote: > The VFS i_ino is an unsigned long, while XFS inode numbers are 64-bit > wide, so checking i_ino for equality could lead to rate false positives > on 32-bit architectures. Just compare the inode pointers themselves > to be safe. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/xfs_file.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 20563f2..012a960 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -1003,7 +1003,7 @@ xfs_file_share_range( > IS_SWAPFILE(inode_out)) > return -ETXTBSY; > > - same_inode = (inode_in->i_ino == inode_out->i_ino); > + same_inode = (inode_in == inode_out); > > /* Don't reflink dirs, pipes, sockets... */ > if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) > -- > 2.1.4 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig 2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig @ 2016-10-17 12:05 ` Christoph Hellwig 2016-10-17 21:16 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig 2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong 4 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong We need the iolock protection to stabilizie the IS_SWAPFILE and IS_IMMUTABLE values, as well as preventing new buffered writers re-dirtying the file data that we just wrote out. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 62 ++++++++++++++++++++++++++++++++++------------------ fs/xfs/xfs_reflink.c | 15 ------------- 2 files changed, 41 insertions(+), 36 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 012a960..0960264 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -996,38 +996,54 @@ xfs_file_share_range( inode_out = file_inode(file_out); bs = inode_out->i_sb->s_blocksize; + /* Lock both files against IO */ + same_inode = (inode_in == inode_out); + if (same_inode) { + xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); + xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); + } else { + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), + XFS_IOLOCK_EXCL); + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), + XFS_MMAPLOCK_EXCL); + } + /* Don't touch certain kinds of inodes */ + ret = -EPERM; if (IS_IMMUTABLE(inode_out)) - return -EPERM; - if (IS_SWAPFILE(inode_in) || - IS_SWAPFILE(inode_out)) - return -ETXTBSY; - - same_inode = (inode_in == inode_out); + goto out_unlock; + ret = -ETXTBSY; + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) + goto out_unlock; /* Don't reflink dirs, pipes, sockets... */ + ret = -EISDIR; if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) - return -EISDIR; + goto out_unlock; + ret = -EINVAL; if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) - return -EINVAL; + goto out_unlock; if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) - return -EINVAL; + goto out_unlock; /* Don't share DAX file data for now. */ if (IS_DAX(inode_in) || IS_DAX(inode_out)) - return -EINVAL; + goto out_unlock; /* Are we going all the way to the end? */ isize = i_size_read(inode_in); - if (isize == 0) - return 0; + if (isize == 0) { + ret = 0; + goto out_unlock; + } + if (len == 0) len = isize - pos_in; /* Ensure offsets don't wrap and the input is inside i_size */ if (pos_in + len < pos_in || pos_out + len < pos_out || pos_in + len > isize) - return -EINVAL; + goto out_unlock; /* Don't allow dedupe past EOF in the dest file */ if (is_dedupe) { @@ -1035,7 +1051,7 @@ xfs_file_share_range( disize = i_size_read(inode_out); if (pos_out >= disize || pos_out + len > disize) - return -EINVAL; + goto out_unlock; } /* If we're linking to EOF, continue to the block boundary. */ @@ -1047,28 +1063,32 @@ xfs_file_share_range( /* Only reflink if we're aligned to block boundaries */ if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) - return -EINVAL; + goto out_unlock; /* Don't allow overlapped reflink within the same file */ if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) - return -EINVAL; + goto out_unlock; /* Wait for the completion of any pending IOs on srcfile */ ret = xfs_file_wait_for_io(inode_in, pos_in, len); if (ret) - goto out; + goto out_unlock; ret = xfs_file_wait_for_io(inode_out, pos_out, len); if (ret) - goto out; + goto out_unlock; if (is_dedupe) flags |= XFS_REFLINK_DEDUPE; ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), pos_out, len, flags); - if (ret < 0) - goto out; -out: +out_unlock: + xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); + xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); + if (!same_inode) { + xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); + xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); + } return ret; } diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 5965e94..d012746 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1341,15 +1341,6 @@ xfs_reflink_remap_range( trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); - /* Lock both files against IO */ - if (src->i_ino == dest->i_ino) { - xfs_ilock(src, XFS_IOLOCK_EXCL); - xfs_ilock(src, XFS_MMAPLOCK_EXCL); - } else { - xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); - xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); - } - /* * Check that the extents are the same. */ @@ -1401,12 +1392,6 @@ xfs_reflink_remap_range( goto out_error; out_error: - xfs_iunlock(src, XFS_MMAPLOCK_EXCL); - xfs_iunlock(src, XFS_IOLOCK_EXCL); - if (src->i_ino != dest->i_ino) { - xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); - xfs_iunlock(dest, XFS_IOLOCK_EXCL); - } if (error) trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); return error; -- 2.1.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range 2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:16 ` Darrick J. Wong 0 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:16 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:19PM +0200, Christoph Hellwig wrote: > We need the iolock protection to stabilizie the IS_SWAPFILE and IS_IMMUTABLE > values, as well as preventing new buffered writers re-dirtying the file data > that we just wrote out. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/xfs_file.c | 62 ++++++++++++++++++++++++++++++++++------------------ > fs/xfs/xfs_reflink.c | 15 ------------- > 2 files changed, 41 insertions(+), 36 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 012a960..0960264 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -996,38 +996,54 @@ xfs_file_share_range( > inode_out = file_inode(file_out); > bs = inode_out->i_sb->s_blocksize; > > + /* Lock both files against IO */ > + same_inode = (inode_in == inode_out); > + if (same_inode) { > + xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); > + xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); > + } else { > + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), > + XFS_IOLOCK_EXCL); > + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), > + XFS_MMAPLOCK_EXCL); > + } > + > /* Don't touch certain kinds of inodes */ > + ret = -EPERM; > if (IS_IMMUTABLE(inode_out)) > - return -EPERM; > - if (IS_SWAPFILE(inode_in) || > - IS_SWAPFILE(inode_out)) > - return -ETXTBSY; > - > - same_inode = (inode_in == inode_out); > + goto out_unlock; > + ret = -ETXTBSY; > + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) > + goto out_unlock; > > /* Don't reflink dirs, pipes, sockets... */ > + ret = -EISDIR; > if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) > - return -EISDIR; > + goto out_unlock; > + ret = -EINVAL; > if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) > - return -EINVAL; > + goto out_unlock; > if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) > - return -EINVAL; > + goto out_unlock; > > /* Don't share DAX file data for now. */ > if (IS_DAX(inode_in) || IS_DAX(inode_out)) > - return -EINVAL; > + goto out_unlock; > > /* Are we going all the way to the end? */ > isize = i_size_read(inode_in); > - if (isize == 0) > - return 0; > + if (isize == 0) { > + ret = 0; > + goto out_unlock; > + } > + > if (len == 0) > len = isize - pos_in; > > /* Ensure offsets don't wrap and the input is inside i_size */ > if (pos_in + len < pos_in || pos_out + len < pos_out || > pos_in + len > isize) > - return -EINVAL; > + goto out_unlock; > > /* Don't allow dedupe past EOF in the dest file */ > if (is_dedupe) { > @@ -1035,7 +1051,7 @@ xfs_file_share_range( > > disize = i_size_read(inode_out); > if (pos_out >= disize || pos_out + len > disize) > - return -EINVAL; > + goto out_unlock; > } > > /* If we're linking to EOF, continue to the block boundary. */ > @@ -1047,28 +1063,32 @@ xfs_file_share_range( > /* Only reflink if we're aligned to block boundaries */ > if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || > !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) > - return -EINVAL; > + goto out_unlock; > > /* Don't allow overlapped reflink within the same file */ > if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) > - return -EINVAL; > + goto out_unlock; > > /* Wait for the completion of any pending IOs on srcfile */ > ret = xfs_file_wait_for_io(inode_in, pos_in, len); > if (ret) > - goto out; > + goto out_unlock; > ret = xfs_file_wait_for_io(inode_out, pos_out, len); > if (ret) > - goto out; > + goto out_unlock; > > if (is_dedupe) > flags |= XFS_REFLINK_DEDUPE; > ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), > pos_out, len, flags); > - if (ret < 0) > - goto out; > > -out: > +out_unlock: > + xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); > + xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); > + if (!same_inode) { > + xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); > + xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); > + } > return ret; > } > > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c > index 5965e94..d012746 100644 > --- a/fs/xfs/xfs_reflink.c > +++ b/fs/xfs/xfs_reflink.c > @@ -1341,15 +1341,6 @@ xfs_reflink_remap_range( > > trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); > > - /* Lock both files against IO */ > - if (src->i_ino == dest->i_ino) { > - xfs_ilock(src, XFS_IOLOCK_EXCL); > - xfs_ilock(src, XFS_MMAPLOCK_EXCL); > - } else { > - xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); > - xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); > - } > - > /* > * Check that the extents are the same. > */ > @@ -1401,12 +1392,6 @@ xfs_reflink_remap_range( > goto out_error; > > out_error: > - xfs_iunlock(src, XFS_MMAPLOCK_EXCL); > - xfs_iunlock(src, XFS_IOLOCK_EXCL); > - if (src->i_ino != dest->i_ino) { > - xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); > - xfs_iunlock(dest, XFS_IOLOCK_EXCL); > - } > if (error) > trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); > return error; > -- > 2.1.4 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig ` (2 preceding siblings ...) 2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig @ 2016-10-17 12:05 ` Christoph Hellwig 2016-10-17 21:29 ` Darrick J. Wong 2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong 4 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong There is no clear division of responsibility between those functions, so just merge them into one to keep the code simple. Also move xfs_file_wait_for_io to xfs_reflink.c together with its only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 145 ------------------------------------- fs/xfs/xfs_reflink.c | 199 ++++++++++++++++++++++++++++++++++++++++----------- fs/xfs/xfs_reflink.h | 8 +-- 3 files changed, 161 insertions(+), 191 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 0960264..cd37573 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -947,151 +947,6 @@ xfs_file_fallocate( return error; } -/* - * Flush all file writes out to disk. - */ -static int -xfs_file_wait_for_io( - struct inode *inode, - loff_t offset, - size_t len) -{ - loff_t rounding; - loff_t ioffset; - loff_t iendoffset; - loff_t bs; - int ret; - - bs = inode->i_sb->s_blocksize; - inode_dio_wait(inode); - - rounding = max_t(xfs_off_t, bs, PAGE_SIZE); - ioffset = round_down(offset, rounding); - iendoffset = round_up(offset + len, rounding) - 1; - ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, - iendoffset); - return ret; -} - -/* Hook up to the VFS reflink function */ -STATIC int -xfs_file_share_range( - struct file *file_in, - loff_t pos_in, - struct file *file_out, - loff_t pos_out, - u64 len, - bool is_dedupe) -{ - struct inode *inode_in; - struct inode *inode_out; - ssize_t ret; - loff_t bs; - loff_t isize; - int same_inode; - loff_t blen; - unsigned int flags = 0; - - inode_in = file_inode(file_in); - inode_out = file_inode(file_out); - bs = inode_out->i_sb->s_blocksize; - - /* Lock both files against IO */ - same_inode = (inode_in == inode_out); - if (same_inode) { - xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); - xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); - } else { - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), - XFS_IOLOCK_EXCL); - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), - XFS_MMAPLOCK_EXCL); - } - - /* Don't touch certain kinds of inodes */ - ret = -EPERM; - if (IS_IMMUTABLE(inode_out)) - goto out_unlock; - ret = -ETXTBSY; - if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) - goto out_unlock; - - /* Don't reflink dirs, pipes, sockets... */ - ret = -EISDIR; - if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) - goto out_unlock; - ret = -EINVAL; - if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) - goto out_unlock; - if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) - goto out_unlock; - - /* Don't share DAX file data for now. */ - if (IS_DAX(inode_in) || IS_DAX(inode_out)) - goto out_unlock; - - /* Are we going all the way to the end? */ - isize = i_size_read(inode_in); - if (isize == 0) { - ret = 0; - goto out_unlock; - } - - if (len == 0) - len = isize - pos_in; - - /* Ensure offsets don't wrap and the input is inside i_size */ - if (pos_in + len < pos_in || pos_out + len < pos_out || - pos_in + len > isize) - goto out_unlock; - - /* Don't allow dedupe past EOF in the dest file */ - if (is_dedupe) { - loff_t disize; - - disize = i_size_read(inode_out); - if (pos_out >= disize || pos_out + len > disize) - goto out_unlock; - } - - /* If we're linking to EOF, continue to the block boundary. */ - if (pos_in + len == isize) - blen = ALIGN(isize, bs) - pos_in; - else - blen = len; - - /* Only reflink if we're aligned to block boundaries */ - if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || - !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) - goto out_unlock; - - /* Don't allow overlapped reflink within the same file */ - if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) - goto out_unlock; - - /* Wait for the completion of any pending IOs on srcfile */ - ret = xfs_file_wait_for_io(inode_in, pos_in, len); - if (ret) - goto out_unlock; - ret = xfs_file_wait_for_io(inode_out, pos_out, len); - if (ret) - goto out_unlock; - - if (is_dedupe) - flags |= XFS_REFLINK_DEDUPE; - ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), - pos_out, len, flags); - -out_unlock: - xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); - xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); - if (!same_inode) { - xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); - xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); - } - return ret; -} - STATIC ssize_t xfs_file_copy_range( struct file *file_in, diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index d012746..11ed2ad 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1308,23 +1308,56 @@ xfs_compare_extents( } /* + * Flush all file writes out to disk. + */ +static int +xfs_file_wait_for_io( + struct inode *inode, + loff_t offset, + size_t len) +{ + loff_t rounding; + loff_t ioffset; + loff_t iendoffset; + loff_t bs; + int ret; + + bs = inode->i_sb->s_blocksize; + inode_dio_wait(inode); + + rounding = max_t(xfs_off_t, bs, PAGE_SIZE); + ioffset = round_down(offset, rounding); + iendoffset = round_up(offset + len, rounding) - 1; + ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, + iendoffset); + return ret; +} + +/* * Link a range of blocks from one file to another. */ int -xfs_reflink_remap_range( - struct xfs_inode *src, - xfs_off_t srcoff, - struct xfs_inode *dest, - xfs_off_t destoff, - xfs_off_t len, - unsigned int flags) +xfs_file_share_range( + struct file *file_in, + loff_t pos_in, + struct file *file_out, + loff_t pos_out, + u64 len, + bool is_dedupe) { + struct inode *inode_in = file_inode(file_in); + struct xfs_inode *src = XFS_I(inode_in); + struct inode *inode_out = file_inode(file_out); + struct xfs_inode *dest = XFS_I(inode_out); struct xfs_mount *mp = src->i_mount; + loff_t bs = inode_out->i_sb->s_blocksize; + bool same_inode = (inode_in == inode_out); xfs_fileoff_t sfsbno, dfsbno; xfs_filblks_t fsblen; - int error; xfs_extlen_t cowextsize; - bool is_same; + loff_t isize; + ssize_t ret; + loff_t blen; if (!xfs_sb_version_hasreflink(&mp->m_sb)) return -EOPNOTSUPP; @@ -1332,48 +1365,128 @@ xfs_reflink_remap_range( if (XFS_FORCED_SHUTDOWN(mp)) return -EIO; + /* Lock both files against IO */ + if (same_inode) { + xfs_ilock(src, XFS_IOLOCK_EXCL); + xfs_ilock(src, XFS_MMAPLOCK_EXCL); + } else { + xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); + xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); + } + + /* Don't touch certain kinds of inodes */ + ret = -EPERM; + if (IS_IMMUTABLE(inode_out)) + goto out_unlock; + + ret = -ETXTBSY; + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) + goto out_unlock; + + + /* Don't reflink dirs, pipes, sockets... */ + ret = -EISDIR; + if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) + goto out_unlock; + ret = -EINVAL; + if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) + goto out_unlock; + if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) + goto out_unlock; + /* Don't reflink realtime inodes */ if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest)) - return -EINVAL; + goto out_unlock; + + /* Don't share DAX file data for now. */ + if (IS_DAX(inode_in) || IS_DAX(inode_out)) + goto out_unlock; + + /* Are we going all the way to the end? */ + isize = i_size_read(inode_in); + if (isize == 0) { + ret = 0; + goto out_unlock; + } + + if (len == 0) + len = isize - pos_in; - if (flags & ~XFS_REFLINK_ALL) - return -EINVAL; + /* Ensure offsets don't wrap and the input is inside i_size */ + if (pos_in + len < pos_in || pos_out + len < pos_out || + pos_in + len > isize) + goto out_unlock; + + /* Don't allow dedupe past EOF in the dest file */ + if (is_dedupe) { + loff_t disize; + + disize = i_size_read(inode_out); + if (pos_out >= disize || pos_out + len > disize) + goto out_unlock; + } - trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); + /* If we're linking to EOF, continue to the block boundary. */ + if (pos_in + len == isize) + blen = ALIGN(isize, bs) - pos_in; + else + blen = len; + + /* Only reflink if we're aligned to block boundaries */ + if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || + !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) + goto out_unlock; + + /* Don't allow overlapped reflink within the same file */ + if (same_inode) { + if (pos_out + blen > pos_in && pos_out < pos_in + blen) + goto out_unlock; + } + + /* Wait for the completion of any pending IOs on srcfile */ + ret = xfs_file_wait_for_io(inode_in, pos_in, len); + if (ret) + goto out_unlock; + ret = xfs_file_wait_for_io(inode_out, pos_out, len); + if (ret) + goto out_unlock; + + trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out); /* * Check that the extents are the same. */ - if (flags & XFS_REFLINK_DEDUPE) { - is_same = false; - error = xfs_compare_extents(VFS_I(src), srcoff, VFS_I(dest), - destoff, len, &is_same); - if (error) - goto out_error; + if (is_dedupe) { + bool is_same = false; + + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out, + len, &is_same); + if (ret) + goto out_unlock;; if (!is_same) { - error = -EBADE; - goto out_error; + ret = -EBADE; + goto out_unlock; } } - error = xfs_reflink_set_inode_flag(src, dest); - if (error) - goto out_error; + ret = xfs_reflink_set_inode_flag(src, dest); + if (ret) + goto out_unlock; /* * Invalidate the page cache so that we can clear any CoW mappings * in the destination file. */ - truncate_inode_pages_range(&VFS_I(dest)->i_data, destoff, - PAGE_ALIGN(destoff + len) - 1); + truncate_inode_pages_range(&inode_out->i_data, pos_out, + PAGE_ALIGN(pos_out + len) - 1); - dfsbno = XFS_B_TO_FSBT(mp, destoff); - sfsbno = XFS_B_TO_FSBT(mp, srcoff); + dfsbno = XFS_B_TO_FSBT(mp, pos_out); + sfsbno = XFS_B_TO_FSBT(mp, pos_in); fsblen = XFS_B_TO_FSB(mp, len); - error = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, - destoff + len); - if (error) - goto out_error; + ret = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, + pos_out + len); + if (ret) + goto out_unlock; /* * Carry the cowextsize hint from src to dest if we're sharing the @@ -1381,20 +1494,24 @@ xfs_reflink_remap_range( * has a cowextsize hint, and the destination file does not. */ cowextsize = 0; - if (srcoff == 0 && len == i_size_read(VFS_I(src)) && + if (pos_in == 0 && len == i_size_read(inode_in) && (src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) && - destoff == 0 && len >= i_size_read(VFS_I(dest)) && + pos_out == 0 && len >= i_size_read(inode_out) && !(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE)) cowextsize = src->i_d.di_cowextsize; - error = xfs_reflink_update_dest(dest, destoff + len, cowextsize); - if (error) - goto out_error; + ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize); -out_error: - if (error) - trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); - return error; +out_unlock: + xfs_iunlock(src, XFS_MMAPLOCK_EXCL); + xfs_iunlock(src, XFS_IOLOCK_EXCL); + if (src->i_ino != dest->i_ino) { + xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); + xfs_iunlock(dest, XFS_IOLOCK_EXCL); + } + if (ret) + trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_); + return ret; } /* diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 5dc3c8a..25a8956 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -43,11 +43,6 @@ extern int xfs_reflink_cancel_cow_range(struct xfs_inode *ip, xfs_off_t offset, extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset, xfs_off_t count); extern int xfs_reflink_recover_cow(struct xfs_mount *mp); -#define XFS_REFLINK_DEDUPE 1 /* only reflink if contents match */ -#define XFS_REFLINK_ALL (XFS_REFLINK_DEDUPE) -extern int xfs_reflink_remap_range(struct xfs_inode *src, xfs_off_t srcoff, - struct xfs_inode *dest, xfs_off_t destoff, xfs_off_t len, - unsigned int flags); extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip, struct xfs_trans **tpp); extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, @@ -55,4 +50,7 @@ extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, extern bool xfs_reflink_has_real_cow_blocks(struct xfs_inode *ip); +int xfs_file_share_range(struct file *file_in, loff_t pos_in, + struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe); + #endif /* __XFS_REFLINK_H */ -- 2.1.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range 2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:29 ` Darrick J. Wong 2016-10-18 5:17 ` Christoph Hellwig 0 siblings, 1 reply; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:29 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:20PM +0200, Christoph Hellwig wrote: > There is no clear division of responsibility between those functions, so > just merge them into one to keep the code simple. Also move > xfs_file_wait_for_io to xfs_reflink.c together with its only caller. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_file.c | 145 ------------------------------------- > fs/xfs/xfs_reflink.c | 199 ++++++++++++++++++++++++++++++++++++++++----------- > fs/xfs/xfs_reflink.h | 8 +-- > 3 files changed, 161 insertions(+), 191 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 0960264..cd37573 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -947,151 +947,6 @@ xfs_file_fallocate( > return error; > } > > -/* > - * Flush all file writes out to disk. > - */ > -static int > -xfs_file_wait_for_io( > - struct inode *inode, > - loff_t offset, > - size_t len) > -{ > - loff_t rounding; > - loff_t ioffset; > - loff_t iendoffset; > - loff_t bs; > - int ret; > - > - bs = inode->i_sb->s_blocksize; > - inode_dio_wait(inode); > - > - rounding = max_t(xfs_off_t, bs, PAGE_SIZE); > - ioffset = round_down(offset, rounding); > - iendoffset = round_up(offset + len, rounding) - 1; > - ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, > - iendoffset); > - return ret; > -} > - > -/* Hook up to the VFS reflink function */ > -STATIC int > -xfs_file_share_range( > - struct file *file_in, > - loff_t pos_in, > - struct file *file_out, > - loff_t pos_out, > - u64 len, > - bool is_dedupe) > -{ > - struct inode *inode_in; > - struct inode *inode_out; > - ssize_t ret; > - loff_t bs; > - loff_t isize; > - int same_inode; > - loff_t blen; > - unsigned int flags = 0; > - > - inode_in = file_inode(file_in); > - inode_out = file_inode(file_out); > - bs = inode_out->i_sb->s_blocksize; > - > - /* Lock both files against IO */ > - same_inode = (inode_in == inode_out); > - if (same_inode) { > - xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); > - xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); > - } else { > - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), > - XFS_IOLOCK_EXCL); > - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), > - XFS_MMAPLOCK_EXCL); > - } > - > - /* Don't touch certain kinds of inodes */ > - ret = -EPERM; > - if (IS_IMMUTABLE(inode_out)) > - goto out_unlock; > - ret = -ETXTBSY; > - if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) > - goto out_unlock; > - > - /* Don't reflink dirs, pipes, sockets... */ > - ret = -EISDIR; > - if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) > - goto out_unlock; > - ret = -EINVAL; > - if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) > - goto out_unlock; > - if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) > - goto out_unlock; > - > - /* Don't share DAX file data for now. */ > - if (IS_DAX(inode_in) || IS_DAX(inode_out)) > - goto out_unlock; > - > - /* Are we going all the way to the end? */ > - isize = i_size_read(inode_in); > - if (isize == 0) { > - ret = 0; > - goto out_unlock; > - } > - > - if (len == 0) > - len = isize - pos_in; > - > - /* Ensure offsets don't wrap and the input is inside i_size */ > - if (pos_in + len < pos_in || pos_out + len < pos_out || > - pos_in + len > isize) > - goto out_unlock; > - > - /* Don't allow dedupe past EOF in the dest file */ > - if (is_dedupe) { > - loff_t disize; > - > - disize = i_size_read(inode_out); > - if (pos_out >= disize || pos_out + len > disize) > - goto out_unlock; > - } > - > - /* If we're linking to EOF, continue to the block boundary. */ > - if (pos_in + len == isize) > - blen = ALIGN(isize, bs) - pos_in; > - else > - blen = len; > - > - /* Only reflink if we're aligned to block boundaries */ > - if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || > - !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) > - goto out_unlock; > - > - /* Don't allow overlapped reflink within the same file */ > - if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) > - goto out_unlock; > - > - /* Wait for the completion of any pending IOs on srcfile */ > - ret = xfs_file_wait_for_io(inode_in, pos_in, len); > - if (ret) > - goto out_unlock; > - ret = xfs_file_wait_for_io(inode_out, pos_out, len); > - if (ret) > - goto out_unlock; > - > - if (is_dedupe) > - flags |= XFS_REFLINK_DEDUPE; > - ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), > - pos_out, len, flags); > - > -out_unlock: > - xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); > - xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); > - if (!same_inode) { > - xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); > - xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); > - } > - return ret; > -} > - > STATIC ssize_t > xfs_file_copy_range( > struct file *file_in, > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c > index d012746..11ed2ad 100644 > --- a/fs/xfs/xfs_reflink.c > +++ b/fs/xfs/xfs_reflink.c > @@ -1308,23 +1308,56 @@ xfs_compare_extents( > } > > /* > + * Flush all file writes out to disk. > + */ > +static int > +xfs_file_wait_for_io( > + struct inode *inode, > + loff_t offset, > + size_t len) > +{ > + loff_t rounding; > + loff_t ioffset; > + loff_t iendoffset; > + loff_t bs; > + int ret; > + > + bs = inode->i_sb->s_blocksize; > + inode_dio_wait(inode); > + > + rounding = max_t(xfs_off_t, bs, PAGE_SIZE); > + ioffset = round_down(offset, rounding); > + iendoffset = round_up(offset + len, rounding) - 1; > + ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, > + iendoffset); > + return ret; > +} This seems like a file action not specific to reflink. > + > +/* > * Link a range of blocks from one file to another. > */ > int > -xfs_reflink_remap_range( > - struct xfs_inode *src, > - xfs_off_t srcoff, > - struct xfs_inode *dest, > - xfs_off_t destoff, > - xfs_off_t len, > - unsigned int flags) > +xfs_file_share_range( xfs_reflink_share_file_range() ? I'd like to maintain the convention that all reflink functions start with xfs_reflink_*, particularly since the xfs_file_* functions largely live in xfs_file.c. > + struct file *file_in, > + loff_t pos_in, > + struct file *file_out, > + loff_t pos_out, > + u64 len, > + bool is_dedupe) > { > + struct inode *inode_in = file_inode(file_in); > + struct xfs_inode *src = XFS_I(inode_in); > + struct inode *inode_out = file_inode(file_out); > + struct xfs_inode *dest = XFS_I(inode_out); > struct xfs_mount *mp = src->i_mount; > + loff_t bs = inode_out->i_sb->s_blocksize; > + bool same_inode = (inode_in == inode_out); > xfs_fileoff_t sfsbno, dfsbno; > xfs_filblks_t fsblen; > - int error; > xfs_extlen_t cowextsize; > - bool is_same; > + loff_t isize; > + ssize_t ret; > + loff_t blen; > > if (!xfs_sb_version_hasreflink(&mp->m_sb)) > return -EOPNOTSUPP; > @@ -1332,48 +1365,128 @@ xfs_reflink_remap_range( > if (XFS_FORCED_SHUTDOWN(mp)) > return -EIO; > > + /* Lock both files against IO */ > + if (same_inode) { > + xfs_ilock(src, XFS_IOLOCK_EXCL); > + xfs_ilock(src, XFS_MMAPLOCK_EXCL); > + } else { > + xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); > + xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); > + } > + > + /* Don't touch certain kinds of inodes */ > + ret = -EPERM; > + if (IS_IMMUTABLE(inode_out)) > + goto out_unlock; > + > + ret = -ETXTBSY; > + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) > + goto out_unlock; > + > + > + /* Don't reflink dirs, pipes, sockets... */ > + ret = -EISDIR; > + if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) > + goto out_unlock; > + ret = -EINVAL; > + if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) > + goto out_unlock; > + if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) > + goto out_unlock; > + > /* Don't reflink realtime inodes */ > if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest)) > - return -EINVAL; > + goto out_unlock; > + > + /* Don't share DAX file data for now. */ > + if (IS_DAX(inode_in) || IS_DAX(inode_out)) > + goto out_unlock; > + > + /* Are we going all the way to the end? */ > + isize = i_size_read(inode_in); > + if (isize == 0) { > + ret = 0; > + goto out_unlock; > + } > + > + if (len == 0) > + len = isize - pos_in; > > - if (flags & ~XFS_REFLINK_ALL) > - return -EINVAL; > + /* Ensure offsets don't wrap and the input is inside i_size */ > + if (pos_in + len < pos_in || pos_out + len < pos_out || > + pos_in + len > isize) > + goto out_unlock; > + > + /* Don't allow dedupe past EOF in the dest file */ > + if (is_dedupe) { > + loff_t disize; > + > + disize = i_size_read(inode_out); > + if (pos_out >= disize || pos_out + len > disize) > + goto out_unlock; > + } > > - trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); > + /* If we're linking to EOF, continue to the block boundary. */ > + if (pos_in + len == isize) > + blen = ALIGN(isize, bs) - pos_in; > + else > + blen = len; > + > + /* Only reflink if we're aligned to block boundaries */ > + if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || > + !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) > + goto out_unlock; > + > + /* Don't allow overlapped reflink within the same file */ > + if (same_inode) { > + if (pos_out + blen > pos_in && pos_out < pos_in + blen) > + goto out_unlock; > + } > + > + /* Wait for the completion of any pending IOs on srcfile */ > + ret = xfs_file_wait_for_io(inode_in, pos_in, len); > + if (ret) > + goto out_unlock; > + ret = xfs_file_wait_for_io(inode_out, pos_out, len); > + if (ret) > + goto out_unlock; > + > + trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out); > > /* > * Check that the extents are the same. > */ > - if (flags & XFS_REFLINK_DEDUPE) { > - is_same = false; > - error = xfs_compare_extents(VFS_I(src), srcoff, VFS_I(dest), > - destoff, len, &is_same); > - if (error) > - goto out_error; > + if (is_dedupe) { > + bool is_same = false; > + > + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out, > + len, &is_same); > + if (ret) > + goto out_unlock;; Double-semicolon here. > if (!is_same) { > - error = -EBADE; > - goto out_error; > + ret = -EBADE; > + goto out_unlock; > } > } > > - error = xfs_reflink_set_inode_flag(src, dest); > - if (error) > - goto out_error; > + ret = xfs_reflink_set_inode_flag(src, dest); > + if (ret) > + goto out_unlock; > > /* > * Invalidate the page cache so that we can clear any CoW mappings > * in the destination file. > */ > - truncate_inode_pages_range(&VFS_I(dest)->i_data, destoff, > - PAGE_ALIGN(destoff + len) - 1); > + truncate_inode_pages_range(&inode_out->i_data, pos_out, > + PAGE_ALIGN(pos_out + len) - 1); > > - dfsbno = XFS_B_TO_FSBT(mp, destoff); > - sfsbno = XFS_B_TO_FSBT(mp, srcoff); > + dfsbno = XFS_B_TO_FSBT(mp, pos_out); > + sfsbno = XFS_B_TO_FSBT(mp, pos_in); > fsblen = XFS_B_TO_FSB(mp, len); > - error = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, > - destoff + len); > - if (error) > - goto out_error; > + ret = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, > + pos_out + len); > + if (ret) > + goto out_unlock; > > /* > * Carry the cowextsize hint from src to dest if we're sharing the > @@ -1381,20 +1494,24 @@ xfs_reflink_remap_range( > * has a cowextsize hint, and the destination file does not. > */ > cowextsize = 0; > - if (srcoff == 0 && len == i_size_read(VFS_I(src)) && > + if (pos_in == 0 && len == i_size_read(inode_in) && > (src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) && > - destoff == 0 && len >= i_size_read(VFS_I(dest)) && > + pos_out == 0 && len >= i_size_read(inode_out) && > !(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE)) > cowextsize = src->i_d.di_cowextsize; > > - error = xfs_reflink_update_dest(dest, destoff + len, cowextsize); > - if (error) > - goto out_error; > + ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize); > > -out_error: > - if (error) > - trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); > - return error; > +out_unlock: > + xfs_iunlock(src, XFS_MMAPLOCK_EXCL); > + xfs_iunlock(src, XFS_IOLOCK_EXCL); > + if (src->i_ino != dest->i_ino) { > + xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); > + xfs_iunlock(dest, XFS_IOLOCK_EXCL); > + } > + if (ret) > + trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_); > + return ret; > } > > /* > diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h > index 5dc3c8a..25a8956 100644 > --- a/fs/xfs/xfs_reflink.h > +++ b/fs/xfs/xfs_reflink.h > @@ -43,11 +43,6 @@ extern int xfs_reflink_cancel_cow_range(struct xfs_inode *ip, xfs_off_t offset, > extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset, > xfs_off_t count); > extern int xfs_reflink_recover_cow(struct xfs_mount *mp); > -#define XFS_REFLINK_DEDUPE 1 /* only reflink if contents match */ > -#define XFS_REFLINK_ALL (XFS_REFLINK_DEDUPE) Hooray, uglyflags went away! :) Excepting the minor things I complained about, the rest is Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > -extern int xfs_reflink_remap_range(struct xfs_inode *src, xfs_off_t srcoff, > - struct xfs_inode *dest, xfs_off_t destoff, xfs_off_t len, > - unsigned int flags); > extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip, > struct xfs_trans **tpp); > extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, > @@ -55,4 +50,7 @@ extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, > > extern bool xfs_reflink_has_real_cow_blocks(struct xfs_inode *ip); > > +int xfs_file_share_range(struct file *file_in, loff_t pos_in, > + struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe); > + > #endif /* __XFS_REFLINK_H */ > -- > 2.1.4 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range 2016-10-17 21:29 ` Darrick J. Wong @ 2016-10-18 5:17 ` Christoph Hellwig 0 siblings, 0 replies; 11+ messages in thread From: Christoph Hellwig @ 2016-10-18 5:17 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs On Mon, Oct 17, 2016 at 02:29:33PM -0700, Darrick J. Wong wrote: > > + bs = inode->i_sb->s_blocksize; > > + inode_dio_wait(inode); > > + > > + rounding = max_t(xfs_off_t, bs, PAGE_SIZE); > > + ioffset = round_down(offset, rounding); > > + iendoffset = round_up(offset + len, rounding) - 1; > > + ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, > > + iendoffset); > > + return ret; > > +} > > This seems like a file action not specific to reflink. But it's only used by the reflink code :) That being said given that filemap_write_and_wait_range operates on pages there is no need for the rounding anyway, and we could just replace it with open coded calls to inode_dio_wait and filemap_write_and_wait_range. Maybe I should do that before this patch so we don't have to bother moving it. > > +xfs_file_share_range( > > xfs_reflink_share_file_range() ? > > I'd like to maintain the convention that all reflink functions > start with xfs_reflink_*, particularly since the xfs_file_* functions > largely live in xfs_file.c. Ok, fine. > > + if (is_dedupe) { > > + bool is_same = false; > > + > > + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out, > > + len, &is_same); > > + if (ret) > > + goto out_unlock;; > > Double-semicolon here. I'll fix it. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: fix locking for the reflink operation 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig ` (3 preceding siblings ...) 2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:35 ` Darrick J. Wong 4 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:35 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:16PM +0200, Christoph Hellwig wrote: > When creating a reflink we need to take the iolock much earlier, as > various early checks done in xfs_file_share_range currently are racy > without it. Patches 1-3 sort that out in a minimal invasive way, > but I think we should just merge xfs_file_share_range and > xfs_reflink_remap_range, which is what patch 4 does. > > Patches 1-3 are something I'd like to see in 4.9, patch 4 might not > fully qualify, but just getting it in might make everyones life easier. This series (+ the CoW optimization series before it) seem to run ok here. I'm ok with (more soak testing and) sending it in for 4.9. --D > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-10-18 5:18 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig 2016-10-17 21:11 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig 2016-10-17 21:12 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig 2016-10-17 21:16 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig 2016-10-17 21:29 ` Darrick J. Wong 2016-10-18 5:17 ` Christoph Hellwig 2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).