All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls
Date: Sun, 6 Dec 2020 15:24:54 -0800	[thread overview]
Message-ID: <20201206232454.GL629293@magnolia> (raw)
In-Reply-To: <20201201152548.GB1205666@bfoster>

On Tue, Dec 01, 2020 at 10:25:48AM -0500, Brian Foster wrote:
> On Mon, Nov 30, 2020 at 07:37:16PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Add a copy_file_range handler to XFS so that we can accelerate file
> > copies with reflink when the source and destination ranges are not
> > block-aligned.  We'll use the generic pagecache copy to handle the
> > unaligned edges and attempt to reflink the middle.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_file.c |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 99 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 5b0f93f73837..9d1bb0dc30e2 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1119,6 +1119,104 @@ xfs_file_remap_range(
> >  	return remapped > 0 ? remapped : ret;
> >  }
> >  
> ...
> > +STATIC ssize_t
> > +xfs_file_copy_range(
> > +	struct file		*src_file,
> > +	loff_t			src_off,
> > +	struct file		*dst_file,
> > +	loff_t			dst_off,
> > +	size_t			len,
> > +	unsigned int		flags)
> > +{
> > +	struct inode		*inode_src = file_inode(src_file);
> > +	struct xfs_inode	*src = XFS_I(inode_src);
> > +	struct inode		*inode_dst = file_inode(dst_file);
> > +	struct xfs_inode	*dst = XFS_I(inode_dst);
> > +	struct xfs_mount	*mp = src->i_mount;
> > +	loff_t			copy_ret;
> > +	loff_t			next_block;
> > +	size_t			copy_len;
> > +	ssize_t			total_copied = 0;
> > +
> > +	/* Bypass all this if no copy acceleration is possible. */
> > +	if (!xfs_want_reflink_copy_range(src, src_off, dst, dst_off, len))
> > +		goto use_generic;
> > +
> > +	/* Use the regular copy until we're block aligned at the start. */
> > +	next_block = round_up(src_off + 1, mp->m_sb.sb_blocksize);
> 
> Why the +1? AFAICT this means we manually copy the first block if
> src_off does happen to be block aligned. Is this an assumption based on
> the caller attempting ->remap_file_range() first?

Yes.  The VFS always tries that first.

> BTW, if we do happen to be called in some (theoretical) corner case
> where remap doesn't work unrelated to alignment, it seems this would
> unconditionally break the manual copy into multiple parts (first block +
> the rest). It's not immediately clear to me if that's significant from a
> performance perspective,

I doubt it, since that's usually just copying around the pagecache.

> but I wonder if it would be nicer here to
> filter that out more explicitly. For example, run the remap checks on
> the block aligned offset/len first, or skip the remap if the caller has
> provided a block aligned start (i.e. hinting that remap failed for other
> reasons),

Yes, checking the block alignment is a good suggestion.  Will fix.

> or perhaps even implement this so it conditionally performs a
> short manual copy so the next retry would fall into ->remap_file_range()
> with aligned offsets, etc.

Hm.  That could be a thing too, though my opinion is that we should make
as much progress as we can before exiting the kernel.

--D

> Thoughts?
> 
> > +	copy_len = min_t(size_t, len, next_block - src_off);
> > +	if (copy_len > 0) {
> > +		copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
> > +					dst_off, copy_len, flags);
> > +		if (copy_ret < 0)
> > +			return copy_ret;
> > +
> > +		src_off += copy_ret;
> > +		dst_off += copy_ret;
> > +		len -= copy_ret;
> > +		total_copied += copy_ret;
> > +		if (copy_ret < copy_len || len == 0)
> > +			return total_copied;
> > +	}
> > +
> > +	/*
> > +	 * Now try to reflink as many full blocks as we can.  If the end of the
> > +	 * copy request wasn't block-aligned or the reflink fails, we'll just
> > +	 * fall into the generic copy to do the rest.
> > +	 */
> > +	copy_len = round_down(len, mp->m_sb.sb_blocksize);
> > +	if (copy_len > 0) {
> > +		copy_ret = xfs_file_remap_range(src_file, src_off, dst_file,
> > +				dst_off, copy_len, REMAP_FILE_CAN_SHORTEN);
> > +		if (copy_ret >= 0) {
> > +			src_off += copy_ret;
> > +			dst_off += copy_ret;
> > +			len -= copy_ret;
> > +			total_copied += copy_ret;
> > +			if (copy_ret < copy_len || len == 0)
> > +				return total_copied;
> 
> Any reason we return a potential short copy here, but fall into the
> manual copy if the reflink outright fails?
> 
> > +		}
> > +	}
> > +
> > +use_generic:
> > +	/* Use the regular copy to deal with leftover bytes. */
> > +	copy_ret = generic_copy_file_range(src_file, src_off, dst_file,
> > +			dst_off, len, flags);
> > +	if (copy_ret < 0)
> > +		return copy_ret;
> 
> Perhaps this should also check/return total_copied in the event we've
> already done some work..?
> 
> Brian
> 
> > +	return total_copied + copy_ret;
> > +}
> > +
> >  STATIC int
> >  xfs_file_open(
> >  	struct inode	*inode,
> > @@ -1381,6 +1479,7 @@ const struct file_operations xfs_file_operations = {
> >  	.get_unmapped_area = thp_get_unmapped_area,
> >  	.fallocate	= xfs_file_fallocate,
> >  	.fadvise	= xfs_file_fadvise,
> > +	.copy_file_range = xfs_file_copy_range,
> >  	.remap_file_range = xfs_file_remap_range,
> >  };
> >  
> > 
> 

  reply	other threads:[~2020-12-06 23:25 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-01  3:37 [PATCH 0/1] xfs: faster unaligned copy_file_range Darrick J. Wong
2020-12-01  3:37 ` [PATCH 1/1] xfs: use reflink to assist unaligned copy_file_range calls Darrick J. Wong
2020-12-01 10:02   ` Christoph Hellwig
2020-12-06 23:21     ` Darrick J. Wong
2020-12-07 14:20       ` Christoph Hellwig
2020-12-01 15:25   ` Brian Foster
2020-12-06 23:24     ` Darrick J. Wong [this message]
2020-12-07 14:01       ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201206232454.GL629293@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.