All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Luís Henriques" <lhenriques@suse.de>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Namjae Jeon <linkinjeon@kernel.org>,
	Olga Kornievskaia <olga.kornievskaia@gmail.com>,
	Luis Henriques <lhenriques@suse.com>,
	linux-fsdevel@vger.kernel.org, linux-cifs@vger.kernel.org,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH] vfs: fix copy_file_range() averts filesystem freeze protection
Date: Mon, 14 Nov 2022 11:33:18 +0000	[thread overview]
Message-ID: <877czx22kh.fsf@suse.de> (raw)
In-Reply-To: <20221110155522.556225-1-amir73il@gmail.com> (Amir Goldstein's message of "Thu, 10 Nov 2022 17:55:22 +0200")

Amir Goldstein <amir73il@gmail.com> writes:

> Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs
> copies") removed fallback to generic_copy_file_range() for cross-fs
> cases inside vfs_copy_file_range().
>
> To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to
> generic_copy_file_range() was added in nfsd and ksmbd code, but that
> call is missing sb_start_write(), fsnotify hooks and more.
>
> Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that
> will take care of the fallback, but that code would be subtle and we got
> vfs_copy_file_range() logic wrong too many times already.
>
> Instead, add a flag to explicitly request vfs_copy_file_range() to
> perform only generic_copy_file_range() and let nfsd and ksmbd use this
> flag only in the fallback path.
>
> This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code
> paths to reduce the risk of further regressions.
>
> Fixes: 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies")
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>
> Hi Al,
>
> Another fix for the long tradition of copy_file_range() regressions.
> This one only affected cross-fs server-side-copy from nfsd/ksmbd.
>
> I ran the copy_range fstests group on ext4/xfs/overlay to verify no
> regressions in local fs and nfsv3/nfsv4 to test server-side-copy.
>
> I also patched copy_file_range() to test the nfsd fallback code on
> local fs.
>
> Namje, could you please test ksmbd.

For what is worth, I've also done some testing with ceph and I didn't saw
any regression either.  So, feel free to add my

Tested-by: Luís Henriques <lhenriques@suse.de>

Cheers,
-- 
Luís

>
> Thanks,
> Amir.
>
>  fs/ksmbd/vfs.c     |  6 +++---
>  fs/nfsd/vfs.c      |  4 ++--
>  fs/read_write.c    | 19 +++++++++++++++----
>  include/linux/fs.h |  8 ++++++++
>  4 files changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ksmbd/vfs.c b/fs/ksmbd/vfs.c
> index 8de970d6146f..94b8ed4ef870 100644
> --- a/fs/ksmbd/vfs.c
> +++ b/fs/ksmbd/vfs.c
> @@ -1794,9 +1794,9 @@ int ksmbd_vfs_copy_file_ranges(struct ksmbd_work *work,
>  		ret = vfs_copy_file_range(src_fp->filp, src_off,
>  					  dst_fp->filp, dst_off, len, 0);
>  		if (ret == -EOPNOTSUPP || ret == -EXDEV)
> -			ret = generic_copy_file_range(src_fp->filp, src_off,
> -						      dst_fp->filp, dst_off,
> -						      len, 0);
> +			ret = vfs_copy_file_range(src_fp->filp, src_off,
> +						  dst_fp->filp, dst_off, len,
> +						  COPY_FILE_SPLICE);
>  		if (ret < 0)
>  			return ret;
>  
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index f650afedd67f..5cf11cde51f8 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -596,8 +596,8 @@ ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst,
>  	ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
>  
>  	if (ret == -EOPNOTSUPP || ret == -EXDEV)
> -		ret = generic_copy_file_range(src, src_pos, dst, dst_pos,
> -					      count, 0);
> +		ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count,
> +					  COPY_FILE_SPLICE);
>  	return ret;
>  }
>  
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 328ce8cf9a85..24b9668d6377 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1388,6 +1388,8 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
>  				struct file *file_out, loff_t pos_out,
>  				size_t len, unsigned int flags)
>  {
> +	lockdep_assert(sb_write_started(file_inode(file_out)->i_sb));
> +
>  	return do_splice_direct(file_in, &pos_in, file_out, &pos_out,
>  				len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
>  }
> @@ -1424,7 +1426,9 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in,
>  	 * and several different sets of file_operations, but they all end up
>  	 * using the same ->copy_file_range() function pointer.
>  	 */
> -	if (file_out->f_op->copy_file_range) {
> +	if (flags & COPY_FILE_SPLICE) {
> +		/* cross sb splice is allowed */
> +	} else if (file_out->f_op->copy_file_range) {
>  		if (file_in->f_op->copy_file_range !=
>  		    file_out->f_op->copy_file_range)
>  			return -EXDEV;
> @@ -1474,8 +1478,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  			    size_t len, unsigned int flags)
>  {
>  	ssize_t ret;
> +	bool splice = flags & COPY_FILE_SPLICE;
>  
> -	if (flags != 0)
> +	if (flags & ~COPY_FILE_SPLICE)
>  		return -EINVAL;
>  
>  	ret = generic_copy_file_checks(file_in, pos_in, file_out, pos_out, &len,
> @@ -1501,14 +1506,14 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  	 * same sb using clone, but for filesystems where both clone and copy
>  	 * are supported (e.g. nfs,cifs), we only call the copy method.
>  	 */
> -	if (file_out->f_op->copy_file_range) {
> +	if (!splice && file_out->f_op->copy_file_range) {
>  		ret = file_out->f_op->copy_file_range(file_in, pos_in,
>  						      file_out, pos_out,
>  						      len, flags);
>  		goto done;
>  	}
>  
> -	if (file_in->f_op->remap_file_range &&
> +	if (!splice && file_in->f_op->remap_file_range &&
>  	    file_inode(file_in)->i_sb == file_inode(file_out)->i_sb) {
>  		ret = file_in->f_op->remap_file_range(file_in, pos_in,
>  				file_out, pos_out,
> @@ -1528,6 +1533,8 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  	 * consistent story about which filesystems support copy_file_range()
>  	 * and which filesystems do not, that will allow userspace tools to
>  	 * make consistent desicions w.r.t using copy_file_range().
> +	 *
> +	 * We also get here if caller (e.g. nfsd) requested COPY_FILE_SPLICE.
>  	 */
>  	ret = generic_copy_file_range(file_in, pos_in, file_out, pos_out, len,
>  				      flags);
> @@ -1582,6 +1589,10 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in,
>  		pos_out = f_out.file->f_pos;
>  	}
>  
> +	ret = -EINVAL;
> +	if (flags != 0)
> +		goto out;
> +
>  	ret = vfs_copy_file_range(f_in.file, pos_in, f_out.file, pos_out, len,
>  				  flags);
>  	if (ret > 0) {
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index e654435f1651..59ae95ddb679 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2089,6 +2089,14 @@ struct dir_context {
>   */
>  #define REMAP_FILE_ADVISORY		(REMAP_FILE_CAN_SHORTEN)
>  
> +/*
> + * These flags control the behavior of vfs_copy_file_range().
> + * They are not available to the user via syscall.
> + *
> + * COPY_FILE_SPLICE: call splice direct instead of fs clone/copy ops
> + */
> +#define COPY_FILE_SPLICE		(1 << 0)
> +
>  struct iov_iter;
>  struct io_uring_cmd;
>  
> -- 
>
> 2.25.1
>


      parent reply	other threads:[~2022-11-14 11:32 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-10 15:55 [PATCH] vfs: fix copy_file_range() averts filesystem freeze protection Amir Goldstein
2022-11-11 14:53 ` Namjae Jeon
2022-11-14 11:33 ` Luís Henriques [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877czx22kh.fsf@suse.de \
    --to=lhenriques@suse.de \
    --cc=amir73il@gmail.com \
    --cc=lhenriques@suse.com \
    --cc=linkinjeon@kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=olga.kornievskaia@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.