linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>,
	Jeff Layton <jlayton@kernel.org>,
	Josef Bacik <josef@toxicpanda.com>,
	Christoph Hellwig <hch@lst.de>, Jan Kara <jack@suse.cz>,
	David Howells <dhowells@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2 3/3] fs: use do_splice_direct() for nfsd/ksmbd server-side-copy
Date: Thu, 30 Nov 2023 17:49:09 +0100	[thread overview]
Message-ID: <20231130164909.vqafeznxlyxbqsmh@quack3> (raw)
In-Reply-To: <20231130141624.3338942-4-amir73il@gmail.com>

On Thu 30-11-23 16:16:24, Amir Goldstein wrote:
> nfsd/ksmbd call vfs_copy_file_range() with flag COPY_FILE_SPLICE to
> perform kernel copy between two files on any two filesystems.
> 
> Splicing input file, while holding file_start_write() on the output file
> which is on a different sb, posses a risk for fanotify related deadlocks.
> 
> We only need to call splice_file_range() from within the context of
> ->copy_file_range() filesystem methods with file_start_write() held.
> 
> To avoid the possible deadlocks, always use do_splice_direct() instead of
> splice_file_range() for the kernel copy fallback in vfs_copy_file_range()
> without holding file_start_write().
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/read_write.c | 36 +++++++++++++++++++++++-------------
>  1 file changed, 23 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 0bc99f38e623..e0c2c1b5962b 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1421,6 +1421,10 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
>  				struct file *file_out, loff_t pos_out,
>  				size_t len, unsigned int flags)
>  {
> +	/* May only be called from within ->copy_file_range() methods */
> +	if (WARN_ON_ONCE(flags))
> +		return -EINVAL;
> +
>  	return splice_file_range(file_in, &pos_in, file_out, &pos_out,
>  				 min_t(size_t, len, MAX_RW_COUNT));
>  }
> @@ -1541,19 +1545,22 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  		ret = file_out->f_op->copy_file_range(file_in, pos_in,
>  						      file_out, pos_out,
>  						      len, flags);
> -		goto done;
> -	}
> -
> -	if (!splice && file_in->f_op->remap_file_range &&
> -	    file_inode(file_in)->i_sb == file_inode(file_out)->i_sb) {
> +	} else if (!splice && file_in->f_op->remap_file_range &&
> +		   file_inode(file_in)->i_sb == file_inode(file_out)->i_sb) {
>  		ret = file_in->f_op->remap_file_range(file_in, pos_in,
>  				file_out, pos_out,
>  				min_t(loff_t, MAX_RW_COUNT, len),
>  				REMAP_FILE_CAN_SHORTEN);
> -		if (ret > 0)
> -			goto done;
> +		/* fallback to splice */
> +		if (ret <= 0)
> +			splice = true;
>  	}
>  
> +	file_end_write(file_out);
> +
> +	if (!splice)
> +		goto done;
> +
>  	/*
>  	 * We can get here for same sb copy of filesystems that do not implement
>  	 * ->copy_file_range() in case filesystem does not support clone or in
> @@ -1565,11 +1572,16 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  	 * and which filesystems do not, that will allow userspace tools to
>  	 * make consistent desicions w.r.t using copy_file_range().
>  	 *
> -	 * We also get here if caller (e.g. nfsd) requested COPY_FILE_SPLICE.
> +	 * We also get here if caller (e.g. nfsd) requested COPY_FILE_SPLICE
> +	 * for server-side-copy between any two sb.
> +	 *
> +	 * In any case, we call do_splice_direct() and not splice_file_range(),
> +	 * without file_start_write() held, to avoid possible deadlocks related
> +	 * to splicing from input file, while file_start_write() is held on
> +	 * the output file on a different sb.
>  	 */
> -	ret = generic_copy_file_range(file_in, pos_in, file_out, pos_out, len,
> -				      flags);
> -
> +	ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
> +			       min_t(size_t, len, MAX_RW_COUNT), 0);
>  done:
>  	if (ret > 0) {
>  		fsnotify_access(file_in);
> @@ -1581,8 +1593,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  	inc_syscr(current);
>  	inc_syscw(current);
>  
> -	file_end_write(file_out);
> -
>  	return ret;
>  }
>  EXPORT_SYMBOL(vfs_copy_file_range);
> -- 
> 2.34.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2023-11-30 16:49 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-30 14:16 [PATCH v2 0/3] Avert possible deadlock with splice() and fanotify Amir Goldstein
2023-11-30 14:16 ` [PATCH v2 1/3] fs: fork splice_file_range() from do_splice_direct() Amir Goldstein
2023-11-30 16:27   ` Jeff Layton
2023-12-04  8:37   ` Christoph Hellwig
2023-12-04  8:38     ` Christoph Hellwig
2023-12-04 13:29       ` Amir Goldstein
2023-12-04 14:07         ` Christoph Hellwig
2023-12-04 14:29           ` Amir Goldstein
2023-12-04 17:16             ` Jan Kara
2023-12-04 18:53               ` Amir Goldstein
2023-11-30 14:16 ` [PATCH v2 2/3] fs: move file_start_write() into direct_splice_actor() Amir Goldstein
2023-12-04  8:38   ` Christoph Hellwig
2023-11-30 14:16 ` [PATCH v2 3/3] fs: use do_splice_direct() for nfsd/ksmbd server-side-copy Amir Goldstein
2023-11-30 16:49   ` Jan Kara [this message]
2023-12-04  8:39   ` Christoph Hellwig
2023-12-04 13:19     ` Amir Goldstein
2023-12-04 14:02       ` Christoph Hellwig
2023-12-05  0:16   ` [PATCH] fs: read_write: make default in vfs_copy_file_range() reachable Bert Karwatzki
2023-12-05  3:45     ` Amir Goldstein
2023-12-05  5:01       ` Amir Goldstein
2023-12-05  9:50         ` Bert Karwatzki
2023-11-30 16:32 ` [PATCH v2 0/3] Avert possible deadlock with splice() and fanotify Jeff Layton
2023-12-01 10:40 ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231130164909.vqafeznxlyxbqsmh@quack3 \
    --to=jack@suse.cz \
    --cc=amir73il@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=hch@lst.de \
    --cc=jlayton@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).