linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Anna Schumaker <Anna.Schumaker@netapp.com>
Cc: linux-nfs@vger.kernel.org, linux-btrfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
	zab@zabbo.net, viro@zeniv.linux.org.uk, clm@fb.com,
	mtk.manpages@gmail.com, andros@netapp.com, hch@infradead.org
Subject: Re: [PATCH v1 8/8] vfs: Fall back on splice if no copy function defined
Date: Fri, 4 Sep 2015 14:08:13 -0700	[thread overview]
Message-ID: <20150904210813.GA30681@birch.djwong.org> (raw)
In-Reply-To: <1441397823-1203-9-git-send-email-Anna.Schumaker@Netapp.com>

On Fri, Sep 04, 2015 at 04:17:02PM -0400, Anna Schumaker wrote:
> The NFS server will need a fallback for filesystems that don't have any
> kind of copy acceleration yet, and it should be generally useful to have
> an in-kernel copy to avoid lots of switches between kernel and
> user space.  Users who only want a reflink can pass the COPY_REFLINK
> flag to disable the fallback.
> 
> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
> ---
>  fs/read_write.c           | 16 ++++++++++++----
>  include/linux/copy.h      |  6 ++++++
>  include/uapi/linux/Kbuild |  1 +
>  include/uapi/linux/copy.h |  6 ++++++
>  4 files changed, 25 insertions(+), 4 deletions(-)
>  create mode 100644 include/linux/copy.h
>  create mode 100644 include/uapi/linux/copy.h
> 
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 9dafb7f..bd7e7e2 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -7,6 +7,7 @@
>  #include <linux/slab.h> 
>  #include <linux/stat.h>
>  #include <linux/fcntl.h>
> +#include <linux/copy.h>
>  #include <linux/file.h>
>  #include <linux/uio.h>
>  #include <linux/fsnotify.h>
> @@ -1342,7 +1343,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  	struct inode *inode_out;
>  	ssize_t ret;
>  
> -	if (flags)
> +	if (!((flags == 0) || (flags == COPY_REFLINK)))

I think it's a good idea for the copy_file_range flags to have the name of
the syscall in them at the beginning, both to try to avoid namespace collisions
and also to make it clear where the flag comes from.  So, could we prefix the
values with "COPY_FILE_RANGE_" instead of just "COPY_"?

Also... I'm confused by what the structure of check implies about 'flags' --
are each of 'flags's values supposed to be mutually exclusive, or is it a
straight bitset like flags arguments tend to be?

Say we want to add dedupe and ponies as potential behavioral variants.  Then
we'd end up with something like:

/* raw flags */
#define	COPY_FILE_RANGE_SHARE_BLOCKS	(1)
#define COPY_FILE_RANGE_CHECK_SAME	(2)
#define COPY_FILE_RANGE_PONIES		(4)

/* syntactic sugar */
#define COPY_FILE_RANGE_DEDUPE		(COPY_FILE_RANGE_SHARE_BLOCKS | \
					 COPY_FILE_RANGE_CHECK_SAME)
#define COPY_FILE_RANGE_ALL		(COPY_FILE_RANGE_SHARE_BLOCKS | \
					 COPY_FILE_RANGE_CHECK_SAME | \
					 COPY_FILE_RANGE_PONIES)

and in copy_file_range():

if (flags & ~COPY_FILE_RANGE_ALL)
	return -EINVAL;
if ((flags & COPY_FILE_RANGE_CHECK_SAME) && 
    !(flags & COPY_FILE_RANGE_SHARE_BLOCKS))
	return -EINVAL;	/* or is it return 0? */

if (flags & COPY_FILE_RANGE_PONIES)
	panic();
if (flags & COPY_FILE_RANGE_CHECK_SAME)
	check_same_contents(...);
err = vfs_copy_range(...);
if (err < 0 && !(flags & COPY_FILE_RANGE_SHARE_BLOCKS))
	err = splice(...);

One hard part is figuring out just what actions we want userland to be able to
ask for.  I can think of three: "share blocks via remapping" (i.e.  reflink),
"share blocks via remapping but only if they're identical" (i.e. dedupe), and
"classic copy via pagecache".  I lean towards giving each variant its own bit
and only allowing through the cases that make sense, rather than giving each
valid combination its own unique number, but maybe I misinterpreted the intent
behind the code.

(I guess there could also be 'do it with hardware assist' but that's digging
myself a pretty deep hole.)

--D

>  		return -EINVAL;
>  
>  	/* copy_file_range allows full ssize_t len, ignoring MAX_RW_COUNT  */
> @@ -1355,7 +1356,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  	if (!(file_in->f_mode & FMODE_READ) ||
>  	    !(file_out->f_mode & FMODE_WRITE) ||
>  	    (file_out->f_flags & O_APPEND) ||
> -	    !file_out->f_op || !file_out->f_op->copy_file_range)
> +	    !file_in->f_op)
>  		return -EBADF;
>  
>  	inode_in = file_inode(file_in);
> @@ -1377,8 +1378,15 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
>  	if (ret)
>  		return ret;
>  
> -	ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out, pos_out,
> -					      len, flags);
> +	ret = -EOPNOTSUPP;
> +	if (file_out->f_op->copy_file_range)
> +		ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out,
> +						      pos_out, len, flags);
> +	if ((ret < 0) && !(flags & COPY_REFLINK)) {
> +		file_start_write(file_out);
> +		ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out, len, 0);
> +		file_end_write(file_out);
> +	}
>  	if (ret > 0) {
>  		fsnotify_access(file_in);
>  		add_rchar(current, ret);
> diff --git a/include/linux/copy.h b/include/linux/copy.h
> new file mode 100644
> index 0000000..fd54543
> --- /dev/null
> +++ b/include/linux/copy.h
> @@ -0,0 +1,6 @@
> +#ifndef _LINUX_COPY_H
> +#define _LINUX_COPY_H
> +
> +#include <uapi/linux/copy.h>
> +
> +#endif /* _LINUX_COPY_H */
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 1ff9942..b6109f3 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -90,6 +90,7 @@ header-y += coda_psdev.h
>  header-y += coff.h
>  header-y += connector.h
>  header-y += const.h
> +header-y += copy.h
>  header-y += cramfs_fs.h
>  header-y += cuda.h
>  header-y += cyclades.h
> diff --git a/include/uapi/linux/copy.h b/include/uapi/linux/copy.h
> new file mode 100644
> index 0000000..68f3d65
> --- /dev/null
> +++ b/include/uapi/linux/copy.h
> @@ -0,0 +1,6 @@
> +#ifndef _UAPI_LINUX_COPY_H
> +#define _UAPI_LINUX_COPY_H
> +
> +#define COPY_REFLINK	1	/* only perform a reflink */
> +
> +#endif /* _UAPI_LINUX_COPY_H */
> -- 
> 2.5.1
> 

  reply	other threads:[~2015-09-04 21:08 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-04 20:16 [PATCH v1 0/8] VFS: In-kernel copy system call Anna Schumaker
2015-09-04 20:16 ` [PATCH v1 2/8] x86: add sys_copy_file_range to syscall tables Anna Schumaker
2015-09-04 20:16 ` [PATCH v1 3/8] btrfs: add .copy_file_range file operation Anna Schumaker
     [not found]   ` <1441397823-1203-4-git-send-email-Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-04 21:02     ` Josef Bacik
2015-09-09  8:39   ` David Sterba
2015-09-04 20:17 ` [PATCH v1 7/8] vfs: Copy should use file_out rather than file_in Anna Schumaker
     [not found] ` <1441397823-1203-1-git-send-email-Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-04 20:16   ` [PATCH v1 1/9] vfs: add copy_file_range syscall and vfs helper Anna Schumaker
2015-09-04 21:50     ` Darrick J. Wong
2015-09-04 20:16   ` [PATCH v1 4/8] btrfs: Add mountpoint checking during btrfs_copy_file_range Anna Schumaker
2015-09-09  9:18     ` David Sterba
2015-09-09 15:56       ` Anna Schumaker
2015-09-04 20:16   ` [PATCH v1 5/8] vfs: Remove copy_file_range mountpoint checks Anna Schumaker
2015-09-04 20:17   ` [PATCH v1 6/8] vfs: Copy should check len after file open mode Anna Schumaker
2015-09-04 20:17   ` [PATCH v1 8/8] vfs: Fall back on splice if no copy function defined Anna Schumaker
2015-09-04 21:08     ` Darrick J. Wong [this message]
     [not found]       ` <20150904210813.GA30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-08 14:57         ` Anna Schumaker
2015-09-04 20:17   ` [PATCH v1 9/8] copy_file_range.2: New page documenting copy_file_range() Anna Schumaker
2015-09-04 21:38     ` Darrick J. Wong
     [not found]       ` <20150904213856.GC10391-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-04 22:31         ` Andreas Dilger
     [not found]           ` <95674806-645C-410C-8A4B-A46F03AFFE20-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2015-09-08 15:05             ` Anna Schumaker
2015-09-08 15:04         ` Anna Schumaker
2015-09-08 20:39           ` Darrick J. Wong
2015-09-09  9:16             ` David Sterba
     [not found]             ` <20150908203918.GB30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-09 11:38               ` Austin S Hemmelgarn
2015-09-09 17:17                 ` Darrick J. Wong
     [not found]                   ` <20150909171757.GE10391-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-09 17:31                     ` Anna Schumaker
     [not found]                       ` <55F06CEC.5040208-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 18:12                         ` Darrick J. Wong
2015-09-09 19:25                           ` Anna Schumaker
2015-09-10 15:42                     ` David Sterba
     [not found]                       ` <20150910154251.GM8891-1ReQVI26iDCaZKY3DrU6dA@public.gmane.org>
2015-09-10 16:43                         ` Darrick J. Wong
2015-09-04 22:25   ` [PATCH v1 0/8] VFS: In-kernel copy system call Andreas Dilger
     [not found]     ` <4B41043F-5D85-42D6-8F20-2DCC45930EF4-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2015-09-05  8:33       ` Al Viro
     [not found]         ` <20150905083342.GG22011-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-09-08 15:08           ` Anna Schumaker
2015-09-08 20:45             ` Darrick J. Wong
     [not found]               ` <20150908204517.GC30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-08 20:49                 ` Anna Schumaker
2015-09-08 15:07     ` Anna Schumaker
2015-09-08 15:21   ` Pádraig Brady
     [not found]     ` <55EEFCEE.5090000-V8g9lnOeT5ydJdNcDFJN0w@public.gmane.org>
2015-09-08 18:23       ` Anna Schumaker
     [not found]         ` <55EF279B.3020101-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-08 19:10           ` Andy Lutomirski
     [not found]             ` <CALCETrXxRB-LXVb+=nkwfj0zEjWuXXTctkSAc9Oec0fgyOQ5Yg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-08 20:03               ` Pádraig Brady
     [not found]                 ` <55EF3EFD.3080302-V8g9lnOeT5ydJdNcDFJN0w@public.gmane.org>
2015-09-08 21:29                   ` Darrick J. Wong
2015-09-08 21:45                     ` Andy Lutomirski
2015-09-08 22:39                       ` Darrick J. Wong
2015-09-08 23:08                         ` Andy Lutomirski
2015-09-09  1:19                           ` Darrick J. Wong
2015-09-09 20:09                           ` Chris Mason
     [not found]                             ` <20150909200921.GD9511-DzB2rL6jT1BHfPKRx072akEOCMrvLtNR@public.gmane.org>
2015-09-09 20:26                               ` Trond Myklebust
     [not found]                                 ` <CAHQdGtTSZ1beMMF4DJv=OuA1j2ww0xzJj3+9HMRAf3UpCCLaZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-09 20:38                                   ` Chris Mason
     [not found]                                     ` <20150909203805.GE9511-DzB2rL6jT1BHfPKRx072akEOCMrvLtNR@public.gmane.org>
2015-09-09 20:41                                       ` Anna Schumaker
     [not found]                                         ` <55F0997E.1040105-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 21:42                                           ` Darrick J. Wong
2015-09-09 20:37                               ` Andy Lutomirski
     [not found]                                 ` <CALCETrXPcxHWGwqhtkGStVabWDOsRbBy+VzrN+XxVZA_F9O0qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-09 20:42                                   ` Chris Mason
     [not found]                           ` <CALCETrVsWBdqvAgwxHcG=gbcWRNPG2ZziWUg1g=siKDrDu7S2Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-13 23:25                             ` Dave Chinner
2015-09-14 17:53                               ` Andy Lutomirski
2015-09-09 18:52                         ` Anna Schumaker
     [not found]                           ` <55F07FD8.4020507-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 21:16                             ` Darrick J. Wong
2015-09-10 15:10                               ` Anna Schumaker
     [not found]                                 ` <55F19D7F.5090907-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-10 15:49                                   ` Austin S Hemmelgarn
2015-09-10 11:40                           ` Austin S Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150904210813.GA30681@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=andros@netapp.com \
    --cc=clm@fb.com \
    --cc=hch@infradead.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zab@zabbo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).