linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anna Schumaker <Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
To: Andreas Dilger <adilger-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>,
	"Darrick J. Wong"
	<darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: <linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<linux-btrfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<zab-ugsP4Wv/S6ZeoWH0uzbU5w@public.gmane.org>,
	<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	<clm-b10kYP2dOMg@public.gmane.org>,
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	<andros-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>,
	<hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Subject: Re: [PATCH v1 9/8] copy_file_range.2: New page documenting copy_file_range()
Date: Tue, 8 Sep 2015 11:05:18 -0400	[thread overview]
Message-ID: <55EEF92E.2090201@Netapp.com> (raw)
In-Reply-To: <95674806-645C-410C-8A4B-A46F03AFFE20-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>

On 09/04/2015 06:31 PM, Andreas Dilger wrote:
> On Sep 4, 2015, at 3:38 PM, Darrick J. Wong <darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
>>
>> On Fri, Sep 04, 2015 at 04:17:03PM -0400, Anna Schumaker wrote:
>>> copy_file_range() is a new system call for copying ranges of data
>>> completely in the kernel.  This gives filesystems an opportunity to
>>> implement some kind of "copy acceleration", such as reflinks or
>>> server-side-copy (in the case of NFS).
>>>
>>> Signed-off-by: Anna Schumaker <Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
>>> ---
>>> man2/copy_file_range.2 | 168 +++++++++++++++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 168 insertions(+)
>>> create mode 100644 man2/copy_file_range.2
>>>
>>> diff --git a/man2/copy_file_range.2 b/man2/copy_file_range.2
>>> new file mode 100644
>>> index 0000000..4a4cb73
>>> --- /dev/null
>>> +++ b/man2/copy_file_range.2
>>> @@ -0,0 +1,168 @@
>>> +.\"This manpage is Copyright (C) 2015 Anna Schumaker <Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
>>> +.TH COPY 2 2015-8-31 "Linux" "Linux Programmer's Manual"
>>> +.SH NAME
>>> +copy_file_range \- Copy a range of data from one file to another
>>> +.SH SYNOPSIS
>>> +.nf
>>> +.B #include <linux/copy.h>
>>> +.B #include <sys/syscall.h>
>>> +.B #include <unistd.h>
>>> +
>>> +.BI "ssize_t syscall(__NR_copy_file_range, int " fd_in ", loff_t * " off_in ",
>>> +.BI "                int " fd_out ", loff_t * " off_out ", size_t " len ",
>>> +.BI "                unsigned int " flags );
>>> +.fi
>>> +.SH DESCRIPTION
>>> +The
>>> +.BR copy_file_range ()
>>> +system call performs an in-kernel copy between two file descriptors
>>> +without all that tedious mucking about in userspace.
>>
>> ;)
>>
>>> +It copies up to
>>> +.I len
>>> +bytes of data from file descriptor
>>> +.I fd_in
>>> +to file descriptor
>>> +.I fd_out
>>> +at
>>> +.IR off_out .
>>> +The file descriptors must not refer to the same file.
>>
>> Why?  btrfs (and XFS) reflink can handle the case of a file sharing blocks
>> with itself.
>>
>>> +
>>> +The following semantics apply for
>>> +.IR fd_in ,
>>> +and similar statements apply to
>>> +.IR off_out :
>>> +.IP * 3
>>> +If
>>> +.I off_in
>>> +is NULL, then bytes are read from
>>> +.I fd_in
>>> +starting from the current file offset and the current
>>> +file offset is adjusted appropriately.
>>> +.IP *
>>> +If
>>> +.I off_in
>>> +is not NULL, then
>>> +.I off_in
>>> +must point to a buffer that specifies the starting
>>> +offset where bytes from
>>> +.I fd_in
>>> +will be read.  The current file offset of
>>> +.I fd_in
>>> +is not changed, but
>>> +.I off_in
>>> +is adjusted appropriately.
>>> +.PP
>>> +The default behavior of
>>> +.BR copy_file_range ()
>>> +is filesystem specific, and might result in creating a
>>> +copy-on-write reflink.
>>> +In the event that a given filesystem does not implement
>>> +any form of copy acceleration, the kernel will perform
>>> +a deep copy of the requested range by reading bytes from
>>
>> I wonder if it's wise to allow deep copies -- what happens if
>> len == 1T? Will this syscall just block for a really long time?
> 
> It should be interruptible, and return the length of the number of
> bytes copied so far, just like read() and write().  That allows
> the caller to continue where it left off, or abort and delete the
> target file, or whatever it wants to do.

We already return the number of bytes copied so far, so I'll look into making it interruptable!

Thanks,
Anna

> 
> Cheers, Andreas
> 
>>> +.I fd_in
>>> +and writing them to
>>> +.IR fd_out .
>>
>> "...if COPY_REFLINK is not set in flags."
>>
>>> +
>>> +Currently, Linux only supports the following flag:
>>> +.TP 1.9i
>>> +.B COPY_REFLINK
>>> +Only perform the copy if the filesystem can do it as a reflink.
>>> +Do not fall back on performing a deep copy.
>>> +.SH RETURN VALUE
>>> +Upon successful completion,
>>> +.BR copy_file_range ()
>>> +will return the number of bytes copied between files.
>>> +This could be less than the length originally requested.
>>> +
>>> +On error,
>>> +.BR copy_file_range ()
>>> +returns \-1 and
>>> +.I errno
>>> +is set to indicate the error.
>>> +.SH ERRORS
>>> +.TP
>>> +.B EBADF
>>> +One or more file descriptors are not valid,
>>> +or do not have proper read-write mode.
>>
>> "or fd_out is not opened for writing"?
>>
>>> +.TP
>>> +.B EINVAL
>>> +Requested range extends beyond the end of the file;
>>> +.I flags
>>> +argument is set to an invalid value.
>>> +.TP
>>> +.B EOPNOTSUPP
>>> +.B COPY_REFLINK
>>> +was specified in
>>> +.IR flags ,
>>> +but the target filesystem does not support reflinks.
>>> +.TP
>>> +.B EXDEV
>>> +Target filesystem doesn't support cross-filesystem copies.
>>> +.SH VERSIONS
>>
>> Perhaps this ought to list a few more errors (EIO, ENOSPC, ENOSYS, EPERM...)
>> that can be returned?  (I was looking at the fallocate manpage.)
>>
>> --D
>>
>>> +The
>>> +.BR copy_file_range ()
>>> +system call first appeared in Linux 4.3.
>>> +.SH CONFORMING TO
>>> +The
>>> +.BR copy_file_range ()
>>> +system call is a nonstandard Linux extension.
>>> +.SH EXAMPLE
>>> +.nf
>>> +
>>> +#define _GNU_SOURCE
>>> +#include <fcntl.h>
>>> +#include <linux/copy.h>
>>> +#include <stdio.h>
>>> +#include <stdlib.h>
>>> +#include <sys/stat.h>
>>> +#include <sys/syscall.h>
>>> +#include <unistd.h>
>>> +
>>> +
>>> +int main(int argc, char **argv)
>>> +{
>>> +    int fd_in, fd_out;
>>> +    struct stat stat;
>>> +    loff_t len, ret;
>>> +
>>> +    if (argc != 3) {
>>> +        fprintf(stderr, "Usage: %s <pathname> <pathname>\n", argv[0]);
>>> +        exit(EXIT_FAILURE);
>>> +    }
>>> +
>>> +    fd_in = open(argv[1], O_RDONLY);
>>> +    if (fd_in == -1) {
>>> +        perror("open (argv[1])");
>>> +        exit(EXIT_FAILURE);
>>> +    }
>>> +
>>> +    if (fstat(fd_in, &stat) == -1) {
>>> +        perror("fstat");
>>> +        exit(EXIT_FAILURE);
>>> +    }
>>> +    len = stat.st_size;
>>> +
>>> +    fd_out = open(argv[2], O_WRONLY | O_CREAT, 0644);
>>> +    if (fd_out == -1) {
>>> +        perror("open (argv[2])");
>>> +        exit(EXIT_FAILURE);
>>> +    }
>>> +
>>> +    do {
>>> +        ret = syscall(__NR_copy_file_range, fd_in, NULL,
>>> +                      fd_out, NULL, len, 0);
>>> +        if (ret == -1) {
>>> +            perror("copy_file_range");
>>> +            exit(EXIT_FAILURE);
>>> +        }
>>> +
>>> +        len -= ret;
>>> +    } while (len > 0);
>>> +
>>> +    close(fd_in);
>>> +    close(fd_out);
>>> +    exit(EXIT_SUCCESS);
>>> +}
>>> +.fi
>>> +.SH SEE ALSO
>>> +.BR splice (2)
>>> -- 
>>> 2.5.1
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> Cheers, Andreas
> 
> 
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-09-08 15:05 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-04 20:16 [PATCH v1 0/8] VFS: In-kernel copy system call Anna Schumaker
2015-09-04 20:16 ` [PATCH v1 2/8] x86: add sys_copy_file_range to syscall tables Anna Schumaker
2015-09-04 20:16 ` [PATCH v1 3/8] btrfs: add .copy_file_range file operation Anna Schumaker
     [not found]   ` <1441397823-1203-4-git-send-email-Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-04 21:02     ` Josef Bacik
2015-09-09  8:39   ` David Sterba
2015-09-04 20:17 ` [PATCH v1 7/8] vfs: Copy should use file_out rather than file_in Anna Schumaker
     [not found] ` <1441397823-1203-1-git-send-email-Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-04 20:16   ` [PATCH v1 1/9] vfs: add copy_file_range syscall and vfs helper Anna Schumaker
2015-09-04 21:50     ` Darrick J. Wong
2015-09-04 20:16   ` [PATCH v1 4/8] btrfs: Add mountpoint checking during btrfs_copy_file_range Anna Schumaker
2015-09-09  9:18     ` David Sterba
2015-09-09 15:56       ` Anna Schumaker
2015-09-04 20:16   ` [PATCH v1 5/8] vfs: Remove copy_file_range mountpoint checks Anna Schumaker
2015-09-04 20:17   ` [PATCH v1 6/8] vfs: Copy should check len after file open mode Anna Schumaker
2015-09-04 20:17   ` [PATCH v1 8/8] vfs: Fall back on splice if no copy function defined Anna Schumaker
2015-09-04 21:08     ` Darrick J. Wong
     [not found]       ` <20150904210813.GA30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-08 14:57         ` Anna Schumaker
2015-09-04 20:17   ` [PATCH v1 9/8] copy_file_range.2: New page documenting copy_file_range() Anna Schumaker
2015-09-04 21:38     ` Darrick J. Wong
     [not found]       ` <20150904213856.GC10391-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-04 22:31         ` Andreas Dilger
     [not found]           ` <95674806-645C-410C-8A4B-A46F03AFFE20-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2015-09-08 15:05             ` Anna Schumaker [this message]
2015-09-08 15:04         ` Anna Schumaker
2015-09-08 20:39           ` Darrick J. Wong
2015-09-09  9:16             ` David Sterba
     [not found]             ` <20150908203918.GB30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-09 11:38               ` Austin S Hemmelgarn
2015-09-09 17:17                 ` Darrick J. Wong
     [not found]                   ` <20150909171757.GE10391-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-09 17:31                     ` Anna Schumaker
     [not found]                       ` <55F06CEC.5040208-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 18:12                         ` Darrick J. Wong
2015-09-09 19:25                           ` Anna Schumaker
2015-09-10 15:42                     ` David Sterba
     [not found]                       ` <20150910154251.GM8891-1ReQVI26iDCaZKY3DrU6dA@public.gmane.org>
2015-09-10 16:43                         ` Darrick J. Wong
2015-09-04 22:25   ` [PATCH v1 0/8] VFS: In-kernel copy system call Andreas Dilger
     [not found]     ` <4B41043F-5D85-42D6-8F20-2DCC45930EF4-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2015-09-05  8:33       ` Al Viro
     [not found]         ` <20150905083342.GG22011-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-09-08 15:08           ` Anna Schumaker
2015-09-08 20:45             ` Darrick J. Wong
     [not found]               ` <20150908204517.GC30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-08 20:49                 ` Anna Schumaker
2015-09-08 15:07     ` Anna Schumaker
2015-09-08 15:21   ` Pádraig Brady
     [not found]     ` <55EEFCEE.5090000-V8g9lnOeT5ydJdNcDFJN0w@public.gmane.org>
2015-09-08 18:23       ` Anna Schumaker
     [not found]         ` <55EF279B.3020101-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-08 19:10           ` Andy Lutomirski
     [not found]             ` <CALCETrXxRB-LXVb+=nkwfj0zEjWuXXTctkSAc9Oec0fgyOQ5Yg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-08 20:03               ` Pádraig Brady
     [not found]                 ` <55EF3EFD.3080302-V8g9lnOeT5ydJdNcDFJN0w@public.gmane.org>
2015-09-08 21:29                   ` Darrick J. Wong
2015-09-08 21:45                     ` Andy Lutomirski
2015-09-08 22:39                       ` Darrick J. Wong
2015-09-08 23:08                         ` Andy Lutomirski
2015-09-09  1:19                           ` Darrick J. Wong
2015-09-09 20:09                           ` Chris Mason
     [not found]                             ` <20150909200921.GD9511-DzB2rL6jT1BHfPKRx072akEOCMrvLtNR@public.gmane.org>
2015-09-09 20:26                               ` Trond Myklebust
     [not found]                                 ` <CAHQdGtTSZ1beMMF4DJv=OuA1j2ww0xzJj3+9HMRAf3UpCCLaZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-09 20:38                                   ` Chris Mason
     [not found]                                     ` <20150909203805.GE9511-DzB2rL6jT1BHfPKRx072akEOCMrvLtNR@public.gmane.org>
2015-09-09 20:41                                       ` Anna Schumaker
     [not found]                                         ` <55F0997E.1040105-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 21:42                                           ` Darrick J. Wong
2015-09-09 20:37                               ` Andy Lutomirski
     [not found]                                 ` <CALCETrXPcxHWGwqhtkGStVabWDOsRbBy+VzrN+XxVZA_F9O0qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-09 20:42                                   ` Chris Mason
     [not found]                           ` <CALCETrVsWBdqvAgwxHcG=gbcWRNPG2ZziWUg1g=siKDrDu7S2Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-13 23:25                             ` Dave Chinner
2015-09-14 17:53                               ` Andy Lutomirski
2015-09-09 18:52                         ` Anna Schumaker
     [not found]                           ` <55F07FD8.4020507-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 21:16                             ` Darrick J. Wong
2015-09-10 15:10                               ` Anna Schumaker
     [not found]                                 ` <55F19D7F.5090907-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-10 15:49                                   ` Austin S Hemmelgarn
2015-09-10 11:40                           ` Austin S Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55EEF92E.2090201@Netapp.com \
    --to=anna.schumaker-hgovqubeegtqt0dzr+alfa@public.gmane.org \
    --cc=adilger-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org \
    --cc=andros-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org \
    --cc=clm-b10kYP2dOMg@public.gmane.org \
    --cc=darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-btrfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    --cc=zab-ugsP4Wv/S6ZeoWH0uzbU5w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).