From: Anna Schumaker <Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
To: "Darrick J. Wong" <darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: <linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
<linux-btrfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
<linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
<zab-ugsP4Wv/S6ZeoWH0uzbU5w@public.gmane.org>,
<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
<clm-b10kYP2dOMg@public.gmane.org>,
<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
<andros-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>,
<hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Subject: Re: [PATCH v1 9/8] copy_file_range.2: New page documenting copy_file_range()
Date: Tue, 8 Sep 2015 11:04:03 -0400 [thread overview]
Message-ID: <55EEF8E3.8030501@Netapp.com> (raw)
In-Reply-To: <20150904213856.GC10391-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
On 09/04/2015 05:38 PM, Darrick J. Wong wrote:
> On Fri, Sep 04, 2015 at 04:17:03PM -0400, Anna Schumaker wrote:
>> copy_file_range() is a new system call for copying ranges of data
>> completely in the kernel. This gives filesystems an opportunity to
>> implement some kind of "copy acceleration", such as reflinks or
>> server-side-copy (in the case of NFS).
>>
>> Signed-off-by: Anna Schumaker <Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
>> ---
>> man2/copy_file_range.2 | 168 +++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 168 insertions(+)
>> create mode 100644 man2/copy_file_range.2
>>
>> diff --git a/man2/copy_file_range.2 b/man2/copy_file_range.2
>> new file mode 100644
>> index 0000000..4a4cb73
>> --- /dev/null
>> +++ b/man2/copy_file_range.2
>> @@ -0,0 +1,168 @@
>> +.\"This manpage is Copyright (C) 2015 Anna Schumaker <Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
>> +.TH COPY 2 2015-8-31 "Linux" "Linux Programmer's Manual"
>> +.SH NAME
>> +copy_file_range \- Copy a range of data from one file to another
>> +.SH SYNOPSIS
>> +.nf
>> +.B #include <linux/copy.h>
>> +.B #include <sys/syscall.h>
>> +.B #include <unistd.h>
>> +
>> +.BI "ssize_t syscall(__NR_copy_file_range, int " fd_in ", loff_t * " off_in ",
>> +.BI " int " fd_out ", loff_t * " off_out ", size_t " len ",
>> +.BI " unsigned int " flags );
>> +.fi
>> +.SH DESCRIPTION
>> +The
>> +.BR copy_file_range ()
>> +system call performs an in-kernel copy between two file descriptors
>> +without all that tedious mucking about in userspace.
>
> ;)
>
>> +It copies up to
>> +.I len
>> +bytes of data from file descriptor
>> +.I fd_in
>> +to file descriptor
>> +.I fd_out
>> +at
>> +.IR off_out .
>> +The file descriptors must not refer to the same file.
>
> Why? btrfs (and XFS) reflink can handle the case of a file sharing blocks
> with itself.
I've never really thought about it... Zach had that in his initial submission, so mentioned it in the man page. Should I remove that bit?
>
>> +
>> +The following semantics apply for
>> +.IR fd_in ,
>> +and similar statements apply to
>> +.IR off_out :
>> +.IP * 3
>> +If
>> +.I off_in
>> +is NULL, then bytes are read from
>> +.I fd_in
>> +starting from the current file offset and the current
>> +file offset is adjusted appropriately.
>> +.IP *
>> +If
>> +.I off_in
>> +is not NULL, then
>> +.I off_in
>> +must point to a buffer that specifies the starting
>> +offset where bytes from
>> +.I fd_in
>> +will be read. The current file offset of
>> +.I fd_in
>> +is not changed, but
>> +.I off_in
>> +is adjusted appropriately.
>> +.PP
>> +The default behavior of
>> +.BR copy_file_range ()
>> +is filesystem specific, and might result in creating a
>> +copy-on-write reflink.
>> +In the event that a given filesystem does not implement
>> +any form of copy acceleration, the kernel will perform
>> +a deep copy of the requested range by reading bytes from
>
> I wonder if it's wise to allow deep copies -- what happens if len == 1T?
> Will this syscall just block for a really long time?
We use rw_verify_area(), (similar to read and write) so we won't allow a value of len that long. I can mention this in an updated version of this man page!
>
>> +.I fd_in
>> +and writing them to
>> +.IR fd_out .
>
> "...if COPY_REFLINK is not set in flags."
Sure.
>
>> +
>> +Currently, Linux only supports the following flag:
>> +.TP 1.9i
>> +.B COPY_REFLINK
>> +Only perform the copy if the filesystem can do it as a reflink.
>> +Do not fall back on performing a deep copy.
>> +.SH RETURN VALUE
>> +Upon successful completion,
>> +.BR copy_file_range ()
>> +will return the number of bytes copied between files.
>> +This could be less than the length originally requested.
>> +
>> +On error,
>> +.BR copy_file_range ()
>> +returns \-1 and
>> +.I errno
>> +is set to indicate the error.
>> +.SH ERRORS
>> +.TP
>> +.B EBADF
>> +One or more file descriptors are not valid,
>> +or do not have proper read-write mode.
>
> "or fd_out is not opened for writing"?
I'll add that.
>
>> +.TP
>> +.B EINVAL
>> +Requested range extends beyond the end of the file;
>> +.I flags
>> +argument is set to an invalid value.
>> +.TP
>> +.B EOPNOTSUPP
>> +.B COPY_REFLINK
>> +was specified in
>> +.IR flags ,
>> +but the target filesystem does not support reflinks.
>> +.TP
>> +.B EXDEV
>> +Target filesystem doesn't support cross-filesystem copies.
>> +.SH VERSIONS
>
> Perhaps this ought to list a few more errors (EIO, ENOSPC, ENOSYS, EPERM...)
> that can be returned? (I was looking at the fallocate manpage.)
Okay. I'll poke around for what else could be returned!
Thanks,
Anna
>
> --D
>
>> +The
>> +.BR copy_file_range ()
>> +system call first appeared in Linux 4.3.
>> +.SH CONFORMING TO
>> +The
>> +.BR copy_file_range ()
>> +system call is a nonstandard Linux extension.
>> +.SH EXAMPLE
>> +.nf
>> +
>> +#define _GNU_SOURCE
>> +#include <fcntl.h>
>> +#include <linux/copy.h>
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <sys/stat.h>
>> +#include <sys/syscall.h>
>> +#include <unistd.h>
>> +
>> +
>> +int main(int argc, char **argv)
>> +{
>> + int fd_in, fd_out;
>> + struct stat stat;
>> + loff_t len, ret;
>> +
>> + if (argc != 3) {
>> + fprintf(stderr, "Usage: %s <pathname> <pathname>\n", argv[0]);
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + fd_in = open(argv[1], O_RDONLY);
>> + if (fd_in == -1) {
>> + perror("open (argv[1])");
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + if (fstat(fd_in, &stat) == -1) {
>> + perror("fstat");
>> + exit(EXIT_FAILURE);
>> + }
>> + len = stat.st_size;
>> +
>> + fd_out = open(argv[2], O_WRONLY | O_CREAT, 0644);
>> + if (fd_out == -1) {
>> + perror("open (argv[2])");
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + do {
>> + ret = syscall(__NR_copy_file_range, fd_in, NULL,
>> + fd_out, NULL, len, 0);
>> + if (ret == -1) {
>> + perror("copy_file_range");
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + len -= ret;
>> + } while (len > 0);
>> +
>> + close(fd_in);
>> + close(fd_out);
>> + exit(EXIT_SUCCESS);
>> +}
>> +.fi
>> +.SH SEE ALSO
>> +.BR splice (2)
>> --
>> 2.5.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-09-08 15:04 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-04 20:16 [PATCH v1 0/8] VFS: In-kernel copy system call Anna Schumaker
2015-09-04 20:16 ` [PATCH v1 2/8] x86: add sys_copy_file_range to syscall tables Anna Schumaker
2015-09-04 20:16 ` [PATCH v1 3/8] btrfs: add .copy_file_range file operation Anna Schumaker
[not found] ` <1441397823-1203-4-git-send-email-Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-04 21:02 ` Josef Bacik
2015-09-09 8:39 ` David Sterba
2015-09-04 20:17 ` [PATCH v1 7/8] vfs: Copy should use file_out rather than file_in Anna Schumaker
[not found] ` <1441397823-1203-1-git-send-email-Anna.Schumaker-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-04 20:16 ` [PATCH v1 1/9] vfs: add copy_file_range syscall and vfs helper Anna Schumaker
2015-09-04 21:50 ` Darrick J. Wong
2015-09-04 20:16 ` [PATCH v1 4/8] btrfs: Add mountpoint checking during btrfs_copy_file_range Anna Schumaker
2015-09-09 9:18 ` David Sterba
2015-09-09 15:56 ` Anna Schumaker
2015-09-04 20:16 ` [PATCH v1 5/8] vfs: Remove copy_file_range mountpoint checks Anna Schumaker
2015-09-04 20:17 ` [PATCH v1 6/8] vfs: Copy should check len after file open mode Anna Schumaker
2015-09-04 20:17 ` [PATCH v1 8/8] vfs: Fall back on splice if no copy function defined Anna Schumaker
2015-09-04 21:08 ` Darrick J. Wong
[not found] ` <20150904210813.GA30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-08 14:57 ` Anna Schumaker
2015-09-04 20:17 ` [PATCH v1 9/8] copy_file_range.2: New page documenting copy_file_range() Anna Schumaker
2015-09-04 21:38 ` Darrick J. Wong
[not found] ` <20150904213856.GC10391-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-04 22:31 ` Andreas Dilger
[not found] ` <95674806-645C-410C-8A4B-A46F03AFFE20-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2015-09-08 15:05 ` Anna Schumaker
2015-09-08 15:04 ` Anna Schumaker [this message]
2015-09-08 20:39 ` Darrick J. Wong
2015-09-09 9:16 ` David Sterba
[not found] ` <20150908203918.GB30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-09 11:38 ` Austin S Hemmelgarn
2015-09-09 17:17 ` Darrick J. Wong
[not found] ` <20150909171757.GE10391-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-09 17:31 ` Anna Schumaker
[not found] ` <55F06CEC.5040208-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 18:12 ` Darrick J. Wong
2015-09-09 19:25 ` Anna Schumaker
2015-09-10 15:42 ` David Sterba
[not found] ` <20150910154251.GM8891-1ReQVI26iDCaZKY3DrU6dA@public.gmane.org>
2015-09-10 16:43 ` Darrick J. Wong
2015-09-04 22:25 ` [PATCH v1 0/8] VFS: In-kernel copy system call Andreas Dilger
[not found] ` <4B41043F-5D85-42D6-8F20-2DCC45930EF4-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2015-09-05 8:33 ` Al Viro
[not found] ` <20150905083342.GG22011-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-09-08 15:08 ` Anna Schumaker
2015-09-08 20:45 ` Darrick J. Wong
[not found] ` <20150908204517.GC30681-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-09-08 20:49 ` Anna Schumaker
2015-09-08 15:07 ` Anna Schumaker
2015-09-08 15:21 ` Pádraig Brady
[not found] ` <55EEFCEE.5090000-V8g9lnOeT5ydJdNcDFJN0w@public.gmane.org>
2015-09-08 18:23 ` Anna Schumaker
[not found] ` <55EF279B.3020101-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-08 19:10 ` Andy Lutomirski
[not found] ` <CALCETrXxRB-LXVb+=nkwfj0zEjWuXXTctkSAc9Oec0fgyOQ5Yg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-08 20:03 ` Pádraig Brady
[not found] ` <55EF3EFD.3080302-V8g9lnOeT5ydJdNcDFJN0w@public.gmane.org>
2015-09-08 21:29 ` Darrick J. Wong
2015-09-08 21:45 ` Andy Lutomirski
2015-09-08 22:39 ` Darrick J. Wong
2015-09-08 23:08 ` Andy Lutomirski
2015-09-09 1:19 ` Darrick J. Wong
2015-09-09 20:09 ` Chris Mason
[not found] ` <20150909200921.GD9511-DzB2rL6jT1BHfPKRx072akEOCMrvLtNR@public.gmane.org>
2015-09-09 20:26 ` Trond Myklebust
[not found] ` <CAHQdGtTSZ1beMMF4DJv=OuA1j2ww0xzJj3+9HMRAf3UpCCLaZg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-09 20:38 ` Chris Mason
[not found] ` <20150909203805.GE9511-DzB2rL6jT1BHfPKRx072akEOCMrvLtNR@public.gmane.org>
2015-09-09 20:41 ` Anna Schumaker
[not found] ` <55F0997E.1040105-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 21:42 ` Darrick J. Wong
2015-09-09 20:37 ` Andy Lutomirski
[not found] ` <CALCETrXPcxHWGwqhtkGStVabWDOsRbBy+VzrN+XxVZA_F9O0qA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-09 20:42 ` Chris Mason
[not found] ` <CALCETrVsWBdqvAgwxHcG=gbcWRNPG2ZziWUg1g=siKDrDu7S2Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-13 23:25 ` Dave Chinner
2015-09-14 17:53 ` Andy Lutomirski
2015-09-09 18:52 ` Anna Schumaker
[not found] ` <55F07FD8.4020507-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-09 21:16 ` Darrick J. Wong
2015-09-10 15:10 ` Anna Schumaker
[not found] ` <55F19D7F.5090907-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-09-10 15:49 ` Austin S Hemmelgarn
2015-09-10 11:40 ` Austin S Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55EEF8E3.8030501@Netapp.com \
--to=anna.schumaker-hgovqubeegtqt0dzr+alfa@public.gmane.org \
--cc=andros-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org \
--cc=clm-b10kYP2dOMg@public.gmane.org \
--cc=darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-btrfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
--cc=zab-ugsP4Wv/S6ZeoWH0uzbU5w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).