From: Andreas Dilger <adilger@sun.com>
To: "Jörn Engel" <joern@logfs.org>
Cc: Theodore Tso <tytso@mit.edu>, Jamie Lokier <jamie@shareable.org>,
jmorris@namei.org, ocfs2-devel@oss.oracle.com,
linux-fsdevel@vger.kernel.org,
Chris Mason <chris.mason@oracle.com>,
viro@zeniv.linux.org.uk
Subject: Re: copyfile semantics.
Date: Tue, 05 May 2009 15:44:54 -0600 [thread overview]
Message-ID: <20090505214454.GP3209@webber.adilger.int> (raw)
In-Reply-To: <20090505164619.GA32180@logfs.org>
On May 05, 2009 18:46 +0200, Jörn Engel wrote:
> On Tue, 5 May 2009 16:36:29 +0100, Jamie Lokier wrote:
> > What is the advantage of adding the system call for the special case
> > of reflink(), when we choose not to have, say, a copyfile() system
> > call which does what "cp -a" does because doing it in user space is
> > good enough?
>
> Given an ignorant filesystem, copyfile() will simply do the read/write
> loop in kernelspace. So either copyfile() is just a fancy name for
> splice()
Sure, except splice() (AFAIK) doesn't allow a splice between two regular
files, only between a pipe and a file. Maybe it has changed since the
last time I looked. On high performance filesystems the copy_to_user()
and copy_from_user() can be a major limiting factor on IO performance,
and it is getting more significant because the single-core performance
is not improving at all. At 1GB/s just a single copy_{to,from}_user
(read or write) will consume 40% of a single core.
If it is possible to use splice() to copy between two regular files then
that is great. Does anything (e.g. cp) actually use this yet?
> or copyfile() will also have to create a tempfile, rename the
> tempfile when the copy is done and deal with all possible errors. And
> if the system crashes, who will remove the tempfile on reboot? Will the
> tempfile have a well-known name, allowing for easy DoS? Or will it be
> random, causing much fun locating it after reboot.
Maybe I'm missing something, but why do we need a tempfile at all?
I can't imagine that people expect atomic semantics for copyfile(),
any more than they expect atomic sematics for "cp" in the face of a
crash.
> When implemented in the filesystem itself, copyfile() can be quite nice.
> The filesystem can create a temporary inode without visibly exposing it
> to userspace. It can delete temporary inodes in journal replay after a
> crash. And depending on the fs design, the read/write loop can be
> replaced with finer-grained reference counting.
I would think that copyfile() is of primary interest when it involves
a network filesystem, so there is no need to ship data to the client
doing the copy at all. This is possible for NFS and CIFS protocol today,
AFAIK. The problem with splice is that the filesystem only knows about
->splice_read() and ->splice_write(), it doesn't have any opportunity
to optimize this further (e.g. by sending a "copyfile" RPC, or implementing
a reflink or whatever).
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
next prev parent reply other threads:[~2009-05-05 21:44 UTC|newest]
Thread overview: 151+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-03 6:15 [RFC] The reflink(2) system call Joel Becker
2009-05-03 6:15 ` [PATCH 1/3] fs: Document the " Joel Becker
2009-05-03 8:01 ` Christoph Hellwig
2009-05-04 2:46 ` Joel Becker
2009-05-04 6:36 ` Michael Kerrisk
2009-05-04 7:12 ` Joel Becker
2009-05-03 13:08 ` Boaz Harrosh
2009-05-03 23:08 ` Al Viro
2009-05-04 2:49 ` Joel Becker
2009-05-03 23:45 ` Theodore Tso
2009-05-04 1:44 ` Tao Ma
2009-05-04 18:25 ` Joel Becker
2009-05-04 21:18 ` [Ocfs2-devel] " Joel Becker
2009-05-04 22:23 ` Theodore Tso
2009-05-05 6:55 ` Joel Becker
2009-05-05 1:07 ` Jamie Lokier
2009-05-05 7:16 ` Joel Becker
2009-05-05 8:09 ` Andreas Dilger
2009-05-05 16:56 ` Joel Becker
2009-05-05 21:24 ` Andreas Dilger
2009-05-05 21:32 ` Joel Becker
2009-05-06 7:15 ` [Ocfs2-devel] " Theodore Tso
2009-05-06 14:24 ` jim owens
2009-05-06 14:30 ` jim owens
2009-05-06 17:50 ` jim owens
2009-05-12 19:20 ` Jamie Lokier
2009-05-12 19:30 ` Jamie Lokier
2009-05-12 19:11 ` Jamie Lokier
2009-05-12 19:37 ` jim owens
2009-05-12 20:11 ` Jamie Lokier
2009-05-05 13:01 ` Theodore Tso
2009-05-05 13:19 ` Jamie Lokier
2009-05-05 13:39 ` Chris Mason
2009-05-05 15:36 ` Jamie Lokier
2009-05-05 15:41 ` Chris Mason
2009-05-05 16:03 ` Jamie Lokier
2009-05-05 16:18 ` Chris Mason
2009-05-05 20:48 ` jim owens
2009-05-05 21:57 ` Jamie Lokier
2009-05-05 22:04 ` Joel Becker
2009-05-05 22:11 ` Jamie Lokier
2009-05-05 22:24 ` Joel Becker
2009-05-05 23:14 ` Jamie Lokier
2009-05-05 22:12 ` Jamie Lokier
2009-05-05 22:21 ` Joel Becker
2009-05-05 22:32 ` James Morris
2009-05-05 22:39 ` Joel Becker
2009-05-12 19:40 ` Jamie Lokier
2009-05-05 22:28 ` jim owens
2009-05-05 23:12 ` Jamie Lokier
2009-05-05 16:46 ` Jörn Engel
2009-05-05 16:54 ` Jörn Engel
2009-05-05 22:03 ` Jamie Lokier
2009-05-05 21:44 ` Andreas Dilger [this message]
2009-05-05 21:48 ` copyfile semantics Matthew Wilcox
2009-05-05 22:25 ` Trond Myklebust
2009-05-05 22:06 ` Jamie Lokier
2009-05-06 5:57 ` Jörn Engel
2009-05-05 14:21 ` [PATCH 1/3] fs: Document the reflink(2) system call Theodore Tso
2009-05-05 15:32 ` Jamie Lokier
2009-05-05 22:49 ` James Morris
2009-05-05 17:05 ` Joel Becker
2009-05-05 17:00 ` Joel Becker
2009-05-05 17:29 ` Theodore Tso
2009-05-05 22:36 ` Jamie Lokier
2009-05-05 22:30 ` Jamie Lokier
2009-05-05 22:37 ` Joel Becker
2009-05-05 23:08 ` jim owens
2009-05-05 13:01 ` Jamie Lokier
2009-05-05 17:09 ` Joel Becker
2009-05-03 6:15 ` [PATCH 2/3] fs: Add vfs_reflink() and the ->reflink() inode operation Joel Becker
2009-05-03 8:03 ` Christoph Hellwig
2009-05-04 2:51 ` Joel Becker
2009-05-03 6:15 ` [PATCH 3/3] fs: Add the reflink(2) system call Joel Becker
2009-05-03 6:27 ` Matthew Wilcox
2009-05-03 6:39 ` Al Viro
2009-05-03 7:48 ` Christoph Hellwig
2009-05-03 11:16 ` Al Viro
2009-05-04 2:53 ` Joel Becker
2009-05-04 2:53 ` Joel Becker
2009-05-03 8:04 ` Christoph Hellwig
2009-05-07 22:15 ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-08 1:39 ` James Morris
2009-05-08 1:49 ` Joel Becker
2009-05-08 13:01 ` Tetsuo Handa
2009-05-08 2:59 ` jim owens
2009-05-08 3:10 ` Joel Becker
2009-05-08 11:53 ` jim owens
2009-05-08 12:16 ` jim owens
2009-05-08 14:11 ` jim owens
2009-05-11 20:40 ` [RFC] The reflink(2) system call v4 Joel Becker
2009-05-11 22:27 ` James Morris
2009-05-11 22:34 ` Joel Becker
2009-05-12 1:12 ` James Morris
2009-05-12 12:18 ` Stephen Smalley
2009-05-12 17:22 ` Joel Becker
2009-05-12 17:32 ` Stephen Smalley
2009-05-12 18:03 ` Joel Becker
2009-05-12 18:04 ` Stephen Smalley
2009-05-12 18:28 ` Joel Becker
2009-05-12 18:37 ` Stephen Smalley
2009-05-14 18:06 ` Stephen Smalley
2009-05-14 18:25 ` Stephen Smalley
2009-05-14 23:25 ` James Morris
2009-05-15 11:54 ` Stephen Smalley
2009-05-15 13:35 ` James Morris
2009-05-15 15:44 ` Stephen Smalley
2009-05-13 1:47 ` Casey Schaufler
2009-05-13 16:43 ` Joel Becker
2009-05-13 17:23 ` Stephen Smalley
2009-05-13 18:27 ` Joel Becker
2009-05-12 12:01 ` Stephen Smalley
2009-05-11 23:11 ` jim owens
2009-05-11 23:42 ` Joel Becker
2009-05-12 11:31 ` Jörn Engel
2009-05-12 13:12 ` jim owens
2009-05-12 20:24 ` Jamie Lokier
2009-05-14 18:43 ` Jörn Engel
2009-05-12 15:04 ` Sage Weil
2009-05-12 15:23 ` jim owens
2009-05-12 16:16 ` Sage Weil
2009-05-12 17:45 ` jim owens
2009-05-12 20:29 ` Jamie Lokier
2009-05-12 17:28 ` Joel Becker
2009-05-13 4:30 ` Sage Weil
2009-05-14 3:57 ` Andy Lutomirski
2009-05-14 18:12 ` Stephen Smalley
2009-05-14 22:00 ` Joel Becker
2009-05-15 1:20 ` Jamie Lokier
2009-05-15 12:01 ` Stephen Smalley
2009-05-15 15:22 ` Joel Becker
2009-05-15 15:55 ` Stephen Smalley
2009-05-15 16:42 ` Joel Becker
2009-05-15 17:01 ` Shaya Potter
2009-05-15 20:53 ` [Ocfs2-devel] " Joel Becker
2009-05-18 9:17 ` Jörn Engel
2009-05-18 13:02 ` Stephen Smalley
2009-05-18 14:33 ` Stephen Smalley
2009-05-18 17:15 ` Stephen Smalley
2009-05-18 18:26 ` Joel Becker
2009-05-19 16:32 ` [Ocfs2-devel] " Sage Weil
2009-05-19 19:33 ` Jonathan Corbet
2009-05-19 20:15 ` Jamie Lokier
[not found] ` <20090519132057.419b9de0@bike.lwn.net>
[not found] ` <20090519193244.GB25521@mail.oracle.com>
2009-05-19 19:41 ` Jonathan Corbet
2009-05-28 0:24 ` [RFC] The reflink(2) system call v5 Joel Becker
2009-09-14 22:24 ` Joel Becker
2009-05-11 20:49 ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-11 22:49 ` jim owens
2009-05-11 23:46 ` Joel Becker
2009-05-12 0:54 ` Chris Mason
2009-05-12 20:36 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090505214454.GP3209@webber.adilger.int \
--to=adilger@sun.com \
--cc=chris.mason@oracle.com \
--cc=jamie@shareable.org \
--cc=jmorris@namei.org \
--cc=joern@logfs.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=ocfs2-devel@oss.oracle.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).