From: Jamie Lokier <jamie@shareable.org>
To: jim owens <jowens@hp.com>
Cc: Theodore Tso <tytso@mit.edu>,
joel.becker@oracle.com, Andreas Dilger <adilger@sun.com>,
linux-fsdevel@vger.kernel.org, jmorris@namei.org,
ocfs2-devel@oss.oracle.com, viro@zeniv.linux.org.uk
Subject: Re: [Ocfs2-devel] [PATCH 1/3] fs: Document the reflink(2) system call.
Date: Tue, 12 May 2009 20:11:29 +0100 [thread overview]
Message-ID: <20090512191129.GA10436@shareable.org> (raw)
In-Reply-To: <4A019DA9.4050700@hp.com>
jim owens wrote:
> You ask why not use a 2-step "cowfilecopy" and "attrfilecopy"
> to do "snapfile"... because that is not an atomic snapshot.
Understood, no problem with that. (Though it would be nice to have a
realistic example showing the atomicity being useful for a single file
snapshot).
Being able to create a _new file_ with the security attributes of an
existing file is sometimes useful too. Lots of programs do that, of
course, but a lot of them get it wrong when non-traditional security
attributes are used.
reflink() followed by truncate() would be useful for that - and in
that case, returning EPERM if it can't clone the attributes would be
essential - because if a program which wants to copy "all the security
attributes" without the knowledge to parse them itself and set them in
the right order, then it won't have the code to check if they were
cloned reliably either.
> The security and "might not know about it" concerns are bogus:
> No extra visibility exists to future updates of the original
> file that would not exist without either snapfile or cowfilecopy.
> That BOTH point at your old data is no different than if root
> or raid was copying every disk block to permanent storage. If
> you write it, someone can have it later.
I agree with that _as long as_ reflink() does not permit you to clone
a file when you are not the owner and you don't have read access.
It looks like reflink() V4 does not permit in that case - good!
(A more precise statement of the rule is "as long as you could not
copy the file normally and then change its attributes to match what
reflink() produces").
That's different from link(), which _does_ allow links when you have
no read access and aren't the owner, but it always bumps i_nlink.
That's where I was coming from with the "might not know about it"
concern, because it looked like earlier reflink() proposals applied
the same weak permission checks as link().
V4 seems much better.
> So bottom line... I see no reason (except someone has to document)
> why we should not have 2 system calls since there are good uses
> and good definitions for both and the code is 99% identical.
I doubt if anyone cares deeply if there are two system calls or one
system call with a flag(*), since they are so similar. The main thing
is having useful behaviours.
(*) Except for aesthetics.
I'm with the folks who think it's better for userspace to explicitly
request one behaviour or the other, rather than having reflink()
"automatically" decide for itself whether it will clone the attributes
or use new-file attributes.
The reason is because the "automatic" behaviour will certainly require
some applications to work around it, by guessing what it's going to do
before (which is difficult to do accurately), or checking what it did
afterwards.
That will be these applications:
- Sometimes an app will want to clone the attributes, and tell the
user "sorry, no" if that's not possible. So the app will have to
stat the file first, check the file owner against it's euid,
reflink, then stat the resulting file afterwards and check what
happened (because ownership might have changed between the first
stat and reflink calls, changing reflink's behaviour from what it
expected), and then call unlink if the wrong thing happend *and*
it will still be wrong 1% of the time when the security model is
not what the application expected. Applications should not
have to hard-code every known security model. And linking then
unlinking because you got it wrong is another security issue.
"cp --cow -a" might be in this category, so would "rsync --cow -a"
and generic backup applications. I expect most applications
wanting to copy exactly care about this.
- Sometimes an app will want to warn the user if the attributes
couldn't be cloned, but succeed in making the copy. reflink() V4
does that, but the app will have to check the new attributes against
the old ones to know whether to warn, and then guess what errno
would be appropriate.
Maybe "cp --cow -a" will be like this.
- Sometimes an app really just wants to copy a file with COW for
efficient data sharing. It will have to change the resulting
attributes to "new file" attributes - and that will be wrong 1%
of the time because it's not necessarily easy to get those
attributes right, especially with non-standard security models.
Even with traditional security, getting setgid-directory
behaviour right is extremely difficult - because it depends on
the filesystem's mount options among other things. Basically
"new file" attributes are something that should always be left to
the kernel.
While it might not be obvious when root would want to copy a file
without preserving attributes with COW performance, the argument
"I nearly always forget -p when writing cp" is arguing for "alias
cp='cp -p'" in your /root/.profile, not for making the system
call do it in a way you can't disable :-)
Besides I can think of when you would want it: When running *any*
shell script that you didn't write with the environment variable
CP_USE_COW_WHEN_POSSIBLE_TO_SAVE_SPACE set ;-)
Now the opposite of "automatic" is the app requests whether to clone
attributes or use "new file" attributes. In contrast to the above
problems, this doesn't cause any difficulty to applications, because
any app wanting the automatic choice can just do this:
ret = reflink(a,b);
if (ret == -1 && errno == EPERM)
ret = cowlink(a,b);
Ok, that's not perfect because EPERM can mean other things.
Which brings us back to a flag ;-) like this:
REFLINK_ATTR_CLONE (EPERM if can't clone attributes)
REFLINK_ATTR_CLONE_IF_OWNER_OR_ROOT (choose, as proposed in reflink V4)
One last annoyance. If you're making a new file, then like open() you
need another argument, which is the new file's mode which is combined
with umask. But not if you're cloning the attributes.
That's a good reason why there should be two functions for
applications. The names reflink/cowlink (and reflinkat/cowlinkat)
make sense to me. The cowlink functions have an extra mode argument,
like the last argument to open().
(They could all be one system call at the kernel level, but different
in libc, as is already planned for the reflink/reflinkat distinction.)
Oh, and please implement AT_SYMLINK_FOLLOW the same as link().
Thanks :-)
-- Jamie
next prev parent reply other threads:[~2009-05-12 19:11 UTC|newest]
Thread overview: 151+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-03 6:15 [RFC] The reflink(2) system call Joel Becker
2009-05-03 6:15 ` [PATCH 1/3] fs: Document the " Joel Becker
2009-05-03 8:01 ` Christoph Hellwig
2009-05-04 2:46 ` Joel Becker
2009-05-04 6:36 ` Michael Kerrisk
2009-05-04 7:12 ` Joel Becker
2009-05-03 13:08 ` Boaz Harrosh
2009-05-03 23:08 ` Al Viro
2009-05-04 2:49 ` Joel Becker
2009-05-03 23:45 ` Theodore Tso
2009-05-04 1:44 ` Tao Ma
2009-05-04 18:25 ` Joel Becker
2009-05-04 21:18 ` [Ocfs2-devel] " Joel Becker
2009-05-04 22:23 ` Theodore Tso
2009-05-05 6:55 ` Joel Becker
2009-05-05 1:07 ` Jamie Lokier
2009-05-05 7:16 ` Joel Becker
2009-05-05 8:09 ` Andreas Dilger
2009-05-05 16:56 ` Joel Becker
2009-05-05 21:24 ` Andreas Dilger
2009-05-05 21:32 ` Joel Becker
2009-05-06 7:15 ` [Ocfs2-devel] " Theodore Tso
2009-05-06 14:24 ` jim owens
2009-05-06 14:30 ` jim owens
2009-05-06 17:50 ` jim owens
2009-05-12 19:20 ` Jamie Lokier
2009-05-12 19:30 ` Jamie Lokier
2009-05-12 19:11 ` Jamie Lokier [this message]
2009-05-12 19:37 ` jim owens
2009-05-12 20:11 ` Jamie Lokier
2009-05-05 13:01 ` Theodore Tso
2009-05-05 13:19 ` Jamie Lokier
2009-05-05 13:39 ` Chris Mason
2009-05-05 15:36 ` Jamie Lokier
2009-05-05 15:41 ` Chris Mason
2009-05-05 16:03 ` Jamie Lokier
2009-05-05 16:18 ` Chris Mason
2009-05-05 20:48 ` jim owens
2009-05-05 21:57 ` Jamie Lokier
2009-05-05 22:04 ` Joel Becker
2009-05-05 22:11 ` Jamie Lokier
2009-05-05 22:24 ` Joel Becker
2009-05-05 23:14 ` Jamie Lokier
2009-05-05 22:12 ` Jamie Lokier
2009-05-05 22:21 ` Joel Becker
2009-05-05 22:32 ` James Morris
2009-05-05 22:39 ` Joel Becker
2009-05-12 19:40 ` Jamie Lokier
2009-05-05 22:28 ` jim owens
2009-05-05 23:12 ` Jamie Lokier
2009-05-05 16:46 ` Jörn Engel
2009-05-05 16:54 ` Jörn Engel
2009-05-05 22:03 ` Jamie Lokier
2009-05-05 21:44 ` copyfile semantics Andreas Dilger
2009-05-05 21:48 ` Matthew Wilcox
2009-05-05 22:25 ` Trond Myklebust
2009-05-05 22:06 ` Jamie Lokier
2009-05-06 5:57 ` Jörn Engel
2009-05-05 14:21 ` [PATCH 1/3] fs: Document the reflink(2) system call Theodore Tso
2009-05-05 15:32 ` Jamie Lokier
2009-05-05 22:49 ` James Morris
2009-05-05 17:05 ` Joel Becker
2009-05-05 17:00 ` Joel Becker
2009-05-05 17:29 ` Theodore Tso
2009-05-05 22:36 ` Jamie Lokier
2009-05-05 22:30 ` Jamie Lokier
2009-05-05 22:37 ` Joel Becker
2009-05-05 23:08 ` jim owens
2009-05-05 13:01 ` Jamie Lokier
2009-05-05 17:09 ` Joel Becker
2009-05-03 6:15 ` [PATCH 2/3] fs: Add vfs_reflink() and the ->reflink() inode operation Joel Becker
2009-05-03 8:03 ` Christoph Hellwig
2009-05-04 2:51 ` Joel Becker
2009-05-03 6:15 ` [PATCH 3/3] fs: Add the reflink(2) system call Joel Becker
2009-05-03 6:27 ` Matthew Wilcox
2009-05-03 6:39 ` Al Viro
2009-05-03 7:48 ` Christoph Hellwig
2009-05-03 11:16 ` Al Viro
2009-05-04 2:53 ` Joel Becker
2009-05-04 2:53 ` Joel Becker
2009-05-03 8:04 ` Christoph Hellwig
2009-05-07 22:15 ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-08 1:39 ` James Morris
2009-05-08 1:49 ` Joel Becker
2009-05-08 13:01 ` Tetsuo Handa
2009-05-08 2:59 ` jim owens
2009-05-08 3:10 ` Joel Becker
2009-05-08 11:53 ` jim owens
2009-05-08 12:16 ` jim owens
2009-05-08 14:11 ` jim owens
2009-05-11 20:40 ` [RFC] The reflink(2) system call v4 Joel Becker
2009-05-11 22:27 ` James Morris
2009-05-11 22:34 ` Joel Becker
2009-05-12 1:12 ` James Morris
2009-05-12 12:18 ` Stephen Smalley
2009-05-12 17:22 ` Joel Becker
2009-05-12 17:32 ` Stephen Smalley
2009-05-12 18:03 ` Joel Becker
2009-05-12 18:04 ` Stephen Smalley
2009-05-12 18:28 ` Joel Becker
2009-05-12 18:37 ` Stephen Smalley
2009-05-14 18:06 ` Stephen Smalley
2009-05-14 18:25 ` Stephen Smalley
2009-05-14 23:25 ` James Morris
2009-05-15 11:54 ` Stephen Smalley
2009-05-15 13:35 ` James Morris
2009-05-15 15:44 ` Stephen Smalley
2009-05-13 1:47 ` Casey Schaufler
2009-05-13 16:43 ` Joel Becker
2009-05-13 17:23 ` Stephen Smalley
2009-05-13 18:27 ` Joel Becker
2009-05-12 12:01 ` Stephen Smalley
2009-05-11 23:11 ` jim owens
2009-05-11 23:42 ` Joel Becker
2009-05-12 11:31 ` Jörn Engel
2009-05-12 13:12 ` jim owens
2009-05-12 20:24 ` Jamie Lokier
2009-05-14 18:43 ` Jörn Engel
2009-05-12 15:04 ` Sage Weil
2009-05-12 15:23 ` jim owens
2009-05-12 16:16 ` Sage Weil
2009-05-12 17:45 ` jim owens
2009-05-12 20:29 ` Jamie Lokier
2009-05-12 17:28 ` Joel Becker
2009-05-13 4:30 ` Sage Weil
2009-05-14 3:57 ` Andy Lutomirski
2009-05-14 18:12 ` Stephen Smalley
2009-05-14 22:00 ` Joel Becker
2009-05-15 1:20 ` Jamie Lokier
2009-05-15 12:01 ` Stephen Smalley
2009-05-15 15:22 ` Joel Becker
2009-05-15 15:55 ` Stephen Smalley
2009-05-15 16:42 ` Joel Becker
2009-05-15 17:01 ` Shaya Potter
2009-05-15 20:53 ` [Ocfs2-devel] " Joel Becker
2009-05-18 9:17 ` Jörn Engel
2009-05-18 13:02 ` Stephen Smalley
2009-05-18 14:33 ` Stephen Smalley
2009-05-18 17:15 ` Stephen Smalley
2009-05-18 18:26 ` Joel Becker
2009-05-19 16:32 ` [Ocfs2-devel] " Sage Weil
2009-05-19 19:33 ` Jonathan Corbet
2009-05-19 20:15 ` Jamie Lokier
[not found] ` <20090519132057.419b9de0@bike.lwn.net>
[not found] ` <20090519193244.GB25521@mail.oracle.com>
2009-05-19 19:41 ` Jonathan Corbet
2009-05-28 0:24 ` [RFC] The reflink(2) system call v5 Joel Becker
2009-09-14 22:24 ` Joel Becker
2009-05-11 20:49 ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-11 22:49 ` jim owens
2009-05-11 23:46 ` Joel Becker
2009-05-12 0:54 ` Chris Mason
2009-05-12 20:36 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090512191129.GA10436@shareable.org \
--to=jamie@shareable.org \
--cc=adilger@sun.com \
--cc=jmorris@namei.org \
--cc=joel.becker@oracle.com \
--cc=jowens@hp.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=ocfs2-devel@oss.oracle.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).