From: Andreas Dilger <adilger@sun.com>
To: Jamie Lokier <jamie@shareable.org>,
linux-fsdevel@vger.kernel.org, jmorris@namei.org,
ocfs2-devel@oss.oracle.com, viro@zeniv.linux.org.uk
Subject: Re: [PATCH 1/3] fs: Document the reflink(2) system call.
Date: Tue, 05 May 2009 15:24:17 -0600 [thread overview]
Message-ID: <20090505212417.GO3209@webber.adilger.int> (raw)
In-Reply-To: <20090505165628.GC7835@mail.oracle.com>
On May 05, 2009 09:56 -0700, Joel Becker wrote:
> On Tue, May 05, 2009 at 02:09:36AM -0600, Andreas Dilger wrote:
> > If the reflink caller is always charged for the full space used (as if
> > it were a real copy) by virtue of the user doing the reflink() owning the
> > new inode. Doing anything else seems broken. If the owner of the file
> > wasn't charged for the reflink's quota then if the reflink inode was
> > chowned the new owner would be charged for the new file, but the quota
> > code would have to special case the decrement of EACH of the reflink's
> > blocks because otherwise the original owner might "release" quota that
> > it was never originally charged.
>
> If the caller is creating an inode in someone else's name, then
> who do you charge for the quota?
IMHO, it shouldn't be possible to create an inode in someone else's
name (CAP_* excluded), just like it isn't possible to create a new
file in someone elses name. The caller of reflink() should be the
one creating the file, hence the owner of the file, and the owner of
the quota.
> If you charge the caller, how do you know to decrement the caller's
> quota when the actual owner does truncate, given that the inode has
> no knowledge of the caller anymore.
No, if the owner of the inode (== caller) is charged the quota then
when the inode is truncated (regardless of who does the truncate)
the quota will just work correctly.
> You've hit the nail on the head - without backrefs for each
> refcounted hunk, you can't figure out who it owns it from a quota
> perspective. And that's just a non-starter to try and maintain.
No, I don't think my proposal is _more_ complex than the original.
It is actually _less_ complex, because the fact that this is a reflink
and not a complete file copy is a purely internal detail of the filesystem
and is not exposed outside the filesystem. The fact that a reflink
consumes less space and is faster than a real copy is an implementation
detail, not really any different than if the file were compressed by
the filesystem internally.
> > > Here's another fun trick. Overwriting rsync, instead of copying
> > > blocks from the already-existing source could reflink the source to the
> > > .temporary, then only write the changed blocks. And since you own both
> > > files, it just works. If you're overwriting someone else's file? The
> > > old copy behavior is fine.
> >
> > Well, "fine" as in it works, but if there are only a few changed blocks,
> > and the old copy is now part of a snapshot (so it won't be released when
> > rsync is finished) the space consumption has doubled instead of just
> > using a few extra blocks.
>
> No, because the last thing rsync will do is rename(.temporary,
> source). All the references from the source will be decremented, and
> any blocks only owned by the source will be freed. Space usage is
> identical before and after, like a copying rsync, but there is less
> space used and less I/O done during the rsync process.
What I was objecting to is "when overwriting someone elses file, the old
copy behaviour is fine". If we are implementing a copy-on-write API,
why hamstring it to not work in the expected manner by a normal "cp"?
> > Is there anything about changing the owner/group of the new inode during
> > reflink that makes the implementation more complex? If the process doing
> > the reflink is the same as the file owner then the semantics are unchanged
> > from what you have proposed.
>
> If you define that 'reflink sets the attributes as if it was a
> new file', then you should be creating the file with a new security
> context, not with the security context from the existing inode. And
> then you can't really snapshot.
> A mixed behavior, like "if you own it, I'll preserve the entire
> security context, but if not I will treat it with a new context" is
> confusing at best.
I don't find it confusing. The security context would be inherited from
the creating process, just like creating a new file would. If it is the
same user as the file owner then the security context will be the same.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
next prev parent reply other threads:[~2009-05-05 21:24 UTC|newest]
Thread overview: 151+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-03 6:15 [RFC] The reflink(2) system call Joel Becker
2009-05-03 6:15 ` [PATCH 1/3] fs: Document the " Joel Becker
2009-05-03 8:01 ` Christoph Hellwig
2009-05-04 2:46 ` Joel Becker
2009-05-04 6:36 ` Michael Kerrisk
2009-05-04 7:12 ` Joel Becker
2009-05-03 13:08 ` Boaz Harrosh
2009-05-03 23:08 ` Al Viro
2009-05-04 2:49 ` Joel Becker
2009-05-03 23:45 ` Theodore Tso
2009-05-04 1:44 ` Tao Ma
2009-05-04 18:25 ` Joel Becker
2009-05-04 21:18 ` [Ocfs2-devel] " Joel Becker
2009-05-04 22:23 ` Theodore Tso
2009-05-05 6:55 ` Joel Becker
2009-05-05 1:07 ` Jamie Lokier
2009-05-05 7:16 ` Joel Becker
2009-05-05 8:09 ` Andreas Dilger
2009-05-05 16:56 ` Joel Becker
2009-05-05 21:24 ` Andreas Dilger [this message]
2009-05-05 21:32 ` Joel Becker
2009-05-06 7:15 ` [Ocfs2-devel] " Theodore Tso
2009-05-06 14:24 ` jim owens
2009-05-06 14:30 ` jim owens
2009-05-06 17:50 ` jim owens
2009-05-12 19:20 ` Jamie Lokier
2009-05-12 19:30 ` Jamie Lokier
2009-05-12 19:11 ` Jamie Lokier
2009-05-12 19:37 ` jim owens
2009-05-12 20:11 ` Jamie Lokier
2009-05-05 13:01 ` Theodore Tso
2009-05-05 13:19 ` Jamie Lokier
2009-05-05 13:39 ` Chris Mason
2009-05-05 15:36 ` Jamie Lokier
2009-05-05 15:41 ` Chris Mason
2009-05-05 16:03 ` Jamie Lokier
2009-05-05 16:18 ` Chris Mason
2009-05-05 20:48 ` jim owens
2009-05-05 21:57 ` Jamie Lokier
2009-05-05 22:04 ` Joel Becker
2009-05-05 22:11 ` Jamie Lokier
2009-05-05 22:24 ` Joel Becker
2009-05-05 23:14 ` Jamie Lokier
2009-05-05 22:12 ` Jamie Lokier
2009-05-05 22:21 ` Joel Becker
2009-05-05 22:32 ` James Morris
2009-05-05 22:39 ` Joel Becker
2009-05-12 19:40 ` Jamie Lokier
2009-05-05 22:28 ` jim owens
2009-05-05 23:12 ` Jamie Lokier
2009-05-05 16:46 ` Jörn Engel
2009-05-05 16:54 ` Jörn Engel
2009-05-05 22:03 ` Jamie Lokier
2009-05-05 21:44 ` copyfile semantics Andreas Dilger
2009-05-05 21:48 ` Matthew Wilcox
2009-05-05 22:25 ` Trond Myklebust
2009-05-05 22:06 ` Jamie Lokier
2009-05-06 5:57 ` Jörn Engel
2009-05-05 14:21 ` [PATCH 1/3] fs: Document the reflink(2) system call Theodore Tso
2009-05-05 15:32 ` Jamie Lokier
2009-05-05 22:49 ` James Morris
2009-05-05 17:05 ` Joel Becker
2009-05-05 17:00 ` Joel Becker
2009-05-05 17:29 ` Theodore Tso
2009-05-05 22:36 ` Jamie Lokier
2009-05-05 22:30 ` Jamie Lokier
2009-05-05 22:37 ` Joel Becker
2009-05-05 23:08 ` jim owens
2009-05-05 13:01 ` Jamie Lokier
2009-05-05 17:09 ` Joel Becker
2009-05-03 6:15 ` [PATCH 2/3] fs: Add vfs_reflink() and the ->reflink() inode operation Joel Becker
2009-05-03 8:03 ` Christoph Hellwig
2009-05-04 2:51 ` Joel Becker
2009-05-03 6:15 ` [PATCH 3/3] fs: Add the reflink(2) system call Joel Becker
2009-05-03 6:27 ` Matthew Wilcox
2009-05-03 6:39 ` Al Viro
2009-05-03 7:48 ` Christoph Hellwig
2009-05-03 11:16 ` Al Viro
2009-05-04 2:53 ` Joel Becker
2009-05-04 2:53 ` Joel Becker
2009-05-03 8:04 ` Christoph Hellwig
2009-05-07 22:15 ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-08 1:39 ` James Morris
2009-05-08 1:49 ` Joel Becker
2009-05-08 13:01 ` Tetsuo Handa
2009-05-08 2:59 ` jim owens
2009-05-08 3:10 ` Joel Becker
2009-05-08 11:53 ` jim owens
2009-05-08 12:16 ` jim owens
2009-05-08 14:11 ` jim owens
2009-05-11 20:40 ` [RFC] The reflink(2) system call v4 Joel Becker
2009-05-11 22:27 ` James Morris
2009-05-11 22:34 ` Joel Becker
2009-05-12 1:12 ` James Morris
2009-05-12 12:18 ` Stephen Smalley
2009-05-12 17:22 ` Joel Becker
2009-05-12 17:32 ` Stephen Smalley
2009-05-12 18:03 ` Joel Becker
2009-05-12 18:04 ` Stephen Smalley
2009-05-12 18:28 ` Joel Becker
2009-05-12 18:37 ` Stephen Smalley
2009-05-14 18:06 ` Stephen Smalley
2009-05-14 18:25 ` Stephen Smalley
2009-05-14 23:25 ` James Morris
2009-05-15 11:54 ` Stephen Smalley
2009-05-15 13:35 ` James Morris
2009-05-15 15:44 ` Stephen Smalley
2009-05-13 1:47 ` Casey Schaufler
2009-05-13 16:43 ` Joel Becker
2009-05-13 17:23 ` Stephen Smalley
2009-05-13 18:27 ` Joel Becker
2009-05-12 12:01 ` Stephen Smalley
2009-05-11 23:11 ` jim owens
2009-05-11 23:42 ` Joel Becker
2009-05-12 11:31 ` Jörn Engel
2009-05-12 13:12 ` jim owens
2009-05-12 20:24 ` Jamie Lokier
2009-05-14 18:43 ` Jörn Engel
2009-05-12 15:04 ` Sage Weil
2009-05-12 15:23 ` jim owens
2009-05-12 16:16 ` Sage Weil
2009-05-12 17:45 ` jim owens
2009-05-12 20:29 ` Jamie Lokier
2009-05-12 17:28 ` Joel Becker
2009-05-13 4:30 ` Sage Weil
2009-05-14 3:57 ` Andy Lutomirski
2009-05-14 18:12 ` Stephen Smalley
2009-05-14 22:00 ` Joel Becker
2009-05-15 1:20 ` Jamie Lokier
2009-05-15 12:01 ` Stephen Smalley
2009-05-15 15:22 ` Joel Becker
2009-05-15 15:55 ` Stephen Smalley
2009-05-15 16:42 ` Joel Becker
2009-05-15 17:01 ` Shaya Potter
2009-05-15 20:53 ` [Ocfs2-devel] " Joel Becker
2009-05-18 9:17 ` Jörn Engel
2009-05-18 13:02 ` Stephen Smalley
2009-05-18 14:33 ` Stephen Smalley
2009-05-18 17:15 ` Stephen Smalley
2009-05-18 18:26 ` Joel Becker
2009-05-19 16:32 ` [Ocfs2-devel] " Sage Weil
2009-05-19 19:33 ` Jonathan Corbet
2009-05-19 20:15 ` Jamie Lokier
[not found] ` <20090519132057.419b9de0@bike.lwn.net>
[not found] ` <20090519193244.GB25521@mail.oracle.com>
2009-05-19 19:41 ` Jonathan Corbet
2009-05-28 0:24 ` [RFC] The reflink(2) system call v5 Joel Becker
2009-09-14 22:24 ` Joel Becker
2009-05-11 20:49 ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-11 22:49 ` jim owens
2009-05-11 23:46 ` Joel Becker
2009-05-12 0:54 ` Chris Mason
2009-05-12 20:36 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090505212417.GO3209@webber.adilger.int \
--to=adilger@sun.com \
--cc=jamie@shareable.org \
--cc=jmorris@namei.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=ocfs2-devel@oss.oracle.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).