linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@sun.com>
To: Jamie Lokier <jamie@shareable.org>,
	linux-fsdevel@vger.kernel.org, jmorris@namei.org,
	ocfs2-devel@oss.oracle.com, viro@zeniv.linux.org.uk
Subject: Re: [PATCH 1/3] fs: Document the reflink(2) system call.
Date: Tue, 05 May 2009 15:24:17 -0600	[thread overview]
Message-ID: <20090505212417.GO3209@webber.adilger.int> (raw)
In-Reply-To: <20090505165628.GC7835@mail.oracle.com>

On May 05, 2009  09:56 -0700, Joel Becker wrote:
> On Tue, May 05, 2009 at 02:09:36AM -0600, Andreas Dilger wrote:
> > If the reflink caller is always charged for the full space used (as if
> > it were a real copy) by virtue of the user doing the reflink() owning the
> > new inode.  Doing anything else seems broken.  If the owner of the file
> > wasn't charged for the reflink's quota then if the reflink inode was
> > chowned the new owner would be charged for the new file, but the quota
> > code would have to special case the decrement of EACH of the reflink's
> > blocks because otherwise the original owner might "release" quota that
> > it was never originally charged.
> 
>  If the caller is creating an inode in someone else's name, then
> who do you charge for the quota?

IMHO, it shouldn't be possible to create an inode in someone else's
name (CAP_* excluded), just like it isn't possible to create a new
file in someone elses name.  The caller of reflink() should be the
one creating the file, hence the owner of the file, and the owner of
the quota.

> If you charge the caller, how do you know to decrement the caller's
> quota when the actual owner does truncate, given that the inode has
> no knowledge of the caller anymore.

No, if the owner of the inode (== caller) is charged the quota then
when the inode is truncated (regardless of who does the truncate)
the quota will just work correctly.

> 	You've hit the nail on the head - without backrefs for each
> refcounted hunk, you can't figure out who it owns it from a quota
> perspective.  And that's just a non-starter to try and maintain.

No, I don't think my proposal is _more_ complex than the original.
It is actually _less_ complex, because the fact that this is a reflink
and not a complete file copy is a purely internal detail of the filesystem
and is not exposed outside the filesystem.  The fact that a reflink
consumes less space and is faster than a real copy is an implementation
detail, not really any different than if the file were compressed by
the filesystem internally.

> > > 	Here's another fun trick.  Overwriting rsync, instead of copying
> > > blocks from the already-existing source could reflink the source to the
> > > .temporary, then only write the changed blocks.  And since you own both
> > > files, it just works.  If you're overwriting someone else's file?  The
> > > old copy behavior is fine.
> > 
> > Well, "fine" as in it works, but if there are only a few changed blocks,
> > and the old copy is now part of a snapshot (so it won't be released when
> > rsync is finished) the space consumption has doubled instead of just
> > using a few extra blocks.
> 
> 	No, because the last thing rsync will do is rename(.temporary,
> source).  All the references from the source will be decremented, and
> any blocks only owned by the source will be freed.  Space usage is
> identical before and after, like a copying rsync, but there is less
> space used and less I/O done during the rsync process.

What I was objecting to is "when overwriting someone elses file, the old
copy behaviour is fine".  If we are implementing a copy-on-write API,
why hamstring it to not work in the expected manner by a normal "cp"?

> > Is there anything about changing the owner/group of the new inode during
> > reflink that makes the implementation more complex?  If the process doing
> > the reflink is the same as the file owner then the semantics are unchanged
> > from what you have proposed.
> 
> 	If you define that 'reflink sets the attributes as if it was a
> new file', then you should be creating the file with a new security
> context, not with the security context from the existing inode.  And
> then you can't really snapshot.
> 	A mixed behavior, like "if you own it, I'll preserve the entire
> security context, but if not I will treat it with a new context" is
> confusing at best.

I don't find it confusing.  The security context would be inherited from
the creating process, just like creating a new file would.  If it is the
same user as the file owner then the security context will be the same.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


  reply	other threads:[~2009-05-05 21:24 UTC|newest]

Thread overview: 151+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-03  6:15 [RFC] The reflink(2) system call Joel Becker
2009-05-03  6:15 ` [PATCH 1/3] fs: Document the " Joel Becker
2009-05-03  8:01   ` Christoph Hellwig
2009-05-04  2:46     ` Joel Becker
2009-05-04  6:36       ` Michael Kerrisk
2009-05-04  7:12         ` Joel Becker
2009-05-03 13:08   ` Boaz Harrosh
2009-05-03 23:08     ` Al Viro
2009-05-04  2:49     ` Joel Becker
2009-05-03 23:45   ` Theodore Tso
2009-05-04  1:44     ` Tao Ma
2009-05-04 18:25       ` Joel Becker
2009-05-04 21:18         ` [Ocfs2-devel] " Joel Becker
2009-05-04 22:23           ` Theodore Tso
2009-05-05  6:55             ` Joel Becker
2009-05-05  1:07   ` Jamie Lokier
2009-05-05  7:16     ` Joel Becker
2009-05-05  8:09       ` Andreas Dilger
2009-05-05 16:56         ` Joel Becker
2009-05-05 21:24           ` Andreas Dilger [this message]
2009-05-05 21:32             ` Joel Becker
2009-05-06  7:15               ` [Ocfs2-devel] " Theodore Tso
2009-05-06 14:24                 ` jim owens
2009-05-06 14:30                   ` jim owens
2009-05-06 17:50                     ` jim owens
2009-05-12 19:20                       ` Jamie Lokier
2009-05-12 19:30                       ` Jamie Lokier
2009-05-12 19:11                   ` Jamie Lokier
2009-05-12 19:37                     ` jim owens
2009-05-12 20:11                       ` Jamie Lokier
2009-05-05 13:01       ` Theodore Tso
2009-05-05 13:19         ` Jamie Lokier
2009-05-05 13:39           ` Chris Mason
2009-05-05 15:36             ` Jamie Lokier
2009-05-05 15:41               ` Chris Mason
2009-05-05 16:03                 ` Jamie Lokier
2009-05-05 16:18                   ` Chris Mason
2009-05-05 20:48                   ` jim owens
2009-05-05 21:57                     ` Jamie Lokier
2009-05-05 22:04                       ` Joel Becker
2009-05-05 22:11                         ` Jamie Lokier
2009-05-05 22:24                           ` Joel Becker
2009-05-05 23:14                             ` Jamie Lokier
2009-05-05 22:12                         ` Jamie Lokier
2009-05-05 22:21                           ` Joel Becker
2009-05-05 22:32                             ` James Morris
2009-05-05 22:39                               ` Joel Becker
2009-05-12 19:40                               ` Jamie Lokier
2009-05-05 22:28                         ` jim owens
2009-05-05 23:12                           ` Jamie Lokier
2009-05-05 16:46               ` Jörn Engel
2009-05-05 16:54                 ` Jörn Engel
2009-05-05 22:03                   ` Jamie Lokier
2009-05-05 21:44                 ` copyfile semantics Andreas Dilger
2009-05-05 21:48                   ` Matthew Wilcox
2009-05-05 22:25                     ` Trond Myklebust
2009-05-05 22:06                   ` Jamie Lokier
2009-05-06  5:57                   ` Jörn Engel
2009-05-05 14:21           ` [PATCH 1/3] fs: Document the reflink(2) system call Theodore Tso
2009-05-05 15:32             ` Jamie Lokier
2009-05-05 22:49             ` James Morris
2009-05-05 17:05           ` Joel Becker
2009-05-05 17:00         ` Joel Becker
2009-05-05 17:29           ` Theodore Tso
2009-05-05 22:36             ` Jamie Lokier
2009-05-05 22:30           ` Jamie Lokier
2009-05-05 22:37             ` Joel Becker
2009-05-05 23:08             ` jim owens
2009-05-05 13:01       ` Jamie Lokier
2009-05-05 17:09         ` Joel Becker
2009-05-03  6:15 ` [PATCH 2/3] fs: Add vfs_reflink() and the ->reflink() inode operation Joel Becker
2009-05-03  8:03   ` Christoph Hellwig
2009-05-04  2:51     ` Joel Becker
2009-05-03  6:15 ` [PATCH 3/3] fs: Add the reflink(2) system call Joel Becker
2009-05-03  6:27   ` Matthew Wilcox
2009-05-03  6:39     ` Al Viro
2009-05-03  7:48       ` Christoph Hellwig
2009-05-03 11:16         ` Al Viro
2009-05-04  2:53       ` Joel Becker
2009-05-04  2:53     ` Joel Becker
2009-05-03  8:04   ` Christoph Hellwig
2009-05-07 22:15 ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-08  1:39   ` James Morris
2009-05-08  1:49     ` Joel Becker
2009-05-08 13:01       ` Tetsuo Handa
2009-05-08  2:59   ` jim owens
2009-05-08  3:10     ` Joel Becker
2009-05-08 11:53       ` jim owens
2009-05-08 12:16       ` jim owens
2009-05-08 14:11         ` jim owens
2009-05-11 20:40       ` [RFC] The reflink(2) system call v4 Joel Becker
2009-05-11 22:27         ` James Morris
2009-05-11 22:34           ` Joel Becker
2009-05-12  1:12             ` James Morris
2009-05-12 12:18               ` Stephen Smalley
2009-05-12 17:22                 ` Joel Becker
2009-05-12 17:32                   ` Stephen Smalley
2009-05-12 18:03                     ` Joel Becker
2009-05-12 18:04                       ` Stephen Smalley
2009-05-12 18:28                         ` Joel Becker
2009-05-12 18:37                           ` Stephen Smalley
2009-05-14 18:06                         ` Stephen Smalley
2009-05-14 18:25                           ` Stephen Smalley
2009-05-14 23:25                             ` James Morris
2009-05-15 11:54                               ` Stephen Smalley
2009-05-15 13:35                                 ` James Morris
2009-05-15 15:44                                   ` Stephen Smalley
2009-05-13  1:47                       ` Casey Schaufler
2009-05-13 16:43                         ` Joel Becker
2009-05-13 17:23                           ` Stephen Smalley
2009-05-13 18:27                             ` Joel Becker
2009-05-12 12:01           ` Stephen Smalley
2009-05-11 23:11         ` jim owens
2009-05-11 23:42           ` Joel Becker
2009-05-12 11:31         ` Jörn Engel
2009-05-12 13:12           ` jim owens
2009-05-12 20:24             ` Jamie Lokier
2009-05-14 18:43             ` Jörn Engel
2009-05-12 15:04         ` Sage Weil
2009-05-12 15:23           ` jim owens
2009-05-12 16:16             ` Sage Weil
2009-05-12 17:45               ` jim owens
2009-05-12 20:29                 ` Jamie Lokier
2009-05-12 17:28           ` Joel Becker
2009-05-13  4:30             ` Sage Weil
2009-05-14  3:57         ` Andy Lutomirski
2009-05-14 18:12           ` Stephen Smalley
2009-05-14 22:00             ` Joel Becker
2009-05-15  1:20               ` Jamie Lokier
2009-05-15 12:01               ` Stephen Smalley
2009-05-15 15:22                 ` Joel Becker
2009-05-15 15:55                   ` Stephen Smalley
2009-05-15 16:42                     ` Joel Becker
2009-05-15 17:01                       ` Shaya Potter
2009-05-15 20:53                       ` [Ocfs2-devel] " Joel Becker
2009-05-18  9:17                         ` Jörn Engel
2009-05-18 13:02                         ` Stephen Smalley
2009-05-18 14:33                           ` Stephen Smalley
2009-05-18 17:15                             ` Stephen Smalley
2009-05-18 18:26                           ` Joel Becker
2009-05-19 16:32                             ` [Ocfs2-devel] " Sage Weil
2009-05-19 19:33                         ` Jonathan Corbet
2009-05-19 20:15                           ` Jamie Lokier
     [not found]                         ` <20090519132057.419b9de0@bike.lwn.net>
     [not found]                           ` <20090519193244.GB25521@mail.oracle.com>
2009-05-19 19:41                             ` Jonathan Corbet
2009-05-28  0:24         ` [RFC] The reflink(2) system call v5 Joel Becker
2009-09-14 22:24         ` Joel Becker
2009-05-11 20:49     ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-11 22:49       ` jim owens
2009-05-11 23:46         ` Joel Becker
2009-05-12  0:54           ` Chris Mason
2009-05-12 20:36           ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090505212417.GO3209@webber.adilger.int \
    --to=adilger@sun.com \
    --cc=jamie@shareable.org \
    --cc=jmorris@namei.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).