linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Becker <Joel.Becker@oracle.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: linux-fsdevel@vger.kernel.org, jmorris@namei.org,
	ocfs2-devel@oss.oracle.com, viro@zeniv.linux.org.uk
Subject: Re: [PATCH 1/3] fs: Document the reflink(2) system call.
Date: Tue, 5 May 2009 00:16:09 -0700	[thread overview]
Message-ID: <20090505071608.GB10258@mail.oracle.com> (raw)
In-Reply-To: <20090505010703.GA12731@shareable.org>

On Tue, May 05, 2009 at 02:07:03AM +0100, Jamie Lokier wrote:
> Joel Becker wrote:
> > +All file attributes and extended attributes of the new file must
> > +identical to the source file with the following exceptions:
> 
> reflink() sounds useful already, but is there a compelling reason why
> both files must have the same attributes, and changing attributes will
> break the COW?

	Yeah, because without it you can't use it for snapshotting.
That's where the original design came from - inode snapshots.  The big
thing that excited me was that defining reflink() as I did, instead of
a more specific snapshot call, allows all sorts of generic uses (some of
which you outline below).
	If reflink() creates a snapshot, you can then break it to make
things a little different.  But if it changes things, you can never
change them back.

> Being able to have different attributes would allow:
> 
>    - reflink() to be used for fast space-efficient copying, i.e. an
>      optimisation to "cp", "git checkout" and things like that.

	It can right now, just not of other people's files.  Actually,
the only real difficult with doing it to other people's files is quota.
But I can't come up with a way to prevent quota DoS.
	Here's another fun trick.  Overwriting rsync, instead of copying
blocks from the already-existing source could reflink the source to the
.temporary, then only write the changed blocks.  And since you own both
files, it just works.  If you're overwriting someone else's file?  The
old copy behavior is fine.

>    - reflink() to be used for merging files with identical contents
>      (something I find surprisingly often on my disks).
> 
>    - reflink() to be used for merging files from different
>      cgroup-style VMs in particular.

	While it would be great to have a way to do this, reflink() is
not the way.  It's really simple to understand with its link-like
semantic, and I see no point in making it a seven-different-operation
kitchen sink call.

> Requiring all attributes except nlink and ino to be identical makes
> reflink() unsuitable for transparently doing those things, except in
> cases where they happen to have the same attributes anyway.

	We've had a lot of fun thinking up many uses for reflink(), and
almost all of them are within the context of one's own files.

> I'm thinking particularly of file permissions, owner/group and atime.

	People do cp -p all the time.  I don't see how keeping those
things the same will break anything.  It's a new call, not an existing
semantic.

> Since each reflink has its own nlink and ino, I'm wondering why the
> other attributes cannot also be separate.  (I realise extended
> attributes complicate the picture and it's desirable to share them,
> especially if they are large).

	The biggest reason is snapshotting.  The second biggest reason
is a simple to understand call.  "Everything is identical except those
things that *have* to be different".

> But is there an efficient way for reflink-aware applications to detect
> these files have the same contents, other than reading the contents
> twice and comparing?  Occasionally that would be good.  E.g. It would
> be nice if "diff -r" could be patched to do that.

	I would think FIEMAP would tell you what you want to know,
wouldn't it?

> > +- The ctime of the source file only changes if the source's metadata
> > +  must be changed to accommodate the copy-on-write linkage.  The ctime of
> > +  the new file is set to represent its creation.
> 
> What change to the source metadata would require ctime to change?

	ocfs2 flags all extents in the source file with a "this is now
shared, go check the reference count before writing" flag if they don't
have it already.  I'd call that a metadata update.

> > +- The link count of the source file is unchanged, and the link count of
> > +  the new file is one.
> 
> Can you hard link to the source file and the reflink afterwards,
> incrementing the reflink's link count?  (I presume yes).  Can you
> reflink to both of them too?

	Yes, absolutely.  Once reflinked, they look like two separate
POSIX files.

Joel

-- 

"Depend on the rabbit's foot if you will, but remember, it didn't
 help the rabbit."
	- R. E. Shay

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

  reply	other threads:[~2009-05-05  7:16 UTC|newest]

Thread overview: 151+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-03  6:15 [RFC] The reflink(2) system call Joel Becker
2009-05-03  6:15 ` [PATCH 1/3] fs: Document the " Joel Becker
2009-05-03  8:01   ` Christoph Hellwig
2009-05-04  2:46     ` Joel Becker
2009-05-04  6:36       ` Michael Kerrisk
2009-05-04  7:12         ` Joel Becker
2009-05-03 13:08   ` Boaz Harrosh
2009-05-03 23:08     ` Al Viro
2009-05-04  2:49     ` Joel Becker
2009-05-03 23:45   ` Theodore Tso
2009-05-04  1:44     ` Tao Ma
2009-05-04 18:25       ` Joel Becker
2009-05-04 21:18         ` [Ocfs2-devel] " Joel Becker
2009-05-04 22:23           ` Theodore Tso
2009-05-05  6:55             ` Joel Becker
2009-05-05  1:07   ` Jamie Lokier
2009-05-05  7:16     ` Joel Becker [this message]
2009-05-05  8:09       ` Andreas Dilger
2009-05-05 16:56         ` Joel Becker
2009-05-05 21:24           ` Andreas Dilger
2009-05-05 21:32             ` Joel Becker
2009-05-06  7:15               ` [Ocfs2-devel] " Theodore Tso
2009-05-06 14:24                 ` jim owens
2009-05-06 14:30                   ` jim owens
2009-05-06 17:50                     ` jim owens
2009-05-12 19:20                       ` Jamie Lokier
2009-05-12 19:30                       ` Jamie Lokier
2009-05-12 19:11                   ` Jamie Lokier
2009-05-12 19:37                     ` jim owens
2009-05-12 20:11                       ` Jamie Lokier
2009-05-05 13:01       ` Theodore Tso
2009-05-05 13:19         ` Jamie Lokier
2009-05-05 13:39           ` Chris Mason
2009-05-05 15:36             ` Jamie Lokier
2009-05-05 15:41               ` Chris Mason
2009-05-05 16:03                 ` Jamie Lokier
2009-05-05 16:18                   ` Chris Mason
2009-05-05 20:48                   ` jim owens
2009-05-05 21:57                     ` Jamie Lokier
2009-05-05 22:04                       ` Joel Becker
2009-05-05 22:11                         ` Jamie Lokier
2009-05-05 22:24                           ` Joel Becker
2009-05-05 23:14                             ` Jamie Lokier
2009-05-05 22:12                         ` Jamie Lokier
2009-05-05 22:21                           ` Joel Becker
2009-05-05 22:32                             ` James Morris
2009-05-05 22:39                               ` Joel Becker
2009-05-12 19:40                               ` Jamie Lokier
2009-05-05 22:28                         ` jim owens
2009-05-05 23:12                           ` Jamie Lokier
2009-05-05 16:46               ` Jörn Engel
2009-05-05 16:54                 ` Jörn Engel
2009-05-05 22:03                   ` Jamie Lokier
2009-05-05 21:44                 ` copyfile semantics Andreas Dilger
2009-05-05 21:48                   ` Matthew Wilcox
2009-05-05 22:25                     ` Trond Myklebust
2009-05-05 22:06                   ` Jamie Lokier
2009-05-06  5:57                   ` Jörn Engel
2009-05-05 14:21           ` [PATCH 1/3] fs: Document the reflink(2) system call Theodore Tso
2009-05-05 15:32             ` Jamie Lokier
2009-05-05 22:49             ` James Morris
2009-05-05 17:05           ` Joel Becker
2009-05-05 17:00         ` Joel Becker
2009-05-05 17:29           ` Theodore Tso
2009-05-05 22:36             ` Jamie Lokier
2009-05-05 22:30           ` Jamie Lokier
2009-05-05 22:37             ` Joel Becker
2009-05-05 23:08             ` jim owens
2009-05-05 13:01       ` Jamie Lokier
2009-05-05 17:09         ` Joel Becker
2009-05-03  6:15 ` [PATCH 2/3] fs: Add vfs_reflink() and the ->reflink() inode operation Joel Becker
2009-05-03  8:03   ` Christoph Hellwig
2009-05-04  2:51     ` Joel Becker
2009-05-03  6:15 ` [PATCH 3/3] fs: Add the reflink(2) system call Joel Becker
2009-05-03  6:27   ` Matthew Wilcox
2009-05-03  6:39     ` Al Viro
2009-05-03  7:48       ` Christoph Hellwig
2009-05-03 11:16         ` Al Viro
2009-05-04  2:53       ` Joel Becker
2009-05-04  2:53     ` Joel Becker
2009-05-03  8:04   ` Christoph Hellwig
2009-05-07 22:15 ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-08  1:39   ` James Morris
2009-05-08  1:49     ` Joel Becker
2009-05-08 13:01       ` Tetsuo Handa
2009-05-08  2:59   ` jim owens
2009-05-08  3:10     ` Joel Becker
2009-05-08 11:53       ` jim owens
2009-05-08 12:16       ` jim owens
2009-05-08 14:11         ` jim owens
2009-05-11 20:40       ` [RFC] The reflink(2) system call v4 Joel Becker
2009-05-11 22:27         ` James Morris
2009-05-11 22:34           ` Joel Becker
2009-05-12  1:12             ` James Morris
2009-05-12 12:18               ` Stephen Smalley
2009-05-12 17:22                 ` Joel Becker
2009-05-12 17:32                   ` Stephen Smalley
2009-05-12 18:03                     ` Joel Becker
2009-05-12 18:04                       ` Stephen Smalley
2009-05-12 18:28                         ` Joel Becker
2009-05-12 18:37                           ` Stephen Smalley
2009-05-14 18:06                         ` Stephen Smalley
2009-05-14 18:25                           ` Stephen Smalley
2009-05-14 23:25                             ` James Morris
2009-05-15 11:54                               ` Stephen Smalley
2009-05-15 13:35                                 ` James Morris
2009-05-15 15:44                                   ` Stephen Smalley
2009-05-13  1:47                       ` Casey Schaufler
2009-05-13 16:43                         ` Joel Becker
2009-05-13 17:23                           ` Stephen Smalley
2009-05-13 18:27                             ` Joel Becker
2009-05-12 12:01           ` Stephen Smalley
2009-05-11 23:11         ` jim owens
2009-05-11 23:42           ` Joel Becker
2009-05-12 11:31         ` Jörn Engel
2009-05-12 13:12           ` jim owens
2009-05-12 20:24             ` Jamie Lokier
2009-05-14 18:43             ` Jörn Engel
2009-05-12 15:04         ` Sage Weil
2009-05-12 15:23           ` jim owens
2009-05-12 16:16             ` Sage Weil
2009-05-12 17:45               ` jim owens
2009-05-12 20:29                 ` Jamie Lokier
2009-05-12 17:28           ` Joel Becker
2009-05-13  4:30             ` Sage Weil
2009-05-14  3:57         ` Andy Lutomirski
2009-05-14 18:12           ` Stephen Smalley
2009-05-14 22:00             ` Joel Becker
2009-05-15  1:20               ` Jamie Lokier
2009-05-15 12:01               ` Stephen Smalley
2009-05-15 15:22                 ` Joel Becker
2009-05-15 15:55                   ` Stephen Smalley
2009-05-15 16:42                     ` Joel Becker
2009-05-15 17:01                       ` Shaya Potter
2009-05-15 20:53                       ` [Ocfs2-devel] " Joel Becker
2009-05-18  9:17                         ` Jörn Engel
2009-05-18 13:02                         ` Stephen Smalley
2009-05-18 14:33                           ` Stephen Smalley
2009-05-18 17:15                             ` Stephen Smalley
2009-05-18 18:26                           ` Joel Becker
2009-05-19 16:32                             ` [Ocfs2-devel] " Sage Weil
2009-05-19 19:33                         ` Jonathan Corbet
2009-05-19 20:15                           ` Jamie Lokier
     [not found]                         ` <20090519132057.419b9de0@bike.lwn.net>
     [not found]                           ` <20090519193244.GB25521@mail.oracle.com>
2009-05-19 19:41                             ` Jonathan Corbet
2009-05-28  0:24         ` [RFC] The reflink(2) system call v5 Joel Becker
2009-09-14 22:24         ` Joel Becker
2009-05-11 20:49     ` [RFC] The reflink(2) system call v2 Joel Becker
2009-05-11 22:49       ` jim owens
2009-05-11 23:46         ` Joel Becker
2009-05-12  0:54           ` Chris Mason
2009-05-12 20:36           ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090505071608.GB10258@mail.oracle.com \
    --to=joel.becker@oracle.com \
    --cc=jamie@shareable.org \
    --cc=jmorris@namei.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).