public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Jamie Lokier <jamie@shareable.org>
Cc: "Jörn Engel" <joern@wohnheim.fh-wedel.de>,
	"Davide Libenzi" <davidel@xmailserver.org>,
	"Patrick J. LoPresti" <patl@users.sourceforge.net>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] cowlinks v2
Date: 28 Mar 2004 18:31:34 -0700	[thread overview]
Message-ID: <m1zna0tp55.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <20040328235528.GA2693@mail.shareable.org>

Jamie Lokier <jamie@shareable.org> writes:

> Eric W. Biederman wrote:
> > All of which works great until you have a file that has one hard link
> > in your cow directory structure and another hard link outside of
> > any cow.  An application can come in and modify the file through that
> > second cow link causing problems.
> 
> I don't see that problem (although I see another, see below).  The
> application will modify only one instance of the file, and it's the
> correct instance.  If the application writes through the link outside
> both trees, or the link inside the original tree, it will only affect
> the tree that was cowlinked _from_, which is correct.  If the
> application writes to the name inside the snapshot tree, it will only
> affect that tree, which is also correct.

What I see is a race.  An application may write through the link outside
both trees before any of the links is marked cow.  With the result
that you don't have a snapshot of your data.
 
> You cowlinked a directory.  That converts the original directory inode
> to a cowlink, creates another cowlink, and creates a shared inode
> which now contains the directory.
> 
> Then you modify the directory or anything below it.  That duplicates
> the directory, breaking the directory cowlinks and duplicating the
> shared directory inode -- so that the two directory cowlink inodes
> become normal directory inodes.  The directory duplication results in
> two directory which are full of cowlinks -- every object in the
> original directory is cowlinked by this operation.
> 
> A file which was originally hard-linked inside the tree and also
> outside both trees retains the correct hard-link identity: the hard
> link is simply two directory entries referring to the same inode,
> which at all times is the inode visible inside the original tree and
> not visible inside the snapshot, cowlinked tree.  That inode changes
> its underlying representation from file-inode to cowlink-inode
> (pointing to a shared file-inode) and back again during these
> operations.  However, the hard link identity remains correct at all
> times.  Writing to a file won't ever modify the wrong file.

Correct to a point.  And we seem to be imagining the same operations.
However while you will always modify the correct file, as the metadata
is correct.  There is no guarantee that the data will be correct.  The
file will become a cow file only after it is modified or it's
containing directory is modified.  Thus you can have data in the file
that was written after the snapshot operation finished, but before the
individual file itself is marked cow.

> There is a different problem, though: cowlinking whole trees like that
> doesn't preserve hard-linkage _within_ the tree being copied.

> I see a different problem: the equivalent of something
> semantically equivalent to "cp -pr" is fine and fast, but "cp -dpr"
> (aka. "cp -a") must, unless it's quite complicated with filesystem
> metadata, duplicate the whole directories immediately rather than
> lazily, or at least scan them looking for hard links.

Now that you bring it out I see that problem as well.  I have seen it
in other proposed implementations as well.  Keeping hard links linked
requires for some amount of context to be maintained for the entire
copy operation.  If necessary you could keep that context where it is
available to the lazy copy but it is far from trivial. 

In short lazy copying for creating snapshots is dangerous.  The
data you are copying may be modified before you are done.  It is
difficult to maintain state across the entire copy.

All of which sounds like a job for user space to me.

Eric


  reply	other threads:[~2004-03-29  1:32 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-20  8:34 [PATCH] cowlinks v2 Jörn Engel
2004-03-20  8:49 ` Andrew Morton
2004-03-20 11:27   ` Jörn Engel
2004-03-20 19:28     ` Andrew Morton
2004-03-21 12:43       ` Jörn Engel
2004-03-21 18:53       ` Jörn Engel
     [not found] ` <mit.lcs.mail.linux-kernel/20040320083411.GA25934@wohnheim.fh-wedel.de>
2004-03-20 15:03   ` Patrick J. LoPresti
2004-03-20 15:23     ` Jörn Engel
2004-03-29 17:12       ` Pavel Machek
2004-03-29 21:05         ` Patrick J. LoPresti
2004-03-29 23:16           ` Pavel Machek
2004-03-31 14:34             ` Jamie Lokier
2004-03-31 14:45               ` Pavel Machek
2004-03-31 15:20                 ` Jamie Lokier
2004-04-02 11:44                 ` Tim Connors
2004-04-02 16:54             ` Jörn Engel
2004-04-02 18:01               ` Pavel Machek
2004-04-02 18:17                 ` Jörn Engel
2004-04-02 18:23                   ` Pavel Machek
2004-04-02 19:28                     ` Ross Biro
2004-04-02 21:35                       ` Pavel Machek
2004-04-05  8:12                       ` Jörn Engel
2004-04-05  8:19                         ` Pavel Machek
2004-04-05  8:45                           ` Jörn Engel
2004-04-02 20:09                     ` Jamie Lokier
2004-04-02 21:39                       ` Pavel Machek
2004-04-02 22:00                         ` Chris Friesen
2004-04-03  0:49                           ` Jamie Lokier
2004-04-03  8:23                             ` Pavel Machek
2004-04-03 13:15                               ` Jamie Lokier
2004-04-05  8:19                                 ` Jörn Engel
2004-04-05  8:22                                   ` Pavel Machek
2004-04-03  0:46                         ` Jamie Lokier
2004-04-03  1:04                         ` Jamie Lokier
2004-04-03  1:21                           ` Erik Andersen
2004-04-03  1:59                             ` Jamie Lokier
2004-04-03  3:55                               ` Ross Biro
2004-04-03  9:09                               ` Pavel Machek
2004-04-03 13:27                                 ` Jamie Lokier
2004-04-03 18:39                           ` Eric W. Biederman
2004-04-03 19:43                             ` Jamie Lokier
2004-04-03 20:30                               ` Eric W. Biederman
2004-04-03 21:59                                 ` Jamie Lokier
2004-04-04  8:15                                   ` Eric W. Biederman
2004-04-05  8:35                               ` Jörn Engel
2004-04-05  9:15                                 ` Eric W. Biederman
2004-04-05  9:18                                   ` Jörn Engel
2004-04-05 11:43                                   ` Pavel Machek
2004-04-05 12:17                                     ` Jamie Lokier
2004-04-05 12:39                                   ` Jamie Lokier
2004-04-05 12:41                                 ` Jamie Lokier
2004-04-05 18:03                                   ` Jörn Engel
2004-04-05 11:10                         ` jlnance
2004-04-05 11:46                           ` Pavel Machek
2004-04-05 12:35                           ` Jamie Lokier
2004-04-05  8:43                     ` Jörn Engel
2004-04-03 19:47               ` Eric W. Biederman
2004-04-05  8:54                 ` Jörn Engel
2004-04-05  9:07                   ` Eric W. Biederman
2004-03-20 16:48     ` Davide Libenzi
2004-03-21 12:57       ` Jörn Engel
2004-03-21 17:59         ` Davide Libenzi
2004-03-21 18:14           ` Jörn Engel
2004-03-21 20:26             ` Davide Libenzi
2004-03-21 20:35               ` Jörn Engel
2004-03-22  0:18             ` Eric W. Biederman
2004-03-22  0:25               ` Davide Libenzi
2004-03-22  5:07                 ` Eric W. Biederman
2004-03-22  5:11                   ` Davide Libenzi
2004-03-22 11:20                     ` Eric W. Biederman
2004-03-22 16:02                       ` Davide Libenzi
2004-03-25 17:49               ` Jamie Lokier
2004-03-25 18:06                 ` Eric W. Biederman
2004-03-25 19:43                   ` Jamie Lokier
2004-03-25 20:38                     ` Linus Torvalds
2004-03-25 22:16                       ` Eric W. Biederman
2004-04-01 14:53                         ` Jörn Engel
2004-04-02 11:54                         ` Tim Connors
2004-03-25 21:46                     ` Eric W. Biederman
2004-03-27 10:28                       ` Jamie Lokier
2004-03-27 21:00                         ` Eric W. Biederman
2004-03-27 21:42                           ` Jamie Lokier
2004-03-27 23:45                             ` Eric W. Biederman
2004-03-28  0:43                               ` Eric W. Biederman
2004-03-28 12:22                                 ` Jamie Lokier
2004-03-28 20:07                                   ` Eric W. Biederman
2004-03-28 23:55                                     ` Jamie Lokier
2004-03-29  1:31                                       ` Eric W. Biederman [this message]
2004-03-29 12:36                                         ` Jamie Lokier
2004-03-29 19:36                                           ` Eric W. Biederman
2004-03-29 23:05                                             ` Jamie Lokier
2004-03-29 23:58                                               ` Eric W. Biederman
2004-03-29  7:45                                       ` Denis Vlasenko
2004-03-29  9:28                             ` Pavel Machek
2004-03-29 12:40                               ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1zna0tp55.fsf@ebiederm.dsl.xmission.com \
    --to=ebiederm@xmission.com \
    --cc=davidel@xmailserver.org \
    --cc=jamie@shareable.org \
    --cc=joern@wohnheim.fh-wedel.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=patl@users.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox