All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gabriel de Perthuis <g2p.code@gmail.com>
To: Hugo Mills <hugo@carfax.org.uk>
Cc: Linux Btrfs <linux-btrfs@vger.kernel.org>,
	Jerome Haltom <wasabi@cogito.cx>
Subject: Re: Q: Why subvolumes?
Date: Tue, 23 Jul 2013 19:47:41 +0200	[thread overview]
Message-ID: <51EEC1BD.9030001@gmail.com> (raw)
In-Reply-To: <20130723150620.GG20517@carfax.org.uk>

>    Now... since the snapshot's FS tree is a direct duplicate of the
> original FS tree (actually, it's the same tree, but they look like
> different things to the outside world), they share everything --
> including things like inode numbers. This is OK within a subvolume,
> because we have the semantics that subvolumes have their own distinct
> inode-number spaces. If we could snapshot arbitrary subsections of the
> FS, we'd end up having to fix up inode numbers to ensure that they
> were unique -- which can't really be an atomic operation (unless you
> want to have the FS locked while the kernel updates the inodes of the
> billion files you just snapshotted).

I don't think so; I just checked some snapshots and the inos are the same.
Btrfs just changes the dev_id of subvolumes (somehow the vfs allows this).

>    The other thing to talk about here is that while the FS tree is a
> tree structure, it's not a direct one-to-one map to the directory tree
> structure. In fact, it looks more like a list of inodes, in inode
> order, with some extra info for easily tracking through the list. The
> B-tree structure of the FS tree is just a fast indexing method. So
> snapshotting a directory entry within the FS tree would require
> (somehow) making an atomic copy, or CoW copy, of only the parts of the
> FS tree that fall under the directory in question -- so you'd end up
> trying to take a sequence of records in the FS tree, of arbitrary size
> (proportional roughly to the number of entries in the directory) and
> copying them to somewhere else in the same tree in such a way that you
> can automatically dereference the copies when you modify them. So,
> ultimately, it boils down to being able to do CoW operations at the
> byte level, which is going to introduce huge quantities of extra
> metadata, and it all starts looking really awkward to implement (plus
> having to deal with the long time taken to copy the directory entries
> for the thing you're snapshotting).

Btrfs already does CoW of arbitrarily-large files (extent lists);
doing the same for directories doesn't seem impossible.

  reply	other threads:[~2013-07-23 17:47 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-23 11:59 Q: Why subvolumes? Jerome Haltom
2013-07-23 14:52 ` AW: " Andreas Buschka
2013-07-23 15:06 ` Q: " Hugo Mills
2013-07-23 17:47   ` Gabriel de Perthuis [this message]
2013-07-23 19:30     ` Hugo Mills
2013-07-23 19:41       ` Gabriel de Perthuis
2013-07-23 19:43       ` Jerome Haltom
2013-07-23 21:52         ` Chris Murphy
2013-07-23 23:39           ` Jerome Haltom
2013-07-24  1:27             ` Josef Bacik
2013-07-24  2:02               ` Chris Murphy
2013-08-04 14:56         ` Alexandre Oliva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51EEC1BD.9030001@gmail.com \
    --to=g2p.code@gmail.com \
    --cc=hugo@carfax.org.uk \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wasabi@cogito.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.