Re: Q: Why subvolumes? - Gabriel de Perthuis

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Gabriel de Perthuis <g2p.code@gmail.com>
To: Hugo Mills <hugo@carfax.org.uk>
Cc: Linux Btrfs <linux-btrfs@vger.kernel.org>,
	Jerome Haltom <wasabi@cogito.cx>
Subject: Re: Q: Why subvolumes?
Date: Tue, 23 Jul 2013 19:47:41 +0200	[thread overview]
Message-ID: <51EEC1BD.9030001@gmail.com> (raw)
In-Reply-To: <20130723150620.GG20517@carfax.org.uk>

>    Now... since the snapshot's FS tree is a direct duplicate of the
> original FS tree (actually, it's the same tree, but they look like
> different things to the outside world), they share everything --
> including things like inode numbers. This is OK within a subvolume,
> because we have the semantics that subvolumes have their own distinct
> inode-number spaces. If we could snapshot arbitrary subsections of the
> FS, we'd end up having to fix up inode numbers to ensure that they
> were unique -- which can't really be an atomic operation (unless you
> want to have the FS locked while the kernel updates the inodes of the
> billion files you just snapshotted).

I don't think so; I just checked some snapshots and the inos are the same.
Btrfs just changes the dev_id of subvolumes (somehow the vfs allows this).

>    The other thing to talk about here is that while the FS tree is a
> tree structure, it's not a direct one-to-one map to the directory tree
> structure. In fact, it looks more like a list of inodes, in inode
> order, with some extra info for easily tracking through the list. The
> B-tree structure of the FS tree is just a fast indexing method. So
> snapshotting a directory entry within the FS tree would require
> (somehow) making an atomic copy, or CoW copy, of only the parts of the
> FS tree that fall under the directory in question -- so you'd end up
> trying to take a sequence of records in the FS tree, of arbitrary size
> (proportional roughly to the number of entries in the directory) and
> copying them to somewhere else in the same tree in such a way that you
> can automatically dereference the copies when you modify them. So,
> ultimately, it boils down to being able to do CoW operations at the
> byte level, which is going to introduce huge quantities of extra
> metadata, and it all starts looking really awkward to implement (plus
> having to deal with the long time taken to copy the directory entries
> for the thing you're snapshotting).

Btrfs already does CoW of arbitrarily-large files (extent lists);
doing the same for directories doesn't seem impossible.

next prev parent reply	other threads:[~2013-07-23 17:47 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-23 11:59 Q: Why subvolumes? Jerome Haltom
2013-07-23 14:52 ` AW: " Andreas Buschka
2013-07-23 15:06 ` Q: " Hugo Mills
2013-07-23 17:47   ` Gabriel de Perthuis [this message]
2013-07-23 19:30     ` Hugo Mills
2013-07-23 19:41       ` Gabriel de Perthuis
2013-07-23 19:43       ` Jerome Haltom
2013-07-23 21:52         ` Chris Murphy
2013-07-23 23:39           ` Jerome Haltom
2013-07-24  1:27             ` Josef Bacik
2013-07-24  2:02               ` Chris Murphy
2013-08-04 14:56         ` Alexandre Oliva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51EEC1BD.9030001@gmail.com \
    --to=g2p.code@gmail.com \
    --cc=hugo@carfax.org.uk \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wasabi@cogito.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).