From: Gabriel de Perthuis <g2p.code@gmail.com>
To: Hugo Mills <hugo@carfax.org.uk>
Cc: Linux Btrfs <linux-btrfs@vger.kernel.org>,
Jerome Haltom <wasabi@cogito.cx>
Subject: Re: Q: Why subvolumes?
Date: Tue, 23 Jul 2013 19:47:41 +0200 [thread overview]
Message-ID: <51EEC1BD.9030001@gmail.com> (raw)
In-Reply-To: <20130723150620.GG20517@carfax.org.uk>
> Now... since the snapshot's FS tree is a direct duplicate of the
> original FS tree (actually, it's the same tree, but they look like
> different things to the outside world), they share everything --
> including things like inode numbers. This is OK within a subvolume,
> because we have the semantics that subvolumes have their own distinct
> inode-number spaces. If we could snapshot arbitrary subsections of the
> FS, we'd end up having to fix up inode numbers to ensure that they
> were unique -- which can't really be an atomic operation (unless you
> want to have the FS locked while the kernel updates the inodes of the
> billion files you just snapshotted).
I don't think so; I just checked some snapshots and the inos are the same.
Btrfs just changes the dev_id of subvolumes (somehow the vfs allows this).
> The other thing to talk about here is that while the FS tree is a
> tree structure, it's not a direct one-to-one map to the directory tree
> structure. In fact, it looks more like a list of inodes, in inode
> order, with some extra info for easily tracking through the list. The
> B-tree structure of the FS tree is just a fast indexing method. So
> snapshotting a directory entry within the FS tree would require
> (somehow) making an atomic copy, or CoW copy, of only the parts of the
> FS tree that fall under the directory in question -- so you'd end up
> trying to take a sequence of records in the FS tree, of arbitrary size
> (proportional roughly to the number of entries in the directory) and
> copying them to somewhere else in the same tree in such a way that you
> can automatically dereference the copies when you modify them. So,
> ultimately, it boils down to being able to do CoW operations at the
> byte level, which is going to introduce huge quantities of extra
> metadata, and it all starts looking really awkward to implement (plus
> having to deal with the long time taken to copy the directory entries
> for the thing you're snapshotting).
Btrfs already does CoW of arbitrarily-large files (extent lists);
doing the same for directories doesn't seem impossible.
next prev parent reply other threads:[~2013-07-23 17:47 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-23 11:59 Q: Why subvolumes? Jerome Haltom
2013-07-23 14:52 ` AW: " Andreas Buschka
2013-07-23 15:06 ` Q: " Hugo Mills
2013-07-23 17:47 ` Gabriel de Perthuis [this message]
2013-07-23 19:30 ` Hugo Mills
2013-07-23 19:41 ` Gabriel de Perthuis
2013-07-23 19:43 ` Jerome Haltom
2013-07-23 21:52 ` Chris Murphy
2013-07-23 23:39 ` Jerome Haltom
2013-07-24 1:27 ` Josef Bacik
2013-07-24 2:02 ` Chris Murphy
2013-08-04 14:56 ` Alexandre Oliva
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51EEC1BD.9030001@gmail.com \
--to=g2p.code@gmail.com \
--cc=hugo@carfax.org.uk \
--cc=linux-btrfs@vger.kernel.org \
--cc=wasabi@cogito.cx \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).