From: David Pottage <david@chrestomanci.org>
To: Marc MERLIN <marc@merlins.org>, linux-btrfs@vger.kernel.org
Subject: Re: btrfs snapshot sizes
Date: Fri, 09 May 2014 08:42:22 +0100 [thread overview]
Message-ID: <536C86DE.9090003@chrestomanci.org> (raw)
In-Reply-To: <20140507111949.GT10159@merlins.org>
On 07/05/14 12:19, Marc MERLIN wrote:
> So have others found a good way to have an idea about how much space is
> taken by each snapshot?
>
> I've tried quota trees, but I'm not sure how to read the output, or if it's
> correct (including the negative numbers some have mentioned). Are there
> other options?
>
> I think the main problem is that the shared data field is not working,
> making it harder to know which blocks are only used in a given snapshot.
In my understanding (devs please correct me if I am wrong), a snapshot
is just a subvolume that happens to share a lot of data with another
subvolume. The idea of taking regular snapshots to preserve the state of
the filing system at a point in time is a userland concept. From the
kernel's point of view the user has asked for a clone of a subvolume,
and both copies are equal. What the user does with one or other clone
after that is their affair.
This means that suppose you have a subvolume representing your home
directory that contains around 1Gb of data, and then take daily
snapshots, asking the kernel how big each snapshot is will not give the
answer you expect. They all contain roughly 1Gb.
The question you should be asking, is to compare two subvolumes. (eg the
current /home and a snapshot taken of it last week), and ask how much
data is different between the two. Depending on how you count the "size"
of the snapshot will be the total amount of data that is not shared, or
just the data that is in the snapshot but not the base.
The thing is, I don't think there is an easy way to get a report of the
amount of non shared data without walking the file-systems in both
subvolumes and building a large data structure of inodes or suchlike.
Measuring the size of snapshots will get even more thorny when you take
many snapshots. For example suppose you take one every hour, and you
have just deleted a large file. All your old hourly snapshots will
contain a reference to that large file, but the data will only be on
disc once, so you don't want to count it's size more than once when
considering how much of you disc is taken up by snapshots.
NB: I am not a btrfs developer, just an interested user, and lurker on
this list.
--
David Pottage
next prev parent reply other threads:[~2014-05-09 8:20 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-07 11:19 btrfs snapshot sizes Marc MERLIN
2014-05-09 7:42 ` David Pottage [this message]
2014-05-09 14:06 ` Marc MERLIN
2014-05-09 17:23 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=536C86DE.9090003@chrestomanci.org \
--to=david@chrestomanci.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=marc@merlins.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).