linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hugo Mills <hugo@carfax.org.uk>
To: James Pharaoh <james@pharaoh.uk>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: BTRFS backup questions
Date: Sat, 27 Sep 2014 17:59:29 +0100	[thread overview]
Message-ID: <20140927165929.GC7191@carfax.org.uk> (raw)
In-Reply-To: <5426E6F6.5070701@pharaoh.uk>

[-- Attachment #1: Type: text/plain, Size: 6889 bytes --]

On Sat, Sep 27, 2014 at 06:33:58PM +0200, James Pharaoh wrote:
> On 27/09/14 18:17, Hugo Mills wrote:
> >On Sat, Sep 27, 2014 at 05:39:07PM +0200, James Pharaoh wrote:
> 
> >>2. Duplicating NOCOW files
> >>
> >>This is obviously possible, since it takes place when you make a snapshot.
> >>So why can't I create a clone of a snapshot of a NOCOW file? I am hoping the
> >>answer to this is that it is possible but not implemented yet...
> >
> >    Umm... you should be able to, I think.
> 
> Well I've tried with the haskell btrfs library, using clone, and also using
> cp --reflink=auto. Here's an example using cp:
> 
> root@host:/btrfs# btrfs subvolume snapshot -r src dest
> Create a readonly snapshot of 'src' in './dest'
> root@host:/btrfs# cp --reflink dest/test test
> cp: failed to clone 'test' from 'dest/test': Invalid argument

   Are you trying to cross a mount-point with that? It works for me:

hrm@amelia:/media/btrfs/amelia/test $ sudo btrfs sub create bar
Create subvolume './bar'
hrm@amelia:/media/btrfs/amelia/test $ sudo dd if=/dev/zero of=bar/data bs=1024 count=500
500+0 records in
500+0 records out
512000 bytes (512 kB) copied, 0.0047491 s, 108 MB/s
hrm@amelia:/media/btrfs/amelia/test $ sudo btrfs sub snap -r bar foo
Create a readonly snapshot of 'bar' in './foo'
hrm@amelia:/media/btrfs/amelia/test $ sudo cp --reflink=always bar/data bar-data
hrm@amelia:/media/btrfs/amelia/test $ sudo cp --reflink=always foo/data foo-data
hrm@amelia:/media/btrfs/amelia/test $ ls -l
total 1000
drwxr-xr-x 1 root root      8 Sep 27 17:55 bar
-rw-r--r-- 1 root root 512000 Sep 27 17:57 bar-data
drwxr-xr-x 1 root root      8 Sep 27 17:55 foo
-rw-r--r-- 1 root root 512000 Sep 27 17:57 foo-data

[snip]
> >>3. Peformance penalty of fragmentation on SSD systems with lots of memory
> >>
> >    There are two performance problems with fragmentation -- seek time
> >to find the fragments (which affects only rotational media), and the
> >amount of time taken to manage the fragments. As the number of
> >fragments increases, so does the number of extents that the FS has to
> >keep track of. Ultimately, with very fragmented files, this will have
> >an effect, as the metadata size will increase hugely.
> 
> Ok so this sounds like the answer I wanted to hear ;-) Presumably so long as
> the load is not too great, and I run the occasional defrag, then this
> shouldn't be much to worry about then?

   Be aware that the current implementation of (manual) defrag will
separate the shared extents, so you no longer get the deduplication
effect. There was a snapshot-aware defrag implementation, but it
caused filesystem corruption, and has been removed for now until a
working version can be written. I think Josef was working on this.

> >>4. Generations and tree structures
> >>
> >>I am planning to use lots more clever tricks which I think should be
> >>available in BTRFS, but I can't see much documentation. Can anyone point out
> >>any good examples or documentation of how to access the tree structures
> >>directly. I'm particularly interested in finding changed files and portions
> >>of files using the generations and the tree search.
> >
> >    You need the TREE SEARCH ioctl -- that gives you direct access to
> >all the internal trees of the FS. There's some documentation on the
> >wiki about how these fit together:
> >
> >https://btrfs.wiki.kernel.org/index.php/Data_Structures
> >https://btrfs.wiki.kernel.org/index.php/Trees
> >
> >    What "tricks" are you thinking of, exactly?
> 
> Principally I want to be able to detect exactly what has changed, so that I
> can perform backups very quickly. I want to be able to update a small
> portion of a large file and then identify exactly which parts changed and
> only back those up, for example.

   send/receive does this.

[snip]
> >    Are you aware of btrfs send/receive? It should allow you to do all
> >of this. The main part of the code then comes down to managing the
> >send/receive, and all the distributed error handling. Then the only
> >direct access to the internal metadata you need is being able to read
> >UUIDs to work out what you have on each side -- which can also be done
> >by "btrfs sub list".
> 
> Yes, this is one of my main inspirations. The problem is that I am pretty
> sure it won't handle deduplication of the data.

   It does. That's one of the things it's explicitly designed to do.

> I'm planning to have a LOT of containers running the same stuff, on fast
> (expensive) SSD media, and deduplication is essential to make that work
> properly. I can already see huge savings from this.
> 
> As far as I can tell, btrfs send/receive operates on a subvolume basis, and
> any shared data between those subvolumes is duplicated if you copy them
> separately.

   Not so.

   You can tell send that there are subvolumes with known IDs on the
receive side, using the -c option (arbitrarily many subvols). If the
subvol you are sending (on the send side) shares extents with any of
those, then the data is not sent -- just a reference to it. On the
receive side, if that happens, the shared extents are reconstructed.
It will also do this with the -p option.

> I'll be very happy if this is already possible, or if there is some simple
> way around this!
> 
> My current solution, which I have already implemented in the project I
> shared, is to first snapshot all the subvolumes into an identical tree, then
> to reflink copy (or normal(ish) copy for nocow) all of the files over to
> another subvolume, which I am planning to then send/receive as a single
> entity.
> 
> I believe this will allow the deduplication to be transferred over to the
> receiving machine, and that this won't take place if I transfer the
> subvolumes separately.

   You send each one in turn, and add the -c option for the ones
you've already sent:

for n in A B C D etc; do
   btrfs sub snap -r live/subvol$n backups/subvol$n.1
done
btrfs send backups/subvolA.1 | ...
btrfs send -c backups/subvolA.1 backups/subvolB.1 | ...
btrfs send -c backups/subvolA.1 -c backups/subvolB.1 backups/subvolC.1 | ...
btrfs send  -c backups/subvolA.1 -c backups/subvolB.1 -c backups/subvolC.1 backups/subvolD.1 | ...

   You can then use the same process to do incrementals against each
subvol, by keeping the last snapshot you sent and doing an incremental
against it:

for n in A B C D etc; do
   btrfs sub snap -r live/subvol$n backups/subvol$n.2
done
btrfs send -p backups/subvolA.1 backups/subvolA.2 | ...
btrfs send -c backups/subvolA.2 -p backups/subvolB.1 backups/subvolB.2 | ...
btrfs send -c backups/subvolA.2 -c backups/subvolB.2 -p backups/subvolC.1 backups/subvolC.2 | ...

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- I am an opera lover from planet Zog. Take me to your lieder ---   

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

  reply	other threads:[~2014-09-27 17:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-27 15:39 BTRFS backup questions James Pharaoh
2014-09-27 16:17 ` Hugo Mills
2014-09-27 16:33   ` James Pharaoh
2014-09-27 16:59     ` Hugo Mills [this message]
2014-09-29 11:02       ` James Pharaoh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140927165929.GC7191@carfax.org.uk \
    --to=hugo@carfax.org.uk \
    --cc=james@pharaoh.uk \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).