From: James Pharaoh <james@pharaoh.uk>
To: Hugo Mills <hugo@carfax.org.uk>, linux-btrfs@vger.kernel.org
Subject: Re: BTRFS backup questions
Date: Mon, 29 Sep 2014 13:02:35 +0200 [thread overview]
Message-ID: <54293C4B.5080201@pharaoh.uk> (raw)
In-Reply-To: <20140927165929.GC7191@carfax.org.uk>
On 27/09/14 18:59, Hugo Mills wrote:
>>>> 2. Duplicating NOCOW files
> Are you trying to cross a mount-point with that? It works for me:
Here's a script which replicates what I'm doing:
https://gist.github.com/jamespharaoh/d693067ffd203689ebea
And here's the output when I run it:
https://gist.github.com/jamespharaoh/75cb937fd73b05c9128d
> Be aware that the current implementation of (manual) defrag will
> separate the shared extents, so you no longer get the deduplication
> effect. There was a snapshot-aware defrag implementation, but it
> caused filesystem corruption, and has been removed for now until a
> working version can be written. I think Josef was working on this.
Yeah, good to know but won't be a major problem. So I'll probably leave
cow on in almost all cases even for database files. I'll defragment
those files and deduplicate all the rest. In the case of very large
sites, which will be rare, I'll use nocow for those files and provision
replication or whatever.
I'll do some performance testing at some point and post some code and
the results here ;-)
>> Yes, this is one of my main inspirations. The problem is that I am pretty
>> sure it won't handle deduplication of the data.
> It does. That's one of the things it's explicitly designed to do.
Ok, so I think I understand this now. I believe that the only type of
object with a universal id is a subvolume, so the receive function can't
identify items which already exist by themselves, or that it would be
expensive to do so.
Providing a "parent" subvolume allows it to do that. So as long as the
parent subvolume shares the reference with the filesystem being sent it
will do so after the receive takes place on the target.
I think the issue for me is the word "parent". These are really
"reference" filesystems.
The subvolumes you've told me to list as the parents are not parent
filesystems at all, compared to the one I'm sending, except for the
previous version of the same subvolume of course.
Is that all correct?
James
prev parent reply other threads:[~2014-09-29 11:02 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-27 15:39 BTRFS backup questions James Pharaoh
2014-09-27 16:17 ` Hugo Mills
2014-09-27 16:33 ` James Pharaoh
2014-09-27 16:59 ` Hugo Mills
2014-09-29 11:02 ` James Pharaoh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54293C4B.5080201@pharaoh.uk \
--to=james@pharaoh.uk \
--cc=hugo@carfax.org.uk \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).