Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Kai Krakow <hurikhan77+btrfs@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Possible to dedpulicate read-only snapshots for space-efficient backups
Date: Sun, 05 May 2013 12:07:17 +0200	[thread overview]
Message-ID: <mjnh5a-mcf.ln1@hurikhan.ath.cx> (raw)

Hey list,

I wonder if it is possible to deduplicate read-only snapshots.

Background:

I'm using an bash/rsync script[1] to backup my whole system on a nightly 
basis to an attached USB3 drive into a scratch area, then take a snapshot of 
this area. I'd like to have these snapshots immutable, so they should be 
read-only.

Since rsync won't discover moved files but instead place a new copy of that 
in the backup, I'm running the wonderful bedup application[2] to deduplicate 
my backup drive from time to time and it almost always gains back a good 
pile of gigabytes. The rest of storage space issues is taken care of by 
using rsync's inplace option (although this won't cover the case of files 
moved and changed between backup runs) and using compress-force=gzip.

Since bedup sets the immutable attribute during touching the files, I 
suspect the process will no longer work when I make the snapshots read-only.

I've read about ongoing work to integrate offline (and even online) 
deduplication into the kernel so that this process can be made atomic (and 
even block-based instead of file-based). This would - to my understandings - 
result in the immutable attribute no longer needed. So, given the fact above 
and for the case read-only snapshots cannot be used for this application 
currently, will these patches address the problem and read-only snapshots 
could be deduplicated? Or are read-only snapshots meant to be what the name 
suggests: Immutable, even for deduplication?

Regards,
Kai

[1]: https://gist.github.com/kakra/5520370
[2]: https://github.com/g2p/bedup


             reply	other threads:[~2013-05-05 10:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-05 10:07 Kai Krakow [this message]
2013-05-05 12:55 ` Possible to dedpulicate read-only snapshots for space-efficient backups Gabriel de Perthuis
2013-05-05 17:22   ` Kai Krakow
2013-05-07 22:07     ` Gabriel de Perthuis
2013-05-07 23:04       ` Kai Krakow
2013-05-07 23:22         ` Kai Krakow
2013-05-07 23:35         ` Possible to deduplicate " Gabriel de Perthuis
2013-05-06  6:15 ` Possible to dedpulicate " Jan Schmidt
2013-05-06  7:44   ` Kai Krakow
2013-05-06 14:35     ` james northrup
2013-05-06 20:48       ` Kai Krakow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mjnh5a-mcf.ln1@hurikhan.ath.cx \
    --to=hurikhan77+btrfs@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox