linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Norbert Scheibner" <scno@gmx.net>
To: Konstantinos Skarlatos <k.skarlatos@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: cross-subvolume cp --reflink
Date: Sun, 01 Apr 2012 20:11:54 +0200	[thread overview]
Message-ID: <20120401181154.61780@gmx.net> (raw)
In-Reply-To: <4F788E1C.6080404@gmail.com>

> On: Sun, 01 Apr 2012 20:19:24 +0300 Konstantinos Skarlatos wrote

> > I use btrfs for my backups. Ones a day I rsync --delete --inplace the
> complete system to a subvolume, snapshot it, delete some tempfiles in the
> snapshot.
> 
> In my setup I rsync --inplace many servers and workstations, 4-6 times 
> a day into a 12TB btrfs volume, each one in its own subvolume. After 
> every backup a new ro snapshot is created.
> 
> I have many cross-subvolume duplicate files (OS files, programs, many 
> huge media files that are copied locally from the servers to the 
> workstations etc), so a good "dedupe" script could save lots of space, 
> and allow me to keep snapshots for much longer.

So the script should be optimized not to try to deduplicate the whole fs everytime but the newly written ones. You could take such a file list out of the rsync output or the btrfs subvolume find-new command.

Albeit the reflink patch, You could use such a bash-script inside one subvolume, after the rsync and before the snapshot. I don't know how much space it saves for You in this situation, but it's worth a try and a good way to develop such a script, because before You write anything to disc You can see how many duplicates are there and how much space could be freed.

MfG
    Norbert

  reply	other threads:[~2012-04-01 18:11 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-01 15:27 cross-subvolume cp --reflink Norbert Scheibner
2012-04-01 15:30 ` Konstantinos Skarlatos
2012-04-01 16:41   ` Norbert Scheibner
2012-04-01 16:45     ` Konstantinos Skarlatos
2012-04-01 17:07       ` Norbert Scheibner
2012-04-01 17:19         ` Konstantinos Skarlatos
2012-04-01 18:11           ` Norbert Scheibner [this message]
2012-04-01 19:42             ` Konstantinos Skarlatos
     [not found]         ` <4F788EE2.4010105@univie.ac.at>
2012-04-01 18:39           ` Norbert Scheibner
2012-04-01 19:27             ` Konstantinos Skarlatos
2012-04-02  8:29               ` David Sterba
2012-04-01 15:42 ` Jérôme Poulin
2012-04-28 23:53   ` Hubert Kario
2012-04-29 20:05     ` Norbert Scheibner
2012-08-17  4:20       ` james northrup
2012-08-17  5:20         ` Marc MERLIN
2012-08-19  5:08           ` Mitch Harder
2012-08-19  6:43             ` Marc MERLIN
     [not found]       ` <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com>
2012-08-20 18:08         ` Jérôme Poulin
2012-08-21  0:20           ` james northrup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120401181154.61780@gmx.net \
    --to=scno@gmx.net \
    --cc=k.skarlatos@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).