All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Norbert Scheibner" <scno@gmx.net>
To: Konstantinos Skarlatos <k.skarlatos@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: cross-subvolume cp --reflink
Date: Sun, 01 Apr 2012 19:07:54 +0200	[thread overview]
Message-ID: <20120401170754.238550@gmx.net> (raw)
In-Reply-To: <4F788619.1040403@gmail.com>

> On: Sun, 01 Apr 2012 19:45:13 +0300 Konstantinos Skarlatos wrote

> > That's my point. This poor man's dedupe would solve my problems here
> very well. I don't need a zfs-variant of dedupe. I can implement such a
> file-based dedupe with userland tools and would be happy.
> 
> do you have any scripts that can search a btrfs filesystem for dupes 
> and replace them with cp --reflink?

Nothing really working and tested very well. After I get to known the missing cp --reflink feature I stopped to develop the script any further.

I use btrfs for my backups. Ones a day I rsync --delete --inplace the complete system to a subvolume, snapshot it, delete some tempfiles in the snapshot.
In addition to that I wanted to shrink file-duplicates.

What the script should do:
1. I md5sum every file
2. If the checksums are identical, I compare the files
3. If 2 or more files are really identical:
   - move one to a temp-dir
   - cp --reflink the second to the position and name of the first
   - do a chown --reference, chmod --reference and touch --reference
     to copy owner, file mode bits and time from the orginal to the
     reflink-copy and then delete the original in temp-dir

Everything could be done with bash. Thinkable is the use of a database for the md5sums, which could be used for other purposes in the future.

  reply	other threads:[~2012-04-01 17:07 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-01 15:27 cross-subvolume cp --reflink Norbert Scheibner
2012-04-01 15:30 ` Konstantinos Skarlatos
2012-04-01 16:41   ` Norbert Scheibner
2012-04-01 16:45     ` Konstantinos Skarlatos
2012-04-01 17:07       ` Norbert Scheibner [this message]
2012-04-01 17:19         ` Konstantinos Skarlatos
2012-04-01 18:11           ` Norbert Scheibner
2012-04-01 19:42             ` Konstantinos Skarlatos
     [not found]         ` <4F788EE2.4010105@univie.ac.at>
2012-04-01 18:39           ` Norbert Scheibner
2012-04-01 19:27             ` Konstantinos Skarlatos
2012-04-02  8:29               ` David Sterba
2012-04-01 15:42 ` Jérôme Poulin
2012-04-28 23:53   ` Hubert Kario
2012-04-29 20:05     ` Norbert Scheibner
2012-08-17  4:20       ` james northrup
2012-08-17  5:20         ` Marc MERLIN
2012-08-19  5:08           ` Mitch Harder
2012-08-19  6:43             ` Marc MERLIN
     [not found]       ` <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com>
2012-08-20 18:08         ` Jérôme Poulin
2012-08-21  0:20           ` james northrup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120401170754.238550@gmx.net \
    --to=scno@gmx.net \
    --cc=k.skarlatos@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.