From: "Norbert Scheibner" <scno@gmx.net>
To: Konstantinos Skarlatos <k.skarlatos@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: cross-subvolume cp --reflink
Date: Sun, 01 Apr 2012 19:07:54 +0200 [thread overview]
Message-ID: <20120401170754.238550@gmx.net> (raw)
In-Reply-To: <4F788619.1040403@gmail.com>
> On: Sun, 01 Apr 2012 19:45:13 +0300 Konstantinos Skarlatos wrote
> > That's my point. This poor man's dedupe would solve my problems here
> very well. I don't need a zfs-variant of dedupe. I can implement such a
> file-based dedupe with userland tools and would be happy.
>
> do you have any scripts that can search a btrfs filesystem for dupes
> and replace them with cp --reflink?
Nothing really working and tested very well. After I get to known the missing cp --reflink feature I stopped to develop the script any further.
I use btrfs for my backups. Ones a day I rsync --delete --inplace the complete system to a subvolume, snapshot it, delete some tempfiles in the snapshot.
In addition to that I wanted to shrink file-duplicates.
What the script should do:
1. I md5sum every file
2. If the checksums are identical, I compare the files
3. If 2 or more files are really identical:
- move one to a temp-dir
- cp --reflink the second to the position and name of the first
- do a chown --reference, chmod --reference and touch --reference
to copy owner, file mode bits and time from the orginal to the
reflink-copy and then delete the original in temp-dir
Everything could be done with bash. Thinkable is the use of a database for the md5sums, which could be used for other purposes in the future.
next prev parent reply other threads:[~2012-04-01 17:07 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-01 15:27 cross-subvolume cp --reflink Norbert Scheibner
2012-04-01 15:30 ` Konstantinos Skarlatos
2012-04-01 16:41 ` Norbert Scheibner
2012-04-01 16:45 ` Konstantinos Skarlatos
2012-04-01 17:07 ` Norbert Scheibner [this message]
2012-04-01 17:19 ` Konstantinos Skarlatos
2012-04-01 18:11 ` Norbert Scheibner
2012-04-01 19:42 ` Konstantinos Skarlatos
[not found] ` <4F788EE2.4010105@univie.ac.at>
2012-04-01 18:39 ` Norbert Scheibner
2012-04-01 19:27 ` Konstantinos Skarlatos
2012-04-02 8:29 ` David Sterba
2012-04-01 15:42 ` Jérôme Poulin
2012-04-28 23:53 ` Hubert Kario
2012-04-29 20:05 ` Norbert Scheibner
2012-08-17 4:20 ` james northrup
2012-08-17 5:20 ` Marc MERLIN
2012-08-19 5:08 ` Mitch Harder
2012-08-19 6:43 ` Marc MERLIN
[not found] ` <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com>
2012-08-20 18:08 ` Jérôme Poulin
2012-08-21 0:20 ` james northrup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120401170754.238550@gmx.net \
--to=scno@gmx.net \
--cc=k.skarlatos@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.