From: Konstantinos Skarlatos <k.skarlatos@gmail.com>
To: Norbert Scheibner <scno@gmx.net>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: cross-subvolume cp --reflink
Date: Sun, 01 Apr 2012 20:19:24 +0300 [thread overview]
Message-ID: <4F788E1C.6080404@gmail.com> (raw)
In-Reply-To: <20120401170754.238550@gmx.net>
On =CE=9A=CF=85=CF=81=CE=B9=CE=B1=CE=BA=CE=AE, 1 =CE=91=CF=80=CF=81=CE=AF=
=CE=BB=CE=B9=CE=BF=CF=82 2012 8:07:54 =CE=BC=CE=BC, Norbert Scheibner w=
rote:
>> On: Sun, 01 Apr 2012 19:45:13 +0300 Konstantinos Skarlatos wrote
>
>>> That's my point. This poor man's dedupe would solve my problems her=
e
>> very well. I don't need a zfs-variant of dedupe. I can implement suc=
h a
>> file-based dedupe with userland tools and would be happy.
>>
>> do you have any scripts that can search a btrfs filesystem for dupes
>> and replace them with cp --reflink?
>
> Nothing really working and tested very well. After I get to known the=
missing cp --reflink feature I stopped to develop the script any furth=
er.
>
> I use btrfs for my backups. Ones a day I rsync --delete --inplace the=
complete system to a subvolume, snapshot it, delete some tempfiles in =
the snapshot.
In my setup I rsync --inplace many servers and workstations, 4-6 times=20
a day into a 12TB btrfs volume, each one in its own subvolume. After=20
every backup a new ro snapshot is created.
I have many cross-subvolume duplicate files (OS files, programs, many=20
huge media files that are copied locally from the servers to the=20
workstations etc), so a good "dedupe" script could save lots of space,=20
and allow me to keep snapshots for much longer.
> In addition to that I wanted to shrink file-duplicates.
>
> What the script should do:
> 1. I md5sum every file
> 2. If the checksums are identical, I compare the files
> 3. If 2 or more files are really identical:
> - move one to a temp-dir
> - cp --reflink the second to the position and name of the first
> - do a chown --reference, chmod --reference and touch --reference
> to copy owner, file mode bits and time from the orginal to the
> reflink-copy and then delete the original in temp-dir
>
> Everything could be done with bash. Thinkable is the use of a databas=
e for the md5sums, which could be used for other purposes in the future=
=2E
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-04-01 17:19 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-01 15:27 cross-subvolume cp --reflink Norbert Scheibner
2012-04-01 15:30 ` Konstantinos Skarlatos
2012-04-01 16:41 ` Norbert Scheibner
2012-04-01 16:45 ` Konstantinos Skarlatos
2012-04-01 17:07 ` Norbert Scheibner
2012-04-01 17:19 ` Konstantinos Skarlatos [this message]
2012-04-01 18:11 ` Norbert Scheibner
2012-04-01 19:42 ` Konstantinos Skarlatos
[not found] ` <4F788EE2.4010105@univie.ac.at>
2012-04-01 18:39 ` Norbert Scheibner
2012-04-01 19:27 ` Konstantinos Skarlatos
2012-04-02 8:29 ` David Sterba
2012-04-01 15:42 ` Jérôme Poulin
2012-04-28 23:53 ` Hubert Kario
2012-04-29 20:05 ` Norbert Scheibner
2012-08-17 4:20 ` james northrup
2012-08-17 5:20 ` Marc MERLIN
2012-08-19 5:08 ` Mitch Harder
2012-08-19 6:43 ` Marc MERLIN
[not found] ` <CAPkEcwgSZ8umbFeuZ-fQAFAprBubL58eFSf4TQ=Z13ks8i9DOQ@mail.gmail.com>
2012-08-20 18:08 ` Jérôme Poulin
2012-08-21 0:20 ` james northrup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F788E1C.6080404@gmail.com \
--to=k.skarlatos@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=scno@gmx.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.