All of lore.kernel.org
 help / color / mirror / Atom feed
From: Saint Germain <saintger@gmail.com>
To: <linux-btrfs@vger.kernel.org>
Cc: "Niccolò Belli" <darkbasic@linuxsystems.it>
Subject: Re: Announcing btrfs-dedupe
Date: Wed, 9 Nov 2016 13:47:51 +0100	[thread overview]
Message-ID: <20161109134751.434b5e83@system> (raw)
In-Reply-To: <8f0cf023-7189-4de1-a72c-38a4deb8a049@linuxsystems.it>

On Wed, 09 Nov 2016 12:24:51 +0100, Niccolò Belli
<darkbasic@linuxsystems.it> wrote :
> 
> On martedì 8 novembre 2016 23:36:25 CET, Saint Germain wrote:
> > Please be aware of these other similar softwares:
> > - jdupes: https://github.com/jbruchon/jdupes
> > - rmlint: https://github.com/sahib/rmlint
> > And of course fdupes.
> >
> > Some intesting points I have seen in them:
> > - use xxhash to identify potential duplicates (huge speedup)
> > - ability to deduplicate read-only snapshots
> > - identify potential reflinked files (see also my email here:
> >   https://www.spinics.net/lists/linux-btrfs/msg60081.html)
> > - ability to filter out hardlinks
> > - triangle problem: see jdupes readme
> > - jdupes has started the process to be included in Debian
> >
> > I hope that will help and that you can share some codes with them !
> > 
> Hi,
> What do you think about jdupes? I'm searching an alternative to
> duperemove and rmlint doesn't seem to support btrfs deduplication, so
> I would like to try jdupes. My main problem with duperemove is a
> memory leak, also it seems to lead to greater disk usage: 
> https://github.com/markfasheh/duperemove/issues/163

rmlint is supporting btrfs deduplication:
rmlint --algorithm=xxhash --types="duplicates" --hidden --config=sh:handler=clone --no-hardlinked

I've used jdupes and rmlint to deduplicate 2TB with 4GB RAM and it took
a few hours. So it is acceptable from a performance point of view.
The problems I found have been corrected by both.

Jdupes author is really kind and reactive !

  reply	other threads:[~2016-11-09 12:47 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-06 13:30 Announcing btrfs-dedupe James Pharaoh
2016-11-07 14:02 ` David Sterba
2016-11-07 17:48   ` Mark Fasheh
2016-11-07 20:54     ` Adam Borowski
2016-11-08  2:17       ` Darrick J. Wong
2016-11-08 18:59         ` Mark Fasheh
2016-11-08 19:47           ` Darrick J. Wong
2016-11-08 19:47             ` [Ocfs2-devel] " Darrick J. Wong
2016-11-09 15:02       ` David Sterba
2016-11-08  2:40   ` Christoph Anton Mitterer
2016-11-08  6:11     ` James Pharaoh
2016-11-08 13:26     ` Austin S. Hemmelgarn
2016-11-08 16:57       ` Darrick J. Wong
2016-11-08 17:04         ` Austin S. Hemmelgarn
2016-11-08 18:49     ` Mark Fasheh
2016-11-07 17:59 ` Mark Fasheh
2016-11-07 18:49   ` James Pharaoh
2016-11-07 18:53     ` James Pharaoh
2016-11-14 18:07     ` Zygo Blaxell
2016-11-14 18:22       ` James Pharaoh
2016-11-14 18:39         ` Austin S. Hemmelgarn
2016-11-14 19:51           ` Zygo Blaxell
2016-11-14 19:56             ` Austin S. Hemmelgarn
2016-11-14 21:10               ` Zygo Blaxell
2016-11-15 12:26                 ` Austin S. Hemmelgarn
2016-11-15 17:52                   ` Zygo Blaxell
2016-11-16 22:24                     ` Niccolò Belli
2016-11-17  3:01                       ` Zygo Blaxell
2016-11-18 10:36                         ` Niccolò Belli
2016-11-14 20:07             ` James Pharaoh
2016-11-14 21:22               ` Zygo Blaxell
2016-11-14 18:43         ` Zygo Blaxell
2016-11-08 11:06 ` Niccolò Belli
2016-11-08 11:38   ` James Pharaoh
2016-11-08 16:57     ` Niccolò Belli
2016-11-08 16:58       ` James Pharaoh
2016-11-08 17:08         ` Niccolò Belli
2016-11-14 18:27   ` Zygo Blaxell
2016-11-08 22:36 ` Saint Germain
2016-11-09 11:24   ` Niccolò Belli
2016-11-09 12:47     ` Saint Germain [this message]
2016-11-13 12:45   ` James Pharaoh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161109134751.434b5e83@system \
    --to=saintger@gmail.com \
    --cc=darkbasic@linuxsystems.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.