All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Christoph Anton Mitterer <calestyo@scientia.net>,
	dsterba@suse.cz, James Pharaoh <james@wellbehavedsoftware.com>,
	linux-btrfs@vger.kernel.org, mark@fasheh.com
Subject: Re: Announcing btrfs-dedupe
Date: Tue, 8 Nov 2016 12:04:16 -0500	[thread overview]
Message-ID: <ec4e9e7a-6c85-b4a7-4ae1-e54b94ec0db3@gmail.com> (raw)
In-Reply-To: <20161108165706.GB16801@birch.djwong.org>

On 2016-11-08 11:57, Darrick J. Wong wrote:
> On Tue, Nov 08, 2016 at 08:26:02AM -0500, Austin S. Hemmelgarn wrote:
>> On 2016-11-07 21:40, Christoph Anton Mitterer wrote:
>>> On Mon, 2016-11-07 at 15:02 +0100, David Sterba wrote:
>>>> I think adding a whole-file dedup mode to duperemove would be better
>>>> (from user's POV) than writing a whole new tool
>>>
>>> What would IMO be really good from a user's POV was, if one of the
>>> tools, deemed to be the "best", would be added to the btrfs-progs and
>>> simply become "the official" one.
>>
>> The problem is that for deduplication, most tools won't work well for
>> everything.  For example the cases I use it in are very specific and have
>> horrible performance using pretty much any available tool (I have a couple
>> cases where I have disjoint subsets of the same directory tree with
>> different prefixes, so I can tell exactly which files are duplicated, and
>> that any duplicate file is 100% duplicate, as well as a couple of cases
>> where changes are small, scattered, and highly predictable (and thus it's
>> easier to find what's changed and dedupe everything else instead of finding
>> what's the same), and none of the existing options do well in either
>> situation).
>>
>> I'd argue at minimum for having the extent-same tool from duperemove in
>> btrfs-progs, as that lets people do deduplication how they want without
>> having to write C code.  Something equivalent that would let you call any
>> BTRFS ioctl with (reasonably) arbitrary arguments might actually be even
>> better (I can see such a tool being wonderful for debugging).
>
> Since xfsprogs 4.3, xfs_io has a 'dedupe' command that can talk to
> FIDEDUPERANGE (f.k.a. EXTENT SAME):
>
> $ xfs_io -c '/mnt/srcfile srcoffset dstoffset length' /mnt/destfile
>
I actually hadn't known about this, thanks.  It means that xfs_io just 
got even more useful despite me not running XFS.


  reply	other threads:[~2016-11-08 17:04 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-06 13:30 Announcing btrfs-dedupe James Pharaoh
2016-11-07 14:02 ` David Sterba
2016-11-07 17:48   ` Mark Fasheh
2016-11-07 20:54     ` Adam Borowski
2016-11-08  2:17       ` Darrick J. Wong
2016-11-08 18:59         ` Mark Fasheh
2016-11-08 19:47           ` Darrick J. Wong
2016-11-08 19:47             ` [Ocfs2-devel] " Darrick J. Wong
2016-11-09 15:02       ` David Sterba
2016-11-08  2:40   ` Christoph Anton Mitterer
2016-11-08  6:11     ` James Pharaoh
2016-11-08 13:26     ` Austin S. Hemmelgarn
2016-11-08 16:57       ` Darrick J. Wong
2016-11-08 17:04         ` Austin S. Hemmelgarn [this message]
2016-11-08 18:49     ` Mark Fasheh
2016-11-07 17:59 ` Mark Fasheh
2016-11-07 18:49   ` James Pharaoh
2016-11-07 18:53     ` James Pharaoh
2016-11-14 18:07     ` Zygo Blaxell
2016-11-14 18:22       ` James Pharaoh
2016-11-14 18:39         ` Austin S. Hemmelgarn
2016-11-14 19:51           ` Zygo Blaxell
2016-11-14 19:56             ` Austin S. Hemmelgarn
2016-11-14 21:10               ` Zygo Blaxell
2016-11-15 12:26                 ` Austin S. Hemmelgarn
2016-11-15 17:52                   ` Zygo Blaxell
2016-11-16 22:24                     ` Niccolò Belli
2016-11-17  3:01                       ` Zygo Blaxell
2016-11-18 10:36                         ` Niccolò Belli
2016-11-14 20:07             ` James Pharaoh
2016-11-14 21:22               ` Zygo Blaxell
2016-11-14 18:43         ` Zygo Blaxell
2016-11-08 11:06 ` Niccolò Belli
2016-11-08 11:38   ` James Pharaoh
2016-11-08 16:57     ` Niccolò Belli
2016-11-08 16:58       ` James Pharaoh
2016-11-08 17:08         ` Niccolò Belli
2016-11-14 18:27   ` Zygo Blaxell
2016-11-08 22:36 ` Saint Germain
2016-11-09 11:24   ` Niccolò Belli
2016-11-09 12:47     ` Saint Germain
2016-11-13 12:45   ` James Pharaoh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec4e9e7a-6c85-b4a7-4ae1-e54b94ec0db3@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=calestyo@scientia.net \
    --cc=darrick.wong@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=james@wellbehavedsoftware.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mark@fasheh.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.