btrfs dedup - available or experimental? Or yet to be?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* btrfs dedup - available or experimental? Or yet to be?
@ 2015-03-23 23:10 Martin
  2015-03-23 23:22 ` Hugo Mills
  0 siblings, 1 reply; 14+ messages in thread
From: Martin @ 2015-03-23 23:10 UTC (permalink / raw)
  To: linux-btrfs

As titled:

Does btrfs have dedup (on raid1 multiple disks) that can be enabled?

Can anyone relate any experiences?

Is there (or will there be,) a bad penalty of fragmentation?

(For kernel 3.18.9)

Thanks,
Martin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-23 23:10 btrfs dedup - available or experimental? Or yet to be? Martin
@ 2015-03-23 23:22 ` Hugo Mills
  2015-03-25  1:30   ` Rich Freeman
  2015-05-13 16:23   ` Learner Study
  0 siblings, 2 replies; 14+ messages in thread
From: Hugo Mills @ 2015-03-23 23:22 UTC (permalink / raw)
  To: Martin; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 672 bytes --]

On Mon, Mar 23, 2015 at 11:10:46PM +0000, Martin wrote:
> As titled:
> 
> 
> Does btrfs have dedup (on raid1 multiple disks) that can be enabled?

   The current state of play is on the wiki:

https://btrfs.wiki.kernel.org/index.php/Deduplication

> Can anyone relate any experiences?

   duperemove is reported as working.

> Is there (or will there be,) a bad penalty of fragmentation?

   With duperemove, it operates on an extent scale, not at the level
of blocks, so the fragmentation isn't so bad.

   Hugo.

-- 
Hugo Mills             | ©1973 Unclear Research Ltd
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: 65E74AC0          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-23 23:22 ` Hugo Mills
@ 2015-03-25  1:30   ` Rich Freeman
  2015-03-27  0:07     ` Martin
  2015-03-27 20:44     ` Mark Fasheh
  2015-05-13 16:23   ` Learner Study
  1 sibling, 2 replies; 14+ messages in thread
From: Rich Freeman @ 2015-03-25  1:30 UTC (permalink / raw)
  To: Hugo Mills, Martin, Btrfs BTRFS

On Mon, Mar 23, 2015 at 7:22 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Mon, Mar 23, 2015 at 11:10:46PM +0000, Martin wrote:
>> As titled:
>>
>>
>> Does btrfs have dedup (on raid1 multiple disks) that can be enabled?
>
>    The current state of play is on the wiki:
>
> https://btrfs.wiki.kernel.org/index.php/Deduplication
>

I hadn't realized that bedup was deprecated.

This seems unfortunate since it seemed to be a lot smarter about
detecting what has and hasn't already been scanned, and it also
supported defragmenting files while de-duplicating them.

I'll give duperemove a shot.   I just packaged it on Gentoo.

--
Rich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-25  1:30   ` Rich Freeman
@ 2015-03-27  0:07     ` Martin
  2015-03-27  0:30       ` Rich Freeman
  2015-03-27 20:51       ` Mark Fasheh
  2015-03-27 20:44     ` Mark Fasheh
  1 sibling, 2 replies; 14+ messages in thread
From: Martin @ 2015-03-27  0:07 UTC (permalink / raw)
  To: linux-btrfs

On 25/03/15 01:30, Rich Freeman wrote:
> On Mon, Mar 23, 2015 at 7:22 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
>> On Mon, Mar 23, 2015 at 11:10:46PM +0000, Martin wrote:
>>> As titled:
>>>
>>>
>>> Does btrfs have dedup (on raid1 multiple disks) that can be enabled?
>>
>>    The current state of play is on the wiki:
>>
>> https://btrfs.wiki.kernel.org/index.php/Deduplication
>>
> 
> I hadn't realized that bedup was deprecated.
> 
> This seems unfortunate since it seemed to be a lot smarter about
> detecting what has and hasn't already been scanned, and it also
> supported defragmenting files while de-duplicating them.
> 
> I'll give duperemove a shot.   I just packaged it on Gentoo.

Excellent and very rapid packaging, thanks!


Already compiled, installed, and soon to be tried on a test subvolume...


Anyone with any comments on how well duperemove performs for TB-sized
volumes?

Does it work across subvolumes? (Presumably not...)


Thanks,
Martin


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-27  0:07     ` Martin
@ 2015-03-27  0:30       ` Rich Freeman
  2015-03-29 11:43         ` Kai Krakow
  2015-03-27 20:51       ` Mark Fasheh
  1 sibling, 1 reply; 14+ messages in thread
From: Rich Freeman @ 2015-03-27  0:30 UTC (permalink / raw)
  To: Martin; +Cc: Btrfs BTRFS

On Thu, Mar 26, 2015 at 8:07 PM, Martin <m_btrfs@ml1.co.uk> wrote:
>
> Anyone with any comments on how well duperemove performs for TB-sized
> volumes?

Took many hours but less than a day for a few TB - I'm not sure
whether it is smart enough to take less time on subsequent scans like
bedup.

>
> Does it work across subvolumes? (Presumably not...)

As far as I can tell, yes.  Unless you pass a command-line option it
crosses filesystem boundaries and even scans non-btrfs filesystems
(like /proc, /dev, etc).  Obviously you'll want to avoid that since it
only wastes time and I can just imagine it trying to hash kcore and
such.

Other than being less-than-ideal intelligence-wise, it seemed
effective.  I can live with that in an early release like this.

--
Rich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-25  1:30   ` Rich Freeman
  2015-03-27  0:07     ` Martin
@ 2015-03-27 20:44     ` Mark Fasheh
  1 sibling, 0 replies; 14+ messages in thread
From: Mark Fasheh @ 2015-03-27 20:44 UTC (permalink / raw)
  To: Rich Freeman; +Cc: Hugo Mills, Martin, Btrfs BTRFS

On Tue, Mar 24, 2015 at 09:30:52PM -0400, Rich Freeman wrote:
> On Mon, Mar 23, 2015 at 7:22 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> > On Mon, Mar 23, 2015 at 11:10:46PM +0000, Martin wrote:
> >> As titled:
> >>
> >>
> >> Does btrfs have dedup (on raid1 multiple disks) that can be enabled?
> >
> >    The current state of play is on the wiki:
> >
> > https://btrfs.wiki.kernel.org/index.php/Deduplication
> >
> 
> I hadn't realized that bedup was deprecated.
> 
> This seems unfortunate since it seemed to be a lot smarter about
> detecting what has and hasn't already been scanned, and it also
> supported defragmenting files while de-duplicating them.

Hi just FYI, only rescanning files that have changed since the last scan is
a feature I've been working on in duperemove for some time now. I have some
rudimentary code that works which will be going into master branch in a week
or so (I wanted to finish it this week but other things have kept me busy).

But anyway that should help with the lack of intelligence on what files to
scan.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-27  0:07     ` Martin
  2015-03-27  0:30       ` Rich Freeman
@ 2015-03-27 20:51       ` Mark Fasheh
  1 sibling, 0 replies; 14+ messages in thread
From: Mark Fasheh @ 2015-03-27 20:51 UTC (permalink / raw)
  To: Martin; +Cc: linux-btrfs

On Fri, Mar 27, 2015 at 12:07:29AM +0000, Martin wrote:
> Excellent and very rapid packaging, thanks!
> 
> 
> Already compiled, installed, and soon to be tried on a test subvolume...
> 
> 
> Anyone with any comments on how well duperemove performs for TB-sized
> volumes?

https://github.com/markfasheh/duperemove/wiki/Performance-Numbers

That page has some sample performance numbers. Keep in mind that the tests
were done on reasonably nice hardware.

TB-size is definitely on the larger end of what I expect it should handling
these days. The biggest problem you would see is memory usage - versions
0.09 and below will be storing all hashes in memory so if everything else is
fast enough that's likely the first bump you'll hit.

Master branch has some code which reduces our memory consumption
dramatically by using a bloom filter and temporarily storing them on disk.
That branch needs some more features and bug fixing before I'm ready to call
it stable.

> Does it work across subvolumes? (Presumably not...)

Yep it will dedupe across subvolumes for you!
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-27  0:30       ` Rich Freeman
@ 2015-03-29 11:43         ` Kai Krakow
  2015-03-29 12:31           ` Rich Freeman
  2015-03-29 17:51           ` Christoph Anton Mitterer
  0 siblings, 2 replies; 14+ messages in thread
From: Kai Krakow @ 2015-03-29 11:43 UTC (permalink / raw)
  To: linux-btrfs

Rich Freeman <r-btrfs@thefreemanclan.net> schrieb:

> On Thu, Mar 26, 2015 at 8:07 PM, Martin <m_btrfs@ml1.co.uk> wrote:
>>
>> Anyone with any comments on how well duperemove performs for TB-sized
>> volumes?
> 
> Took many hours but less than a day for a few TB - I'm not sure
> whether it is smart enough to take less time on subsequent scans like
> bedup.
> 
>>
>> Does it work across subvolumes? (Presumably not...)
> 
> As far as I can tell, yes.  Unless you pass a command-line option it
> crosses filesystem boundaries and even scans non-btrfs filesystems
> (like /proc, /dev, etc).  Obviously you'll want to avoid that since it
> only wastes time and I can just imagine it trying to hash kcore and
> such.
> 
> Other than being less-than-ideal intelligence-wise, it seemed
> effective.  I can live with that in an early release like this.

This is mainly in there to support deduping across different subvolumes 
within the same device pool. So I think the idea was neither less-than-
ideal, nor unintelligent, and it has nothing to do with performance.

But your warning is still valid: One should take care not to "dedupe" 
special filesystems (but that is the same with every other tool out there, 
like rsync, cp, essentially everything that supports recursion), nor is it 
very effective for the deduplication process to cross a boundary to a non-
btrfs device - for one or more exceptions: You may want duperemove to write 
hashes for a non-btrfs device and use the result for other purposes outside 
of duperemoves scope, or you are nesting btrfs into non-btrfs into btrfs 
mounts, or...

Concluding that: duperemove should probably not try to become smart about 
filesystem boundaries. It should either cross them or not as it is now - the 
option is left to the user (as is the task to supply proper cmdline 
arguments with that).

With the planned performance improvements, I'm guessing the best way will 
become mounting the root subvolume (subvolid 0) and letting duperemove work 
on that as a whole - including crossing all fs boundaries.

-- 
Replies to list only preferred.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-29 11:43         ` Kai Krakow
@ 2015-03-29 12:31           ` Rich Freeman
  2015-03-29 14:44             ` Kai Krakow
  2015-03-29 17:51           ` Christoph Anton Mitterer
  1 sibling, 1 reply; 14+ messages in thread
From: Rich Freeman @ 2015-03-29 12:31 UTC (permalink / raw)
  To: Kai Krakow; +Cc: Btrfs BTRFS

On Sun, Mar 29, 2015 at 7:43 AM, Kai Krakow <hurikhan77@gmail.com> wrote:
>
> With the planned performance improvements, I'm guessing the best way will
> become mounting the root subvolume (subvolid 0) and letting duperemove work
> on that as a whole - including crossing all fs boundaries.
>

Why cross filesystem boundaries by default?  If you scan from the root
subvolume you're guanteed to traverse every file on the filesystem
(which is all that can be deduped) without crossing any filesystem
boundaries.  Even if you have btrfs on non-btrfs on btrfs there must
be some other path that reaches the same files when scanning from
subvolid 0.

--
Rich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-29 12:31           ` Rich Freeman
@ 2015-03-29 14:44             ` Kai Krakow
  2015-03-29 17:54               ` Christoph Anton Mitterer
  0 siblings, 1 reply; 14+ messages in thread
From: Kai Krakow @ 2015-03-29 14:44 UTC (permalink / raw)
  To: linux-btrfs

Rich Freeman <r-btrfs@thefreemanclan.net> schrieb:

> On Sun, Mar 29, 2015 at 7:43 AM, Kai Krakow <hurikhan77@gmail.com> wrote:
>>
>> With the planned performance improvements, I'm guessing the best way will
>> become mounting the root subvolume (subvolid 0) and letting duperemove
>> work on that as a whole - including crossing all fs boundaries.
>>
> 
> Why cross filesystem boundaries by default?  If you scan from the root
> subvolume you're guanteed to traverse every file on the filesystem
> (which is all that can be deduped) without crossing any filesystem
> boundaries.  Even if you have btrfs on non-btrfs on btrfs there must
> be some other path that reaches the same files when scanning from
> subvolid 0.

Yes, the chosen "default" is probably not the best for this kind of utility. 
But I suppose it follows the principle of least surprise. At least every 
utility I'm daily using (like find) follows this default route. By the way, 
I wrote "default" because one should keep in mind that it is not recursive 
by default (and thus crossing the boundary wouldn't even apply in the 
default configuration) which only strengthens my point for the principle of 
least surprise. And I'd leave that open for discussion here to change the 
default, all I suggested was that duperemove should not try to become smart 
about it as the only choice (behavior will be undefined otherwise when 
deploying this on a vast amount of individually configured systems). I could 
image that there was a cmdline option to make it smart.

The idea for subvolid 0: It is just pure intention how I would use it for my 
personal purpose. By no means this should be in any default deployments.

-- 
Replies to list only preferred.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-29 11:43         ` Kai Krakow
  2015-03-29 12:31           ` Rich Freeman
@ 2015-03-29 17:51           ` Christoph Anton Mitterer
  1 sibling, 0 replies; 14+ messages in thread
From: Christoph Anton Mitterer @ 2015-03-29 17:51 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 609 bytes --]

On Sun, 2015-03-29 at 13:43 +0200, Kai Krakow wrote: 
> Concluding that: duperemove should probably not try to become smart about 
> filesystem boundaries. It should either cross them or not as it is now - the 
> option is left to the user (as is the task to supply proper cmdline 
> arguments with that).
Couldn't it per default simply cross boundaries just within the same
btrfs fs (i.e. amongst all it's subvolumes), since this seems to be the
natural choice users want in most cases,... and via --no-xdev option or
something like that it would be allowed to pass boundaries?


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-29 14:44             ` Kai Krakow
@ 2015-03-29 17:54               ` Christoph Anton Mitterer
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Anton Mitterer @ 2015-03-29 17:54 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 586 bytes --]

On Sun, 2015-03-29 at 16:44 +0200, Kai Krakow wrote: 
> Yes, the chosen "default" is probably not the best for this kind of utility. 
> But I suppose it follows the principle of least surprise. At least every 
> utility I'm daily using (like find) follows this default route.
But the default with all these tools is that they operate on the file
hierarchy and per default don't care about filesystems at all - or at
least not in their original meaning.

dedup is IMHO however a more filesystem internal centric operation...
more like defragmentation ore tune2fs.


Cheers.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-03-23 23:22 ` Hugo Mills
  2015-03-25  1:30   ` Rich Freeman
@ 2015-05-13 16:23   ` Learner Study
  2015-05-13 21:08     ` Zygo Blaxell
  1 sibling, 1 reply; 14+ messages in thread
From: Learner Study @ 2015-05-13 16:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Learner Study

Hello,

I have been reading on de-duplication and how algorithms such as Bloom
and Cuckoo filters are used for this purpose.

Does BTRFS dedup use any of these, or are there plans to incorporate
these in future?

Thanks for your guidance!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs dedup - available or experimental? Or yet to be?
  2015-05-13 16:23   ` Learner Study
@ 2015-05-13 21:08     ` Zygo Blaxell
  0 siblings, 0 replies; 14+ messages in thread
From: Zygo Blaxell @ 2015-05-13 21:08 UTC (permalink / raw)
  To: Learner Study; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2506 bytes --]

On Wed, May 13, 2015 at 09:23:25AM -0700, Learner Study wrote:
> I have been reading on de-duplication and how algorithms such as Bloom
> and Cuckoo filters are used for this purpose.
> 
> Does BTRFS dedup use any of these, or are there plans to incorporate
> these in future?

btrfs dedup currently lives in user space, and there are multiple
dedup userspace projects in development.

Administrators can choose the dedup tool, and can use multiple dedup tools
on the same filesystem.  This is particularly handy if you know something
about your data that a naive hashing algorithm might not (e.g. you have
two large trees derived from a common base, so you can use a much more
efficient algorithm than you would if you knew nothing about the data).

The basic kernel interface for dedup is the extent-same ioctl.
A userspace program creates a list of (fd, length, offset) tuples
referencing identical content and passes them to the kernel.  The kernel
locks the file contents, compares them, and replaces identical data
copies with references to a single extent in an atomic operation.
The kernel also provides interfaces to efficiently discover recently
modified extents in a filesystem, enabling deduplicators to follow new
data without the need to block writes.

Most btrfs deduplicators are based on a block-level hash table built by
scanning files, but every other aspect of the tools (e.g. the mechanism by
which files are discovered, block sizes, scalability, use of prefiltering
algorithms such as Bloom filters, whether the hash table is persistent
or ephemeral, etc) is different from one tool to another, and changing
over time as the tools are developed.

Because the kernel interface implies a read of both copies of duplicate
data, it is not necessary to use a collision-free hash.  Optimizing the
number of bits in the hash function for the size of the filesystem and
exploiting the statistical tendency for identical blocks to be adjacent
to other identical blocks in files enables considerable space efficiency
in the hash table--possibly so much that the Bloom/Cuckoo-style
pre-filtering benefit becomes irrelevant.  I'm not aware of a released
btrfs deduplicator that currently exploits these optimizations.

> Thanks for your guidance!
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-05-13 21:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-23 23:10 btrfs dedup - available or experimental? Or yet to be? Martin
2015-03-23 23:22 ` Hugo Mills
2015-03-25  1:30   ` Rich Freeman
2015-03-27  0:07     ` Martin
2015-03-27  0:30       ` Rich Freeman
2015-03-29 11:43         ` Kai Krakow
2015-03-29 12:31           ` Rich Freeman
2015-03-29 14:44             ` Kai Krakow
2015-03-29 17:54               ` Christoph Anton Mitterer
2015-03-29 17:51           ` Christoph Anton Mitterer
2015-03-27 20:51       ` Mark Fasheh
2015-03-27 20:44     ` Mark Fasheh
2015-05-13 16:23   ` Learner Study
2015-05-13 21:08     ` Zygo Blaxell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).