linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Chris Murphy <lists@colorremedies.com>,
	Konstantin Svist <fry.kun@gmail.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: bedup --defrag freezing
Date: Thu, 13 Aug 2015 07:30:38 -0400	[thread overview]
Message-ID: <55CC7FDE.5000209@gmail.com> (raw)
In-Reply-To: <CAJCQCtSaO7jXeJKe+jkMWcRkG6zmUrc+T=jiSD-54ewfAZXV1A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3832 bytes --]

On 2015-08-12 15:30, Chris Murphy wrote:
> On Wed, Aug 12, 2015 at 12:44 PM, Konstantin Svist <fry.kun@gmail.com> wrote:
>> On 08/06/2015 04:10 AM, Austin S Hemmelgarn wrote:
>>> On 2015-08-05 17:45, Konstantin Svist wrote:
>>>> Hi,
>>>>
>>>> I've been running btrfs on Fedora for a while now, with bedup --defrag
>>>> running in a night-time cronjob.
>>>> Last few runs seem to have gotten stuck, without possibility of even
>>>> killing the process (kill -9 doesn't work) -- all I could do is hard
>>>> power cycle.
>>>>
>>>> Did something change recently? Is bedup simply too out of date? What
>>>> should I use to de-duplicate across snapshots instead? Etc.?
>>>>
>>> AFAIK, bedup hasn't been actively developed for quite a while (I'm
>>> actually kind of surprised it runs with the newest btrfs-progs).
>>> Personally, I'd suggest using duperemove
>>> (https://github.com/markfasheh/duperemove)
>>
>> Thanks, good to know.
>> Tried duperemove -- it looks like it builds a database of its own
>> checksums every time it runs... why won't it use BTRFS internal
>> checksums for fast rejection? Would run a LOT faster...
>
> I think the reason is duperremove does extent based deduplication.
> Where Btrfs checksums are 4KiB block based, not extent based. And so
> many 4KiB CRC32C checksums would need to be in memory, that could be
> kinda expensive. And also, I don't know if CRC32C checksums have
> essentially no practical chance of collision. If it's really rare,
> rather than "so improbable as to be impossible" then you could end up
> with "really rare" corruption where incorrect deduplication happens.
Yeah, duperemove doesn't use them because of the memory limitations. 
Theoretically it's possible to take the the CRC checksums of the 
individual blocks and then combine them to get a checksum of the blocks 
as a whold, but it really isn't worth it for that (it would take just 
about as long as the current hashing.

As for the collision properties of CRC32C, it's actually almost trivial 
to construct collisions.  The reason that it is used in BTRFS is because 
there is a functional guarantee that any single bit error in a block 
_will_ result in a different CRC, and most larger errors will also.  In 
other words, the usage of CRC32C in BTRFS is for error detection and 
because it's ridiculously fast on all modern processors.  As far as the 
possibility of incorrect deduplication, the kernel does a bytewise 
comparison of the extents submitted before actually deduplicating them, 
so there's no chance (barring hardware issues and/or external influence 
from a ill-intentioned third-party) of it happening.  Because of this, 
you could theoretically just call the ioctl on every possible 
combination of extents in the FS, but that would take a ridiculous 
amount of time (especially because calls involving the same byte ranges 
get internally serialized by the kernel), which is why we have programs 
like duperemove (while the hashing has to read all the data too, it's 
still a lot faster than just comparing all of it directly).
>
> There was a patch late last  year I think to re-introduce sha256 hash
> as the checksum, but as far as I know it's not in btrfs-progs yet. I
> forget if that's file, extent or block based.
I'm pretty sure that that patch never made it into the kernel (the 
original one was for the kernel, not the userspace programs, and it 
never got brought in because the argument for it (better protection 
against malicious intent) was inherently invalid for the usage of 
checksums in BTRFS (if someone can rewrite your data arbitrarily on 
disk, they can do so for the checksums also)), and that it was block 
based (and as such less useful for deduplication than the CRC32C that we 
are currently using).



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

      reply	other threads:[~2015-08-13 11:30 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-05 21:45 bedup --defrag freezing Konstantin Svist
2015-08-06 11:10 ` Austin S Hemmelgarn
2015-08-12 18:44   ` Konstantin Svist
2015-08-12 19:30     ` Chris Murphy
2015-08-13 11:30       ` Austin S Hemmelgarn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55CC7FDE.5000209@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=fry.kun@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).