From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Chris Murphy <lists@colorremedies.com>,
Konstantin Svist <fry.kun@gmail.com>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: bedup --defrag freezing
Date: Thu, 13 Aug 2015 07:30:38 -0400 [thread overview]
Message-ID: <55CC7FDE.5000209@gmail.com> (raw)
In-Reply-To: <CAJCQCtSaO7jXeJKe+jkMWcRkG6zmUrc+T=jiSD-54ewfAZXV1A@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3832 bytes --]
On 2015-08-12 15:30, Chris Murphy wrote:
> On Wed, Aug 12, 2015 at 12:44 PM, Konstantin Svist <fry.kun@gmail.com> wrote:
>> On 08/06/2015 04:10 AM, Austin S Hemmelgarn wrote:
>>> On 2015-08-05 17:45, Konstantin Svist wrote:
>>>> Hi,
>>>>
>>>> I've been running btrfs on Fedora for a while now, with bedup --defrag
>>>> running in a night-time cronjob.
>>>> Last few runs seem to have gotten stuck, without possibility of even
>>>> killing the process (kill -9 doesn't work) -- all I could do is hard
>>>> power cycle.
>>>>
>>>> Did something change recently? Is bedup simply too out of date? What
>>>> should I use to de-duplicate across snapshots instead? Etc.?
>>>>
>>> AFAIK, bedup hasn't been actively developed for quite a while (I'm
>>> actually kind of surprised it runs with the newest btrfs-progs).
>>> Personally, I'd suggest using duperemove
>>> (https://github.com/markfasheh/duperemove)
>>
>> Thanks, good to know.
>> Tried duperemove -- it looks like it builds a database of its own
>> checksums every time it runs... why won't it use BTRFS internal
>> checksums for fast rejection? Would run a LOT faster...
>
> I think the reason is duperremove does extent based deduplication.
> Where Btrfs checksums are 4KiB block based, not extent based. And so
> many 4KiB CRC32C checksums would need to be in memory, that could be
> kinda expensive. And also, I don't know if CRC32C checksums have
> essentially no practical chance of collision. If it's really rare,
> rather than "so improbable as to be impossible" then you could end up
> with "really rare" corruption where incorrect deduplication happens.
Yeah, duperemove doesn't use them because of the memory limitations.
Theoretically it's possible to take the the CRC checksums of the
individual blocks and then combine them to get a checksum of the blocks
as a whold, but it really isn't worth it for that (it would take just
about as long as the current hashing.
As for the collision properties of CRC32C, it's actually almost trivial
to construct collisions. The reason that it is used in BTRFS is because
there is a functional guarantee that any single bit error in a block
_will_ result in a different CRC, and most larger errors will also. In
other words, the usage of CRC32C in BTRFS is for error detection and
because it's ridiculously fast on all modern processors. As far as the
possibility of incorrect deduplication, the kernel does a bytewise
comparison of the extents submitted before actually deduplicating them,
so there's no chance (barring hardware issues and/or external influence
from a ill-intentioned third-party) of it happening. Because of this,
you could theoretically just call the ioctl on every possible
combination of extents in the FS, but that would take a ridiculous
amount of time (especially because calls involving the same byte ranges
get internally serialized by the kernel), which is why we have programs
like duperemove (while the hashing has to read all the data too, it's
still a lot faster than just comparing all of it directly).
>
> There was a patch late last year I think to re-introduce sha256 hash
> as the checksum, but as far as I know it's not in btrfs-progs yet. I
> forget if that's file, extent or block based.
I'm pretty sure that that patch never made it into the kernel (the
original one was for the kernel, not the userspace programs, and it
never got brought in because the argument for it (better protection
against malicious intent) was inherently invalid for the usage of
checksums in BTRFS (if someone can rewrite your data arbitrarily on
disk, they can do so for the checksums also)), and that it was block
based (and as such less useful for deduplication than the CRC32C that we
are currently using).
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]
prev parent reply other threads:[~2015-08-13 11:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-05 21:45 bedup --defrag freezing Konstantin Svist
2015-08-06 11:10 ` Austin S Hemmelgarn
2015-08-12 18:44 ` Konstantin Svist
2015-08-12 19:30 ` Chris Murphy
2015-08-13 11:30 ` Austin S Hemmelgarn [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55CC7FDE.5000209@gmail.com \
--to=ahferroin7@gmail.com \
--cc=fry.kun@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).