linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Marcel Ritter <ritter.marcel@gmail.com>
Cc: btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Feedback on inline dedup patches
Date: Wed, 30 Dec 2015 16:58:07 +0800	[thread overview]
Message-ID: <56839C9F.2090401@cn.fujitsu.com> (raw)
In-Reply-To: <568397F5.7070706@cn.fujitsu.com>



Qu Wenruo wrote on 2015/12/30 16:38 +0800:
> Hi Marcel
>
> Thanks a lot for the feedback.
>
> Marcel Ritter wrote on 2015/12/30 09:15 +0100:
>> Hi Qu Wenrou,
>>
>> I just wanted to give some feedback on yesterdays dedup patches:
>>
>> I just applied them to a 4.4-rc7 kernel and did some (very basic)
>> testing:
>>
>> Test1: in-memory
>>
>> Didn't crash on my 350 GB test files. Copying those files again,
>> but "btrfs fi df" didn't show much space savings (maybe that's not
>> the tool to check anyway?).
> Two reasons, one is the the default limit for 4096 hashes.
>
> The other one is a bug for not deduping if a transaction is just committed.
> We already have fix for it internally, but we are busy fixing/testing
> on-disk backend.
> But that's not a big problem as it will only skipped the first several hit.
>
>> Looking further I found the (default) limit of 4096 hashes (is it really
>> hashes? with 16k blocks that'd cover a dataset of only 64 MB?).
>
> Yes, that's the default value.
>
> Allowing even embedded device to have a try on btrfs dedup.
> Default value shouldn't be super big to make the system OOM, so I just
> chose the small 4096 default value.
>
>> I think I'll start a new test run, with a much higher number of hashes,
>> but I'd like to know the memory requirements involved - is there
>> a formula for calculating those memory needs?
>
> The formula is very easy:
> Memory usage = btrfs_dedup_hash_size * limit.
>
> Currently, btrfs_dedup_hahs_size for SHA-256 is 112 bytes.
>
>>
>> Test2: ondisk
>>
>> Created filesystem with "-O dedup", did a btrfs dedup enable -s ondisk"
>> and started copying the same date (s. above). Just a few seconds
>> later I got a kernel crash :-(
>> I'll try to get a kernel dump - maybe this helps to track down the
>> problem.
>
> We're aware of the bug, and are trying our best to fix it.
> But the bug seems quite wired and it may take some time to fix.

OK....

I'm just confused with "btrfs_item_offset_nr" and "btrfs_item_ptr_offset".

And that's the root cause of the problem for on-disk backend.

What a SUUUUUUUUUUUUPER STUUUUUUUUUUUUUPID bug!!!

So the fix will be much sooner than I'd expected.

Thanks,
Qu

>
> So on-disk is not recommended, unless you want to help fixing the bug.
>
> Thanks,
> Qu
>
>>
>>
>>     Marcel
>>
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



      reply	other threads:[~2015-12-30  8:58 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CADurGxzoCAk7EFVGiRD+8cc1LMYTFiFBLd02ZcPFtjn=MhBuwQ@mail.gmail.com>
2015-12-30  8:38 ` Feedback on inline dedup patches Qu Wenruo
2015-12-30  8:58   ` Qu Wenruo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56839C9F.2090401@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=ritter.marcel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).