From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Marcel Ritter <ritter.marcel@gmail.com>
Cc: btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Feedback on inline dedup patches
Date: Wed, 30 Dec 2015 16:38:13 +0800 [thread overview]
Message-ID: <568397F5.7070706@cn.fujitsu.com> (raw)
In-Reply-To: <CADurGxzoCAk7EFVGiRD+8cc1LMYTFiFBLd02ZcPFtjn=MhBuwQ@mail.gmail.com>
Hi Marcel
Thanks a lot for the feedback.
Marcel Ritter wrote on 2015/12/30 09:15 +0100:
> Hi Qu Wenrou,
>
> I just wanted to give some feedback on yesterdays dedup patches:
>
> I just applied them to a 4.4-rc7 kernel and did some (very basic)
> testing:
>
> Test1: in-memory
>
> Didn't crash on my 350 GB test files. Copying those files again,
> but "btrfs fi df" didn't show much space savings (maybe that's not
> the tool to check anyway?).
Two reasons, one is the the default limit for 4096 hashes.
The other one is a bug for not deduping if a transaction is just committed.
We already have fix for it internally, but we are busy fixing/testing
on-disk backend.
But that's not a big problem as it will only skipped the first several hit.
> Looking further I found the (default) limit of 4096 hashes (is it really
> hashes? with 16k blocks that'd cover a dataset of only 64 MB?).
Yes, that's the default value.
Allowing even embedded device to have a try on btrfs dedup.
Default value shouldn't be super big to make the system OOM, so I just
chose the small 4096 default value.
> I think I'll start a new test run, with a much higher number of hashes,
> but I'd like to know the memory requirements involved - is there
> a formula for calculating those memory needs?
The formula is very easy:
Memory usage = btrfs_dedup_hash_size * limit.
Currently, btrfs_dedup_hahs_size for SHA-256 is 112 bytes.
>
> Test2: ondisk
>
> Created filesystem with "-O dedup", did a btrfs dedup enable -s ondisk"
> and started copying the same date (s. above). Just a few seconds
> later I got a kernel crash :-(
> I'll try to get a kernel dump - maybe this helps to track down the problem.
We're aware of the bug, and are trying our best to fix it.
But the bug seems quite wired and it may take some time to fix.
So on-disk is not recommended, unless you want to help fixing the bug.
Thanks,
Qu
>
>
> Marcel
>
>
next parent reply other threads:[~2015-12-30 8:38 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CADurGxzoCAk7EFVGiRD+8cc1LMYTFiFBLd02ZcPFtjn=MhBuwQ@mail.gmail.com>
2015-12-30 8:38 ` Qu Wenruo [this message]
2015-12-30 8:58 ` Feedback on inline dedup patches Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=568397F5.7070706@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=ritter.marcel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).