From: David Sterba <dsterba@suse.cz>
To: Mark Fasheh <mfasheh@suse.de>
Cc: Qu Wenruo <quwenruo@cn.fujitsu.com>, Chris Mason <clm@fb.com>,
Josef Bacik <jbacik@fb.com>, btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: About in-band dedupe for v4.7
Date: Wed, 11 May 2016 19:36:59 +0200 [thread overview]
Message-ID: <20160511173659.GI29353@suse.cz> (raw)
In-Reply-To: <20160511025211.GF7633@wotan.suse.de>
On Tue, May 10, 2016 at 07:52:11PM -0700, Mark Fasheh wrote:
> Taking your history with qgroups out of this btw, my opinion does not
> change.
>
> With respect to in-memory only dedupe, it is my honest opinion that such a
> limited feature is not worth the extra maintenance work. In particular
> there's about 800 lines of code in the userspace patches which I'm sure
> you'd want merged, because how could we test this then?
I like the in-memory dedup backend. It's lightweight, only a heuristic,
does not need any IO or persistent storage. OTOH I consider it a subpart
of the in-band deduplication that does all the persistency etc. So I
treat the ioctl interface from a broader aspect.
A usecase I find interesting is to keep the in-memory dedup cache and
then flush it to disk on demand, compared to automatically synced dedup
(eg. at commit time).
> A couple examples sore points in my review so far:
>
> - Internally you're using a mutex (instead of a spinlock) to lock out queries
> to the in-memory hash, which I can see becoming a performance problem in the
> write path.
>
> - Also, we're doing SHA256 in the write path which I expect will
> slow it down even more dramatically. Given that all the work done gets
> thrown out every time we fill the hash (or remount), I just don't see much
> benefit to the user with this.
I had some ideas to use faster hashes and do sha256 when it's going to
be stored on disk, but there were some concerns. The objection against
speed and performance hit at write time is valid. But we'll need to
verify that in real performance tests, which haven't happend yet up to
my knowledge.
> Users can get better dedupe via the ioctl today than with what
> you propose go in as an experimental feature so I don't see many people
> caring to test it. IMHO you would have to provide a more compelling reason
> to include this code.
I see it as a complementary feature in the deduplication capabilities,
covering more usecases.
next prev parent reply other threads:[~2016-05-11 17:37 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-10 7:19 About in-band dedupe for v4.7 Qu Wenruo
2016-05-10 22:11 ` Mark Fasheh
2016-05-11 1:03 ` Qu Wenruo
2016-05-11 2:52 ` Mark Fasheh
2016-05-11 9:14 ` Qu Wenruo
2016-05-11 17:36 ` David Sterba [this message]
2016-05-12 20:54 ` Mark Fasheh
2016-05-13 7:14 ` Duncan
2016-05-13 12:14 ` Austin S. Hemmelgarn
2016-05-13 14:25 ` Qu Wenruo
2016-05-13 16:37 ` Zygo Blaxell
2016-05-16 15:26 ` David Sterba
2016-05-13 6:01 ` Zygo Blaxell
2016-05-11 16:56 ` David Sterba
2016-05-13 3:13 ` Wang Shilong
2016-05-13 3:44 ` Qu Wenruo
2016-05-13 6:21 ` Zygo Blaxell
2016-05-16 16:40 ` David Sterba
2016-05-11 0:37 ` Chris Mason
2016-05-11 1:40 ` Qu Wenruo
2016-05-11 2:26 ` Satoru Takeuchi
2016-05-11 4:22 ` Mark Fasheh
2016-05-11 16:39 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160511173659.GI29353@suse.cz \
--to=dsterba@suse.cz \
--cc=clm@fb.com \
--cc=jbacik@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=mfasheh@suse.de \
--cc=quwenruo@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).