From: Mark Fasheh <mfasheh@suse.de>
To: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: dsterba@suse.cz, Timofey Titovets <nefelim4ag@gmail.com>,
linux-btrfs@vger.kernel.org
Subject: Re: Btrfs offline deduplication
Date: Fri, 1 Aug 2014 13:18:09 -0700 [thread overview]
Message-ID: <20140801201809.GH2203@wotan.suse.de> (raw)
In-Reply-To: <53DBE816.9050209@gmail.com>
On Fri, Aug 01, 2014 at 03:18:46PM -0400, Austin S Hemmelgarn wrote:
> > Why does this have to be kernel side? There's userspace software already to
> > dedupe that can be run on a regular basis. Exporting checksums is a
> > differnet story (you can do that via ioctl) but running the dedupe software
> > itself inside the kernel is exactly what we want to avoid by having the
> > dedupe ioctl in the first place.
> > --Mark
> >
> > --
> > Mark Fasheh
> >
> Based on the same logic however, we don't need scrub to be done kernel
> side, as it wouldn't take but one more ioctl to be able to tell it which
> block out of a set to treat as valid. I'm not saying that things need
> to be done in the kernel, but duperemove doesn't use the ioctl interface
> even if it exists, and bedup is buggy as hell (unless it's improved
> greatly in the last two weeks), and neither of them is at all efficient.
Duperemove absolutely *does* use the ioctl interface for offline dedupe.
> I do understand that this isn't something that is computationally
> simple (especially on x86 with it's defficiency of registers), but rsync
> does almost the same thing for data transmission over the network, and
> it does so seemingly much more efficiently than either option available
> at the moment.
None of the problems you mentioned get solved by pushing the entirety of
offline deduplication into the kernel. If anything, it's more dangerous tod
o that as bugs tend to be far more critical when we hit them from kernel.
Regarding duperemove there's a series to fix up some performance issues that
I'm working on importing at the moment.
--Mark
--
Mark Fasheh
prev parent reply other threads:[~2014-08-01 20:18 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-31 23:54 Btrfs offline deduplication Timofey Titovets
2014-08-01 10:17 ` Austin S Hemmelgarn
2014-08-01 13:23 ` David Sterba
2014-08-01 14:16 ` Austin S Hemmelgarn
2014-08-01 18:55 ` Mark Fasheh
2014-08-01 19:18 ` Austin S Hemmelgarn
2014-08-01 20:18 ` Mark Fasheh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140801201809.GH2203@wotan.suse.de \
--to=mfasheh@suse.de \
--cc=ahferroin7@gmail.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=nefelim4ag@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.