From: Kai Krakow <hurikhan77@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: Tiered storage?
Date: Thu, 16 Nov 2017 17:42:34 +0100 [thread overview]
Message-ID: <20171116174234.0b6c21cc@jupiter.sol.kaishome.de> (raw)
In-Reply-To: 08cbefb3-43eb-8d76-1dd6-191e2709bdc7@dirtcellar.net
Am Wed, 15 Nov 2017 08:11:04 +0100
schrieb waxhead <waxhead@dirtcellar.net>:
> As for dedupe there is (to my knowledge) nothing fully automatic yet.
> You have to run a program to scan your filesystem but all the
> deduplication is done in the kernel.
> duperemove works apparently quite well when I tested it, but there
> may be some performance implications.
There's bees as near-line deduplication tool, that is it watches for
generation changes in the filesystem and walks the inodes. It only
looks at extents, not at files. Deduplication itself is then delegated
to the kernel which ensures all changes are data-safe. The process is
running as a daemon and processes your changes in realtime (delayed by
a few seconds to minutes of course, due to transaction commit and
hashing phase).
You need to dedicate it part of your RAM to work, around 1 GB is
usually sufficient to work well enough. The RAM will be locked and
cannot be swapped out, so you should have a sufficiently equipped
system.
Works very well here (2TB of data, 1GB hash table, 16GB RAM).
New dDuplicated files are picked up within seconds, scanned (hitting
the cache most of the time thus not requiring physical IO), and then
submitted to the kernel for deduplication.
I'd call that fully automatic: Once set up, it just works, and works
well. Performance impact is very low once the initial scan is done.
https://github.com/Zygo/bees
--
Regards,
Kai
Replies to list-only preferred.
prev parent reply other threads:[~2017-11-16 16:42 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-15 1:01 Tiered storage? Roy Sigurd Karlsbakk
2017-11-15 7:11 ` waxhead
2017-11-15 9:26 ` Marat Khalili
2017-11-15 12:43 ` Austin S. Hemmelgarn
2017-11-15 12:52 ` Austin S. Hemmelgarn
2017-11-15 14:10 ` Roy Sigurd Karlsbakk
2017-11-15 22:09 ` Duncan
2017-11-16 16:42 ` Kai Krakow [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171116174234.0b6c21cc@jupiter.sol.kaishome.de \
--to=hurikhan77@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).