linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: waxhead@dirtcellar.net, Roy Sigurd Karlsbakk <roy@karlsbakk.net>,
	linux-btrfs@vger.kernel.org
Subject: Re: Tiered storage?
Date: Wed, 15 Nov 2017 07:52:13 -0500	[thread overview]
Message-ID: <45d19817-ebf8-65ae-693f-2324ba637a67@gmail.com> (raw)
In-Reply-To: <08cbefb3-43eb-8d76-1dd6-191e2709bdc7@dirtcellar.net>

On 2017-11-15 02:11, waxhead wrote:
> As a regular BTRFS user I can tell you that there is no such thing as 
> hot data tracking yet. Some people seem to use bcache together with 
> btrfs and come asking for help on the mailing list.
Bcache works fine recently.  It was only with older versions that there 
were issues.  dm-cache similarly works fine on recent versions.  In both 
cases though, you need to be sure you know what you're doing, otherwise 
you are liable to break things.
> 
> Raid5/6 have received a few fixes recently, and it *may* soon me worth 
> trying out raid5/6 for data, but keeping metadata in raid1/10 (I would 
> rather loose a file or two than the entire filesystem).
> I had plans to run some tests on this a while ago, but forgot about it.
> As call good citizens, remember to have good backups. Last time I tested 
> for Raid5/6 I ran into issues easily. For what it's worth - raid1/10 
> seems pretty rock solid as long as you have sufficient disks (hint: you 
> need more than two for raid1 if you want to stay safe)
Parity profiles (raid5 and raid6) still have issues, although there are 
fewer than there were, with most of the remaining issues surrounding 
recovery.  I would still recommend against it for production usage.

Simple replication (raid1) is pretty much rock solid as long as you keep 
on top of replacing failing hardware and aren't stupid enough to run the 
array degraded for any extended period of time (converting to a single 
device volume instead of leaving things with half a volume is vastly 
preferred for multiple reasons).

Striped replication (raid10) is generally fine, but you can get much 
better performance by running BTRFS with a raid1 profile on top of two 
MD/LVM/Hardware RAID0 volumes (BTRFS still doesn't do a very good job of 
parallelizing things).
> 
> As for dedupe there is (to my knowledge) nothing fully automatic yet. 
> You have to run a program to scan your filesystem but all the 
> deduplication is done in the kernel.
> duperemove works apparently quite well when I tested it, but there may 
> be some performance implications.
Correct, there is nothing automatic (and there are pretty significant 
arguments against doing automatic deduplication in most cases), but the 
off-line options (via the EXTENT_SAME ioctl) are reasonably reliable. 
Duperemove in particular does a good job, though it may take a long time 
for large data sets.

As far as performance, it's no worse than large numbers of snapshots. 
The issues arise from using very large numbers of reflinks.
> 
> Roy Sigurd Karlsbakk wrote:
>> Hi all
>>
>> I've been following this project on and off for quite a few years, and 
>> I wonder if anyone has looked into tiered storage on it. With tiered 
>> storage, I mean hot data lying on fast storage and cold data on slow 
>> storage. I'm not talking about cashing (where you just keep a copy of 
>> the hot data on the fast storage).
>>
>> And btw, how far is raid[56] and block-level dedup from something 
>> useful in production?
>>
>> Vennlig hilsen
>>
>> roy
>> -- 
>> Roy Sigurd Karlsbakk
>> (+47) 98013356
>> http://blogg.karlsbakk.net/
>> GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
>> -- 
>> Hið góða skaltu í stein höggva, hið illa í snjó rita.


  parent reply	other threads:[~2017-11-15 12:52 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-15  1:01 Tiered storage? Roy Sigurd Karlsbakk
2017-11-15  7:11 ` waxhead
2017-11-15  9:26   ` Marat Khalili
2017-11-15 12:43     ` Austin S. Hemmelgarn
2017-11-15 12:52   ` Austin S. Hemmelgarn [this message]
2017-11-15 14:10     ` Roy Sigurd Karlsbakk
2017-11-15 22:09       ` Duncan
2017-11-16 16:42   ` Kai Krakow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45d19817-ebf8-65ae-693f-2324ba637a67@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=roy@karlsbakk.net \
    --cc=waxhead@dirtcellar.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).