linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: John Petrini <jpetrini@coredial.com>
Cc: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Volume appears full but TB's of space available
Date: Fri, 7 Apr 2017 09:50:16 -0400	[thread overview]
Message-ID: <56b58b49-a4ab-56f9-25e5-94d64699da83@gmail.com> (raw)
In-Reply-To: <CAD4AmV7-tFesnWbYR=geRVcFedgb4zcfKeVXuEVNnNtJN0qg1w@mail.gmail.com>

On 2017-04-07 09:28, John Petrini wrote:
> Hi Austin,
>
> Thanks for taking to time to provide all of this great information!
Glad I could help.
>
> You've got me curious about RAID1. If I were to convert the array to
> RAID1 could it then sustain a multi drive failure? Or in other words
> do I actually end up with mirrored pairs or can a chunk still be
> mirrored to any disk in the array? Are there performance implications
> to using RAID1 vs RAID10?
>
For raid10, your data is stored as 2 replicas striped at or below the 
filesystem-block level across all the disks in the array.  Because of 
how the data striping is done currently, you're functionally guaranteed 
to lose data if you lose more than one disk in raid10 mode.  This 
theoretically could be improved so that partial losses could be 
recovered, but doing so with the current implementation would be 
extremely complicated, and as such is not a high priority (although 
patches would almost certainly be welcome).

For raid1, your data is stored as 2 replicas with each entirely on one 
disk, but individual chunks (the higher level allocation in BTRFS) are 
distributed in a round-robin fashion among the disks, so any given 
filesystem block is on exactly 2 disks.  With the current 
implementation, for any reasonably utilized filesystem, you will lose 
data if you lose 2 or more disks in raid1 mode.  That said, there are 
plans (still currently vaporware in favor of getting raid5/6 working) to 
add arbitrary replication levels to BTRFS, so once that hits, you could 
set things to have as many replicas as you want.

In effect, both can currently only sustain one disk failure, but losing 
2 disks in raid10 will probably corrupt files (currently, it will 
functionally kill the FS, although with a bit of theoretically simple 
work this could be changed), while losing 2 disks in raid1 mode will 
usually just make files disappear unless they are larger than the data 
chunk size (usually between 1-5GB depending on the size of the FS), so 
if you're just storing small files, you'll have an easier time 
quantifying data loss with raid1 than raid10.  Both modes have the 
possibility of completely losing the FS if the lost disks happen to take 
out the System chunk.

As for performance, raid10 mode in BTRFS gets better performance, but 
you can get even better performance than that by running BTRFS in raid1 
mode on top of 2 LVM or MD raid0 volumes.  Such a configuration provides 
the same effective data safety as BTRFS raid10, but can get anywhere 
from 5-30% better performance depending on the workload.

If you care about both performance and data safety, I would suggest 
using BTRFS raid1 mode on top of LVM or MD RAID0 together with having 
good backups and good monitoring.  Statistically speaking, catastrophic 
hardware failures are rare, and you'll usually have more than enough 
warning that a device is failing before it actually does, so provided 
you keep on top of monitoring and replace disks that are showing signs 
of impending failure as soon as possible, you will be no worse off in 
terms of data integrity than running ext4 or XFS on top of a LVM or MD 
RAID10 volume.

  reply	other threads:[~2017-04-07 13:50 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-07  0:47 Volume appears full but TB's of space available John Petrini
2017-04-07  1:15 ` John Petrini
2017-04-07  1:21   ` Chris Murphy
2017-04-07  1:31     ` John Petrini
2017-04-07  2:42       ` Chris Murphy
2017-04-07  3:25         ` John Petrini
2017-04-07 11:41           ` Austin S. Hemmelgarn
2017-04-07 13:28             ` John Petrini
2017-04-07 13:50               ` Austin S. Hemmelgarn [this message]
2017-04-07 16:28                 ` Chris Murphy
2017-04-07 16:58                   ` Austin S. Hemmelgarn
2017-04-07 17:05                     ` John Petrini
2017-04-07 17:11                       ` Austin S. Hemmelgarn
2017-04-07 16:04             ` Chris Murphy
2017-04-07 16:51               ` Austin S. Hemmelgarn
2017-04-07 16:58                 ` John Petrini
2017-04-07 17:04                   ` Austin S. Hemmelgarn
2017-04-08  5:12             ` Duncan
2017-04-10 11:31               ` Austin S. Hemmelgarn
2017-04-07  1:17 ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56b58b49-a4ab-56f9-25e5-94d64699da83@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=jpetrini@coredial.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).