From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Brendan Hide <brendan@swiftspirit.co.za>,
ashford@whisperpc.com, Phillip Susi <psusi@ubuntu.com>
Cc: Jose Manuel Perez Bethencourt <jmperezbeth@gmail.com>,
Chris Murphy <lists@colorremedies.com>,
"sys.syphus" <syssyphus@gmail.com>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: I need to P. are we almost there yet?
Date: Fri, 02 Jan 2015 14:41:10 -0500 [thread overview]
Message-ID: <54A6F456.6090701@gmail.com> (raw)
In-Reply-To: <54A6D951.9080706@swiftspirit.co.za>
[-- Attachment #1: Type: text/plain, Size: 2485 bytes --]
On 2015-01-02 12:45, Brendan Hide wrote:
> On 2015/01/02 15:42, Austin S Hemmelgarn wrote:
>> On 2014-12-31 12:27, ashford@whisperpc.com wrote:
>>> I see this as a CRITICAL design flaw. The reason for calling it
>>> CRITICAL
>>> is that System Administrators have been trained for >20 years that
>>> RAID-10
>>> can usually handle a dual-disk failure, but the BTRFS implementation has
>>> effectively ZERO chance of doing so.
>> No, some rather simple math
> That's the problem. The math isn't as simple as you'd expect:
>
> The example below is probably a pathological case - but here goes. Let's
> say in this 4-disk example that chunks are striped as d1,d2,d1,d2 where
> d1 is the first bit of data and d2 is the second:
> Chunk 1 might be striped across disks A,B,C,D d1,d2,d1,d2
> Chunk 2 might be striped across disks B,C,A,D d3,d4,d3,d4
> Chunk 3 might be striped across disks D,A,C,B d5,d6,d5,d6
> Chunk 4 might be striped across disks A,C,B,D d7,d8,d7,d8
> Chunk 5 might be striped across disks A,C,D,B d9,d10,d9,d10
>
> Lose any two disks and you have a 50% chance on *each* chunk to have
> lost that chunk. With traditional RAID10 you have a 50% chance of losing
> the array entirely. With btrfs, the more data you have stored, the
> chances get closer to 100% of losing *some* data in a 2-disk failure.
>
> In the above example, losing A and B means you lose d3, d6, and d7
> (which ends up being 60% of all chunks).
> Losing A and C means you lose d1 (20% of all chunks).OK
> Losing A and D means you lose d9 (20% of all chunks).
> Losing B and C means you lose d10 (20% of all chunks).
> Losing B and D means you lose d2 (20% of all chunks).
> Losing C and D means you lose d4,d5, AND d8 (60% of all chunks)
>
> The above skewed example has an average of 40% of all chunks failed. As
> you add more data and randomise the allocation, this will approach 50% -
> BUT, the chances of losing *some* data is already clearly shown to be
> very close to 100%.
>
OK, I forgot about the randomization effect that the chunk allocation
and freeing has. We really should slap a *BIG* warning label on that
(and ideally find some better way to do it so it's more reliable).
As an aside, I've found that a BTRFS raid1 set on top of 2 LVM/MD RAID0
sets is actually faster than using a BTRFS raid10 set with the same
number of disks (how much faster is workload dependent), and provides
better guarantees than a BTRFS raid10 set.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2455 bytes --]
next prev parent reply other threads:[~2015-01-02 19:41 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-29 18:56 I need to P. are we almost there yet? sys.syphus
2014-12-29 19:00 ` sys.syphus
2014-12-29 19:04 ` Hugo Mills
2014-12-29 20:25 ` sys.syphus
2014-12-29 21:50 ` Hugo Mills
2014-12-29 21:16 ` Chris Murphy
2014-12-30 0:20 ` ashford
[not found] ` <CALBWd85UsSih24RhwpmDeMjuMWCKj9dGeuZes5POj6qEFkiz2w@mail.gmail.com>
2014-12-30 17:09 ` Fwd: " Jose Manuel Perez Bethencourt
2014-12-30 21:44 ` Phillip Susi
2014-12-30 23:17 ` ashford
2014-12-31 2:45 ` Phillip Susi
2014-12-31 17:27 ` ashford
2014-12-31 23:38 ` Phillip Susi
2015-01-01 1:26 ` Chris Samuel
2015-01-01 20:12 ` Roger Binns
2015-01-02 3:47 ` Duncan
2015-01-02 13:42 ` Austin S Hemmelgarn
2015-01-02 17:45 ` Brendan Hide
2015-01-02 19:41 ` Austin S Hemmelgarn [this message]
2014-12-29 21:13 ` Chris Murphy
2015-01-03 11:34 ` Bob Marley
2015-01-03 13:11 ` Duncan
2015-01-03 18:53 ` Bob Marley
2015-01-03 19:03 ` sys.syphus
2015-01-03 18:55 ` sys.syphus
2015-01-04 3:22 ` Duncan
2015-01-04 3:54 ` Hugo Mills
2015-01-03 21:58 ` Roman Mamedov
2015-01-04 3:24 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54A6F456.6090701@gmail.com \
--to=ahferroin7@gmail.com \
--cc=ashford@whisperpc.com \
--cc=brendan@swiftspirit.co.za \
--cc=jmperezbeth@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
--cc=psusi@ubuntu.com \
--cc=syssyphus@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.