From: David Brown <david@westcontrol.com>
To: linux-raid@vger.kernel.org
Subject: Re: Triple-parity raid6
Date: Fri, 10 Jun 2011 16:28:09 +0200 [thread overview]
Message-ID: <ist9n7$khq$1@dough.gmane.org> (raw)
In-Reply-To: <4DF20C18.3030604@christoph-d.de>
On 10/06/2011 14:20, Christoph Dittmann wrote:
> On 06/10/2011 10:45 AM, David Brown wrote:
>> Making multiple parity syndromes is easy enough mathematically:
>
> Adam Leventhal, who wrote the double parity and triple parity code for
> ZFS, mentioned on his blog [1] that going beyond triple parity poses
> significant challenges if write performance should not suffer.
>
> In particular, it looks like a relevant math paper (originally) had a
> flaw in the claim that quad parity and above can be implemented just
> like triple parity.
>
> I don't know the implementation you were going to use, neither am I
> knowledgeable about multi-parity in general. I only thought it might be
> relevant to add to the current discussion that other people had issues
> with implementing N-parity for N > 3.
>
>
I've looked at Adam's blog article - in fact, stumbling over it when
wandering around the 'net was one of my inspirations to think about
multi-parity raid again.
But there are a few key differences between the description in the James
Plank papers referenced, and the implementation I've looked at.
One point is that he is looking at GF(2^4), which is quickly a limiting
factor. It's hard to get enough independent parity syndromes, and you
can get easily start to think that things won't work for larger parity.
The other is the choice of factors for the syndromes. Assuming I
understand the papers correctly, the first paper is suggesting this
arrangement (all arithmetic done in GF()):
P_0 = 1^0 . d_0 + 2^0 . d_1 + 3^0 . d_2 + 4^0 . d_3
P_1 = 1^1 . d_0 + 2^1 . d_1 + 3^1 . d_2 + 4^1 . d_3
P_2 = 1^2 . d_0 + 2^2 . d_1 + 3^2 . d_2 + 4^2 . d_3
P_3 = 1^3 . d_0 + 2^3 . d_1 + 3^3 . d_2 + 4^3 . d_3
The second paper changes it to:
P_0 = 1^0 . d_0 + 1^1 . d_1 + 1^2 . d_2 + 1^3 . d_3
P_1 = 2^0 . d_0 + 2^1 . d_1 + 2^2 . d_2 + 2^3 . d_3
P_2 = 3^0 . d_0 + 3^1 . d_1 + 3^2 . d_2 + 3^3 . d_3
P_3 = 4^0 . d_0 + 4^1 . d_1 + 4^2 . d_2 + 4^3 . d_3
For the first two parity blocks, this is the same as for Linux raid6:
P = d_0 + d_1 + d_2 + d_3
Q = g^0 . d_0 + g^1 . d_1 + g^2 . d_2 + g^3 . d_3
(where g = 2)
But the third (and later) lines are not good - the restoration equations
on multiple disk failure are not independent for all combinations of
failed disks. For example, if you have 20 data disks and 3 parity
disks, there are 1140 different combinations of three data-disk
failures. Of these, 92 combinations lead to non-independent equations
when you try to restore the data.
The equations I am using are:
P_0 = 1^0 . d_0 + 1^1 . d_1 + 1^2 . d_2 + 1^3 . d_3
P_1 = 2^0 . d_0 + 2^1 . d_1 + 2^2 . d_2 + 2^3 . d_3
P_2 = 4^0 . d_0 + 4^1 . d_1 + 4^2 . d_2 + 4^3 . d_3
P_3 = 8^0 . d_0 + 8^1 . d_1 + 8^2 . d_2 + 8^3 . d_3
P_4 would use powers of 16, etc.
I have checked that the restoration equations are solvable for all
combinations of 3 disk failures for triple parity, and all combinations
of 4 disk failures for quad parity. The restoration equations can be
written in matrix form and are dependent on the indexes of the failed
disks. To solve them, you simply invert the matrix and this gives you a
linear formula for re-calculating each missing data block. So to check
that the data can be restored, you need to check that the determinant of
this matrix is non-zero for all combinations of n-disk failures.
I /believe/ these should work find for at least up to P_7 (i.e., 8
parity disks), but I haven't yet checked more than a few larger parity
modes. Checking all 4 disk failures from 253 disks took my inefficient
python program about 45 minutes - to check for 8-disk failures would
mean finding all combinations of 8 choices for up to 249 disks. If my
maths serves me right, that's 316,528,258,780,856 combinations - and for
each one, I'd have to find the determinant of an 8x8 matrix over
GF(2^8). Fortunately, I don't think they'll be much call for
octal-parity RAID in the near future :-)
Of course, all this assume that my maths is correct !
next prev parent reply other threads:[~2011-06-10 14:28 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-09 0:01 Triple-parity raid6 David Brown
2011-06-09 1:49 ` NeilBrown
2011-06-09 11:32 ` David Brown
2011-06-09 12:04 ` NeilBrown
2011-06-09 19:19 ` David Brown
2011-06-10 3:22 ` Namhyung Kim
2011-06-10 8:45 ` David Brown
2011-06-10 12:20 ` Christoph Dittmann
2011-06-10 14:28 ` David Brown [this message]
2011-06-11 10:13 ` Piergiorgio Sartor
2011-06-11 11:51 ` David Brown
2011-06-11 13:18 ` Piergiorgio Sartor
2011-06-11 14:53 ` David Brown
2011-06-11 15:05 ` Joe Landman
2011-06-11 16:31 ` David Brown
2011-06-11 16:57 ` Joe Landman
2011-06-12 9:05 ` David Brown
2011-06-11 17:14 ` Joe Landman
2011-06-11 18:05 ` David Brown
2011-06-10 9:03 ` David Brown
2011-06-10 13:56 ` Bill Davidsen
2011-06-09 22:42 ` David Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='ist9n7$khq$1@dough.gmane.org' \
--to=david@westcontrol.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).