From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Brown Subject: Re: Triple-parity raid6 Date: Fri, 10 Jun 2011 00:42:41 +0200 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids On 09/06/11 02:01, David Brown wrote: > Has anyone considered triple-parity raid6 ? As far as I can see, it > should not be significantly harder than normal raid6 - either to > implement, or for the processor at run-time. Once you have the GF(2=E2= =81=B8) > field arithmetic in place for raid6, it's just a matter of making > another parity block in the same way but using a different generator: > > P =3D D_0 + D_1 + D_2 + .. + D_(n.1) > Q =3D D_0 + g.D_1 + g=C2=B2.D_2 + .. + g^(n-1).D_(n.1) > R =3D D_0 + h.D_1 + h=C2=B2.D_2 + .. + h^(n-1).D_(n.1) > > The raid6 implementation in mdraid uses g =3D 0x02 to generate the se= cond > parity (based on "The mathematics of RAID-6" - I haven't checked the > source code). You can make a third parity using h =3D 0x04 and then g= et a > redundancy of 3 disks. (Note - I haven't yet confirmed that this is > valid for more than 100 data disks - I need to make my checker progra= m > more efficient first.) > > Rebuilding a disk, or running in degraded mode, is just an obvious > extension to the current raid6 algorithms. If you are missing three d= ata > blocks, the maths looks hard to start with - but if you express the > equations as a set of linear equations and use standard matrix invers= ion > techniques, it should not be hard to implement. You only need to do t= his > inversion once when you find that one or more disks have failed - the= n > you pre-compute the multiplication tables in the same way as is done = for > raid6 today. > > In normal use, calculating the R parity is no more demanding than > calculating the Q parity. And most rebuilds or degraded situations wi= ll > only involve a single disk, and the data can thus be re-constructed > using the P parity just like raid5 or two-parity raid6. > > > I'm sure there are situations where triple-parity raid6 would be > appealing - it has already been implemented in ZFS, and it is only a > matter of time before two-parity raid6 has a real probability of hitt= ing > an unrecoverable read error during a rebuild. > > > And of course, there is no particular reason to stop at three parity > blocks - the maths can easily be generalised. 1, 2, 4 and 8 can be us= ed > as generators for quad-parity (checked up to 60 disks), and adding 16 > gives you quintuple parity (checked up to 30 disks) - but that's mayb= e > getting a bit paranoid. > > > ref.: > > > > > > > > mvh., > > David > Just to follow up on my numbers here - I've now checked the validity of= =20 triple-parity using generators 1, 2 and 4 for up to 254 data disks=20 (i.e., 257 disks altogether). I've checked the validity of quad-parity= =20 up to 120 disks - checking the full 253 disks will probably take the=20 machine most of the night. I'm sure there is some mathematical way to=20 prove this, and it could certainly be checked more efficiently than wit= h=20 a Python program - but my computer has more spare time than me! -- To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html