linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: David Brown <david.brown@hesbynett.no>
Cc: linux-raid@vger.kernel.org
Subject: Re: Triple-parity raid6
Date: Thu, 9 Jun 2011 11:49:54 +1000	[thread overview]
Message-ID: <20110609114954.243e9e22@notabene.brown> (raw)
In-Reply-To: <isp2g2$rf$1@dough.gmane.org>

On Thu, 09 Jun 2011 02:01:06 +0200 David Brown <david.brown@hesbynett.no>
wrote:

> Has anyone considered triple-parity raid6 ?  As far as I can see, it 
> should not be significantly harder than normal raid6 - either  to 
> implement, or for the processor at run-time.  Once you have the GF(2⁸) 
> field arithmetic in place for raid6, it's just a matter of making 
> another parity block in the same way but using a different generator:
> 
> P = D_0 + D_1 + D_2 + .. + D_(n.1)
> Q = D_0 + g.D_1 + g².D_2 + .. + g^(n-1).D_(n.1)
> R = D_0 + h.D_1 + h².D_2 + .. + h^(n-1).D_(n.1)
> 
> The raid6 implementation in mdraid uses g = 0x02 to generate the second 
> parity (based on "The mathematics of RAID-6" - I haven't checked the 
> source code).  You can make a third parity using h = 0x04 and then get a 
> redundancy of 3 disks.  (Note - I haven't yet confirmed that this is 
> valid for more than 100 data disks - I need to make my checker program 
> more efficient first.)
> 
> Rebuilding a disk, or running in degraded mode, is just an obvious 
> extension to the current raid6 algorithms.  If you are missing three 
> data blocks, the maths looks hard to start with - but if you express the 
> equations as a set of linear equations and use standard matrix inversion 
> techniques, it should not be hard to implement.  You only need to do 
> this inversion once when you find that one or more disks have failed - 
> then you pre-compute the multiplication tables in the same way as is 
> done for raid6 today.
> 
> In normal use, calculating the R parity is no more demanding than 
> calculating the Q parity.  And most rebuilds or degraded situations will 
> only involve a single disk, and the data can thus be re-constructed 
> using the P parity just like raid5 or two-parity raid6.
> 
> 
> I'm sure there are situations where triple-parity raid6 would be 
> appealing - it has already been implemented in ZFS, and it is only a 
> matter of time before two-parity raid6 has a real probability of hitting 
> an unrecoverable read error during a rebuild.
> 
> 
> And of course, there is no particular reason to stop at three parity 
> blocks - the maths can easily be generalised.  1, 2, 4 and 8 can be used 
> as generators for quad-parity (checked up to 60 disks), and adding 16 
> gives you quintuple parity (checked up to 30 disks) - but that's maybe 
> getting a bit paranoid.
> 
> 
> ref.:
> 
> <http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf>
> <http://blogs.oracle.com/ahl/entry/acm_triple_parity_raid>
> <http://queue.acm.org/detail.cfm?id=1670144>
> <http://blogs.oracle.com/ahl/entry/triple_parity_raid_z>
> 

 -ENOPATCH  :-)

I have a series of patches nearly ready which removes a lot of the remaining
duplication in raid5.c between raid5 and raid6 paths.  So there will be
relative few places where RAID5 and RAID6 do different things - only the
places where they *must* do different things.
After that, adding a new level or layout which has 'max_degraded == 3' would
be quite easy.
The most difficult part would be the enhancements to libraid6 to generate the
new 'syndrome', and to handle the different recovery possibilities.

So if you're not otherwise busy this weekend, a patch would be nice :-)

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2011-06-09  1:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-09  0:01 Triple-parity raid6 David Brown
2011-06-09  1:49 ` NeilBrown [this message]
2011-06-09 11:32   ` David Brown
2011-06-09 12:04     ` NeilBrown
2011-06-09 19:19       ` David Brown
2011-06-10  3:22       ` Namhyung Kim
2011-06-10  8:45         ` David Brown
2011-06-10 12:20           ` Christoph Dittmann
2011-06-10 14:28             ` David Brown
2011-06-11 10:13               ` Piergiorgio Sartor
2011-06-11 11:51                 ` David Brown
2011-06-11 13:18                   ` Piergiorgio Sartor
2011-06-11 14:53                     ` David Brown
2011-06-11 15:05                       ` Joe Landman
2011-06-11 16:31                         ` David Brown
2011-06-11 16:57                           ` Joe Landman
2011-06-12  9:05                             ` David Brown
2011-06-11 17:14                           ` Joe Landman
2011-06-11 18:05                             ` David Brown
2011-06-10  9:03       ` David Brown
2011-06-10 13:56       ` Bill Davidsen
2011-06-09 22:42 ` David Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110609114954.243e9e22@notabene.brown \
    --to=neilb@suse.de \
    --cc=david.brown@hesbynett.no \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).