* Re: raid6 and parity calculations
@ 2010-09-15 15:55 Michael Sallaway
2010-09-15 16:07 ` Andre Noll
0 siblings, 1 reply; 4+ messages in thread
From: Michael Sallaway @ 2010-09-15 15:55 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
> -------Original Message-------
> From: Neil Brown <neilb@suse.de>
> To: Michael Sallaway <michael@sallaway.com>
> Cc: linux-raid@vger.kernel.org
> Subject: Re: raid6 and parity calculations
> Sent: 15 Sep '10 10:26
> It is best not to think to think to much about chunks. Think about strips
> (not stripes).
> A strip is a set of blocks, one per device each at the same offset.
> Think of page sizes blocks/strips.
> Each strip has a P block and a Q block and a bunch of data blocks. Which
> is P and which is Q and which each data block is a function of the offset,
> the layout and the chunk size. Once you have used the chunksize to perform
> that calculation, don't think about chunks any more - just blocks and strips.
>
Aah, perfect -- that makes sense, thanks for that.
As a sort-of follow up question, would anyone know if the data size of a Q calculation affects the result at all? eg. if I do a 64kb Q calculation on 10 drives of data, would that be the same as doing 16x 4kb Q calculations on sequential blocks of the same data, then concatenating it together? (I can't remember what that operation property is called....?)
I've been reading the maths of RAID6 PDF (http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf), but I'm a bit too rusty to understand Galois fields, and if the data size matters. I presume the data ordering is also critical for a Q calculation, correct? (eg. drives have to be d0 -> d10 in order, not just random).
And, in contrast, for the P calculations, data size and input order makes no difference, correct? (since it's just a simple bitwise XOR of all the inputs).
>
> Ouch... good luck.
Thanks! I'm the only one to blame, though -- it happened in the month between "getting the new system set up" and "setting up backups for the new system". So it's the only copy of the data.... whoops. :-)
Thanks for the help/advice!
Cheers,
Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: raid6 and parity calculations
2010-09-15 15:55 raid6 and parity calculations Michael Sallaway
@ 2010-09-15 16:07 ` Andre Noll
0 siblings, 0 replies; 4+ messages in thread
From: Andre Noll @ 2010-09-15 16:07 UTC (permalink / raw)
To: Michael Sallaway; +Cc: Neil Brown, linux-raid
[-- Attachment #1: Type: text/plain, Size: 1171 bytes --]
On Wed, Sep 15, 15:55, Michael Sallaway wrote:
> As a sort-of follow up question, would anyone know if the data size of
> a Q calculation affects the result at all? eg. if I do a 64kb Q
> calculation on 10 drives of data, would that be the same as doing 16x
> 4kb Q calculations on sequential blocks of the same data, then
> concatenating it together? (I can't remember what that operation
> property is called....?)
Yes, the result would be the same. In fact, byte n of Q depends only
on byte n of the 10/16 data drives.
> I've been reading the maths of RAID6 PDF
> (http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf), but I'm a
> bit too rusty to understand Galois fields, and if the data size
> matters. I presume the data ordering is also critical for a Q
> calculation, correct? (eg. drives have to be d0 -> d10 in order, not
> just random).
Right, order matters.
> And, in contrast, for the P calculations, data size and input order
> makes no difference, correct? (since it's just a simple bitwise XOR of
> all the inputs).
Also correct.
Andre
--
The only person who always got his work done by Friday was Robinson Crusoe
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* raid6 and parity calculations
@ 2010-09-14 14:45 Michael Sallaway
2010-09-15 10:26 ` Neil Brown
0 siblings, 1 reply; 4+ messages in thread
From: Michael Sallaway @ 2010-09-14 14:45 UTC (permalink / raw)
To: linux-raid
Hi,
I've been looking through the drivers/md code, and I've got a few questions about the RAID6 parity calculations that have me stumped.
I can see that when recovering 1 or 2 data sections, it calls functions based on the content that we're recovering (eg. async_gen_syndrome, async_xor, async_raid6_datap_recov, etc.) However, the length parameter is always given as STRIPE_SIZE, which from what I can tell is the same as PAGE_SIZE, which for vanilla systems like the one I'm playing with is 4096 bytes.
The thing that I can't figure out is how this interacts with the RAID6 chunk size; the array I'm playing with has a default chunk size (64kb), which I understand means that there's 64kb of data striped across each disk (bar two), then 64kb of P, then 64kb of Q for the first stripe, correct? If so, I can't figure out where the whole parity calculation is done for all 64kb. There's no loops, no recursion, or anything that would process it that I can find. I'm obviously missing something here, can anyone enlighten me?
Thanks for any advice or pointers!
Cheers,
Michael
(as a side note: I'm playing with all this as I've managed to royally screw up an array which had 2 dropped drives, by readding them back in (in what appears to be the wrong order). That would have been fine if thr rebuild finished completely, however the rebuild failed a few percent in, so now I have 2 drives with "swapped" data. That is, drive A contains the data for raid member 4 for the first x%, and raid member 5 for the rest, and drive B contains the data for raid member 5 for the first x% and raid member 4 for the rest. So I'm trying to write a userspace program to manually go through the array members, inspecting each stripe, and manually doing parity calculations for a range of drive permutations to try and see what looks sensible, hence I'm trying to understand what's ON the driv
e to reverse engineer it.)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: raid6 and parity calculations
2010-09-14 14:45 Michael Sallaway
@ 2010-09-15 10:26 ` Neil Brown
0 siblings, 0 replies; 4+ messages in thread
From: Neil Brown @ 2010-09-15 10:26 UTC (permalink / raw)
To: Michael Sallaway; +Cc: linux-raid
On Tue, 14 Sep 2010 14:45:40 +0000
"Michael Sallaway" <michael@sallaway.com> wrote:
> Hi,
>
> I've been looking through the drivers/md code, and I've got a few questions about the RAID6 parity calculations that have me stumped.
>
> I can see that when recovering 1 or 2 data sections, it calls functions based on the content that we're recovering (eg. async_gen_syndrome, async_xor, async_raid6_datap_recov, etc.) However, the length parameter is always given as STRIPE_SIZE, which from what I can tell is the same as PAGE_SIZE, which for vanilla systems like the one I'm playing with is 4096 bytes.
>
> The thing that I can't figure out is how this interacts with the RAID6 chunk size; the array I'm playing with has a default chunk size (64kb), which I understand means that there's 64kb of data striped across each disk (bar two), then 64kb of P, then 64kb of Q for the first stripe, correct? If so, I can't figure out where the whole parity calculation is done for all 64kb. There's no loops, no recursion, or anything that would process it that I can find. I'm obviously missing something here, can anyone enlighten me?
>
> Thanks for any advice or pointers!
It is best not to think to think to much about chunks. Think about strips
(not stripes).
A strip is a set of blocks, one per device each at the same offset.
Think of page sizes blocks/strips.
Each strip has a P block and a Q block and a bunch of data blocks. Which
is P and which is Q and which each data block is a function of the offset,
the layout and the chunk size. Once you have used the chunksize to perform
that calculation, don't think about chunks any more - just blocks and strips.
Hope that helps.
>
> Cheers,
> Michael
>
>
> (as a side note: I'm playing with all this as I've managed to royally screw up an array which had 2 dropped drives, by readding them back in (in what appears to be the wrong order). That would have been fine if thr rebuild finished completely, however the rebuild failed a few percent in, so now I have 2 drives with "swapped" data. That is, drive A contains the data for raid member 4 for the first x%, and raid member 5 for the rest, and drive B contains the data for raid member 5 for the first x% and raid member 4 for the rest. So I'm trying to write a userspace program to manually go through the array members, inspecting each stripe, and manually doing parity calculations for a range of drive permutations to try and see what looks sensible, hence I'm trying to understand what's ON the dr
ive to reverse engineer it.)
Ouch... good luck.
NeilBrown
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-09-15 16:07 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-15 15:55 raid6 and parity calculations Michael Sallaway
2010-09-15 16:07 ` Andre Noll
-- strict thread matches above, loose matches on Subject: below --
2010-09-14 14:45 Michael Sallaway
2010-09-15 10:26 ` Neil Brown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).