* max number of devices in raid6 array @ 2009-08-12 9:06 David Cure 2009-08-12 9:38 ` John Robinson 0 siblings, 1 reply; 13+ messages in thread From: David Cure @ 2009-08-12 9:06 UTC (permalink / raw) To: linux-raid Hello, I want to create an raid6 array and I see an limit of number of devices : with 27 devices the array is created, and with 28 I've got this error : mdadm: invalid number of raid devices: 28 I use kernel 2.6.30 (from backport) and mdadm - v2.6.7.2 - 14th November 2008 on Debian Lenny. Is there a way to pass out this limit ? Thanks, David. PS: I'm not subscribed of linux-raid (only linux-kernel) so please copy your answer to my mail. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-12 9:06 max number of devices in raid6 array David Cure @ 2009-08-12 9:38 ` John Robinson 2009-08-12 12:19 ` David Cure 0 siblings, 1 reply; 13+ messages in thread From: John Robinson @ 2009-08-12 9:38 UTC (permalink / raw) To: David Cure; +Cc: linux-raid On Wed, 12 August, 2009 10:06 am, David Cure wrote: > I want to create an raid6 array and I see an limit of number of > devices : with 27 devices the array is created, and with 28 I've got > this error : > mdadm: invalid number of raid devices: 28 > > I use kernel 2.6.30 (from backport) and mdadm - v2.6.7.2 - 14th > November 2008 on Debian Lenny. > > Is there a way to pass out this limit ? The limit is only with version 0.90 metadata, which is the default. I'd recommend you do a bit of reading to pick metadata version 1.0, 1.1 or 1.2, according to any requirements you may have, then tell mdadm what your choice is with --metadata= while creating your array. Cheers, John. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-12 9:38 ` John Robinson @ 2009-08-12 12:19 ` David Cure 2009-08-12 14:53 ` Goswin von Brederlow 0 siblings, 1 reply; 13+ messages in thread From: David Cure @ 2009-08-12 12:19 UTC (permalink / raw) To: John Robinson; +Cc: linux-raid Le Wed, Aug 12, 2009 at 10:38:40AM +0100, John Robinson ecrivait : > > The limit is only with version 0.90 metadata, which is the default. I'd > recommend you do a bit of reading to pick metadata version 1.0, 1.1 or > 1.2, according to any requirements you may have, then tell mdadm what your > choice is with --metadata= while creating your array. thanks for your quick reply, I'm going to read the spec of these metadata's version. David. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-12 12:19 ` David Cure @ 2009-08-12 14:53 ` Goswin von Brederlow 2009-08-12 16:27 ` Billy Crook 2009-08-12 16:33 ` John Robinson 0 siblings, 2 replies; 13+ messages in thread From: Goswin von Brederlow @ 2009-08-12 14:53 UTC (permalink / raw) To: David Cure; +Cc: John Robinson, linux-raid David Cure <lnk@cure.nom.fr> writes: > Le Wed, Aug 12, 2009 at 10:38:40AM +0100, John Robinson ecrivait : >> >> The limit is only with version 0.90 metadata, which is the default. I'd >> recommend you do a bit of reading to pick metadata version 1.0, 1.1 or >> 1.2, according to any requirements you may have, then tell mdadm what your >> choice is with --metadata= while creating your array. > > thanks for your quick reply, I'm going to read the spec of these > metadata's version. > > David. And compute the overall MTBFS. With how many devices does the MTBFS of a raid6 drop below that of a single disk? MfG Goswin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-12 14:53 ` Goswin von Brederlow @ 2009-08-12 16:27 ` Billy Crook 2009-08-12 16:33 ` John Robinson 1 sibling, 0 replies; 13+ messages in thread From: Billy Crook @ 2009-08-12 16:27 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: David Cure, John Robinson, linux-raid On Wed, Aug 12, 2009 at 09:53, Goswin von Brederlow<goswin-v-b@web.de> wrote: > With how many devices does the MTBFS of > a raid6 drop below that of a single disk? Four devices. The more important thing to worry about is running in to an uncorrectable error during rebuild, twice during the same rebuild. I'd say a good rule of thumb is no more than 8TB in a RAID5 array; 16TB in a raid6 array if you want to remain very safe. YMMV. Your risk comfort and willingness to restore from backups will vary as well. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-12 14:53 ` Goswin von Brederlow 2009-08-12 16:27 ` Billy Crook @ 2009-08-12 16:33 ` John Robinson 2009-08-13 2:52 ` Goswin von Brederlow 1 sibling, 1 reply; 13+ messages in thread From: John Robinson @ 2009-08-12 16:33 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: David Cure, linux-raid On Wed, 12 August, 2009 3:53 pm, Goswin von Brederlow wrote: [...] > And compute the overall MTBFS. With how many devices does the MTBFS of a raid6 drop below that of a single disk? First up, we probably want to be talking about Mean Time To Data Loss. It'll vary enormously depending on how fast you think you can replace dead drives, which in turn depends on how long a rebuild takes (since a dead drive doesn't count as having been replaced until the new drive is fully sync'ed). And building an array that big, it's going to be hard to get drives all from different batches. Anyway, someone asked Google a similar question: http://answers.google.com/answers/threadview/id/730165.html and the MTTDL for an 11-disc RAID-5 with 100,000-hour drives and a 24-hour replacement+rebuild turnaround was 3.8 million hours (433 years), and a RAID-6 was said to be "hundreds of times" more reliable. The 433 years figure will be assuming that one drive failure doesn't cause another one, though, so it's to be taken with a pinch of salt. Cheers, John. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-12 16:33 ` John Robinson @ 2009-08-13 2:52 ` Goswin von Brederlow 2009-08-13 3:39 ` Guy Watkins ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Goswin von Brederlow @ 2009-08-13 2:52 UTC (permalink / raw) To: John Robinson; +Cc: Goswin von Brederlow, David Cure, linux-raid "John Robinson" <john.robinson@anonymous.org.uk> writes: > On Wed, 12 August, 2009 3:53 pm, Goswin von Brederlow wrote: > [...] >> And compute the overall MTBFS. With how many devices does the MTBFS of a > raid6 drop below that of a single disk? > > First up, we probably want to be talking about Mean Time To Data Loss. > It'll vary enormously depending on how fast you think you can replace dead > drives, which in turn depends on how long a rebuild takes (since a dead > drive doesn't count as having been replaced until the new drive is fully > sync'ed). And building an array that big, it's going to be hard to get > drives all from different batches. > > Anyway, someone asked Google a similar question: > http://answers.google.com/answers/threadview/id/730165.html and the MTTDL > for an 11-disc RAID-5 with 100,000-hour drives and a 24-hour > replacement+rebuild turnaround was 3.8 million hours (433 years), and a > RAID-6 was said to be "hundreds of times" more reliable. The 433 years > figure will be assuming that one drive failure doesn't cause another one, > though, so it's to be taken with a pinch of salt. > > Cheers, > > John. I would take that with a verry large pinch of salt. From the little experience I have that value doesn't reflects reality. Unfortunately the MTBFS values for disks vendors give are pretty much totaly dreamed up. So the 100,000-hours for a single drive already has a huge uncertainty. Shouldn't affect the cut of point where the MTBFS of tha raid is less than a single disk though. Secondly disk failures in a raid are not unrelated. The disk all age and most people don't rotate in new disk regulary. The chance of a disk failure is not uniform over time. On top of that the stress of rebuilding usualy greatly increases the chances. And with large raids and todays large disks we are talking days to weeks or rebuild time. As you said, the 433 years are assuming that one drive failure doesn't cause another one to fail. In reality that seems to be a real factor though. If I understood the math in the URL right then the chance of a disk failing within a week is: 168/100000 = 0.00168 The chance of 2 disks failing within a week with 25 disks would be: (1-(1-168/100000)^25)^2 = ~0.00169448195081717874 The chance of 3 disks failing within a week with 75 disks would be: (1-(1-168/100000)^75)^3 = ~0.00166310371815668874 So the cut off values are roughly 25 and 75 disks for raid 5/6. Right? Now lets assume, and I'm totally guessing here, the failure is 4 times more likely during a rebuild: (1-(1-168/100000*4)^7)^2 = ~0.00212541503635 (1-(1-168/100000*4)^19)^3 = ~0.00173857193240 (1-(1-336/100000*4)^10)^3 = ~0.00202697761277 (two weeks rebuild time) So cut off is 7 and 19 (10 for 2 week rebuild) disks. Or am I totaly doing the wrong math? MfG Goswin ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: max number of devices in raid6 array 2009-08-13 2:52 ` Goswin von Brederlow @ 2009-08-13 3:39 ` Guy Watkins 2009-08-14 3:40 ` Leslie Rhorer 2009-08-17 7:31 ` Goswin von Brederlow 2009-08-13 4:22 ` Richard Scobie 2009-08-17 16:06 ` John Robinson 2 siblings, 2 replies; 13+ messages in thread From: Guy Watkins @ 2009-08-13 3:39 UTC (permalink / raw) To: 'Goswin von Brederlow', 'John Robinson' Cc: 'David Cure', linux-raid } -----Original Message----- } From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- } owner@vger.kernel.org] On Behalf Of Goswin von Brederlow } Sent: Wednesday, August 12, 2009 10:52 PM } To: John Robinson } Cc: Goswin von Brederlow; David Cure; linux-raid@vger.kernel.org } Subject: Re: max number of devices in raid6 array } } "John Robinson" <john.robinson@anonymous.org.uk> writes: } } > On Wed, 12 August, 2009 3:53 pm, Goswin von Brederlow wrote: } > [...] } >> And compute the overall MTBFS. With how many devices does the MTBFS of } a } > raid6 drop below that of a single disk? } > } > First up, we probably want to be talking about Mean Time To Data Loss. } > It'll vary enormously depending on how fast you think you can replace } dead } > drives, which in turn depends on how long a rebuild takes (since a dead } > drive doesn't count as having been replaced until the new drive is fully } > sync'ed). And building an array that big, it's going to be hard to get } > drives all from different batches. } > } > Anyway, someone asked Google a similar question: } > http://answers.google.com/answers/threadview/id/730165.html and the } MTTDL } > for an 11-disc RAID-5 with 100,000-hour drives and a 24-hour } > replacement+rebuild turnaround was 3.8 million hours (433 years), and a } > RAID-6 was said to be "hundreds of times" more reliable. The 433 years } > figure will be assuming that one drive failure doesn't cause another } one, } > though, so it's to be taken with a pinch of salt. } > } > Cheers, } > } > John. } } I would take that with a verry large pinch of salt. From the little } experience I have that value doesn't reflects reality. } } Unfortunately the MTBFS values for disks vendors give are pretty much } totaly dreamed up. So the 100,000-hours for a single drive already has } a huge uncertainty. Shouldn't affect the cut of point where the MTBFS } of tha raid is less than a single disk though. } } Secondly disk failures in a raid are not unrelated. The disk all age } and most people don't rotate in new disk regulary. The chance of a } disk failure is not uniform over time. } } On top of that the stress of rebuilding usualy greatly increases the } chances. And with large raids and todays large disks we are talking } days to weeks or rebuild time. As you said, the 433 years are assuming } that one drive failure doesn't cause another one to fail. In reality } that seems to be a real factor though. } } } If I understood the math in the URL right then the chance of a disk } failing within a week is: } } 168/100000 = 0.00168 } } The chance of 2 disks failing within a week with 25 disks would be: } } (1-(1-168/100000)^25)^2 = ~0.00169448195081717874 } } The chance of 3 disks failing within a week with 75 disks would be: } } (1-(1-168/100000)^75)^3 = ~0.00166310371815668874 } } So the cut off values are roughly 25 and 75 disks for raid 5/6. Right? } } } Now lets assume, and I'm totally guessing here, the failure is 4 times } more likely during a rebuild: } } (1-(1-168/100000*4)^7)^2 = ~0.00212541503635 } (1-(1-168/100000*4)^19)^3 = ~0.00173857193240 } (1-(1-336/100000*4)^10)^3 = ~0.00202697761277 (two weeks rebuild time) } } So cut off is 7 and 19 (10 for 2 week rebuild) disks. Or am I totaly } doing the wrong math? } } MfG } Goswin I don't believe a block read error is considered in the MTBF. A current 2TB disk has a "<1 in 10^15" "Non-recoverable read errors per bits read". That is about 1 error per 114 TB read (10^15/8/1024/1024/1024/1024). So, you should get 1 failure per about 114 TB read. If you had 57 2TB disks + 1 parity, your chance of a read error should be 1 during a recovery. If you had 29 2TB disks and 1 parity, you should have about 1 failure per 2 recoveries. With 6 2TB disks and 1 parity, you should have about 1 failure per 10 recoveries. This assumes you had no other disk reads to increase the failure rate. I got the 10^15 from here: http://www.wdc.com/en/library/sata/2879-701229.pdf I hope my math is correct! Guy ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: max number of devices in raid6 array 2009-08-13 3:39 ` Guy Watkins @ 2009-08-14 3:40 ` Leslie Rhorer 2009-08-17 7:31 ` Goswin von Brederlow 1 sibling, 0 replies; 13+ messages in thread From: Leslie Rhorer @ 2009-08-14 3:40 UTC (permalink / raw) To: linux-raid > I don't believe a block read error is considered in the MTBF. A current > 2TB > disk has a "<1 in 10^15" "Non-recoverable read errors per bits read". > That > is about 1 error per 114 TB read (10^15/8/1024/1024/1024/1024). So, you > should get 1 failure per about 114 TB read. If you had 57 2TB disks + 1 > parity, your chance of a read error should be 1 during a recovery. If you It doesn't work that way. Just because there is a 1 in 2 chance of getting heads in a coin flip does not mean you must get a heads if you flip the coin twice. The odds of getting at least 1 heads with two coin tosses is 75%. The odds of rolling a 1 with six throws of a single die are 66.5% If one takes a card from a deck, replaces the card in the desk, shuffles, and pulls another card from the deck, repeating 52 times, the odds of drawing the Ace of Spades once is 63.6% Only if one removes the card from the deck completely each time do the odds rise to 1. In this case it means on average there may be 1 error in 1E15 reads, provided one reads many hundreds of TB. To calculate the odds of single error for precisely 1E15 bits read, take 1 - 1E-15 and raise it to the power of the total number of bits read. This equals the odds of NOT getting any error. Subtract that number from 1, and you have the odds of getting an error. If we read 1E15 bits, then the odds of getting a single error are 1 - .999999999999999^1E15 It's about 63%. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-13 3:39 ` Guy Watkins 2009-08-14 3:40 ` Leslie Rhorer @ 2009-08-17 7:31 ` Goswin von Brederlow 1 sibling, 0 replies; 13+ messages in thread From: Goswin von Brederlow @ 2009-08-17 7:31 UTC (permalink / raw) To: Guy Watkins Cc: 'Goswin von Brederlow', 'John Robinson', 'David Cure', linux-raid "Guy Watkins" <linux-raid@watkins-home.com> writes: > } -----Original Message----- > } From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > } owner@vger.kernel.org] On Behalf Of Goswin von Brederlow > } Sent: Wednesday, August 12, 2009 10:52 PM > } To: John Robinson > } Cc: Goswin von Brederlow; David Cure; linux-raid@vger.kernel.org > } Subject: Re: max number of devices in raid6 array > } > } "John Robinson" <john.robinson@anonymous.org.uk> writes: > } > } > On Wed, 12 August, 2009 3:53 pm, Goswin von Brederlow wrote: > } > [...] > } >> And compute the overall MTBFS. With how many devices does the MTBFS of > } a > } > raid6 drop below that of a single disk? > } > > } > First up, we probably want to be talking about Mean Time To Data Loss. > } > It'll vary enormously depending on how fast you think you can replace > } dead > } > drives, which in turn depends on how long a rebuild takes (since a dead > } > drive doesn't count as having been replaced until the new drive is fully > } > sync'ed). And building an array that big, it's going to be hard to get > } > drives all from different batches. > } > > } > Anyway, someone asked Google a similar question: > } > http://answers.google.com/answers/threadview/id/730165.html and the > } MTTDL > } > for an 11-disc RAID-5 with 100,000-hour drives and a 24-hour > } > replacement+rebuild turnaround was 3.8 million hours (433 years), and a > } > RAID-6 was said to be "hundreds of times" more reliable. The 433 years > } > figure will be assuming that one drive failure doesn't cause another > } one, > } > though, so it's to be taken with a pinch of salt. > } > > } > Cheers, > } > > } > John. > } > } I would take that with a verry large pinch of salt. From the little > } experience I have that value doesn't reflects reality. > } > } Unfortunately the MTBFS values for disks vendors give are pretty much > } totaly dreamed up. So the 100,000-hours for a single drive already has > } a huge uncertainty. Shouldn't affect the cut of point where the MTBFS > } of tha raid is less than a single disk though. > } > } Secondly disk failures in a raid are not unrelated. The disk all age > } and most people don't rotate in new disk regulary. The chance of a > } disk failure is not uniform over time. > } > } On top of that the stress of rebuilding usualy greatly increases the > } chances. And with large raids and todays large disks we are talking > } days to weeks or rebuild time. As you said, the 433 years are assuming > } that one drive failure doesn't cause another one to fail. In reality > } that seems to be a real factor though. > } > } > } If I understood the math in the URL right then the chance of a disk > } failing within a week is: > } > } 168/100000 = 0.00168 > } > } The chance of 2 disks failing within a week with 25 disks would be: > } > } (1-(1-168/100000)^25)^2 = ~0.00169448195081717874 > } > } The chance of 3 disks failing within a week with 75 disks would be: > } > } (1-(1-168/100000)^75)^3 = ~0.00166310371815668874 > } > } So the cut off values are roughly 25 and 75 disks for raid 5/6. Right? > } > } > } Now lets assume, and I'm totally guessing here, the failure is 4 times > } more likely during a rebuild: > } > } (1-(1-168/100000*4)^7)^2 = ~0.00212541503635 > } (1-(1-168/100000*4)^19)^3 = ~0.00173857193240 > } (1-(1-336/100000*4)^10)^3 = ~0.00202697761277 (two weeks rebuild time) > } > } So cut off is 7 and 19 (10 for 2 week rebuild) disks. Or am I totaly > } doing the wrong math? > } > } MfG > } Goswin > > I don't believe a block read error is considered in the MTBF. A current 2TB > disk has a "<1 in 10^15" "Non-recoverable read errors per bits read". That > is about 1 error per 114 TB read (10^15/8/1024/1024/1024/1024). So, you > should get 1 failure per about 114 TB read. If you had 57 2TB disks + 1 > parity, your chance of a read error should be 1 during a recovery. If you > had 29 2TB disks and 1 parity, you should have about 1 failure per 2 > recoveries. With 6 2TB disks and 1 parity, you should have about 1 failure > per 10 recoveries. This assumes you had no other disk reads to increase the > failure rate. > > I got the 10^15 from here: > http://www.wdc.com/en/library/sata/2879-701229.pdf > > I hope my math is correct! > > Guy But does that cause data loss? If only one disk has failed on a raid6 a read error is still correctable and rewriting would remap the block internally in the drive and avoid having to fail the drive. The kernel does that nowadays or not? MfG Goswin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-13 2:52 ` Goswin von Brederlow 2009-08-13 3:39 ` Guy Watkins @ 2009-08-13 4:22 ` Richard Scobie 2009-08-17 16:06 ` John Robinson 2 siblings, 0 replies; 13+ messages in thread From: Richard Scobie @ 2009-08-13 4:22 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: linux-raid Goswin von Brederlow wrote: > On top of that the stress of rebuilding usualy greatly increases the > chances. And with large raids and todays large disks we are talking > days to weeks or rebuild time. As you said, the 433 years are assuming > that one drive failure doesn't cause another one to fail. In reality > that seems to be a real factor though. I am intrigued as to what this extra stress actually is. I could understand if the drives were head thrashing for hours, but as I understand it, a rebuild just has all drives reading/writing in an orderly cylinder by cylinder fashion, so while the read/write electronics are being exercised continuously, mechanically there is not much going on, except I guess for the odd remapped sector that would involve a seek. I figure that by far the more common reason for the array to fail due to another disc being kicked out, is due to undiscovered uncorrectable read errors. The risk of striking these can be reduced by regularly performing md "check" or "repairs" - echo check > /sys/block/mdX/md/sync_action. Regards, Richard ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-13 2:52 ` Goswin von Brederlow 2009-08-13 3:39 ` Guy Watkins 2009-08-13 4:22 ` Richard Scobie @ 2009-08-17 16:06 ` John Robinson 2009-08-17 16:09 ` John Robinson 2 siblings, 1 reply; 13+ messages in thread From: John Robinson @ 2009-08-17 16:06 UTC (permalink / raw) To: Goswin von Brederlow; +Cc: David Cure, linux-raid On 13/08/2009 03:52, Goswin von Brederlow wrote: [...] > I would take that with a verry large pinch of salt. From the little > experience I have that value doesn't reflects reality. You're probably right, given that the data error rate hasn't dropped (it might have grown, given the higher areal densities, but in fact it's been about 1 in 10^15 for a while) while drive size has grown. While I was looking at the IDEMA LBA standard stuff, I found their paper about long sectors (also discussed here recently) which features a wee graph of hard drive capacity vs rebuilds before data loss in a 7-drive RAID-5 with 512-byte sectors, showing about 450 rebuilds for 36G drives but only 10 for 1.5T drives. See page 9: http://www.idema.org/_smartsite/modules/local/data_file/show_file.php?cmd=download&data_file_id=1779 Cheers, John. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: max number of devices in raid6 array 2009-08-17 16:06 ` John Robinson @ 2009-08-17 16:09 ` John Robinson 0 siblings, 0 replies; 13+ messages in thread From: John Robinson @ 2009-08-17 16:09 UTC (permalink / raw) To: Linux RAID On 17/08/2009 17:06, John Robinson wrote: [...] > graph of hard drive capacity vs rebuilds before data loss in a 7-drive > RAID-5 with 512-byte sectors, showing about 450 rebuilds for 36G drives > but only 10 for 1.5T drives. See page 9: > http://www.idema.org/_smartsite/modules/local/data_file/show_file.php?cmd=download&data_file_id=1779 Page 7 actually. Cheers, John. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-08-17 16:09 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-12 9:06 max number of devices in raid6 array David Cure 2009-08-12 9:38 ` John Robinson 2009-08-12 12:19 ` David Cure 2009-08-12 14:53 ` Goswin von Brederlow 2009-08-12 16:27 ` Billy Crook 2009-08-12 16:33 ` John Robinson 2009-08-13 2:52 ` Goswin von Brederlow 2009-08-13 3:39 ` Guy Watkins 2009-08-14 3:40 ` Leslie Rhorer 2009-08-17 7:31 ` Goswin von Brederlow 2009-08-13 4:22 ` Richard Scobie 2009-08-17 16:06 ` John Robinson 2009-08-17 16:09 ` John Robinson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).