If your using large Sata drives in raid 5/6 ....

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* If your using large Sata drives in raid 5/6 ....
       [not found] <87f94c371002021440o3b30414bk3a7ccf9d2fa9b8af@mail.gmail.com>
@ 2010-02-02 22:46 ` Greg Freemyer
  2010-02-03  9:27   ` Steven Haigh
                     ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Greg Freemyer @ 2010-02-02 22:46 UTC (permalink / raw)
  To: Linux RAID

All,

I think the below is accurate, but please cmiiw or misunderstand.

===
If your using normal big drives (1TB, etc.) in a raid-5 array, the
general consensus of this list is that it is a bad idea.  The reason being
that the sector error rate for a bad sector has not changed with
increasing density.

So in the days of 1GB drives, the likelihood of a undetected /
repaired bad sector was actually pretty low for the drive as whole.
But for today's 1TB drives, the odds are 1000x worse.  ie. 1000x more
sectors with the same basic failure rate per sector.

So a raid-5 composed of 1TB drives is 1000x more likely to be unable
to rebuild itself after a drive failure than a raid-5 built from 1 GB
drives of yesteryear.  Thus the current recommendation is to use raid
6 with high density drives.

The good news is that Western Digital is apparently introducing a new
series of drives with an error rate "2 orders of magnitude" better
than the current generation.

See <http://www.anandtech.com/storage/showdoc.aspx?i=3691&p=1>

The whole article is good, but this paragraph is what really got my attention:

"From a numbers perspective, Western Digital estimates that the use of
4K sectors will give them an immediate 7%-11% increase in format
efficiency. ECC burst error correction stands to improve by 50%, and
the overall error rate capability improves by 2 orders of magnitude.
In theory these reliability benefits should immediately apply to all
4K sector drives (making the Advanced Format drives more reliable than
regular drives), but Western Digital is not pushing that idea at this
time."

So maybe raid-5 will once again be a reasonable choice again in the
future.

(I think these drives may already be available at least as
engineering samples.  Basic linux kernel support went in summer 2009 I
believe. I believe 2.6.33 will be the first kernel to have been tested
with these new class of drives.)

I don't know if there is a mdraid wiki, but if so and someone wants to
post the above there, please do.

Greg
--
Greg Freemyer
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-02 22:46 ` If your using large Sata drives in raid 5/6 Greg Freemyer
@ 2010-02-03  9:27   ` Steven Haigh
  2010-02-03 10:56   ` Mikael Abrahamsson
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Steven Haigh @ 2010-02-03  9:27 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: Linux RAID

Numbers are a funny game.

Sure, there is a 1000x increase in the chance of an error occurring
*somewhere* on the drive - as we have 1000x more space. What you need to
keep in perspective is that although building a 2Tb RAID5 from 3 x 1Tb
drives may seem more likely to have problems than a 2Gb RAID5 from 3 x 1Gb
drives - however if you build a 2Tb RAID5 from 1Gb drives then you would
probably increase the likelihood of a failure to around the same levels.

From what I read of the article, it seems like more of a sales pitch and
although it has what seem to be good points, they seem to ignore the fact
of scaling. ie which is more likely to fail horribly, 3 x 1Tb drives or
3000 x 1Gb drives or even 300 x 10Gb drives?

Data storage always seems to be a minefield of misinformation and sales
pitches masked as information - however the best practice is to use RAID to
hopefully minimise downtime if a drive fails, but keep a copy of your data
elsewhere that you have easy access to if you need it. After this, it
doesn't really matter if you have a single drive failure - or even a double
drive failure in the case of RAID6.

If you have more failures and the array crashes horribly, then theres not
much you can do but rebuild and restore.

On Tue, 2 Feb 2010 17:46:25 -0500, Greg Freemyer <greg.freemyer@gmail.com>
wrote:
> All,
> 
> I think the below is accurate, but please cmiiw or misunderstand.
> 
> ===
> If your using normal big drives (1TB, etc.) in a raid-5 array, the
> general consensus of this list is that it is a bad idea.  The reason
being
> that the sector error rate for a bad sector has not changed with
> increasing density.
> 
> So in the days of 1GB drives, the likelihood of a undetected /
> repaired bad sector was actually pretty low for the drive as whole.
> But for today's 1TB drives, the odds are 1000x worse.  ie. 1000x more
> sectors with the same basic failure rate per sector.
> 
> So a raid-5 composed of 1TB drives is 1000x more likely to be unable
> to rebuild itself after a drive failure than a raid-5 built from 1 GB
> drives of yesteryear.  Thus the current recommendation is to use raid
> 6 with high density drives.
> 
> The good news is that Western Digital is apparently introducing a new
> series of drives with an error rate "2 orders of magnitude" better
> than the current generation.
> 
> See <http://www.anandtech.com/storage/showdoc.aspx?i=3691&p=1>
> 
> The whole article is good, but this paragraph is what really got my
> attention:
> 
> "From a numbers perspective, Western Digital estimates that the use of
> 4K sectors will give them an immediate 7%-11% increase in format
> efficiency. ECC burst error correction stands to improve by 50%, and
> the overall error rate capability improves by 2 orders of magnitude.
> In theory these reliability benefits should immediately apply to all
> 4K sector drives (making the Advanced Format drives more reliable than
> regular drives), but Western Digital is not pushing that idea at this
> time."
> 
> So maybe raid-5 will once again be a reasonable choice again in the
> future.
> 
> (I think these drives may already be available at least as
> engineering samples.  Basic linux kernel support went in summer 2009 I
> believe. I believe 2.6.33 will be the first kernel to have been tested
> with these new class of drives.)
> 
> I don't know if there is a mdraid wiki, but if so and someone wants to
> post the above there, please do.
> 
> Greg
> --
> Greg Freemyer
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Steven Haigh
 
Email: netwiz@crc.id.au
Web: http://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
Fax: (03) 8338 0299
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-02 22:46 ` If your using large Sata drives in raid 5/6 Greg Freemyer
  2010-02-03  9:27   ` Steven Haigh
@ 2010-02-03 10:56   ` Mikael Abrahamsson
  2010-02-03 12:24     ` Goswin von Brederlow
  2010-02-03 11:25   ` Goswin von Brederlow
  2010-02-03 14:08   ` John Robinson
  3 siblings, 1 reply; 11+ messages in thread
From: Mikael Abrahamsson @ 2010-02-03 10:56 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: Linux RAID

On Tue, 2 Feb 2010, Greg Freemyer wrote:

> (I think these drives may already be available at least as engineering 
> samples.  Basic linux kernel support went in summer 2009 I believe. I 
> believe 2.6.33 will be the first kernel to have been tested with these 
> new class of drives.)

I purchased two WD20EARS drives yesterday, going to play around with them 
tonight. They emulate 512 bytes sectors to the SATA layer but write them 
as 4k on drive, so you want to make sure that everything that writes to 
the drive, writes in 4k block size (which is the case with most in linux 
as far as I have understood).

So basically they'll work with any OS, but for best performance you need 
to make sure that things are aligned to 4k boundary and you want to write 
4k at a time.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-02 22:46 ` If your using large Sata drives in raid 5/6 Greg Freemyer
  2010-02-03  9:27   ` Steven Haigh
  2010-02-03 10:56   ` Mikael Abrahamsson
@ 2010-02-03 11:25   ` Goswin von Brederlow
  2010-02-03 14:08   ` John Robinson
  3 siblings, 0 replies; 11+ messages in thread
From: Goswin von Brederlow @ 2010-02-03 11:25 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: Linux RAID

Greg Freemyer <greg.freemyer@gmail.com> writes:

> All,
>
> I think the below is accurate, but please cmiiw or misunderstand.
>
> ===
> If your using normal big drives (1TB, etc.) in a raid-5 array, the
> general consensus of this list is that it is a bad idea.  The reason being
> that the sector error rate for a bad sector has not changed with
> increasing density.
>
> So in the days of 1GB drives, the likelihood of a undetected /
> repaired bad sector was actually pretty low for the drive as whole.
> But for today's 1TB drives, the odds are 1000x worse.  ie. 1000x more
> sectors with the same basic failure rate per sector.

I just had such a case yesterday. It does happen too often. Esspecialy
as drives get older. Rebuilding a raid5 becomes more and more dangerous
as you increase drive capacity.

> So a raid-5 composed of 1TB drives is 1000x more likely to be unable
> to rebuild itself after a drive failure than a raid-5 built from 1 GB
> drives of yesteryear.  Thus the current recommendation is to use raid
> 6 with high density drives.

If you can spare the drive and cpu then raid6 is definetly preferable.

Although I think the only future is in combining the raid and filesystem
into one. If you have some corrupted blocks then the FS can tell you
which files are corrupted. Raid only really covers the case of a whole
drive failing well.

MfG
        Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-03 10:56   ` Mikael Abrahamsson
@ 2010-02-03 12:24     ` Goswin von Brederlow
  0 siblings, 0 replies; 11+ messages in thread
From: Goswin von Brederlow @ 2010-02-03 12:24 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Linux RAID

Mikael Abrahamsson <swmike@swm.pp.se> writes:

> On Tue, 2 Feb 2010, Greg Freemyer wrote:
>
>> (I think these drives may already be available at least as
>> engineering samples.  Basic linux kernel support went in summer 2009
>> I believe. I believe 2.6.33 will be the first kernel to have been
>> tested with these new class of drives.)
>
> I purchased two WD20EARS drives yesterday, going to play around with
> them tonight. They emulate 512 bytes sectors to the SATA layer but
> write them as 4k on drive, so you want to make sure that everything
> that writes to the drive, writes in 4k block size (which is the case
> with most in linux as far as I have understood).
>
> So basically they'll work with any OS, but for best performance you
> need to make sure that things are aligned to 4k boundary and you want
> to write 4k at a time.

Make sure that your partitions are aligned then. Older partitioners will
start the first partition on block 63, meaning 32.5KiB into the disk.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-02 22:46 ` If your using large Sata drives in raid 5/6 Greg Freemyer
                     ` (2 preceding siblings ...)
  2010-02-03 11:25   ` Goswin von Brederlow
@ 2010-02-03 14:08   ` John Robinson
  2010-02-05 15:38     ` Bill Davidsen
  3 siblings, 1 reply; 11+ messages in thread
From: John Robinson @ 2010-02-03 14:08 UTC (permalink / raw)
  To: Linux RAID

On 02/02/2010 22:46, Greg Freemyer wrote:
> All,
> 
> I think the below is accurate, but please cmiiw or misunderstand.
> 
> ===
> If your using normal big drives (1TB, etc.) in a raid-5 array, the
> general consensus of this list is that it is a bad idea.  The reason being
> that the sector error rate for a bad sector has not changed with
> increasing density.
> 
> So in the days of 1GB drives, the likelihood of a undetected /
> repaired bad sector was actually pretty low for the drive as whole.
> But for today's 1TB drives, the odds are 1000x worse.  ie. 1000x more
> sectors with the same basic failure rate per sector.
> 
> So a raid-5 composed of 1TB drives is 1000x more likely to be unable
> to rebuild itself after a drive failure than a raid-5 built from 1 GB
> drives of yesteryear.  Thus the current recommendation is to use raid
> 6 with high density drives.

That sounds about right. One might still see RAID-5 as a way of pushing 
data loss through bad sectors back into a comfortable zone. After all, 
the likelihood of the same sector going bad on one of the other drives 
should be relatively small. Unfortunately it's too long since I studied 
probability for me to work it out properly. Then, to also protect 
yourself against dead drives, adding another drive a la RAID-6 sounds 
like the answer. But you can't think of RAID-6 protecting you from 2 
drive failures any more.

What is more, you need Linux md's implementation of single-sector 
recovery/rewriting for this to work. You cannot go around failing arrays 
because occasional single-sector reads fail.

> The good news is that Western Digital is apparently introducing a new
> series of drives with an error rate "2 orders of magnitude" better
> than the current generation.

It's not borne out in their figures; WD quote "less than 1 in 10^15 
bits" which is the same as they quote for their older drives.

What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable 
error rate, suggest you've a 1 in 63 chance of getting an uncorrectable 
error while reading the whole surface of their 2TB disc. Read the whole 
disc 44 times and you've a 50/50 chance of hitting an uncorrectable error.

You could read the whole drive in about 5 hours, according to the spec 
(at 110MB/s), so if you keep your drive busy you're going to reach this 
point in about 9 days. If you had a 5-drive array, you're going to get 
here inside 2 days.

Bear in mind that this is on a drive working perfectly correctly as 
specified. We have to expect to be recovering from failed reads daily.

</doom> ;-)

Cheers,

John.

PS. Wish I'd written down my working for this.
PPS. I'm not having a go at WD; other manufacturers' specs are similar.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-03 14:08   ` John Robinson
@ 2010-02-05 15:38     ` Bill Davidsen
  2010-02-05 16:14       ` Greg Freemyer
  2010-02-05 16:59       ` John Robinson
  0 siblings, 2 replies; 11+ messages in thread
From: Bill Davidsen @ 2010-02-05 15:38 UTC (permalink / raw)
  To: John Robinson; +Cc: Linux RAID

John Robinson wrote:
> On 02/02/2010 22:46, Greg Freemyer wrote:
>> All,
>>
>> I think the below is accurate, but please cmiiw or misunderstand.
>>
>> ===
>> If your using normal big drives (1TB, etc.) in a raid-5 array, the
>> general consensus of this list is that it is a bad idea.  The reason 
>> being
>> that the sector error rate for a bad sector has not changed with
>> increasing density.
>>
>> So in the days of 1GB drives, the likelihood of a undetected /
>> repaired bad sector was actually pretty low for the drive as whole.
>> But for today's 1TB drives, the odds are 1000x worse.  ie. 1000x more
>> sectors with the same basic failure rate per sector.
>>
>> So a raid-5 composed of 1TB drives is 1000x more likely to be unable
>> to rebuild itself after a drive failure than a raid-5 built from 1 GB
>> drives of yesteryear.  Thus the current recommendation is to use raid
>> 6 with high density drives.
>
> That sounds about right. One might still see RAID-5 as a way of 
> pushing data loss through bad sectors back into a comfortable zone. 
> After all, the likelihood of the same sector going bad on one of the 
> other drives should be relatively small. Unfortunately it's too long 
> since I studied probability for me to work it out properly. Then, to 
> also protect yourself against dead drives, adding another drive a la 
> RAID-6 sounds like the answer. But you can't think of RAID-6 
> protecting you from 2 drive failures any more.
>
> What is more, you need Linux md's implementation of single-sector 
> recovery/rewriting for this to work. You cannot go around failing 
> arrays because occasional single-sector reads fail.
>
>> The good news is that Western Digital is apparently introducing a new
>> series of drives with an error rate "2 orders of magnitude" better
>> than the current generation.
>
> It's not borne out in their figures; WD quote "less than 1 in 10^15 
> bits" which is the same as they quote for their older drives.
>
> What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable 
> error rate, suggest you've a 1 in 63 chance of getting an 
> uncorrectable error while reading the whole surface of their 2TB disc. 
> Read the whole disc 44 times and you've a 50/50 chance of hitting an 
> uncorrectable error.
>
Rethink that, virtually all errors happen during write, reading is 
non-destructive, in terms of what's on the drive. So it's valid after 
write or it isn't, but having been written correctly, other than 
failures in the media (including mechanical parts) or electronics, the 
chances of "going bad" are probably vanishingly small. And since "write 
in the wrong place" errors are proportional to actual writes, long term 
storage of unchanging data is better than active drives with lots of change.

> You could read the whole drive in about 5 hours, according to the spec 
> (at 110MB/s), so if you keep your drive busy you're going to reach 
> this point in about 9 days. If you had a 5-drive array, you're going 
> to get here inside 2 days.
>
> Bear in mind that this is on a drive working perfectly correctly as 
> specified. We have to expect to be recovering from failed reads daily.
>
> </doom> ;-)
>
> Cheers,
>
> John.
>
> PS. Wish I'd written down my working for this.
> PPS. I'm not having a go at WD; other manufacturers' specs are similar.


-- 
Bill Davidsen <davidsen@tmr.com>
  "We can't solve today's problems by using the same thinking we
   used in creating them." - Einstein


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-05 15:38     ` Bill Davidsen
@ 2010-02-05 16:14       ` Greg Freemyer
  2010-02-05 17:12         ` Bill Davidsen
  2010-02-05 16:59       ` John Robinson
  1 sibling, 1 reply; 11+ messages in thread
From: Greg Freemyer @ 2010-02-05 16:14 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: John Robinson, Linux RAID

On Fri, Feb 5, 2010 at 10:38 AM, Bill Davidsen <davidsen@tmr.com> wrote:
<snip>
>>> The good news is that Western Digital is apparently introducing a new
>>> series of drives with an error rate "2 orders of magnitude" better
>>> than the current generation.
>>
>> It's not borne out in their figures; WD quote "less than 1 in 10^15 bits"
>> which is the same as they quote for their older drives.
>>
>> What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable error
>> rate, suggest you've a 1 in 63 chance of getting an uncorrectable error
>> while reading the whole surface of their 2TB disc. Read the whole disc 44
>> times and you've a 50/50 chance of hitting an uncorrectable error.
>>
> Rethink that, virtually all errors happen during write, reading is
> non-destructive, in terms of what's on the drive. So it's valid after write
> or it isn't, but having been written correctly, other than failures in the
> media (including mechanical parts) or electronics, the chances of "going
> bad" are probably vanishingly small. And since "write in the wrong place"
> errors are proportional to actual writes, long term storage of unchanging
> data is better than active drives with lots of change.

Bill,

I thought writes went to the media unverified.  Thus if you write data
to a newly bad sector you won't know until some future point when you
try to read it.

During the read is when the bad CRC is detected and the sector marked
for future relocation.  The relocation of course does not happen until
another write comes along.  Thus the importance of doing a background
scan routinely to detect bad sectors and when encountered to rebuild
the info from other drives and then rewrite it thus triggering the
remapping to a spare sector.

If I've got that wrong, I'd appreciate a correction.

Greg

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-05 15:38     ` Bill Davidsen
  2010-02-05 16:14       ` Greg Freemyer
@ 2010-02-05 16:59       ` John Robinson
  2010-02-05 17:40         ` Bill Davidsen
  1 sibling, 1 reply; 11+ messages in thread
From: John Robinson @ 2010-02-05 16:59 UTC (permalink / raw)
  To: Linux RAID

On 05/02/2010 15:38, Bill Davidsen wrote:
> John Robinson wrote:
[...]
>> What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable 
>> error rate, suggest you've a 1 in 63 chance of getting an 
>> uncorrectable error while reading the whole surface of their 2TB disc. 
>> Read the whole disc 44 times and you've a 50/50 chance of hitting an 
>> uncorrectable error.
>>
> Rethink that, virtually all errors happen during write, reading is 
> non-destructive, in terms of what's on the drive. So it's valid after 
> write or it isn't, but having been written correctly, other than 
> failures in the media (including mechanical parts) or electronics, the 
> chances of "going bad" are probably vanishingly small.

They're quite small, at 1 in 10^15 bits read. On 1GB discs, you probably 
could call it vanishingly small. But now with 1TB and larger discs, I 
wouldn't characterise it as vanishingly small. It's entirely on the 
basis of the given specs that I did my calculations.

Bear in mind that the operation of the disc is now deliberately designed 
to use ECC all the time. Have a look at the vast numbers you get from 
the SMART data for ECC errors corrected. I just checked a 160GB 
single-platter disc with 4500 power-on hours; it quotes 200,000,000 
hardware ECC errors recovered.

Cheers,

John.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-05 16:14       ` Greg Freemyer
@ 2010-02-05 17:12         ` Bill Davidsen
  0 siblings, 0 replies; 11+ messages in thread
From: Bill Davidsen @ 2010-02-05 17:12 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: John Robinson, Linux RAID

Greg Freemyer wrote:
> On Fri, Feb 5, 2010 at 10:38 AM, Bill Davidsen <davidsen@tmr.com> wrote:
> <snip>
>   
>>>> The good news is that Western Digital is apparently introducing a new
>>>> series of drives with an error rate "2 orders of magnitude" better
>>>> than the current generation.
>>>>         
>>> It's not borne out in their figures; WD quote "less than 1 in 10^15 bits"
>>> which is the same as they quote for their older drives.
>>>
>>> What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable error
>>> rate, suggest you've a 1 in 63 chance of getting an uncorrectable error
>>> while reading the whole surface of their 2TB disc. Read the whole disc 44
>>> times and you've a 50/50 chance of hitting an uncorrectable error.
>>>
>>>       
>> Rethink that, virtually all errors happen during write, reading is
>> non-destructive, in terms of what's on the drive. So it's valid after write
>> or it isn't, but having been written correctly, other than failures in the
>> media (including mechanical parts) or electronics, the chances of "going
>> bad" are probably vanishingly small. And since "write in the wrong place"
>> errors are proportional to actual writes, long term storage of unchanging
>> data is better than active drives with lots of change.
>>     
>
> Bill,
>
> I thought writes went to the media unverified.  Thus if you write data
> to a newly bad sector you won't know until some future point when you
> try to read it.
>
> During the read is when the bad CRC is detected and the sector marked
> for future relocation.  The relocation of course does not happen until
> another write comes along.  Thus the importance of doing a background
> scan routinely to detect bad sectors and when encountered to rebuild
> the info from other drives and then rewrite it thus triggering the
> remapping to a spare sector.
>
> If I've got that wrong, I'd appreciate a correction.
>   

No, modern drives are less robust than "read after write" tech used in 
the past, but what I was mentioning is that the "Read the whole disc 44 
times" idea, the chances or a good read being followed by a bad read are 
smaller than the chance of a fail during the first read after a write.

The whole idea of using larger sectors is hardly new, back in the days 
of eight inch floppy disks capacity was increased by going to large 
sectors, and eventually speed was boosted by using track buffers and 
reading an entire track into cache and delivering 256 byte "sectors" 
from that. And on writes, the data was modified in the buffer, then the 
modified 256 bytes pseudo sectors were tracked using a bitmap which 
flagged which 1k hardware sectors were dirty, or the whole track could 
be rewritten.

I wrote code to do that using 128k bank select memory from CCS in the 
late 70's, and sold enough hardware to buy a new car. So there is little 
new under the sun, the problem of handling small pseudo sectors has been 
studied for ages, and current hardware allows the solution to go into 
the drive itself, just like the variable number of sectors per track 
(although everyone uses LBA now).

-- 
Bill Davidsen <davidsen@tmr.com>
  "We can't solve today's problems by using the same thinking we
   used in creating them." - Einstein

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: If your using large Sata drives in raid 5/6 ....
  2010-02-05 16:59       ` John Robinson
@ 2010-02-05 17:40         ` Bill Davidsen
  0 siblings, 0 replies; 11+ messages in thread
From: Bill Davidsen @ 2010-02-05 17:40 UTC (permalink / raw)
  To: John Robinson; +Cc: Linux RAID

John Robinson wrote:
> On 05/02/2010 15:38, Bill Davidsen wrote:
>> John Robinson wrote:
> [...]
>>> What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable 
>>> error rate, suggest you've a 1 in 63 chance of getting an 
>>> uncorrectable error while reading the whole surface of their 2TB 
>>> disc. Read the whole disc 44 times and you've a 50/50 chance of 
>>> hitting an uncorrectable error.
>>>
>> Rethink that, virtually all errors happen during write, reading is 
>> non-destructive, in terms of what's on the drive. So it's valid after 
>> write or it isn't, but having been written correctly, other than 
>> failures in the media (including mechanical parts) or electronics, 
>> the chances of "going bad" are probably vanishingly small.
>
> They're quite small, at 1 in 10^15 bits read. On 1GB discs, you 
> probably could call it vanishingly small. But now with 1TB and larger 
> discs, I wouldn't characterise it as vanishingly small. It's entirely 
> on the basis of the given specs that I did my calculations.
>
> Bear in mind that the operation of the disc is now deliberately 
> designed to use ECC all the time. Have a look at the vast numbers you 
> get from the SMART data for ECC errors corrected. I just checked a 
> 160GB single-platter disc with 4500 power-on hours; it quotes 
> 200,000,000 hardware ECC errors recovered.

I don't know how to read the POH smart reports for Seagate, I just 
checked a server which has been up 167 days most recently, and all but 
two weeks (moves and such) of the last four years. It shows 622 POH, and 
the others in the same raid-5 array show times from 1600 to 470. Two 
report ECC rates of 50-60m in four years, the other 6. Yes, six. None 
show any relocates. My set of WD 1TB drives showed no relocates in a 
year, and no errors (may not show that field if zero).

I keep a table of MD5sum for all significate files on the arrays, and 
haven't seen an error in years. Since I do a "check" regularly, I know 
all sectors are being read. My main issue with your post was the "read 
44 times" as explained in another reply, not your original calculation.

-- 
Bill Davidsen <davidsen@tmr.com>
  "We can't solve today's problems by using the same thinking we
   used in creating them." - Einstein


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-02-05 17:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <87f94c371002021440o3b30414bk3a7ccf9d2fa9b8af@mail.gmail.com>
2010-02-02 22:46 ` If your using large Sata drives in raid 5/6 Greg Freemyer
2010-02-03  9:27   ` Steven Haigh
2010-02-03 10:56   ` Mikael Abrahamsson
2010-02-03 12:24     ` Goswin von Brederlow
2010-02-03 11:25   ` Goswin von Brederlow
2010-02-03 14:08   ` John Robinson
2010-02-05 15:38     ` Bill Davidsen
2010-02-05 16:14       ` Greg Freemyer
2010-02-05 17:12         ` Bill Davidsen
2010-02-05 16:59       ` John Robinson
2010-02-05 17:40         ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).