feature suggestion to handle read errors during re-sync of raid5

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* feature suggestion to handle read errors during re-sync of raid5
@ 2010-01-30 12:37 Mikael Abrahamsson
  2010-01-30 17:51 ` Giovanni Tessore
  2010-01-30 18:59 ` Goswin von Brederlow
  0 siblings, 2 replies; 20+ messages in thread
From: Mikael Abrahamsson @ 2010-01-30 12:37 UTC (permalink / raw)
  To: linux-raid

So, a couple of times I've been having the problem of something going 
wrong on raid5, drive being kicked, thus has a lower event number, re-add, 
during the sync a single block on one of the other drives has a read error 
(surprisingly common on WD20EADS 2TB drives), resync stops, I have to take 
down the array, ddrescue the whole read error drive to another drive, I 
lose that block, start up the array degraded, and then add the drive 
again.

It would be nice if there was an option that when re-sync:ing a drive 
which earlier belonged to the array, if there is a read error on another 
drive, just use the parity from the drive being added (in my case it's 
highly likely it'll be valid, and if it's not, then I haven't lost 
anything anyway, because the read error block is gone anyway).

Does this make sense? It would of course be nice if the md layer could see 
the difference between sata timeouts and UNC errors, because UNC really 
means something is wrong, whereas sata timeouts might be transient 
problem (?).

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-30 12:37 feature suggestion to handle read errors during re-sync of raid5 Mikael Abrahamsson
@ 2010-01-30 17:51 ` Giovanni Tessore
  2010-01-30 19:04   ` John Robinson
  2010-01-30 21:09   ` Asdo
  2010-01-30 18:59 ` Goswin von Brederlow
  1 sibling, 2 replies; 20+ messages in thread
From: Giovanni Tessore @ 2010-01-30 17:51 UTC (permalink / raw)
  To: linux-raid

Mikael Abrahamsson wrote:
>
> So, a couple of times I've been having the problem of something going 
> wrong on raid5, drive being kicked, thus has a lower event number, 
> re-add, during the sync a single block on one of the other drives has 
> a read error (surprisingly common on WD20EADS 2TB drives), resync 
> stops, I have to take down the array, ddrescue the whole read error 
> drive to another drive, I lose that block, start up the array 
> degraded, and then add the drive again.
> It would be nice if there was an option that when re-sync:ing a drive 
> which earlier belonged to the array, if there is a read error on 
> another drive, just use the parity from the drive being added (in my 
> case it's highly likely it'll be valid, and if it's not, then I 
> haven't lost anything anyway, because the read error block is gone 
> anyway).

I had similar problem recently and I'm debating on another thread on 
read errors too.
I think that your proposal can be useful to recover from panic 
situation, and surely would have helped me to recover from disaster: my 
failed drive was not 100% dead, and if I could use it to correct read 
errors from the other disk, I would save lot of time, pain and angriness.
But also I think some work must be done on read errors policy for raid, 
to avoid these situations to present, where possible.

Btw, I've a doubt:
You say that these errors are common on these drives.
In another post I read that 'with modern drives it is possible to have 
some failed sector'.

I supposed that modern hard drives' firmware would recover and relocate 
dying sectors by its own (using smart and other techs), and that the OS 
gets read errors only when the drive is actually in very bad shape and 
can't cope with the problem, and it's time to trash it. Having the OS 
recover and rewrite the sectors makes me feel back in the past, when 
under DOS we used PCTools and other utilities to do this recovery stuff 
on ST-506 drives .... and this works well on raid, but in sinlge disk 
configuration, shouldn't these be data loss?

I'm confused... how much are modern disks reliable?

-- 
Cordiali saluti.
Yours faithfully.

Giovanni Tessore

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-30 17:51 ` Giovanni Tessore
@ 2010-01-30 19:04   ` John Robinson
  2010-01-30 21:33     ` Mikael Abrahamsson
  2010-01-30 21:09   ` Asdo
  1 sibling, 1 reply; 20+ messages in thread
From: John Robinson @ 2010-01-30 19:04 UTC (permalink / raw)
  To: Giovanni Tessore; +Cc: linux-raid

On 30/01/2010 17:51, Giovanni Tessore wrote:
> [...]  Having the OS
> recover and rewrite the sectors makes me feel back in the past, when 
> under DOS we used PCTools and other utilities to do this recovery stuff 
> on ST-506 drives .... and this works well on raid, but in sinlge disk 
> configuration, shouldn't these be data loss?
> 
> I'm confused... how much are modern disks reliable?

I think the problem is that modern discs haven't got more reliable, but 
they have got much, much bigger; "modern" 2G discs had a read error rate 
of 1 bit per 10^14, and current 2T discs have the same, so while on a 2G 
disc you could read the whole surface of the disc tens of thousands of 
times before being likely to get a read error, now it's only tens of 
times. Someone did recently post links to a formal article analysing 
this subject to this list, but I can't find it :-(

Cheers,

John.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-30 19:04   ` John Robinson
@ 2010-01-30 21:33     ` Mikael Abrahamsson
  2010-01-30 22:04       ` Asdo
  2010-01-31 16:17       ` John Robinson
  0 siblings, 2 replies; 20+ messages in thread
From: Mikael Abrahamsson @ 2010-01-30 21:33 UTC (permalink / raw)
  To: linux-raid

On Sat, 30 Jan 2010, John Robinson wrote:

> I think the problem is that modern discs haven't got more reliable, but they 
> have got much, much bigger; "modern" 2G discs had a read error rate of 1 bit 
> per 10^14, and current 2T discs have the same, so while on a 2G disc you 
> could read the whole surface of the disc tens of thousands of times before 
> being likely to get a read error, now it's only tens of times. Someone did 
> recently post links to a formal article analysing this subject to this list, 
> but I can't find it :-(

I think the 4k sector size on WD20EARS (for instance) is supposed to add 
more ECC information but I'm not sure how this will affect the 10^14 error 
rate. I think the manufacturers need to work a bit more on it, I don't see 
how these drives can be used in single drive configuration with the 
current 10^14 value.

I don't have experience with any other drives than the WD20EADS but they 
develop read errors at a rate higher than any other drive I've 
experienced before (and I started doing computer stuff in the 80ties). 
Yes, the large size is absolutely a factor, so the per-block read error 
rate might not be too high, but the read error per drive is definitely so 
(and I use the default ubuntu behaviour and read scrub every week).

I'm going RAID6 as soon as I can because of this, I'm tired of this 
bit-rot when I ddrescue things because of this (hopefully it's just 4k of 
data in the middle of a file which I care little about).

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-30 21:33     ` Mikael Abrahamsson
@ 2010-01-30 22:04       ` Asdo
  2010-01-30 22:25         ` Mikael Abrahamsson
  2010-01-31 16:17       ` John Robinson
  1 sibling, 1 reply; 20+ messages in thread
From: Asdo @ 2010-01-30 22:04 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

Mikael Abrahamsson wrote:
> I think the 4k sector size on WD20EARS (for instance) is supposed to 
> add more ECC information but I'm not sure how this will affect the 
> 10^14 error rate. I think the manufacturers need to work a bit more on 
> it, I don't see how these drives can be used in single drive 
> configuration with the current 10^14 value.
>
> I don't have experience with any other drives than the WD20EADS but 
> they develop read errors at a rate higher than any other drive I've 
> experienced before....

We are thinking about buying a large number of WD2002FYPS RE4-GP for a 
large array (raid-6)

Anyone knows if the error rate is so high on those drives too?
Do they also use 4k sectors? Mikael, how do you find this out?

Is the 4k sectors a RMW emulation as written in
    http://lwn.net/Articles/322777/
or Linux sees them really as 4k block devices?
And if it's the second, were there side effects on this?

Thank you

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-30 22:04       ` Asdo
@ 2010-01-30 22:25         ` Mikael Abrahamsson
  0 siblings, 0 replies; 20+ messages in thread
From: Mikael Abrahamsson @ 2010-01-30 22:25 UTC (permalink / raw)
  To: linux-raid

On Sat, 30 Jan 2010, Asdo wrote:

> Anyone knows if the error rate is so high on those drives too?
> Do they also use 4k sectors? Mikael, how do you find this out?

http://www.anandtech.com/storage/showdoc.aspx?i=3691

Guess you have to look at the data sheet.

> Is the 4k sectors a RMW emulation as written in
>   http://lwn.net/Articles/322777/
> or Linux sees them really as 4k block devices?

I think you can choose (jumper). Default is 512b block emulation.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-30 21:33     ` Mikael Abrahamsson
  2010-01-30 22:04       ` Asdo
@ 2010-01-31 16:17       ` John Robinson
  2010-01-31 16:34         ` Asdo
  2010-01-31 17:56         ` Mikael Abrahamsson
  1 sibling, 2 replies; 20+ messages in thread
From: John Robinson @ 2010-01-31 16:17 UTC (permalink / raw)
  To: Linux RAID

On 30/01/2010 21:33, Mikael Abrahamsson wrote:
[...]
> I think the 4k sector size on WD20EARS (for instance) is supposed to add 
> more ECC information but I'm not sure how this will affect the 10^14 
> error rate.

iirc part of the point of moving to 4K sectors is to improve the error 
correction to something like 1 in 10^20 or 22 without losing storage 
density, partly by using what was lost before in inter-sector gaps and 
partly because you can do better with more bits of ECC over more data.

Frankly I wish they'd sacrificed a little storage density and improved 
the error rate a long time ago.

Cheers,

John.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-31 16:17       ` John Robinson
@ 2010-01-31 16:34         ` Asdo
  2010-01-31 18:04           ` Goswin von Brederlow
  2010-01-31 17:56         ` Mikael Abrahamsson
  1 sibling, 1 reply; 20+ messages in thread
From: Asdo @ 2010-01-31 16:34 UTC (permalink / raw)
  To: John Robinson; +Cc: Linux RAID

John Robinson wrote:
> On 30/01/2010 21:33, Mikael Abrahamsson wrote:
> [...]
>> I think the 4k sector size on WD20EARS (for instance) is supposed to 
>> add more ECC information but I'm not sure how this will affect the 
>> 10^14 error rate.
>
> iirc part of the point of moving to 4K sectors is to improve the error 
> correction to something like 1 in 10^20 or 22 without losing storage 
> density, partly by using what was lost before in inter-sector gaps and 
> partly because you can do better with more bits of ECC over more data.
I remember the other way around: the purpose of 4k was to keep the same 
error rate while saving storage space
You have links?
I got my impression from here http://lwn.net/Articles/322777/ but it's 
not explicitly written
> Frankly I wish they'd sacrificed a little storage density and improved 
> the error rate a long time ago.
Me too

Anyway I think you'll never know how many ECC bits a brand uses. It's 
not declared. They just declare the estimated error rate which I think 
is the estimate of how likely is a surface defect and how likely is that 
to be fixable by the ECC algorithm. So the discussion has no real meaning...

There is another maybe more important factor: how likely is silent data 
corruption. IIRC reed-solomon algorithms can be tuned (at the same 
number of bits) for more correction and less detection of errors, or for 
more detection and less correction. If the detection is insufficient, 
you get garbage data when reading the sector, but the disk does not 
return a read error so the OS takes it as good data. Every 1 bit of 
error correction costs the same number of ECC bits as 2 bits of error 
detection. The disk makers never tell you how their algorithm is balanced.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-31 16:34         ` Asdo
@ 2010-01-31 18:04           ` Goswin von Brederlow
  0 siblings, 0 replies; 20+ messages in thread
From: Goswin von Brederlow @ 2010-01-31 18:04 UTC (permalink / raw)
  To: Asdo; +Cc: John Robinson, Linux RAID

Asdo <asdo@shiftmail.org> writes:

> John Robinson wrote:
>> On 30/01/2010 21:33, Mikael Abrahamsson wrote:
>> [...]
>>> I think the 4k sector size on WD20EARS (for instance) is supposed
>>> to add more ECC information but I'm not sure how this will affect
>>> the 10^14 error rate.
>>
>> iirc part of the point of moving to 4K sectors is to improve the
>> error correction to something like 1 in 10^20 or 22 without losing
>> storage density, partly by using what was lost before in
>> inter-sector gaps and partly because you can do better with more
>> bits of ECC over more data.
> I remember the other way around: the purpose of 4k was to keep the
> same error rate while saving storage space

I thought the purpose of 4k was to reduce the number of blocks so the
wear leveling has to handle less data.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-31 16:17       ` John Robinson
  2010-01-31 16:34         ` Asdo
@ 2010-01-31 17:56         ` Mikael Abrahamsson
  2010-02-01  1:30           ` Roger Heflin
  1 sibling, 1 reply; 20+ messages in thread
From: Mikael Abrahamsson @ 2010-01-31 17:56 UTC (permalink / raw)
  To: John Robinson; +Cc: Linux RAID

On Sun, 31 Jan 2010, John Robinson wrote:

> iirc part of the point of moving to 4K sectors is to improve the error 
> correction to something like 1 in 10^20 or 22 without losing storage 
> density, partly by using what was lost before in inter-sector gaps and 
> partly because you can do better with more bits of ECC over more data.
>
> Frankly I wish they'd sacrificed a little storage density and improved the 
> error rate a long time ago.

Looking at the data sheet for WD20EADS and WD20EARS they both have 10^15 
"non-recoverable read errors per bits read", so at least in the data sheet 
nothing has really changed in the error rate aspect.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-31 17:56         ` Mikael Abrahamsson
@ 2010-02-01  1:30           ` Roger Heflin
  2010-02-01  7:15             ` Mikael Abrahamsson
  0 siblings, 1 reply; 20+ messages in thread
From: Roger Heflin @ 2010-02-01  1:30 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: John Robinson, Linux RAID

Mikael Abrahamsson wrote:
> On Sun, 31 Jan 2010, John Robinson wrote:
> 
>> iirc part of the point of moving to 4K sectors is to improve the error 
>> correction to something like 1 in 10^20 or 22 without losing storage 
>> density, partly by using what was lost before in inter-sector gaps and 
>> partly because you can do better with more bits of ECC over more data.
>>
>> Frankly I wish they'd sacrificed a little storage density and improved 
>> the error rate a long time ago.
> 
> Looking at the data sheet for WD20EADS and WD20EARS they both have 10^15 
> "non-recoverable read errors per bits read", so at least in the data 
> sheet nothing has really changed in the error rate aspect.
> 

Based on my experience, I am not sure how sensible that parameter is 
without a time constraint added to it, unless the parameter applies to 
the underlying patter, and means it is still correctable and does not 
represent a error reported back by the disk to the user.

Bit errors seem more likely the longer a sector has set since being 
written.  I would be not expect to see errors if you read the entire 
disk, and then reread it 1x a day for a year, but if you read the disk 
once, let it sit spinning for a year without any reads and read it 
again, this will almost certainly get a read error.    When you keep 
rereading it, when the bits start to go bad the disk should move it 
long before data loss happens, but if you only read it 1x a year, it 
is likely that when you find the data bad there will be too many bad 
bits, beyond being able to correct it.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-02-01  1:30           ` Roger Heflin
@ 2010-02-01  7:15             ` Mikael Abrahamsson
  2010-02-01 13:33               ` Guy Watkins
  0 siblings, 1 reply; 20+ messages in thread
From: Mikael Abrahamsson @ 2010-02-01  7:15 UTC (permalink / raw)
  To: Linux RAID

On Sun, 31 Jan 2010, Roger Heflin wrote:

> Bit errors seem more likely the longer a sector has set since being 
> written. I would be not expect to see errors if you read the entire 
> disk, and then reread it 1x a day for a year, but if you read the disk 
> once, let it sit spinning for a year without any reads and read it 
> again, this will almost certainly get a read error.  When you keep 
> rereading it, when the bits start to go bad the disk should move it long 
> before data loss happens, but if you only read it 1x a year, it is 
> likely that when you find the data bad there will be too many bad bits, 
> beyond being able to correct it.

Are you sure that drives today remap like that (if it tries to read it and 
only succeeds on the 5th try, it'll reallocate the sector)?

Looking at "reallocated sectors" on my drives, I've never seen this 
happen. That SMART parameter has never increased unless I had a hard UNC 
and re-wrote the sector (and a lot of the times, not even that).

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: feature suggestion to handle read errors during re-sync of raid5
  2010-02-01  7:15             ` Mikael Abrahamsson
@ 2010-02-01 13:33               ` Guy Watkins
  2010-02-01 13:42                 ` Mikael Abrahamsson
  0 siblings, 1 reply; 20+ messages in thread
From: Guy Watkins @ 2010-02-01 13:33 UTC (permalink / raw)
  To: 'Mikael Abrahamsson', 'Linux RAID'

I have always read that drives remap on failed write.
But have not read anything new in 2+ years.

} -----Original Message-----
} From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
} owner@vger.kernel.org] On Behalf Of Mikael Abrahamsson
} Sent: Monday, February 01, 2010 2:16 AM
} To: Linux RAID
} Subject: Re: feature suggestion to handle read errors during re-sync of
} raid5
} 
} On Sun, 31 Jan 2010, Roger Heflin wrote:
} 
} > Bit errors seem more likely the longer a sector has set since being
} > written. I would be not expect to see errors if you read the entire
} > disk, and then reread it 1x a day for a year, but if you read the disk
} > once, let it sit spinning for a year without any reads and read it
} > again, this will almost certainly get a read error.  When you keep
} > rereading it, when the bits start to go bad the disk should move it long
} > before data loss happens, but if you only read it 1x a year, it is
} > likely that when you find the data bad there will be too many bad bits,
} > beyond being able to correct it.
} 
} Are you sure that drives today remap like that (if it tries to read it and
} only succeeds on the 5th try, it'll reallocate the sector)?
} 
} Looking at "reallocated sectors" on my drives, I've never seen this
} happen. That SMART parameter has never increased unless I had a hard UNC
} and re-wrote the sector (and a lot of the times, not even that).
} 
} --
} Mikael Abrahamsson    email: swmike@swm.pp.se
} --
} To unsubscribe from this list: send the line "unsubscribe linux-raid" in
} the body of a message to majordomo@vger.kernel.org
} More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: feature suggestion to handle read errors during re-sync of raid5
  2010-02-01 13:33               ` Guy Watkins
@ 2010-02-01 13:42                 ` Mikael Abrahamsson
  2010-02-01 15:15                   ` Goswin von Brederlow
  0 siblings, 1 reply; 20+ messages in thread
From: Mikael Abrahamsson @ 2010-02-01 13:42 UTC (permalink / raw)
  To: 'Linux RAID'

On Mon, 1 Feb 2010, Guy Watkins wrote:

> I have always read that drives remap on failed write.
> But have not read anything new in 2+ years.

Yes, failed write is one thing, I agree that this happens. The question 
was if there is going to be re-map if there is a read that "was hard to do 
because it took several attemps before it succeeded".

My experience is that it doesn't.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-02-01 13:42                 ` Mikael Abrahamsson
@ 2010-02-01 15:15                   ` Goswin von Brederlow
  2010-02-01 16:28                     ` Mikael Abrahamsson
  0 siblings, 1 reply; 20+ messages in thread
From: Goswin von Brederlow @ 2010-02-01 15:15 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: 'Linux RAID'

Mikael Abrahamsson <swmike@swm.pp.se> writes:

> On Mon, 1 Feb 2010, Guy Watkins wrote:
>
>> I have always read that drives remap on failed write.
>> But have not read anything new in 2+ years.
>
> Yes, failed write is one thing, I agree that this happens. The
> question was if there is going to be re-map if there is a read that
> "was hard to do because it took several attemps before it succeeded".
>
> My experience is that it doesn't.

Why should it remap? The drive should frist rewrite the pattern to
refresh the bits and only when that fails a remap is required.

It is true that drives do not remap on FAILED reads but that means reads
so broken the ECC can not repair them. Simple bit errors that ECC can
repair the OS never sees. So maybe you just never had a rewrite of a bit
error give a write error?

MfG
        Goswin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-02-01 15:15                   ` Goswin von Brederlow
@ 2010-02-01 16:28                     ` Mikael Abrahamsson
  2010-02-01 20:30                       ` Richard Scobie
  2010-02-02 11:06                       ` John Robinson
  0 siblings, 2 replies; 20+ messages in thread
From: Mikael Abrahamsson @ 2010-02-01 16:28 UTC (permalink / raw)
  To: 'Linux RAID'

On Mon, 1 Feb 2010, Goswin von Brederlow wrote:

> It is true that drives do not remap on FAILED reads but that means reads 
> so broken the ECC can not repair them. Simple bit errors that ECC can 
> repair the OS never sees. So maybe you just never had a rewrite of a bit 
> error give a write error?

Could be.

Since the drive has ECC, it would be interesting to see the amount of read 
errors ECC has corrected. Would also be interesting to know if the drive 
will rewrite a sector if it's ECC corrected, so that ECC doesn't have to 
kick in next time (or if they're actually operating in so narrow margins 
that they actually use the ECC constantly because the s/n ratio is bad and 
that this is part of the design).

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of  raid5
  2010-02-01 16:28                     ` Mikael Abrahamsson
@ 2010-02-01 20:30                       ` Richard Scobie
  2010-02-02 11:06                       ` John Robinson
  1 sibling, 0 replies; 20+ messages in thread
From: Richard Scobie @ 2010-02-01 20:30 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: 'Linux RAID'

Mikael Abrahamsson wrote:

> Since the drive has ECC, it would be interesting to see the amount of
> read errors ECC has corrected. Would also be interesting to know if the
> drive will rewrite a sector if it's ECC corrected, so that ECC doesn't
> have to kick in next time (or if they're actually operating in so narrow
> margins that they actually use the ECC constantly because the s/n ratio
> is bad and that this is part of the design).

I guess this is manufacturer and model specific, but looking smartd 
output from a representative (as in similar errors etc to others of the 
same model) Seagate SCSI drive, they are not rewriting on ECC:

Error counter log:
            Errors Corrected by           Total   Correction 
Gigabytes    Total
                EEC          rereads/    errors   algorithm 
processed    uncorrected
            fast | delayed   rewrites  corrected  invocations   [10^9 
bytes]  errors
read:    2205106        0         0   2205106    2205106      18342.407 
           0
write:         0        0         0         0          0       3831.185 
           0


Regards,

Richard

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-02-01 16:28                     ` Mikael Abrahamsson
  2010-02-01 20:30                       ` Richard Scobie
@ 2010-02-02 11:06                       ` John Robinson
  1 sibling, 0 replies; 20+ messages in thread
From: John Robinson @ 2010-02-02 11:06 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: 'Linux RAID'

On 01/02/2010 16:28, Mikael Abrahamsson wrote:
[...]
> Since the drive has ECC, it would be interesting to see the amount of 
> read errors ECC has corrected. Would also be interesting to know if the 
> drive will rewrite a sector if it's ECC corrected, so that ECC doesn't 
> have to kick in next time (or if they're actually operating in so narrow 
> margins that they actually use the ECC constantly because the s/n ratio 
> is bad and that this is part of the design).

I think that's exactly what's going on. There may also be some level of 
correction above which the drive will transparently rewrite after read, 
e.g. if the limit for correctability is X bits per sector, it'll 
automatically rewrite when it's had to correct 90% of X bits, but any 
less than that is just normal operation.

Cheers,

John.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-30 17:51 ` Giovanni Tessore
  2010-01-30 19:04   ` John Robinson
@ 2010-01-30 21:09   ` Asdo
  1 sibling, 0 replies; 20+ messages in thread
From: Asdo @ 2010-01-30 21:09 UTC (permalink / raw)
  To: Giovanni Tessore; +Cc: linux-raid

Giovanni Tessore wrote:
> I supposed that modern hard drives' firmware would recover and 
> relocate dying sectors by its own (using smart and other techs), and 
> that the OS gets read errors only when the drive is actually in very 
> bad shape and can't cope with the problem, and it's time to trash it.
The disk must find the dying sector somehow.

It will certainly find it if you scrub the array often, and at that time 
relocate it (by itself or by MD means). If the disk is very smart (I'm 
not sure how many disks really do this) it can relocate during read, if 
it finds that the sector has a partial read error which is still 
correctable with the internal reed-solomon algorithm. Otherwise it will 
report the read error to Linux and MD will regenerate parity and write 
over the sector. At that point the sector will be relocated.

I'm not sure SMART test performs a surface scan so I would rely on MD 
scrubs for that.

> Having the OS recover and rewrite the sectors makes me feel back in 
> the past, when under DOS we used PCTools and other utilities to do 
> this recovery stuff on ST-506 drives .... and this works well on raid, 
> but in sinlge disk configuration, shouldn't these be data loss?
The OS rewrites the sector ONLY if the disk is in a raid setup. 
Otherwise how could it guess the data that should be in that sector, if 
the disk itself cannot?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: feature suggestion to handle read errors during re-sync of raid5
  2010-01-30 12:37 feature suggestion to handle read errors during re-sync of raid5 Mikael Abrahamsson
  2010-01-30 17:51 ` Giovanni Tessore
@ 2010-01-30 18:59 ` Goswin von Brederlow
  1 sibling, 0 replies; 20+ messages in thread
From: Goswin von Brederlow @ 2010-01-30 18:59 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

Mikael Abrahamsson <swmike@swm.pp.se> writes:

> So, a couple of times I've been having the problem of something going
> wrong on raid5, drive being kicked, thus has a lower event number,
> re-add, during the sync a single block on one of the other drives has
> a read error (surprisingly common on WD20EADS 2TB drives), resync
> stops, I have to take down the array, ddrescue the whole read error
> drive to another drive, I lose that block, start up the array
> degraded, and then add the drive again.
>
> It would be nice if there was an option that when re-sync:ing a drive
> which earlier belonged to the array, if there is a read error on
> another drive, just use the parity from the drive being added (in my
> case it's highly likely it'll be valid, and if it's not, then I
> haven't lost anything anyway, because the read error block is gone
> anyway).
>
> Does this make sense? It would of course be nice if the md layer could
> see the difference between sata timeouts and UNC errors, because UNC
> really means something is wrong, whereas sata timeouts might be
> transient problem (?).

Ever looked into adding bitmaps? That way it only syncs the parts where
something changed, is done within minutes and unlikely do get another
error.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2010-02-02 11:06 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-30 12:37 feature suggestion to handle read errors during re-sync of raid5 Mikael Abrahamsson
2010-01-30 17:51 ` Giovanni Tessore
2010-01-30 19:04   ` John Robinson
2010-01-30 21:33     ` Mikael Abrahamsson
2010-01-30 22:04       ` Asdo
2010-01-30 22:25         ` Mikael Abrahamsson
2010-01-31 16:17       ` John Robinson
2010-01-31 16:34         ` Asdo
2010-01-31 18:04           ` Goswin von Brederlow
2010-01-31 17:56         ` Mikael Abrahamsson
2010-02-01  1:30           ` Roger Heflin
2010-02-01  7:15             ` Mikael Abrahamsson
2010-02-01 13:33               ` Guy Watkins
2010-02-01 13:42                 ` Mikael Abrahamsson
2010-02-01 15:15                   ` Goswin von Brederlow
2010-02-01 16:28                     ` Mikael Abrahamsson
2010-02-01 20:30                       ` Richard Scobie
2010-02-02 11:06                       ` John Robinson
2010-01-30 21:09   ` Asdo
2010-01-30 18:59 ` Goswin von Brederlow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).