apparent but not real raid1 failure. what happened? still confused. Gurus Please help...

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* apparent but not real raid1 failure. what happened? still confused. Gurus Please help...
@ 2005-05-02 16:24 Mitchell Laks
  2005-05-02 18:20 ` Peter T. Breuer
  2005-06-02 15:23 ` Is there a drive error "retry" parameter? Carlos Knowlton
  0 siblings, 2 replies; 10+ messages in thread
From: Mitchell Laks @ 2005-05-02 16:24 UTC (permalink / raw)
  To: linux-raid

I recently had what appeared to be 
a raid1 failure and was wondering what insight I should draw. The kernel 
diagnostics suggested a dual drive failure - but the data turned out to still 
be there. What does this mean?

I described what happened in an earlier post, but I really don't understand 
and would be very grateful for insight from the gurus on the list.

Is it a bug in the kernel? in software raid? Is it my stupidity?

my system: 

asus K8v-x motherboard with amd64 processor,
uname -a
Linux A2 2.6.8-1-386 #1 Mon Jan 24 03:01:58 EST 2005 i686 GNU/Linux
(this was the debian stock 2.6.8 kernel circa January)

mdadm-v1.9.0

All harddrives are 250GB  parallel ata ide drives, 
WD2500 JB drives (3 year warranty)

Initially, one raid failed:
/dev/md0 between /dev/hda1 and
/dev/hdg1 with the /dev/hdg1 on a highpoint rocket 133 controller.

From reading the log files I see that initially /dev/hda1 died

Apr 21 07:36:01 A2 kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Apr 21 07:36:01 A2 kernel: hda: dma_intr: error=0x40 { UncorrectableError },
LBAsect=209715335, high=12, low=8388743, sector=209
715335
Apr 21 07:36:01 A2 kernel: end_request: I/O error, dev hda, sector 209715335
Apr 21 07:36:01 A2 kernel: raid1: Disk failure on hda1, disabling device.
Apr 21 07:36:01 A2 kernel: ^IOperation continuing on 1 devices
Apr 21 07:36:01 A2 kernel: raid1: hda1: rescheduling sector 209715272
Apr 21 07:36:01 A2 kernel: RAID1 conf printout:
Apr 21 07:36:01 A2 kernel:  --- wd:1 rd:2
Apr 21 07:36:01 A2 kernel:  disk 0, wo:1, o:0, dev:hda1
Apr 21 07:36:01 A2 kernel:  disk 1, wo:0, o:1, dev:hdg1
Apr 21 07:36:01 A2 kernel: RAID1 conf printout:
Apr 21 07:36:01 A2 kernel:  --- wd:1 rd:2
Apr 21 07:36:01 A2 kernel:  disk 1, wo:0, o:1, dev:hdg1
Apr 21 07:36:01 A2 kernel: raid1: hdg1: redirecting sector 209715272 to
another mirror
Apr 21 07:36:21 A2 kernel: hdg: dma_timer_expiry: dma status == 0x20
Apr 21 07:36:21 A2 kernel: hdg: DMA timeout retry
Apr 21 07:36:21 A2 kernel: hdg: timeout waiting for DMA
Apr 21 07:36:21 A2 kernel: hdg: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Apr 21 07:36:21 A2 kernel:
Apr 21 07:36:21 A2 kernel: hdg: drive not ready for command
Apr 21 07:36:21 A2 kernel: raid1: hdg1: rescheduling sector 209715272
Apr 21 07:36:21 A2 kernel: raid1: hdg1: redirecting sector 209715272 to
another mirror
Apr 21 07:36:21 A2 kernel: hdg: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Apr 21 07:36:21 A2 kernel:
Apr 21 07:36:21 A2 kernel: hdg: drive not ready for command
Apr 21 07:36:41 A2 kernel: hdg: dma_timer_expiry: dma status == 0x20
Apr 21 07:36:41 A2 kernel: hdg: DMA timeout retry
Apr 21 07:36:41 A2 kernel: hdg: timeout waiting for DMA
Apr 21 07:36:41 A2 kernel: hdg: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Apr 21 07:36:41 A2 kernel:
Apr 21 07:36:41 A2 kernel: hdg: drive not ready for command
Apr 21 07:36:41 A2 kernel: raid1: hdg1: rescheduling sector 209715272
Apr 21 07:36:41 A2 kernel: raid1: hdg1: redirecting sector 209715272 to
another mirror
Apr 21 07:36:41 A2 kernel: hdg: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Apr 21 07:36:41 A2 kernel:

and then /dev/hdg1 immediately began to spew forth error messages of the
following sort 

Apr 22 22:29:21 A2 kernel: hdg: status error: status=0x58 { DriveReady
SeekComplete DataRequest }
Apr 22 22:29:21 A2 kernel:
Apr 22 22:29:21 A2 kernel: hdg: drive not ready for command
Apr 22 22:29:21 A2 kernel: raid1: hdg1: rescheduling sector 209715272
Apr 22 22:29:21 A2 kernel: raid1: hdg1: redirecting sector 209715272 to
another
mirror
Apr 22 22:29:21 A2 kernel: hdg: status error: status=0x58 { DriveReady
SeekCompl
ete DataRequest }
Apr 22 22:29:21 A2 kernel:
Apr 22 22:29:21 A2 kernel: hdg: drive not ready for command
Apr 22 22:29:21 A2 kernel: raid1: hdg1: rescheduling sector 209715272
Apr 22 22:29:21 A2 kernel: raid1: hdg1: redirecting sector 209715272 to other
 ....

These errors continued nonstop all day/night until /
var ran out of space and errors filled the 6GB /var partition.

2.6GB of         /var/log/kern.log and
2.6GB of        /var/log/syslog and
1GB of          /var/log/messages
were filled by these errors.

I then pulled the two drives out of the system and 
put a pair of new drives in for /dev/hda1 and /dev/hdg1 and
and created /dev/md0 anew, and restored the data to my servers from backups.


I then took the two drives /dev/hda1 and /dev/hdg1 to another machine

and ran the Western Digital drive diagnostics on both of them and they 
are both fine. No errors.

I then took /dev/hda1 on the new system and did
modprobe raid1
mknod /dev/md0 b 9 0
mdadm -A /dev/md0  /dev/hda1 

and then mount /dev/md0 /mnt and i see my data which looks intact.
Similarly if I do that with /dev/hdg1 i see the same data.

(note if i then try to do mdadm -A /dev/md0 /dev/hda1 /dev/hdc1 
(where /dev/hdc1 was /dev/hdg1 on the other machine) then i get a message
saying effectively that they are not up to date to each other ...)
 
Has anyone else had this trouble? Could someone explain what happened?

What should I have done when I found the errors when my system failed?

Is it safe for me to continue to use raid1?

Thanks,

Mitchell


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: apparent but not real raid1 failure. what happened? still confused. Gurus Please help...
  2005-05-02 16:24 apparent but not real raid1 failure. what happened? still confused. Gurus Please help Mitchell Laks
@ 2005-05-02 18:20 ` Peter T. Breuer
  2005-06-02 15:23 ` Is there a drive error "retry" parameter? Carlos Knowlton
  1 sibling, 0 replies; 10+ messages in thread
From: Peter T. Breuer @ 2005-05-02 18:20 UTC (permalink / raw)
  To: linux-raid

Mitchell Laks <mlaks@verizon.net> wrote:
> Initially, one raid failed:
> /dev/md0 between /dev/hda1 and
> /dev/hdg1 with the /dev/hdg1 on a highpoint rocket 133 controller.

> From reading the log files I see that initially /dev/hda1 died

Yes, but then so did hdg1.

Or at least they were slow replying, or perhaps spun down.

> Apr 21 07:36:01 A2 kernel: hda: dma_intr: status=0x51 { DriveReady
> SeekComplete Error }
> Apr 21 07:36:01 A2 kernel: hda: dma_intr: error=0x40 { UncorrectableError },
> LBAsect=209715335, high=12, low=8388743, sector=209
> 715335
> Apr 21 07:36:01 A2 kernel: end_request: I/O error, dev hda, sector 209715335

Well, this seems to be a bona fide error on hda.  It's about 100GB in.
Is that right? Looks like slow wakeup, or tracking problem.

Anyway, it might have been a read error. Those happen. I posted a patch
("robust raid") to stop the disk being faulted out of the array on
those errors, letting the other disk be tried instead.

> Apr 21 07:36:01 A2 kernel:  disk 1, wo:0, o:1, dev:hdg1
> Apr 21 07:36:01 A2 kernel: RAID1 conf printout:
> Apr 21 07:36:01 A2 kernel:  --- wd:1 rd:2
> Apr 21 07:36:01 A2 kernel:  disk 1, wo:0, o:1, dev:hdg1
> Apr 21 07:36:01 A2 kernel: raid1: hdg1: redirecting sector 209715272 to
> another mirror

Oh, well after faulting out hda1, hdg1 got tried anyay.

> Apr 21 07:36:21 A2 kernel: hdg: dma_timer_expiry: dma status == 0x20
> Apr 21 07:36:21 A2 kernel: hdg: DMA timeout retry
> Apr 21 07:36:21 A2 kernel: hdg: timeout waiting for DMA
> Apr 21 07:36:21 A2 kernel: hdg: status error: status=0x58 { DriveReady
> SeekComplete DataRequest }

But it was not ready either.  It looks like neither disk is especially
happy using that mode of dma.  I'd play with hdparm!

> Apr 21 07:36:21 A2 kernel: hdg: drive not ready for command
> Apr 21 07:36:21 A2 kernel: raid1: hdg1: rescheduling sector 209715272
> Apr 21 07:36:21 A2 kernel: raid1: hdg1: redirecting sector 209715272 to
> another mirror

That's a silly (harmless) raid bug. There is no other mirror.

> and then /dev/hdg1 immediately began to spew forth error messages of the
> following sort 

> Apr 22 22:29:21 A2 kernel: hdg: status error: status=0x58 { DriveReady
> SeekComplete DataRequest }
> Apr 22 22:29:21 A2 kernel:
> Apr 22 22:29:21 A2 kernel: hdg: drive not ready for command
> Apr 22 22:29:21 A2 kernel: raid1: hdg1: rescheduling sector 209715272
> Apr 22 22:29:21 A2 kernel: raid1: hdg1: redirecting sector 209715272 to
> another
> mirror

Well, maybe not so harmless.  Same sector every time. It never learns.

> These errors continued nonstop all day/night until /
> var ran out of space and errors filled the 6GB /var partition.

Shrug. They were harmless otherwise.

> 2.6GB of         /var/log/kern.log and

You need to practice better log control!

> 2.6GB of        /var/log/syslog and
> 1GB of          /var/log/messages
> were filled by these errors.

Unfortunate, but not terribly harmful.

> I then pulled the two drives out of the system and 

There's nothing wrong with them as far as I can see!  Just let them cool
down or use another dma mode, or rewrite the god*mn bad sector and let
them go on their merry way.

> put a pair of new drives in for /dev/hda1 and /dev/hdg1 and

Nooooooo.

> and created /dev/md0 anew, and restored the data to my servers from backups.

It is not likely that both disks are kaput. It's not even likely that
one disk was. It's likely that your controller or disk was not happy
with the dma mode, or simply that you had the immense bad luck to get a
read error from the same sector on both. That's not disastrous. It
sounds like it's something in common about your machine, since errors
are really unlikely on the same sector in two independent drives!

> I then took the two drives /dev/hda1 and /dev/hdg1 to another machine
> and ran the Western Digital drive diagnostics on both of them and they 
> are both fine. No errors.

See!

> Has anyone else had this trouble? Could someone explain what happened?

Sure - you saw what happened. Read error. Leading to raid ejection.
Leading to admin panic.

> What should I have done when I found the errors when my system failed?

Nothing.  Nothing was very wrong.  Rewrite the failed sector with zeros,
at worst, and restart the raid array, and take the 512B loss like a man.

But I'd be looking hard at your dma settings! You can't leave the disks
in that setting - they error!

You might also be inspecting cables or anything else that occurs to you
to check. Heat?

But by all accounts (your report), nothing was really deeply wrong!

You needed to force the raid to restart.  I think a mkraid --force would
have done the trick.  I'm wondering whether to use
--dangerous-no-resync, because a rewrite would be nice in order to at
least warm up and fix one disk!  But would it read the bad sector?
Maybe.  Maybe not. That dma setting needs changing.

Perhaps you also want to check for disk settings (if you can get at
them) like "error on read error", or "replace bad sector on read".
You might have a SMART control that can do that.

> Is it safe for me to continue to use raid1?

Not if you continue to do that, unfortunately - most likely nothing was
wrong at all and the array merely needed restarting to correct its.
overzealous action in ejecting the disk after read error.  I think
that's going a bit far, and one should tell it not to do that and get on
with life.  But you rather paniced and took it all offline instead of
kicking its pants and telling it to calm down and be a good boy.

My "robost raid" patch purports to at least stop the disk being kicked
on a read error, but it's not clear that it would have helped here
because BOTH disks failed on that sector.  It looks as though your
motherboard was a bit hot to the touch at that moment ...

Peter

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Is there a drive error "retry" parameter?
  2005-05-02 16:24 apparent but not real raid1 failure. what happened? still confused. Gurus Please help Mitchell Laks
  2005-05-02 18:20 ` Peter T. Breuer
@ 2005-06-02 15:23 ` Carlos Knowlton
  2005-06-02 17:16   ` Michael Tokarev
  1 sibling, 1 reply; 10+ messages in thread
From: Carlos Knowlton @ 2005-06-02 15:23 UTC (permalink / raw)
  To: linux-raid

I want to understand exactly what is going on in the Software RAID 5 
code when a drive is marked "dirty", and booted from the array.  Based 
on what I've read so far, it seems that this happens any time the RAID 
software runs into a read or write error that might have been corrected 
by fsck (if it had been there first).  Is this true?

Is there a "retry" parameter that can be set in the kernel parameters, 
or else in the code itself to prolong the existence of a drive in an 
array before it is considered dirty? 

If so, I would like to increase it in my environment, because it seems 
like I'm losing drives in my array that are often still quite stable.

Thanks!
Carlos Knowlton

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a drive error "retry" parameter?
  2005-06-02 15:23 ` Is there a drive error "retry" parameter? Carlos Knowlton
@ 2005-06-02 17:16   ` Michael Tokarev
  2005-06-03  9:21     ` danci
  2005-06-14 21:53     ` Carlos Knowlton
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Tokarev @ 2005-06-02 17:16 UTC (permalink / raw)
  To: cknowlton; +Cc: linux-raid

Carlos Knowlton wrote:
> I want to understand exactly what is going on in the Software RAID 5
> code when a drive is marked "dirty", and booted from the array.  Based
> on what I've read so far, it seems that this happens any time the RAID
> software runs into a read or write error that might have been corrected
> by fsck (if it had been there first).  Is this true?

You're mixing up 2 very different things here.  Very different.

Fsck has nothing to do with raid, per se.  Fsck checks the filesystem
which is on top of a block device (be it a raid array, a disk, or a
loopback device, whatever).  It does not understand/know what is "raid",
at all.  Speaking of raid, the filesystem is an upper-level stuff.  Again,
raid code knows nothing about filesystems or any data it stores.  Also,
filesystem obviously does not know about underlying components of the
raid array where the filesystem resides -- so fsck can NOT "fix" whatever
error happened two layers down the stack (fs, raid, underlying devices).

From the other side, raid code ensures (or tries to, anyway) that any
errors in underlying (components) devices will not propagate to the
upper level (be it a filesystem, database or anything else - raid does
not care what data it stores).  It is here to "hide" whatever errors
may happen on the physical device (disk drive).  Currently, if enouth
drives fails, raid array will be "shut down" so that the upper level
(eg filesystem) can't even access the whole raid array.  Until that
happens, there should be no errors propagated to the filesystem layer,
all such errors will be corrected by raid code, ensuring that it will
read the same data as has been written to it.

> Is there a "retry" parameter that can be set in the kernel parameters,
> or else in the code itself to prolong the existence of a drive in an
> array before it is considered dirty?

There's no such parameter currently.  But there was several discussions
about how to make raid code more robust - in particular, in case of
read error, raid code may keep the errored drive in the array and mark
it dirty only in case of write error.

> If so, I would like to increase it in my environment, because it seems
> like I'm losing drives in my array that are often still quite stable.

I think you have to provide some more information.  Kernel logging tells
alot of details about what exactly happening and what the raid code is
doing as a result of that.

Raid code is quite stable and is used in alot of machines all over the
world.  If you're expiriencing such a weird behaviour, I think it's due
to some othe problem on your side, and the best would be to find and fix
the real error, not the symptom.

/mjt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a drive error "retry" parameter?
  2005-06-02 17:16   ` Michael Tokarev
@ 2005-06-03  9:21     ` danci
  2005-06-14 21:53     ` Carlos Knowlton
  1 sibling, 0 replies; 10+ messages in thread
From: danci @ 2005-06-03  9:21 UTC (permalink / raw)
  To: linux-raid; +Cc: cknowlton

On Thu, 2 Jun 2005, Michael Tokarev wrote:

> Raid code is quite stable and is used in alot of machines all over the
> world.  If you're expiriencing such a weird behaviour, I think it's due
> to some othe problem on your side, and the best would be to find and fix
> the real error, not the symptom.

Sometimes (not very often) a similar thing happens to my server - two of 
the drives are marked faulty (seamingly at the same time).

Fortunately I have always been able to re-construct the array so my data 
was intact (and there was no need for a week of backup restoration), but 
it still is very annoying.

Each time after that happened I tested the failed disks with 'badblocks 
-n', but there were no read/write errors. Unfortunately, libata doesn't 
support SMART (in that version, at least).

To add some details, I'm using a Promise SATA150 TX4 controllers and 4 
Maxtor 6Y200M0 SATA drives for a RAID5 array (holding most of the data). 
Additionally there is an on-board Adaptec AIC-7901A U320 with two IBM 
IC35L073UWDY10-0 for a RAID1 array (holding operating system, homes, ...).

The kernel is 2.6.8-24.11-smp, the OS is SuSE Linux 9.2.

  D.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a drive error "retry" parameter?
  2005-06-02 17:16   ` Michael Tokarev
  2005-06-03  9:21     ` danci
@ 2005-06-14 21:53     ` Carlos Knowlton
  2005-06-14 22:46       ` Michael Tokarev
  1 sibling, 1 reply; 10+ messages in thread
From: Carlos Knowlton @ 2005-06-14 21:53 UTC (permalink / raw)
  To: linux-raid

Hi Michael,

Michael Tokarev wrote:

>Carlos Knowlton wrote:
>  
>
>>I want to understand exactly what is going on in the Software RAID 5
>>code when a drive is marked "dirty", and booted from the array.  Based
>>on what I've read so far, it seems that this happens any time the RAID
>>software runs into a read or write error that might have been corrected
>>by fsck (if it had been there first).  Is this true?
>>    
>>
>
>You're mixing up 2 very different things here.  Very different.
>
>Fsck has nothing to do with raid, per se.  Fsck checks the filesystem
>which is on top of a block device (be it a raid array, a disk, or a
>loopback device, whatever).  It does not understand/know what is "raid",
>at all.  Speaking of raid, the filesystem is an upper-level stuff.  Again,
>raid code knows nothing about filesystems or any data it stores.  Also,
>filesystem obviously does not know about underlying components of the
>raid array where the filesystem resides -- so fsck can NOT "fix" whatever
>error happened two layers down the stack (fs, raid, underlying devices).
>
From the other side, raid code ensures (or tries to, anyway) that any
>errors in underlying (components) devices will not propagate to the
>upper level (be it a filesystem, database or anything else - raid does
>not care what data it stores).  It is here to "hide" whatever errors
>may happen on the physical device (disk drive).  Currently, if enouth
>drives fails, raid array will be "shut down" so that the upper level
>(eg filesystem) can't even access the whole raid array.  Until that
>happens, there should be no errors propagated to the filesystem layer,
>all such errors will be corrected by raid code, ensuring that it will
>read the same data as has been written to it.
>  
>
Thanks, that is good to know!   I had read a discussion from this list a 
few months ago that I must have gotten the wrong impression from. 
<http://marc.theaimsgroup.com/?l=linux-raid&m=108852478803297&w=2>.

Maybe you can help me clarify some other misconceptions I have.  For 
instance, I had heard that with most modern hard disks, when they run 
into a bad sector, they will map around that sector, and copy the data 
to another place on the disk.  Do you know if this is true?  If so, how 
does this impact RAID? (ie, Is RAID benefited by this, or does it 
override it?)


>>Is there a "retry" parameter that can be set in the kernel parameters,
>>or else in the code itself to prolong the existence of a drive in an
>>array before it is considered dirty?
>>    
>>
>
>There's no such parameter currently.  But there was several discussions
>about how to make raid code more robust - in particular, in case of
>read error, raid code may keep the errored drive in the array and mark
>it dirty only in case of write error.
>  
>
That would be nice.  Do you know if anyone has done any work toward such 
a fix?

>>If so, I would like to increase it in my environment, because it seems
>>like I'm losing drives in my array that are often still quite stable.
>>    
>>
>
>I think you have to provide some more information.  Kernel logging tells
>alot of details about what exactly happening and what the raid code is
>doing as a result of that.
>  
>
Unfortunately, I don't have the logs handy, but I'll post something next 
time I see it.   I built several RAID servers for some customers over a 
year ago, and they have reported drive failures.  We have replaced these 
and when we tested the old drives they were still in fairly good 
condition.  So for the last little while, I have just reinserted the 
drive back into the array, and it usually doesn't cause any trouble 
again (though occasionally a different drive will fail).  If there is a 
way to keep the drive in the array a little longer,  when a read error 
is detected, it would really help!


Thanks!
Carlos Knowlton

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a drive error "retry" parameter?
  2005-06-14 21:53     ` Carlos Knowlton
@ 2005-06-14 22:46       ` Michael Tokarev
  2005-06-15 21:40         ` Carlos Knowlton
  2005-06-16  0:20         ` Paul Clements
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Tokarev @ 2005-06-14 22:46 UTC (permalink / raw)
  To: cknowlton; +Cc: linux-raid

Carlos Knowlton wrote:
> Hi Michael,
> 
> Michael Tokarev wrote:
[]

> Maybe you can help me clarify some other misconceptions I have.  For 
> instance, I had heard that with most modern hard disks, when they run 
> into a bad sector, they will map around that sector, and copy the data 
> to another place on the disk.  Do you know if this is true?  If so, how 
> does this impact RAID? (ie, Is RAID benefited by this, or does it 
> override it?)

There are two types of *read* failure possible.

One is when a drive, by its own, probably after several retries,
is finally able to read that bad block.  In that case, it will be
able to reallocate said block to some other place, without losing
any data.

And another type is when the drive can not recover whatever data
has been written to that block.  In that case, the drive, obviously,
is unable to reallocate that block, at least automatically, without
losing some data, and it should NOT reallocate the block without
telling to the operating system.

Unfortunately, the latter kind of failure occurs more often than
the former.  And the drive - at least fair one - has no other
choice but return some "read error" to the OS in that case,
and leaving the block alone, in case the OS will not react
properly (which is "properly"?) and will try to read the same
block again after a while.

So, to answer your first question, it's both yes and no.  Yes,
most drives do have such an ability, in particular, all SCSI
drives I know supports read-error relocation.  And no, because
in most cases the drive can't recover the bad block anyway,
so there's nothing to reallocate.

Ofcourse such reallocation - again, fair one, where drive will
not silently zero-fill the missin bits while relocating -- helps
both raid (software or hardware, does not matter) and any other
subsystem which is using it (it does not really matter if it's
raid or something else), in a transparent way.

(For completness: there's another reallocation feature supporting
by most drives - write-error relocation, when a drive relocates
bad block on *write* error, because it knows which data should be
there.  A block that was unreadable may become good again after
re-write, either "just because", after refreshing its pieces,
it is now in cleaner state, or because the write-error relocation
mechanism in the drive did its work.  That's why re-writing
a drive with bad blocks often results in a good drive, and often
that good state persists; it's more or less normal for a drive
to develop one or two bad blocks during its lifetime and reallocate
them.)

>>> Is there a "retry" parameter that can be set in the kernel parameters,
>>> or else in the code itself to prolong the existence of a drive in an
>>> array before it is considered dirty?
>>
>> There's no such parameter currently.  But there was several discussions
>> about how to make raid code more robust - in particular, in case of
>> read error, raid code may keep the errored drive in the array and mark
>> it dirty only in case of write error.
>>
> That would be nice.  Do you know if anyone has done any work toward such 
> a fix?

Looks like this is a "FAQ #1" candidate for linux softraid ;)
I tried to do just that myself, with a help from Peter T. Breuer.
The code even worked here on a test machine for some time.
But it's umm.. quite a bit ugly, and Neil is going to slightly
different direction (which I for one don't like much - the
persistent bitmaps stuff, -- I think simpler approach is better).

>>> If so, I would like to increase it in my environment, because it seems
>>> like I'm losing drives in my array that are often still quite stable.
>>
>> I think you have to provide some more information.  Kernel logging tells
>> alot of details about what exactly happening and what the raid code is
>> doing as a result of that.
>>
> Unfortunately, I don't have the logs handy, but I'll post something next 
> time I see it.   I built several RAID servers for some customers over a 
> year ago, and they have reported drive failures.  We have replaced these 
> and when we tested the old drives they were still in fairly good 
> condition.  So for the last little while, I have just reinserted the 
> drive back into the array, and it usually doesn't cause any trouble 
> again (though occasionally a different drive will fail).  If there is a 
> way to keep the drive in the array a little longer,  when a read error 
> is detected, it would really help!

If memory serves me right, you mentioned *several* drives goes off
all at once.  This is not a bad sector on one drive, it's something
else like bad cabling or power supplies, whatever.

Speaking of drives and bad sectors -- see above.  On SCSI drives
there's a way to see all the relocations (scsiinfo utility for
example).

And yes indeed, it'd be nice to keep the drive in the array in case
of read error, and only kick it off on write errors - huge step in
the right direction.

/mjt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a drive error "retry" parameter?
  2005-06-14 22:46       ` Michael Tokarev
@ 2005-06-15 21:40         ` Carlos Knowlton
  2005-06-16  0:20         ` Paul Clements
  1 sibling, 0 replies; 10+ messages in thread
From: Carlos Knowlton @ 2005-06-15 21:40 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: linux-raid

Hello Michael,

Michael Tokarev wrote:
...

> (For completness: there's another reallocation feature supporting
> by most drives - write-error relocation, when a drive relocates
> bad block on *write* error, because it knows which data should be
> there.  A block that was unreadable may become good again after
> re-write, either "just because", after refreshing its pieces,
> it is now in cleaner state, or because the write-error relocation
> mechanism in the drive did its work.  That's why re-writing
> a drive with bad blocks often results in a good drive, and often
> that good state persists; it's more or less normal for a drive
> to develop one or two bad blocks during its lifetime and reallocate
> them.)

Thanks!  This is useful info. 

I did some googling on sector relocation, and it appears that SpinRite 
6.0  (on their features page <http://www.grc.com/srfeatures.htm>),  
claims to be able to turn off sector relocation, and re-read and analyze 
the "bad" sector in different ways until it can get a good read, (or 
deduce the correct data from the statistical outcome of multiple failed 
reads) then turn relocation back on, and map around the sector.  Any 
reason this couldn't be done in the block device driver (or some other, 
more appropriate layer)?  It seems that this kind of transparent data 
recovery would be a real plus!  Do you know if any thought has gone into 
this kind of thing?

>
>>>> Is there a "retry" parameter that can be set in the kernel parameters,
>>>> or else in the code itself to prolong the existence of a drive in an
>>>> array before it is considered dirty?
>>>
>>>
>>> There's no such parameter currently.  But there was several discussions
>>> about how to make raid code more robust - in particular, in case of
>>> read error, raid code may keep the errored drive in the array and mark
>>> it dirty only in case of write error.
>>>
>> That would be nice.  Do you know if anyone has done any work toward 
>> such a fix?
>
>
> Looks like this is a "FAQ #1" candidate for linux softraid ;)
> I tried to do just that myself, with a help from Peter T. Breuer.
> The code even worked here on a test machine for some time.
> But it's umm.. quite a bit ugly, and Neil is going to slightly
> different direction (which I for one don't like much - the
> persistent bitmaps stuff, -- I think simpler approach is better).

Is that the journal stuff mentioned here 
<http://lwn.net/2002/0523/a/jbd-md.php3> between Neil and Steven  
Tweedie?  What is the status of it?  (a complex approach to a solution 
is better than nothing, as long as it solves the problem, right?)

> If memory serves me right, you mentioned *several* drives goes off
> all at once.  This is not a bad sector on one drive, it's something
> else like bad cabling or power supplies, whatever.

I've looked into cable and power issues, and if they are the culprit, 
the problem is terribly intermittent, and my setup  is generally within 
spec.  (although on some servers we have mounted two drives on a 40pin 
ATA cable, we've rarely seen two drives fail that have shared a 
cable.).  After a reboot, the drives that had these errors are happily 
restored back into the array as if nothing happened. If these are issues 
with a standard setup, this is all the more reason to want RAID to be a 
little bit more lenient on the isolated read error.

I've been looking into the IDE code to see if I can get it to give me a 
few more read retries before declaring a read error.  The "ERROR_MAX" 
variable in ".../linux-x.x.x/include/linux/ide.h" looks like it might 
afford me some extra time.  Is there a better place to find this kind of 
relief?

> Speaking of drives and bad sectors -- see above.  On SCSI drives
> there's a way to see all the relocations (scsiinfo utility for
> example).

Is there anything similar  to this for S-ATA, or P-ATA drives?

> And yes indeed, it'd be nice to keep the drive in the array in case
> of read error, and only kick it off on write errors - huge step in
> the right direction.

I appreciate your effort toward this end.   Thanks again for your help!

Regards,
Carlos

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a drive error "retry" parameter?
  2005-06-14 22:46       ` Michael Tokarev
  2005-06-15 21:40         ` Carlos Knowlton
@ 2005-06-16  0:20         ` Paul Clements
  2005-06-16 16:23           ` Michael Tokarev
  1 sibling, 1 reply; 10+ messages in thread
From: Paul Clements @ 2005-06-16  0:20 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: cknowlton, linux-raid

Michael Tokarev wrote:
> Carlos Knowlton wrote:

>>>> Is there a "retry" parameter that can be set in the kernel parameters,
>>>> or else in the code itself to prolong the existence of a drive in an
>>>> array before it is considered dirty?
>>>
>>>
>>> There's no such parameter currently.  But there was several discussions
>>> about how to make raid code more robust - in particular, in case of
>>> read error, raid code may keep the errored drive in the array and mark
>>> it dirty only in case of write error.
>>>
>> That would be nice.  Do you know if anyone has done any work toward 
>> such a fix?
> 
> 
> Looks like this is a "FAQ #1" candidate for linux softraid ;)
> I tried to do just that myself, with a help from Peter T. Breuer.
> The code even worked here on a test machine for some time.
> But it's umm.. quite a bit ugly, and Neil is going to slightly
> different direction (which I for one don't like much - the
> persistent bitmaps stuff, -- I think simpler approach is better).

The persistent bitmap code has got nothing to do with read/write error 
correction. The bitmap simply keeps track of what's out of sync between 
the component drives, so you never need a full resync. On the other 
hand, read/write error correction tries to limit the conditions under 
which a drive would be kicked out of an array (thus resulting in a 
resync). Ultimately, I think we'd like to see both capabilities in md, 
though...

--
Paul


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a drive error "retry" parameter?
  2005-06-16  0:20         ` Paul Clements
@ 2005-06-16 16:23           ` Michael Tokarev
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Tokarev @ 2005-06-16 16:23 UTC (permalink / raw)
  To: linux-raid

Paul Clements wrote:
> Michael Tokarev wrote:
[]
>>>> There's no such parameter currently.  But there was several discussions
>>>> about how to make raid code more robust - in particular, in case of
>>>> read error, raid code may keep the errored drive in the array and mark
>>>> it dirty only in case of write error.
>>>>
>>> That would be nice.  Do you know if anyone has done any work toward
>>> such a fix?
>>
>> Looks like this is a "FAQ #1" candidate for linux softraid ;)
>> I tried to do just that myself, with a help from Peter T. Breuer.
>> The code even worked here on a test machine for some time.
>> But it's umm.. quite a bit ugly, and Neil is going to slightly
>> different direction (which I for one don't like much - the
>> persistent bitmaps stuff, -- I think simpler approach is better).
> 
> The persistent bitmap code has got nothing to do with read/write error
> correction. The bitmap simply keeps track of what's out of sync between
> the component drives, so you never need a full resync. On the other
> hand, read/write error correction tries to limit the conditions under
> which a drive would be kicked out of an array (thus resulting in a
> resync). Ultimately, I think we'd like to see both capabilities in md,
> though...

The two features are sorta independant from each other, but if
I understand Neil correctly, he wants to implement "robust
raid" (not kicking drive on the first error etc) "on top" of
the bitmap code (which somehow makes sense ofcourse).

/mjt

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-06-16 16:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-02 16:24 apparent but not real raid1 failure. what happened? still confused. Gurus Please help Mitchell Laks
2005-05-02 18:20 ` Peter T. Breuer
2005-06-02 15:23 ` Is there a drive error "retry" parameter? Carlos Knowlton
2005-06-02 17:16   ` Michael Tokarev
2005-06-03  9:21     ` danci
2005-06-14 21:53     ` Carlos Knowlton
2005-06-14 22:46       ` Michael Tokarev
2005-06-15 21:40         ` Carlos Knowlton
2005-06-16  0:20         ` Paul Clements
2005-06-16 16:23           ` Michael Tokarev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).