From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-1?Q?Johan_Sch=F6n?= Subject: Re: Disks keep disapearing Date: Sat, 10 May 2003 18:47:40 +0200 Sender: linux-raid-owner@vger.kernel.org Message-ID: <3EBD2D2C.1050900@visiarc.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: To: "Peter L. Ashford" Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids Peter L. Ashford wrote: > WD has had problems similar to this with many of their drives. It ju= st > decides to 'go away'. There is a fix available on their web site for= the > 180GB and 200GB drives (and a better description of the problem), but= the > problem is NOT limited to those drives. How do these problem appear in log files? I have a machine with two Promise Ultra100 TX2 cards, and five WD2000JB 200 GB drives in RAID-5. In a month, i've had a few disk "fail= ures" that typically looks like this in the logs: |hdg: dma_intr: status=3D0x63 { DriveReady DeviceFault Index Error } |hdg: dma_intr: error=3D0x04 { DriveStatusError } |hdg: DMA disabled |hdh: DMA disabled |PDC202XX: Secondary channel reset. |ide3: reset: success |hdg: irq timeout: status=3D0xd2 { Busy } | |PDC202XX: Secondary channel reset. |ide3: reset: success |hdg: irq timeout: status=3D0xd2 { Busy } | |end_request: I/O error, dev 22:00 (hdg), sector 280277504 |raid5: Disk failure on hdg, disabling device. Operation continuing on = 4 devices |hdg: status timeout: status=3D0xd2 { Busy } | |PDC202XX: Secondary channel reset. |hdg: drive not ready for command |md: updating md0 RAID superblock on device |md: hdh [events: 00000007]<6>(write) hdh's sb offset: 195360896 |md: recovery thread got woken up ... |md0: no spare disk to reconstruct array! -- continuing in degraded mod= e |ide3: reset: success |md: (skipping faulty hdg ) |md: hdf [events: 00000007]<6>(write) hdf's sb offset: 195360896 |md: hde [events: 00000007]<6>(write) hde's sb offset: 195360896 |md: hdb [events: 00000007]<6>(write) hdb's sb offset: 195360896 |hdg: irq timeout: status=3D0xd2 { Busy } The disk itself doesn't appear to know about any failures (using smartctl), and it works again when hotadded to the raidset. I've also had a multiple drive "failure" twice, both times with two drives using the same IDE channel. I'm not sure if these problems are caused by buggy Promise ATA drivers in my kernel (RH9, 2.4.20) or the WDC problem with 180/200 GB drives. From WDC's description of the problem, I got the impression that it only happened when the drives were connected to hardware RAID cards like 3Ware IDE raid controllers. Can anyone advise? // Johan --=20 Johan Sch=F6n www.visiarc.com VISIARC AB Cell: +46-708-343002 - To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html