Thanks for software RAID!

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Thanks for software RAID!
@ 2003-06-26 12:31 Robin Whittle
  0 siblings, 0 replies; only message in thread
From: Robin Whittle @ 2003-06-26 12:31 UTC (permalink / raw)
  To: Linux RAID

A few days ago one of the two IBM 60 GXP drives (20 gig) in my RH 7.2
server failed.  Two sectors were unreadable, generating these lines in
/var/log/messages, all in the one second:

 hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
 hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=32778538,
      sector=12810992
 end_request: I/O error, dev 03:09 (hda), sector 12810992
 raid1: Disk failure on hda9, disabling device.
 ^IOperation continuing on 1 devices
 raid1: hda9: rescheduling block 12810992
 md: updating md5 RAID superblock on device
 md: hdc9 [events: 000000c9]<6>(write) hdc9's sb offset: 10080384
 md: recovery thread got woken up ...
 md5: no spare disk to reconstruct array! -- continuing in degraded mode
 md: recovery thread finished ...
 hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
 hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=32778538,
      sector=12811000
 end_request: I/O error, dev 03:09 (hda), sector 12811000
 raid1: hda9: rescheduling block 12811000
 md: (skipping faulty hda9 )
 raid1: hdc9: redirecting sector 12810992 to another mirror
 raid1: hdc9: redirecting sector 12811000 to another mirror

There is an hourly cron job which uses "cat /proc/mdstat" to look for
trouble and email me if there is any.  There are no-doubt other ways of
doing this which are faster and more direct.

The computer kept running like a charm and the next day I replaced the
two 20 Gig IBM drives with 40 Gig Seagate Barracuda IV.  I used "cat
/dev/hda > /dev/hdc" (after booting single user) to byte-for-byte clone
the first half of the two new drives from the two old drives.  (This is
possible since both drives have the same number of heads and sectors as
far as Linux is concerned.  I could have used the second half of the 40
gig drives for another partition, but I don't need it.)

Then by recreating the md5 device (I first had to temporarily delete the
md5 section of /etc/raidtab and reboot - probably there is a better
way), which was the one which had a partition fail, and creating a file
system there:

  mkraid /dev/md5     (It took a while to synch the drives.)
  mkfs -j /dev/md5

I was nearly ready to roll.  I copied the data from the good 20 gig
drive by mounting that raw partition (not as part of a RAID device) and
then the system was ready to run.

Software RAID-1 worked perfectly - the computer kept running and no data
was lost.  There was no extra hardware and so no extra cost and no extra
sources of unreliability.

Thanks for Software RAID!

  Cheers

    - Robin

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2003-06-26 12:31 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-26 12:31 Thanks for software RAID! Robin Whittle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).