replacing failed hard drives in RAID 5 configuration

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* replacing failed hard drives in RAID 5 configuration
@ 2004-10-11 21:33 Saurabh Barve
  0 siblings, 0 replies; only message in thread
From: Saurabh Barve @ 2004-10-11 21:33 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4121 bytes --]

Hi,

I am running a server that has four 250 GB hard drives in a RAID 5 
configuration. Recently, two of the hard drives failed. I copied the 
data bitwise from one of the failed hard drives (/dev/hdc1) to another 
(/dev/hdd1) using dd_rescue 
(http://www.garloff.de/kurt/linux/ddrescue/). The failed hard drive had 
about 300 bad blocks (I checked using the badblocks utility). Because of 
the failure of the two hard drives, the RAID (/dev/md0) wouldn't start.

I tried to add the new hard drive (/dev/hdd1) to the RAID using mdadm. I 
kept the failed hard drive (/dev/hdc1) in the machine. The other two 
functional hard drives are /dev/hdg1 and /dev/hdh1. Initially I tried 
starting the array with 'raidstart'. When I did this, I got the 
following error messages in /var/log/messages:

Oct 11 14:41:15 server-name kernel: md: invalid raid superblock magic on 
hdd1
Oct 11 14:41:15 server-name kernel: md: hdd1 has invalid sb, not importing!
Oct 11 14:41:15 server-name kernel: md: could not import hdd1, trying to 
run array nevertheless.
Oct 11 14:41:15 server-name kernel:  [events: 00000017]
Oct 11 14:41:15 server-name kernel:  [events: 00000017]
Oct 11 14:41:15 server-name kernel: md: autorun ...
Oct 11 14:41:15 server-name kernel: md: considering hdh1 ...
Oct 11 14:41:15 server-name kernel: md:  adding hdh1 ...
Oct 11 14:41:15 server-name kernel: md:  adding hdg1 ...
Oct 11 14:41:15 server-name kernel: md:  adding hdc1 ...
Oct 11 14:41:15 server-name kernel: md: created md0
Oct 11 14:41:15 server-name kernel: md: bind<hdc1,1>
Oct 11 14:41:15 server-name kernel: md: bind<hdg1,2>
Oct 11 14:41:15 server-name kernel: md: bind<hdh1,3>
Oct 11 14:41:15 server-name kernel: md: running: <hdh1><hdg1><hdc1>
Oct 11 14:41:15 server-name kernel: md: hdh1's event counter: 00000017
Oct 11 14:41:15 server-name kernel: md: hdg1's event counter: 00000017
Oct 11 14:41:15 server-name kernel: md: hdc1's event counter: 0000000f
Oct 11 14:41:15 server-name kernel: md: superblock update time 
inconsistency -- using the most recent one
Oct 11 14:41:15 server-name kernel: md: freshest: hdh1
Oct 11 14:41:15 server-name kernel: md: kicking non-fresh hdc1 from array!
Oct 11 14:41:15 server-name kernel: md: unbind<hdc1,2>
Oct 11 14:41:15 server-name kernel: md: export_rdev(hdc1)
Oct 11 14:41:15 server-name kernel: md0: removing former faulty hdd1!
Oct 11 14:41:15 server-name kernel: md0: max total readahead window set 
to 768k
Oct 11 14:41:15 server-name kernel: md0: 3 data-disks, max readahead per 
data-disk: 256k
Oct 11 14:41:15 server-name kernel: raid5: device hdh1 operational as 
raid disk 3
Oct 11 14:41:15 server-name kernel: raid5: device hdg1 operational as 
raid disk 2
Oct 11 14:41:15 server-name kernel: raid5: not enough operational 
devices for md0 (2/4 failed)
Oct 11 14:41:15 server-name kernel: RAID5 conf printout:
Oct 11 14:41:15 server-name kernel:  --- rd:4 wd:2 fd:2
Oct 11 14:41:15 server-name kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 
dev:[dev 00:00]
Oct 11 14:41:15 server-name kernel:  disk 1, s:0, o:0, n:1 rd:1 us:1 
dev:[dev 00:00]
Oct 11 14:41:15 server-name kernel:  disk 2, s:0, o:1, n:2 rd:2 us:1 
dev:hdg1
Oct 11 14:41:15 server-name kernel:  disk 3, s:0, o:1, n:3 rd:3 us:1 
dev:hdh1
Oct 11 14:41:15 server-name kernel: raid5: failed to run raid set md0
Oct 11 14:41:15 server-name kernel: md: pers->run() failed ...
Oct 11 14:41:15 server-name kernel: md :do_md_run() returned -22
Oct 11 14:41:15 server-name kernel: md: md0 stopped.
Oct 11 14:41:15 server-name kernel: md: unbind<hdh1,1>
Oct 11 14:41:15 server-name kernel: md: export_rdev(hdh1)
Oct 11 14:41:15 server-name kernel: md: unbind<hdg1,0>
Oct 11 14:41:15 server-name kernel: md: export_rdev(hdg1)
Oct 11 14:41:15 server-name kernel: md: ... autorun DONE.

I also tried to run the array using mdamd - 'mdadm --assemble --scan 
/dev/md0 /dev/hdc1 /dev/hdd1 /dev/hdg1 /dev/hdh1'. However, diung this 
gave me an error message of "Segmentation Fault".

Can anybody help me replace the old hard drive (/dev/hdc1) with the new 
hard drive (/dev/hdd1) that has data copied off of the old drive?

Thanks,
Saurabh Barve.

[-- Attachment #2: sa.vcf --]
[-- Type: text/x-vcard, Size: 323 bytes --]

begin:vcard
fn:Saurabh Barve
n:Barve;Saurabh
org:Colorado State University;Department of Atmospheric Science
adr:;;4100 West Laporte Avenue;Fort Collins;CO;80523;USA
email;internet:sa@atmos.colostate.edu
title:Systems Administrator
tel;work:(970) 491-7714
tel;home:(970) 416-7512
x-mozilla-html:TRUE
version:2.1
end:vcard


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2004-10-11 21:33 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-11 21:33 replacing failed hard drives in RAID 5 configuration Saurabh Barve

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).