From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Dmytriw - NetCetera Subject: I have managed to pickle my RAID 1 install after a disk crash Date: Sun, 19 Sep 2004 08:08:10 -0600 Sender: linux-raid-owner@vger.kernel.org Message-ID: <414D92CA.2060903@netcetera-solutions.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: To: linux-raid@vger.kernel.org List-Id: linux-raid.ids Hi, I recently had the mis-fortune of a disk failure, luckily the disk was part of a RAID1 setup, so nothing was lost - yet...I should mention that I am mirroring /boot/, swap and / , and am using mdadm. I had thought that I had setup grub correctlly to allow booting off of either disk, but did not test it - my bad. So when I replaced the drive - hda- I thought that the sytem would boot off of hdd and then go through the process of rebuilding the array with the new drive. But the system would not boot. I tried various things, BIOS settings, etc, but the Grub splash screen would not appear when I tried to boot off of hdd. I swapped cables - and drive jumpers - so that my previous hdd was now hda and then sucessfully re-booted the system. So far so good. Not sure why the disk would boot as hda and not hdd - maybe a BIOS issue with my motherboard even though it does allow specifying IDE 0-4 as boot devices. So I had the sytem up and running - in a RAID degraded state - and started woking on bringing the RAID 1 scenario back. I partioned the replacement drive, now hdd and all looked well. It didn't look like I could simply add the drive to the array as cat /proc/mdstat implied to me that the first disk in the array had failed and I was worried about copying the contents of the second drive - which mdadm thought was good - over the drive that actually had the good stuff on it. I tried various other things with mdadm, like stoopping and re-creating the raid devices, etc, but to no success - probably user eror. So now I am not sure how to proceed. cat /proc/mdstat yeilds this: Code: lucky root # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid5] [multipath] read_ahead 1024 sectors md3 : active raid1 ide/host0/bus0/target0/lun0/part3[1] 12377984 blocks [2/1] [_U] md1 : active raid1 ide/host0/bus1/target1/lun0/part1[1] 64128 blocks [2/1] [_U] md2 : active raid1 ide/host0/bus1/target1/lun0/part2[1] 248896 blocks [2/1] [_U] unused devices: Which implies to me that things are very messed up. I think this because of the following snippets: Code: md3 : active raid1 ide/host0/bus0/target0/lun0/part3[1] 12377984 blocks [2/1] [_U] and Code: md1 : active raid1 ide/host0/bus1/target1/lun0/part1[1] 64128 blocks [2/1] [_U] different busses and targets - so different disks are active.... I spent some time seraching around and came to the conclusion that my RAID config is definitely borked. I am thinking that the best thing for me to do now is to deactivate RAID completely, then come back and do a complete RAID re-config with my disks the way they are. But, I can't find a way to stop/delete the meta devices so that I can start from scratch. I am running on my /dev/hdax config with no /dev/mdx devices mounted. Any thoughts ? Thanx. -- Dave Dmytriw Principal, NetCetera Solutions Inc. Calgary, AB 403-703-1399 daved@netcetera-solutions.com http://www.netcetera-solutions.com