From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brad Campbell Subject: Solved : Re: Time to ask for help. Raid-5 Dual drive failure Date: Wed, 05 Nov 2008 12:50:22 +0400 Message-ID: <49115E4E.7000002@wasp.net.au> References: <4910BC0E.9000307@wasp.net.au> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4910BC0E.9000307@wasp.net.au> Sender: linux-raid-owner@vger.kernel.org To: RAID Linux List-Id: linux-raid.ids Brad Campbell wrote: > Ok, so it finally died. > > I was doing a large copy to an ext3 filesystem on md0 when one drive > dropped out (SATA error). 3 minutes later a second drive dropped out > (SATA error). > > I've tried to re-assemble the array with > mdadm --assemble --force /dev/md0 but it errors out with > > mdadm: failed to RUN_ARRAY /dev/md0: Input/output error > So I re-read my archives on the linux-raid list, consulted google and decided I had enough information available to be able to re-create the array. I figured looking at the output from --examine on the first drive to die would give me a good indicator on what the array *should* look like. /dev/sdj1: Magic : a92b4efc Version : 00.90.00 UUID : 05cc3f43:de1ecfa4:83a51293:78015f1e Creation Time : Sun May 2 18:02:14 2004 Raid Level : raid5 Used Dev Size : 244198400 (232.89 GiB 250.06 GB) Array Size : 2197785600 (2095.97 GiB 2250.53 GB) Raid Devices : 10 Total Devices : 10 Preferred Minor : 0 Update Time : Tue Nov 4 22:23:33 2008 State : active Active Devices : 10 Working Devices : 10 Failed Devices : 0 Spare Devices : 0 Checksum : 210701c1 - correct Events : 0.1338267 Layout : left-asymmetric Chunk Size : 128K Number Major Minor RaidDevice State this 0 8 145 0 active sync /dev/sdj1 0 0 8 145 0 active sync /dev/sdj1 1 1 8 161 1 active sync /dev/sdk1 2 2 8 176 2 active sync /dev/sdl 3 3 8 193 3 active sync /dev/sdm1 4 4 8 225 4 active sync /dev/sdo1 5 5 8 209 5 active sync /dev/sdn1 6 6 8 113 6 active sync /dev/sdh1 7 7 8 129 7 active sync /dev/sdi1 8 8 8 81 8 active sync /dev/sdf1 9 9 8 96 9 active sync /dev/sdg I supposed the most important thing was the order of the disks, so I tried this magic incantation.. mdadm --create /dev/md0 --assume-clean --level 5 --raid-devices=10 missing /dev/sdk1 /dev/sdl /dev/sdm1 /dev/sdo1 /dev/sdn1 /dev/sdh1 /dev/sdi1 /dev/sdf1 /dev/sdg That failed being completely unable to locate the superblock. Then I wondered if perhaps it was defaulting to a different chunk size, (never thought to check with --examine on one of the newly created components) Second time I added --chunk 128 and e2fsck found a superblock however it was very mangled. Third time I did an --examine on one of the newly created components and noticed that the new array defaulted to left-symmetric, so I added --layout left-asymmetric and it all came back up. mdadm --create /dev/md0 --assume-clean --level 5 --chunk 128 --layout left-asymmetric --raid-devices=10 missing /dev/sdk1 /dev/sdl /dev/sdm1 /dev/sdo1 /dev/sdn1 /dev/sdh1 /dev/sdi1 /dev/sdf1 /dev/sdg For those following along at home, double check everything! Don't _ever_ try to see if it's right by mounting the array, use fsck -n which will do a read only check of the filesystem and not try and write anything. A mount will try and replay the journal. Regards, Brad -- Dolphins are so intelligent that within a few weeks they can train Americans to stand at the edge of the pool and throw them fish.