From mboxrd@z Thu Jan 1 00:00:00 1970 From: "T. Ermlich" Subject: Re: Broken harddisk Date: Sat, 29 Jan 2005 17:47:23 +0100 Message-ID: <41FBBE1B.7060406@gmx.net> References: <41FAD73F.1070504@gmx.net> <41FBAD0B.2080408@gmx.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: Gordon Henderson Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids Hi again, well, due to that realy handy hints I subscribed to the list ... ;) Gordon Henderson scribbled on 29.01.2005 16:56: > On Sat, 29 Jan 2005, T. Ermlich wrote: > >>That's right: each harddisk is partitioned absolutly identically, like: >> 0 - 19456 - /dev/sda1 - extended partition >> 1 - 6528 - /dev/sda5 - /dev/md0 >> 6529 - 9138 - /dev/sda6 - /dev/md1 >> 9139 - 16970 - /dev/sda7 - /dev/md2 >>16971 - 19456 - /dev/sda8 - /dev/md3 >>And after doing those partitionings I 'combined' them to act as raid1. > >>I have two additional IDE drives in that system. >>/dev/hda contains some data, and is the boot drive, /dev/hdb contains >>some less important data. > > Just as a point of note - if the boot disk goes down it will be harder to > recover the data... Consider making the boot disk mirrored too! Yeah .. I thought about that in the past ... and decided to buy an 3Ware controller (9500S-4LP) for those things in ~2-3 month (as I don't have the money yet). Currently I'm using the onboard SATA controller (Asus A7V8X with an Promise controller), >>> mdadm --add /dev/md0 /dev/sda1 >>> mdadm --add /dev/md1 /dev/sda2 >>> mdadm --add /dev/md2 /dev/sda3 >>> mdadm --add /dev/md3 /dev/sda4 >> >>Now some new trouble starts ...? >>'mdadm --add /dev/md0 /dev/sda1' started just fine - but exactly at 50% >>it started giving tons of errors, like: > > You should ve using: > > mdadm --add /dev/md0 /dev/sda5 Yes, I did - I just made a mistake when writing the command above. >>[quote] >>Jan 29 16:10:24 suse92 kernel: Additional sense: Unrecovered read error >>- auto reallocate failed >>Jan 29 16:10:24 suse92 kernel: end_request: I/O error, dev sdb, sector >>52460420 > > The is a read error from /dev/sdb. What it's saying is that sdb has bad > sectors which can't be recoverd. > > You have 2 bad drives in a RAID-1 - and thats really bad )-: All I have ... better than nothing ... will be improved in the future ;) >>Personalities : [raid1] >>md3 : active raid1 sdb8[1] >> 19960640 blocks [2/1] [_U] >> >>md2 : active raid1 sdb7[1] >> 62910400 blocks [2/1] [_U] >> >>md1 : active raid1 sdb6[1] >> 20964672 blocks [2/1] [_U] >> >>md0 : active raid1 sdb5[1] sda5[2] >> 52436032 blocks [2/1] [_U] >> [==========>..........] recovery = 50.0% (26230016/52436032) >>finish=121.7min speed=1050K/sec >>unused devices: >>[/quote] >> >>Can I stop that process for /dev/md0, and start with /dev/md1 (just to >>compare if its a problem with that partition only, or an general problem >>(so that eg. the second drive has problens, too)? > > Yes - just fail & remove the drive partition: > > mdadm --fail /dev/md0 /dev/sda5 > mdadm --remove /dev/md0 /dev/sda5 > > At this point, I'd run a badblocks on the other partitions before doing > the resync: > > badblocks -s -c 256 /dev/sdb6 > badblocks -s -c 256 /dev/sdb7 > badblocks -s -c 256 /dev/sdb8 > > if these pass, you can do the hot-add, however, it looks like the sdb disk > is also faulty. > > At this point, I'd be looking to replace both disks and restore from > backup, but if you can re-sync the other 3 partitions, then remove the > also-faulty sdb, and replace it with a new one, and you can re-sync the 3 > good partitions, and you only have to restore the '5' partition (md0) from > backup. > > You could try mkfs'ing the new partition sda5, mounting it, and copying > the data on md0 over to it - theres a chance the bad sectors on sdb lie > outside the filing system... This would save you having to restore from > backup, however, it then becomes trickier as you then have to re-create > the raid set on a new disk with a missing drive, and copy it again. Ok, I'll do that. I attached an older 80GB harddisk (/dev/hdc), and right now I'm copying the content of /dev/md0 there, using 'cp -a'. If that's finished I'd start checking for badblocks ... and I guess the backups I made in the past might be full with probably damaged data ... :-( Should I delete /dev/md0 completly after the copy-process has finished? Or just checking for badblocks and continue using it? >>btw: does mdadm also format the partitions? > > No... You don't need to format/mkfs the partitions, as the raid resync > will take care of making it a mirror of the existing working disk. Ah .. ok. :-) > Gordon Thanks a lot!! Torsten