In a pickle got out of most of it.

Linux RAID subsystem development
 help / color / mirror / Atom feed

* In a pickle got out of most of it.
@ 2002-09-28 11:01 luterac
  2002-09-29 17:13 ` Adam Luter
  0 siblings, 1 reply; 2+ messages in thread
From: luterac @ 2002-09-28 11:01 UTC (permalink / raw)
  To: linux-raid

I have four harddrives, hd[efgh].  The raid order should be hd[fgeh].  However
I did not have drive hdf (device position 0 in the array), when I created the
array.  I am trying to add this device (now that it is here) into the cluster
at position zero (where it belongs).

The only thing I can manage to do right now is to add it as a spare drive.

So either I need to know how to add it correctly to begin with, or I need to
be able to convert hdf's spare status into a full fledge drive status (though
I suppose that doesn't really matter much).

I was actually in a much worse position earlier.  Not only was hdf added as a
spare, but before I even realized this problem a DMA error popped in and cause
the whole array to fail.

I was terrible panic!  I didn't know what to do.  When I finally found
documentation on the mkraid command (as a recovery tool), I realized I still
had a major problem: I didn't know which drives were where!

The reason was I just got some fans for the drives too, so I rearranged them
without marking them!

After trying 8 different combinations (half of the total possible) I finally
found the correct one (fgeh).  I can now mount my drive again, FEW!  I am
infinitely relieved about that.  But it is still in a degraded mode since I
don't know how to add hdf in.

My best guess was to do this:

Right now my /etc/raidtab has the hdf marked as faulty.

I tried just 'raidstop /dev/md1' then turning it on as not faulty in
/etc/raidtab and then running 'raidstart /dev/md1'.  However this caused it to
read the /dev/hdf1's super block instead of the other three!  Luckily it
didn't try to recover and I could just start over again using mkraid.

So my best guess was to run mdadm --zero-superblock /dev/hdf1 (which I have
already done), and then try again without marking it faulty.  But I am just
simply too scared to do that without some sort of ok from someone :) .

I'm afraid that it will start reconstruction or something -- in fact I'm
mainly worried that it's just too late at night for me to think rationally --
it's already 6am.

Could someone help?  I do appreciate it greatly -- I have been unlucky to have
my series of accidents lead up to this point but very lucky that I still have
my data (which is 80% backed up elsewhere, but I would really rather not
restore that copy, since the 20% is about 40 hours work that I don't have
anywhere to backup -- without deleting some of the other stuff [and deletion
is not an option either, of course]).

Thank you, thank you, thank you for any advice and at the least sympathy
enough to read this far -- I know I tend to be verbose when I get tired.

So I'll stop here.  I'm including my raidtab -- hopefully my situation (at
this point) is plain enough that I do not need any md event logs posted here
-- but I will be glad to do that too.

raiddev /dev/md1
	raid-level	5
	nr-raid-disks	4
	persistent-superblock	1
	parity-algorithm	left-symmetric
	chunk-size	64k

	device	/dev/hdf1
	failed-disk	0
#	raid-disk	0
	device	/dev/hdg1
	raid-disk	1
	device	/dev/hde1
	raid-disk	2
	device	/dev/hdh1
	raid-disk	3

-Gryn (Adam Luter, the ever thankful).

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: In a pickle got out of most of it.
  2002-09-28 11:01 In a pickle got out of most of it luterac
@ 2002-09-29 17:13 ` Adam Luter
  0 siblings, 0 replies; 2+ messages in thread
From: Adam Luter @ 2002-09-29 17:13 UTC (permalink / raw)
  To: linux-raid

I found out that the spare disk replaces the old disk transparantly,
after it syncs.  However, I also found out that one of my harddrives
had bad sectors! ( :( :( ).  And it wasn't the spare disk either.  So
I had to not sync (because trying to read the bad sectors would cause
that device to fail, bringing the whole array down).

Instead I just mounted ro, and copied the most important bits out
(well the ones that would fit on my sytem drive), then ran a badblock
program that I got from the manufacturer's site to "remove" the
badblocks (that is, hide them).  And finally rebuilt the array from
scratch.

The only lucky thing was that I still had the old array available,
just not onsite.  So after procuring it and copying it back onto the
new array (along with the saved bits) I've lost only about 10% of my
data.

-Gryn (Adam Luter)

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2002-09-29 17:13 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-28 11:01 In a pickle got out of most of it luterac
2002-09-29 17:13 ` Adam Luter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox