two-disk-failure question

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* two-disk-failure question
@ 2003-10-17 21:15 maarten van den Berg
  2003-10-22 16:55 ` Ryan B. Lynch
  0 siblings, 1 reply; 2+ messages in thread
From: maarten van den Berg @ 2003-10-17 21:15 UTC (permalink / raw)
  To: linux-raid

Hi List,

I've a problem which unfortunately hit me yesterday.  I have the dreaded 
"two-disk-failure" on a 6 disk raid5 volume.  I read the fine howto chapter 
6.1 at unthought.net and before I start (with somewhat shaking hands... ;-| ) 
I need some clearing up of an issue / question...

Quote:

"To get this to work, you'll need to have an up to date /etc/raidtab - if it 
doesn't EXACTLY match devices and ordering of the original disks this will 
not work as expected, but will most likely completely obliterate whatever 
data you used to have on your disks."

Now I can't be completely absolutely sure I did not -at some point- re-order 
cables and such. So my obvious question is:  Is this step (mkraid --force 
with one of the offline disks defined as failed-disk) destructive, or could I 
(theoretically) experiment endlessly with the order in which the disks are 
defined in /etc/raidtab before I decide to mount it read-write and raidhotadd 
a fresh disk ?

Second question, If one is sufficiently adept at looking at raw disk 
structures (notably the suberblocks), can a human find out which disk is 
which, ie. in which order they DO belong ?

Thanks for any and all help regarding this. I have 400 Gigs at stake here...
:-(

Maarten 

-- 
Yes of course I'm sure it's the red cable. I guarante[^%!/+)F#0c|'NO CARRIER

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: two-disk-failure question
  2003-10-17 21:15 two-disk-failure question maarten van den Berg
@ 2003-10-22 16:55 ` Ryan B. Lynch
  0 siblings, 0 replies; 2+ messages in thread
From: Ryan B. Lynch @ 2003-10-22 16:55 UTC (permalink / raw)
  To: maarten van den Berg; +Cc: linux-raid

Hey Maarten,

Maarten van den Berg wrote:

>cables and such. So my obvious question is:  Is this step (mkraid --force 
>with one of the offline disks defined as failed-disk) destructive, or could I 
>(theoretically) experiment endlessly with the order in which the disks are 
>defined in /etc/raidtab before I decide to mount it read-write and raidhotadd 
>a fresh disk ?
>
I had to do this about a year ago, on account of a bad IDE controller, 
and was successful on the first try.  I recall from the HOWTO that the 
'mkraid --force' command IS destructive, and it WILL lose the array if 
you get it wrong.

My incident involved data which would have been a bear to restore from 
backup, so I didn't take any chances.  Prior to the 'mkraid --force' 
step, I labelled each physical disk with it's number in the array, and 
then copied each disk to an identical disk using the 'dd' command 
[somthing like 'dd if=/dev/sda of=/dev/sdb bs=8192' where /dev/sda is 
the original and /dev/sdb is a handy blank disk, repeated for each 
original disk in the array].  Of course, this required five extra hard 
disks, but I had them lying around anyway.  And that's the foolproof 
method.  If you're really paranoid (you don't have a backup of the 
failed array), you might want to also do an 'md5sum /dev/sdb; md5sum 
/dev/sda' and compare the two hash values to ensure that each copy is 
faithful--re-do the copy that doesn't pass the hash check.  Since you're 
juggling so many hard drives, make sure you label all the 
disks--scotch-taped sticky notes on the cover, with the array #s written 
in felt pen worked for me.

Then, you can run off and do whatever with the originals, and you'll 
always have something to go back to if you utterly wipe out your array 
while trying to restore it.  If that happens, you just take the 
clobbered disks, and re-do the 'dd' command to write the copy back to 
the originals.  Hash if necessary, and go to town again.  Repeat until 
you either successfully restore the array, or you swear off computers 
and enroll in florists' school.

Keep in mind that this can add a LOT of time to a restore operation.  
With Western Digital WD200 IDE disks (7200 RPM, 18.6 GB), I can move 
about 30-22 MB/sec (beginning-end) on direct disk-to-disk copies like 
this, with an average around 25 MB/sec.  That's ~1.5 GB/min, so you can 
estimate your own ETAs.  Hashing will take a similar amount of time if 
you use 'md5sum' on a reasonably fast machine (P4 1.8+).

-R

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-10-22 16:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-17 21:15 two-disk-failure question maarten van den Berg
2003-10-22 16:55 ` Ryan B. Lynch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).