mdadm: what if - crashed OS

All of lore.kernel.org
 help / color / mirror / Atom feed

* mdadm: what if - crashed OS
@ 2007-01-05  6:23 Vince Spinelli
  2007-01-05 11:21 ` David Greaves
  2007-01-05 21:51 ` Bill Davidsen
  0 siblings, 2 replies; 5+ messages in thread
From: Vince Spinelli @ 2007-01-05  6:23 UTC (permalink / raw)
  To: linux-raid

Hello,

My name is Vince Spinelli, from Buffalo, NY (US).  I am currently using
'mdadm' under Fedora Core 5 (32-bit) to run two Soft-RAID arrays.

1) RAID-1 (mirror) for mission critical data.  #drives = 2 ea. PATA ATA100
2) RAID-5 (striped+parity) for multimedia data.  #drives = 5 ea. SATA 3G

My question is this...

In case of catastropic machine failure, such as the operating system
(which is on a separate PATA ATA100 drive) failing or even the OS hard
drive being physically destroyed, how would I go about rebuilding my RAID
arrays?

Obviously, this would assume that the 7 disks which make up my arrays had
survived and were not damaged.

-I would obviously then build a new computer,
-install Linux, make sure 'mdadm' was installed,
-physically install all of my drives into the computer,
-copy my old /etc/mdadm.conf file (which has been saved on cd-rom but is
easily re-made) onto the new computer,
- and then what?

I have thought about this, and I can't understand how 'mdadm' decides the
health of an array.

For example, if I type at prompt:

/sbin/mdadm --detail /dev/md1

then I am given the current status of array 'md1'.  It may be clean,
degraded, recovering, or whatever.  Therefore, on a fresh install of
Linux, with a fresh copy of 'mdadm', I am led to believe that the result
of the previous command would be something like...

Active Devices = 0
Working Devices = 4
Failed Devices = 0
Spare Devices = 4

That, obviously would be no good.

So, please, if anyone has rebuilt a Soft-RAID array from scratch WHILE
STILL PRESERVING THE DATA ON THAT ARRAY with 'mdadm', please explain how
this is accomplished, as I'm sitting on 1.5 TB of data that I truly do not
want to lose.

Thank You,
- Vince

---------------------------------------
Vince Spinelli
University at Buffalo: EE
---------------------------------------
"Kind of off his mental reservation."
- ancient cowboy wisdom.
---------------------------------------
Vince@SpinelliCreations.com
[vfs@buffalo.edu / vfs@eng.buffalo.edu]

---------------------------------------------------------
SpinelliCreations Secure Webmail: Powered by SquirrelMail

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdadm: what if - crashed OS
  2007-01-05  6:23 mdadm: what if - crashed OS Vince Spinelli
@ 2007-01-05 11:21 ` David Greaves
  2007-01-06  4:10   ` Vince Spinelli
  2007-01-05 21:51 ` Bill Davidsen
  1 sibling, 1 reply; 5+ messages in thread
From: David Greaves @ 2007-01-05 11:21 UTC (permalink / raw)
  To: Vince; +Cc: linux-raid

Assuming you can allow some downtime, get yourself a rescue CD such as 'RIP'

This will let you boot into the machine and run mdadm commands.

You don't mention kernel/mdadm versions so you may want to check they're close
on the rescue CD.

Then try looking at the manpage around --assemble.
In particular you may want to try --scan and --uuid (if your RIP/live
kernel/mdadm support it)

Also check out the examples...

Assuming this is a sane machine and you're not in real disaster recovery mode
with drives pulled in from random boxes then look at using the literal string
"--config=partitions" (see the manpage) to avoid creating an mdadm.conf with the
"DEVICE partitions" line - PITA on live CDs where you just want a command line ;)

If you can manage it, this will give you a nice warm feeling about recovering
from a problem and it's pretty safe - just common sense like making sure the
live CD kernel/mdadm are either up-to-date or match your production system.

HTH

Also:
> I have thought about this, and I can't understand how 'mdadm' decides the
> health of an array.

Each disk/partition used by md has a superblock which contains a unique UUID and
other info, like the number of devices and the raid level. mdadm --scan looks
into each partition for a superblock and notes this data. It can then group all
the superblocks with the same UUID together and, for each group, knowing how
many devices it should have, how many it has and how many it needs it can decide
if the device can safely be assembled.

David
PS Yes, I've done this (too many times!)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdadm: what if - crashed OS
  2007-01-05  6:23 mdadm: what if - crashed OS Vince Spinelli
  2007-01-05 11:21 ` David Greaves
@ 2007-01-05 21:51 ` Bill Davidsen
  2007-01-05 22:25   ` Andrew Geppert
  1 sibling, 1 reply; 5+ messages in thread
From: Bill Davidsen @ 2007-01-05 21:51 UTC (permalink / raw)
  To: Vince; +Cc: linux-raid

Vince Spinelli wrote:
> Hello,
>
> My name is Vince Spinelli, from Buffalo, NY (US).  I am currently using
> 'mdadm' under Fedora Core 5 (32-bit) to run two Soft-RAID arrays.
>
> 1) RAID-1 (mirror) for mission critical data.  #drives = 2 ea. PATA ATA100
> 2) RAID-5 (striped+parity) for multimedia data.  #drives = 5 ea. SATA 3G
>
> My question is this...
>
> In case of catastropic machine failure, such as the operating system
> (which is on a separate PATA ATA100 drive) failing or even the OS hard
> drive being physically destroyed, how would I go about rebuilding my RAID
> arrays?
>   
May I say that if you don't want to lose your data, then the o/s is 
"critical data." I regard boot, root, and swap as critical, because if 
they fail you have a much more complex recovery issue. Also note that 
you still need backup, because some hardware failure modes will write 
bad data (maybe silently) without an actual "crash" you notice. Power 
supplies, controllers, and disk drives will help you test your backup 
procedures.
> Obviously, this would assume that the 7 disks which make up my arrays had
> survived and were not damaged.
>
> -I would obviously then build a new computer,
> -install Linux, make sure 'mdadm' was installed,
> -physically install all of my drives into the computer,
> -copy my old /etc/mdadm.conf file (which has been saved on cd-rom but is
> easily re-made) onto the new computer,
> - and then what?
>
> I have thought about this, and I can't understand how 'mdadm' decides the
> health of an array.
>
> For example, if I type at prompt:
>
> /sbin/mdadm --detail /dev/md1
>
> then I am given the current status of array 'md1'.  It may be clean,
> degraded, recovering, or whatever.  Therefore, on a fresh install of
> Linux, with a fresh copy of 'mdadm', I am led to believe that the result
> of the previous command would be something like...
>
> Active Devices = 0
> Working Devices = 4
> Failed Devices = 0
> Spare Devices = 4
>
> That, obviously would be no good.
>
> So, please, if anyone has rebuilt a Soft-RAID array from scratch WHILE
> STILL PRESERVING THE DATA ON THAT ARRAY with 'mdadm', please explain how
> this is accomplished, as I'm sitting on 1.5 TB of data that I truly do not
> want to lose.
You just set devices to PARTITIONS and use the -assemble command. Oh, 
you use a superblock with uuid so assemble can figure out what to do.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdadm: what if - crashed OS
  2007-01-05 21:51 ` Bill Davidsen
@ 2007-01-05 22:25   ` Andrew Geppert
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Geppert @ 2007-01-05 22:25 UTC (permalink / raw)
  To: linux-raid

This actually happened to me recently where my OS HDD had a complete
physical failure. Unfortunately, I do not have a copy of the config files.
What additional tasks are needed to re-create the config files - or is this
done by -assemble?

AMG

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdadm: what if - crashed OS
  2007-01-05 11:21 ` David Greaves
@ 2007-01-06  4:10   ` Vince Spinelli
  0 siblings, 0 replies; 5+ messages in thread
From: Vince Spinelli @ 2007-01-06  4:10 UTC (permalink / raw)
  To: David Greaves; +Cc: vince, linux-raid

Thank you all for the responses and help (you guys are FAST!)...

I have successfully simulated a worst case scenario and 'rebuilt the
arrays from scratch' while preserving all data.

This is how I did it, for anyone who may run into the same jam:

1) removed OS hdd, temporarily installed scrap (read as: old 4 GB IDE HDD)
hard drive.  Installed same version of Linux - FC5, and same kernel.
2) copied (From cd backup) original mdadm.conf to /etc/mdadm.conf
3) as reccomended, executed following at command line...
[abc@123]# /sbin/mdadm --assemble --uuid=[UUIDOFARRAY] --scan /dev/md1
4) rebooted

This resulted in the array popping right up, and being accessible with all
data in tact.

Thank you again for all the help!  I will now be working on a live cd to
suite my taste, so that this all would be a little easier.

FYI --- kernel being used is a prepackaged one from Livna Repository,
2.6.17-1.2187_FC5.stk16 #1 Mon Sep 25 17:32:45 EDT 2006 i686
mdadm version being used is,
mdadm.i386 2.3.1-3

-Regards
Vince

---------------------------------------
Vince Spinelli
University at Buffalo: EE

---------------------------------------
"Kind of off his mental reservation."
- ancient cowboy wisdom.
---------------------------------------
Vince@SpinelliCreations.com
[vfs@buffalo.edu / vfs@eng.buffalo.edu]

> Assuming you can allow some downtime, get yourself a rescue CD such as
> 'RIP'
>
> This will let you boot into the machine and run mdadm commands.
>
> You don't mention kernel/mdadm versions so you may want to check they're
> close
> on the rescue CD.
>
> Then try looking at the manpage around --assemble.
> In particular you may want to try --scan and --uuid (if your RIP/live
> kernel/mdadm support it)
>
> Also check out the examples...
>
> Assuming this is a sane machine and you're not in real disaster recovery
> mode
> with drives pulled in from random boxes then look at using the literal
> string
> "--config=partitions" (see the manpage) to avoid creating an mdadm.conf
> with the
> "DEVICE partitions" line - PITA on live CDs where you just want a command
> line ;)
>
> If you can manage it, this will give you a nice warm feeling about
> recovering
> from a problem and it's pretty safe - just common sense like making sure
> the
> live CD kernel/mdadm are either up-to-date or match your production
> system.
>
> HTH
>
> Also:
>> I have thought about this, and I can't understand how 'mdadm' decides
>> the
>> health of an array.
>
> Each disk/partition used by md has a superblock which contains a unique
> UUID and
> other info, like the number of devices and the raid level. mdadm --scan
> looks
> into each partition for a superblock and notes this data. It can then
> group all
> the superblocks with the same UUID together and, for each group, knowing
> how
> many devices it should have, how many it has and how many it needs it can
> decide
> if the device can safely be assembled.
>
> David
> PS Yes, I've done this (too many times!)
>



---------------------------------------------------------
SpinelliCreations Secure Webmail: Powered by SquirrelMail

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-01-06  4:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-05  6:23 mdadm: what if - crashed OS Vince Spinelli
2007-01-05 11:21 ` David Greaves
2007-01-06  4:10   ` Vince Spinelli
2007-01-05 21:51 ` Bill Davidsen
2007-01-05 22:25   ` Andrew Geppert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.