Re: Best Practice for Raid1 Root

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Terrence Martin <tmartin@physics.ucsd.edu>
Cc: linux-raid@vger.kernel.org
Subject: Re: Best Practice for Raid1 Root
Date: Wed, 14 Jan 2004 16:59:56 -0800	[thread overview]
Message-ID: <4005E60C.1040504@physics.ucsd.edu> (raw)
In-Reply-To: <4005DE22.30604@tls.msk.ru>

Thank you for the detailed post. My primary concern is the complete 
failure case since even if there are block problems that cause a partial 
boot (and subsequent failure) a quick unplug of the disk will simulate 
the complete failure state. It is also fairly easy to document that. :)

I had not considered that grub would not be the better solution in this 
case and the older lilo would be the preferred.

While I have managed to grok some of the details of grub it is fairly 
complex. Your technique for lilo gives me a hint though on what I may 
have to do to get grub to work. Of course I have lilo to fall back on.

I do have a concern that moving forward lilo may disappear as an option 
from RH, but it is in RHAS3.0 so I guess I am good for a while.

Also thank you for the tip about swap. I had not considered placing swap 
on an md device to ensure reliability. I will do that as well.

Thanks again,

Terrence





Michael Tokarev wrote:
> Terrence Martin wrote:
> 
>> Hi,
>>
>> I wanted to post this question for a while.
>>
>> On several systems I have configured a root software raid setup with 
>> two IDE hard drives. The systems are always some version of redhat. 
>> Each disk has its own controller and is partitioned similar to the 
>> following, maybe with more partitions, but this is the minimum.
>>
>> hda1 fd   100M
>> hda2 swap 1024M
>> hda3 fd   10G
>>
>> hdc1 fd   100M
>> hdc2 swap 1024M
>> hdc3 fd   10G
>>
>> The Raid devices would be
>>
>> /dev/md0 mounted under /boot made of /dev/hda1 and /dev/hdc1
>> /dev/md1 mounted under / made of /dev/hda3 and /dev/hdc3
> 
> 
> You aren't using raid1 for swap, yes?
> Using two (or more) swap partitions in equivalent of raid0 array
> (listing all them in fstab with the same priority) looks like a
> rather common case, and indeed it works good (you're getting
> stripe speed this way)... until one disk crashes.  And in case
> of disk failure, your running system goes complete havoc,
> including possible filesystem corruption and very probable data
> corruption due to bad ("missing") parts of virtual memory.
> It happened to us recently - we where using 2-disk systems,
> mirroring everything but swap... it was not a nice lesson... ;)
>  From now on, I'm using raid1 for swap too.  Yes it is much
> slower than using several plain swap partitions, and less
> efficient too, but it is much more safe.
> 
>> The boot loader is grub and I want both /boot and / raided.
>>
>> In the event of a failure of hda I would like the system to switch to 
>> hdc. This works fine. However what I have had problems with is if the 
>> system reboots. If /dev/hda is unavailable I no longer have a disk 
>> with a boot sector set up correctly. Unless I have a floppy or CDROM 
>> with a boot loader the system will not come up.
>>
>> So my main question is what is the best practice to get a workable 
>> boot sector on /dev/hdc? How are other people making sure that their 
>> system remains bootable after a disk failure of the boot disk? Is it 
>> even possible with software raid and PC BIOS? Also when you replace 
>> /dev/hda how are you getting a valid boot sector on that disk?
> 
> 
> The answer really depends.  There's no boot program set out there (where
> boot program set is everything from BIOS to the OS boot loader) that is
> able to deal with every kind of first (boot) disk failure.  There are 2
> scenarios of disk failure: when your failed /dev/hda is dead completely,
> just like as it just unplugged, so BIOS and OS boot loader does not even
> see/recognize it (from my expirience this is the most common scenario,
> YMMV).  And second choice is when your boot disk is alive but have some
> bad/unreadable/whatever sectors that belongs to data used during boot
> sequence, so the disk is recognized but boot fails due to read errors.
> 
> It's easy to deal with first case (first disk dead completely).  I wasn't
> able to use grub in that case, but lilo works just fine.  For that, I
> use standard MBR on both /dev/hda and /dev/hdc (your case), and install
> lilo into /dev/md0 (install=/dev/md0 in lilo.conf), making corresponding
> /dev/hd[ac]1 bootable ("active") partitions.  This way, boot sector gets
> "mirrored" manually when installing the MBR, and lilo maps are mirrored
> by raid code.  Lilo uses 0x80 BIOS disk number for the boot map for all
> the disks that forms /dev/md0 (regardless of actual number of them) - it
> treats /dev/md0 array like a single disk.  This way, you may remove/fail
> first (or second or 3rd in multidisk config) disk and your system will
> boot from first disk available, provided your bios will skip missing
> disks and assign 0x80 number to first disk really present.  There's one
> limitation of this method: disk layout should be exactly the same on all
> disks (at least /dev/hd[ac]1 partition placement), or else lilo map will
> be invalid on some disks and valid on others.
> 
> But there's no good way to deal with second scenario.  Especially since
> the problem (failed read) may happen when reading partition table or MBR
> by BIOS - a piece of code you usually can't modify/control.  Provided MBR
> read correctly by BIOS, loaded into memory and first stage of lilo/whatever
> is executing, next steps depends on the OS boot loader (lilo, grub, ...).
> It *may* recognize/know about raid1 array it is booting from, and try other
> disks in case read from first disk fails.  But none of currently existing
> linux boot loaders does that as far as I know.
> 
> So to summarize: it seems like using lilo, installing it into raid array
> instead of MBR, and using standard MBR to boot the machine allows you to 
> deal
> with at least one disk failure scenario, while other scenario is 
> problematic
> in all cases....
> 
> /mjt
>

next prev parent reply	other threads:[~2004-01-15  0:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-14 23:43 Best Practice for Raid1 Root Terrence Martin
2004-01-15  0:06 ` Christian Kivalo
2004-01-15  0:32   ` Michael Tokarev
2004-01-15 12:48     ` Luca Berra
2004-01-15  0:26 ` Michael Tokarev
2004-01-15  0:59   ` Terrence Martin [this message]
2004-01-15  1:22     ` Terrence Martin
2004-01-15  8:42 ` Gordon Henderson
2004-01-18 21:58 ` Frank van Maarseveen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4005E60C.1040504@physics.ucsd.edu \
    --to=tmartin@physics.ucsd.edu \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.