From: "Keld Jørn Simonsen" <keld@dkuug.dk>
To: linux-raid@vger.kernel.org
Subject: draft howto on making raids for surviving a disk crash
Date: Sat, 2 Feb 2008 20:41:31 +0100 [thread overview]
Message-ID: <20080202194131.GA7875@rap.rap.dk> (raw)
This is intended for the linux raid howto. Please give comments.
It is not fully ready /keld
Howto prepare for a failing disk
The following will describe how to prepare a system to survive
if one disk fails. This can be important for a server which is
intended to always run. The description is mostly aimed at
small servers, but it can also be used for
work stations to protect it for not losing data, and be running even if a
disk fails. Some recommendations on larger server setup is given
at the end of the howto.
This requires some extra hardware, especially disks, and the description
will also touch how to mak the most out of the disks, be it in terms of
available disk space, or input/output speed.
1. Creating of partitions
We recommend creating partitions for /boot, root, swap and other file systems.
This can be done by fdisk, parted or maybe a graphical interface
like the Mandriva/PClinuxos harddrake2. It is recommended to use drives
with equal sizes and performance characteristics.
If we are using the 2 drives sda and sdb, then sfdisk
may be used to make all the partitions into raid partitions:
sfdisk -c /dev/sda 1 fd
sfdisk -c /dev/sda 2 fd
sfdisk -c /dev/sda 3 fd
sfdisk -c /dev/sda 5 fd
sfdisk -c /dev/sdb 1 fd
sfdisk -c /dev/sdb 2 fd
sfdisk -c /dev/sdb 3 fd
sfdisk -c /dev/sdb 5 fd
Using:
fdisk -l /dev/sda /dev/sdb
The partition layout could then look like this:
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 37 297171 fd Linux raid autodetect
/dev/sda2 38 1132 8795587+ fd Linux raid autodetect
/dev/sda3 1133 1619 3911827+ fd Linux raid autodetect
/dev/sda4 1620 121601 963755415 5 Extended
/dev/sda5 1620 121601 963755383+ fd Linux raid autodetect
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 37 297171 fd Linux raid autodetect
/dev/sdb2 38 1132 8795587+ fd Linux raid autodetect
/dev/sdb3 1133 1619 3911827+ fd Linux raid autodetect
/dev/sdb4 1620 121601 963755415 5 Extended
/dev/sdb5 1620 121601 963755383+ fd Linux raid autodetect
2. Prepare for boot
The system should be set up to boot from multiple devices, so that
if one disk fails, the system can boot from another disk.
On Intel hardware, there are two common boot loaders, grub and lilo.
Both grub and lilo can only boot off a raid1. they cannot boot off
any other software raid device type. The reason they can boot off
the raid1 is that hey see the raid1 as a normal disk, they only then use
one of the dishs when booting. The boot stage only involves loading the kernel
with a initrd image, so not much data is needed for this. The kernel,
the initrd and other boot files can be put in a small /boot partition.
We recommend something like 200 MB on an ext3 raid1.
Make the raid1 and ext3 filesystem:
mdadm --create /dev/md0 --chunk=256 -R -l 1 -n 2 /dev/sda1 /dev/sdb1
mkfs -t ext3 -f /dev/md0
Make each of the disks bootable by lilo:
lilo -b /dev/sda /etc/lilo.conf1
lilo -b /dev/sdb /etc/lilo.conf2
Make each of the disks bootable by grub
(to be described)
3. The root file system
The root file system can be on another raid tah the /boot partition.
We recommend an raid10,f2, as the root file system will mostly be reads, and
the raid10,f2 raid type is the fastest for reads, while also sufficient
fast for writes. Other relevant raid types would be raid10,o2 or raid1.
It is recommended to use the udev file system, as this runs in RAM, and you
thus can avoid a number of read and writes to disk.
It is recommended that all file systems are mounted with the noatime option, this
avoids writing to the filesystem inodes every time a file has been read or written.
Make the raid10,f2 and ext3 filesystem:
mdadm --create /dev/md1 --chunk=256 -R -l 10 -n 2 -p f2 /dev/sda2 /dev/sdb2
mkfs -t ext3 -f /dev/md1
4. The swap file system
If a disk fails, where processes are swapped to, then all these processes fail.
This may be vital processes for the system, or vital jobs on the system. You can prevent
the failing of the processes by having the swap partitions on a raid. The swap area
needed is normally relatively small compared to the overall disk space available,
so we recommend the faster raid types over the more space economic. The raid10,f2
type seems to be the fastest here, other relevant raid types could be raid10,o2 or raid1.
Given that you have created a raid array, you can just make the swap partition directly
on it:
mdadm --create /dev/md2 --chunk=256 -R -l 10 -n 2 -p f2 /dev/sda3 /dev/sdb3
sfdisk -c /dev/md 2 82
mkswap /dev/md2
Maybe something on /var and /tmp could go here.
5. The rest of the file systems.
Other file systems can also be protected against one failing disk.
Which technique to recommend depends on your purpose with the
disk space. You may mix the different raid types if you have different types
of use on the same server, eg a data base and servicing of large files
from the same server. (This is one of the advantages of software raid
over hardware raid: you may have different types of raids on
a disk with a software raid, where a hardware raid only may take one
type for the whole disk.)
Is disk capacity the main priority, and you have more than 2 drives,
then raid5 is recommended. Raid5 only uses 1 drive for securing the
data, while raid1 and raid10 use at least half the capacity.
For example with 4 drives, raid5 provides 75 % of the total disk
space as usable, while raid1 and raid10 at most (dependent on the number
of copies) give a 50 % usability of the disk space. This becomes even better
for raid5 with more disks, with 10 disks you only use 10 % for security.
Is speed your main priority, then raid10,f2 raid10,o2 or raid1 would give you
most speed during normal operation. This even works if you only have 2 drives.
Is speed with a failed disk a concern, then raid10,o2 could be the choice, as
raid10,f2 is somewhat slower in operation, when a disk has failed.
Examples:
mdadm --create /dev/md3 --chunk=256 -R -l 10 -n 2 -p f2 /dev/sda5 /dev/sdb5
mdadm --create /dev/md3 --chunk=256 -R -l 10 -n 2 -p o2 /dev/sd[ab]5
mdadm --create /dev/md3 --chunk=256 -R -l 5 -n 4 /dev/sd[abcd]5
6. /etc/mdadm.conf
Something here on /etc/mdadm.conf. What would be safe, allowing
a system to boot even if a disk has crashed?
7. Recommendation for the setup of larger servers.
Given a larger server setup, with more disks, it is possible to
survive more than one disk crash. The raid6 array type can be used
to be able to survive 2 disk crashes, at the expense of the space of 2 disks.
The /boot, root and swap partitions can be set up with more disks, eg a
/boot partition made up from a raid1 of 3 disks, and root and swap partitons
made up from raid10,f3 arrays. Given that raid6 cannot survive more than the chashes
of 2 disks, the system disks need not be prepared for more than 2 craches
either, and you can use the rest of the disk IO capacity to speed up the system.
next reply other threads:[~2008-02-02 19:41 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-02 19:41 Keld Jørn Simonsen [this message]
2008-02-02 20:32 ` draft howto on making raids for surviving a disk crash Janek Kozicki
2008-02-02 20:52 ` Keld Jørn Simonsen
2008-02-05 15:45 ` Keld Jørn Simonsen
2008-02-03 15:53 ` Bill Davidsen
2008-02-03 17:03 ` Keld Jørn Simonsen
2008-02-04 18:22 ` Bill Davidsen
2008-02-06 9:05 ` Luca Berra
2008-02-06 14:24 ` Purpose of Document? (was Re: draft howto on making raids for surviving a disk crash) Moshe Yudkowsky
2008-02-06 15:29 ` Keld Jørn Simonsen
2008-02-06 15:45 ` draft howto on making raids for surviving a disk crash Keld Jørn Simonsen
2008-02-07 8:05 ` Luca Berra
2008-02-07 9:12 ` Keld Jørn Simonsen
2008-02-06 13:48 ` Michal Soltys
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080202194131.GA7875@rap.rap.dk \
--to=keld@dkuug.dk \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).