[linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed

From: hansbkk@gmail.com
To: linux-lvm@redhat.com
Subject: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?
Date: Sun, 28 Nov 2010 22:31:51 +0700	[thread overview]
Message-ID: <AANLkTimGkMJGiJC+7L+Pu3+yf-J_s0Ex3hM2-g-0+UqQ@mail.gmail.com> (raw)
In-Reply-To: <AANLkTimTYse7vnqZydc1Z5z60vNuQw3juj7J_W7+bQEm@mail.gmail.com>

�- - - - - - My abject apologies to all for improper addressing in my
previous messages (thanks to all those who set me straight :)

Hope you're all still willing to consider my request for feedback.
Start with a bit of context:

- SAN/NAS (call it FILER-A) hosting say a dozen TB and servicing a few
dozen client machines and servers, mostly virtual hosts. Another,
larger (FILER-B - still just tens of TB) host's drives are used for
storing backup sets, via not only Amanda, but also filesystems
comprising gazillions of hard-linked archive sets created by (eg)
rdiff-backup, rsnapshot and BackupPC. We're on a very limited budget,
therefore no tape storage for backups.

- I plan to run LVM over RAID (likely RAID1 or RAID10) for IMO an
ideal combination of fault tolerance, performance and flexibility.

- I am not at this point overly concerned about performance issues -
reliability/redundancy and ease of recovery are my main priorities.

Problem:

For off-site data rotation, the hard-linked filesystems on FILER-B
require full filesystem cloning with block-level tools rather than
file-level copying or sync'ing. My current plan is to swap out disks
mirrored via RAID, marking them as "failed" and then rebuilding using
the (re-initialized) incoming rotation set.

HOWEVER - the use of LVM (and possibly RAID10) adds complexity to the
filesystems, which makes disaster recovery from the detached disk sets
much more difficult than regular partitions on physical disks.

Theoretical solution:

Use RAID1 on the "top layer" to mirror the data stored in an LVM (set
of) disk(s) on the one hand (call it TopRAID1) to ***regular
partitions*** on actual physical disks on the other (call this the
TopRAID2 side).

(ASCII art best viewed with a monospaced font)

"TopRAID1" side
�______________________________________
| � � � � � � � �LVM VG � � � � � � � �|
| �_____ � _____________ � __________ �|
| | LV1 | | � � LV2 � � | | � �LV3 � | |
| | � � | | � � � � � � | | � � � � �| |
| | � � | | � � � � � � | | � � � � �| |
| | � � | | � � � � � � | | � � � � �| |
| | � � | | � � � � � � | | � � � � �| |
| | � � | | � � � � � � | | � � � � �| |
| |_____| |_____________| |__________| |
|____v___________v______________v______|
� � v � � � � � v � � � � � � �v
� � v � � � � � v � � � � � � �v
� RAID1 � � � RAID1 � � � � �RAID1
� � v � � � � � v � � � � � � �v
� __v__ � ______v______ � _____v____
�| HD1 | | � � HD2 � � | | � �HD3 � |
�| � � | | � � � � � � | | � � � � �|
�| � � | | � � � � � � | | � � � � �|
�| � � | | � � � � � � | | � � � � �|
�| � � | | � � � � � � | | � � � � �|
�| � � | | � � � � � � | | � � � � �|
�|_____| |_____________| |__________|

"TopRAID2" side

The mirroring at the top level would be set up between the individual
LVs on the TopRAID1 side and regular filesystem partitions (no RAID or
LVM) on the TopRAID2 side. In the event of a massive host failure, the
filesystems on the TopRAID2 side could be easily mounted for data
recovery and/or service resumption on another machine, and the
TopRAID1 disk set rebuilt from scratch and then re-mirrored from the
TopRAID2 disks.

One design goal would be to not allow any LV to get so large that it
won't fit on a single physical disk on the TopRAID2 side. If this is
not possible, then the corresponding TopRAID2 side would need to
comprise a multiple disk set, perhaps striped by RAID0 - not as
straightforward to recover as single disks, but at least without the
LVM layer.

Remember, the main purpose of this arrangement is so that the disks in
the TopRAID2 set can be rotated out for offsite storage. Ideally this
would be done by using an extra identical set (TopRAID2a and
TopRAID2b) to minimize the time windows when the live data is running
on TopRAID1 only.

Note that on the TopRAID1 side the LVM layers could be running on top
of another set of RAID disks (call it the BottomRAID), again either
RAID1 or perhaps RAID10 mirroring at the lowest layer. This disk set
could be allowed to grow in both size and complexity, with an
expectation that in the event of massive failure I won't even attempt
to rebuild/recover it, just tear it down and set it up again from
scratch, then mirror the data back from TopRAID2.

At this point this is all idle speculation on my part, and although I
think the technology makes it possible, I don't know whether it is a
practical scheme.

An enhancement of this idea would be to implement the "TopRAID" with a
full-server mirror using drdb and heartbeat, perhaps eliminating the
need for intra-server disk mirroring. In this case the TopRAID1 server
would have the flexibile disk space allocation of LVM, while the
TopRAID2 server's disks would all be just regular partitions (no LVM),
again, easily swapped out for offsite rotation.

Any feedback on these ideas would be most appreciated.

next      parent reply	other threads:[~2010-11-28 15:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AANLkTimTYse7vnqZydc1Z5z60vNuQw3juj7J_W7+bQEm@mail.gmail.com>
2010-11-28 15:31 ` hansbkk [this message]
2010-11-29 16:27   ` [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = best of both worlds? Lars Ellenberg
2010-11-29 17:00     ` hansbkk
2010-11-29 18:57   ` Nataraj
2010-11-30  5:20     ` hansbkk
2010-11-30  7:14       ` Nataraj
2010-11-30  7:34         ` hansbkk
2010-11-30 13:13           ` Phil Turmel
2010-11-30 15:39             ` hansbkk
2010-11-30 16:56               ` Phil Turmel
2010-12-01  4:45                 ` hansbkk
2010-12-01 12:50                   ` Phil Turmel
2010-12-01 19:47                     ` hansbkk
2010-11-30 15:41           ` Andrew Gideon
2010-11-30 15:53             ` hansbkk
2010-11-30 15:54               ` hansbkk
     [not found] ` <2D.98.07087.690A2FC4@cdptpa-omtalb.mail.rr.com>
2010-11-29 11:01   ` [linux-lvm] " hansbkk
2010-11-29 15:29     ` Keld Jørn Simonsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTimGkMJGiJC+7L+Pu3+yf-J_s0Ex3hM2-g-0+UqQ@mail.gmail.com \
    --to=hansbkk@gmail.com \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).