[linux-lvm] RAID 1 on Device Mapper

All of lore.kernel.org
 help / color / mirror / Atom feed

* [linux-lvm] RAID 1 on Device Mapper - best practices?
@ 2003-10-17 15:10 John Stoffel
  2003-10-17 15:29 ` Mike Williams
  0 siblings, 1 reply; 28+ messages in thread
From: John Stoffel @ 2003-10-17 15:10 UTC (permalink / raw)
  To: linux-lvm

Hi All,

I've just recently upgrade my home system to a fresh install of Debian
3.0/Unstable, and moved my data to a pair of 120gb disks.  My goal is
to mirror the disks so I don't have to worry as much about device
failures losing all my data (files and MPs, and junk).  Of course I'll
be doing backups, but this is more the first step.

Anyway, what I'd like to do is setup the disks using Device Mapper so
that I can grow and shrink volumes at need, along with the filesystems
mounted on top of those volumes.  So I now have it working, except
that reboots don't detect the RAID volumes on top of the Device Mapper
LVs properly on reboot.  

But first some background:

    Kernel:		2.4.22 (or .21)
    Device Mapper:      2.4.22-dm-1
    lvdisplay --version
	LVM version:     2.00.06 (2003-08-20)
	Library version: 1.00.05-ioctl-cvs (2003-09-01)
	Driver version:  4.0.1

Nor do I think I'm really doing this in the proper way.  Currently I
have one volume group per disk, left_vg and right_vg.  Each has a pair
of LVs on then, [left,right]_lv_[home,local] for a total of four LVs.  

Then I assemble these into MD volumes using mdadm, with each volume
having an LV from each of the VGs.  

Do I really need to use MD here, or can I build a mirrored LV to do
what I want?

Thanks,
John
   John Stoffel - Senior Unix Systems Administrator - Lucent Technologies
	 stoffel@lucent.com - http://www.lucent.com - 978-952-7548

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] RAID 1 on Device Mapper - best practices?
  2003-10-17 15:10 [linux-lvm] RAID 1 on Device Mapper - best practices? John Stoffel
@ 2003-10-17 15:29 ` Mike Williams
  2003-10-17 15:51   ` John Stoffel
  0 siblings, 1 reply; 28+ messages in thread
From: Mike Williams @ 2003-10-17 15:29 UTC (permalink / raw)
  To: linux-lvm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 17 October 2003 21:07, John Stoffel wrote:

> Anyway, what I'd like to do is setup the disks using Device Mapper so
> that I can grow and shrink volumes at need, along with the filesystems
> mounted on top of those volumes.  So I now have it working, except
> that reboots don't detect the RAID volumes on top of the Device Mapper
> LVs properly on reboot.

> Nor do I think I'm really doing this in the proper way.  Currently I
> have one volume group per disk, left_vg and right_vg.  Each has a pair
> of LVs on then, [left,right]_lv_[home,local] for a total of four LVs.
>
> Then I assemble these into MD volumes using mdadm, with each volume
> having an LV from each of the VGs.
>
> Do I really need to use MD here, or can I build a mirrored LV to do
> what I want?

Wouldn't it be more sensible to mirror the disks, then stick your LVs on top 
of that? This is what I have done with a software RAID5 array.

At first this didn't work, as the startup scripts Gentoo provide initialise 
LVM before MD, but that was easily changed.

- -- 
Mike Williams
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/kFDaInuLMrk7bIwRAlVKAKCL7QcheuL6uw7vgX9y2+bT6FA/AQCgjAZZ
Koz9o0FWAHHzz6gSfFBBMF0=
=vntA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] RAID 1 on Device Mapper - best practices?
  2003-10-17 15:29 ` Mike Williams
@ 2003-10-17 15:51   ` John Stoffel
  2003-10-17 15:56     ` [linux-lvm] " Måns Rullgård
  2003-10-17 16:13     ` [linux-lvm] " Mike Williams
  0 siblings, 2 replies; 28+ messages in thread
From: John Stoffel @ 2003-10-17 15:51 UTC (permalink / raw)
  To: linux-lvm

Mike> Wouldn't it be more sensible to mirror the disks, then stick
Mike> your LVs on top of that? This is what I have done with a
Mike> software RAID5 array.

Well, it could make more sense that way, but I was trying to get it do
that I could move and expand/shrink the filesystems as needed on the
various volumes.  

The other downside of the MD underneath LVM is that when a MD RAID
goes bad, I need to resync the entire disk.  In my setup, if I don't
have it mounted (or being used) and I corrupt it, there's less data to
have to re-mirror.

At least that was my goal.  

Mike> At first this didn't work, as the startup scripts Gentoo provide
Mike> initialise LVM before MD, but that was easily changed.

Yeah, I thinking I need to just tweak the scripts under debian as
well, but I wanted to find out what people found to be the best way to
do this.

John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [linux-lvm] Re: RAID 1 on Device Mapper - best practices?
  2003-10-17 15:51   ` John Stoffel
@ 2003-10-17 15:56     ` Måns Rullgård
  2003-10-17 16:13     ` [linux-lvm] " Mike Williams
  1 sibling, 0 replies; 28+ messages in thread
From: Måns Rullgård @ 2003-10-17 15:56 UTC (permalink / raw)
  To: linux-lvm

"John Stoffel" <stoffel@lucent.com> writes:

> Mike> At first this didn't work, as the startup scripts Gentoo provide
> Mike> initialise LVM before MD, but that was easily changed.
>
> Yeah, I thinking I need to just tweak the scripts under debian as
> well, but I wanted to find out what people found to be the best way to
> do this.

I've got mine set up so the kernel detects the arrays by itself.  It
can't get simpler.

-- 
M�ns Rullg�rd
mru@users.sf.net

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] RAID 1 on Device Mapper - best practices?
  2003-10-17 15:51   ` John Stoffel
  2003-10-17 15:56     ` [linux-lvm] " Måns Rullgård
@ 2003-10-17 16:13     ` Mike Williams
  2003-10-22  8:02       ` wopp
  1 sibling, 1 reply; 28+ messages in thread
From: Mike Williams @ 2003-10-17 16:13 UTC (permalink / raw)
  To: linux-lvm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 17 October 2003 21:48, John Stoffel wrote:
> Well, it could make more sense that way, but I was trying to get it do
> that I could move and expand/shrink the filesystems as needed on the
> various volumes.

Ahh, but that's the beauty of RAID and LVM, what you end up with is just 
another block device. Which ever way you do it you'll get the same benefit.

> The other downside of the MD underneath LVM is that when a MD RAID
> goes bad, I need to resync the entire disk.  In my setup, if I don't
> have it mounted (or being used) and I corrupt it, there's less data to
> have to re-mirror.
>
> At least that was my goal.

That's certainly a very good idea.

> Yeah, I thinking I need to just tweak the scripts under debian as
> well, but I wanted to find out what people found to be the best way to
> do this.

- From what I've read LVM is only capable of linearing or striping, so RAID 
seems to be the only option.

- -- 
Mike Williams
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/kFsmInuLMrk7bIwRAp2WAKCO+zdHrAIYEQm/2ov+crkhCgcLxQCfe81i
0s7o0RZ1gCTf2I4deV4kRNk=
=NwaX
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] RAID 1 on Device Mapper - best practices?
  2003-10-17 16:13     ` [linux-lvm] " Mike Williams
@ 2003-10-22  8:02       ` wopp
  2003-10-23 17:52         ` [linux-lvm] Drive gone bad, now what? Gert van der Knokke
                           ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: wopp @ 2003-10-22  8:02 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 3802 bytes --]

Hi all,

Mike Williams wrote on 17.10.2003 at 22:12:06 [Re: [linux-lvm] RAID 1 on Device Mapper - best practices?]:
> On Friday 17 October 2003 21:48, John Stoffel wrote: 
> > Well, it could make more sense that way, but I was trying to get it do
> > that I could move and expand/shrink the filesystems as needed on the
> > various volumes.
> 
> Ahh, but that's the beauty of RAID and LVM, what you end up with is just
> another block device. Which ever way you do it you'll get the same benefit.

I believe that is not true. How do you resize a RAID device? The only
option I can think of is to re-create it, which is clearly beside the 
point. With LVM on top of RAID, you can lvextend (or lvreduce), pvmove
and so on - where's the problem?

To put it differently, LVM devices are not just "another block device",
they're resizeable block devices. You get this benefit at the level you use
LVM on. Below RAID it's not really worth much (which is probably why Debian
starts RAID first).

> > The other downside of the MD underneath LVM is that when a MD RAID
> > goes bad, I need to resync the entire disk.

I've thought about that too. At the moment, I'm experimenting with several
disk partitions on each physical disk and RAID-1 devices made up of one
partition of each disk. Each MD is a PV in my VG. This way, if one
partition fails (i.e. runs in degenerated mode) the others will still be
mirrored. Maybe some ASCII-art can make this a bit clearer:

     +------+   +------+
     | hda1 | + | hdb1 |  -> md0 \
     +------+   +------+          +- VG0
     | hda2 | + | hdb2 |  -> md1 /
     +------+   +------+

[Yes, I'd prefer sda/sdb too ;-]

I'm not totally happy with this setup, because it's partly pointless :).
First of all, I'd like to point out that redundancy-providing RAID is,
PRIMARILY, a means of minimizing downtime, NOT a means of preventing data
loss. RAID reacts on disk failures (which affect the whole disk) and on
read/write errors. In these cases, normal operation proceeds without
interruption. If your data goes bad on disk but does not trigger a read
error, RAID doesn't care, i.e. it does not compare the contents of your
mirrors. Backups are against data loss and data corruption.

Of course, there's also the case of a partial failure (surface damage or
something the like) - I've just recently experienced it myself. In my case,
it hit a non-mirrored LVM PV, resulting in one or more filesystems being
remounted read-only, which was a pain ...
I've replaced the faulty disk with a new one, and now everything is
mirrored as described above (so "next time", there will hopefully be no
service interruption due to an FS which is unexpectedly read-only).
So? I'd have had to resync the whole disk in any case.

My conclusion is: Either you're only "playing around" with RAID, in which
case you should probably do whatever is most fun or gives you the most
learning experience, or you're serious about it, in which case you'll
immediately replace any disk showing errors anyway.

Just for the sake of contradicting myself, I'd like to add one thing which
is not stressed often enough here for my taste :).

People, if you're spreading out file systems over several physical disks
without providing some sort of redundancy, you're asking for trouble.
You're increasing the points of failure, making it much more probable for
a hardware error on any one of the disks to take all your data with it.
This is the reason RAID level 1+0 and RAID level 5 (yes, and 4 ...) were
invented shortly after RAID 0. Learn from other people's mistakes :).

Redundancy does HELP to keep your data safe :).

I'm sure more people have thought about these topics. What are your
conclusions?

Regards,
Holger

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [linux-lvm] Drive gone bad, now what?
  2003-10-22  8:02       ` wopp
@ 2003-10-23 17:52         ` Gert van der Knokke
  2003-10-23 18:59           ` John Stoffel
  2003-10-23 18:53         ` [linux-lvm] RAID 1 on Device Mapper - best practices? John Stoffel
  2003-11-20  7:20         ` Gregory K. Ruiz-Ade
  2 siblings, 1 reply; 28+ messages in thread
From: Gert van der Knokke @ 2003-10-23 17:52 UTC (permalink / raw)
  To: linux-lvm

Well, LVM 'lured' me too into a false sense of security and I think 
there should be a warning label on it :-)
I know it's my own fault but still...

The problem:

We've setup a simple server machine with a bunch of harddisks of 60 and 
80 Gb.
With 6 drives and lvm(1) setup it provided us with a nice amount of 
storage space, of course there was always the risk of a drive going bad 
but I had thought that lvm would be robust enough to cope with that sort 
of thing (no I didn't expect redundancy or soemthing like that, just I 
would be able to access data on the surviving disks)

Alas a drive went bad (reallly bad, beyond repair so no chance of 
getting any data from it).
Ok time for plan B how do I access the data on this 'limp' lvm system.
Googling and reading the FAQ's there were 3 options:

1 replacing the disk with a fresh one (still the data would be gone, as 
well as the lvm volume data stuff)
2 install LVM2 and access the lvm in 'partial mode'
3 quick hacks to access the lvm with risk of hanging when data on the 
missing drive would be accessed.

Option 1:
tried with an extra drive to no avail, how does one get the metadata 
stuff back on such a drive ? I assume there is some kind of 'numbering' 
scheme internally in lvm so it knows which drive is mapped where.

Option 2:
Installed LVM2 tools. Still running with the old kernel it says on 
vgscan/pvscan that there is a lvm consisting of 6 devices with one 
missing. With vgchange it says (naturally) that it needs the device 
mapper stuff in the kernel.
Compiled a new kernel with devicemapper (1.03 and 1.05 tested) and then 
pvscan says something about data being inconsistent on devices hda and 
hdc and such. vgscan find some vague lvm stuff but at the end is says 
found 1 volume expected 0.. vgchange -ay -P exits with a segmentation 
fault. (whereas it runs without segfaulting on the kernel without the 
devicemapper)

Option 3: lvm can't be activated because it needs 6 and finds 5 devices...

Raah.. :-)

For now we let the system rest until lvm2 matures and maybe the tools 
will be there to rescue this set of disks, the data on the drives is 
about 300 Gb worth of music and part of the data is still on cdrom 
backup but much of the music was added later and must be 
restored/re-ripped from the original audio CD's..

On the lvm is ReiserFS as filesystem. With the missing drive and maybe 
partially reactivating the lvm, what is ReiserFS going to do  after 
mounting it ?

So for our new server system:

What is the best way to make a 'reliable' lvm system ?
Is mirroring the most viable option or is raid 5 also usable, keeping in 
mind the number of drives you can normally connect to a PC motherboard 
(some boards, ours too, have an on board ide-raid controller which we 
used as a simple ide extension since the bios onboard was only the 
'lite' version and handled 2 drives in raid config only)
On our system the OS was installed on a small 2Gb SCSI drive and 6 IDE 
drives were used for 'massive amounts of storage' with still two IDE 
places available.
LVM seemed an easy way to expand when needed..

If we used mirroring the total number of effective drives will be 8/2 
and the drives would have to be the same in pairs.
Upgrading the lvm would mean that 1 IDE port must be free to hook up a 
new (larger) set of drives, pvmove the data from the old (smaller) pair 
of drives we wish to replace to the new set and removing the smaller set 
out of the lvm.
But how about raid 5 ?

With raid 5 it is possible to hook up say 7 drives with 1 spare But then 
the upgrade path is almost impossible since all the drives have to be 
the same size for raid 5 to work...

Can anyone shed some light on this ?

Gert van der Knokke

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] RAID 1 on Device Mapper - best practices?
  2003-10-22  8:02       ` wopp
  2003-10-23 17:52         ` [linux-lvm] Drive gone bad, now what? Gert van der Knokke
@ 2003-10-23 18:53         ` John Stoffel
  2003-11-20  7:20         ` Gregory K. Ruiz-Ade
  2 siblings, 0 replies; 28+ messages in thread
From: John Stoffel @ 2003-10-23 18:53 UTC (permalink / raw)
  To: linux-lvm

wopp> I believe that is not true. How do you resize a RAID device? The
wopp> only option I can think of is to re-create it, which is clearly
wopp> beside the point. With LVM on top of RAID, you can lvextend (or
wopp> lvreduce), pvmove and so on - where's the problem?

Well, the problem as I see it is that it really puts the model for
device/block management upside down on it's head.  

wopp> To put it differently, LVM devices are not just "another block
wopp> device", they're resizeable block devices. You get this benefit
wopp> at the level you use LVM on. Below RAID it's not really worth
wopp> much (which is probably why Debian starts RAID first).

I don't think this is really right, but if that's what we have, that's
what I have to deal with. 

Basically, I'm very used to the Veritas Volume Manager (VxVM) and
other mature LVM offerings from other Unix vendors.  In those setups,
you have the low level disk(s).  On top of them you create logical
disks (or sub-disks) which are then strung together at the next higher
layer into sub-volumes (or plexes).  At this point you have alot of
options.  You can mirror sub-volumes, or you can build them into a
RAID0 stripe set, or even RAID0+1 (or the more flexible and resilient
RAID 1+0).  Then on top of that you have your actual volumes, which
provide the block devices to build the file systems on.

In my case, I setup some PVs (a pair of disks), then made a pair of
VGs, then a pair of LVs per VG, and then I used those LVs to create a
pair of MD devices, upon which I put my ext3 filesystems.

Now I think I need to go back and invert the model, where instead I
take the two disks, mirror them, then build my VGs, LVs and
filesystems up from there.  Which is mostly a pain, and mostly not how
I think it should be done.  

time for more research in EVMS and DM and how they can work together
under the 2.4.22+ and 2.6.0-test8+ kernels.  

I'd love to see more of a discussion on this.  I've read the EVMs web
site, but it's poorly written and doesn't do a good job of explaining
the basics and how they layer together, which is a shame since it
looks like a fairly flexible model to manage block devices.

Since really, all most people want is a way to grow/shrink their file
systems, and spread them across multiple physical disks in various
flavors of RAID 0, RAID 1 and RAID 5.  I'll ignore the RAID 3 & 4,
since they are just variations on a theme.

John
   John Stoffel - Senior Unix Systems Administrator - Lucent Technologies
	 stoffel@lucent.com - http://www.lucent.com - 978-952-7548

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-23 17:52         ` [linux-lvm] Drive gone bad, now what? Gert van der Knokke
@ 2003-10-23 18:59           ` John Stoffel
  2003-10-24  0:22             ` Rickard Olsson
  0 siblings, 1 reply; 28+ messages in thread
From: John Stoffel @ 2003-10-23 18:59 UTC (permalink / raw)
  To: linux-lvm

Gert> We've setup a simple server machine with a bunch of harddisks of
Gert> 60 and 80 Gb.  With 6 drives and lvm(1) setup it provided us
Gert> with a nice amount of storage space, of course there was always
Gert> the risk of a drive going bad but I had thought that lvm would
Gert> be robust enough to cope with that sort of thing (no I didn't
Gert> expect redundancy or soemthing like that, just I would be able
Gert> to access data on the surviving disks)

Wait a second, let me try to understand this.  Did you just
concatentate or stripe the data across all the drives?  Did you use
RAID5 in this setup, or RAID1?  Or was it just a 6 x 60gb = 320gb
volume without any redundancy?  You need to be precise in specifying
what you had here. 

Gert> Alas a drive went bad (reallly bad, beyond repair so no chance
Gert> of getting any data from it).  Ok time for plan B how do I
Gert> access the data on this 'limp' lvm system.  Googling and reading
Gert> the FAQ's there were 3 options:

If you had just a simple concatenation of all the disks, then you are
toast.  How do you expect LVM to restore the missing 60gb if there's
no parity information or mirrored blocks?  It's impossible!

Gert> For now we let the system rest until lvm2 matures and maybe the
Gert> tools will be there to rescue this set of disks, the data on the
Gert> drives is about 300 Gb worth of music and part of the data is
Gert> still on cdrom backup but much of the music was added later and
Gert> must be restored/re-ripped from the original audio CD's..

This leads me to believe that you just concatenated the disks into one
big volume, without using RAID5 or RAID 1, correct?

Gert> What is the best way to make a 'reliable' lvm system ?  Is
Gert> mirroring the most viable option or is raid 5 also usable,
Gert> keeping in mind the number of drives you can normally connect to
Gert> a PC motherboard (some boards, ours too, have an on board
Gert> ide-raid controller which we used as a simple ide extension
Gert> since the bios onboard was only the 'lite' version and handled 2
Gert> drives in raid config only) On our system the OS was installed
Gert> on a small 2Gb SCSI drive and 6 IDE drives were used for
Gert> 'massive amounts of storage' with still two IDE places
Gert> available.  LVM seemed an easy way to expand when needed..

Gert> If we used mirroring the total number of effective drives will
Gert> be 8/2 and the drives would have to be the same in pairs.
Gert> Upgrading the lvm would mean that 1 IDE port must be free to
Gert> hook up a new (larger) set of drives, pvmove the data from the
Gert> old (smaller) pair of drives we wish to replace to the new set
Gert> and removing the smaller set out of the lvm.  But how about raid
Gert> 5 ?

Gert> With raid 5 it is possible to hook up say 7 drives with 1 spare
Gert> But then the upgrade path is almost impossible since all the
Gert> drives have to be the same size for raid 5 to work...

You've basically hit upon the basic tradeoffs here, though you're
missing a performance issue, in that you should really try to keep
just one drive per IDE channel if at all possible from a performance
point of view.  

John
   John Stoffel - Senior Unix Systems Administrator - Lucent Technologies
	 stoffel@lucent.com - http://www.lucent.com - 978-952-7548

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-23 18:59           ` John Stoffel
@ 2003-10-24  0:22             ` Rickard Olsson
  2003-10-24 15:23               ` Gert van der Knokke
  0 siblings, 1 reply; 28+ messages in thread
From: Rickard Olsson @ 2003-10-24  0:22 UTC (permalink / raw)
  To: linux-lvm

John Stoffel wrote:

Gert> We've setup a simple server machine with a bunch of harddisks of
Gert> 60 and 80 Gb.

John> Did you just concatentate or stripe the data across all the drives?

Butting in, I would assume (dangerous, I know) concatenation and 
non-striped since that is the default LVM mode when creating LVs. This 
is the exact same setup I have, BTW.

John> If you had just a simple concatenation of all the disks, then you are
John> toast.  How do you expect LVM to restore the missing 60gb if there's
John> no parity information or mirrored blocks?  It's impossible!

Yes, but he can restore the LV (sans the missing 60 gigs, of course) and 
access the rest of the archive. I believe that's what he's after.

Gert> For now we let the system rest until lvm2 matures and maybe the
Gert> tools will be there to rescue this set of disks

The tools are already here. I did the same thing a while back. But it 
ain't always easy. :-)

Go back to LVM1. Then, find another disk with the _exact same size_ as 
the deceased disk. Plug it in and pvcreate it (unless the old one had a 
LVM partition, in which case you fdisk and create a LVM partition on the 
new one):
# pvcreate -ff /dev/ide/host0/bus1/target0/lun0/disc

Restore your metadata to the new, empty disk, so LVM can restore the LV:
# vgcfgrestore -n YourVGName /dev/ide/host0/bus1/target0/lun0/disc
# vgscan
# vgchange -ay
# reiserfsck --rebuild-tree /dev/YourVGName
# mount /dev/YourVGName

There are a number of pitfalls along the way (not finding a disk of the 
same size is probably #1, but there is a way. If you can't find one, 
pvcreate a larger disk with -s. Use vgcfgrestore -ll to list the VG 
metadata stored in the backup, including the exact size of the dead PV.) 
but this is the basic layout. Heinz was kind enough to walk me through 
this when I had problems so now I feel like an expert. ;-)

Gert> LVM seemed an easy way to expand when needed..

It is. It's also the primary reason I use it instead of RAID. However, 
if there's money behind the archive (in my case, there isn't) you can go 
for a hardware RAID solution that offers the ability to grow the RAID. 
You will still need a bunch of same-size disks, but I could live with 
that, maybe you could too.

You can also combine RAID and LVM in various ways in an attempt to 
minimize the need for spare disks and maximize the size of useable 
space. Perhaps one RAID-5 array of one size disks and use LVM to 
concatenate it with another RAID-5 set of differently-sized disks. You 
can use NFS or Coda to glue two or more file servers together over the 
network if you run out of physical space in one of the machines.

John> You've basically hit upon the basic tradeoffs here, though you're
John> missing a performance issue, in that you should really try to keep
John> just one drive per IDE channel if at all possible from a performance
John> point of view.

If he's doing the same trade-offs I am, he values size over performance 
(which is 'good enough' for many uses even with shared IDE channels).

    / Rickard Olsson,IT-Konsult/
   / Telefon: +46 70 635 01 42/
  / http://www.webhackande.se/

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-24  0:22             ` Rickard Olsson
@ 2003-10-24 15:23               ` Gert van der Knokke
  2003-10-27  8:59                 ` John Stoffel
  2003-10-28  8:40                 ` Mark H. Wood
  0 siblings, 2 replies; 28+ messages in thread
From: Gert van der Knokke @ 2003-10-24 15:23 UTC (permalink / raw)
  To: linux-lvm

Rickard Olsson wrote:

>
> John> Did you just concatentate or stripe the data across all the drives?
>
>
> Butting in, I would assume (dangerous, I know) concatenation and 
> non-striped since that is the default LVM mode when creating LVs. This 
> is the exact same setup I have, BTW.

Yes, it was just concatenation, no striping, since lvm is 'expandable as 
needed' we opted for this.

>
> John> If you had just a simple concatenation of all the disks, then 
> you are
> John> toast.  How do you expect LVM to restore the missing 60gb if 
> there's
> John> no parity information or mirrored blocks?  It's impossible! 

I didn't expect lvm to restore the missing data, I guessed it would just 
let me access the rest of the data.

> Yes, but he can restore the LV (sans the missing 60 gigs, of course) 
> and access the rest of the archive. I believe that's what he's after. 

Yes

> The tools are already here. I did the same thing a while back. But it 
> ain't always easy. :-)
>
> Go back to LVM1. Then, find another disk with the _exact same size_ as 
> the deceased disk. Plug it in and pvcreate it (unless the old one had 
> a LVM partition, in which case you fdisk and create a LVM partition on 
> the new one):
> # pvcreate -ff /dev/ide/host0/bus1/target0/lun0/disc
>
> Restore your metadata to the new, empty disk, so LVM can restore the LV:
> # vgcfgrestore -n YourVGName /dev/ide/host0/bus1/target0/lun0/disc
> # vgscan
> # vgchange -ay
> # reiserfsck --rebuild-tree /dev/YourVGName
> # mount /dev/YourVGName
>
> There are a number of pitfalls along the way (not finding a disk of 
> the same size is probably #1, but there is a way. If you can't find 
> one, pvcreate a larger disk with -s. Use vgcfgrestore -ll to list the 
> VG metadata stored in the backup, including the exact size of the dead 
> PV.) but this is the basic layout. Heinz was kind enough to walk me 
> through this when I had problems so now I feel like an expert. ;-) 

Hmm this gives some handles to try again.

> It is. It's also the primary reason I use it instead of RAID. However, 
> if there's money behind the archive (in my case, there isn't) you can 
> go for a hardware RAID solution that offers the ability to grow the 
> RAID. You will still need a bunch of same-size disks, but I could live 
> with that, maybe you could too. 

It's not so much money in this server, its more a question of time and 
effort... :-)

> You can also combine RAID and LVM in various ways in an attempt to 
> minimize the need for spare disks and maximize the size of useable 
> space. Perhaps one RAID-5 array of one size disks and use LVM to 
> concatenate it with another RAID-5 set of differently-sized disks. You 
> can use NFS or Coda to glue two or more file servers together over the 
> network if you run out of physical space in one of the machines. 

Expandability is the main object and now also reliability ;-)
I guess mirroring is the best option for expanding as IDE drives tend to 
grow fast these days. So for every upgrade we just hang in two new 
larger drives, move the data and remove the two smallest ones expanding 
the lvm effectively by 1 new drive minus the size of the smallest one.

> John> You've basically hit upon the basic tradeoffs here, though you're
> John> missing a performance issue, in that you should really try to keep
> John> just one drive per IDE channel if at all possible from a 
> performance
> John> point of view. 

Since this server machine is connected over 100 Mb network, performance 
is not an issue, maybe in time when Gigabit switches are more affordable 
this will become a problem. Nevertheless, the Linux server performed 
already far better than the Windows 98/XP system that was used before.

> If he's doing the same trade-offs I am, he values size over 
> performance (which is 'good enough' for many uses even with shared IDE 
> channels).

Gert

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-24 15:23               ` Gert van der Knokke
@ 2003-10-27  8:59                 ` John Stoffel
  2003-10-27 18:28                   ` Gert van der Knokke
  2003-10-28  8:40                 ` Mark H. Wood
  1 sibling, 1 reply; 28+ messages in thread
From: John Stoffel @ 2003-10-27  8:59 UTC (permalink / raw)
  To: linux-lvm

Gert> Yes, it was just concatenation, no striping, since lvm is
Gert> 'expandable as needed' we opted for this.

Ok.

Gert> I didn't expect lvm to restore the missing data, I guessed it
Gert> would just let me access the rest of the data.

At this point, you have to think, how can my filesystem cope with the
loss of a 60gb chunk of data in the middle (start or end even) of the
300+ gb of data?  There's all sorts of meta-data and true data which
is now gone, and re-building the filesystem into a consistent state is
really impossible.  

Sure, you spent heroic amounts of time, you might be able to pull back
lots of data from individual sections where it's still around, but in
general, it's not going to happen.

If you are looking for a large/cheap/reliable bunch of storage,
instead of mirroring, you might want to think about RAID5 instead.  In
your case, you had a mix of disks, so what you could do is build a
pair of RAID5 arrays using disks of the same size for each array
(minimum of three disks each of course) and then stripe the filesystem
across both arrays.  

To add more storage, you need to work on chunks of three disks, but
since 120gb disks are going for around $100 these days, it's not that
expensive.

Good luck! 

John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-27  8:59                 ` John Stoffel
@ 2003-10-27 18:28                   ` Gert van der Knokke
  2003-10-28  2:20                     ` Patrick Caulfield
  2003-10-28  8:57                     ` Mark H. Wood
  0 siblings, 2 replies; 28+ messages in thread
From: Gert van der Knokke @ 2003-10-27 18:28 UTC (permalink / raw)
  To: linux-lvm

John Stoffel wrote:

>Gert> I didn't expect lvm to restore the missing data, I guessed it
>Gert> would just let me access the rest of the data.
>
>At this point, you have to think, how can my filesystem cope with the
>loss of a 60gb chunk of data in the middle (start or end even) of the
>300+ gb of data?  There's all sorts of meta-data and true data which
>is now gone, and re-building the filesystem into a consistent state is
>really impossible.  
>
Hmm, and so I think LVM still needs a warning label :-)

I wonder why LVM doesn't work the other way around:
Create filesystems on several disks and then concatenate these to the 
outside as one large filesystem. This way if one drive goes bad you can 
always individually mount the drives and use the data.

>If you are looking for a large/cheap/reliable bunch of storage,
>instead of mirroring, you might want to think about RAID5 instead.
>
No, what we're looking for is an 'expandable as needed' filesystem and 
this is what LVM pretends to be.
Our server acts as a NAS and when it gets full you add more drives or 
exchange them for (a set of) larger ones.
To the user it still is the same network share, just bigger.

>your case, you had a mix of disks, so what you could do is build a
>pair of RAID5 arrays using disks of the same size for each array
>(minimum of three disks each of course) and then stripe the filesystem
>across both arrays.  
>
>To add more storage, you need to work on chunks of three disks, but
>since 120gb disks are going for around $100 these days, it's not that
>expensive.
>
We will look into raid5, but considering the hardware limitations 
(number of onboard ports and such), for step by step upgrades the 2 disk 
mirror option is best I think.

Gert

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-27 18:28                   ` Gert van der Knokke
@ 2003-10-28  2:20                     ` Patrick Caulfield
  2003-10-28 13:52                       ` Gert van der Knokke
  2003-10-28  8:57                     ` Mark H. Wood
  1 sibling, 1 reply; 28+ messages in thread
From: Patrick Caulfield @ 2003-10-28  2:20 UTC (permalink / raw)
  To: linux-lvm

On Tue, Oct 28, 2003 at 01:27:04AM +0100, Gert van der Knokke wrote:
> John Stoffel wrote:
> 
> >Gert> I didn't expect lvm to restore the missing data, I guessed it
> >Gert> would just let me access the rest of the data.
> >
> >At this point, you have to think, how can my filesystem cope with the
> >loss of a 60gb chunk of data in the middle (start or end even) of the
> >300+ gb of data?  There's all sorts of meta-data and true data which
> >is now gone, and re-building the filesystem into a consistent state is
> >really impossible.  
> >
> Hmm, and so I think LVM still needs a warning label :-)
> 
> I wonder why LVM doesn't work the other way around:
> Create filesystems on several disks and then concatenate these to the 
> outside as one large filesystem. This way if one drive goes bad you can 
> always individually mount the drives and use the data.
> 
> >If you are looking for a large/cheap/reliable bunch of storage,
> >instead of mirroring, you might want to think about RAID5 instead.
> >
> No, what we're looking for is an 'expandable as needed' filesystem and 
> this is what LVM pretends to be.

No. LVM does in no way "pretend to be a file system". It's an expandable block
device. What the filesystem does with that block device is up to it. 

If a disk fails and you're not using RAID then you restore from backups.

-- 

patrick

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-24 15:23               ` Gert van der Knokke
  2003-10-27  8:59                 ` John Stoffel
@ 2003-10-28  8:40                 ` Mark H. Wood
  1 sibling, 0 replies; 28+ messages in thread
From: Mark H. Wood @ 2003-10-28  8:40 UTC (permalink / raw)
  To: linux-lvm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 24 Oct 2003, Gert van der Knokke wrote:
> Rickard Olsson wrote:
> > John> If you had just a simple concatenation of all the disks, then
> > you are
> > John> toast.  How do you expect LVM to restore the missing 60gb if
> > there's
> > John> no parity information or mirrored blocks?  It's impossible!
>
> I didn't expect lvm to restore the missing data, I guessed it would just
> let me access the rest of the data.

To elaborate a bit, there are three cases to consider:

o  An LV with no extents stored on the failed drive.  This LV is intact
   and LVM should be able to provide it as if nothing had happened,
   because nothing *has* happened to this LV.

o  An LV all of whose extents were stored on the failed drive.  This LV's
   content is entirely lost and may only be recovered from other media.

o  An LV, some but not all of whose extents were stored on the failed
   drive.  Once the lost storage has been replaced, LVM *could* present
   this LV, which would contain a damaged filesystem.  'fsck' might be
   able to repair the filesystem enough to recover files which were stored
   in the undamaged extents, or a filesystem debugger might be available
   to facilitate manual recovery.  Some of the content is lost and may
   only be recovered from other media, but other content is undamaged.

   I don't yet know LVM well enough to say whether it *does* handle this
   case, but it can in theory and I would expect it to be written to do
   so.  Others' comments suggest that this is so.

BTW I've used various logical-volume schemes for years on top of hardware
RAID.  It's a combination that works well.  I don't understand why some
people want to put (soft) RAID on top of LVM rather than underneath it.

- -- 
Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
MS Windows *is* user-friendly, but only for certain values of "user".
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: pgpenvelope 2.10.2 - http://pgpenvelope.sourceforge.net/

iD8DBQE/nn9/s/NR4JuTKG8RArSyAJ93Bdog6U2fNrbpF/dAJ6BZOVlMyQCeKhmU
fEB18by5nkGWlKt/Ge35olE=
=8ETo
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-27 18:28                   ` Gert van der Knokke
  2003-10-28  2:20                     ` Patrick Caulfield
@ 2003-10-28  8:57                     ` Mark H. Wood
  1 sibling, 0 replies; 28+ messages in thread
From: Mark H. Wood @ 2003-10-28  8:57 UTC (permalink / raw)
  To: linux-lvm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 28 Oct 2003, Gert van der Knokke wrote:
> John Stoffel wrote:
> >Gert> I didn't expect lvm to restore the missing data, I guessed it
> >Gert> would just let me access the rest of the data.
> >
> >At this point, you have to think, how can my filesystem cope with the
> >loss of a 60gb chunk of data in the middle (start or end even) of the
> >300+ gb of data?  There's all sorts of meta-data and true data which
> >is now gone, and re-building the filesystem into a consistent state is
> >really impossible.
> >
> Hmm, and so I think LVM still needs a warning label :-)

Maybe, in the same way that McDonald's now feels compelled to remind its
customers that hot coffee is hot.  I don't recall anything in the
description of LVM which suggests that it provides redundancy or other
data-protection mechanisms.  RAID, regular backups, and retention of
distribution media are still required if you value your data.

> I wonder why LVM doesn't work the other way around:
> Create filesystems on several disks and then concatenate these to the
> outside as one large filesystem. This way if one drive goes bad you can
> always individually mount the drives and use the data.

man mount

> >If you are looking for a large/cheap/reliable bunch of storage,
> >instead of mirroring, you might want to think about RAID5 instead.
> >
> No, what we're looking for is an 'expandable as needed' filesystem and
> this is what LVM pretends to be.

That's what it *is*.  You can slice and dice your physical storage and
recombine as needed.  Do that to redundant physical storage and you have a
highly reliable expandable storage stack.  Do it to a simple concatenation
of cheap disks and you have a cheap failure-prone expandable storage
stack.  Expandability and reliability are orthogonal, and you use separate
tools to provide them.

- -- 
Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
MS Windows *is* user-friendly, but only for certain values of "user".
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: pgpenvelope 2.10.2 - http://pgpenvelope.sourceforge.net/

iD8DBQE/noNls/NR4JuTKG8RAjZZAJ9/bdnzRL07utWgLrw0FKR3Jw2hFQCaAwMr
+J7f6p3l6Arpds78jK2TKgE=
=QaKC
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-28  2:20                     ` Patrick Caulfield
@ 2003-10-28 13:52                       ` Gert van der Knokke
  2003-10-28 14:14                         ` Jayson Garrell
  2003-10-28 14:55                         ` Chris Cox
  0 siblings, 2 replies; 28+ messages in thread
From: Gert van der Knokke @ 2003-10-28 13:52 UTC (permalink / raw)
  To: linux-lvm

Patrick Caulfield wrote:

>>>      
>>>
>>No, what we're looking for is an 'expandable as needed' filesystem and 
>>this is what LVM pretends to be.
>>    
>>
>
>No. LVM does in no way "pretend to be a file system". It's an expandable block
>device. What the filesystem does with that block device is up to it. 
>  
>
Ok, this is a 'slight' mixup between what we need and what LVM provides.

>If a disk fails and you're not using RAID then you restore from backups.
>
>  
>
Where on earth do you backup 300 Gb on ? On tapes ? For the price of a 
tape device including tapes which can handle this amount of data you can 
buy a lot of harddisks...

This is a very common problem nowadays with ultralarge drives becoming 
available dirt cheap but reliability of these drives is to say 'not so good'

We are searching for a way to store large amounts of data reliable and 
affordable on a system running as mass storage/archive for a small group 
of users. Starting point will be around 300 to 400 Gb but in the near 
future 1 Tb (and more...) The users must simply be able to 'store and 
forget' on this system.
Traffic is fairly low but occasionally large amounts have to be 
'restored or copied' to local (smaller) systems to be taken out on the road.

Raid5 is ok for fixed size systems, but a simple mirroring system would 
be the most expandable with various size drives (in pairs ofcourse)
Maybe even two servers at different locations (then upgrading would mean 
buying 4 drives at a time..)

Now we have to find a reliable resizable/expandable filesystem (or 
resizable/expandable block device system) on this hardware.

Gert

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-28 13:52                       ` Gert van der Knokke
@ 2003-10-28 14:14                         ` Jayson Garrell
  2003-10-28 14:30                           ` Gert van der Knokke
  2003-10-28 14:55                         ` Chris Cox
  1 sibling, 1 reply; 28+ messages in thread
From: Jayson Garrell @ 2003-10-28 14:14 UTC (permalink / raw)
  To: linux-lvm

On Tue, 2003-10-28 at 11:51, Gert van der Knokke wrote:

> Where on earth do you backup 300 Gb on ? On tapes ? For the price of a 
> tape device including tapes which can handle this amount of data you can 
> buy a lot of harddisks...
At my office we are currently using a OverLand DLT autoloader,
LoaderXpress. It uses 10 40/80G tapes for a total of 400G native and
800G compressed @ 6Mb/s. Yes it wasn't cheap but you can't put a price
on someone else's data. 

Jayson Garrell

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-28 14:14                         ` Jayson Garrell
@ 2003-10-28 14:30                           ` Gert van der Knokke
  2003-10-28 15:36                             ` John Stoffel
  2003-10-28 16:06                             ` Glen Harris
  0 siblings, 2 replies; 28+ messages in thread
From: Gert van der Knokke @ 2003-10-28 14:30 UTC (permalink / raw)
  To: linux-lvm

Jayson Garrell wrote:

>On Tue, 2003-10-28 at 11:51, Gert van der Knokke wrote:
>
>  
>
>>Where on earth do you backup 300 Gb on ? On tapes ? For the price of a 
>>tape device including tapes which can handle this amount of data you can 
>>buy a lot of harddisks...
>>    
>>
>At my office we are currently using a OverLand DLT autoloader,
>LoaderXpress. It uses 10 40/80G tapes for a total of 400G native and
>800G compressed @ 6Mb/s. Yes it wasn't cheap but you can't put a price
>on someone else's data. 
>  
>
True, but what if I stick 2 IDE drives of 250 Gb in an external USB2 or 
Firewire box I have 500 Gb and speeds of 20 to 30 Mbyte/s at a fraction 
of the cost.
And those boxes can be put into a safe too.

Mind this, I'm just stirring up things a bit to get some perspective 
view of cost versus reliability.

Gert

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-28 13:52                       ` Gert van der Knokke
  2003-10-28 14:14                         ` Jayson Garrell
@ 2003-10-28 14:55                         ` Chris Cox
  1 sibling, 0 replies; 28+ messages in thread
From: Chris Cox @ 2003-10-28 14:55 UTC (permalink / raw)
  To: linux-lvm

(orginally wasn't going to send this.. but changed my mind)
First, let me say that 100's of GB is nothing today.
A good LTO2 unit will do the trick quite nicely.  The problem
is that we are talking about 100's of TB now.  Even if
you have a 100 tape LTO2 drive or multiple LTO2 arrays
capable of moving 50M/sec+... you still end up in a crunch.

SAN folks will tell you to live mirror to a remote DR site...
but the pipe between sites could be VERY expensive (if you
have the money, it is definitely something worth considering
though).  What I'm suggesting below.. and I've tried to
use LVM... is the use of mirrored drives and the idea of
archival of hard disk storage.

Gert van der Knokke wrote:
...
> Now we have to find a reliable resizable/expandable filesystem (or 
> resizable/expandable block device system) on this hardware.

Mirrored drives.  So a logical drive includes a mirror as well.
You'll need a raid controller that can configure many drive instances
(drive + mirror).  Use LVM to add new logical drives into your
volume group, then extend your volumes and resize your filesystems.

(ascii art)

   pv1       pv2
+-----+   +-----+
| d 1 |   | d 2 |
+-----+   +-----+
| m 1 |   | m 2 |
+-----+   +-----+

pvcreate on pv1 and pv2 (can add pv3, etc later as needed)
put pv1, pv2 into a vg (extend with additional pv's later)
(if you like the pv's could be a set of mirrored drives
which are also striped).

Mirrored set could be pulled and archived at a storage center or
moved to a disaster recovery site.  Of course, this does imply
the ability to take out the entire range of mirrors safely.
You'll have to look at your HW raid controller to see what
is possible.  Afterwards, replace the "m" drives and let them
rebuild.

You may ask about joining the drives at the HW RAID level.. but as
mentioned, this might not give you the flexibility desired (see
the idea of striped sets mentioned earlier).

There are probably better solutions to this... obviously some
high end SAN devices may be better suited... not sure about
Linux filesystem compatibility though).

Just an idea.

Today, we're using a 5 TB disk based cache feeding into a
72 tape LTO library with 2 drives... adequate for our needs,
but we just don't have the storage needs that some require.
Moving forward, the idea of using off site disk storage,
off site DR replication or live DR mirroring are things we'll
haved to look at.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-28 14:30                           ` Gert van der Knokke
@ 2003-10-28 15:36                             ` John Stoffel
  2003-10-29  8:54                               ` Brian J. Murrell
  2003-10-28 16:06                             ` Glen Harris
  1 sibling, 1 reply; 28+ messages in thread
From: John Stoffel @ 2003-10-28 15:36 UTC (permalink / raw)
  To: linux-lvm

Gert> True, but what if I stick 2 IDE drives of 250 Gb in an external
Gert> USB2 or Firewire box I have 500 Gb and speeds of 20 to 30
Gert> Mbyte/s at a fraction of the cost.  And those boxes can be put
Gert> into a safe too.

Very true, but in this case, what happens to your data if one of those
drives dies?  There's a similar issue with backups spanning multiple
tapes as well, so it's not even.

Also, at 30mb/s, it will take around three hours to fill that 250gb
disk.  Are you sure you can sustain that kind of throughput?  It's
becoming a big issue with tape backups, being able to drive the tapes
at their rated speed for the best compression/performance possible.
But moving that much data in a short amount of time isn't always easy.

For a cheap fileserver, I'd probably go with a 3ware controller,
either four or 8 ports, with 250gb drives.  For the four port
controller you can do RAID5, but you give up the hot spare.  For the
eight port controller, you'd get 1.5tb of RAW blocks, with one parity
and one hotspare disk.  You'd probably get closer to 1.2tb of useable
storage from the filesystem.

Now how do you back that up?  

John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-28 14:30                           ` Gert van der Knokke
  2003-10-28 15:36                             ` John Stoffel
@ 2003-10-28 16:06                             ` Glen Harris
  1 sibling, 0 replies; 28+ messages in thread
From: Glen Harris @ 2003-10-28 16:06 UTC (permalink / raw)
  To: linux-lvm

Gert van der Knokke wrote:
 > Mind this, I'm just stirring up things a bit to get some perspective
 > view of cost versus reliability.

Here's my AU$0.02.

We have roughly 400Gb of disk spread across 7 SMTP/IMAP/Oracle/GIS/
Web servers. We've just bought a Sun 3510 16-bay FC/SCSI RAID box to
serve as a baby-SAN to 5 of the servers and will consolidate the rest.
There's another 5 servers which have about half that but are squid or
CDROM servers and as such don't need full backups.

Right now we back up to two DLT drives, a Mamoth, and a Mamoth2,
several of them across the network to a machine with a physical
drive.

This is expensive and prone to failure.

Our solution is to buy a cheap Arena PA-8211 FC/IDE RAID box, connect
it to a cheap Intel/Debian server with Gb ethernet and rsync all the
servers to it several times per day. We can do a final rsync then
take a snapshot early AM when the rsync will be as consistant as it
will get, and back that up to a 7-tape autoloader for the Mamoth2.

This is still expensive - those Mamoth2 tapes are worth their weight
in gold. The chance of failure is much reduced, however.

The next stage is to buy a new tape drive, one of the new ones which
does 300Gb native, which will keep us to only two tapes for a full
backup. With a 5 or 10 slot autoloader, it should keep us going for
maybe two years before we need to look at a new tape drive.

Since the Arena box has a maximum capacity of 3.5Tb with 250Gb disks,
way above the Sun box, we intend to also take hourly/daily snapshots
as each rsync is done to reduce the need for restoring from tape.
(That was the on-topic LVM part of this post!)

Hey, while I have your attention, has anyone written some scripts to
check free space on a number of snapshots and add PE's as needed to
keep the snapshot valid, deleting the oldest as needed to free PE's?

Alternately, are there plans to add a trigger to a snapshot to do this
automatically, so the script only needs to keep an eye on the total free
PE's and delete the oldest snapshot as necessary? That would be *much*
nicer!

Cheers, glen.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-28 15:36                             ` John Stoffel
@ 2003-10-29  8:54                               ` Brian J. Murrell
  2003-10-30  8:00                                 ` Petro
  0 siblings, 1 reply; 28+ messages in thread
From: Brian J. Murrell @ 2003-10-29  8:54 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 364 bytes --]

On Tue, 2003-10-28 at 16:05, John Stoffel wrote:
> But moving that much data in a short amount of time isn't always easy.

Never underestimate the bandwidth of a station wagon full of tapes
hurtling down the highway.
                                         Andrew Tannenbaum
b.
-- 
My other computer is your Microsoft Windows server.

Brian J. Murrell

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-29  8:54                               ` Brian J. Murrell
@ 2003-10-30  8:00                                 ` Petro
  2003-11-26 10:15                                   ` Harri Haataja
  0 siblings, 1 reply; 28+ messages in thread
From: Petro @ 2003-10-30  8:00 UTC (permalink / raw)
  To: linux-lvm

On Tue, Oct 28, 2003 at 01:41:21PM -0800, Brian J. Murrell wrote:
> On Tue, 2003-10-28 at 16:05, John Stoffel wrote:
> > But moving that much data in a short amount of time isn't always easy.
> Never underestimate the bandwidth of a station wagon full of tapes
> hurtling down the highway.
>                                          Andrew Tannenbaum

    Yes, but it takes *DAYS* to write that many tapes. 
    
-- 
Petro@corp.vendio.com                           ccpetro at vtext.com [sms] 
Unix Administrator                           2766480 at skytel.com [pager] 
Vendio Service Inc.                                  (650) 793-1650 [cell]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] RAID 1 on Device Mapper - best practices?
  2003-10-22  8:02       ` wopp
  2003-10-23 17:52         ` [linux-lvm] Drive gone bad, now what? Gert van der Knokke
  2003-10-23 18:53         ` [linux-lvm] RAID 1 on Device Mapper - best practices? John Stoffel
@ 2003-11-20  7:20         ` Gregory K. Ruiz-Ade
  2003-11-20  8:33           ` [linux-lvm] " Måns Rullgård
  2 siblings, 1 reply; 28+ messages in thread
From: Gregory K. Ruiz-Ade @ 2003-11-20  7:20 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 3176 bytes --]

On Tuesday 21 October 2003 10:18 am, wopp@parplies.de wrote:
> > Ahh, but that's the beauty of RAID and LVM, what you end up with is
> > just another block device. Which ever way you do it you'll get the same
> > benefit.
>
> I believe that is not true. How do you resize a RAID device? The only
> option I can think of is to re-create it, which is clearly beside the
> point. With LVM on top of RAID, you can lvextend (or lvreduce), pvmove
> and so on - where's the problem?

At the risk of saying something _way_ past the point that this thread seems 
to have died down (hey, I'm only now catching up on email!):

You don't extend the RAID device, you add a new one, and add it as a new 
physical volume to your volume group, and _then_ you can extend your 
volumes onto the new underlying RAID device.

And _THAT_ is the beauty of LVM.

>      +------+   +------+
>
>      | hda1 | + | hdb1 |  -> md0 \
>
>      +------+   +------+          +- VG0
>
>      | hda2 | + | hdb2 |  -> md1 /
>
>      +------+   +------+

Personally, I'd do it this way:

+------+   +------+    +-----+
| hda1 | + | hdc1 | -> | md0 | = /boot
+------+   +------+    +-----+
+------+   +------+    +-----+    +------+
| hda2 | + | hdc2 | -> | md1 | -> | vg00 |
+------+   +------+    +-----+    +------+

vg00 would then contain logical volumes for all the remaining filesystems.

Two points, though.  First, if you're going to do mirroring on IDE drives, 
regardless of how hooptie your IDE controller is, _always_ put the drives 
as masters on _seperate_ channels, otherwise all your writes could take up 
to twice as long, since only one device on an IDE channel can be accessed 
at a time.

Second, the beauty of LVM is that you can add a new physical volume to the 
volume group whenever you feel like it, and then extend your logical 
volumes into that new space.  So, you buy a Promise 2-channel IDE 
controller, and add two 250GB drives to them.  Then, you create a new RAID 
mirror set with those.

+------+   +------+    +-----+
| hde1 | + | hdg1 | -> | md2 |
+------+   +------+    +-----+

You can then add that to your VG:

+------+   +------+    +-----+
| hda2 | + | hdc2 | -> | md1 |       +------+
+------+   +------+    +-----+ \_____| vg00 |
+------+   +------+    +-----+ /     |      |
| hde1 | + | hdg1 | -> | md2 |       +------+
+------+   +------+    +-----+

You don't lose anything in terms of reliability, unless you lose an IDE 
controller, in this case.

this way, you're leveraging the capabilities of both the RAID and LVM 
subsystems.

In terms of mirror resync times, I personally have not had to deal with 
that, either because so far haven't had a hard drive in a software raid 
system fail on me yet.  Either I'm lucky or careful to buy good hard 
drives, but I'm honestly not sure which it is. :)

Doesn't MD allow background resyncs, anyway?  Sure, it might suck the living 
daylights out of system performance, but at least it should be usable.

Gregory 

-- 
Gregory K. Ruiz-Ade <gregory@castandcrew.com>
Sr. Systems Administrator
Cast & Crew Entertainment Services, Inc.

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [linux-lvm] Re: RAID 1 on Device Mapper - best practices?
  2003-11-20  7:20         ` Gregory K. Ruiz-Ade
@ 2003-11-20  8:33           ` Måns Rullgård
  2003-11-21 13:02             ` Micah Anderson
  0 siblings, 1 reply; 28+ messages in thread
From: Måns Rullgård @ 2003-11-20  8:33 UTC (permalink / raw)
  To: linux-lvm

"Gregory K. Ruiz-Ade" <gregory@castandcrew.com> writes:

> Doesn't MD allow background resyncs, anyway?

It does.

> Sure, it might suck the living daylights out of system performance,

It does.

> but at least it should be usable.

Sort of.  Anything timing critical, like music, is out of the
question.

-- 
M�ns Rullg�rd
mru@kth.se

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Re: RAID 1 on Device Mapper - best practices?
  2003-11-20  8:33           ` [linux-lvm] " Måns Rullgård
@ 2003-11-21 13:02             ` Micah Anderson
  0 siblings, 0 replies; 28+ messages in thread
From: Micah Anderson @ 2003-11-21 13:02 UTC (permalink / raw)
  To: linux-lvm

The md code decided it was going to resync the mirror
> at between 100KB/sec and 100000KB/sec. The actual rate was 100KB/sec,
> while the device was otherwise idle. By increasing
> /proc/.../speed_limit_min, I was able to crank the resync rate up to
> 20MB/sec, which is slightly more reasonable but still short of the
> ~60MB/sec this RAID is capable of.

On Thu, 20 Nov 2003, M?ns Rullg?rd wrote:

> > Sure, it might suck the living daylights out of system performance,
> 
> It does.

Doesn't have to... With 2.2 kernels, you can simply use 'cat
/proc/sys/dev/md/speed-limit', to see the *minimum* rebuild rate. You
can change it using something like:
'echo 200000 > /proc/sys/dev/md/speed-limit'.

With 2.4 kernels, there are two files in '/proc/sys/dev/raid/', called
'speed_limit_max' and 'speed_limit_min'. I think the rest is pretty
obvious. You don't have to destroy system performance...

micah

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [linux-lvm] Drive gone bad, now what?
  2003-10-30  8:00                                 ` Petro
@ 2003-11-26 10:15                                   ` Harri Haataja
  0 siblings, 0 replies; 28+ messages in thread
From: Harri Haataja @ 2003-11-26 10:15 UTC (permalink / raw)
  To: linux-lvm

On Wed, Oct 29, 2003 at 04:41:10PM -0800, Petro wrote:
> On Tue, Oct 28, 2003 at 01:41:21PM -0800, Brian J. Murrell wrote:
> > On Tue, 2003-10-28 at 16:05, John Stoffel wrote:
> > > But moving that much data in a short amount of time isn't always easy.
> > Never underestimate the bandwidth of a station wagon full of tapes
> > hurtling down the highway.
> >                                          Andrew Tannenbaum
>     Yes, but it takes *DAYS* to write that many tapes. 

Unless you have some serious RAIT and write them in parallel.

-- 
CAUTION: The Mass of This Product Contains the Energy Equivalent of 85
Million Tons of TNT per Net Ounce of Weight.
		-- http://www.xs4all.nl/~jcdverha/scijokes/

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2003-11-26 10:15 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-17 15:10 [linux-lvm] RAID 1 on Device Mapper - best practices? John Stoffel
2003-10-17 15:29 ` Mike Williams
2003-10-17 15:51   ` John Stoffel
2003-10-17 15:56     ` [linux-lvm] " Måns Rullgård
2003-10-17 16:13     ` [linux-lvm] " Mike Williams
2003-10-22  8:02       ` wopp
2003-10-23 17:52         ` [linux-lvm] Drive gone bad, now what? Gert van der Knokke
2003-10-23 18:59           ` John Stoffel
2003-10-24  0:22             ` Rickard Olsson
2003-10-24 15:23               ` Gert van der Knokke
2003-10-27  8:59                 ` John Stoffel
2003-10-27 18:28                   ` Gert van der Knokke
2003-10-28  2:20                     ` Patrick Caulfield
2003-10-28 13:52                       ` Gert van der Knokke
2003-10-28 14:14                         ` Jayson Garrell
2003-10-28 14:30                           ` Gert van der Knokke
2003-10-28 15:36                             ` John Stoffel
2003-10-29  8:54                               ` Brian J. Murrell
2003-10-30  8:00                                 ` Petro
2003-11-26 10:15                                   ` Harri Haataja
2003-10-28 16:06                             ` Glen Harris
2003-10-28 14:55                         ` Chris Cox
2003-10-28  8:57                     ` Mark H. Wood
2003-10-28  8:40                 ` Mark H. Wood
2003-10-23 18:53         ` [linux-lvm] RAID 1 on Device Mapper - best practices? John Stoffel
2003-11-20  7:20         ` Gregory K. Ruiz-Ade
2003-11-20  8:33           ` [linux-lvm] " Måns Rullgård
2003-11-21 13:02             ` Micah Anderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.