[linux-lvm] Allocation Policy for Cloud Computing needed

linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed

* [linux-lvm] Allocation Policy for Cloud Computing needed
@ 2012-02-16 13:50 Sebastian Riemer
  2012-02-16 14:41 ` James Hawtin
  2012-02-20 21:59 ` Lars Ellenberg
  0 siblings, 2 replies; 7+ messages in thread
From: Sebastian Riemer @ 2012-02-16 13:50 UTC (permalink / raw)
  To: linux-lvm

Hi LVM list,

I'm experimenting with storage for many QEMU/KVM virtual machines in
cloud computing. I've got many concurrent IO processes and 24 hard
drives. I've tested the scalability with a single IO reader process per
hard drive. Single drives scale best and have the best performance of
cause, but we need mirroring and volume management. So I've created MD
RAID-1 arrays and created on each a VG and two LVs. This gives me good
overall performance (up to 2 GB/s, HBA limit: 2.2 GB/s).

Then, I've tested to put all my RAID-1 arrays into a single VG, because
LV size should be adjustable over all hard drives. I've tried all
allocation policies but none does what I want to achieve here. Yeah,
that this isn't implemented fully is in the man page, ... .

I want to have an allocation which distributes the LVs equally over the
PVs as long as space is left and LVs aren't resized. The goal is to
minimize the number of concurrent IO processes per hard drive (striping
is total crap in this situation).

I've tested LVM2 2.02.66 and kernel 3.0.15. Is something like that
implemented in newer releases or is something like that intended to be
implemented in near future? Or does someone want to implement this
together with me?

Thanks, cheers,

Sebastian

-- 
Sebastian Riemer
Linux Kernel Developer

ProfitBricks GmbH
Greifswalder Str. 207
10405 Berlin, Germany

Tel.:  +49 - 30 - 60 98 56 991 - 303
Fax:   +49 - 30 - 51 64 09 22
Email: sebastian.riemer@profitbricks.com
Web:   http://www.profitbricks.com/

RG: Amtsgericht Charlottenburg, HRB 125506 B
GF: Andreas Gauger, Achim Weiss

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Allocation Policy for Cloud Computing needed
  2012-02-16 13:50 [linux-lvm] Allocation Policy for Cloud Computing needed Sebastian Riemer
@ 2012-02-16 14:41 ` James Hawtin
  2012-02-16 15:27   ` Sebastian Riemer
  2012-02-20 21:59 ` Lars Ellenberg
  1 sibling, 1 reply; 7+ messages in thread
From: James Hawtin @ 2012-02-16 14:41 UTC (permalink / raw)
  To: LVM general discussion and development

Sebastian Riemer wrote:
> Hi LVM list,
>
> I'm experimenting with storage for many QEMU/KVM virtual machines in
> cloud computing. I've got many concurrent IO processes and 24 hard
> drives. I've tested the scalability with a single IO reader process per
> hard drive. Single drives scale best and have the best performance of
> cause, but we need mirroring and volume management. So I've created MD
> RAID-1 arrays and created on each a VG and two LVs. This gives me good
> overall performance (up to 2 GB/s, HBA limit: 2.2 GB/s).
>
> Then, I've tested to put all my RAID-1 arrays into a single VG, because
> LV size should be adjustable over all hard drives. I've tried all
> allocation policies but none does what I want to achieve here. Yeah,
> that this isn't implemented fully is in the man page, ... .
>
> I want to have an allocation which distributes the LVs equally over the
> PVs as long as space is left and LVs aren't resized. The goal is to
> minimize the number of concurrent IO processes per hard drive (striping
> is total crap in this situation).
>
> I've tested LVM2 2.02.66 and kernel 3.0.15. Is something like that
> implemented in newer releases or is something like that intended to be
> implemented in near future? Or does someone want to implement this
> together with me?
>
> Thanks, cheers,
>
> Sebastian
>
>   
I wrote a script to do this, with LVM 1 that just extended the lv in 
chunks over a range of PVs until it was full. while it worked on LVM it 
produced lots of backups in /etc. I felt with the LVM stripe function 
that this was not necessary, and  could acheive the same thing by using 
large stripe size, as LVM allows concats of different stripeness it was 
not a problem extending the lv.

James.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Allocation Policy for Cloud Computing needed
  2012-02-16 14:41 ` James Hawtin
@ 2012-02-16 15:27   ` Sebastian Riemer
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Riemer @ 2012-02-16 15:27 UTC (permalink / raw)
  To: James Hawtin; +Cc: LVM general discussion and development

Hi James,

thank you for your idea, but this is exactly what I don't want. I want
to have the LV completely on one PD if possible (but not all on the
first). Only if the customer requests more than the capacity of a single
hard drive I would provide striping, but only on that volume.

If all customers can access all hard drives at the same time this
results into very slow random IO. I've seen that on SW RAID-10. All the
rw-heads need to be repositioned over and over again.

Cheers,

Sebastian

On 16/02/12 15:41, James Hawtin wrote:
> I wrote a script to do this, with LVM 1 that just extended the lv in
> chunks over a range of PVs until it was full. while it worked on LVM
> it produced lots of backups in /etc. I felt with the LVM stripe
> function that this was not necessary, and  could acheive the same
> thing by using large stripe size, as LVM allows concats of different
> stripeness it was not a problem extending the lv. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Allocation Policy for Cloud Computing needed
  2012-02-16 13:50 [linux-lvm] Allocation Policy for Cloud Computing needed Sebastian Riemer
  2012-02-16 14:41 ` James Hawtin
@ 2012-02-20 21:59 ` Lars Ellenberg
  2012-02-20 22:41   ` Ray Morris
  2012-02-21  9:34   ` Sebastian Riemer
  1 sibling, 2 replies; 7+ messages in thread
From: Lars Ellenberg @ 2012-02-20 21:59 UTC (permalink / raw)
  To: linux-lvm

On Thu, Feb 16, 2012 at 02:50:07PM +0100, Sebastian Riemer wrote:
> Hi LVM list,
> 
> I'm experimenting with storage for many QEMU/KVM virtual machines in
> cloud computing. I've got many concurrent IO processes and 24 hard
> drives. I've tested the scalability with a single IO reader process per
> hard drive. Single drives scale best and have the best performance of
> cause, but we need mirroring and volume management. So I've created MD
> RAID-1 arrays and created on each a VG and two LVs. This gives me good
> overall performance (up to 2 GB/s, HBA limit: 2.2 GB/s).
> 
> Then, I've tested to put all my RAID-1 arrays into a single VG, because
> LV size should be adjustable over all hard drives. I've tried all
> allocation policies but none does what I want to achieve here. Yeah,
> that this isn't implemented fully is in the man page, ... .
> 
> I want to have an allocation which distributes the LVs equally over the
> PVs as long as space is left and LVs aren't resized. The goal is to
> minimize the number of concurrent IO processes per hard drive (striping
> is total crap in this situation).
> 
> I've tested LVM2 2.02.66 and kernel 3.0.15. Is something like that
> implemented in newer releases or is something like that intended to be
> implemented in near future?

I don't know.  Does not look like it, though.

> Or does someone want to implement this together with me?

I would certainly be here for discussions.

Though, as you always will be more flexible with scripts than with
pre-implemented fixed algorithms, I probably would first check if I can
solve it with some scripting.
[completely untested, but you get the idea]

#!/bin/bash
export LANG=C LC_ALL=C
name=$1 vg=$2 size_in_MiB=$3
PVS=$(vgs --nohead --unit m -o pv_name,pv_free -O -pv_free,pv_name $vg |
	awk -v need=$size_in_MiB '{ print $1; sum += $2;
	if (sum >= need) exit; }')
lvcreate -n $name -L ${size_in_MiB}m $vg $PVS

(similar for lvextend)

Which basically implements this allocation policy:
use the pvs with most free space available,
and no more than necessary.

If I understand you correctly, that would almost do what you asked for.

You can get pretty complex in similar scripts, if you really want to...
consider using
  pvs -o vg_name,lv_name,pv_name,pvseg_start,pvseg_size,seg_pe_ranges
and explicitly listing not only the PVS, but even the PE ranges to your
lvcreate commands...

	Lars

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Allocation Policy for Cloud Computing needed
  2012-02-20 21:59 ` Lars Ellenberg
@ 2012-02-20 22:41   ` Ray Morris
  2012-02-21  9:48     ` Sebastian Riemer
  2012-02-21  9:34   ` Sebastian Riemer
  1 sibling, 1 reply; 7+ messages in thread
From: Ray Morris @ 2012-02-20 22:41 UTC (permalink / raw)
  To: LVM general discussion and development

> > Then, I've tested to put all my RAID-1 arrays into a single VG,
> > because LV size should be adjustable over all hard drives.
... 
> > I want to have an allocation which distributes the LVs equally over
> > the PVs as long as space is left and LVs aren't resized.


Since you're using RAID anyway, consider testing RAID 10, which will 
distribute IO across spindles.

> You can get pretty complex in similar scripts, if you really want
> to... consider using
>   pvs -o vg_name,lv_name,pv_name,pvseg_start,pvseg_size,seg_pe_ranges
> and explicitly listing not only the PVS, but even the PE ranges to
> your lvcreate commands...

For scripting, see Linux::LVM on CPAN. It gives you that information 
as a nice data structure. I welcome feature requests and patches. 
(Linux::LVM::Do coming soon for modifying rather than just querying 
LVM objects.)
-- 
Ray Morris
support@bettercgi.com

Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/

Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/

Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php




On Mon, 20 Feb 2012 22:59:11 +0100
Lars Ellenberg <lars.ellenberg@linbit.com> wrote:

> On Thu, Feb 16, 2012 at 02:50:07PM +0100, Sebastian Riemer wrote:
> > Hi LVM list,
> > 
> > I'm experimenting with storage for many QEMU/KVM virtual machines in
> > cloud computing. I've got many concurrent IO processes and 24 hard
> > drives. I've tested the scalability with a single IO reader process
> > per hard drive. Single drives scale best and have the best
> > performance of cause, but we need mirroring and volume management.
> > So I've created MD RAID-1 arrays and created on each a VG and two
> > LVs. This gives me good overall performance (up to 2 GB/s, HBA
> > limit: 2.2 GB/s).
> > 
> > Then, I've tested to put all my RAID-1 arrays into a single VG,
> > because LV size should be adjustable over all hard drives. I've
> > tried all allocation policies but none does what I want to achieve
> > here. Yeah, that this isn't implemented fully is in the man
> > page, ... .
> > 
> > I want to have an allocation which distributes the LVs equally over
> > the PVs as long as space is left and LVs aren't resized. The goal
> > is to minimize the number of concurrent IO processes per hard drive
> > (striping is total crap in this situation).
> > 
> > I've tested LVM2 2.02.66 and kernel 3.0.15. Is something like that
> > implemented in newer releases or is something like that intended to
> > be implemented in near future?
> 
> I don't know.  Does not look like it, though.
> 
> > Or does someone want to implement this together with me?
> 
> I would certainly be here for discussions.
> 
> Though, as you always will be more flexible with scripts than with
> pre-implemented fixed algorithms, I probably would first check if I
> can solve it with some scripting.
> [completely untested, but you get the idea]
> 
> #!/bin/bash
> export LANG=C LC_ALL=C
> name=$1 vg=$2 size_in_MiB=$3
> PVS=$(vgs --nohead --unit m -o pv_name,pv_free -O -pv_free,pv_name
> $vg | awk -v need=$size_in_MiB '{ print $1; sum += $2;
> 	if (sum >= need) exit; }')
> lvcreate -n $name -L ${size_in_MiB}m $vg $PVS
> 
> (similar for lvextend)
> 
> Which basically implements this allocation policy:
> use the pvs with most free space available,
> and no more than necessary.
> 
> If I understand you correctly, that would almost do what you asked
> for.
> 
> You can get pretty complex in similar scripts, if you really want
> to... consider using
>   pvs -o vg_name,lv_name,pv_name,pvseg_start,pvseg_size,seg_pe_ranges
> and explicitly listing not only the PVS, but even the PE ranges to
> your lvcreate commands...
> 
> 	Lars
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Allocation Policy for Cloud Computing needed
  2012-02-20 22:41   ` Ray Morris
@ 2012-02-21  9:48     ` Sebastian Riemer
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Riemer @ 2012-02-21  9:48 UTC (permalink / raw)
  To: LVM general discussion and development

On 20/02/12 23:41, Ray Morris wrote:
> Since you're using RAID anyway, consider testing RAID 10, which will 
> distribute IO across spindles.

I've already tested RAID-10 with MD as well as letting LVM do the
striping. This is total equal: There are too many IO processes on all
spindles for good performance. All r/w-heads in RAID-10 have to be
repositioned over and over again.
Even in random IO this only reads from a single spindle per RAID-1 array.
Without striping MD RAID-1 has a good read balancing algorithm. With
that I can read from both spindles in the RAID-1 array simultaneously.

> For scripting, see Linux::LVM on CPAN. It gives you that information 
> as a nice data structure. I welcome feature requests and patches. 
> (Linux::LVM::Do coming soon for modifying rather than just querying 
> LVM objects.)

Thanks, I'll look at that.

Cheers,
Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-lvm] Allocation Policy for Cloud Computing needed
  2012-02-20 21:59 ` Lars Ellenberg
  2012-02-20 22:41   ` Ray Morris
@ 2012-02-21  9:34   ` Sebastian Riemer
  1 sibling, 0 replies; 7+ messages in thread
From: Sebastian Riemer @ 2012-02-21  9:34 UTC (permalink / raw)
  To: LVM general discussion and development

On 20/02/12 22:59, Lars Ellenberg wrote:
>> Or does someone want to implement this together with me?
> 
> I would certainly be here for discussions.
> 
> Though, as you always will be more flexible with scripts than with
> pre-implemented fixed algorithms, I probably would first check if I can
> solve it with some scripting.
> [completely untested, but you get the idea]
> 
> #!/bin/bash
> export LANG=C LC_ALL=C
> name=$1 vg=$2 size_in_MiB=$3
> PVS=$(vgs --nohead --unit m -o pv_name,pv_free -O -pv_free,pv_name $vg |
> 	awk -v need=$size_in_MiB '{ print $1; sum += $2;
> 	if (sum >= need) exit; }')
> lvcreate -n $name -L ${size_in_MiB}m $vg $PVS
> 
> (similar for lvextend)
> 
> Which basically implements this allocation policy:
> use the pvs with most free space available,
> and no more than necessary.
> 
> If I understand you correctly, that would almost do what you asked for.

Yes, this really helps. I've also thought about allocating the LVs
directly to distinct PVs. Thanks for the confirmation.

> You can get pretty complex in similar scripts, if you really want to...
> consider using
>   pvs -o vg_name,lv_name,pv_name,pvseg_start,pvseg_size,seg_pe_ranges
> and explicitly listing not only the PVS, but even the PE ranges to your
> lvcreate commands...

Thank you very much for your response.

Regards,
Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-02-21  9:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-16 13:50 [linux-lvm] Allocation Policy for Cloud Computing needed Sebastian Riemer
2012-02-16 14:41 ` James Hawtin
2012-02-16 15:27   ` Sebastian Riemer
2012-02-20 21:59 ` Lars Ellenberg
2012-02-20 22:41   ` Ray Morris
2012-02-21  9:48     ` Sebastian Riemer
2012-02-21  9:34   ` Sebastian Riemer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).