Severe slowdown with LVM on RAID, alignment problem?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Severe slowdown with LVM on RAID, alignment problem?
@ 2008-02-29  0:05 Michael Guntsche
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Guntsche @ 2008-02-29  0:05 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 6946 bytes --]

Hello list,

I know the subject about lvm on raid and the very popular alignment  
question have been beaten to death by now.
But whatever I try I cannot seem to work it out, so please bear with  
me and my lengthy description.

Currently I am testing a SATA RAID-5 for deployment in my little home  
server. I am interested in space mostly but still do not want to give  
up all the speed.
So for testing purposes I created a 10GB partition on every disk and  
created a RAID-5 across them.

mdadm --detail /dev/md0
/dev/md0:
         Version : 01.00.03
   Creation Time : Thu Feb 28 18:56:07 2008
      Raid Level : raid5
      Array Size : 29326080 (27.97 GiB 30.03 GB)
   Used Dev Size : 19550720 (9.32 GiB 10.01 GB)
    Raid Devices : 4
   Total Devices : 4
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Fri Feb 29 00:40:44 2008
           State : clean
  Active Devices : 4
Working Devices : 4
  Failed Devices : 0
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 256K

            Name : gibson:0  (local to host gibson)
            UUID : 46cb01e4:ea969e20:535d91a1:234004ed
          Events : 8

     Number   Major   Minor   RaidDevice State
        0       8       33        0      active sync   /dev/sdc1
        1       8       49        1      active sync   /dev/sdd1
        2       8       65        2      active sync   /dev/sde1
        4       8       81        3      active sync   /dev/sdf1

I set the Cunk Size to 256K as you can see and also used Superblock  
1.0, apart from that I changed nothing. I let the RAID resync and  
then I started testing.

First one was a simple XFS FS sans LVM. I called mkfs.xfs without any  
parameters and got the following.

  mkfs.xfs /dev/md0
meta-data=/dev/md0               isize=256    agcount=16,  
agsize=458240 blks
          =                       sectsz=4096  attr=2
data     =                       bsize=4096   blocks=7331520, imaxpct=25
          =                       sunit=64     swidth=192 blks
naming   =version 2              bsize=4096
log      =internal log           bsize=4096   blocks=3579, version=2
          =                       sectsz=4096  sunit=1 blks, lazy- 
count=1
realtime =none                   extsz=786432 blocks=0, rtextents=0

Running a quick bonnie benchmark on this one, I got the following  
results.

Version  1.03c      ------Sequential Output------ --Sequential Input-  
--Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
CP  /sec %CP
xfs              1G           31359  20 24135  19           79587  29  
154.8   1

Well this is not the fastest machine, so I did not think much about  
the results, in my opinion the look ok, but please correct me if I am  
totally wrong.

Next I created VG-LV on the raid, formatted it with XFS and ran bonny  
again.

!! No special switches where used for vgcreate or lvcreate !!

Version  1.03c      ------Sequential Output------ --Sequential Input-  
--Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
CP  /sec %CP
xfs              1G           31359  20 24135  19           79587  29  
154.8   1
lvm-nal          1G           28917  21 15344  19           33645  26  
136.3   1

We lost a little bit on writing, but what happened to reading? I lost  
more than half of the performance here.

So I had a look at mkfs.xfs again and saw that sunit and swidth where  
0. Looking at the values I got the first time I called mkfs.xfs on  
the /dev/md0 device I came up with

mkfs.xfs -dsunit=512,swidth=1536 /dev/mapper/lamo-gabo -f
meta-data=/dev/mapper/lamo-gabo  isize=256    agcount=4,  
agsize=655360 blks
          =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=2621440, imaxpct=25
          =                       sunit=64     swidth=192 blks
naming   =version 2              bsize=4096
log      =internal log           bsize=4096   blocks=2560, version=2
          =                       sectsz=512   sunit=64 blks, lazy- 
count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Result:

Version  1.03c      ------Sequential Output------ --Sequential Input-  
--Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
CP  /sec %CP
xfs              1G           31359  20 24135  19           79587  29  
154.8   1
lvm-nal          1G           28917  21 15344  19           33645  26  
136.3   1
xfsopt-lvm-nal   1G           29503  22 15369  19           36604  30  
135.0   1

Ok, some improvement but still not really satisfying.

Reading through the archives I also found hints to improve  
performance by aligning the VG to the stripe/chunk size. Although  
only talking about writing I tried that.

I came up with.... The LVM header is 64K my chunksize 256, so  
256-64=192. I called pvcreate --metadatasize 192k and recreated the  
VG-LV combo, formatted it again with plain mkfs.xfs (no sunit or  
swidth) and ran bonnie again.

Version  1.03c      ------Sequential Output------ --Sequential Input-  
--Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
CP  /sec %CP
xfs              1G           31359  20 24135  19           79587  29  
154.8   1
lvm-nal          1G           28917  21 15344  19           33645  26  
136.3   1
xfsopt-lvm-nal   1G           29503  22 15369  19           36604  30  
135.0   1
lvm-al           1G           29561  21 15857  20           35242  28  
130.6   1

A little bit better but not really that much.

Finally I reformatted it with the sunit, and swidth values from above.

Version  1.03c      ------Sequential Output------ --Sequential Input-  
--Random-
                     -Per Chr- --Block-- -Rewrite- -Per Chr- -- 
Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec % 
CP  /sec %CP
xfs              1G           31359  20 24135  19           79587  29  
154.8   1
lvm-nal          1G           28917  21 15344  19           33645  26  
136.3   1
xfsopt-lvm-nal   1G           29503  22 15369  19           36604  30  
135.0   1
lvm-al           1G           29561  21 15857  20           35242  28  
130.6   1
xfsopt-lvm-al    1G           30948  22 15404  19           33715  26  
118.4   1

Nope still not doing great here.

Ok, this is where I am at now.
Do I really loose that much reading speed just by using lvm on RAID?  
I would have understood some decrease in writing speed but why is  
reading affected that much?

Once again, sorry for the long E-Mail, but this is really bugging me  
right now.

Kind regards,
Michael 

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2417 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe slowdown with LVM on RAID, alignment problem?
       [not found] <47C75436.9010301@harddata.com>
@ 2008-02-29  7:37 ` Michael Guntsche
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Guntsche @ 2008-02-29  7:37 UTC (permalink / raw)
  To: Maurice Hilarius; +Cc: linux-raid

On Thu, 28 Feb 2008 17:39:18 -0700, Maurice Hilarius <maurice@harddata.com>
wrote:
> If you use bonnie++ with a file size of 1GB, I hope you only have 256MB
> of RAM in this machine.
> Otherwise buffering will horribly skew any results to the point that
> they are unusable

Thank you for the info. 
Currently I have 512MB RAM and 500MB SWAP in this box. I upped the size to
8GB, so 8 files a 1GB are created.
While the result for XFS is different, reading is actually faster, the
differnece between xfs and xfs on lvm is still there.

pvcreate was called so that the first PE starts exactly at 256K, no sunits
where used with mkfs.xfs for the lvm case.

I still do not understand the read output at all.

Version  1.03c      ------Sequential Output------ --Sequential Input-
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
xfs              8G           38876  25 28244  22           103891  40
161.4   2
lvm-al           8G           37089  24 18821  23           48489  40 155.8
  2

While it looks like that writing is actually pretty similar reading is
still way down.

Totally at loss here,
Michael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe slowdown with LVM on RAID, alignment problem?
@ 2008-02-29  8:12 Michael Guntsche
  2008-02-29 10:37 ` Peter Rabbitson
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Guntsche @ 2008-02-29  8:12 UTC (permalink / raw)
  To: Maurice Hilarius; +Cc: linux-raid


On Fri, 29 Feb 2008 00:53:06 -0700, Maurice Hilarius <maurice@harddata.com>
wrote:
> Michael Guntsche wrote:
>> ..
>> While the result for XFS is different, reading is actually faster, the
>> differnece between xfs and xfs on lvm is still there.
>>
>>
> Great. At least now the figures are more realistic.

>> pvcreate was called so that the first PE starts exactly at 256K, no
sunits
>> where used with mkfs.xfs for the lvm case.
>>
>> I still do not understand the read output at all.
>>
> That is certainly a puzzle.

Is it possible that my computer is just too slow to get good read results?
I wonder since writing seems to be nearly similar.
I just tried with an ext3 FS.

Version 1.03c       ------Sequential Output------ --Sequential Input-
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
lvm-al8g-ext3    8G           45029  27 22436  29           55034  45 192.0
 3

While reading is a little bit faster it's nowhere near the speed I get on
md0 itself.

Kind regards,
Michael


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe slowdown with LVM on RAID, alignment problem?
  2008-02-29  8:12 Michael Guntsche
@ 2008-02-29 10:37 ` Peter Rabbitson
  2008-02-29 10:45   ` Michael Guntsche
  2008-03-01 20:45   ` Bill Davidsen
  0 siblings, 2 replies; 8+ messages in thread
From: Peter Rabbitson @ 2008-02-29 10:37 UTC (permalink / raw)
  To: Michael Guntsche; +Cc: Maurice Hilarius, linux-raid

Michael Guntsche wrote:
> 
> Is it possible that my computer is just too slow to get good read results?
unlikely

> While reading is a little bit faster it's nowhere near the speed I get on
> md0 itself.
> 

I would guess that you did not set the correct read-ahead values for the LV. 
If you do not specify anything it will default to 128k (256 sectors), which is 
terribly small for sequential reads. On the contrary the MD device will do 
some clever calculations and set its read-ahead correctly depending on the 
raid level and the number of disks. Do:

blockdev --setra 65536 <your lv device>

and run the tests again. You are almost certainly going to get the results you 
are after.

Peter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe slowdown with LVM on RAID, alignment problem?
  2008-02-29 10:37 ` Peter Rabbitson
@ 2008-02-29 10:45   ` Michael Guntsche
  2008-03-01 20:45   ` Bill Davidsen
  1 sibling, 0 replies; 8+ messages in thread
From: Michael Guntsche @ 2008-02-29 10:45 UTC (permalink / raw)
  To: Peter Rabbitson; +Cc: Maurice Hilarius, linux-raid

Hello Peter

On Fri, 29 Feb 2008 11:37:58 +0100, Peter Rabbitson <rabbit+list@rabbit.us>
wrote:
> Michael Guntsche wrote:
> I would guess that you did not set the correct read-ahead values for the
> LV.
> If you do not specify anything it will default to 128k (256 sectors),
> which is
> terribly small for sequential reads. On the contrary the MD device will
do
> some clever calculations and set its read-ahead correctly depending on
the
> raid level and the number of disks. Do:
> 
> blockdev --setra 65536 <your lv device>
> 
> and run the tests again. You are almost certainly going to get the
results
> you
> are after.

I checked the read-ahead value on md0 (3072) and set this on the LV as
well.

Here is the result:

Version 1.03c       ------Sequential Output------ --Sequential Input-
--Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
lvm              8G           37251  25 27620  25           103996  49
160.0   2

I did not test it with the proper sunit,swdith values yet but the result is
now looking much better.
I'll play around with it some more this afternoon and post my result of
what is working best for me.
In the mean time, thank you all for your quick and helpful responses.


Kind regards,
Michael


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe slowdown with LVM on RAID, alignment problem?
  2008-02-29 10:37 ` Peter Rabbitson
  2008-02-29 10:45   ` Michael Guntsche
@ 2008-03-01 20:45   ` Bill Davidsen
  2008-03-01 21:26     ` Michael Guntsche
  1 sibling, 1 reply; 8+ messages in thread
From: Bill Davidsen @ 2008-03-01 20:45 UTC (permalink / raw)
  To: Peter Rabbitson; +Cc: Michael Guntsche, Maurice Hilarius, linux-raid

Peter Rabbitson wrote:
> Michael Guntsche wrote:
>>
>> Is it possible that my computer is just too slow to get good read 
>> results?
> unlikely
>
>> While reading is a little bit faster it's nowhere near the speed I 
>> get on
>> md0 itself.
>>
>
> I would guess that you did not set the correct read-ahead values for 
> the LV. If you do not specify anything it will default to 128k (256 
> sectors), which is terribly small for sequential reads. On the 
> contrary the MD device will do some clever calculations and set its 
> read-ahead correctly depending on the raid level and the number of 
> disks. Do:
>
> blockdev --setra 65536 <your lv device>
>
> and run the tests again. You are almost certainly going to get the 
> results you are after.

I will just comment that really large readahead values may cause 
significant memory usage and transfer of unused data. My observations 
and some posts indicate that very large readahead and/or chunk size may 
reduce random access performance. I believe you said you had 512MB RAM, 
that may be a factor as well.

Also, blockdev will allow you to diddle readahead on the device, 
/dev/sdX, the array /dev/mdX, and the lv /dev/mapper/NAME. The 
interaction of these, and the performance results of having the same 
exact amount of readhead memory used in different way is a fine topic 
for a thesis, conference paper, magazine article, or nightmare.

Unless you are planning to use this machine mainly for running 
benchmarks, I would tune it for your actual load and a bit of worst case 
avoidance.

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe slowdown with LVM on RAID, alignment problem?
  2008-03-01 20:45   ` Bill Davidsen
@ 2008-03-01 21:26     ` Michael Guntsche
  2008-03-02 20:14       ` Bill Davidsen
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Guntsche @ 2008-03-01 21:26 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1633 bytes --]


On Mar 1, 2008, at 21:45, Bill Davidsen wrote:

>> blockdev --setra 65536 <your lv device>
>>
>> and run the tests again. You are almost certainly going to get the  
>> results you are after.
>
> I will just comment that really large readahead values may cause  
> significant memory usage and transfer of unused data. My  
> observations and some posts indicate that very large readahead and/ 
> or chunk size may reduce random access performance. I believe you  
> said you had 512MB RAM, that may be a factor as well.
>

I did not set such a large read-ahead. I had a look at the md0 device  
which had a value of 3072 and set this on the LV device as well.  
Performance really improved after this.

>
> Unless you are planning to use this machine mainly for running  
> benchmarks, I would tune it for your actual load and a bit of worst  
> case avoidance.
>

The last part is exactly what I am aiming at right now.
I tried to keep my changes to a bare minimum.

* Change chunk size to 256K
* Align the physical extent of the LVM to it
* Use the same parameters for mkfs.xfs that are choosen autmatically  
by mkfs.xfs if called on the md0 device itself.

* Set the read-ahead of the LVM block device to the same value as the  
md0 device
* Change the stripe_cache_size to 2048


With these settings applied to my setup here, RAID+XFS and RAID+LVM 
+XFS perform nearly identical and that was my goal from the beginning.

Now I am off to figure out what's happening during the initial  
rebuild of the RAID-5 but see my other mail for this.

Once again, thank you all for your valuable input and support.

Kind regards,
Michael

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2417 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe slowdown with LVM on RAID, alignment problem?
  2008-03-01 21:26     ` Michael Guntsche
@ 2008-03-02 20:14       ` Bill Davidsen
  0 siblings, 0 replies; 8+ messages in thread
From: Bill Davidsen @ 2008-03-02 20:14 UTC (permalink / raw)
  To: Michael Guntsche; +Cc: linux-raid

Michael Guntsche wrote:
>
> On Mar 1, 2008, at 21:45, Bill Davidsen wrote:
>
>>> blockdev --setra 65536 <your lv device>
>>>
>>> and run the tests again. You are almost certainly going to get the 
>>> results you are after.
>>
>> I will just comment that really large readahead values may cause 
>> significant memory usage and transfer of unused data. My observations 
>> and some posts indicate that very large readahead and/or chunk size 
>> may reduce random access performance. I believe you said you had 
>> 512MB RAM, that may be a factor as well.
>>
>
> I did not set such a large read-ahead. I had a look at the md0 device 
> which had a value of 3072 and set this on the LV device as well. 
> Performance really improved after this.
>
>>
>> Unless you are planning to use this machine mainly for running 
>> benchmarks, I would tune it for your actual load and a bit of worst 
>> case avoidance.
>>
>
> The last part is exactly what I am aiming at right now.
> I tried to keep my changes to a bare minimum.
>
> * Change chunk size to 256K
> * Align the physical extent of the LVM to it
> * Use the same parameters for mkfs.xfs that are choosen autmatically 
> by mkfs.xfs if called on the md0 device itself.
>
> * Set the read-ahead of the LVM block device to the same value as the 
> md0 device
> * Change the stripe_cache_size to 2048
>
>
> With these settings applied to my setup here, RAID+XFS and 
> RAID+LVM+XFS perform nearly identical and that was my goal from the 
> beginning.
>
> Now I am off to figure out what's happening during the initial rebuild 
> of the RAID-5 but see my other mail for this.
>
> Once again, thank you all for your valuable input and support.
Thank you for reporting results, hopefully will be useful to some future 
seeker of the same info.

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-03-02 20:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <47C75436.9010301@harddata.com>
2008-02-29  7:37 ` Severe slowdown with LVM on RAID, alignment problem? Michael Guntsche
2008-02-29  8:12 Michael Guntsche
2008-02-29 10:37 ` Peter Rabbitson
2008-02-29 10:45   ` Michael Guntsche
2008-03-01 20:45   ` Bill Davidsen
2008-03-01 21:26     ` Michael Guntsche
2008-03-02 20:14       ` Bill Davidsen
  -- strict thread matches above, loose matches on Subject: below --
2008-02-29  0:05 Michael Guntsche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).