How to boost performance

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* How to boost performance
@ 2010-06-16 22:23 aragonx
  2010-06-17  4:01 ` Roman Mamedov
  0 siblings, 1 reply; 16+ messages in thread
From: aragonx @ 2010-06-16 22:23 UTC (permalink / raw)
  To: linux-raid

Hello all,

I have a software RAID on my server that I would like to get better
performance out of.  I routinely copy large files and/or groups of files
to and from this array over the network.  When I start, I get a reasonable
xfer rate of about 50MB/sec.  It gradually drops to around 20MB/sec after
say 1GB of data.  When looking on the server, the CPU usage is usually at
100% with md0_raid5 usually at the top of the hit list with flush being
right behind it.  Perhaps 30 - 40% of CPU time is spent on waits (this is
from memory though.  If is the critical piece of information, I'll get a
real number).

Before I start spewing server information, my question is this.  Is there
anything I can tweak to improve my performance?  It seems that the server
is CPU bound when I am transferring large amounts of data to or from it. 
Any suggestions will be considered.  Faster processor, switching to RAID
0, etc.  I'm hoping there is something I can do with the RAID software
though.  Maybe a different chunk size or different algorithm?

Now to the information:

# hdparm -t /dev/md0

/dev/md0:
 Timing buffered disk reads:  954 MB in  3.00 seconds = 317.83 MB/sec

# uname -a
Linux homeserv.local 2.6.32.11-99.fc12.x86_64 #1 SMP Mon Apr 5 19:59:38
UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdf1[0] sdg1[4] sdd1[3] sdc1[2] sdb1[1]
      2930287616 blocks level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]

# df -h /dev/md0
Filesystem            Size  Used Avail Use% Mounted on
/dev/md0              2.7T  724G  1.9T  28% /home/data

# free
             total       used       free     shared    buffers     cached
Mem:       8194940    5483640    2711300          0     171392    3131220
-/+ buffers/cache:    2181028    6013912
Swap:      4096532      50108    4046424

# mdadm -D /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Mon Jan 25 16:14:08 2010
     Raid Level : raid5
     Array Size : 2930287616 (2794.54 GiB 3000.61 GB)
  Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Jun 16 17:55:14 2010
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 18928390:76024ba7:d9fdb3bf:6408b6d2 (local to host
homeserv.local)
         Events : 0.54506

    Number   Major   Minor   RaidDevice State
       0       8       81        0      active sync   /dev/sdf1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1
       4       8       97        4      active sync   /dev/sdg1


# cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 67
model name      : AMD Athlon(tm) 64 X2 Dual Core Processor 6000+
stepping        : 3
cpu MHz         : 3000.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy
svm extapic cr8_legacy
bogomips        : 5999.16
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 67
model name      : AMD Athlon(tm) 64 X2 Dual Core Processor 6000+
stepping        : 3
cpu MHz         : 3000.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy
svm extapic cr8_legacy
bogomips        : 5999.62
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

# dmidecode
Handle 0x003C, DMI type 17, 27 bytes
Memory Device
        Array Handle: 0x003A
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 72 bits
        Size: 2048 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM0
        Bank Locator: BANK0
        Type: DDR2
        Type Detail: Synchronous
        Speed: 400 MHz
        Manufacturer: Manufacturer00
        Serial Number: SerNum00
        Asset Tag: AssetTagNum0
        Part Number: ModulePartNumber00

---
Will Y.



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-16 22:23 How to boost performance aragonx
@ 2010-06-17  4:01 ` Roman Mamedov
  2010-06-17  8:17   ` Michael Evans
  2010-06-17 13:49   ` aragonx
  0 siblings, 2 replies; 16+ messages in thread
From: Roman Mamedov @ 2010-06-17  4:01 UTC (permalink / raw)
  To: aragonx; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]

On Wed, 16 Jun 2010 18:23:36 -0400
aragonx@dcsnow.com wrote:

> Before I start spewing server information, my question is this.  Is there
> anything I can tweak to improve my performance?  It seems that the server
> is CPU bound when I am transferring large amounts of data to or from it. 
> Any suggestions will be considered.  Faster processor, switching to RAID
> 0, etc.  I'm hoping there is something I can do with the RAID software
> though.  Maybe a different chunk size or different algorithm?

Increasing stripe cache size from the default of 256 should help immensely.

echo 16384 > /sys/block/md0/md/stripe_cache_size

Be warned that this consumes (^that amount * 4096 * number of disks) bytes of
RAM. Some benchmarks: 

http://peterkieser.com/2009/11/29/raid-mdraid-stripe_cache_size-vs-write-transfer/

On a related note -- Neil, are there plans to implement a stripe cache which
would be shared between all RAID devices? I have two RAID5s in my system, and
when one has a lot of writes, the other is often idle (or vice versa), so that
array's stripe cache is just sitting there wasting memory. Would be nice to
be able to have a shared pool of RAM for stripe-caching all the arrays and
the active one(s) using it to the fullest.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17  4:01 ` Roman Mamedov
@ 2010-06-17  8:17   ` Michael Evans
  2010-06-17 13:49   ` aragonx
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Evans @ 2010-06-17  8:17 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: aragonx, linux-raid

On Wed, Jun 16, 2010 at 9:01 PM, Roman Mamedov <roman@rm.pp.ru> wrote:
> On Wed, 16 Jun 2010 18:23:36 -0400
> aragonx@dcsnow.com wrote:
>
>> Before I start spewing server information, my question is this.  Is there
>> anything I can tweak to improve my performance?  It seems that the server
>> is CPU bound when I am transferring large amounts of data to or from it.
>> Any suggestions will be considered.  Faster processor, switching to RAID
>> 0, etc.  I'm hoping there is something I can do with the RAID software
>> though.  Maybe a different chunk size or different algorithm?
>
> Increasing stripe cache size from the default of 256 should help immensely.
>
> echo 16384 > /sys/block/md0/md/stripe_cache_size
>
> Be warned that this consumes (^that amount * 4096 * number of disks) bytes of
> RAM. Some benchmarks:
>
> http://peterkieser.com/2009/11/29/raid-mdraid-stripe_cache_size-vs-write-transfer/
>
> On a related note -- Neil, are there plans to implement a stripe cache which
> would be shared between all RAID devices? I have two RAID5s in my system, and
> when one has a lot of writes, the other is often idle (or vice versa), so that
> array's stripe cache is just sitting there wasting memory. Would be nice to
> be able to have a shared pool of RAM for stripe-caching all the arrays and
> the active one(s) using it to the fullest.
>
> --
> With respect,
> Roman
>

Yet another idea; does your stripe (easily) fit within your CPU's
cache?  (This way you can process transactions at CPU speed, and spend
the rest of the wait time for the next stripe's delivery on other
threads that aren't blocked.  Actually I should benchmark this to see
if I'm correct...)

I second the idea of shared stripe caches.  However I suggest that
there also be an extra parameter via another file to determine
stripe-cache-id (initially 0 or something).

Also, the above formula for stripe-cache-size use, as well as what the
other sysfs files do, should be noted in the manpages or someplace the
manpage points to.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17  4:01 ` Roman Mamedov
  2010-06-17  8:17   ` Michael Evans
@ 2010-06-17 13:49   ` aragonx
  2010-06-17 16:13     ` Roman Mamedov
  1 sibling, 1 reply; 16+ messages in thread
From: aragonx @ 2010-06-17 13:49 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: aragonx, linux-raid

> On Wed, 16 Jun 2010 18:23:36 -0400
> Increasing stripe cache size from the default of 256 should help
> immensely.
>
> echo 16384 > /sys/block/md0/md/stripe_cache_size
>
> Be warned that this consumes (^that amount * 4096 * number of disks) bytes
> of
> RAM. Some benchmarks:

Hello.  Thank you for the quick suggestions!

First I need to correct myself.  I only have the slowdown when writing to
the server.  Reads from the server are usually in the 60-70MB/sec range
regardless of file size.

Prior to the change above, on a 2GB file, I would start off the write (to
the server) at 70MB/sec and end at about 35MB/sec.  CPU usage was at 100%
with the md0 using about 70% CPU and smb using 30% with flush sometimes
jumping in at 30%.  Wait states remained below 10%.  After the change, on
a 2GB file I would start the write at 70MB/sec and end about 55MB/sec
(nice improvement!).

Switched to a 5GB file and the write ended around 35MB/sec.  At the
beginning smbd would star with 50% CPU but gradually go down to 30% with
md0 ramping up to 70% and flush jumping in from time to time at 30%.

This information is all coming from top.  If I should use something else
that will get more accurate readings, please let me know.

With one change my performance greatly increased but I would like to see
more.  Is there anything else I can do?  I'm hoping to get sustainable
60MB/sec writes.  Is that possible?

Thank you if you have read this far!  :)

---
Will Y.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 13:49   ` aragonx
@ 2010-06-17 16:13     ` Roman Mamedov
  2010-06-17 16:44       ` Stefan /*St0fF*/ Hübner
  2010-06-17 19:46       ` aragonx
  0 siblings, 2 replies; 16+ messages in thread
From: Roman Mamedov @ 2010-06-17 16:13 UTC (permalink / raw)
  To: aragonx; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1322 bytes --]

On Thu, 17 Jun 2010 09:49:42 -0400
aragonx@dcsnow.com wrote:

> Prior to the change above, on a 2GB file, I would start off the write (to
> the server) at 70MB/sec and end at about 35MB/sec.  CPU usage was at 100%
> with the md0 using about 70% CPU and smb using 30% with flush sometimes
> jumping in at 30%.  Wait states remained below 10%.  After the change, on
> a 2GB file I would start the write at 70MB/sec and end about 55MB/sec
> (nice improvement!).

A more consistent way to test would be to cd into a directory on the array,
and repeatedly run something like:

dd if=/dev/zero of=zerofile bs=1M count=2048 conv=fdatasync,notrunc

...and implement various tweaks you are trying out between the runs, to see
their effect.

Also, the reason you see the write speed dropping off in the end, is because
your server first fills up its write cache almost at the maximum
attainable sender's (and network) speed, then, as the space in RAM for that
cache runs out, starts flushing it to disk, reducing the rate at which it
receives new data from the network. So you see that the 70 MB/sec figure is
totally unrelated to the RAID's performance. The dd test described above,
thanks to these "conv" flags (see the dd man page) will have much more sense
as a benchmark.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 16:13     ` Roman Mamedov
@ 2010-06-17 16:44       ` Stefan /*St0fF*/ Hübner
  2010-06-17 19:51         ` aragonx
  2010-06-17 19:46       ` aragonx
  1 sibling, 1 reply; 16+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2010-06-17 16:44 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: aragonx, linux-raid

Actually, you have not said a word about which controllers you use (for
the drives).  Using the wrong controller can drain speed a lot.  As from
the kernel benchmarks it seems like neither RAM nor computing power are
the bottlenecks.

Some SATA-controllers handle "nearly parallel" writes to multiple drives
better than others.  SiI products for example have a noticeable drop-off
for each disk you add.  Pretty late Intels nearly show no impact of many
disks in parallel.  So maybe that could be the topic you should be
after.  (And maybe a lspci could help :)

Stefan

Am 17.06.2010 18:13, schrieb Roman Mamedov:
> On Thu, 17 Jun 2010 09:49:42 -0400
> aragonx@dcsnow.com wrote:
> 
>> Prior to the change above, on a 2GB file, I would start off the write (to
>> the server) at 70MB/sec and end at about 35MB/sec.  CPU usage was at 100%
>> with the md0 using about 70% CPU and smb using 30% with flush sometimes
>> jumping in at 30%.  Wait states remained below 10%.  After the change, on
>> a 2GB file I would start the write at 70MB/sec and end about 55MB/sec
>> (nice improvement!).
> 
> A more consistent way to test would be to cd into a directory on the array,
> and repeatedly run something like:
> 
> dd if=/dev/zero of=zerofile bs=1M count=2048 conv=fdatasync,notrunc
> 
> ...and implement various tweaks you are trying out between the runs, to see
> their effect.
> 
> Also, the reason you see the write speed dropping off in the end, is because
> your server first fills up its write cache almost at the maximum
> attainable sender's (and network) speed, then, as the space in RAM for that
> cache runs out, starts flushing it to disk, reducing the rate at which it
> receives new data from the network. So you see that the 70 MB/sec figure is
> totally unrelated to the RAID's performance. The dd test described above,
> thanks to these "conv" flags (see the dd man page) will have much more sense
> as a benchmark.
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 16:13     ` Roman Mamedov
  2010-06-17 16:44       ` Stefan /*St0fF*/ Hübner
@ 2010-06-17 19:46       ` aragonx
  2010-06-17 19:56         ` aragonx
  2010-06-17 20:02         ` Roman Mamedov
  1 sibling, 2 replies; 16+ messages in thread
From: aragonx @ 2010-06-17 19:46 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-raid

> On Thu, 17 Jun 2010 09:49:42 -0400
> A more consistent way to test would be to cd into a directory on the
> array,
> and repeatedly run something like:
>
> dd if=/dev/zero of=zerofile bs=1M count=2048 conv=fdatasync,notrunc
>
> ...and implement various tweaks you are trying out between the runs, to
> see
> their effect.
>
> Also, the reason you see the write speed dropping off in the end, is
> because
> your server first fills up its write cache almost at the maximum
> attainable sender's (and network) speed, then, as the space in RAM for
> that
> cache runs out, starts flushing it to disk, reducing the rate at which it
> receives new data from the network. So you see that the 70 MB/sec figure
> is
> totally unrelated to the RAID's performance. The dd test described above,
> thanks to these "conv" flags (see the dd man page) will have much more
> sense
> as a benchmark.

Hi Roman,

While I would agree with you if the performance was the same for reads as
it was writes, that is not the case here.  Additionally, I was not SURE
the where my bottleneck was.  It did not have to be storage related. 
Although I had my suspicions.  That is why I included all the information
I thought was relevant.

That being said, I did try two different DD tests that appear to provide
the same results.  This is a more clean method of testing the storage
subsystem though and will use it for further testing on this issue.

time dd if=/dev/zero of=zerofile bs=1M count=2048 conv=fdatasync,notrunc
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 28.728 s, 74.8 MB/s

real    0m28.732s
user    0m0.004s
sys     0m23.600s

time dd if=/dev/zero of=zerofile bs=1M count=6144 conv=fdatasync,notrunc
6144+0 records in
6144+0 records out
6442450944 bytes (6.4 GB) copied, 196.618 s, 32.8 MB/s

real    3m16.622s
user    0m0.012s
sys     0m27.726s



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 16:44       ` Stefan /*St0fF*/ Hübner
@ 2010-06-17 19:51         ` aragonx
  2010-06-17 22:25           ` Roger Heflin
  0 siblings, 1 reply; 16+ messages in thread
From: aragonx @ 2010-06-17 19:51 UTC (permalink / raw)
  To: st0ff; +Cc: linux-raid

> Actually, you have not said a word about which controllers you use (for
> the drives).  Using the wrong controller can drain speed a lot.  As from
> the kernel benchmarks it seems like neither RAM nor computing power are
> the bottlenecks.
>
> Some SATA-controllers handle "nearly parallel" writes to multiple drives
> better than others.  SiI products for example have a noticeable drop-off
> for each disk you add.  Pretty late Intels nearly show no impact of many
> disks in parallel.  So maybe that could be the topic you should be
> after.  (And maybe a lspci could help :)
>
> Stefan

Hi Stefan,

Here is the output of lspci.  There are 5 disks on the ATI controller. 
Four of them are part of the RAID.  The last disk is on the JMicron
controller which is a PCIE-1x card.  I can move one more disk to the
JMicron controller if you think that would help.  Or just purchase a new 4
port controller?

lspci|grep SATA
00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller
[IDE mode]
02:00.0 SATA controller: JMicron Technologies, Inc. 20360/20363 Serial ATA
Controller (rev 03)

---
Will Y.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 19:46       ` aragonx
@ 2010-06-17 19:56         ` aragonx
  2010-06-17 20:02         ` Roman Mamedov
  1 sibling, 0 replies; 16+ messages in thread
From: aragonx @ 2010-06-17 19:56 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: linux-raid

> time dd if=/dev/zero of=zerofile bs=1M count=2048 conv=fdatasync,notrunc
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 28.728 s, 74.8 MB/s
>
> real    0m28.732s
> user    0m0.004s
> sys     0m23.600s
>
> time dd if=/dev/zero of=zerofile bs=1M count=6144 conv=fdatasync,notrunc
> 6144+0 records in
> 6144+0 records out
> 6442450944 bytes (6.4 GB) copied, 196.618 s, 32.8 MB/s
>
> real    3m16.622s
> user    0m0.012s
> sys     0m27.726s

Bleh, sorry for so many posts.  I ran the same DD test on my OS disk that
is on the same controller as 4 of the RAID disks above.  It did much
better although that drive is slower than the ones that belong to the
RAID.

time dd if=/dev/zero of=zerofile bs=1M count=6144 conv=fdatasync,notrunc
6144+0 records in
6144+0 records out
6442450944 bytes (6.4 GB) copied, 109.823 s, 58.7 MB/s

real    1m49.827s
user    0m0.014s
sys     0m20.965s



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 19:46       ` aragonx
  2010-06-17 19:56         ` aragonx
@ 2010-06-17 20:02         ` Roman Mamedov
  2010-06-17 20:43           ` Keld Simonsen
  1 sibling, 1 reply; 16+ messages in thread
From: Roman Mamedov @ 2010-06-17 20:02 UTC (permalink / raw)
  To: aragonx; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 435 bytes --]

On Thu, 17 Jun 2010 15:46:05 -0400
aragonx@dcsnow.com wrote:

> While I would agree with you if the performance was the same for reads as
> it was writes, that is not the case here.

Reads are a massively faster operation on RAID5 than writes, in my experience
an array easily reads at close to the theoretical limit, i.e. the speed of its
slowest member multiplied by the member count minus one.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 20:02         ` Roman Mamedov
@ 2010-06-17 20:43           ` Keld Simonsen
  2010-06-18 17:55             ` aragonx
  0 siblings, 1 reply; 16+ messages in thread
From: Keld Simonsen @ 2010-06-17 20:43 UTC (permalink / raw)
  To: Roman Mamedov; +Cc: aragonx, linux-raid

On Fri, Jun 18, 2010 at 02:02:20AM +0600, Roman Mamedov wrote:
> On Thu, 17 Jun 2010 15:46:05 -0400
> aragonx@dcsnow.com wrote:
> 
> > While I would agree with you if the performance was the same for reads as
> > it was writes, that is not the case here.
> 
> Reads are a massively faster operation on RAID5 than writes, in my experience
> an array easily reads at close to the theoretical limit, i.e. the speed of its
> slowest member multiplied by the member count minus one.

actually raid5 sequential reads can be faster than n-1 times the slowest disk,
as it may skip the parity blocks faster than it could read them. Not much,
probably.

raid10,f2 is theoretically the fastest of the redundancy raids for sequential reads 
as it approaches the speed of raid0, with at theoretical performance of the
number of drives times the slowest of the disks.

For more on raid performance, see https://raid.wiki.kernel.org/index.php/Performance
and for bottlenecks, see eg https://raid.wiki.kernel.org/index.php/Performance#Bottlenecks

Best regards
keld

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 19:51         ` aragonx
@ 2010-06-17 22:25           ` Roger Heflin
  0 siblings, 0 replies; 16+ messages in thread
From: Roger Heflin @ 2010-06-17 22:25 UTC (permalink / raw)
  To: aragonx; +Cc: st0ff, linux-raid

On 06/17/2010 02:51 PM, aragonx@dcsnow.com wrote:
>> Actually, you have not said a word about which controllers you use (for
>> the drives).  Using the wrong controller can drain speed a lot.  As from
>> the kernel benchmarks it seems like neither RAM nor computing power are
>> the bottlenecks.
>>
>> Some SATA-controllers handle "nearly parallel" writes to multiple drives
>> better than others.  SiI products for example have a noticeable drop-off
>> for each disk you add.  Pretty late Intels nearly show no impact of many
>> disks in parallel.  So maybe that could be the topic you should be
>> after.  (And maybe a lspci could help :)
>>
>> Stefan
>
> Hi Stefan,
>
> Here is the output of lspci.  There are 5 disks on the ATI controller.
> Four of them are part of the RAID.  The last disk is on the JMicron
> controller which is a PCIE-1x card.  I can move one more disk to the
> JMicron controller if you think that would help.  Or just purchase a new 4
> port controller?
>
> lspci|grep SATA
> 00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller
> [IDE mode]
> 02:00.0 SATA controller: JMicron Technologies, Inc. 20360/20363 Serial ATA
> Controller (rev 03)
>
> ---
> Will Y.
>
>

[IDE mode] is probably a bad idea...a bios setting will change that.

Mine shows this:

00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA 
Controller [AHCI mode]

And my results on similar test show this (3x500gb raid5), single disk 
shows this speed, so raid write speed breaks down to about 60-70% of 
single disk speed:
dd if=/dev/sdb of=/dev/null bs=1M count=2048
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 28.8118 s, 74.5 MB/s


dd if=/dev/zero of=zerofile bs=1M count=2048 conv=fdatasync,notrunc
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 36.6523 s, 58.6 MB/s

dd if=/dev/zero of=zerofile bs=1M count=6144 conv=fdatasync,notrunc
6144+0 records in
6144+0 records out
6442450944 bytes (6.4 GB) copied, 113.893 s, 56.6 MB/s

dd if=/dev/zero of=zerofile bs=1M count=16384 conv=fdatasync,notrunc
^C15006+0 records in
15006+0 records out
15734931456 bytes (16 GB) copied, 277.814 s, 56.6 MB/s

And this for reads:
dd if=zerofile of=/dev/null bs=1M count=6144 conv=fdatasync,notrunc
6144+0 records in
6144+0 records out
6442450944 bytes (6.4 GB) copied, 53.8029 s, 120 MB/s




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-17 20:43           ` Keld Simonsen
@ 2010-06-18 17:55             ` aragonx
  2010-06-18 20:12               ` Roger Heflin
  0 siblings, 1 reply; 16+ messages in thread
From: aragonx @ 2010-06-18 17:55 UTC (permalink / raw)
  To: Keld Simonsen; +Cc: linux-raid

> On Fri, Jun 18, 2010 at 02:02:20AM +0600, Roman Mamedov wrote:
> actually raid5 sequential reads can be faster than n-1 times the slowest
> disk,
> as it may skip the parity blocks faster than it could read them. Not much,
> probably.

Hello all,

So would I be correct in concluding that I am CPU bound at this point?  My
machine is busy creating the parity information so can not receive any
faster.  A CPU replacement is in order but I have one question in that
direction.  It appears that the md0 process is only using one core.  Is
this because it is a single file being written?  Should it be using both?

The main thrust of this question is which type of CPU upgrade will help my
situation.  I could go for a faster MHz or more cores.  Actually, I can't
go much faster on the MHz front.  I think 3.4GHz is max and I'm at 3.0GHz.

As always, thank you for all your advice.

---
Will Y.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance
  2010-06-18 17:55             ` aragonx
@ 2010-06-18 20:12               ` Roger Heflin
  2010-06-20 23:30                 ` How to boost performance [SOLVED] aragonx
  0 siblings, 1 reply; 16+ messages in thread
From: Roger Heflin @ 2010-06-18 20:12 UTC (permalink / raw)
  To: aragonx; +Cc: Keld Simonsen, linux-raid

On Fri, Jun 18, 2010 at 12:55 PM,  <aragonx@dcsnow.com> wrote:
>> On Fri, Jun 18, 2010 at 02:02:20AM +0600, Roman Mamedov wrote:
>> actually raid5 sequential reads can be faster than n-1 times the slowest
>> disk,
>> as it may skip the parity blocks faster than it could read them. Not much,
>> probably.
>
> Hello all,
>
> So would I be correct in concluding that I am CPU bound at this point?  My
> machine is busy creating the parity information so can not receive any
> faster.  A CPU replacement is in order but I have one question in that
> direction.  It appears that the md0 process is only using one core.  Is
> this because it is a single file being written?  Should it be using both?
>
> The main thrust of this question is which type of CPU upgrade will help my
> situation.  I could go for a faster MHz or more cores.  Actually, I can't
> go much faster on the MHz front.  I think 3.4GHz is max and I'm at 3.0GHz.
>
> As always, thank you for all your advice.

Regardless of what top says, there is almost no way you could be cpu
bound, if you were cpu bound the IO rate should not get slower since
cpu does not go away, you would have to be backing up because the disk
subsystem cannot keep up.

You have some other issue, probably the fact that your sata ports are
running in IDE mode and not ACHI mode.

ACHI mode is newer and faster, IDE mode is older and less capable and slower.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance [SOLVED]
  2010-06-18 20:12               ` Roger Heflin
@ 2010-06-20 23:30                 ` aragonx
  2010-06-21  0:21                   ` Bernd Schubert
  0 siblings, 1 reply; 16+ messages in thread
From: aragonx @ 2010-06-20 23:30 UTC (permalink / raw)
  To: Roger Heflin; +Cc: linux-raid

Thank you all for your many suggestions.

After performing all the tweaks suggested, I was able to write a 5GB file
from my workstation to my server at 100MB/sec!!!  Reads (from the server)
are equally as fast and are mainly limited by my workstation now.  The
server has dual bonded gigabit NICs so it can throw the data pretty fast.

Here is what I did:

echo 16384 > /sys/block/md0/md/stripe_cache_size
Added to /etc/fstab for md0:   barrier=1,journal_checksum,stripe=5
Switched to AHCI mode for both my SATA controllers
One of my disks in the BIOS was not set to 32bit transfer mode where the
others were.  I changed that.
And finally my kernel version changed when I rebooted.  I'm not sure if it
is relevant but thought it should be noted:  2.6.32.14-127.fc12.x86_64
instead of 2.6.32.11-99.fc12.x86_64.

Additionally, it should be noted that my wait states went down to about
2-3% while writing to the array (instead of 20-40%) and CPU usage was only
about 40% total instead of maxed.

dd if=/dev/zero of=zerofile bs=1M count=6144 conv=fdatasync,notrunc
6144+0 records in
6144+0 records out
6442450944 bytes (6.4 GB) copied, 37.1289 s, 174 MB/s

Thank you all for your help!

---
Will Y.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: How to boost performance [SOLVED]
  2010-06-20 23:30                 ` How to boost performance [SOLVED] aragonx
@ 2010-06-21  0:21                   ` Bernd Schubert
  0 siblings, 0 replies; 16+ messages in thread
From: Bernd Schubert @ 2010-06-21  0:21 UTC (permalink / raw)
  To: aragonx; +Cc: Roger Heflin, linux-raid

On Monday, June 21, 2010, aragonx@dcsnow.com wrote:
> Thank you all for your many suggestions.
> 
> After performing all the tweaks suggested, I was able to write a 5GB file
> from my workstation to my server at 100MB/sec!!!  Reads (from the server)
> are equally as fast and are mainly limited by my workstation now.  The
> server has dual bonded gigabit NICs so it can throw the data pretty fast.
> 
> Here is what I did:
> 
> echo 16384 > /sys/block/md0/md/stripe_cache_size
> Added to /etc/fstab for md0:   barrier=1,journal_checksum,stripe=5

Uh, be careful with journal_checksum. I have a nice Lustre bug, where journal 
checksums actually caused filesystem corruption. If you really want to use it, 
you should first ask on the ext4 list what it the state of that. CC me and I 
will jump in. 
I'm not sure if my issue already was fixed, but last time I looked at the 
code, the replay procedure in case of a failure was not optimal. 
Unfortunately, it has a low priority on my long Lustre bug list... (Lustre 
uses a patched ext3/ext4, so such issues just come through to Lustre).

Cheers,
Bernd

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-06-21  0:21 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-16 22:23 How to boost performance aragonx
2010-06-17  4:01 ` Roman Mamedov
2010-06-17  8:17   ` Michael Evans
2010-06-17 13:49   ` aragonx
2010-06-17 16:13     ` Roman Mamedov
2010-06-17 16:44       ` Stefan /*St0fF*/ Hübner
2010-06-17 19:51         ` aragonx
2010-06-17 22:25           ` Roger Heflin
2010-06-17 19:46       ` aragonx
2010-06-17 19:56         ` aragonx
2010-06-17 20:02         ` Roman Mamedov
2010-06-17 20:43           ` Keld Simonsen
2010-06-18 17:55             ` aragonx
2010-06-18 20:12               ` Roger Heflin
2010-06-20 23:30                 ` How to boost performance [SOLVED] aragonx
2010-06-21  0:21                   ` Bernd Schubert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).