linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* slow raid5 performance
@ 2007-10-18 22:21 nefilim
  2007-10-20 12:38 ` Peter Grandi
  0 siblings, 1 reply; 10+ messages in thread
From: nefilim @ 2007-10-18 22:21 UTC (permalink / raw)
  To: linux-raid


Hi

Pretty new to software raid, I have the following setup in a file server:

/dev/md0:
        Version : 00.90.03
  Creation Time : Wed Oct 10 11:05:46 2007
     Raid Level : raid5
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Oct 18 15:02:16 2007
          State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 9dcbd480:c5ca0550:ca45cdab:f7c9f29d
         Events : 0.9

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
       2       8       65        2      active sync   /dev/sde1

3 x 500GB WD RE2 hard drives
AMD Athlon XP 2400 (2.0Ghz), 1GB RAM
/dev/sd[ab] are connected to Sil 3112 controller on PCI bus
/dev/sd[cde] are connected to Sil 3114 controller on PCI bus

Transferring large media files from /dev/sdb to /dev/md0 I see the following
with iostat:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.01    0.00   55.56   40.40    0.00    3.03

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               0.00         0.00         0.00          0          0
sdb             261.62        31.09         0.00         30          0
sdc             148.48         0.15        16.40          0         16
sdd             102.02         0.41        16.14          0         15
sde             113.13         0.29        16.18          0         16
md0            8263.64         0.00        32.28          0         31
 
which is pretty much what I see with hdparm etc. 32MB/s seems pretty slow
for drives that can easily do 50MB/s each. Read performance is better around
85MB/s (although I expected somewhat higher). So it doesn't seem that PCI
bus is limiting factor here (127MB/s theoretical throughput.. 100MB/s real
world?) quite yet... I see a lot of time being spent in the kernel.. and a
significant iowait time. The CPU is pretty old but where exactly is the
bottleneck? 

Any thoughts, insights or recommendations welcome!

Cheers
Peter
-- 
View this message in context: http://www.nabble.com/slow-raid5-performance-tf4650085.html#a13284909
Sent from the linux-raid mailing list archive at Nabble.com.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
  2007-10-18 22:21 nefilim
@ 2007-10-20 12:38 ` Peter Grandi
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Grandi @ 2007-10-20 12:38 UTC (permalink / raw)
  To: Linux RAID

>>> On Thu, 18 Oct 2007 16:45:20 -0700 (PDT), nefilim
>>> <thenephilim13@yahoo.com> said:

[ ... ]

> 3 x 500GB WD RE2 hard drives
> AMD Athlon XP 2400 (2.0Ghz), 1GB RAM
[ ... ]
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            1.01    0.00   55.56   40.40    0.00    3.03
[ ... ]
> which is pretty much what I see with hdparm etc. 32MB/s seems
> pretty slow for drives that can easily do 50MB/s each. Read
> performance is better around 85MB/s (although I expected
> somewhat higher).

> So it doesn't seem that PCI bus is limiting factor here

Most 500GB drives can do 60-80MB/s on the outer tracks
(30-40MB/s on the inner ones), and 3 together can easily swamp
the PCI bus. While you see the write rates of two disks, the OS
is really writing to all three disks at the same time, and it
will do read-modify-write unless the writes are exactly stripe
aligned. When RMW happens write speed is lower than writing to a
single disk.

> I see a lot of time being spent in the kernel.. and a
> significant iowait time.

The system time is because the Linux page cache etc. is CPU
bound (never mind RAID5 XOR computation, which is not that
big). The IO wait is because IO is taking place.

  http://www.sabi.co.uk/blog/anno05-4th.html#051114

Almost all kernel developers of note have been hired by wealthy
corporations who sell to people buying large servers. Then the
typical system that these developers may have and also target
are high ends 2-4 CPU workstations and servers, with CPUs many
times faster than your PC, and on those system the CPU overhead
of the page cache at speeds like yours less than 5%.

My impression is that something that takes less than 5% on a
developers's system does not get looked at, even if it takes 50%
on your system. The Linux kernel was very efficient when most
developers were using old cheap PCs themselves. "scratch your
itch" rules.

Anyhow, try to bypass the page cache with 'O_DIRECT' or test
with 'dd oflag=direct' and similar for an alterative code path.

> The CPU is pretty old but where exactly is the bottleneck?

Misaligned writes and page cache CPU time most likely.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
@ 2007-10-22 16:15 Peter
  2007-10-22 16:58 ` Justin Piszcz
  0 siblings, 1 reply; 10+ messages in thread
From: Peter @ 2007-10-22 16:15 UTC (permalink / raw)
  To: linux-raid

Does anyone have any insights here? How do I interpret the seemingly competing system & iowait numbers... is my system both CPU and PCI bus bound? 

----- Original Message ----
From: nefilim
To: linux-raid@vger.kernel.org
Sent: Thursday, October 18, 2007 4:45:20 PM
Subject: slow raid5 performance



Hi

Pretty new to software raid, I have the following setup in a file
 server:

/dev/md0:
        Version : 00.90.03
  Creation Time : Wed Oct 10 11:05:46 2007
     Raid Level : raid5
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Oct 18 15:02:16 2007
          State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 9dcbd480:c5ca0550:ca45cdab:f7c9f29d
         Events : 0.9

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
       2       8       65        2      active sync   /dev/sde1

3 x 500GB WD RE2 hard drives
AMD Athlon XP 2400 (2.0Ghz), 1GB RAM
/dev/sd[ab] are connected to Sil 3112 controller on PCI bus
/dev/sd[cde] are connected to Sil 3114 controller on PCI bus

Transferring large media files from /dev/sdb to /dev/md0 I see the
 following
with iostat:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.01    0.00   55.56   40.40    0.00    3.03

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               0.00         0.00         0.00          0          0
sdb             261.62        31.09         0.00         30          0
sdc             148.48         0.15        16.40          0         16
sdd             102.02         0.41        16.14          0         15
sde             113.13         0.29        16.18          0         16
md0            8263.64         0.00        32.28          0         31
 
which is pretty much what I see with hdparm etc. 32MB/s seems pretty
 slow
for drives that can easily do 50MB/s each. Read performance is better
 around
85MB/s (although I expected somewhat higher). So it doesn't seem that
 PCI
bus is limiting factor here (127MB/s theoretical throughput.. 100MB/s
 real
world?) quite yet... I see a lot of time being spent in the kernel..
 and a
significant iowait time. The CPU is pretty old but where exactly is the
bottleneck? 

Any thoughts, insights or recommendations welcome!

Cheers
Peter
-- 
View this message in context:
 http://www.nabble.com/slow-raid5-performance-tf4650085.html#a13284909
Sent from the linux-raid mailing list archive at Nabble.com.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid"
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
  2007-10-22 16:15 Peter
@ 2007-10-22 16:58 ` Justin Piszcz
  0 siblings, 0 replies; 10+ messages in thread
From: Justin Piszcz @ 2007-10-22 16:58 UTC (permalink / raw)
  To: Peter; +Cc: linux-raid

With SW RAID 5 on the PCI bus you are not going to see faster than 38-42 
MiB/s.  Especially with only three drives it may be slower than that. 
Forget / stop using the PCI bus and expect high transfer rates.

For writes = 38-42 MiB/s sw raid5.
For reads = you will get close to 120-122 MiB/s sw raid5.

This is from a lot of testing going up to 400GB x 10 drives using PCI 
cards on a regular PCI bus.

Then I went PCI-e and used faster disks to get 0.5gigabytes/sec SW raid5.

Justin.

On Mon, 22 Oct 2007, Peter wrote:

> Does anyone have any insights here? How do I interpret the seemingly competing system & iowait numbers... is my system both CPU and PCI bus bound?
>
> ----- Original Message ----
> From: nefilim
> To: linux-raid@vger.kernel.org
> Sent: Thursday, October 18, 2007 4:45:20 PM
> Subject: slow raid5 performance
>
>
>
> Hi
>
> Pretty new to software raid, I have the following setup in a file
> server:
>
> /dev/md0:
>        Version : 00.90.03
>  Creation Time : Wed Oct 10 11:05:46 2007
>     Raid Level : raid5
>     Array Size : 976767872 (931.52 GiB 1000.21 GB)
>  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
>   Raid Devices : 3
>  Total Devices : 3
> Preferred Minor : 0
>    Persistence : Superblock is persistent
>
>    Update Time : Thu Oct 18 15:02:16 2007
>          State : active
> Active Devices : 3
> Working Devices : 3
> Failed Devices : 0
>  Spare Devices : 0
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>           UUID : 9dcbd480:c5ca0550:ca45cdab:f7c9f29d
>         Events : 0.9
>
>    Number   Major   Minor   RaidDevice State
>       0       8       33        0      active sync   /dev/sdc1
>       1       8       49        1      active sync   /dev/sdd1
>       2       8       65        2      active sync   /dev/sde1
>
> 3 x 500GB WD RE2 hard drives
> AMD Athlon XP 2400 (2.0Ghz), 1GB RAM
> /dev/sd[ab] are connected to Sil 3112 controller on PCI bus
> /dev/sd[cde] are connected to Sil 3114 controller on PCI bus
>
> Transferring large media files from /dev/sdb to /dev/md0 I see the
> following
> with iostat:
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           1.01    0.00   55.56   40.40    0.00    3.03
>
> Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
> sda               0.00         0.00         0.00          0          0
> sdb             261.62        31.09         0.00         30          0
> sdc             148.48         0.15        16.40          0         16
> sdd             102.02         0.41        16.14          0         15
> sde             113.13         0.29        16.18          0         16
> md0            8263.64         0.00        32.28          0         31
>
> which is pretty much what I see with hdparm etc. 32MB/s seems pretty
> slow
> for drives that can easily do 50MB/s each. Read performance is better
> around
> 85MB/s (although I expected somewhat higher). So it doesn't seem that
> PCI
> bus is limiting factor here (127MB/s theoretical throughput.. 100MB/s
> real
> world?) quite yet... I see a lot of time being spent in the kernel..
> and a
> significant iowait time. The CPU is pretty old but where exactly is the
> bottleneck?
>
> Any thoughts, insights or recommendations welcome!
>
> Cheers
> Peter
> -- 
> View this message in context:
> http://www.nabble.com/slow-raid5-performance-tf4650085.html#a13284909
> Sent from the linux-raid mailing list archive at Nabble.com.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
@ 2007-10-22 17:18 Peter
  2007-10-22 20:52 ` Peter Grandi
  0 siblings, 1 reply; 10+ messages in thread
From: Peter @ 2007-10-22 17:18 UTC (permalink / raw)
  To: linux-raid


----- Original Message ----

From: Peter Grandi <pg_lxra@lxra.for.sabi.co.UK>



Thank you for your insightful response Peter (Yahoo spam filter hid it from me until now). 



> Most 500GB drives can do 60-80MB/s on the outer tracks

> (30-40MB/s on the inner ones), and 3 together can easily swamp

> the PCI bus. While you see the write rates of two disks, the OS

> is really writing to all three disks at the same time, and it

> will do read-modify-write unless the writes are exactly stripe

> aligned. When RMW happens write speed is lower than writing to a

> single disk.



I can understand that if a RMW happens it will effectively lower the write throughput substantially but I'm not sure entirely sure why this would happen while  writing new content, I don't know enough about RAID internals. Would this be the case the majority of time?



> The system time is because the Linux page cache etc. is CPU

> bound (never mind RAID5 XOR computation, which is not that

> big). The IO wait is because IO is taking place.



  http://www.sabi.co.uk/blog/anno05-4th.html#051114



> Almost all kernel developers of note have been hired by wealthy

> corporations who sell to people buying large servers. Then the

> typical system that these developers may have and also target

> are high ends 2-4 CPU workstations and servers, with CPUs many

> times faster than your PC, and on those system the CPU overhead

> of the page cache at speeds like yours less than 5%.



> My impression is that something that takes less than 5% on a

> developers's  system does not get looked at, even if it takes 50%

> on your system. The Linux kernel was very efficient when most

> developers were using old cheap PCs themselves. "scratch your

> itch" rules.



This is a rather unfortunate situation, it seems that some of the roots are forgotten, especially in a case like this where one would think running a file server on a modest CPU should be enough. I was waiting for Phenom and AM2+ motherboards to become available before relegating this X2 4600+ to file server duty, guess I'll need to stay with the slow performance for a few more months. 



> Anyhow, try to bypass the page cache with 'O_DIRECT' or test

> with 'dd oflag=direct' and similar for an alterative code path.



I'll give this a try, thanks.



> Misaligned writes and page cache CPU time most likely.



What influence would adding more harddrives to this RAID have? I know in terms of a Netapp filer they  always talk about spindle count for performance. 



-

To unsubscribe from this list: send the line "unsubscribe linux-raid"  in

the body of a message to majordomo@vger.kernel.org

More majordomo info at  http://vger.kernel.org/majordomo-info.html
















^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
@ 2007-10-22 17:21 Peter
  2007-10-22 19:23 ` Richard Scobie
  0 siblings, 1 reply; 10+ messages in thread
From: Peter @ 2007-10-22 17:21 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-raid

Thanks Justin, good to hear about some real world experience. 

----- Original Message ----
From: Justin Piszcz <jpiszcz@lucidpixels.com>
To: Peter <thenephilim13@yahoo.com>
Cc: linux-raid@vger.kernel.org
Sent: Monday, October 22, 2007 9:58:16 AM
Subject: Re: slow raid5 performance


With SW RAID 5 on the PCI bus you are not going to see faster than
 38-42 
MiB/s.  Especially with only three drives it may be slower than that. 
Forget / stop using the PCI bus and expect high transfer rates.

For writes = 38-42 MiB/s sw raid5.
For reads = you will get close to 120-122 MiB/s sw raid5.

This is from a lot of testing going up to 400GB x 10 drives using PCI 
cards on a regular PCI bus.

Then I went PCI-e and used faster disks to get 0.5gigabytes/sec SW
 raid5.

Justin.

On Mon, 22 Oct 2007, Peter wrote:

> Does anyone have any insights here? How do I interpret the seemingly
 competing system & iowait numbers... is my system both CPU and PCI bus
 bound?
>
> ----- Original Message ----
> From: nefilim
> To: linux-raid@vger.kernel.org
> Sent: Thursday, October 18, 2007 4:45:20 PM
> Subject: slow raid5 performance
>
>
>
> Hi
>
> Pretty new to software raid, I have the following setup in a file
> server:
>
> /dev/md0:
>        Version : 00.90.03
>  Creation Time : Wed Oct 10 11:05:46 2007
>     Raid Level : raid5
>     Array Size : 976767872 (931.52 GiB 1000.21 GB)
>  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
>   Raid Devices : 3
>  Total Devices : 3
> Preferred Minor : 0
>    Persistence : Superblock is persistent
>
>    Update Time : Thu Oct 18 15:02:16 2007
>          State : active
> Active Devices : 3
> Working Devices : 3
> Failed Devices : 0
>  Spare Devices : 0
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>           UUID : 9dcbd480:c5ca0550:ca45cdab:f7c9f29d
>         Events : 0.9
>
>    Number   Major   Minor   RaidDevice State
>       0       8       33        0      active sync   /dev/sdc1
>       1       8       49        1      active sync   /dev/sdd1
>       2       8       65        2      active sync   /dev/sde1
>
> 3 x 500GB WD RE2 hard drives
> AMD Athlon XP 2400 (2.0Ghz), 1GB RAM
> /dev/sd[ab] are connected to Sil 3112 controller on PCI bus
> /dev/sd[cde] are connected to Sil 3114 controller on PCI bus
>
> Transferring large media files from /dev/sdb to /dev/md0 I see the
> following
> with iostat:
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           1.01    0.00   55.56   40.40    0.00    3.03
>
> Device:            tps    MB_read/s    MB_wrtn/s    MB_read  
  MB_wrtn
> sda               0.00         0.00         0.00          0        
  0
> sdb             261.62        31.09         0.00         30        
  0
> sdc             148.48         0.15        16.40          0        
 16
> sdd             102.02         0.41        16.14          0        
 15
> sde             113.13         0.29        16.18          0        
 16
> md0            8263.64         0.00        32.28          0        
 31
>
> which is pretty much what I see with hdparm etc. 32MB/s seems pretty
> slow
> for drives that can easily do 50MB/s each. Read performance is better
> around
> 85MB/s (although I expected somewhat higher). So it doesn't seem that
> PCI
> bus is limiting factor here (127MB/s theoretical throughput.. 100MB/s
> real
> world?) quite yet... I see a lot of time being spent in the kernel..
> and a
> significant iowait time. The CPU is pretty old but where exactly is
 the
> bottleneck?
>
> Any thoughts, insights or recommendations welcome!
>
> Cheers
> Peter
> -- 
> View this message in context:
> http://www.nabble.com/slow-raid5-performance-tf4650085.html#a13284909
> Sent from the linux-raid mailing list archive at Nabble.com.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid"
 in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
-
To unsubscribe from this list: send the line "unsubscribe linux-raid"
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
  2007-10-22 17:21 slow raid5 performance Peter
@ 2007-10-22 19:23 ` Richard Scobie
  2007-10-22 19:33   ` Justin Piszcz
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Scobie @ 2007-10-22 19:23 UTC (permalink / raw)
  To: linux-raid

Peter wrote:
> Thanks Justin, good to hear about some real world experience. 

Hi Peter,

I recently built a 3 drive RAID5 using the onboard SATA controllers on 
an MCP55 based board and get around 115MB/s write and 141MB/s read.

A fourth drive was added some time later and after growing the array and 
filesystem (XFS), saw 160MB/s write and 178MB/s read, with the array 60% 
full.

Regards,

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
  2007-10-22 19:23 ` Richard Scobie
@ 2007-10-22 19:33   ` Justin Piszcz
  2007-10-22 20:18     ` Peter Grandi
  0 siblings, 1 reply; 10+ messages in thread
From: Justin Piszcz @ 2007-10-22 19:33 UTC (permalink / raw)
  To: Richard Scobie; +Cc: linux-raid



On Tue, 23 Oct 2007, Richard Scobie wrote:

> Peter wrote:
>> Thanks Justin, good to hear about some real world experience. 
>
> Hi Peter,
>
> I recently built a 3 drive RAID5 using the onboard SATA controllers on an 
> MCP55 based board and get around 115MB/s write and 141MB/s read.
>
> A fourth drive was added some time later and after growing the array and 
> filesystem (XFS), saw 160MB/s write and 178MB/s read, with the array 60% 
> full.
>
> Regards,
>
> Richard
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Yes, your chipset must be PCI-e based and not PCI.

Justin.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
  2007-10-22 19:33   ` Justin Piszcz
@ 2007-10-22 20:18     ` Peter Grandi
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Grandi @ 2007-10-22 20:18 UTC (permalink / raw)
  To: Linux RAID

>>> On Mon, 22 Oct 2007 15:33:09 -0400 (EDT), Justin Piszcz
>>> <jpiszcz@lucidpixels.com> said:

[ ... speed difference between PCI and PCIe RAID HAs ... ]

>> I recently built a 3 drive RAID5 using the onboard SATA
>> controllers on an MCP55 based board and get around 115MB/s
>> write and 141MB/s read.  A fourth drive was added some time
>> later and after growing the array and filesystem (XFS), saw
>> 160MB/s write and 178MB/s read, with the array 60% full.

jpiszcz> Yes, your chipset must be PCI-e based and not PCI.

Broadly speaking yes (the MCP55 is a PCIe chipset), but it is
more complicated than that. The "south bridge" chipset host
adapters often have a rather faster link to memory and the CPU
interconnect than the PCI or PCIe buses can provide, even when
they are externally ''PCI''.

Also, when the RAID HA is not in-chipset it also matters a fair
bit how many lanes the PCIe slot (or whether it is PCI-X 64 bit
and 66MHz) it is plugged in has -- most PCIe RAID HAs can use 4
or 8 lanes (or equivalent for PCI-X).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: slow raid5 performance
  2007-10-22 17:18 Peter
@ 2007-10-22 20:52 ` Peter Grandi
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Grandi @ 2007-10-22 20:52 UTC (permalink / raw)
  To: Linux RAID

>>> On Mon, 22 Oct 2007 10:18:59 -0700 (PDT), Peter
>>> <thenephilim13@yahoo.com> said:

[ ... ]

thenephilim13> I can understand that if a RMW happens it will
thenephilim13> effectively lower the write throughput
thenephilim13> substantially but I'm not sure entirely sure why
thenephilim13> this would happen while writing new content,

It does not really depend on new vs. old content, but on whether
the filesystem knows the real address of the blocks it is
writing to and assembles writes in sequences beginning at the
appropriate boundary and of the approriate length. XFS (at least
some recetn versions) makes a valiant attempt to deduce the
right values from the underlying storage system, but double
checking usually helps.

>> [ ... ] My impression is that something that takes less than
>> 5% on a developers's system does not get looked at, even if
>> it takes 50% on your system. The Linux kernel was very
>> efficient when most developers were using old cheap PCs
>> themselves. "scratch your itch" rules.

thenephilim13> This is a rather unfortunate situation, it seems
thenephilim13> that some of the roots are forgotten,

If somebody's roots are to be a poor postdoc/postgrad scrounging
used/broken PC parts and then they get hired with a huge salary
and given a 4-CPU 4GB PC to play with, usually they are very
happy to forget those roots and don't think it unfortunate.

Just another instance also of the "who pays the piper calls the
tunes" principle.

thenephilim13> especially in a case like this where one would
thenephilim13> think running a file server on a modest CPU
thenephilim13> should be enough.

Let's say that I have been myself been astonished by how CPU
intensive is the Linux page cache. But then I think that quite a
few aspects of the Linux page/IO management subsystems (never
mind sick, mad horrors like 'vm/page-cluster') could be
substantially improved by someone who had read the relevant
literature (I am not volunteering unfortunately) instead of
trying to wing it and flounder.

[ ... ]

>> Misaligned writes and page cache CPU time most likely.

thenephilim13> What influence would adding more harddrives to
thenephilim13> this RAID have?

If you can guarantee that writes (and reads, less importantly)
be aligned and of the right size adding more drives improves
performance as long as this does not hit the bus bandwidth
limits and the CPU ones (and your system likely is near those);
unfortunately adding more drives makes it less likely that
writes will be properly aligned and of the right size, never
mind all the other downsides. A 3-drive RAID5 is one of only two
common (or uncommon) cases in which RAID5 is a tolerable idea.

thenephilim13> I know in terms of a Netapp filer they always
thenephilim13> talk about spindle count for performance.

NetApp use something very different from RAID5 (and which is not
quite RAID4)and they also have a custom filesystem described in
an interesting paper:

  http://www.netapp.com/library/tr/3002.pdf
  http://blogs.sun.com/val/entry/zfs_faqs_freenix_cfp_new

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-10-22 20:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-22 17:21 slow raid5 performance Peter
2007-10-22 19:23 ` Richard Scobie
2007-10-22 19:33   ` Justin Piszcz
2007-10-22 20:18     ` Peter Grandi
  -- strict thread matches above, loose matches on Subject: below --
2007-10-22 17:18 Peter
2007-10-22 20:52 ` Peter Grandi
2007-10-22 16:15 Peter
2007-10-22 16:58 ` Justin Piszcz
2007-10-18 22:21 nefilim
2007-10-20 12:38 ` Peter Grandi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).