linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Help with chunksize on raid10 -p o3 array
@ 2007-03-06 11:26 Peter Rabbitson
  2007-03-07  0:31 ` Bill Davidsen
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Peter Rabbitson @ 2007-03-06 11:26 UTC (permalink / raw)
  To: linux-raid

Hi,
I have been trying to figure out the best chunk size for raid10 before 
migrating my server to it (currently raid1). I am looking at 3 offset 
stripes, as I want to have two drive failure redundancy, and offset 
striping is said to have the best write performance, with read 
performance equal to far. Information on the internet is scarce so I 
decided to test chunking myself. I used the script 
http://rabbit.us/pool/misc/raid_test.txt to iterate through different 
chunk sizes, and try to dd the resulting array to /dev/null. I 
deliberately did not make a filesystem on top of the array - I was just 
looking for raw performance, and since the FS layer is not involved no 
caching/optimization is taking place. I also monitored the process with 
dstat in a separate window, and memory usage confirmed that this method 
is valid.
I got some pretty weird results: 
http://rabbit.us/pool/misc/raid_test_results.txt
 From all my readings so far I thought that with chunk size increase the 
large block access throughput decreases while small block reads 
increase, and it is just a matter of finding a "sweet spot" balancing 
them out. The results, however, clearly show something else. There are 
some inconsistencies, which I attribute to my non-scientific approach, 
but the trend is clearly showing.

Here are the questions I have:

* Why did the test show best consistent performance over a 16k chunk? Is 
there a way to determine this number without running a lengthy 
benchmark, just from knowing the drive performance?

* Why although I have 3 identical chunks of data at any time, dstat 
never showed simultaneous reading from more than 2 drives. Every dd run 
was accompanied by maxing out one of the drives at 58MB/s and another 
one was trying to catch up to various degrees depending on the chunk 
size. Then on the next dd run two other drives would be (seemingly 
random) selected and the process would repeat.

* What the test results don't show but dstat did is how the array resync 
behaved after the array creation. Although my system can sustain reads 
from all 4 drives at the max speed of 58MB/s, here is what the resync at 
different chunk sizes looked like:

32k	-	simultaneous reads from all 4 drives at 47MB/s sustained
64k	-	simultaneous reads from all 4 drives at 56MB/s sustained
128k	-	simultaneous reads from all 4 drives at 54MB/s sustained
512k	-	simultaneous reads from all 4 drives at 30MB/s sustained
1024k	-	simultaneous reads from all 4 drives at 38MB/s sustained
4096k	-	simultaneous reads from all 4 drives at 44MB/s sustained
16384k	-	simultaneous reads from all 4 drives at 46MB/s sustained
32768k	-	simultaneous reads from 2 drives at 58MB/s sustained and
  		the other two at 26MB/s sustained alternating the speed
		between the pairs of drives every 3 seconds or so
65536k	-	All 4 drives started at 58MB/s sustained gradually
		reducing to 44MB/s sustained at the same time

I repeated just the creation of arrays - the results are consistent. Is 
there any explanation for this?

^ permalink raw reply	[flat|nested] 11+ messages in thread
* raid10 far layout outperforms offset at writing? (was: Help with chunksize on raid10 -p o3 array)
@ 2008-12-17 12:41 Keld Jørn Simonsen
  2008-12-17 12:50 ` Peter Rabbitson
  0 siblings, 1 reply; 11+ messages in thread
From: Keld Jørn Simonsen @ 2008-12-17 12:41 UTC (permalink / raw)
  To: linux-raid

I found this old message:

> Peter Rabbitson
> Mon, 19 Mar 2007 06:14:38 -0800
> 
> Peter Rabbitson wrote:
> 
>     I have been trying to figure out the best chunk size for raid10
> before migrating my server to it (currently raid1). I am looking at 3
> offset stripes, as I want to have two drive failure redundancy, and
> offset striping is said to have the best write performance, with read
> performance equal to far. 
> 
> Incorporating suggestions from previous posts (thank you everyone), I
> used this modified script at http://rabbit.us/pool/misc/raid_test2.txt
> To negate effects of caching memory was jammed below 200mb free by using
> a full tmpfs mount with no swap. Here is what I got with far layout (-p
> f3): http://rabbit.us/pool/misc/raid_far.html The clear winner is 1M
> chunks, and is very consistent at any block size. I was surprised even
> more to see that my read speed was identical to that of a raid0 getting
> near the _maximum_ physical speed of 4 drives (roughly 55MB sustained
> across 1.2G). Unlike offset layout, far really shines at reading stuff
> back. The write speed did not suffer noticeably compared to offset
> striping. Here are the results (-p o3) for comparison:
> http://rabbit.us/pool/misc/raid_offset.html, and they roughly seem to
> correlate with my earlier testing using dd.
> 
> So I guess the way to go for this system will be f3, although the md(4)
> says that offset layout should be more beneficial. Is there anything I
> missed while setting my o3 array, so that I got worse performance for
> both read and write compared to f3?
> 
> Once again thanks everyone for the help.
> Peter

The links were not valid anymore. I wanted to see the results and 
possibly include the results in the performance wiki page
I would appreciate some new links here.

Furthermore some comments to the post: My take on o3 vs f3 is that both
in theory and practice f3 should be much faster for sequential reading,
as the layout is equivalent to raid0. For random reading and sequential
and random writing f3 and o3 (and the same goes for the more normal f2
vs o2) should be about the same, especially when a filesystem and
its associated elevator algorithm is employed.

Best regards
keld

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-12-17 14:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-06 11:26 Help with chunksize on raid10 -p o3 array Peter Rabbitson
2007-03-07  0:31 ` Bill Davidsen
2007-03-07  9:28   ` Peter Rabbitson
2007-03-12  4:28 ` Neil Brown
2007-03-12 14:46   ` Peter Rabbitson
2007-03-12 18:45     ` Richard Scobie
2007-03-12 21:16       ` Peter Rabbitson
2007-03-19 14:14 ` raid10 far layout outperforms offset at writing? (was: Help with chunksize on raid10 -p o3 array) Peter Rabbitson
  -- strict thread matches above, loose matches on Subject: below --
2008-12-17 12:41 Keld Jørn Simonsen
2008-12-17 12:50 ` Peter Rabbitson
2008-12-17 14:34   ` Keld Jørn Simonsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).