Heh, would help if I actually attached the file ;) Phillip Susi wrote: > Attached are the results of some simple testing I did in ods format. > These tests involved having dd read the first GB of data from my two > drive sata (fake)raid0 array with varying numbers of concurrent aio > operations ( except for the original, non aio dd of course ). > > I performed these tests with cpufreq disabled and filesystems mounted > with noatime to insure no disturbances. I also set the IO scheduler to > noop, otherwise the default scheduler reordered the IO requests which > was not good for sequential throughput. I used commands like this: > > sync > dd bs=512 count=1 iflag=direct if=/dev/sda of=/dev/null > dd bs=512 count=1 iflag=direct if=/dev/sdb of=/dev/null > time dd bs=128KiB count=32768 iflag=direct if=/dev/mapper/via_hfciifae > of=/dev/null > > The first two commands were to make sure the drive head was on track > zero, otherwise the TCQ on the drives kicked in and reordered some of > the earlier reads as the head seeked to track zero. > > The results show a rather large increase in throughput for block sizes > under 128 KB, with a smaller improvement on larger block size. Likewise, > the cpu time used was significantly lower, especially with block sizes > less than 128 KB. In most cases, the original dd uses 2-3 times more > cpu time than the aio dd. > > The original dd reached near peak throughput ( 93.4 MB/s ) at a block > size of 128 KB. I believe this is due in part to that being the stripe > width of the array, so smaller block sizes did not keep both drives > operating full time. In contrast all of the aio trials reached peak > throughput of 97.x MB/s with a block size of only 32 KB, and at the > smallest block size of 16 KB, the aio(16) trial managed more than 20% > higher throughput than the non aio dd ( 72.1 vs 59.7 MB/s ), and did so > using 1/7th the cpu time. > > To show the difference O_DIRECT makes, at 128 KB block size the original > dd with O_DIRECT managed 93.4 MB/s using 0.906 seconds of CPU time. > Without O_DIRECT, the original dd only sustains 82.6 MB/s and uses a > whopping 2.912 seconds of cpu time, or more than triple the time without > O_DIRECT, and 13x more cpu time than the aio(4) test at that block size! >