From: Chris Snook <csnook@redhat.com>
To: "Pádraig Brady" <P@draigBrady.com>
Cc: Emmanuel Florac <eflorac@intellique.com>, linux-kernel@vger.kernel.org
Subject: Re: RAID-1 performance under 2.4 and 2.6
Date: Wed, 16 Jul 2008 14:18:25 -0400 [thread overview]
Message-ID: <487E3B71.9040902@redhat.com> (raw)
In-Reply-To: <487E0B30.6050007@draigBrady.com>
Pádraig Brady wrote:
> Chris Snook wrote:
>> Bill Davidsen wrote:
>>> Chris Snook wrote:
>>>> Emmanuel Florac wrote:
>>>>> I post there because I couldn't find any information about this
>>>>> elsewhere : on the same hardware ( Athlon X2 3500+, 512MB RAM, 2x400 GB
>>>>> Hitachi SATA2 hard drives ) the 2.4 Linux software RAID-1 (tested
>>>>> 2.4.32
>>>>> and 2.4.36.2, slightly patched to recognize the hardware :p) is way
>>>>> faster than 2.6 ( tested 2.6.17.13, 2.6.18.8, 2.6.22.16, 2.6.24.3)
>>>>> especially for writes. I actually made the test on several different
>>>>> machines (same hard drives though) and it remained consistent across
>>>>> the board, with /mountpoint a software RAID-1.
>>>>> Actually checking disk activity with iostat or vmstat shows clearly a
>>>>> cache effect much more pronounced on 2.4 (i.e. writing goes on much
>>>>> longer in the background) but it doesn't really account for the
>>>>> difference. I've also tested it thru NFS from another machine (Giga
>>>>> ethernet network):
>>>>>
>>>>> dd if=/dev/zero of=/mountpoint/testfile bs=1M count=1024
>>>>>
>>>>> kernel 2.4 2.6 2.4 thru NFS 2.6 thru NFS
>>>>>
>>>>> write 90 MB/s 65 MB/s 70 MB/s 45 MB/s
>>>>> read 90 MB/s 80 MB/s 75 MB/s 65 MB/s
>>>>>
>>>>> Duh. That's terrible. Does it mean I should stick to (heavily
>>>>> patched...) 2.4 for my file servers or... ? :)
>>>>>
>>>> It means you shouldn't use dd as a benchmark.
>>>>
>>> What do you use as a benchmark for writing large sequential files or
>>> reading them, and why is it better than dd at modeling programs which
>>> read or write in a similar fashion?
>>>
>>> Media programs often do data access in just this fashion,
>>> multi-channel video capture, streaming video servers, and similar.
>>>
>> dd uses unaligned stack-allocated buffers, and defaults to block sized
>> I/O. To call this inefficient is a gross understatement. Modern
>> applications which care about streaming I/O performance use large,
>> aligned buffers which allow the kernel to efficiently optimize things,
>> or they use direct I/O to do it themselves, or they make use of system
>> calls like fadvise, madvise, splice, etc. that inform the kernel how
>> they intend to use the data or pass the work off to the kernel
>> completely. dd is designed to be incredibly lightweight, so it works
>> very well on a box with a 16 MHz CPU. It was *not* designed to take
>> advantage of the resources modern systems have available to enable
>> scalability.
>>
>> I suggest an application-oriented benchmark that resembles the
>> application you'll actually be using.
>
> I was trying to speed up an app¹ I wrote which streams parts of a large file,
> to separate files, and tested your advice above (on ext3 on 2.6.24.5-85.fc8).
>
> I tested reading blocks of 4096, both to stack and page aligned buffers,
> but there were negligible differences between the CPU usage between the
> aligned and non-aligned buffer case.
> I guess the kernel could be clever and only copy the page to userspace
> on modification in the page aligned case, but the benchmarks at least
> don't suggest this is what's happening?
>
> What difference exactly should be expected from using page aligned buffers?
>
> Note I also tested using mmap to stream the data, and there is a significant
> decrease in CPU usage in user and kernel space as expected due to the
> data not being copied from the page cache.
>
> thanks,
> Pádraig.
>
> ¹ http://www.pixelbeat.org/programs/dvd-vr/
Page alignment, by itself, doesn't do much, but it implies a couple of
things:
1) cache line alignment, which matters more with some architectures than
others
2) block alignment, which is necessary for direct I/O
You're on the right track with mmap, but you may want to use madvise()
to tune the readahead on the pagecache.
-- Chris
next prev parent reply other threads:[~2008-07-16 18:18 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-25 18:43 RAID-1 performance under 2.4 and 2.6 Emmanuel Florac
2008-03-25 22:00 ` Chris Snook
2008-03-25 22:09 ` Emmanuel Florac
2008-03-25 22:47 ` Bill Davidsen
2008-03-25 23:13 ` Chris Snook
2008-03-25 23:42 ` Bill Davidsen
2008-03-26 8:05 ` Emmanuel Florac
2008-03-26 8:25 ` "J.A. Magallón"
2008-03-27 21:49 ` Bill Davidsen
2008-03-26 16:51 ` Chris Snook
2008-03-26 16:39 ` Chris Snook
2008-07-16 14:52 ` Pádraig Brady
2008-07-16 18:18 ` Chris Snook [this message]
2008-03-26 7:15 ` Bart Van Assche
2008-03-26 7:56 ` Emmanuel Florac
2008-03-27 21:53 ` Bill Davidsen
2008-03-28 7:44 ` Bart Van Assche
2008-03-28 12:04 ` Bill Davidsen
2008-03-25 22:37 ` Bill Davidsen
2008-03-26 8:42 ` Bart Van Assche
2008-03-26 11:07 ` Emmanuel Florac
2008-03-26 11:15 ` Bart Van Assche
2008-03-26 12:36 ` Emmanuel Florac
2008-03-26 13:22 ` Bart Van Assche
2008-03-27 22:03 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=487E3B71.9040902@redhat.com \
--to=csnook@redhat.com \
--cc=P@draigBrady.com \
--cc=eflorac@intellique.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox