From: Greg Stark <gsstark@mit.edu>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Helge Hafting <helgehaf@aitel.hist.no>,
jw schultz <jw@pegasys.ws>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: raid0 slower than devices it is assembled of?
Date: 07 Jan 2004 23:54:05 -0500 [thread overview]
Message-ID: <87llojujki.fsf@stark.dyndns.tv> (raw)
In-Reply-To: <Pine.LNX.4.58.0312160825570.1599@home.osdl.org>
Linus Torvalds <torvalds@osdl.org> writes:
> The fact is, modern disks are GOOD at streaming data. They're _really_
> good at it compared to just about anything else they ever do. The win you
> get from even medium-sized stripes on RAID0 are likely to not be all that
> noticeable, and you can definitely lose _big_ just because it tends to
> hack your IO patterns to pieces.
I'm not sure how you reach this conclusion. 50MB/s may sound like a lot, it's
sure a whole lot more than the 2MB/s I get on this 486 over here. But then the
hard drive I have that gets 50MB/s is also 250G and the one in the 486 is
425M, a factor of 588 difference in size. So as good as the drives are getting
at streaming data, the amount of data we want to stream is going up even
faster.
> My personal guess is that modern RAID0 stripes should be on the order of
> several MEGABYTES in size rather than the few hundred kB that most people
> use (not to mention the people who have 32kB stripes or smaller - they
> just kill their IO access patterns with that, and put the CPU at
> ridiculous strain).
> Big stripes help because:
>
> - disks already do big transfers well, so you shouldn't split them up.
> Quite frankly, the kinds of access patterns that let you stream
> multiple streams of 50MB/s and get N-way throughput increases just
> don't exists in the real world outside of some very special niches (DoD
> satellite data backup, or whatever).
Or just about any moderate sized SQL database. Virtually any large query will
cause what Oracle calls "full table scan"s or what postgres calls a
"sequential scan" precisely because reading sequential data is way faster than
random access. Often a single query will generate several such streams, and
often large on-disk sorts which have sequential access patterns as well.
It seems to me that having a stripe-size of several megabytes will defeat the
read-ahead and essentially limit the database to 50MB/s which while it seems
like a lot really isn't fast enough to keep up with the increase in the amount
of data being handled. Even a small database with tables around 1GB will
benefit enormously from being able to stream the data at 100MB/s or 150MB/s.
> I may be wrong, of course. But I doubt it.
Well it should be easy enough to test. It would be quite a radical change in
thinking. All the raidtools documentation suggests starting with 32kb and
experimenting -- largely with smaller stripe sizes. I've certainly never
considered anything much larger. It would be really interesting to know how
even a typical database query ran on raid arrays of varying stripe sizes.
--
greg
next prev parent reply other threads:[~2004-01-08 4:54 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-15 13:34 raid0 slower than devices it is assembled of? Witold Krecicki
2003-12-15 15:44 ` Witold Krecicki
2003-12-16 4:01 ` jw schultz
2003-12-16 14:51 ` Helge Hafting
2003-12-16 16:42 ` Linus Torvalds
2003-12-16 20:58 ` Mike Fedyk
2003-12-16 21:11 ` Linus Torvalds
2003-12-17 10:53 ` Jörn Engel
2003-12-17 11:39 ` Peter Zaitsev
2003-12-17 16:01 ` Linus Torvalds
2003-12-17 18:37 ` Mike Fedyk
2003-12-17 21:55 ` bill davidsen
2003-12-17 17:02 ` bill davidsen
2003-12-17 20:14 ` Peter Zaitsev
2003-12-17 19:22 ` Jamie Lokier
2003-12-17 19:40 ` Linus Torvalds
2003-12-17 22:36 ` bill davidsen
2003-12-18 2:47 ` jw schultz
2003-12-17 22:29 ` bill davidsen
2003-12-18 2:18 ` jw schultz
2004-01-08 4:54 ` Greg Stark [this message]
2003-12-16 20:51 ` Andre Hedrick
2003-12-16 21:04 ` Andre Hedrick
2003-12-16 21:46 ` Witold Krecicki
2003-12-16 20:09 ` Witold Krecicki
2003-12-16 21:11 ` Adam Kropelin
2003-12-16 21:25 ` jw schultz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87llojujki.fsf@stark.dyndns.tv \
--to=gsstark@mit.edu \
--cc=helgehaf@aitel.hist.no \
--cc=jw@pegasys.ws \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox