linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Tokarev <mjt@tls.msk.ru>
To: dean gaudet <dean@arctic.org>
Cc: Robin Bowes <robin-lists@robinbowes.com>, linux-raid@vger.kernel.org
Subject: Re: raid5 software vs hardware: parity calculations?
Date: Mon, 15 Jan 2007 14:48:38 +0300	[thread overview]
Message-ID: <45AB6A16.8010001@tls.msk.ru> (raw)
In-Reply-To: <Pine.LNX.4.64.0701131903270.16472@twinlark.arctic.org>

dean gaudet wrote:
[]
> if this is for a database or fs requiring lots of small writes then 
> raid5/6 are generally a mistake... raid10 is the only way to get 
> performance.  (hw raid5/6 with nvram support can help a bit in this area, 
> but you just can't beat raid10 if you need lots of writes/s.)

A small nitpick.

At least some databases never do "small"-sized I/O, at least not against
the datafiles.  That is, for example, Oracle uses a fixed-size I/O block
size, specified at database (or tablespace) creation time, -- by default
it's 4Kb or 8Kb, but may be 16Kb or 32Kb as well.  Now, if you'll make your
raid array stripe size to match the blocksize of a database, *and* ensure
the files are aligned on disk properly, it will just work without needless
reads to calculate parity blocks during writes.

But the problem with that is it's near impossible to do.

First, even if the db writes in 32Kb blocks, it means the stripe size should
be 32Kb, which is only suitable for raid5 with 3 disks, having chunk size of
16Kb, or with 5 disks, chunk size 8Kb (this last variant is quite bad, because
chunk size of 8Kb is too small).  In other words, only very limited set of
configurations will be more-or-less good.

And second, most filesystems used for databases don't care about "correct"
file placement.  For example, ext[23]fs with maximum blocksize of 4Kb will
align files by 4Kb, not by stripe size - which means that a whole 32Kb block
will be laid like - first 4Kb on first stripe, rest 24Kb on the next stripe,
which means that for both parts full read-write cycle will be needed again
to update parity blocks - the thing we tried to avoid by choosing the sizes
in a previous step.  Only xfs so far (from the list of filesystems I've
checked) pays attention to stripe size and tries to ensure files are aligned
to stripe size.  (Yes I know mke2fs's stride=xxx parameter, but it only
affects metadata, not data).

That's why all the above is a "small nitpick" - i.e., in theory, it IS possible
to use raid5 for database workload in certain cases, but due to all the gory
details, it's nearly impossible to do right.

/mjt

  reply	other threads:[~2007-01-15 11:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-11 22:44 raid5 software vs hardware: parity calculations? James Ralston
2007-01-12 17:39 ` dean gaudet
2007-01-12 20:34   ` James Ralston
2007-01-13  9:20     ` Dan Williams
2007-01-13 17:32       ` Bill Davidsen
2007-01-13 23:23         ` Robin Bowes
2007-01-14  3:16           ` dean gaudet
2007-01-15 11:48             ` Michael Tokarev [this message]
2007-01-15 15:29           ` Bill Davidsen
2007-01-15 16:22             ` Robin Bowes
2007-01-15 17:37               ` Bill Davidsen
2007-01-15 21:25               ` dean gaudet
2007-01-15 21:32                 ` Gordon Henderson
2007-01-16  0:35                 ` berk walker
2007-01-16  0:48                   ` dean gaudet
2007-01-16  3:41                     ` Mr. James W. Laferriere
2007-01-16  4:16                       ` dean gaudet
2007-01-16  5:06                   ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45AB6A16.8010001@tls.msk.ru \
    --to=mjt@tls.msk.ru \
    --cc=dean@arctic.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=robin-lists@robinbowes.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).