From: NeilBrown <neilb@suse.de>
To: Christoph Anton Mitterer <calestyo@scientia.net>
Cc: linux-raid@vger.kernel.org
Subject: Re: some general questions on RAID
Date: Mon, 8 Jul 2013 14:48:23 +1000 [thread overview]
Message-ID: <20130708144823.5cc872b9@notabene.brown> (raw)
In-Reply-To: <1372980882.5249.88.camel@fermat.scientia.net>
[-- Attachment #1: Type: text/plain, Size: 3266 bytes --]
On Fri, 05 Jul 2013 01:34:42 +0200 Christoph Anton Mitterer
<calestyo@scientia.net> wrote:
>
> > > 2) Chunks / Chunk size
> > > a) How does MD work in that matter... is it that it _always_ reads
> > > and/or writes FULL chunks?
> >
> > No. It does not. It doesn't go below 4k though.
> So what does that mean exactly? It always reads/writes at least 4k
> blocks?
RAID1 reads or writes whatever block size the filesystem sends
RAID0,10 read or write whatever block size the filesystem sends up to the
chunk size (and obviously less than chunksize when not aligned with the
chunksize).
RAID4/5/6 read the same as RAID0 when not degraded.
When degraded or when writing, RAID4/5/6 does all IO in 4K blocks (hoping
that the lower layers will merge as appropriate).
>
>
> > > Guess it must at least do so on _write_ for the RAID levels with parity
> > > (5/6)... but what about read?
> > No, not even for write.
> :-O
>
> > If an isolated 4k block is written to a raid6,
> > the corresponding 4k blocks from the other data drives in that stripe
> > are read, both corresponding parity blocks are computed, and the three
> > blocks are written.
> okay that's clear... but uhm... why having chuk sizes then? I mean
> what's the difference when having a 128k chunk vs. a 256k one... when
> the parity/data blocks seem to be split in 4k blocks,... or did I get
> that completely wrong?
A sequential read that only hits one chunk will be served faster than one
which hits two chunks. So making the chunksize 1-2 times your typical block
size for random reads can help read performance.
For very large sequential reads it shouldn't really matter though large chunk
sizes tend to result in larger IO requests to the underlying devices.
For very small random reads it shouldn't really matter either.
For writes, you want the stripe size (chunksize * (drives - parity_drives))
to match the typical size for writes - and you want those writes to be
aligned.
So the ideal load is smallish reads and largish writes with a chunksize
between the two.
>
>
> > > And what about read/write with the non-parity RAID levels (1, 0, 10,
> > > linear)... is the chunk size of any real influence here (in terms of
> > > reading/writing)?
> > Not really. At least, I've seen nothing on this list that shows any
> > influence.
> So AFAIU now:
> a) Regardless the RAID level and regardless the chunk size,
> - data blocks are read/written in 4KiB blocks
> - when there IS parity information... then that parity information is _ALWAYS_ read/computed/written in 4KiB blocks.
> b) The chunks basically just control how much consecutive data is on one
> device, thereby allowing to speed up / slow down reads/write for small /
> large files.
> But that should basically only matter on seeking devices, i.e. not on
> SSDs... thus the chunk size is irrelevant on SSDs...
Seeks are cheaper in SSDs than on spinning rust but the cost is not zero.
If you are concerned about the effect of chunksize on performance, you should
measure the performance of your hardware with your workload with differing
chunk sizes and come to your own conclusion.
All anyone else can do is offer generalities.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2013-07-08 4:48 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-04 18:30 some general questions on RAID Christoph Anton Mitterer
2013-07-04 22:07 ` Phil Turmel
2013-07-04 23:34 ` Christoph Anton Mitterer
2013-07-08 4:48 ` NeilBrown [this message]
2013-07-06 1:33 ` Christoph Anton Mitterer
2013-07-06 8:52 ` Stan Hoeppner
2013-07-06 15:15 ` Christoph Anton Mitterer
2013-07-07 16:51 ` Stan Hoeppner
2013-07-07 17:39 ` Milan Broz
2013-07-07 18:01 ` Christoph Anton Mitterer
2013-07-07 18:50 ` Milan Broz
2013-07-07 20:51 ` Christoph Anton Mitterer
2013-07-08 5:40 ` Milan Broz
2013-07-08 4:53 ` NeilBrown
2013-07-08 5:25 ` Milan Broz
2013-07-05 1:13 ` Brad Campbell
2013-07-05 1:39 ` Sam Bingner
2013-07-05 3:06 ` Brad Campbell
2013-07-06 1:23 ` some general questions on RAID (OT) Christoph Anton Mitterer
2013-07-06 6:23 ` Sam Bingner
2013-07-06 15:11 ` Christoph Anton Mitterer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130708144823.5cc872b9@notabene.brown \
--to=neilb@suse.de \
--cc=calestyo@scientia.net \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).