From: NeilBrown <neilb@suse.de>
To: Chris Worley <worleys@gmail.com>
Cc: linuxraid <linux-raid@vger.kernel.org>
Subject: Re: RAID50, despite chunk setting, does everything in 4KB blocks
Date: Tue, 20 Dec 2011 10:24:15 +1100 [thread overview]
Message-ID: <20111220102415.1bb30e78@notabene.brown> (raw)
In-Reply-To: <CANWz5fhEz20oi9fXVuxngywMu6mZQtghMKCBCAwm5uOVXOD__w@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2208 bytes --]
On Mon, 19 Dec 2011 15:43:13 -0700 Chris Worley <worleys@gmail.com> wrote:
> It doesn't really matter what chunk sizes I set, but, for example, I
> create three RAID5's of 5 drives each with a chunk size of 32K, and
> create a RAID0 comprised of the three RAID5's with a chunk size of
> 64K:
>
> md0 : active raid0 md27[2] md26[1] md25[0]
> 1885098048 blocks super 1.2 64k chunks
>
> If I write to one of the RAID5's, using:
>
> # dd of=/dev/md27 if=/dev/zero bs=1024k oflag=direct
>
> ... then "iostat -dmx 2" shows the drives being written to in 32K
> chunks (avgrq-sz=64), as you'd expect.
>
> But, writing to the RAID0 that's striping the RAID5's, shows
> everything being written in 4KB chunks (iostat shows avgrq-sz=8) to
> the RAID0 as well as to the RAID5's.
When writing to a RAID5 it *always* submits request to the lower layers in
PAGE sized units. This makes it much easier to keep parity and data aligned.
The queue on the underlying device should sort the requests and group them
together and your evidence suggests that it does.
When writing to the RAID5 through a RAID0 it will only see 64K at a time but
that shouldn't won't make any difference to its behaviour and should change
the way the requests finally get to the device.
So I have no idea why you see a difference.
I suspect lots of block-layer tracing, and lots of staring at code and lots
of head scratching would be needed to understand what is really going in.
>
> Why is that? Note that this is true for reading too. Note I don't
> see the same problem when using RAID10 (via striped RAID1's) or
> RAID100 (via striped RAID10's).
RAID1 and RAID10 don't split things into pages so I can imagine that they
might life easier for the scheduler.
But the scheduler should still get it right for RAID5 ....
So - its a mystery. Sorry.
NeilBrown
>
> ... this is on SLES11 using a 2.6.32.43-0.5 kernel.
>
> Thanks,
>
> Chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2011-12-19 23:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-19 22:43 RAID50, despite chunk setting, does everything in 4KB blocks Chris Worley
2011-12-19 23:24 ` NeilBrown [this message]
2011-12-19 23:56 ` Chris Worley
2011-12-20 0:08 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111220102415.1bb30e78@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=worleys@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).