From: Joe Williams <jwilliams315@gmail.com>
To: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: increasing stripe_cache_size decreases RAID-6 read throughput
Date: Mon, 3 May 2010 17:06:17 -0700 [thread overview]
Message-ID: <k2k11f0870e1005031706id17d1b17q6e85485ab6e69b34@mail.gmail.com> (raw)
In-Reply-To: <20100429143403.44bef7a1@notabene.brown>
On Wed, Apr 28, 2010 at 9:34 PM, Neil Brown <neilb@suse.de> wrote:
> On Tue, 27 Apr 2010 10:18:36 -0700
> Joe Williams <jwilliams315@gmail.com> wrote:
>> The default setting for stripe_cache_size was 256. So 256 x 4K = 1024K
>> per device, which would be two stripes, I think (you commented to that
>> effect earlier). But somehow the default setting was not optimal for
>> sequential write throughput. When I increased stripe_cache_size, the
>> sequential write throughput improved. Does that make sense? Why would
>> it be necessary to cache more than 2 stripes to get optimal sequential
>> write performance?
>
> The individual devices have some optimal write size - possible one
> track or one cylinder (if we pretend those words mean something useful these
> days).
> To be able to fill that you really need that much cache for each device.
> Maybe your drives work best when they are sent 8M (16 stripes, as you say in
> a subsequent email) before expecting the first write to complete..
>
> You say you get about 250MB/sec, so that is about 80MB/sec per drive
> (3 drives worth of data).
> Rotational speed is what? 10K? That is 166revs-per-second.
Actually, 5400rpm.
> So about 500K per revolution.
About twice that, about 1 MB per revolution.
> I imagine you would need at least 3 revolutions worth of data in the cache,
> one that is currently being written, one that is ready to be written next
> (so the drive knows it can just keep writing) and one that you are in the
> process of filling up.
> You find that you need about 16 revolutions (it seems to be about one
> revolution per stripe). That is more than I would expect .... maybe there is
> some extra latency somewhere.
So about 8 revolutions in the cache. 2 to 3 times what might be
expected to be needed for optimal performance. Hmmm.
16 stripes comes to 16*512KB per drive, or about 8MB per drive. At
about 100MB/s, that is about 80 msec worth of writing. I don't see
where 80 msec of latency might come from.
Could it be a quirk of NCQ? I think each HDD has an NCQ of 31. But 31
512 byte sectors is only 16KB. That does not seem relevant.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2010-05-04 0:06 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-24 23:36 increasing stripe_cache_size decreases RAID-6 read throughput Joe Williams
2010-04-24 23:45 ` Joe Williams
2010-04-27 6:41 ` Neil Brown
2010-04-27 17:18 ` Joe Williams
2010-04-27 21:24 ` Neil Brown
2010-04-28 20:40 ` Joe Williams
2010-04-29 4:34 ` Neil Brown
2010-05-04 0:06 ` Joe Williams [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=k2k11f0870e1005031706id17d1b17q6e85485ab6e69b34@mail.gmail.com \
--to=jwilliams315@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).