From: Stan Hoeppner <stan@hardwarefreak.com>
To: Martin T <m4rtntns@gmail.com>
Cc: "linux-raid@vger.kernel.org List" <linux-raid@vger.kernel.org>
Subject: Re: calculating optimal chunk size for Linux software-RAID
Date: Fri, 07 Mar 2014 23:37:37 -0600 [thread overview]
Message-ID: <531AACA1.6000606@hardwarefreak.com> (raw)
In-Reply-To: <CAJx5YvH1GE0KUcsSnVwbtUz6qP4zMv3N2qt3HhrjdF69NT2vLQ@mail.gmail.com>
On 3/7/2014 9:15 PM, Martin T wrote:
> Stan,
>
> ok, I see. However, are there utilities out there which help one to
> analyze how applications on a server use the file-system over the time
> and help to make an educated decision regarding the chunk size?
My apologies. You're a complete novice and I'm leading you down the
textbook storage architectural design path. Let's short circuit that as
I don't have the time.
As you're starting from zero, let me give you what works best with 99%
of workloads. Use a chunk size of 32KB or 64KB. Such a chunk will work
extremely well with any singular or mixed workloads, on parity and
non-parity RAID. The only workload that should have a significantly
larger chunk than this is a purely streaming allocation workload of
large files.
If you want a more technical explanation, you can read all of my
relevant posts in the linux-raid or XFS archives, as I've explained this
hundreds of times in great detail. Or you can wait a few months to read
the kernel documentation I'm working on, which will teach the reader the
formal storage stack design process, soup to nuts. I wish it was
already finished, as I could simply paste the link for you, which,
coincidentally, is the exact reason I'm writing it. :)
> regards,
> Martin
>
> On Fri, Mar 7, 2014 at 11:58 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
>> On 3/6/2014 8:06 PM, Martin T wrote:
>>> Am I correct that optimal chunk size is usually the size of the
>>> average file read/written to disk divided by number of block devices
>>> in RAID array storing the data? For example if the average file size
>>> is 1024KiB and I have four disks in RAID1, then I should choose the
>>> chunk size around 256KiB to get the optimal read performance? Or if I
>>> have two drives in RAID0, then I should choose the chunk size 512KiB
>>> instead? Or are there better methods/benchmarks to determine the
>>> optimal chunk size for software-RAID?
>>
>> You're asking the wrong question. Storage architecture design always
>> begins with the workload. The correct question is:
>>
>> My workload (application mix) performs *most* IO in manner X, where X is
>>
>> 1. large streaming write/read
>> 2. small file write/read
>> 3. metadata heavy
>>
>> I have Y number of disk drives. I plan to use XFS/EXT4/etc filesystem.
>> What RAID level and chunk size are optimal for my workload, and how do
>> I properly tune my filesystem to my workload and storage stack?
>>
>>> Last but not least, is there a
>>> good utility which could help one to measure the average I/O
>>> read/write size?
>>
>> In flight IO size has no correlation to stripe and chunk size. What you
>> need to know is how your application(s) write to the filesystem and how
>> your filesystem issues write IOs. You should already know that the
>> former, and it's easy to determine the latter.
--
Stan
next prev parent reply other threads:[~2014-03-08 5:37 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-07 2:06 calculating optimal chunk size for Linux software-RAID Martin T
2014-03-07 23:58 ` Stan Hoeppner
2014-03-08 3:15 ` Martin T
2014-03-08 5:37 ` Stan Hoeppner [this message]
2014-03-08 22:03 ` Bill Davidsen
2014-03-12 15:21 ` Martin T
2014-03-13 10:15 ` Stan Hoeppner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=531AACA1.6000606@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=linux-raid@vger.kernel.org \
--cc=m4rtntns@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).