linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: "Keld Jørn Simonsen" <keld@dkuug.dk>
Cc: linux-raid@vger.kernel.org
Subject: Re: recommendations for stripe/chunk size
Date: Wed, 06 Feb 2008 14:22:15 -0500	[thread overview]
Message-ID: <47AA08E7.5000801@tmr.com> (raw)
In-Reply-To: <20080205182421.GA32250@rap.rap.dk>

Keld Jørn Simonsen wrote:
> Hi
>
> I am looking at revising our howto. I see a number of places where a
> chunk size of 32 kiB is recommended, and even recommendations on
> maybe using sizes of 4 kiB. 
>
>   
Depending on the raid level, a write smaller than the chunk size causes 
the chunk to be read, altered, and rewritten, vs. just written if the 
write is a multiple of chunk size. Many filesystems by default use a 4k 
page size and writes. I believe this is the reasoning behind the 
suggestion of small chunk sizes. Sequential vs. random and raid level 
are important here, there's no one size to work best in all cases.
> My own take on that is that this really hurts performance. 
> Normal disks have a rotation speed of between 5400 (laptop)
> 7200 (ide/sata) and 10000 (SCSI) rounds per minute, giving an average
> spinning time for one round of 6 to 12 ms, and average latency of half
> this, that is 3 to 6 ms. Then you need to add head movement which
> is something like 2 to 20 ms - in total average seek time 5 to 26 ms,
> averaging around 13-17 ms. 
>
>   
Having a write not some multiple of chunk size would seem to require a 
read-alter- wait_for_disk_rotation-write, and for large sustained 
sequential i/o using multiple drives helps transfer. for small random 
i/o small chunks are good, I find little benefit to chunks over 256 or 
maybe 1024k.
> in about 15 ms you can read on current SATA-II (300 MB/s) or ATA/133 
> something like between 600 to 1200 kB, actual transfer rates of
> 80 MB/s on SATA-II and 40 MB/s on ATA/133. So to get some bang for the buck,
> and transfer some data you should have something like 256/512 kiB
> chunks. With a transfer rate of 50 MB/s and chunk sizes of 256 kiB
> giving about a time of 20 ms per transaction
> you should be able with random reads to transfer 12 MB/s  - my
> actual figures is about 30 MB/s which is possibly because of the
> elevator effect of the file system driver. With a size of 4 kb per chunk 
> you should have a time of 15 ms per transaction, or 66 transactions per 
> second, or a transfer rate of 250 kb/s. So 256 kb vs 4 kb speeds up
> the transfer by a factor of 50. 
>
>   
If you actually see anything like this your write caching and readahead 
aren't doing what they should!

> I actually  think the kernel should operate with block sizes
> like this and not wth 4 kiB blocks. It is the readahead and the elevator
> algorithms that save us from randomly reading 4 kb a time.
>
>   
Exactly, and nothing save a R-A-RW cycle if the write is a partial chunk.
> I also see that there are some memory constrints on this.
> Having maybe 1000 processes reading, as for my mirror service,
> 256 kib buffers would be acceptable, occupying 256 MB RAM.
> That is reasonable, and I could even tolerate 512 MB ram used.
> But going to 1 MiB buffers would be overdoing it for my configuration.
>
> What would be the recommended chunk size for todays equipment?
>
>   
I think usage is more important than hardware. My opinion only.

> Best regards
> Keld


-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2008-02-06 19:22 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-05 18:24 recommendations for stripe/chunk size Keld Jørn Simonsen
2008-02-05 19:19 ` Justin Piszcz
2008-02-06 19:22 ` Bill Davidsen [this message]
2008-02-06 20:25   ` Wolfgang Denk
2008-02-06 22:37     ` Bill Davidsen
2008-02-07  0:31     ` Keld Jørn Simonsen
2008-02-07  5:40       ` Iustin Pop
2008-02-07  9:58         ` Keld Jørn Simonsen
2008-02-07  5:51       ` Neil Brown
2008-02-07  5:46     ` Neil Brown
2008-02-07  8:49       ` Wolfgang Denk
2008-02-07  5:31   ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47AA08E7.5000801@tmr.com \
    --to=davidsen@tmr.com \
    --cc=keld@dkuug.dk \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).