Re: Awful RAID5 random read performance

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Bill Davidsen <davidsen@tmr.com>
To: tfjellstrom@shaw.ca
Cc: lrhorer@satx.rr.com, linux-raid@vger.kernel.org
Subject: Re: Awful RAID5 random read performance
Date: Tue, 02 Jun 2009 14:54:07 -0400	[thread overview]
Message-ID: <4A25754F.5030107@tmr.com> (raw)
In-Reply-To: <200905311056.30521.tfjellstrom@shaw.ca>

Thomas Fjellstrom wrote:
> On Sun May 31 2009, Leslie Rhorer wrote:
>   
>>>> I happen to be the friend Maurice was talking about. I let the raid
>>>>         
>>> layer keep
>>>
>>>       
>>>> its default chunk size of 64K. The smaller size (below like 2MB) tests
>>>>         
>>> in
>>>
>>>       
>>>> iozone are very very slow. I recently tried disabling readahead,
>>>>         
>>> Acoustic
>>>
>>>       
>>>> Management, and played with the io scheduler and all any of it has done
>>>>         
>>> is
>>>
>>>       
>>>> make the sequential access slower and has barely touched the smaller
>>>>         
>>> sized
>>>
>>>       
>>>> random access test results. Even with the 64K iozone test random
>>>>         
>>> read/write is
>>>
>>>       
>>>> only in the 7 and 11MB/s range.
>>>>
>>>> It just seems too low to me.
>>>>         
>>> I don't think so; can you try a similar test on single drives not using
>>> md RAID-5?
>>>
>>> The killer is seeks, which is what random I/O uses lots of; with a 10ms
>>> seek time you're only going to get ~100 seeks/second and if you're only
>>> reading 512 bytes after each seek you're only going to get ~500
>>> kbytes/second. Bigger block sizes will show higher throughput, but
>>> you'll still only get ~100 seeks/second.
>>>
>>> Clearly when you're doing this over 4 drives you can have ~400
>>> seeks/second but that's still limiting you to ~400 reads/second for
>>> smallish block sizes.
>>>       
>> 	John is perfectly correct, although of course a 10ms seek is a
>> fairly slow one.  The point is, it is drive dependent, and there may not be
>> much one can do about it at the software layer.  That said, you might try a
>> different scheduler, as the seek order can make a difference.  Drives with
>> larger caches may help some, although the increase in performance with
>> larger cache sizes diminishes rapidly beyond a certain point.  As one would
>> infer from John's post, increasing the number of drives in the array will
>> help a lot, since increasing the number of drives raises the limit on the
>> number of seeks / second.
>>
>> 	What file system are you using?  It can make a difference, and
>> surely has a bigger impact than most tweaks to the RAID subsystem.
>>
>> 	The biggest question in my mind, however, is why is random access a
>> big issue for you?  Are you running a very large relational database with
>> tens of thousands of tiny files?  For most systems, high volume accesses
>> consist mostly of large sequential I/O.  The majority of random I/O is of
>> rather short duration, meaning even with comparatively poor performance, it
>> doesn't take long to get the job done.  Fifty to eighty Megabits per second
>> is nothing at which to sneeze for random access of small files.  A few
>> years ago, many drives would have been barely able to manage that on a
>> sustained basis for sequential I/O.
>>     
>
> I thought the numbers were way too low. But I guess I was wrong. I really only 
> have three use cases for my arrays. One will be hosting VM images/volumes, and 
> iso disk images, while another will be hosting large media which will be 
> streaming off, p2p downloads, amd rsync/rsnapshot backups of several machines. 
> I imagine the vm array will appreciate faster random io (boot times will 
> improve, as will things like database and http disk access), and the p2p 
> surely will appreciate faster random io.
>
> I currently have them all on one disk array, but I'm thinking its a good idea 
> to separate the media from the VMs. when ktorrent is downloading a linux iso 
> or something similar atop shows very high disk utilization for ktorrent, same 
> goes for booting VMs. and the backups, oh my lord does that take a while, I 
> even tell it to skip a lot of stuff I don't need to backup.
>
> When I get around to it I may utilize the raid10 module for the VM's and 
> backups. Though that may decrease performance a little bit in the small random 
> io case. 
>   
The accesses on the VM will be similar to a real disk, so you want the 
VM on whatever you would use for bare iron. I run on raid10, many of my 
machines are on VM (including this one, my main desktop). Raid10 is a 
good general use array, I use it for a lot, other than cases where I 
need cheap space and use raid[56] to get more bytes/$ and don't need 
blinding speed. Archival storage, for instance.

-- 
Bill Davidsen <davidsen@tmr.com>
  Even purely technical things can appear to be magic, if the documentation is
obscure enough. For example, PulseAudio is configured by dancing naked around a
fire at midnight, shaking a rattle with one hand and a LISP manual with the
other, while reciting the GNU manifesto in hexadecimal. The documentation fails
to note that you must circle the fire counter-clockwise in the southern
hemisphere.

next prev parent reply	other threads:[~2009-06-02 18:54 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-30 21:46 Awful RAID5 random read performance Maurice Hilarius
2009-05-31  6:25 ` Michael Tokarev
2009-05-31  7:47   ` Thomas Fjellstrom
2009-05-31 12:29     ` John Robinson
2009-05-31 15:41       ` Leslie Rhorer
2009-05-31 16:56         ` Thomas Fjellstrom
2009-05-31 18:26           ` Keld Jørn Simonsen
2009-06-02 18:54           ` Bill Davidsen [this message]
2009-06-02 19:47             ` Keld Jørn Simonsen
2009-06-02 23:13               ` John Robinson
2009-06-03 18:38                 ` Bill Davidsen
2009-06-03 19:57                   ` John Robinson
2009-06-03 22:21                     ` Goswin von Brederlow
2009-06-04 11:23                       ` Keld Jørn Simonsen
2009-06-04 22:40                       ` Nifty Fedora Mitch
2009-06-06 23:06                       ` Bill Davidsen
2009-06-01  1:19         ` Carlos Carvalho
2009-06-01  4:57           ` Leslie Rhorer
2009-06-01  5:39             ` Thomas Fjellstrom
2009-06-01 12:43               ` Maurice Hilarius
2009-06-02 14:57                 ` Wil Reichert
2009-06-02 15:14                   ` Maurice Hilarius
2009-06-02 19:47               ` Bill Davidsen
2009-06-01 11:41             ` Goswin von Brederlow
2009-06-03  1:57               ` Leslie Rhorer
2009-05-31 17:19       ` Goswin von Brederlow
2009-06-01 12:01         ` John Robinson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A25754F.5030107@tmr.com \
    --to=davidsen@tmr.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=lrhorer@satx.rr.com \
    --cc=tfjellstrom@shaw.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).