linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: tfjellstrom@shaw.ca
Cc: linux-raid@vger.kernel.org
Subject: Re: Awful RAID5 random read performance
Date: Tue, 02 Jun 2009 15:47:40 -0400	[thread overview]
Message-ID: <4A2581DC.3060005@tmr.com> (raw)
In-Reply-To: <200905312339.12234.tfjellstrom@shaw.ca>

Thomas Fjellstrom wrote:
> On Sun May 31 2009, Leslie Rhorer wrote:
>
>   
>>> Unfortunately it doesn't seem to be. Take a well-considered drive such
>>> as the WD RE3; it's spec for average latency is 4.2ms. However does it
>>> include the rotational latency (the time the head takes to reach the
>>> sector once it's on the track)? I bet it doesn't. Taking it to be only
>>> the average seek time, this drive is still among the fastest. For a
>>> 7200rpm drive this latency is just 4.2ms, so we'd have for this fast
>>> drive an average total latency of 8.4ms.
>>>       
>> That's an average.  For a random seek to exceed that, it's going to have to
>> span many cylinders.  Give the container size of a modern cylinder, that's
>> a pretty big jump.  Single applications will tend to have their data lumped
>> somewhat together on the drive.
>>
>>     
>
>   
>>> with
>>>
>>>       
>
>   
>>> No, random I/O is the most common case for busy servers, when there
>>> are lots of processes doing uncorrelated reads and writes. Even if a
>>>       
>> Yes, exactly.  By definition, such a scenario represents a multithreaded
>> set of seeks, and as we already established, multithreaded seeks are vastly
>> more efficient than serial random seeks.  The 400 seeks per second number
>> for 4 drives applies.  I don't know the details of the Linux schedulers,
>> but most schedulers employ some variation of an elevator seek to maximize
>> seek efficiency.  The brings the average latency way down and brings the
>> seek frequency way up.
>>     
>
> Ah, I never really understood how adding more random load could increase 
> performance. Now I get it :)
>
>   
>>> single application does sequential access the head will likely have
>>> moved between them. The only solution is to have lots of ram for
>>> cache, and/or lots of disks. It'd be better if they were connected to
>>> several controllers...
>>>       
>> A large RAM cache will help, but as I already pointed out, the increases in
>> returns for increasing cache size diminish rapidly past a certain point.
>> Most quality drives these days have a 32MB cache, or 128M for a 4 drive
>> array.  Add the Linux cache on top of that, and it should be sufficient for
>> most purposes.  Remember, random seeks implies small data extents.  Lots of
>> disks will bring the biggest benefit, and disks are cheap.  Multiple
>> controllers really are not necessary, especially if the controller and
>> drives support NCQ , but having multiple controllers certainly doesn't
>> hurt.
>>     
>
> Yet I've heard NCQ makes some things worse. Some raid tweaking pages tell you 
> to try disabling NCQ.
>
> I've actually been thinking about trying md-cache with an SSD on top of my new 
> raid and see how that works long term. But I can't really think of a good 
> benchmark that actually imitates my particular use cases well enough to show 
> me if it'd help me at all ::)
>
> I doubt my punny little 30G OCZ Vertex would really help all that much any 
> how.
>   

For ext[34] you might want to put the journal on SSD, if you are doing 
any significant write that will help.
Mounting data=journal may also help write, supposedly the write will 
complete when the data hits the journal, and not wait for the platter.

-- 
Bill Davidsen <davidsen@tmr.com>
  Even purely technical things can appear to be magic, if the documentation is
obscure enough. For example, PulseAudio is configured by dancing naked around a
fire at midnight, shaking a rattle with one hand and a LISP manual with the
other, while reciting the GNU manifesto in hexadecimal. The documentation fails
to note that you must circle the fire counter-clockwise in the southern
hemisphere.



  parent reply	other threads:[~2009-06-02 19:47 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-30 21:46 Awful RAID5 random read performance Maurice Hilarius
2009-05-31  6:25 ` Michael Tokarev
2009-05-31  7:47   ` Thomas Fjellstrom
2009-05-31 12:29     ` John Robinson
2009-05-31 15:41       ` Leslie Rhorer
2009-05-31 16:56         ` Thomas Fjellstrom
2009-05-31 18:26           ` Keld Jørn Simonsen
2009-06-02 18:54           ` Bill Davidsen
2009-06-02 19:47             ` Keld Jørn Simonsen
2009-06-02 23:13               ` John Robinson
2009-06-03 18:38                 ` Bill Davidsen
2009-06-03 19:57                   ` John Robinson
2009-06-03 22:21                     ` Goswin von Brederlow
2009-06-04 11:23                       ` Keld Jørn Simonsen
2009-06-04 22:40                       ` Nifty Fedora Mitch
2009-06-06 23:06                       ` Bill Davidsen
2009-06-01  1:19         ` Carlos Carvalho
2009-06-01  4:57           ` Leslie Rhorer
2009-06-01  5:39             ` Thomas Fjellstrom
2009-06-01 12:43               ` Maurice Hilarius
2009-06-02 14:57                 ` Wil Reichert
2009-06-02 15:14                   ` Maurice Hilarius
2009-06-02 19:47               ` Bill Davidsen [this message]
2009-06-01 11:41             ` Goswin von Brederlow
2009-06-03  1:57               ` Leslie Rhorer
2009-05-31 17:19       ` Goswin von Brederlow
2009-06-01 12:01         ` John Robinson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A2581DC.3060005@tmr.com \
    --to=davidsen@tmr.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=tfjellstrom@shaw.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).