From: Roger Heflin <rogerheflin@gmail.com>
To: Martin Sustrik <sustrik@fastmq.com>
Cc: Martin Lucina <mato@kotelna.sk>, linux-kernel@vger.kernel.org
Subject: Re: Higher than expected disk write(2) latency
Date: Mon, 30 Jun 2008 14:02:24 -0500 [thread overview]
Message-ID: <48692DC0.6060904@gmail.com> (raw)
In-Reply-To: <486921AD.8060308@fastmq.com>
Martin Sustrik wrote:
> Hi Roger,
>
>>> If these figures are to be believed, then why are we seeing latencies of
>>> 8.3 msec? Is this normal? Or are we just being overly optimistic in
>>> our performance expectations?
>>
>> Consider this, 60/7200rpm=8.3ms for one rotation.
>>
>> You write sector n and n+1, it takes some amount of time for that
>> first set of sectors to come under the head, when it does you write it
>> and immediately return. Immediately after that you attempt write
>> sector n+2 and n+3 which just a bit ago passed under the head, so you
>> have to wait an *ENTIRE* revolution for those sectors to again come
>> under the head to be written, another ~8.3ms, and you continue to
>> repeat this with each block being written. If the sector was
>> randomly placed in the rotation (ie 50% chance of the disk being off
>> by 1/2 a rotation or less-you would have a 4.15 ms average seek time
>> for your test)-but the case of sequential sync writes this leaves the
>> sector about as far as possible from the head (it just passed under
>> the head).
>
> Fair enough. That exaplains the behaviour. Would AIO help here? If we
> are able to enqueue next write before the first one is finished, it can
> start writing it immediately without waiting for a revolution.
If you could get them queued at the disk level, things that would need to be
watched were if the disk can queue things up (and all controllers/drivers
support it), and how many things the disk can queue up, and how large each of
those things can be, if they aren't queued at the disk, there is the chance that
the machine cannot get the data to the disk faster enough for that next sector.
I have always avoided fully sync operations as things *ALWAYS* got really really
slow because of all of the requirements need to make sure that it always got the
data to disk correctly on a unexpected crash, and typically the type of
applications I dealt with, if the machine crashed the currently outputting data
was known to be incomplete and generally useless, so things were reran.
Depending on your application you could always get a small fast solid state
device (no seek or RPM issues), and use it to keep a journal that could be
replayed on an unexpected crash...and then just use various syncs to force
things to disk at various points.
>
>>> We also ran the same test on a different system with recent SAS disks
>>> connected via a HP/Compaq CCISS controller. I don't have the exact
>>> details of the drives used, since I don't know how to get them out of
>>> the cciss driver, but the latencies we got were around 4 msec. Whilst
>>> this is better than the "commodity" hardware used in the tests above, it
>>> still seems excessive.
>>
>> Almost the same case as for the 7200 rpm disk, but I bet these SAS
>> drives are 15k drives? If so 60/15000=4ms.
>
> Bingo!
Note that in my experience the SAS drives do deal with more concurrently a lot
better than the SATA drives, one would expect a SAS drive to scale about 2x
better than a SATA drive (faster RPM) but the test results indicate that they
were considerably better when hitting it with more concurrent streams that would
be expected.
Roger
next prev parent reply other threads:[~2008-06-30 19:03 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-28 12:11 Higher than expected disk write(2) latency Martin Lucina
2008-06-28 13:11 ` Roger Heflin
2008-06-30 18:10 ` Martin Sustrik
2008-06-30 19:02 ` Roger Heflin [this message]
2008-06-30 22:20 ` Martin Sustrik
2008-07-01 0:11 ` Bernd Eckenfels
2008-07-02 16:48 ` Martin Sustrik
2008-07-02 18:15 ` Jeff Moyer
2008-07-02 18:20 ` Martin Sustrik
2008-07-04 3:16 ` David Dillow
2008-07-02 21:33 ` Roger Heflin
2008-06-28 14:47 ` David Newall
2008-06-29 11:34 ` Martin Sustrik
2008-07-10 5:27 ` Andrew Morton
2008-07-10 8:12 ` Martin Sustrik
2008-07-10 8:14 ` Andrew Morton
2008-07-10 13:29 ` Chris Mason
2008-07-10 13:41 ` Martin Lucina
2008-07-10 14:01 ` Arjan van de Ven
2008-07-10 14:18 ` Chris Mason
2008-07-10 8:31 ` Alan Cox
2008-07-10 13:17 ` Martin Sustrik
2008-07-10 13:18 ` Andrew Morton
2008-07-11 15:17 ` Martin Sustrik
[not found] <fa.OZMA74BZPX46rhnjz1am4hB786M@ifi.uio.no>
2008-06-30 6:41 ` Robert Hancock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48692DC0.6060904@gmail.com \
--to=rogerheflin@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mato@kotelna.sk \
--cc=sustrik@fastmq.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox