From: Roger Heflin <rogerheflin@gmail.com>
To: Martin Lucina <mato@kotelna.sk>
Cc: linux-kernel@vger.kernel.org, Martin Sustrik <sustrik@fastmq.com>
Subject: Re: Higher than expected disk write(2) latency
Date: Sat, 28 Jun 2008 08:11:15 -0500 [thread overview]
Message-ID: <48663873.5010200@gmail.com> (raw)
In-Reply-To: <20080628121131.GA14181@nodbug.moloch.sk>
Martin Lucina wrote:
> Hi,
>
> we're getting some rather high figures for write(2) latency when testing
> synchronous writing to disk. The test I'm running writes 2000 blocks of
> contiguous data to a raw device, using O_DIRECT and various block sizes
> down to a minimum of 512 bytes.
>
> The disk is a Seagate ST380817AS SATA connected to an Intel ICH7
> using ata_piix. Write caching has been explicitly disabled on the
> drive, and there is no other activity that should affect the test
> results (all system filesystems are on a separate drive). The system is
> running Debian etch, with a 2.6.24 kernel.
>
> Observed results:
>
> size=1024, N=2000, took=4.450788 s, thput=3 mb/s seekc=1
> write: avg=8.388851 max=24.998846 min=8.335624 ms
> 8 ms: 1992 cases
> 9 ms: 2 cases
> 10 ms: 1 cases
> 14 ms: 1 cases
> 16 ms: 3 cases
> 24 ms: 1 cases
>
> size=512, N=2000, took=4.401289 s, thput=1 mb/s seekc=1
> write: avg=8.364283 max=16.692206 min=2.010072 ms
> 2 ms: 1 cases
> 7 ms: 1 cases
> 8 ms: 1995 cases
> 16 ms: 3 cases
>
> Measurement of the write(2) time is performed using the TSC, so any
> latency there is negligible.
>
> The datasheet for the drive being used gives the following figures:
>
> Average latency (msec): 4.16
> Track-to-track seek time (msec typical): <1.2 (write)
> Average seek, write (msec typical): 9.5
>
> If these figures are to be believed, then why are we seeing latencies of
> 8.3 msec? Is this normal? Or are we just being overly optimistic in
> our performance expectations?
Consider this, 60/7200rpm=8.3ms for one rotation.
You write sector n and n+1, it takes some amount of time for that first set of
sectors to come under the head, when it does you write it and immediately
return. Immediately after that you attempt write sector n+2 and n+3 which just
a bit ago passed under the head, so you have to wait an *ENTIRE* revolution for
those sectors to again come under the head to be written, another ~8.3ms, and
you continue to repeat this with each block being written. If the sector was
randomly placed in the rotation (ie 50% chance of the disk being off by 1/2 a
rotation or less-you would have a 4.15 ms average seek time for your test)-but
the case of sequential sync writes this leaves the sector about as far as
possible from the head (it just passed under the head).
>
> What we find suspicious is that the latency we see is so close to the
> Average seek latency specified for the drive, almost as if the drive was
> performing a seek on every write.
>
> For comparison, here are the results of the same test with the disk
> write cache *enabled*:
>
> size=1024, N=2000, took=0.296284 s, thput=55 mb/s seekc=1
> write: avg=0.147745 max=0.606990 min=0.117246 ms
> 0 ms: 2000 cases
>
> size=512, N=2000, took=0.304614 s, thput=26 mb/s seekc=1
> write: avg=0.152089 max=0.533234 min=0.125370 ms
> 0 ms: 2000 cases
Write cache allows a return without writing to the actual disk, so you don't
have to wait, and on top of that it queues up all of the writes that are
together on the track and does them at one pass of the head over all of the sectors.
>
> We also ran the same test on a different system with recent SAS disks
> connected via a HP/Compaq CCISS controller. I don't have the exact
> details of the drives used, since I don't know how to get them out of
> the cciss driver, but the latencies we got were around 4 msec. Whilst
> this is better than the "commodity" hardware used in the tests above, it
> still seems excessive.
Almost the same case as for the 7200 rpm disk, but I bet these SAS drives are
15k drives? If so 60/15000=4ms.
>
> Any advice would be appreciated.
>
> Thanks,
>
Roger
next prev parent reply other threads:[~2008-06-28 13:11 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-28 12:11 Higher than expected disk write(2) latency Martin Lucina
2008-06-28 13:11 ` Roger Heflin [this message]
2008-06-30 18:10 ` Martin Sustrik
2008-06-30 19:02 ` Roger Heflin
2008-06-30 22:20 ` Martin Sustrik
2008-07-01 0:11 ` Bernd Eckenfels
2008-07-02 16:48 ` Martin Sustrik
2008-07-02 18:15 ` Jeff Moyer
2008-07-02 18:20 ` Martin Sustrik
2008-07-04 3:16 ` David Dillow
2008-07-02 21:33 ` Roger Heflin
2008-06-28 14:47 ` David Newall
2008-06-29 11:34 ` Martin Sustrik
2008-07-10 5:27 ` Andrew Morton
2008-07-10 8:12 ` Martin Sustrik
2008-07-10 8:14 ` Andrew Morton
2008-07-10 13:29 ` Chris Mason
2008-07-10 13:41 ` Martin Lucina
2008-07-10 14:01 ` Arjan van de Ven
2008-07-10 14:18 ` Chris Mason
2008-07-10 8:31 ` Alan Cox
2008-07-10 13:17 ` Martin Sustrik
2008-07-10 13:18 ` Andrew Morton
2008-07-11 15:17 ` Martin Sustrik
[not found] <fa.OZMA74BZPX46rhnjz1am4hB786M@ifi.uio.no>
2008-06-30 6:41 ` Robert Hancock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48663873.5010200@gmail.com \
--to=rogerheflin@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mato@kotelna.sk \
--cc=sustrik@fastmq.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox