public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Higher than expected disk write(2) latency
@ 2008-06-28 12:11 Martin Lucina
  2008-06-28 13:11 ` Roger Heflin
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Martin Lucina @ 2008-06-28 12:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: Martin Sustrik

Hi,

we're getting some rather high figures for write(2) latency when testing
synchronous writing to disk.  The test I'm running writes 2000 blocks of
contiguous data to a raw device, using O_DIRECT and various block sizes
down to a minimum of 512 bytes.  

The disk is a Seagate ST380817AS SATA connected to an Intel ICH7
using ata_piix.  Write caching has been explicitly disabled on the
drive, and there is no other activity that should affect the test
results (all system filesystems are on a separate drive).  The system is
running Debian etch, with a 2.6.24 kernel.

Observed results:

size=1024, N=2000, took=4.450788 s, thput=3 mb/s seekc=1
write: avg=8.388851 max=24.998846 min=8.335624 ms
8 ms: 1992 cases
9 ms: 2 cases
10 ms: 1 cases
14 ms: 1 cases
16 ms: 3 cases
24 ms: 1 cases

size=512, N=2000, took=4.401289 s, thput=1 mb/s seekc=1
write: avg=8.364283 max=16.692206 min=2.010072 ms
2 ms: 1 cases
7 ms: 1 cases
8 ms: 1995 cases
16 ms: 3 cases

Measurement of the write(2) time is performed using the TSC, so any
latency there is negligible.

The datasheet for the drive being used gives the following figures:

Average latency (msec): 4.16
Track-to-track seek time (msec typical): <1.2 (write)
Average seek, write (msec typical): 9.5

If these figures are to be believed, then why are we seeing latencies of
8.3 msec?  Is this normal?  Or are we just being overly optimistic in
our performance expectations?

What we find suspicious is that the latency we see is so close to the
Average seek latency specified for the drive, almost as if the drive was
performing a seek on every write.

For comparison, here are the results of the same test with the disk
write cache *enabled*:

size=1024, N=2000, took=0.296284 s, thput=55 mb/s seekc=1
write: avg=0.147745 max=0.606990 min=0.117246 ms
0 ms: 2000 cases

size=512, N=2000, took=0.304614 s, thput=26 mb/s seekc=1
write: avg=0.152089 max=0.533234 min=0.125370 ms
0 ms: 2000 cases

We also ran the same test on a different system with recent SAS disks
connected via a HP/Compaq CCISS controller.  I don't have the exact
details of the drives used, since I don't know how to get them out of
the cciss driver, but the latencies we got were around 4 msec.  Whilst
this is better than the "commodity" hardware used in the tests above, it
still seems excessive.

Any advice would be appreciated.

Thanks,

-mato

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2008-07-11 15:20 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <fa.OZMA74BZPX46rhnjz1am4hB786M@ifi.uio.no>
2008-06-30  6:41 ` Higher than expected disk write(2) latency Robert Hancock
2008-06-28 12:11 Martin Lucina
2008-06-28 13:11 ` Roger Heflin
2008-06-30 18:10   ` Martin Sustrik
2008-06-30 19:02     ` Roger Heflin
2008-06-30 22:20       ` Martin Sustrik
2008-07-01  0:11         ` Bernd Eckenfels
2008-07-02 16:48       ` Martin Sustrik
2008-07-02 18:15         ` Jeff Moyer
2008-07-02 18:20           ` Martin Sustrik
2008-07-04  3:16             ` David Dillow
2008-07-02 21:33         ` Roger Heflin
2008-06-28 14:47 ` David Newall
2008-06-29 11:34   ` Martin Sustrik
2008-07-10  5:27 ` Andrew Morton
2008-07-10  8:12   ` Martin Sustrik
2008-07-10  8:14     ` Andrew Morton
2008-07-10 13:29       ` Chris Mason
2008-07-10 13:41         ` Martin Lucina
2008-07-10 14:01           ` Arjan van de Ven
2008-07-10 14:18             ` Chris Mason
2008-07-10  8:31     ` Alan Cox
2008-07-10 13:17       ` Martin Sustrik
2008-07-10 13:18         ` Andrew Morton
2008-07-11 15:17       ` Martin Sustrik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox