From: Daniel Pocock <daniel@pocock.com.au>
To: Martin Steigerwald <Martin@lichtvoll.de>
Cc: Andreas Dilger <adilger@dilger.ca>, linux-ext4@vger.kernel.org
Subject: Re: ext4, barrier, md/RAID1 and write cache
Date: Mon, 07 May 2012 22:56:29 +0200 [thread overview]
Message-ID: <4FA836FD.2070506@pocock.com.au> (raw)
In-Reply-To: <201205072059.10256.Martin@lichtvoll.de>
On 07/05/12 20:59, Martin Steigerwald wrote:
> Am Montag, 7. Mai 2012 schrieb Daniel Pocock:
>
>>> Possibly the older disk is lying about doing cache flushes. The
>>> wonderful disk manufacturers do that with commodity drives to make
>>> their benchmark numbers look better. If you run some random IOPS
>>> test against this disk, and it has performance much over 100 IOPS
>>> then it is definitely not doing real cache flushes.
>>>
> […]
>
> I think an IOPS benchmark would be better. I.e. something like:
>
> /usr/share/doc/fio/examples/ssd-test
>
> (from flexible I/O tester debian package, also included in upstream tarball
> of course)
>
> adapted to your needs.
>
> Maybe with different iodepth or numjobs (to simulate several threads
> generating higher iodepths). With iodepth=1 I have seen 54 IOPS on a
> Hitachi 5400 rpm harddisk connected via eSATA.
>
> Important is direct=1 to bypass the pagecache.
>
>
Thanks for suggesting this tool, I've run it against the USB disk and an
LV on my AHCI/SATA/md array
Incidentally, I upgraded the Seagate firmware (model 7200.12 from CC34
to CC49) and one of the disks went offline shortly after I brought the
system back up. To avoid the risk that a bad drive might interfere with
the SATA performance, I completely removed it before running any tests.
Tomorrow I'm out to buy some enterprise grade drives, I'm thinking about
Seagate Constellation SATA or even SAS.
Anyway, onto the test results:
USB disk (Seagate 9SD2A3-500 320GB):
rand-write: (groupid=3, jobs=1): err= 0: pid=22519
write: io=46680KB, bw=796512B/s, iops=194, runt= 60012msec
slat (usec): min=13, max=25264, avg=106.02, stdev=525.18
clat (usec): min=993, max=103568, avg=20444.19, stdev=11622.11
bw (KB/s) : min= 521, max= 1224, per=100.06%, avg=777.48, stdev=97.07
cpu : usr=0.73%, sys=2.33%, ctx=12024, majf=0, minf=20
IO depths : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued r/w: total=0/11670, short=0/0
lat (usec): 1000=0.01%
lat (msec): 2=0.01%, 4=0.24%, 10=2.75%, 20=64.64%, 50=29.97%
lat (msec): 100=2.31%, 250=0.08%
and from the SATA disk on the AHCI controller
- Barracuda 7200.12 ST31000528AS connected to
- AMD RS785E/SB820M chipset, (lspci reports SB700/SB800 AHCI mode)
rand-write: (groupid=3, jobs=1): err= 0: pid=23038
write: io=46512KB, bw=793566B/s, iops=193, runt= 60018msec
slat (usec): min=13, max=35317, avg=97.09, stdev=541.14
clat (msec): min=2, max=214, avg=20.53, stdev=18.56
bw (KB/s) : min= 0, max= 882, per=98.54%, avg=762.72, stdev=114.51
cpu : usr=0.85%, sys=2.27%, ctx=11972, majf=0, minf=21
IO depths : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued r/w: total=0/11628, short=0/0
lat (msec): 4=1.81%, 10=32.65%, 20=31.30%, 50=26.82%, 100=6.71%
lat (msec): 250=0.71%
The IOPS scores look similar, but I checked carefully and I'm fairly
certain the disks were mounted correctly when the tests ran.
Should I run this tool over NFS, will the results be meaningful?
Given the need to replace a drive anyway, I'm really thinking about one
of the following approaches:
- same controller, upgrade to enterprise SATA drives
- buy a dedicated SAS/SATA controller, upgrade to enterprise SATA drives
- buy a dedicated SAS/SATA controller, upgrade to SAS drives
My HP N36L is quite small, one PCIe x16 slot, the internal drive cage
has an SFF-8087 (mini SAS) plug, so I'm thinking I can grab something
small like the Adaptec 1405 - will any of these solutions offer a
definite win with my NFS issues though?
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-05-07 20:56 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-07 10:47 ext4, barrier, md/RAID1 and write cache Daniel Pocock
2012-05-07 16:25 ` Martin Steigerwald
2012-05-07 16:44 ` Daniel Pocock
2012-05-07 16:54 ` Andreas Dilger
2012-05-07 17:28 ` Daniel Pocock
2012-05-07 18:59 ` Martin Steigerwald
2012-05-07 20:56 ` Daniel Pocock [this message]
2012-05-07 22:24 ` Martin Steigerwald
2012-05-07 23:23 ` Daniel Pocock
2012-05-08 14:55 ` Martin Steigerwald
2012-05-08 15:28 ` Daniel Pocock
2012-05-08 17:02 ` Andreas Dilger
2012-05-09 7:30 ` Martin Steigerwald
2012-05-09 9:34 ` Martin Steigerwald
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FA836FD.2070506@pocock.com.au \
--to=daniel@pocock.com.au \
--cc=Martin@lichtvoll.de \
--cc=adilger@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.