From: Bill Davidsen <davidsen@tmr.com>
To: Roger Lucas <roger@planbit.co.uk>
Cc: linux-raid@vger.kernel.org, neilb@suse.de
Subject: Re: Odd (slow) RAID performance
Date: Sat, 02 Dec 2006 00:27:56 -0500 [thread overview]
Message-ID: <45710EDC.9050805@tmr.com> (raw)
In-Reply-To: <20061201092211.4ACDB12EDE@bluewhale.planbit.co.uk>
Roger Lucas wrote:
>> Roger Lucas wrote:
>>>>> What drive configuration are you using (SCSI / ATA / SATA), what
>> chipset
>>>> is
>>>>
>>>>> providing the disk interface and what cpu are you running with?
>>>>>
>>>> 3xSATA, Seagate 320 ST3320620AS, Intel 6600, ICH7 controller using the
>>>> ata-piix driver, with drive cache set to write-back. It's not obvious
>> to
>>>> me why that matters, but if it helps you see the problem I''m glad to
>>>> provide the info. I'm seeing ~50MB/s on the raw drive, and 3x that on
>>>> plain stripes, so I'm assuming that either the RAID-5 code is not
>>>> working well or I haven't set it up optimally.
>>>>
>>> If it had been ATA, and you had two drives as master+slave on the same
>>> cable, then they would be fast individually but slow as a pair.
>>>
>>> RAID-5 is higher overhead than RAID-0/RAID-1 so if your CPU was slow
>> then
>>> you would see some degradation from that too.
>>>
>>> We have similar hardware here so I'll run some tests here and see what I
>>> get...
>> Much appreciated. Since my last note I tried adding --bitmap=internal to
>> the array. Bot is that a write performance killer. I will have the chart
>> updated in a minute, but write dropped to ~15MB/s with bitmap. Since
>> Fedora can't seem to shut the last array down cleanly, I get a rebuild
>> on every boot :-( So the array for the LVM has bitmap on, as I hate to
>> rebuild 1.5TB regularly. Have to do some compromises on that!
>>
>
> Hi Bill,
>
> Here are the results of my tests here:
>
> CPU: Intel Celetron 2.7GHz socket 775
> MB: Abit LG-81 (Lakeport ICH7 chipset)
> HDD: 4 x Seagate SATA ST3160812AS (directly connected to ICH7)
> OS: Linux 2.6.16-xen
>
> root@hydra:~# uname -a
> Linux hydra 2.6.16-xen #1 SMP Thu Apr 13 18:46:07 BST 2006 i686 GNU/Linux
> root@hydra:~#
>
> All four disks are built into a RAID-5 array to provide ~420GB real storage.
> Most of this is then used by the other Xen virtual machines but there is a
> bit of space left on this server to play with in the Dom-0.
>
> I wasn't able to run I/O tests with "dd" on the disks themselves as I don't
> have a spare partition to corrupt, but hdparm gives:
>
> root@hydra:~# hdparm -tT /dev/sda
>
> /dev/sda:
> Timing cached reads: 3296 MB in 2.00 seconds = 1648.48 MB/sec
> Timing buffered disk reads: 180 MB in 3.01 seconds = 59.78 MB/sec
> root@hydra:~#
>
> Which is exactly what I would expect as this is the performance limit of the
> disk. We have a lot of ICH7/ICH7R-based servers here and all can run the
> disk at their maximum physical speed without problems.
>
> root@hydra:~# cat /proc/mdstat
> Personalities : [raid5] [raid4]
> md0 : active raid5 sda2[0] sdd2[3] sdc2[2] sdb2[1]
> 468647808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
>
> unused devices: <none>
> root@hydra:~# df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/bigraid-root
> 10G 1.3G 8.8G 13% /
> <snip>
> root@hydra:~# vgs
> VG #PV #LV #SN Attr VSize VFree
> bigraid 1 13 0 wz--n- 446.93G 11.31G
> root@hydra:~# lvcreate --name testspeed --size 2G bigraid
> Logical volume "testspeed" created
> root@hydra:~#
>
> *** Now for the LVM over RAID-5 read/write tests ***
>
> root@hydra:~# sync; time bash -c "dd if=/dev/zero bs=1024k count=2048
> of=/dev/bigraid/testspeed; sync"
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 33.7345 seconds, 63.7 MB/s
>
> real 0m34.211s
> user 0m0.020s
> sys 0m2.970s
> root@hydra:~# sync; time bash -c "dd of=/dev/zero bs=1024k count=2048
> if=/dev/bigraid/testspeed; sync"
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 38.1175 seconds, 56.3 MB/s
>
> real 0m38.637s
> user 0m0.010s
> sys 0m3.260s
> root@hydra:~#
>
> During the above two tests, the CPU showed about 35% idle using "top".
>
> *** Now for the file system read/write tests ***
> (Reiser over LVM over RAID-5)
>
> root@hydra:~# mount
> /dev/mapper/bigraid-root on / type reiserfs (rw)
> <snip>
> root@hydra:~#
>
>
> root@hydra:~# sync; time bash -c "dd if=/dev/zero bs=1024k count=2048
> of=~/testspeed; sync"
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 29.8863 seconds, 71.9 MB/s
>
> real 0m32.289s
> user 0m0.000s
> sys 0m4.440s
> root@hydra:~# sync; time bash -c "dd of=/dev/null bs=1024k count=2048
> if=~/testspeed; sync"
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 40.332 seconds, 53.2 MB/s
>
> real 0m40.973s
> user 0m0.010s
> sys 0m2.640s
> root@hydra:~#
>
> During the above two tests, the CPU showed between 0% and 30% idle using
> "top".
>
> Just for curiousity, I started the RAID-5 check process to see what load it
> generated...
>
> root@hydra:~# cat /sys/block/md0/md/mismatch_cnt
> 0
> root@hydra:~# echo check > /sys/block/md0/md/sync_action
> root@hydra:~# cat /sys/block/md0/md/sync_action
> check
> root@hydra:~# cat /proc/mdstat
> Personalities : [raid5] [raid4]
> md0 : active raid5 sda2[0] sdd2[3] sdc2[2] sdb2[1]
> 468647808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
> [>....................] resync = 1.0% (1671552/156215936)
> finish=101.8min speed=25292K/sec
>
> unused devices: <none>
> root@hydra:~#
>
> Whilst the above test was running, the CPU load was between 3% and 7%, so
> running the RAID array isn't that hard for it...
>
> -------------------------
>
> So, using a 4-disk RAID-5 array with an ICH7, I get about 64M write and 54MB
> read prformance. The processor is about 35% idle whilst the test is running
> - I'm not sure why this is, I would have expected the processor load to be
> 0% idle as it should be hitting the hard disk as fast as possible and
> waiting for it otherwise....
>
> If I run over Reiser, the processor load changes a lot more, varying between
> 0% and 35% idle. It also takes a couple of seconds after the test has
> finished before the load drops down to zero on the write test, so I suspect
> these results are basically the same as the raw LVM-over-RAID5 performance.
>
> Summary - it is a little faster with 4 disks rather than the 37.5 MB/s that
> you have with just the three, but it is WAY off the theoretical target of
> 3x60MB = 180MB that could be expected given that you are running a 4-disk
> RAID-5 array.
>
> On the flip side, the performance is good enough for me, so it is not
> causing me a problem, but it seems that there should be a performance boost
> available somewhere!
>
> Best regards,
>
> Roger
Thank you so much for verifying this. I do keep enough room on my drives
to run tests by creating any kind of whatever I need, but the point is
clear: with N drives striped the transfer rate is N x base rate of one
drive; with RAID-5 it is about the speed of one drive, suggesting that
the md code serializes writes.
If true, BOO, HISS!
Can you explain and educate us, Neal? This look like terrible performance.
--
Bill Davidsen
He was a full-time professional cat, not some moonlighting
ferret or weasel. He knew about these things.
next prev parent reply other threads:[~2006-12-02 5:27 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-30 14:13 Odd (slow) RAID performance Bill Davidsen
2006-11-30 14:31 ` Roger Lucas
2006-11-30 15:30 ` Bill Davidsen
2006-11-30 15:32 ` Roger Lucas
2006-11-30 21:09 ` Bill Davidsen
2006-12-01 9:24 ` Roger Lucas
2006-12-02 5:27 ` Bill Davidsen [this message]
2006-12-05 1:33 ` Dan Williams
2006-12-07 15:51 ` Bill Davidsen
2006-12-08 1:15 ` Corey Hickey
2006-12-08 8:21 ` Gabor Gombas
2006-12-08 6:01 ` Neil Brown
2006-12-08 7:28 ` Neil Brown
2006-12-09 20:20 ` Bill Davidsen
2006-12-12 17:44 ` Bill Davidsen
2006-12-12 18:48 ` Raz Ben-Jehuda(caro)
2006-12-12 21:51 ` Bill Davidsen
2006-12-13 17:44 ` Mark Hahn
2006-12-20 4:05 ` Bill Davidsen
2006-12-09 20:16 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45710EDC.9050805@tmr.com \
--to=davidsen@tmr.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=roger@planbit.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).