* Re: raid5 performance - 2.4.28
[not found] <Pine.LNX.4.44.0501091639260.13461-100000@coffee.psychology.mcmaster.ca>
@ 2005-01-10 16:26 ` Tim Moore
0 siblings, 0 replies; 3+ messages in thread
From: Tim Moore @ 2005-01-10 16:26 UTC (permalink / raw)
To: linux-raid; +Cc: Mark Hahn
Mark Hahn wrote:
>>Here's a data point in favor of raw horsepower when considering
>>software raid performance.
>
>
> mostly sw r5 write performance, right?
Correct. Writes increased by 3X, Rewrites by 50%, Reads about the same.
>>Athlon K7 (18u) @ 840MHz, 1GB PC133, Abit KA7
>>Athlon XP 2800 @ 2075MHz, 1GB PC2700, Asus A7V400-MX
>
>
> so your dram bandwidth (measured by stream, say) went from maybe
> .8 GB/s to around 2 GB/s. do you still have boot logs from the
> older configuration around? it would be interesting to know
> the in-cache checksumming speed gain, ie:
>
> raid5: using function: pIII_sse (3128.000 MB/sec)
The Abit KA7 was the first consumer mobo to use leading+trailing mem clock
and bank interleaving, so memory speed has only slightly more than doubled:
Athlon slot-A @ 850MHz + PC133 SDRAM
------------------------------------
kernel: raid5: measuring checksumming speed
kernel: 8regs : 1285.600 MB/sec
kernel: 32regs : 780.800 MB/sec
kernel: pII_mmx : 1972.400 MB/sec
kernel: p5_mmx : 2523.600 MB/sec
kernel: raid5: using function: p5_mmx (2523.600 MB/sec)
kernel: md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
Athlon XP @ 2075MHz + PC2700 DDR
--------------------------------
kernel: raid5: measuring checksumming speed
kernel: 8regs : 3172.800 MB/sec
kernel: 32regs : 1932.400 MB/sec
kernel: pIII_sse : 3490.800 MB/sec
kernel: pII_mmx : 4868.400 MB/sec
kernel: p5_mmx : 6229.200 MB/sec
kernel: raid5: using function: pIII_sse (3490.800 MB/sec)
kernel: md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
I'm also experimenting with this patch to see if the xor hardwire for
modern intel/AMD architectures is still valid. With the old processor
p5_mmx was always picked and always within a few MB/s. The new XP is all
over the map.
pre-patch: always pIII_sse (35xx)
post-patch: always p5_mmx (62xx)
--- ./include/asm-i386/xor.h.orig Fri Aug 2 17:39:45 2002
+++ ./include/asm-i386/xor.h Sun Jan 9 22:32:37 2005
@@ -876,3 +876,8 @@
deals with a load to a line that is being prefetched. */
#define XOR_SELECT_TEMPLATE(FASTEST) \
(cpu_has_xmm ? &xor_block_pIII_sse : FASTEST)
+
+/* This may have been true in 1998, but lets try what appears to be
+ nearly 4x faster */
+#define XOR_SELECT_TEMPLATE(FASTEST) \
+ (cpu_has_xmm ? &xor_block_p5_mmx : FASTEST)
^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <Pine.LNX.4.44.0501101600250.28809-100000@coffee.psychology.mcmaster.ca>]
* Re: raid5 performance - 2.4.28
[not found] <Pine.LNX.4.44.0501101600250.28809-100000@coffee.psychology.mcmaster.ca>
@ 2005-01-12 5:18 ` Tim Moore
0 siblings, 0 replies; 3+ messages in thread
From: Tim Moore @ 2005-01-12 5:18 UTC (permalink / raw)
To: linux-raid
Mark Hahn wrote:
...
>>I'm also experimenting with this patch to see if the xor hardwire for
>>modern intel/AMD architectures is still valid. With the old processor
>>p5_mmx was always picked and always within a few MB/s. The new XP is all
>>over the map.
>
>
> this choice (using pIII_sse even if there's a faster alternative)
> is made because it doesn't dirty the cache. are you suggesting that
> whole-system performance is better with p5_mmx, even though or *because*
> the blocks are then in cache? seems unlikley to me, since the checksum
> is only used on writes or degraded mode. having written blocks in cache
> seems worthless, almost...
I'm an experimentor, not a theorist. I am running a series of experiments
to determine if, and by how much, the xor algorithm in
include/asm-<arch>/xor.h influences RAID5 write performance and reconstruction.
I'll post results.
--
^ permalink raw reply [flat|nested] 3+ messages in thread
* raid5 performance - 2.4.28
@ 2005-01-09 5:28 Tim Moore
0 siblings, 0 replies; 3+ messages in thread
From: Tim Moore @ 2005-01-09 5:28 UTC (permalink / raw)
To: linux-raid
Here's a data point in favor of raw horsepower when considering
software raid performance. I recently upgraded an Athlon Slot-A/SDRAM
system to an Athlon XP/DDR system without changing OS, disks,
disk controller or filesystems.
Athlon K7 (18u) @ 840MHz, 1GB PC133, Abit KA7
---------------------------------------------
abit,1500M:4k,,,12862,36,11132,18,,,44953,32,284.0,1,,,,,,,,,,,,,
abit,1500M:4k,,,13132,36,10231,17,,,57147,40,278.2,1,,,,,,,,,,,,,
abit,1500M:4k,,,12595,36,11039,18,,,51269,36,281.6,1,,,,,,,,,,,,,
Athlon XP 2800 @ 2075MHz, 1GB PC2700, Asus A7V400-MX
----------------------------------------------------
abit,1500M:4k,,,41065,52,14636,15,,,49355,23,429.3,1,,,,,,,,,,,,,
abit,1500M:4k,,,41330,58,15932,16,,,47045,22,415.7,2,,,,,,,,,,,,,
abit,1500M:4k,,,43364,73,15395,17,,,52025,25,418.0,1,,,,,,,,,,,,,
test: bonnie++ v1.03 (-q -s 1500:4k -r 0 -n 0 -f -u root -x 3)
array: RAID 5 (2xIBM-DTLA-307020, 1xMAXTOR 6L020J1)
md partition: 3x10GB (last 1/2 of each 20GB disk)
controller: 3ware 6400, PATA, 66MHz (JBOD)
filesystem: ext3, 4k block size, stride=16, noatime,data=ordered
OS: RedHat 7.3, stock 2.4.28, raidtools 1.00.2, glibc 2.2.5, gcc 2.96
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid5]
read_ahead 1024 sectors
md0 : active raid5 sdc5[2] sdb5[0] sda5[1]
4192768 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]
md1 : active raid5 sdc6[2] sdb6[0] sda6[1]
2425600 blocks level 5, 32k chunk, algorithm 2 [3/3] [UUU]
md2 : active raid5 sdc7[2] sdb7[0] sda7[1]
12707200 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]
md3 : active raid5 sdc8[2] sdb8[0] sda8[1]
20113152 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]
# fdisk -l
Disk /dev/sda: 255 heads, 63 sectors, 2501 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 45 257040 82 Linux swap
/dev/sda3 46 2500 19719787+ 5 Extended
/dev/sda5 46 306 2096451 fd Linux raid autodetect
/dev/sda6 307 457 1212876 fd Linux raid autodetect
/dev/sda7 458 1248 6353676 fd Linux raid autodetect
/dev/sda8 1249 2500 10056658+ fd Linux raid autodetect
Disk /dev/sdb: 255 heads, 63 sectors, 2501 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 13 104391 83 Linux
/dev/sdb2 14 45 257040 82 Linux swap
/dev/sdb3 46 2500 19719787+ 5 Extended
/dev/sdb5 46 306 2096451 fd Linux raid autodetect
/dev/sdb6 307 457 1212876 fd Linux raid autodetect
/dev/sdb7 458 1248 6353676 fd Linux raid autodetect
/dev/sdb8 1249 2500 10056658+ fd Linux raid autodetect
Disk /dev/sdc: 255 heads, 63 sectors, 2498 cylinders
Units = cylinders of 16065 * 512 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 * 1 13 104391 83 Linux
/dev/sdc2 14 43 240975 82 Linux swap
/dev/sdc3 44 2498 19719787+ 5 Extended
/dev/sdc5 44 304 2096451 fd Linux raid autodetect
/dev/sdc6 305 455 1212876 fd Linux raid autodetect
/dev/sdc7 456 1246 6353676 fd Linux raid autodetect
/dev/sdc8 1247 2498 10056658+ fd Linux raid autodetect
Cheers,
--
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-01-12 5:18 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.44.0501091639260.13461-100000@coffee.psychology.mcmaster.ca>
2005-01-10 16:26 ` raid5 performance - 2.4.28 Tim Moore
[not found] <Pine.LNX.4.44.0501101600250.28809-100000@coffee.psychology.mcmaster.ca>
2005-01-12 5:18 ` Tim Moore
2005-01-09 5:28 Tim Moore
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).