linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: raid6's using not the best bandwidth method && raid6 algo is significantly slower in x86_64.
@ 2008-11-17 22:35 H. Peter Anvin
  2008-11-18 12:03 ` Igor Podlesny
  0 siblings, 1 reply; 16+ messages in thread
From: H. Peter Anvin @ 2008-11-17 22:35 UTC (permalink / raw)
  To: for.poige+linux; +Cc: linux-raid, neilb

Interesting... Perhaps you could send me your "bad" kernel vmlinux or raid456.ko file; preferrably compiled with CONFIG_DEBUG_INFO.

-- 
Sent from my mobile phone (pardon any lack of formatting)


-----Original Message-----
From: Igor Podlesny <for.poige+linux@gmail.com>
Sent: Monday, November 17, 2008 12:56
To: H. Peter Anvin <hpa@zytor.com>
Cc: linux-raid@vger.kernel.org; neilb@suse.de
Subject: Re: raid6's using not the best bandwidth method && raid6 algo is significantly slower in x86_64.

2008/11/17 H. Peter Anvin <hpa@zytor.com>:
> Igor Podlesny wrote:
>>
[...]
>> Has anyone on the list similar observations? Can gcc's version
>> difference affect so much? I doubt that, but I can try build x86_32
>> with gcc 4.3.1 (as x86_64 was).
>>
>
> The SSE modes have nicer cache behaviours and are therefore preferred
> even if they are slower.
>
That was my guess that they're preferable (but I wasn't aware of exact
reason, thanks!). :-)
>
> It is very odd that your SSE2 modes are that much slower in 64-bit mode.
>  It could just be an artifact of the may the test is done (cache
> anomalies?), but I kind of suspect there is something more fishy going on

^ permalink raw reply	[flat|nested] 16+ messages in thread
* raid6's using not the best bandwidth method && raid6 algo is significantly slower in x86_64.
@ 2008-11-16 16:18 Igor Podlesny
  2008-11-17  0:36 ` H. Peter Anvin
  0 siblings, 1 reply; 16+ messages in thread
From: Igor Podlesny @ 2008-11-16 16:18 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb

Hi!

Recently I've decided to give a try to x86_64 version of mine distro
of choice (subject to change).
I prefer using own compiled kernels, so I know that in both x86_32 and
x86_64 modes .configs are very similar to each other.

The first quote is x86_32 kernel's dmesg (2.6.27.6-1khz i686):

    [    0.000000] Linux version 2.6.27.6-1khz
(poige@arch.localdomain) (gcc version 4.2.4)
    [    0.021589] CPU0: AMD Athlon(tm) 64 X2 Dual Core Processor
6000+ stepping 03

    [    0.092818] Total of 2 processors activated (12061.15 BogoMIPS).

    [    0.093095] xor: automatically using best checksumming function: pIII_sse
    [    0.097988]    pIII_sse  :  9164.000 MB/sec
    [    0.098027] xor: using function: pIII_sse (9164.000 MB/sec

    [    2.392055] raid6: int32x1   1121 MB/s
    [    2.409048] raid6: int32x2   1226 MB/s
    [    2.412224] input: AT Translated Set 2 keyboard as /class/input/input0
    [    2.426040] raid6: int32x4   1191 MB/s
    [    2.443066] raid6: int32x8    882 MB/s
    [    2.460013] raid6: mmxx1     2453 MB/s
    [    2.477014] raid6: mmxx2     4574 MB/s
    [    2.494024] raid6: sse1x1    2441 MB/s
    [    2.511014] raid6: sse1x2    4222 MB/s
    [    2.528013] raid6: sse2x1    4187 MB/s
    [    2.545004] raid6: sse2x2    5562 MB/s
    [    2.545042] raid6: using algorithm sse2x2 (5562 MB/s)

And now follows x86_64:

    [    0.000000] Linux version 2.6.27.6-64_1khz (root@archlive) (gcc
version 4.3.1 (GCC) )
    [    0.019180] CPU0: AMD Athlon(tm) 64 X2 Dual Core Processor
6000+ stepping 03

    [    0.091750] Total of 2 processors activated (12061.66 BogoMIPS).

    [    0.092073] xor: automatically using best checksumming
function: generic_sse
    [    0.096986]    generic_sse:  9192.000 MB/sec
    [    0.097024] xor: using function: generic_sse (9192.000 MB/sec)

    [    2.583571] md: raid0 personality registered for level 0
    [    2.583614] md: raid1 personality registered for level 1
    [    2.600025] raid6: int64x1   2722 MB/s
    [    2.617010] raid6: int64x2   3660 MB/s
    [    2.634006] raid6: int64x4   3265 MB/s
    [    2.651012] raid6: int64x8   2593 MB/s
    [    2.668034] raid6: sse2x1    1476 MB/s
    [    2.685021] raid6: sse2x2    2316 MB/s
    [    2.702022] raid6: sse2x4    3175 MB/s
    [    2.702060] raid6: using algorithm sse2x4 (3175 MB/s)

So, there're 2 strange things in those dmesgs. The first one might be
unrelated to Linux RAID but affects it -- have you noticed that in
x86_64, raid6 algorithm is ~ 50 % slower, than in x86_32? Is that due
to not too optimized code for x86_64 mode? And the second -- why is
raid6 using algorithm sse2x4 (3175 MB/s), whereas int64x2 gives
slightly better (~ 15 %) throughput -- 3660 MB/s?

Has anyone on the list similar observations? Can gcc's version
difference affect so much? I doubt that, but I can try build x86_32
with gcc 4.3.1 (as x86_64 was).

-- 
End of message. Next message?

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-12-05 17:34 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-17 22:35 raid6's using not the best bandwidth method && raid6 algo is significantly slower in x86_64 H. Peter Anvin
2008-11-18 12:03 ` Igor Podlesny
2008-11-18 15:47   ` H. Peter Anvin
2008-11-21 19:22     ` Igor Podlesny
2008-11-21 19:31       ` H. Peter Anvin
2008-11-21 19:33         ` Igor Podlesny
2008-11-21 20:15           ` H. Peter Anvin
2008-11-22  5:40             ` Igor Podlesny
2008-11-22  5:42               ` H. Peter Anvin
2008-11-22  5:45                 ` Igor Podlesny
2008-11-23  1:12                   ` John Robinson
2008-12-05 13:36                   ` Igor Podlesny
2008-12-05 17:34                     ` H. Peter Anvin
  -- strict thread matches above, loose matches on Subject: below --
2008-11-16 16:18 Igor Podlesny
2008-11-17  0:36 ` H. Peter Anvin
2008-11-17 20:56   ` Igor Podlesny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).