From: David Brown <david.brown@hesbynett.no>
To: linux-raid@vger.kernel.org
Subject: Re: Triple-parity raid6
Date: Sun, 12 Jun 2011 11:05:40 +0200 [thread overview]
Message-ID: <it1vh4$3fh$1@dough.gmane.org> (raw)
In-Reply-To: <4DF39E8B.6090106@gmail.com>
On 11/06/11 18:57, Joe Landman wrote:
> On 06/11/2011 12:31 PM, David Brown wrote:
>
>> What has changed over the years is that there is no longer such a need
>> for manual assembly code to get optimal speed out of the cpu. While
>
> Hmmm ... I've done studies on this using an incredibly simple function
> (Riemann Zeta Function c.f. http://scalability.org/?p=470 ). The short
> version is that hand optimized SSE2 is ~4x faster (for this case) than
> best optimization of high level code. Hand optimized assembler is even
> better.
>
>> writing such assembly is fun, it is time-consuming to write and hard to
>> maintain, especially for code that must run on so many different
>> platforms.
>
> Yes, it is generally hard to write and maintain. But it you can get the
> rest of the language semantics out of the way. If you look at the tests
> that Linux does when it starts up, you can see a fairly wide
> distribution in the performance.
>
> raid5: using function: generic_sse (13356.000 MB/sec)
> raid6: int64x1 3507 MB/s
> raid6: int64x2 3886 MB/s
> raid6: int64x4 3257 MB/s
> raid6: int64x8 3054 MB/s
> raid6: sse2x1 8347 MB/s
> raid6: sse2x2 9695 MB/s
> raid6: sse2x4 10972 MB/s
>
> Some of these are hand coded assembly. See
> ${KERNEL_SOURCE}/drivers/md/raid6sse2.c and look at the
> raid6_sse24_gen_syndrome code.
>
> Really, to get the best performance out of the system, requires a fairly
> deep understanding of how the processor/memory system operates. These
> functions do use the SSE registers, but we can have only so many SSE
> operations in flight at once. These processors can generally have quite
> a few simultaneous operations in flight at once, so a knowledge about
> that, and the mix of operations, and how the interact with the
> instruction scheduler in the hardware, is fairly essential to getting
> good performance.
>
I am not suggesting that hand-coding assembly won't make the
calculations faster - just that better compiler optimisations (which
will automatically make use of sse instructions) will make the generic
code closer to the theoretical maximum.
Out of curiosity, have you re-tried your zeta function code using a more
modern version of gcc? A lot has happened with gcc since 4.1 - in
particular, the "graphite" code in gcc 4.4 will make a big difference to
code that loops through a lot of data (it re-arranges the loops to
unroll inner blocks, and to make loop strides match cache sizes).
>>
>>> We are interested in working on this capability (and more generic
>>> capability) as well.
>>>
>>> Is anyone in particular starting to design/code this? Please let me
>>> know.
>>>
>>
>> Well, I am currently trying to write up some of the maths - I started
>> the thread because I had been playing around with the maths, and thought
>> it should work. I made a brief stab at writing a
>> "raid7_int$#_gen_syndrome()" function, but I haven't done any testing
>> with it (or even tried to compile it) - first I want to be sure of the
>> algorithms.
>
> I've been coding various bits as "pseudocode" using Octave. Makes
> checking with the built in Galios functions pretty easy.
>
> I haven't looked at the math behind the triple parity syndrome calc yet,
> though I'd imagine someone has, and can write it down. If someone hasn't
> done that yet, its a good first step. Then we can code the simple
> version from there with test drivers/cases, and then start optimizing
> the implementation.
>
>
next prev parent reply other threads:[~2011-06-12 9:05 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-09 0:01 Triple-parity raid6 David Brown
2011-06-09 1:49 ` NeilBrown
2011-06-09 11:32 ` David Brown
2011-06-09 12:04 ` NeilBrown
2011-06-09 19:19 ` David Brown
2011-06-10 3:22 ` Namhyung Kim
2011-06-10 8:45 ` David Brown
2011-06-10 12:20 ` Christoph Dittmann
2011-06-10 14:28 ` David Brown
2011-06-11 10:13 ` Piergiorgio Sartor
2011-06-11 11:51 ` David Brown
2011-06-11 13:18 ` Piergiorgio Sartor
2011-06-11 14:53 ` David Brown
2011-06-11 15:05 ` Joe Landman
2011-06-11 16:31 ` David Brown
2011-06-11 16:57 ` Joe Landman
2011-06-12 9:05 ` David Brown [this message]
2011-06-11 17:14 ` Joe Landman
2011-06-11 18:05 ` David Brown
2011-06-10 9:03 ` David Brown
2011-06-10 13:56 ` Bill Davidsen
2011-06-09 22:42 ` David Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='it1vh4$3fh$1@dough.gmane.org' \
--to=david.brown@hesbynett.no \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).