linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Brown <david.brown@hesbynett.no>
To: linux-raid@vger.kernel.org
Subject: Re: Triple-parity raid6
Date: Sun, 12 Jun 2011 11:05:40 +0200	[thread overview]
Message-ID: <it1vh4$3fh$1@dough.gmane.org> (raw)
In-Reply-To: <4DF39E8B.6090106@gmail.com>

On 11/06/11 18:57, Joe Landman wrote:
> On 06/11/2011 12:31 PM, David Brown wrote:
>
>> What has changed over the years is that there is no longer such a need
>> for manual assembly code to get optimal speed out of the cpu. While
>
> Hmmm ... I've done studies on this using an incredibly simple function
> (Riemann Zeta Function c.f. http://scalability.org/?p=470 ). The short
> version is that hand optimized SSE2 is ~4x faster (for this case) than
> best optimization of high level code. Hand optimized assembler is even
> better.
>
>> writing such assembly is fun, it is time-consuming to write and hard to
>> maintain, especially for code that must run on so many different
>> platforms.
>
> Yes, it is generally hard to write and maintain. But it you can get the
> rest of the language semantics out of the way. If you look at the tests
> that Linux does when it starts up, you can see a fairly wide
> distribution in the performance.
>
> raid5: using function: generic_sse (13356.000 MB/sec)
> raid6: int64x1 3507 MB/s
> raid6: int64x2 3886 MB/s
> raid6: int64x4 3257 MB/s
> raid6: int64x8 3054 MB/s
> raid6: sse2x1 8347 MB/s
> raid6: sse2x2 9695 MB/s
> raid6: sse2x4 10972 MB/s
>
> Some of these are hand coded assembly. See
> ${KERNEL_SOURCE}/drivers/md/raid6sse2.c and look at the
> raid6_sse24_gen_syndrome code.
>
> Really, to get the best performance out of the system, requires a fairly
> deep understanding of how the processor/memory system operates. These
> functions do use the SSE registers, but we can have only so many SSE
> operations in flight at once. These processors can generally have quite
> a few simultaneous operations in flight at once, so a knowledge about
> that, and the mix of operations, and how the interact with the
> instruction scheduler in the hardware, is fairly essential to getting
> good performance.
>

I am not suggesting that hand-coding assembly won't make the 
calculations faster - just that better compiler optimisations (which 
will automatically make use of sse instructions) will make the generic 
code closer to the theoretical maximum.

Out of curiosity, have you re-tried your zeta function code using a more 
modern version of gcc?  A lot has happened with gcc since 4.1 - in 
particular, the "graphite" code in gcc 4.4 will make a big difference to 
code that loops through a lot of data (it re-arranges the loops to 
unroll inner blocks, and to make loop strides match cache sizes).


>>
>>> We are interested in working on this capability (and more generic
>>> capability) as well.
>>>
>>> Is anyone in particular starting to design/code this? Please let me
>>> know.
>>>
>>
>> Well, I am currently trying to write up some of the maths - I started
>> the thread because I had been playing around with the maths, and thought
>> it should work. I made a brief stab at writing a
>> "raid7_int$#_gen_syndrome()" function, but I haven't done any testing
>> with it (or even tried to compile it) - first I want to be sure of the
>> algorithms.
>
> I've been coding various bits as "pseudocode" using Octave. Makes
> checking with the built in Galios functions pretty easy.
>
> I haven't looked at the math behind the triple parity syndrome calc yet,
> though I'd imagine someone has, and can write it down. If someone hasn't
> done that yet, its a good first step. Then we can code the simple
> version from there with test drivers/cases, and then start optimizing
> the implementation.
>
>



  reply	other threads:[~2011-06-12  9:05 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-09  0:01 Triple-parity raid6 David Brown
2011-06-09  1:49 ` NeilBrown
2011-06-09 11:32   ` David Brown
2011-06-09 12:04     ` NeilBrown
2011-06-09 19:19       ` David Brown
2011-06-10  3:22       ` Namhyung Kim
2011-06-10  8:45         ` David Brown
2011-06-10 12:20           ` Christoph Dittmann
2011-06-10 14:28             ` David Brown
2011-06-11 10:13               ` Piergiorgio Sartor
2011-06-11 11:51                 ` David Brown
2011-06-11 13:18                   ` Piergiorgio Sartor
2011-06-11 14:53                     ` David Brown
2011-06-11 15:05                       ` Joe Landman
2011-06-11 16:31                         ` David Brown
2011-06-11 16:57                           ` Joe Landman
2011-06-12  9:05                             ` David Brown [this message]
2011-06-11 17:14                           ` Joe Landman
2011-06-11 18:05                             ` David Brown
2011-06-10  9:03       ` David Brown
2011-06-10 13:56       ` Bill Davidsen
2011-06-09 22:42 ` David Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='it1vh4$3fh$1@dough.gmane.org' \
    --to=david.brown@hesbynett.no \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).