Accelerating crush with SIMD

All of lore.kernel.org
 help / color / mirror / Atom feed

* Accelerating crush with SIMD
@ 2016-08-28 17:59 Loic Dachary
  2016-08-29  4:28 ` Gregory Farnum
  0 siblings, 1 reply; 5+ messages in thread
From: Loic Dachary @ 2016-08-28 17:59 UTC (permalink / raw)
  To: Ceph Development

Hi,

Could we significantly accelerate crush with SIMD instructions ? I don't remember the idea being discussed but maybe I missed it.

Cheers
-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Accelerating crush with SIMD
  2016-08-28 17:59 Accelerating crush with SIMD Loic Dachary
@ 2016-08-29  4:28 ` Gregory Farnum
  2016-08-29  8:56   ` Loic Dachary
  0 siblings, 1 reply; 5+ messages in thread
From: Gregory Farnum @ 2016-08-29  4:28 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

On Sun, Aug 28, 2016 at 10:59 AM, Loic Dachary <loic@dachary.org> wrote:
> Hi,
>
> Could we significantly accelerate crush with SIMD instructions ? I don't remember the idea being discussed but maybe I missed it.

I think it was attempted, but using a lookup table method turned out
to be much faster. Sage did some prototyping and then some folks from
Intel did a lot of heavy optimization; I'd be surprised if anybody
managed to speed up the CRUSH calculations much at this point (at
least, without changing the fundamental math involved).

Sorry I can't be more detailed; the actual CRUSH implementation is
something I've largely left alone. I imagine the optimization points
become pretty clear running git blame or something though. ;)
-Greg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Accelerating crush with SIMD
  2016-08-29  4:28 ` Gregory Farnum
@ 2016-08-29  8:56   ` Loic Dachary
  2016-08-29  9:16     ` Piotr Dałek
  2016-08-29 18:15     ` Gregory Farnum
  0 siblings, 2 replies; 5+ messages in thread
From: Loic Dachary @ 2016-08-29  8:56 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Ceph Development

Hi Greg,

On 29/08/2016 06:28, Gregory Farnum wrote:
> On Sun, Aug 28, 2016 at 10:59 AM, Loic Dachary <loic@dachary.org> wrote:
>> Hi,
>>
>> Could we significantly accelerate crush with SIMD instructions ? I don't remember the idea being discussed but maybe I missed it.
> 
> I think it was attempted, but using a lookup table method turned out
> to be much faster. Sage did some prototyping and then some folks from
> Intel did a lot of heavy optimization; I'd be surprised if anybody
> managed to speed up the CRUSH calculations much at this point (at
> least, without changing the fundamental math involved).
> 
> Sorry I can't be more detailed; the actual CRUSH implementation is
> something I've largely left alone. I imagine the optimization points
> become pretty clear running git blame or something though. ;)

I was not thinking of accelerating the crush hash function or the straw2 function, but to have them run simultaneously on 4/8/16 items at a time using _mm, _mm256 or _mm512 instructions[1], when possible. I'll put together a proof of concept later today to clarify what I have in mind.

Cheers

[1] https://software.intel.com/sites/landingpage/IntrinsicsGuide/
-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Accelerating crush with SIMD
  2016-08-29  8:56   ` Loic Dachary
@ 2016-08-29  9:16     ` Piotr Dałek
  2016-08-29 18:15     ` Gregory Farnum
  1 sibling, 0 replies; 5+ messages in thread
From: Piotr Dałek @ 2016-08-29  9:16 UTC (permalink / raw)
  To: Ceph Development

On Mon, Aug 29, 2016 at 10:56:47AM +0200, Loic Dachary wrote:
> Hi Greg,
> 
> On 29/08/2016 06:28, Gregory Farnum wrote:
> > On Sun, Aug 28, 2016 at 10:59 AM, Loic Dachary <loic@dachary.org> wrote:
> >> Hi,
> >>
> >> Could we significantly accelerate crush with SIMD instructions ? I don't remember the idea being discussed but maybe I missed it.
> > 
> > I think it was attempted, but using a lookup table method turned out
> > to be much faster. Sage did some prototyping and then some folks from
> > Intel did a lot of heavy optimization; I'd be surprised if anybody
> > managed to speed up the CRUSH calculations much at this point (at
> > least, without changing the fundamental math involved).
> > 
> > Sorry I can't be more detailed; the actual CRUSH implementation is
> > something I've largely left alone. I imagine the optimization points
> > become pretty clear running git blame or something though. ;)
> 
> I was not thinking of accelerating the crush hash function or the straw2 function, but to have them run simultaneously on 4/8/16 items at a time using _mm, _mm256 or _mm512 instructions[1], when possible. I'll put together a proof of concept later today to clarify what I have in mind.
> 
> Cheers
> 
> [1] https://software.intel.com/sites/landingpage/IntrinsicsGuide/

Last time I checked, it didn't make sense in any way as crush functions were
fast enough already, and there was little room for parallelizing
calculations. This *is* possible, but requires a lot of careful rework on
all parts that actually use it. Note that just calculating 4/8/16 hashes at
once doesn't mean instant benefit as calculation is only the part of story;
you need to pack and unpack data from source/to destination and this takes
time too. Also, I don't think Ceph does so many crush recalculations per
second to make such rework feasible - but feel free to prove me wrong.

Best regards,

-- 
Piotr Dałek
branch@predictor.org.pl
http://blog.predictor.org.pl

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Accelerating crush with SIMD
  2016-08-29  8:56   ` Loic Dachary
  2016-08-29  9:16     ` Piotr Dałek
@ 2016-08-29 18:15     ` Gregory Farnum
  1 sibling, 0 replies; 5+ messages in thread
From: Gregory Farnum @ 2016-08-29 18:15 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

On Mon, Aug 29, 2016 at 1:56 AM, Loic Dachary <loic@dachary.org> wrote:
> Hi Greg,
>
> On 29/08/2016 06:28, Gregory Farnum wrote:
>> On Sun, Aug 28, 2016 at 10:59 AM, Loic Dachary <loic@dachary.org> wrote:
>>> Hi,
>>>
>>> Could we significantly accelerate crush with SIMD instructions ? I don't remember the idea being discussed but maybe I missed it.
>>
>> I think it was attempted, but using a lookup table method turned out
>> to be much faster. Sage did some prototyping and then some folks from
>> Intel did a lot of heavy optimization; I'd be surprised if anybody
>> managed to speed up the CRUSH calculations much at this point (at
>> least, without changing the fundamental math involved).
>>
>> Sorry I can't be more detailed; the actual CRUSH implementation is
>> something I've largely left alone. I imagine the optimization points
>> become pretty clear running git blame or something though. ;)
>
> I was not thinking of accelerating the crush hash function or the straw2 function, but to have them run simultaneously on 4/8/16 items at a time using _mm, _mm256 or _mm512 instructions[1], when possible. I'll put together a proof of concept later today to clarify what I have in mind.

I know this moved threads, but now I get it and that sounds cool. :)
*thumbs up*
-Greg

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-29 18:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-28 17:59 Accelerating crush with SIMD Loic Dachary
2016-08-29  4:28 ` Gregory Farnum
2016-08-29  8:56   ` Loic Dachary
2016-08-29  9:16     ` Piotr Dałek
2016-08-29 18:15     ` Gregory Farnum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.