Re: crush: straw is dead, long live straw2

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Joe Landman <joe.landman@gmail.com>
To: Sage Weil <sweil@redhat.com>, Thorsten Behrens <tbehrens@suse.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: crush: straw is dead, long live straw2
Date: Fri, 12 Dec 2014 10:39:02 -0500	[thread overview]
Message-ID: <548B0C16.4000509@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1412120644340.23559@cobra.newdream.net>

On 12/12/2014 09:46 AM, Sage Weil wrote:
> On Fri, 12 Dec 2014, Thorsten Behrens wrote:
>> Sage Weil wrote:
>>> Calculating the ln() function is a bit annoying because it is a floating
>>> point function and CRUSH is all fixed-point arithmetic (integer-based).
>>> The current draft implementation uses a 128 KB lookup table (2 bytes per
>>> entry for 16 bits of input precision).  It seems to be about 25% slower
>>> than the original straw code in my simple microbenchmark, but I'm not sure
>>> that's a number we should trust since it loops through a zillion inputs
>>> and will pull most of the lookup table into the processor caches.
>>>
>>> We could probably try a few others things:
>>>
>> [snip]
>>
>> My Hacker's Delight copy has a number of further variants (on
>> pp. 215), unfortunately only covering log_2 (which is trivially the
>> number of leading zeros in the binary representation) and log_10. But
>> since the logarithms are simply related by a multiplicative constant,
>> something clever might be doable here with a bit of thinking.
> Any log will definitely do; the next steps are to do a division and then
> take the max.  Simply counting 0's isn't precise enough, though.. we need
> a lot more precision than that.

You want integer log?  Looks like ln(x * 2^-16) from your original 
description., with x = ( 0 .. 2^16 )
So your log is going from undefined to 0 (ln 0 is not defined). Fire up 
gnuplot and do a quick "plot [0:1] (log(x)) " to see the shape of the 
function.

It can be a real valued function scaled in a particular way, then we can 
use a nice set of series to calculate it in FP (single precision if you 
like).    Conversion to int should be fast, and we could do this in 
SSE2/AVX to do multiple calcs at the same time.  I would think the 
random value portion of the code would be the slow part. Where are you 
getting your random source?  Could you not start that out with an 
exponential distribution rather than generate your own?

Aside from that, I'd suggest doing something like this (as you don't 
really need the ln(x/65536)/weight term, just the ln(x) term ... as you 
are effectively throwing away the value and just using side effects

  max_x = -1
  max_item = -1
  mult_const = ln(1/65536)/weight
  for each item:
     x = random value from 0..65535
     x = ln(x)
     if x > max_x:
        max_x = x
        max_item = item
  return item

The random value is likely to be an expensive portion of the algorithm 
... possibly have a thread to pre-queue a few thousand random values 
that you queue up in a ring buffer somewhere?

I'd be happy to code up a quick log calculator if that really is the 
most expensive portion of this.  I think we might be able to do the logs 
and even the comparisons in SSEx/AVX if you are open for that (exploit 
some loop unrolling), though we'd need a serial finishing loop to deal 
with the unrollable portion.

>
> The table lookup is (I think) on the order of 50-100 cycles with a cold
> cache; that's what we need to beat!
>
> sage
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2014-12-12 15:39 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-08 23:48 crush: straw is dead, long live straw2 Sage Weil
2014-12-09  0:21 ` Josh Durgin
2014-12-12  9:14 ` Thorsten Behrens
2014-12-12 14:46   ` Sage Weil
2014-12-12 15:39     ` Joe Landman [this message]
2014-12-12 16:20       ` Sage Weil
2014-12-12 16:39         ` Joe Landman
2014-12-12 18:18         ` Joe Landman
2014-12-12 21:29           ` Sage Weil
2014-12-12 15:43     ` Milosz Tanski
2014-12-12 17:27 ` Yehuda Sadeh
2014-12-12 21:42   ` Mark Nelson
2017-01-27  0:26 ` Loic Dachary

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=548B0C16.4000509@gmail.com \
    --to=joe.landman@gmail.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sweil@redhat.com \
    --cc=tbehrens@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.