public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: torvalds@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, linux@horizon.com, tglx@linutronix.de
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism
Date: 29 Apr 2016 19:31:15 -0400	[thread overview]
Message-ID: <20160429233115.8864.qmail@ns.horizon.com> (raw)
In-Reply-To: <CA+55aFxS9zZ8Gpyg9cn=xgkamx-G6dDZ3DTR2B=iQzXYcAhGoQ@mail.gmail.com>

On Fri, Apr 29, 2016 at 9:32 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
wrote:
> For example, that _long_ range of bits set ("7fffffffc" in the middle)
> is effectively just one bit set with a subtraction. And it's *right*
> in that bit area that is supposed to shuffle bits 14-40 to the high bits
> (which is what we actually *use*. So it effectively shuffles none of those
> bits around at all, and if you have a stride of 4096, your'e pretty much
> done for.

Gee, I recall saying something a lot like that.
> 64 bits is just as bad...  0x9e37fffffffc0001 becomes
> 0x7fffffffc0001, which is 2^51 - 2^18 + 1.

After researching it, I think that the "high bits of a multiply" is
in fact a decent way to do such a hash.  Interestingly, for a randomly
chosen odd multiplier A, the high k bits of the w-bit product A*x is a
universal hash function in the cryptographic sense.  See section 2.3 of
http://arxiv.org/abs/1504.06804


One thing I note is that the advice in the comments to choose a prime
number is misquoting Knuth!  Knuth says (vol. 3 section 6.4) the number
should be *relatively* prime to the word size, which for binary computers
simply means odd.

When we have a hardware multiplier, keeping the Hamming weight low is
a waste of time.  When we don't, clever organization can do
better than the very naive addition/subtraction chain in the
current hash_64().

To multiply by the 32-bit constant 1640531527 = 0x61c88647 (which is
the negative of the golden ratio, so has identical distribution
properties) can be done in 6 shifts + adds, with a critical path
length of 7 operations (3 shifts + 4 adds).

#define GOLDEN_RATIO_32 0x61c88647	/* phi^2 = 1-phi */
/* Returns x * GOLDEN_RATIO_32 without a hardware multiplier */
unsigned hash_32(unsigned x)
{
        unsigned y, z;
				/* Path length */
        y = (x << 19) + x;	/* 1 shift + 1 add */
        z = (x << 9) + y;	/* 1 shift + 2 add */
        x = (x << 23) + z;	/* 1 shift + 3 add */
        z = (z << 8) + y;	/* 2 shift + 3 add */
	x = (x << 6) - x;	/* 2 shift + 4 add */
        return (z << 3) + x;	/* 3 shift + 4 add */
}

Finding a similarly efficient chain for the 64-bit golden ratio
0x9E3779B97F4A7C15 = 11400714819323198485
or
0x61C8864680B583EB = 7046029254386353131

is a bit of a challenge, but algorithms are known.

  parent reply	other threads:[~2016-04-29 23:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-29  2:57 [patch 2/7] lib/hashmod: Add modulo based hash mechanism George Spelvin
2016-04-29  3:16 ` Linus Torvalds
2016-04-29  4:12   ` George Spelvin
2016-04-29 23:31   ` George Spelvin [this message]
2016-04-30  0:05     ` Linus Torvalds
2016-04-30  0:32       ` George Spelvin
2016-04-30  1:12         ` Linus Torvalds
2016-04-30  3:04           ` George Spelvin
     [not found] <CA+55aFxBWfAHQNAdBbdVr+z8ror4GVteyce3D3=vwDWxhu5KqQ@mail.gmail.com>
2016-04-30 20:52 ` George Spelvin
2016-05-01  8:35   ` Thomas Gleixner
2016-05-01  9:43     ` George Spelvin
2016-05-01 16:51       ` Linus Torvalds
2016-05-14  3:54         ` George Spelvin
2016-05-14 18:35           ` Linus Torvalds
2016-05-02  7:11       ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2016-04-28 16:42 [patch 0/7] futex: Add support for process private hashing Thomas Gleixner
2016-04-28 16:42 ` [patch 2/7] lib/hashmod: Add modulo based hash mechanism Thomas Gleixner
2016-04-28 18:32   ` Linus Torvalds
2016-04-28 23:26     ` Thomas Gleixner
2016-04-29  2:25       ` Linus Torvalds
2016-04-30 13:02         ` Thomas Gleixner
2016-04-30 16:45           ` Eric Dumazet
2016-04-30 17:12             ` Linus Torvalds
2016-04-30 17:37               ` Eric Dumazet
2016-06-12 12:18         ` Sandy Harris
2016-04-29 21:10     ` Linus Torvalds
2016-04-29 23:51       ` Linus Torvalds
2016-04-30  1:34         ` Rik van Riel
2016-05-02  9:39         ` Torvald Riegel
2016-04-30 15:22       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160429233115.8864.qmail@ns.horizon.com \
    --to=linux@horizon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox