From: Linus Torvalds <torvalds@linux-foundation.org>
To: Stephen Hemminger <shemminger@vyatta.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
Stephen Hemminger <stephen.hemminger@vyatta.com>,
Andrew Morton <akpm@linux-foundation.org>,
Octavian Purdila <opurdila@ixiacom.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH] dcache: better name hash function
Date: Tue, 27 Oct 2009 16:41:52 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.2.01.0910271636100.31845@localhost.localdomain> (raw)
In-Reply-To: <20091027160822.384dc109@nehalam>
On Tue, 27 Oct 2009, Stephen Hemminger wrote:
>
> Going back to basics. Run tests across different input sets.
> Dropping off the slow ones like crc, md5, ...
> Not using jhash because it doesn't have good character at a time
> interface.
>
> Also, the folding algorithm used matters. Since the kernel
> already uses hash_long() to fold back to N bits, all the
> tests were rerun with that.
Yeah, the 'hash_long()' folding matters for anything that doesn't multiply
big some big number to spread the bits out, because otherwise the bits
from the last character hashed will always be in the low bits.
That explains why our current hash looked so bad with your previous code.
>From your numbers, I think we can dismiss 'kr_hash()' as having horrible
behavior with names like pppXXX (and that isn't just a special case: it's
also noticeably worse for your /home directory case, which means that the
bad behavior shows up in practice too, not just in some special cases).
'elf' and 'pjw' don't have quite the same bad case, but the stddev for the
pppXXX cases are still clearly worse than the other alternatives. They
also seem to always be slower than what we already have.
The 'fnv32' algorithm gets fairly good behavior, but seems bad on Itanium.
Looks like it depends on a fast multiplication unit, and all even your
"slow" ULV chip seems to be a Core2 one, so all your x86 targets have
that. And our current name hash still actually seems to do better in all
cases (maybe I missed some case) even if fnv32 is slightly faster on x86.
>From your list 'string10' seems to get consistently good results and is at
or near the top of performance too. It seems to be the one that
consistently beats 'full_name_hash()' both in performance and in behavior
(string_hash17/31 come close, but aren't as clearly better performing).
But I haven't actually seen the hashes. Maybe there's something that makes
string10 bad?
Regardless, one thing your new numbers do say is that our current hash
really isn't that bad.
Linus
next prev parent reply other threads:[~2009-10-27 23:41 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <9986527.24561256620662709.JavaMail.root@tahiti.vyatta.com>
2009-10-27 5:19 ` [PATCH] dcache: better name hash function Stephen Hemminger
2009-10-27 5:24 ` David Miller
2009-10-27 17:22 ` [PATCH] net: fold network name hash Stephen Hemminger
2009-10-27 18:02 ` Octavian Purdila
2009-10-27 22:04 ` [PATCH] net: fold network name hash (v2) Stephen Hemminger
2009-10-28 6:07 ` Eric Dumazet
2009-10-28 9:28 ` David Miller
2009-10-28 15:57 ` Stephen Hemminger
2009-10-27 6:07 ` [PATCH] dcache: better name hash function Eric Dumazet
2009-10-27 6:50 ` Eric Dumazet
2009-10-27 7:29 ` Eric Dumazet
2009-10-27 17:07 ` Stephen Hemminger
2009-10-27 17:32 ` Linus Torvalds
2009-10-27 23:08 ` Stephen Hemminger
2009-10-27 23:41 ` Linus Torvalds [this message]
2009-10-28 0:10 ` Stephen Hemminger
2009-10-28 0:58 ` Linus Torvalds
2009-10-28 1:56 ` Stephen Hemminger
[not found] ` <4AE72B91.7040700@gmail.com>
2009-10-27 17:35 ` Stephen Hemminger
2009-10-25 19:58 [PATCH next-next-2.6] netdev: better dev_name_hash Octavian Purdila
2009-10-26 4:43 ` Stephen Hemminger
2009-10-26 22:36 ` [PATCH] dcache: better name hash function Stephen Hemminger <shemminger@vyatta.com>, Al Viro
2009-10-27 2:45 ` Eric Dumazet
2009-10-27 3:53 ` Stephen Hemminger
2009-10-27 16:38 ` Rick Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.01.0910271636100.31845@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=opurdila@ixiacom.com \
--cc=shemminger@vyatta.com \
--cc=stephen.hemminger@vyatta.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).