From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: "Michael K. Edwards" <medwards.linux@gmail.com>,
David Miller <davem@davemloft.net>,
akepner@sgi.com, linux@horizon.com, netdev@vger.kernel.org,
bcrl@kvack.org
Subject: Re: Extensible hashing and RCU
Date: Tue, 20 Feb 2007 22:06:34 +0300 [thread overview]
Message-ID: <20070220190634.GA12193@2ka.mipt.ru> (raw)
In-Reply-To: <200702201955.15567.dada1@cosmosbay.com>
On Tue, Feb 20, 2007 at 07:55:15PM +0100, Eric Dumazet (dada1@cosmosbay.com) wrote:
> On Tuesday 20 February 2007 19:00, Evgeniy Polyakov wrote:
> > As you can see, jhash has problems in a really trivial case of 2^16,
> > which in local lan is a disaster. The only reason jenkins hash is good
> > for hashing purposes is that is it more complex than xor one, and thus
> > it is harder to find a collision law. That's all.
> > And it is two times slower.
>
> I see no problems at all. An attacker can not exploit the fact that two (or
> three) different values of sport will hit the same hash bucket.
>
> A hash function may have collisions. This is *designed* like that.
>
> The complexity of the hash function is a tradeoff. A perfect hash would be :
> - Perfect distribution
> - Hard (or even : impossible) to guess for an attacker.
> - No CPU cost.
>
> There is no perfect hash function... given 96 bits in input.
> So what ? hashes are 'badly broken' ?
> Thats just not even funny Evgeniy.
Jenkins has _worse_ distribution than xor one.
_That_ is bad, not the fact that hash has collisions.
hash(val) = val >> 16;
is a hash too, and it has even worse distribution - so it is designed
even worse, so we do not use it.
> The 'two times slower' is a simplistic view, or maybe you have an alien CPU,
> or a CPU from the past ?
It is core duo 3.7 ghz.
Timings are printed in the test I showed in the list.
> On my oprofile, rt_hash_code() uses 0.24% of cpu (about 50 x86_64
> instructions)
>
> Each time a cache miss is done because your bucket length is (X+1) instead of
> (X), your CPU is stuck while it could have do 150 instructions. Next CPUS
> will do 300 instructions per cache miss, maybe 1000 one day... yes, life is
> hard.
>
> I added to my 'simulator_plugged_on_real_server' the average cost calculation,
> relative to number of cache line per lookup.
>
> ehash_size=2^20
> xor hash :
> 386290 sockets, Avg lookup cost=3.2604 cache lines/lookup
> 393667 sockets, Avg lookup cost=3.30579 cache lines/lookup
> 400777 sockets, Avg lookup cost=3.3493 cache lines/lookup
> 404720 sockets, Avg lookup cost=3.36705 cache lines/lookup
> 406671 sockets, Avg lookup cost=3.37677 cache lines/lookup
> jenkin hash:
> 386290 sockets, Avg lookup cost=2.36763 cache lines/lookup
> 393667 sockets, Avg lookup cost=2.37533 cache lines/lookup
> 400777 sockets, Avg lookup cost=2.38211 cache lines/lookup
> 404720 sockets, Avg lookup cost=2.38582 cache lines/lookup
> 406671 sockets, Avg lookup cost=2.38679 cache lines/lookup
>
> (you can see that when number of sockets increase, the xor hash becomes worst)
>
> So the jenkin hash function CPU cost is balanced by the fact its distribution
> is better. In the end you are faster.
Very strange test - it shows that jenkins distribution for your setup is
better than xor one, although for the true random data they are roughly
the same, and jenkins one has more instructions.
But _you_ have shown that with true random data of 2^16 ports jenkins
distribution is _worse_ than xor without any gain to buy.
--
Evgeniy Polyakov
next prev parent reply other threads:[~2007-02-20 19:08 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-04 7:41 Extensible hashing and RCU linux
2007-02-05 18:02 ` akepner
2007-02-17 13:13 ` Evgeniy Polyakov
2007-02-18 18:46 ` Eric Dumazet
2007-02-18 19:10 ` Evgeniy Polyakov
2007-02-18 20:21 ` Eric Dumazet
2007-02-18 21:23 ` Michael K. Edwards
2007-02-18 22:04 ` Michael K. Edwards
2007-02-19 12:04 ` Andi Kleen
2007-02-19 19:18 ` Michael K. Edwards
2007-02-19 11:41 ` Evgeniy Polyakov
2007-02-19 13:38 ` Eric Dumazet
2007-02-19 13:56 ` Evgeniy Polyakov
2007-02-19 14:14 ` Eric Dumazet
2007-02-19 14:25 ` Evgeniy Polyakov
2007-02-19 15:14 ` Eric Dumazet
2007-02-19 18:13 ` Eric Dumazet
2007-02-19 18:26 ` Benjamin LaHaise
2007-02-19 18:38 ` Benjamin LaHaise
2007-02-20 9:25 ` Evgeniy Polyakov
2007-02-20 9:57 ` David Miller
2007-02-20 10:22 ` Evgeniy Polyakov
2007-02-20 10:04 ` Eric Dumazet
2007-02-20 10:12 ` David Miller
2007-02-20 10:30 ` Evgeniy Polyakov
2007-02-20 11:10 ` Eric Dumazet
2007-02-20 11:23 ` Evgeniy Polyakov
2007-02-20 11:30 ` Eric Dumazet
2007-02-20 11:41 ` Evgeniy Polyakov
2007-02-20 10:49 ` Eric Dumazet
2007-02-20 15:07 ` Michael K. Edwards
2007-02-20 15:11 ` Evgeniy Polyakov
2007-02-20 15:49 ` Michael K. Edwards
2007-02-20 15:59 ` Evgeniy Polyakov
2007-02-20 16:08 ` Eric Dumazet
2007-02-20 16:20 ` Evgeniy Polyakov
2007-02-20 16:38 ` Eric Dumazet
2007-02-20 16:59 ` Evgeniy Polyakov
2007-02-20 17:05 ` Evgeniy Polyakov
2007-02-20 17:53 ` Eric Dumazet
2007-02-20 18:00 ` Evgeniy Polyakov
2007-02-20 18:55 ` Eric Dumazet
2007-02-20 19:06 ` Evgeniy Polyakov [this message]
2007-02-20 19:17 ` Eric Dumazet
2007-02-20 19:36 ` Evgeniy Polyakov
2007-02-20 19:44 ` Michael K. Edwards
2007-02-20 17:20 ` Eric Dumazet
2007-02-20 17:55 ` Evgeniy Polyakov
2007-02-20 18:12 ` Evgeniy Polyakov
2007-02-20 19:13 ` Michael K. Edwards
2007-02-20 19:44 ` Evgeniy Polyakov
2007-02-20 20:03 ` Michael K. Edwards
2007-02-20 20:09 ` Michael K. Edwards
2007-02-21 8:56 ` Evgeniy Polyakov
2007-02-21 9:34 ` David Miller
2007-02-21 9:51 ` Evgeniy Polyakov
2007-02-21 10:03 ` David Miller
2007-02-21 8:54 ` Evgeniy Polyakov
2007-02-21 9:15 ` Eric Dumazet
2007-02-21 9:27 ` Evgeniy Polyakov
2007-02-21 9:38 ` Eric Dumazet
2007-02-21 9:57 ` Evgeniy Polyakov
2007-02-21 21:15 ` Michael K. Edwards
2007-02-22 9:06 ` David Miller
2007-02-22 11:00 ` Michael K. Edwards
2007-02-22 11:07 ` David Miller
2007-02-22 19:24 ` Stephen Hemminger
2007-02-20 16:04 ` Eric Dumazet
2007-02-22 23:49 ` linux
2007-02-23 2:31 ` Michael K. Edwards
2007-02-20 10:44 ` Evgeniy Polyakov
2007-02-20 11:09 ` Eric Dumazet
2007-02-20 11:29 ` Evgeniy Polyakov
2007-02-20 11:34 ` Eric Dumazet
2007-02-20 11:45 ` Evgeniy Polyakov
2007-02-21 12:41 ` Andi Kleen
2007-02-21 13:19 ` Eric Dumazet
2007-02-21 13:37 ` David Miller
2007-02-21 23:13 ` Robert Olsson
2007-02-22 6:06 ` Eric Dumazet
2007-02-22 11:41 ` Andi Kleen
2007-02-22 11:44 ` David Miller
2007-02-20 12:11 ` Evgeniy Polyakov
2007-02-19 22:10 ` Andi Kleen
2007-02-19 12:02 ` Andi Kleen
2007-02-19 12:35 ` Robert Olsson
2007-02-19 14:04 ` Evgeniy Polyakov
2007-03-02 8:52 ` Evgeniy Polyakov
2007-03-02 9:56 ` Eric Dumazet
2007-03-02 10:28 ` Evgeniy Polyakov
2007-03-02 20:45 ` Michael K. Edwards
2007-03-03 10:46 ` Evgeniy Polyakov
2007-03-04 10:02 ` Michael K. Edwards
2007-03-04 20:36 ` David Miller
2007-03-05 7:12 ` Michael K. Edwards
2007-03-05 10:02 ` Robert Olsson
2007-03-05 10:00 ` Evgeniy Polyakov
2007-03-13 9:32 ` Evgeniy Polyakov
2007-03-13 10:08 ` Eric Dumazet
2007-03-13 10:24 ` Evgeniy Polyakov
2007-02-05 18:41 ` [RFC/TOY]Extensible " akepner
2007-02-06 19:09 ` linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070220190634.GA12193@2ka.mipt.ru \
--to=johnpol@2ka.mipt.ru \
--cc=akepner@sgi.com \
--cc=bcrl@kvack.org \
--cc=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=linux@horizon.com \
--cc=medwards.linux@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).