From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: Extensible hashing and RCU Date: Tue, 20 Feb 2007 20:17:31 +0100 Message-ID: <200702202017.31965.dada1@cosmosbay.com> References: <200702191913.08125.dada1@cosmosbay.com> <200702201955.15567.dada1@cosmosbay.com> <20070220190634.GA12193@2ka.mipt.ru> Mime-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Cc: "Michael K. Edwards" , David Miller , akepner@sgi.com, linux@horizon.com, netdev@vger.kernel.org, bcrl@kvack.org To: Evgeniy Polyakov Return-path: Received: from pfx2.jmh.fr ([194.153.89.55]:45862 "EHLO pfx2.jmh.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751021AbXBTTRl (ORCPT ); Tue, 20 Feb 2007 14:17:41 -0500 In-Reply-To: <20070220190634.GA12193@2ka.mipt.ru> Content-Disposition: inline Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Tuesday 20 February 2007 20:06, Evgeniy Polyakov wrote: > > I added to my 'simulator_plugged_on_real_server' the average cost > > calculation, relative to number of cache line per lookup. > > > > ehash_size=2^20 > > xor hash : > > 386290 sockets, Avg lookup cost=3.2604 cache lines/lookup > > 393667 sockets, Avg lookup cost=3.30579 cache lines/lookup > > 400777 sockets, Avg lookup cost=3.3493 cache lines/lookup > > 404720 sockets, Avg lookup cost=3.36705 cache lines/lookup > > 406671 sockets, Avg lookup cost=3.37677 cache lines/lookup > > jenkin hash: > > 386290 sockets, Avg lookup cost=2.36763 cache lines/lookup > > 393667 sockets, Avg lookup cost=2.37533 cache lines/lookup > > 400777 sockets, Avg lookup cost=2.38211 cache lines/lookup > > 404720 sockets, Avg lookup cost=2.38582 cache lines/lookup > > 406671 sockets, Avg lookup cost=2.38679 cache lines/lookup > > > > (you can see that when number of sockets increase, the xor hash becomes > > worst) > > > > So the jenkin hash function CPU cost is balanced by the fact its > > distribution is better. In the end you are faster. > > Very strange test - it shows that jenkins distribution for your setup is > better than xor one, although for the true random data they are roughly > the same, and jenkins one has more instructions. > > But _you_ have shown that with true random data of 2^16 ports jenkins > distribution is _worse_ than xor without any gain to buy. I shown your test was bogus. All your claims are just bogus. I claim your 'true random data' is a joke. rand() in your program is a pure joke. Given 48 bits of input, you *can* find a lot of addr/port to hit one particular cache line if XOR function is used. With jhash, without knowing the 32bits random secret, you *cant*. Again, you dont take into account the chain length. If all chains were of length <= 1, then yes, xor would be faster. In real life, we *know* chain length can be larger, especially in DOS situations.