From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: akepner@sgi.com, linux@horizon.com, davem@davemloft.net,
netdev@vger.kernel.org, bcrl@kvack.org
Subject: Re: Extensible hashing and RCU
Date: Tue, 20 Feb 2007 12:25:23 +0300 [thread overview]
Message-ID: <20070220092523.GA6238@2ka.mipt.ru> (raw)
In-Reply-To: <200702191913.08125.dada1@cosmosbay.com>
On Mon, Feb 19, 2007 at 07:13:07PM +0100, Eric Dumazet (dada1@cosmosbay.com) wrote:
> On Monday 19 February 2007 16:14, Eric Dumazet wrote:
> >
> > Because O(1) is different of O(log(N)) ?
> > if N = 2^20, it certainly makes a difference.
> > Yes, 1% of chains might have a length > 10, but yet 99% of the lookups are
> > touching less than 4 cache lines.
> > With a binary tree, log(2^20) is 20. or maybe not ? If you tell me it's 4,
> > I will be very pleased.
> >
>
> Here is the tcp ehash chain length distribution on a real server :
> ehash_addr=0xffff810476000000
> ehash_size=1048576
> 333835 used chains, 3365 used twchains
> Distribution of sockets/chain length
> [chain length]:number of sockets
> [1]:221019 37.4645%
> [2]:56590 56.6495%
> [3]:21250 67.4556%
> [4]:12534 75.9541%
> [5]:8677 83.3082%
> [6]:5862 89.2701%
> [7]:3640 93.5892%
> [8]:2219 96.5983%
> [9]:1083 98.2505%
> [10]:539 99.1642%
> [11]:244 99.6191%
> [12]:112 99.8469%
> [13]:39 99.9329%
> [14]:16 99.9708%
> [15]:6 99.9861%
> [16]:3 99.9942%
> [17]:2 100%
> total : 589942 sockets
>
> So even with a lazy hash function, 89 % of lookups are satisfied with less
> than 6 compares.
This is only 2^19 sockets.
What you miss is the fact, that upper layers of the tree are always in the
cache. So its access cost nothing compared to every time new entry in
the hash table.
And there is _no_ O(1) - O(1) is just a hash entry selection, then you
need to add the whole chain path, which can be long enough.
For example for the length 9 you have 1000 entries - it is exactly size
of the tree - sum of all entries upto and including 2^9.
One third of the table is accessible within 17 lookups (hash chain
length is 1), but given into account size of the entry - let's say it
is equal to
32+32+32 - size of the key
32+32+32 - size of the left/right/parent pointer
so 192 bytes per entry - given into acount that 1/4 of the 1-2 MB cache is
filled with it, we get about 1300 top-level entries in the hash, so
about _10_ first lookups are in the cache _always_ due to nature of the
tree lookup.
--
Evgeniy Polyakov
next prev parent reply other threads:[~2007-02-20 9:26 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-04 7:41 Extensible hashing and RCU linux
2007-02-05 18:02 ` akepner
2007-02-17 13:13 ` Evgeniy Polyakov
2007-02-18 18:46 ` Eric Dumazet
2007-02-18 19:10 ` Evgeniy Polyakov
2007-02-18 20:21 ` Eric Dumazet
2007-02-18 21:23 ` Michael K. Edwards
2007-02-18 22:04 ` Michael K. Edwards
2007-02-19 12:04 ` Andi Kleen
2007-02-19 19:18 ` Michael K. Edwards
2007-02-19 11:41 ` Evgeniy Polyakov
2007-02-19 13:38 ` Eric Dumazet
2007-02-19 13:56 ` Evgeniy Polyakov
2007-02-19 14:14 ` Eric Dumazet
2007-02-19 14:25 ` Evgeniy Polyakov
2007-02-19 15:14 ` Eric Dumazet
2007-02-19 18:13 ` Eric Dumazet
2007-02-19 18:26 ` Benjamin LaHaise
2007-02-19 18:38 ` Benjamin LaHaise
2007-02-20 9:25 ` Evgeniy Polyakov [this message]
2007-02-20 9:57 ` David Miller
2007-02-20 10:22 ` Evgeniy Polyakov
2007-02-20 10:04 ` Eric Dumazet
2007-02-20 10:12 ` David Miller
2007-02-20 10:30 ` Evgeniy Polyakov
2007-02-20 11:10 ` Eric Dumazet
2007-02-20 11:23 ` Evgeniy Polyakov
2007-02-20 11:30 ` Eric Dumazet
2007-02-20 11:41 ` Evgeniy Polyakov
2007-02-20 10:49 ` Eric Dumazet
2007-02-20 15:07 ` Michael K. Edwards
2007-02-20 15:11 ` Evgeniy Polyakov
2007-02-20 15:49 ` Michael K. Edwards
2007-02-20 15:59 ` Evgeniy Polyakov
2007-02-20 16:08 ` Eric Dumazet
2007-02-20 16:20 ` Evgeniy Polyakov
2007-02-20 16:38 ` Eric Dumazet
2007-02-20 16:59 ` Evgeniy Polyakov
2007-02-20 17:05 ` Evgeniy Polyakov
2007-02-20 17:53 ` Eric Dumazet
2007-02-20 18:00 ` Evgeniy Polyakov
2007-02-20 18:55 ` Eric Dumazet
2007-02-20 19:06 ` Evgeniy Polyakov
2007-02-20 19:17 ` Eric Dumazet
2007-02-20 19:36 ` Evgeniy Polyakov
2007-02-20 19:44 ` Michael K. Edwards
2007-02-20 17:20 ` Eric Dumazet
2007-02-20 17:55 ` Evgeniy Polyakov
2007-02-20 18:12 ` Evgeniy Polyakov
2007-02-20 19:13 ` Michael K. Edwards
2007-02-20 19:44 ` Evgeniy Polyakov
2007-02-20 20:03 ` Michael K. Edwards
2007-02-20 20:09 ` Michael K. Edwards
2007-02-21 8:56 ` Evgeniy Polyakov
2007-02-21 9:34 ` David Miller
2007-02-21 9:51 ` Evgeniy Polyakov
2007-02-21 10:03 ` David Miller
2007-02-21 8:54 ` Evgeniy Polyakov
2007-02-21 9:15 ` Eric Dumazet
2007-02-21 9:27 ` Evgeniy Polyakov
2007-02-21 9:38 ` Eric Dumazet
2007-02-21 9:57 ` Evgeniy Polyakov
2007-02-21 21:15 ` Michael K. Edwards
2007-02-22 9:06 ` David Miller
2007-02-22 11:00 ` Michael K. Edwards
2007-02-22 11:07 ` David Miller
2007-02-22 19:24 ` Stephen Hemminger
2007-02-20 16:04 ` Eric Dumazet
2007-02-22 23:49 ` linux
2007-02-23 2:31 ` Michael K. Edwards
2007-02-20 10:44 ` Evgeniy Polyakov
2007-02-20 11:09 ` Eric Dumazet
2007-02-20 11:29 ` Evgeniy Polyakov
2007-02-20 11:34 ` Eric Dumazet
2007-02-20 11:45 ` Evgeniy Polyakov
2007-02-21 12:41 ` Andi Kleen
2007-02-21 13:19 ` Eric Dumazet
2007-02-21 13:37 ` David Miller
2007-02-21 23:13 ` Robert Olsson
2007-02-22 6:06 ` Eric Dumazet
2007-02-22 11:41 ` Andi Kleen
2007-02-22 11:44 ` David Miller
2007-02-20 12:11 ` Evgeniy Polyakov
2007-02-19 22:10 ` Andi Kleen
2007-02-19 12:02 ` Andi Kleen
2007-02-19 12:35 ` Robert Olsson
2007-02-19 14:04 ` Evgeniy Polyakov
2007-03-02 8:52 ` Evgeniy Polyakov
2007-03-02 9:56 ` Eric Dumazet
2007-03-02 10:28 ` Evgeniy Polyakov
2007-03-02 20:45 ` Michael K. Edwards
2007-03-03 10:46 ` Evgeniy Polyakov
2007-03-04 10:02 ` Michael K. Edwards
2007-03-04 20:36 ` David Miller
2007-03-05 7:12 ` Michael K. Edwards
2007-03-05 10:02 ` Robert Olsson
2007-03-05 10:00 ` Evgeniy Polyakov
2007-03-13 9:32 ` Evgeniy Polyakov
2007-03-13 10:08 ` Eric Dumazet
2007-03-13 10:24 ` Evgeniy Polyakov
2007-02-05 18:41 ` [RFC/TOY]Extensible " akepner
2007-02-06 19:09 ` linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070220092523.GA6238@2ka.mipt.ru \
--to=johnpol@2ka.mipt.ru \
--cc=akepner@sgi.com \
--cc=bcrl@kvack.org \
--cc=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=linux@horizon.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).