netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: akepner@sgi.com, linux@horizon.com, davem@davemloft.net,
	netdev@vger.kernel.org, bcrl@linux.intel.com
Subject: Re: Extensible hashing and RCU
Date: Mon, 19 Feb 2007 14:41:18 +0300	[thread overview]
Message-ID: <20070219114117.GA22622@2ka.mipt.ru> (raw)
In-Reply-To: <45D8B54A.70903@cosmosbay.com>

On Sun, Feb 18, 2007 at 09:21:30PM +0100, Eric Dumazet (dada1@cosmosbay.com) wrote:
> Evgeniy Polyakov a e'crit :
> >On Sun, Feb 18, 2007 at 07:46:22PM +0100, Eric Dumazet 
> >(dada1@cosmosbay.com) wrote:
> >>>Why anyone do not want to use trie - for socket-like loads it has
> >>>exactly constant search/insert/delete time and scales as hell.
> >>>
> >>Because we want to be *very* fast. You cannot beat hash table.
> >>
> >>Say you have 1.000.000 tcp connections, with 50.000 incoming packets per 
> >>second to *random* streams...
> >
> >What is really good in trie, that you may have upto 2^32 connections
> >without _any_ difference in lookup performance of random streams.
> 
> So are you speaking of one memory cache miss per lookup ?
> If not, you loose.

With trie big part of it _does_ live in cache compared to hash table
where similar addresses ends up in a completely different hash entries.

> >>With a 2^20 hashtable, a lookup uses one cache line (the hash head 
> >>pointer) plus one cache line to get the socket (you need it to access its 
> >>refcounter)
> >>
> >>Several attempts were done in the past to add RCU to ehash table (last 
> >>done by Benjamin LaHaise last March). I believe this was delayed a bit, 
> >>because David would like to be able to resize the hash table...
> >
> >This is a theory.
> 
> Not theory, but actual practice, on a real machine.
> 
> # cat /proc/net/sockstat
> sockets: used 918944
> TCP: inuse 925413 orphan 7401 tw 4906 alloc 926292 mem 304759
> UDP: inuse 9
> RAW: inuse 0
> FRAG: inuse 9 memory 18360

Theory is a speculation about performance.

Highly cache usage optimized bubble sorting still much worse than 
cache usage non-optimized binary tree.
 
> >Practice includes cost for hashing, locking, and list traversal
> >(each pointer is in own cache line btw, which must be fetched) and plus
> >the same for time wait sockets (if we are unlucky).
> >
> >No need to talk about price of cache miss when there might be more
> >serious problems - for example length of the linked list to traverse each 
> >time new packet is received.
> >
> >For example lookup time in trie with 1.6 millions random 3-dimensional
> >32bit (saddr/daddr/ports) entries is about 1 microsecond on amd athlon64 
> >3500 cpu (test was ran in userspace emulator though).
> 
> 1 microsecond ? Are you kidding ? We want no more than 50 ns.

Theory again.

Existing table does not scale that good - I created (1<<20)/2 (to cover only
established part) entries table and filled it with 1 million of random entries 
-search time is about half of microsecod.

Wanna see a code? I copied Linux hash table magic into userspace and run
the same inet_hash() and inet_lookup() in a loop. Result above.

Trie is still 2 times worse, but I've just found a bug in my implementation.

-- 
	Evgeniy Polyakov

  parent reply	other threads:[~2007-02-19 11:47 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-04  7:41 Extensible hashing and RCU linux
2007-02-05 18:02 ` akepner
2007-02-17 13:13   ` Evgeniy Polyakov
2007-02-18 18:46     ` Eric Dumazet
2007-02-18 19:10       ` Evgeniy Polyakov
2007-02-18 20:21         ` Eric Dumazet
2007-02-18 21:23           ` Michael K. Edwards
2007-02-18 22:04             ` Michael K. Edwards
2007-02-19 12:04             ` Andi Kleen
2007-02-19 19:18               ` Michael K. Edwards
2007-02-19 11:41           ` Evgeniy Polyakov [this message]
2007-02-19 13:38             ` Eric Dumazet
2007-02-19 13:56               ` Evgeniy Polyakov
2007-02-19 14:14                 ` Eric Dumazet
2007-02-19 14:25                   ` Evgeniy Polyakov
2007-02-19 15:14                     ` Eric Dumazet
2007-02-19 18:13                       ` Eric Dumazet
2007-02-19 18:26                         ` Benjamin LaHaise
2007-02-19 18:38                           ` Benjamin LaHaise
2007-02-20  9:25                         ` Evgeniy Polyakov
2007-02-20  9:57                           ` David Miller
2007-02-20 10:22                             ` Evgeniy Polyakov
2007-02-20 10:04                           ` Eric Dumazet
2007-02-20 10:12                             ` David Miller
2007-02-20 10:30                               ` Evgeniy Polyakov
2007-02-20 11:10                                 ` Eric Dumazet
2007-02-20 11:23                                   ` Evgeniy Polyakov
2007-02-20 11:30                                   ` Eric Dumazet
2007-02-20 11:41                                     ` Evgeniy Polyakov
2007-02-20 10:49                               ` Eric Dumazet
2007-02-20 15:07                               ` Michael K. Edwards
2007-02-20 15:11                                 ` Evgeniy Polyakov
2007-02-20 15:49                                   ` Michael K. Edwards
2007-02-20 15:59                                     ` Evgeniy Polyakov
2007-02-20 16:08                                       ` Eric Dumazet
2007-02-20 16:20                                         ` Evgeniy Polyakov
2007-02-20 16:38                                           ` Eric Dumazet
2007-02-20 16:59                                             ` Evgeniy Polyakov
2007-02-20 17:05                                               ` Evgeniy Polyakov
2007-02-20 17:53                                                 ` Eric Dumazet
2007-02-20 18:00                                                   ` Evgeniy Polyakov
2007-02-20 18:55                                                     ` Eric Dumazet
2007-02-20 19:06                                                       ` Evgeniy Polyakov
2007-02-20 19:17                                                         ` Eric Dumazet
2007-02-20 19:36                                                           ` Evgeniy Polyakov
2007-02-20 19:44                                                           ` Michael K. Edwards
2007-02-20 17:20                                               ` Eric Dumazet
2007-02-20 17:55                                                 ` Evgeniy Polyakov
2007-02-20 18:12                                                   ` Evgeniy Polyakov
2007-02-20 19:13                                                     ` Michael K. Edwards
2007-02-20 19:44                                                       ` Evgeniy Polyakov
2007-02-20 20:03                                                         ` Michael K. Edwards
2007-02-20 20:09                                                           ` Michael K. Edwards
2007-02-21  8:56                                                             ` Evgeniy Polyakov
2007-02-21  9:34                                                               ` David Miller
2007-02-21  9:51                                                                 ` Evgeniy Polyakov
2007-02-21 10:03                                                                   ` David Miller
2007-02-21  8:54                                                           ` Evgeniy Polyakov
2007-02-21  9:15                                                             ` Eric Dumazet
2007-02-21  9:27                                                               ` Evgeniy Polyakov
2007-02-21  9:38                                                                 ` Eric Dumazet
2007-02-21  9:57                                                                   ` Evgeniy Polyakov
2007-02-21 21:15                                                                     ` Michael K. Edwards
2007-02-22  9:06                                                                       ` David Miller
2007-02-22 11:00                                                                         ` Michael K. Edwards
2007-02-22 11:07                                                                           ` David Miller
2007-02-22 19:24                                                                             ` Stephen Hemminger
2007-02-20 16:04                                     ` Eric Dumazet
2007-02-22 23:49                                     ` linux
2007-02-23  2:31                                       ` Michael K. Edwards
2007-02-20 10:44                             ` Evgeniy Polyakov
2007-02-20 11:09                               ` Eric Dumazet
2007-02-20 11:29                                 ` Evgeniy Polyakov
2007-02-20 11:34                                   ` Eric Dumazet
2007-02-20 11:45                                     ` Evgeniy Polyakov
2007-02-21 12:41                                 ` Andi Kleen
2007-02-21 13:19                                   ` Eric Dumazet
2007-02-21 13:37                                     ` David Miller
2007-02-21 23:13                                       ` Robert Olsson
2007-02-22  6:06                                         ` Eric Dumazet
2007-02-22 11:41                                         ` Andi Kleen
2007-02-22 11:44                                           ` David Miller
2007-02-20 12:11                           ` Evgeniy Polyakov
2007-02-19 22:10                 ` Andi Kleen
2007-02-19 12:02           ` Andi Kleen
2007-02-19 12:35             ` Robert Olsson
2007-02-19 14:04       ` Evgeniy Polyakov
2007-03-02  8:52     ` Evgeniy Polyakov
2007-03-02  9:56       ` Eric Dumazet
2007-03-02 10:28         ` Evgeniy Polyakov
2007-03-02 20:45         ` Michael K. Edwards
2007-03-03 10:46           ` Evgeniy Polyakov
2007-03-04 10:02             ` Michael K. Edwards
2007-03-04 20:36               ` David Miller
2007-03-05  7:12                 ` Michael K. Edwards
2007-03-05 10:02                   ` Robert Olsson
2007-03-05 10:00               ` Evgeniy Polyakov
2007-03-13  9:32       ` Evgeniy Polyakov
2007-03-13 10:08         ` Eric Dumazet
2007-03-13 10:24           ` Evgeniy Polyakov
2007-02-05 18:41 ` [RFC/TOY]Extensible " akepner
2007-02-06 19:09   ` linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070219114117.GA22622@2ka.mipt.ru \
    --to=johnpol@2ka.mipt.ru \
    --cc=akepner@sgi.com \
    --cc=bcrl@linux.intel.com \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=linux@horizon.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).