netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: akepner@sgi.com, linux@horizon.com, davem@davemloft.net,
	netdev@vger.kernel.org, bcrl@linux.intel.com
Subject: Re: Extensible hashing and RCU
Date: Sun, 18 Feb 2007 22:10:10 +0300	[thread overview]
Message-ID: <20070218191009.GA28216@2ka.mipt.ru> (raw)
In-Reply-To: <45D89EFE.4080103@cosmosbay.com>

On Sun, Feb 18, 2007 at 07:46:22PM +0100, Eric Dumazet (dada1@cosmosbay.com) wrote:
> >Why anyone do not want to use trie - for socket-like loads it has
> >exactly constant search/insert/delete time and scales as hell.
> >
> 
> Because we want to be *very* fast. You cannot beat hash table.
> 
> Say you have 1.000.000 tcp connections, with 50.000 incoming packets per 
> second to *random* streams...

What is really good in trie, that you may have upto 2^32 connections
without _any_ difference in lookup performance of random streams.

> With a 2^20 hashtable, a lookup uses one cache line (the hash head pointer) 
> plus one cache line to get the socket (you need it to access its refcounter)
> 
> Several attempts were done in the past to add RCU to ehash table (last done 
> by Benjamin LaHaise last March). I believe this was delayed a bit, because 
> David would like to be able to resize the hash table...

This is a theory.
Practice includes cost for hashing, locking, and list traversal
(each pointer is in own cache line btw, which must be fetched) and plus
the same for time wait sockets (if we are unlucky).

No need to talk about price of cache miss when there might be more
serious problems - for example length of the linked list to traverse each 
time new packet is received.

For example lookup time in trie with 1.6 millions random 3-dimensional
32bit (saddr/daddr/ports) entries is about 1 microsecond on amd athlon64 
3500 cpu (test was ran in userspace emulator though).
Data obtained from netchannels research:
http://tservice.net.ru/~s0mbre/blog/2006/12/02#2006_12_02

I think I need to setup similar emulator for hash table of the above
size to check hash/list goodness.

> I am not really interested in hash resizing, because an admin can size it 
> at boot time. But RCU is definitly *wanted*

Trie in that regard is true winner - its RCU'fication is trivial.

> Note : It would be good to also use RCU for UDP, because the current rwlock 
> protecting udp_hash[] is a scalability problem.

-- 
	Evgeniy Polyakov

  reply	other threads:[~2007-02-18 19:12 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-04  7:41 Extensible hashing and RCU linux
2007-02-05 18:02 ` akepner
2007-02-17 13:13   ` Evgeniy Polyakov
2007-02-18 18:46     ` Eric Dumazet
2007-02-18 19:10       ` Evgeniy Polyakov [this message]
2007-02-18 20:21         ` Eric Dumazet
2007-02-18 21:23           ` Michael K. Edwards
2007-02-18 22:04             ` Michael K. Edwards
2007-02-19 12:04             ` Andi Kleen
2007-02-19 19:18               ` Michael K. Edwards
2007-02-19 11:41           ` Evgeniy Polyakov
2007-02-19 13:38             ` Eric Dumazet
2007-02-19 13:56               ` Evgeniy Polyakov
2007-02-19 14:14                 ` Eric Dumazet
2007-02-19 14:25                   ` Evgeniy Polyakov
2007-02-19 15:14                     ` Eric Dumazet
2007-02-19 18:13                       ` Eric Dumazet
2007-02-19 18:26                         ` Benjamin LaHaise
2007-02-19 18:38                           ` Benjamin LaHaise
2007-02-20  9:25                         ` Evgeniy Polyakov
2007-02-20  9:57                           ` David Miller
2007-02-20 10:22                             ` Evgeniy Polyakov
2007-02-20 10:04                           ` Eric Dumazet
2007-02-20 10:12                             ` David Miller
2007-02-20 10:30                               ` Evgeniy Polyakov
2007-02-20 11:10                                 ` Eric Dumazet
2007-02-20 11:23                                   ` Evgeniy Polyakov
2007-02-20 11:30                                   ` Eric Dumazet
2007-02-20 11:41                                     ` Evgeniy Polyakov
2007-02-20 10:49                               ` Eric Dumazet
2007-02-20 15:07                               ` Michael K. Edwards
2007-02-20 15:11                                 ` Evgeniy Polyakov
2007-02-20 15:49                                   ` Michael K. Edwards
2007-02-20 15:59                                     ` Evgeniy Polyakov
2007-02-20 16:08                                       ` Eric Dumazet
2007-02-20 16:20                                         ` Evgeniy Polyakov
2007-02-20 16:38                                           ` Eric Dumazet
2007-02-20 16:59                                             ` Evgeniy Polyakov
2007-02-20 17:05                                               ` Evgeniy Polyakov
2007-02-20 17:53                                                 ` Eric Dumazet
2007-02-20 18:00                                                   ` Evgeniy Polyakov
2007-02-20 18:55                                                     ` Eric Dumazet
2007-02-20 19:06                                                       ` Evgeniy Polyakov
2007-02-20 19:17                                                         ` Eric Dumazet
2007-02-20 19:36                                                           ` Evgeniy Polyakov
2007-02-20 19:44                                                           ` Michael K. Edwards
2007-02-20 17:20                                               ` Eric Dumazet
2007-02-20 17:55                                                 ` Evgeniy Polyakov
2007-02-20 18:12                                                   ` Evgeniy Polyakov
2007-02-20 19:13                                                     ` Michael K. Edwards
2007-02-20 19:44                                                       ` Evgeniy Polyakov
2007-02-20 20:03                                                         ` Michael K. Edwards
2007-02-20 20:09                                                           ` Michael K. Edwards
2007-02-21  8:56                                                             ` Evgeniy Polyakov
2007-02-21  9:34                                                               ` David Miller
2007-02-21  9:51                                                                 ` Evgeniy Polyakov
2007-02-21 10:03                                                                   ` David Miller
2007-02-21  8:54                                                           ` Evgeniy Polyakov
2007-02-21  9:15                                                             ` Eric Dumazet
2007-02-21  9:27                                                               ` Evgeniy Polyakov
2007-02-21  9:38                                                                 ` Eric Dumazet
2007-02-21  9:57                                                                   ` Evgeniy Polyakov
2007-02-21 21:15                                                                     ` Michael K. Edwards
2007-02-22  9:06                                                                       ` David Miller
2007-02-22 11:00                                                                         ` Michael K. Edwards
2007-02-22 11:07                                                                           ` David Miller
2007-02-22 19:24                                                                             ` Stephen Hemminger
2007-02-20 16:04                                     ` Eric Dumazet
2007-02-22 23:49                                     ` linux
2007-02-23  2:31                                       ` Michael K. Edwards
2007-02-20 10:44                             ` Evgeniy Polyakov
2007-02-20 11:09                               ` Eric Dumazet
2007-02-20 11:29                                 ` Evgeniy Polyakov
2007-02-20 11:34                                   ` Eric Dumazet
2007-02-20 11:45                                     ` Evgeniy Polyakov
2007-02-21 12:41                                 ` Andi Kleen
2007-02-21 13:19                                   ` Eric Dumazet
2007-02-21 13:37                                     ` David Miller
2007-02-21 23:13                                       ` Robert Olsson
2007-02-22  6:06                                         ` Eric Dumazet
2007-02-22 11:41                                         ` Andi Kleen
2007-02-22 11:44                                           ` David Miller
2007-02-20 12:11                           ` Evgeniy Polyakov
2007-02-19 22:10                 ` Andi Kleen
2007-02-19 12:02           ` Andi Kleen
2007-02-19 12:35             ` Robert Olsson
2007-02-19 14:04       ` Evgeniy Polyakov
2007-03-02  8:52     ` Evgeniy Polyakov
2007-03-02  9:56       ` Eric Dumazet
2007-03-02 10:28         ` Evgeniy Polyakov
2007-03-02 20:45         ` Michael K. Edwards
2007-03-03 10:46           ` Evgeniy Polyakov
2007-03-04 10:02             ` Michael K. Edwards
2007-03-04 20:36               ` David Miller
2007-03-05  7:12                 ` Michael K. Edwards
2007-03-05 10:02                   ` Robert Olsson
2007-03-05 10:00               ` Evgeniy Polyakov
2007-03-13  9:32       ` Evgeniy Polyakov
2007-03-13 10:08         ` Eric Dumazet
2007-03-13 10:24           ` Evgeniy Polyakov
2007-02-05 18:41 ` [RFC/TOY]Extensible " akepner
2007-02-06 19:09   ` linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070218191009.GA28216@2ka.mipt.ru \
    --to=johnpol@2ka.mipt.ru \
    --cc=akepner@sgi.com \
    --cc=bcrl@linux.intel.com \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=linux@horizon.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).