netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Paper on lookup in Linux.
@ 2006-09-04 11:43 Robert Olsson
  2006-09-04 12:24 ` Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: Robert Olsson @ 2006-09-04 11:43 UTC (permalink / raw)
  To: netdev; +Cc: snilsson, Robert.Olsson


Hello.

People on this list might find this paper interesting:
http://www.csc.kth.se/~snilsson/public/papers/trash/

Abstract is below. Feel free to redistribute.

Cheers.	
					--ro


TRASH - A dynamic LC-trie and hash data structure
Robert Olsson and Stefan Nilsson

A dynamic LC-trie is currently used in the Linux kernel to implement address 
lookup in the IP routing table. The main virtue of this data structure is that 
it supports both fast address lookups and frequent updates of the table. Also, 
it has an efficient memory management scheme and supports multi-processor 
architectures using the RCU locking mechanism. The structure scales nicely: 
the expected number of memory accesses for one lookup is O(log log n), where 
n is the number of entries in the lookup table. In particular, the time does 
not depend on the length of the keys, 32-bit IPv4 addresses and 128-bit 
addresses does not make a difference in this respect.

In this article we introduce TRASH, a combination of a dynamic LC-trie and a 
hash function. TRASH is a general purpose data structure supporting fast 
lookup, insert and delete operations for arbitrarily long bit strings. TRASH 
enhances the level-compression part of the LC-trie by prepending a header to 
each key. The header is a hash value based on the complete key. The extended 
keys will behave like uniformly distributed data and hence the average and 
maximum depth is typically very small, in practice less than 1.5 and 5, 
respectively.

We have implemented the scheme in the Linux kernel as a replacement for the 
dst cache (IPv4) and performed a full scale test on a production router using 
128-bit flow-based lookups. The Linux implementation of TRASH inherits the 
efficient RCU locking mechanism from the dynamic LC-trie implementation. In 
particular, the lookup time increases only marginally for longer keys and TRASH
is highly insensitive to different types of data. The performance figures are 
very promising and the cache mechanism could easily be extended to serve as a 
unified lookup for fast socket lookup, flow logging, connection tracking and 
stateful networking in general.

Keywords: trie, LC-trie, hash, hashtrie, Linux, flow lookup, garbage collection
Trita-CSC-TCS 2006:2, ISRN/KTH/CSC/TCS-2006/2-SE, ISSN 1653-7092. 

-- 
VGER BF report: U 0.830787

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Paper on lookup in Linux.
  2006-09-04 11:43 Paper on lookup in Linux Robert Olsson
@ 2006-09-04 12:24 ` Andi Kleen
  2006-09-04 12:53   ` Robert Olsson
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2006-09-04 12:24 UTC (permalink / raw)
  To: Robert Olsson; +Cc: netdev, snilsson

On Monday 04 September 2006 13:43, Robert Olsson wrote:
> 
> Hello.
> 
> People on this list might find this paper interesting:
> http://www.csc.kth.se/~snilsson/public/papers/trash/

Looks nice. Have you looked at using it for local TCP/UDP socket
lookups too or would that be part of the "unified flow cache"? 

-Andi

-- 
VGER BF report: U 0.499718

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Paper on lookup in Linux.
  2006-09-04 12:24 ` Andi Kleen
@ 2006-09-04 12:53   ` Robert Olsson
  2006-09-04 13:29     ` Andi Kleen
  0 siblings, 1 reply; 5+ messages in thread
From: Robert Olsson @ 2006-09-04 12:53 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Robert Olsson, netdev, snilsson


Andi Kleen writes:
 > On Monday 04 September 2006 13:43, Robert Olsson wrote:
 > > 
 > > Hello.
 > > 
 > > People on this list might find this paper interesting:
 > > http://www.csc.kth.se/~snilsson/public/papers/trash/
 > 
 > Looks nice. Have you looked at using it for local TCP/UDP socket
 > lookups too or would that be part of the "unified flow cache"? 


 No we haven't put struct socket in the leafs (result node) yet we just kept 
 dst entries and some stateful flow variables that we used for active GC 
 and flow logging so far. So 128 bit flow version of the dst hash. It would 
 have taken a lot a of more work and we wanted to get this paper out. Also 
 ip_route_input is not the place have the lookup if we with unified include 
 ip6. So we focused on designing and building the lookup core.


 Cheers.
					--ro

 struct leaf {
        __u32 key[LPK];
        unsigned long parent;
        void *obj;  /* Generic leaf object */
        __u32 state;
        unsigned int len;
        int iif;
        struct timeval start;
        struct rcu_head rcu;
};

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Paper on lookup in Linux.
  2006-09-04 12:53   ` Robert Olsson
@ 2006-09-04 13:29     ` Andi Kleen
  2006-09-04 13:54       ` Robert Olsson
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2006-09-04 13:29 UTC (permalink / raw)
  To: Robert Olsson; +Cc: netdev, snilsson

On Monday 04 September 2006 14:53, Robert Olsson wrote:

>  No we haven't put struct socket in the leafs (result node) yet we just kept 
>  dst entries and some stateful flow variables that we used for active GC 
>  and flow logging so far. So 128 bit flow version of the dst hash. It would 
>  have taken a lot a of more work and we wanted to get this paper out. Also 
>  ip_route_input is not the place have the lookup if we with unified include 
>  ip6. So we focused on designing and building the lookup core.

Thanks.

The reason I'm asking is that we still have trouble with the TCP hash tables
taking far too much memory, and your new data structure might provide a nice 
alternative. 

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Paper on lookup in Linux.
  2006-09-04 13:29     ` Andi Kleen
@ 2006-09-04 13:54       ` Robert Olsson
  0 siblings, 0 replies; 5+ messages in thread
From: Robert Olsson @ 2006-09-04 13:54 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Robert Olsson, netdev, snilsson


Andi Kleen writes:

 > The reason I'm asking is that we still have trouble with the TCP hash tables
 > taking far too much memory, and your new data structure might provide a nice 
 > alternative. 

 Yes it's dynamic and selftuning so no need reserve memory in advance and still 
 comparable performance to a perfect hash. Insert/delete and GC is fast and 
 locking is also doable we used RCU like fib_trie but at _bh in this "application". 
 
 Cheers.
						--ro

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-09-04 13:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-04 11:43 Paper on lookup in Linux Robert Olsson
2006-09-04 12:24 ` Andi Kleen
2006-09-04 12:53   ` Robert Olsson
2006-09-04 13:29     ` Andi Kleen
2006-09-04 13:54       ` Robert Olsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).