* Paper on lookup in Linux.
@ 2006-09-04 11:43 Robert Olsson
2006-09-04 12:24 ` Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: Robert Olsson @ 2006-09-04 11:43 UTC (permalink / raw)
To: netdev; +Cc: snilsson, Robert.Olsson
Hello.
People on this list might find this paper interesting:
http://www.csc.kth.se/~snilsson/public/papers/trash/
Abstract is below. Feel free to redistribute.
Cheers.
--ro
TRASH - A dynamic LC-trie and hash data structure
Robert Olsson and Stefan Nilsson
A dynamic LC-trie is currently used in the Linux kernel to implement address
lookup in the IP routing table. The main virtue of this data structure is that
it supports both fast address lookups and frequent updates of the table. Also,
it has an efficient memory management scheme and supports multi-processor
architectures using the RCU locking mechanism. The structure scales nicely:
the expected number of memory accesses for one lookup is O(log log n), where
n is the number of entries in the lookup table. In particular, the time does
not depend on the length of the keys, 32-bit IPv4 addresses and 128-bit
addresses does not make a difference in this respect.
In this article we introduce TRASH, a combination of a dynamic LC-trie and a
hash function. TRASH is a general purpose data structure supporting fast
lookup, insert and delete operations for arbitrarily long bit strings. TRASH
enhances the level-compression part of the LC-trie by prepending a header to
each key. The header is a hash value based on the complete key. The extended
keys will behave like uniformly distributed data and hence the average and
maximum depth is typically very small, in practice less than 1.5 and 5,
respectively.
We have implemented the scheme in the Linux kernel as a replacement for the
dst cache (IPv4) and performed a full scale test on a production router using
128-bit flow-based lookups. The Linux implementation of TRASH inherits the
efficient RCU locking mechanism from the dynamic LC-trie implementation. In
particular, the lookup time increases only marginally for longer keys and TRASH
is highly insensitive to different types of data. The performance figures are
very promising and the cache mechanism could easily be extended to serve as a
unified lookup for fast socket lookup, flow logging, connection tracking and
stateful networking in general.
Keywords: trie, LC-trie, hash, hashtrie, Linux, flow lookup, garbage collection
Trita-CSC-TCS 2006:2, ISRN/KTH/CSC/TCS-2006/2-SE, ISSN 1653-7092.
--
VGER BF report: U 0.830787
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Paper on lookup in Linux.
2006-09-04 11:43 Paper on lookup in Linux Robert Olsson
@ 2006-09-04 12:24 ` Andi Kleen
2006-09-04 12:53 ` Robert Olsson
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2006-09-04 12:24 UTC (permalink / raw)
To: Robert Olsson; +Cc: netdev, snilsson
On Monday 04 September 2006 13:43, Robert Olsson wrote:
>
> Hello.
>
> People on this list might find this paper interesting:
> http://www.csc.kth.se/~snilsson/public/papers/trash/
Looks nice. Have you looked at using it for local TCP/UDP socket
lookups too or would that be part of the "unified flow cache"?
-Andi
--
VGER BF report: U 0.499718
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Paper on lookup in Linux.
2006-09-04 12:24 ` Andi Kleen
@ 2006-09-04 12:53 ` Robert Olsson
2006-09-04 13:29 ` Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: Robert Olsson @ 2006-09-04 12:53 UTC (permalink / raw)
To: Andi Kleen; +Cc: Robert Olsson, netdev, snilsson
Andi Kleen writes:
> On Monday 04 September 2006 13:43, Robert Olsson wrote:
> >
> > Hello.
> >
> > People on this list might find this paper interesting:
> > http://www.csc.kth.se/~snilsson/public/papers/trash/
>
> Looks nice. Have you looked at using it for local TCP/UDP socket
> lookups too or would that be part of the "unified flow cache"?
No we haven't put struct socket in the leafs (result node) yet we just kept
dst entries and some stateful flow variables that we used for active GC
and flow logging so far. So 128 bit flow version of the dst hash. It would
have taken a lot a of more work and we wanted to get this paper out. Also
ip_route_input is not the place have the lookup if we with unified include
ip6. So we focused on designing and building the lookup core.
Cheers.
--ro
struct leaf {
__u32 key[LPK];
unsigned long parent;
void *obj; /* Generic leaf object */
__u32 state;
unsigned int len;
int iif;
struct timeval start;
struct rcu_head rcu;
};
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Paper on lookup in Linux.
2006-09-04 12:53 ` Robert Olsson
@ 2006-09-04 13:29 ` Andi Kleen
2006-09-04 13:54 ` Robert Olsson
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2006-09-04 13:29 UTC (permalink / raw)
To: Robert Olsson; +Cc: netdev, snilsson
On Monday 04 September 2006 14:53, Robert Olsson wrote:
> No we haven't put struct socket in the leafs (result node) yet we just kept
> dst entries and some stateful flow variables that we used for active GC
> and flow logging so far. So 128 bit flow version of the dst hash. It would
> have taken a lot a of more work and we wanted to get this paper out. Also
> ip_route_input is not the place have the lookup if we with unified include
> ip6. So we focused on designing and building the lookup core.
Thanks.
The reason I'm asking is that we still have trouble with the TCP hash tables
taking far too much memory, and your new data structure might provide a nice
alternative.
-Andi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Paper on lookup in Linux.
2006-09-04 13:29 ` Andi Kleen
@ 2006-09-04 13:54 ` Robert Olsson
0 siblings, 0 replies; 5+ messages in thread
From: Robert Olsson @ 2006-09-04 13:54 UTC (permalink / raw)
To: Andi Kleen; +Cc: Robert Olsson, netdev, snilsson
Andi Kleen writes:
> The reason I'm asking is that we still have trouble with the TCP hash tables
> taking far too much memory, and your new data structure might provide a nice
> alternative.
Yes it's dynamic and selftuning so no need reserve memory in advance and still
comparable performance to a perfect hash. Insert/delete and GC is fast and
locking is also doable we used RCU like fib_trie but at _bh in this "application".
Cheers.
--ro
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-09-04 13:54 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-04 11:43 Paper on lookup in Linux Robert Olsson
2006-09-04 12:24 ` Andi Kleen
2006-09-04 12:53 ` Robert Olsson
2006-09-04 13:29 ` Andi Kleen
2006-09-04 13:54 ` Robert Olsson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).