Re: [RFC PATCH]: Dynamically sized routing cache hash table.

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: David Miller <davem@davemloft.net>
To: Robert.Olsson@data.slu.se
Cc: netdev@vger.kernel.org, dada1@cosmosbay.com,
	robert.olsson@its.uu.se, npiggin@suse.de
Subject: Re: [RFC PATCH]: Dynamically sized routing cache hash table.
Date: Tue, 06 Mar 2007 14:20:55 -0800 (PST)	[thread overview]
Message-ID: <20070306.142055.14973231.davem@davemloft.net> (raw)
In-Reply-To: <17901.27628.548105.353342@robur.slu.se>

From: Robert Olsson <Robert.Olsson@data.slu.se>
Date: Tue, 6 Mar 2007 14:26:04 +0100

> David Miller writes:
>  
>  > Actually, more accurately, the conflict exists in how this GC
>  > logic is implemented.  The core issue is that hash table size
>  > guides the GC processing, and hash table growth therefore
>  > modifies those GC goals.  So with the patch below we'll just
>  > keep growing the hash table instead of giving GC some time to
>  > try to keep the working set in equilibrium before doing the
>  > hash grow.
>  
>  AFIK the equilibrium is resizing function as well but using fixed 
>  hash table. So can we do without equilibrium resizing if tables 
>  are dynamic?  I think so....
> 
>  With the hash data structure we could monitor the average chain 
>  length or just size and resize hash after that.

I'm not so sure, it may be a mistake to eliminate the equilibrium
logic.  One error I think it does have is the usage of chain length.

Even a nearly perfect hash has small lumps in distribution, and we
should not penalize entries which fall into these lumps.

Let us call T the threshold at which we would grow the routing hash
table.  As we approach T we start to GC.  Let's assume hash table
has shift = 2. and T would (with T=N+(N>>1) algorithm) therefore be
6.

TABLE:	[0]	DST1, DST2
	[1]	DST3, DST4, DST5

DST6 arrives, what should we do?

If we just accept it and don't GC some existing entries, we
will grow the hash table.  This is the wrong thing to do if
our true working set is smaller than 6 entries and thus some
of the existing entries are unlikely to be reused and thus
could be purged to keep us from hitting T.

If they are all active, growing is the right thing to do.

This is the crux of the whole routing cache problem.

I am of the opinion that LRU, for routes not attached to sockets, is
probably the best thing to do here.

Furthermore at high packet rates, the current rt_may_expire() logic
probably is not very effective since it's granularity is limited to
jiffies.  We can quite easily create 100,000 or more entries per
jiffie when HZ=100 during rDOS, for example.  So perhaps some global
LRU algorithm using ktime is more appropriate.

Global LRU is not easy without touching a lot of memory.  But I'm
sure some clever trick can be discovered by someone :)

It is amusing, but it seems that for rDOS workload most optimal
routing hash would be tiny one like my example above.  All packets
essentially miss the routing cache and create new entry.  So
keeping the working set as small as possible is what you want
to do since no matter how large you grow your hit rate will be
zero :-)

next prev parent reply	other threads:[~2007-03-06 22:20 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-06  4:26 [RFC PATCH]: Dynamically sized routing cache hash table David Miller
2007-03-06  7:14 ` Eric Dumazet
2007-03-06  7:23   ` David Miller
2007-03-06  7:58     ` Eric Dumazet
2007-03-06  9:05       ` David Miller
2007-03-06 10:33         ` [PATCH] NET : Optimizes inet_getpeer() Eric Dumazet
2007-03-07  4:23           ` David Miller
2007-03-06 13:42   ` [RFC PATCH]: Dynamically sized routing cache hash table Robert Olsson
2007-03-06 14:18     ` Eric Dumazet
2007-03-06 17:05       ` Robert Olsson
2007-03-06 17:20         ` Eric Dumazet
2007-03-06 18:55           ` Robert Olsson
2007-03-06  9:11 ` Nick Piggin
2007-03-06  9:17   ` David Miller
2007-03-06  9:22     ` Nick Piggin
2007-03-06  9:23   ` Eric Dumazet
2007-03-06  9:41     ` Nick Piggin
2007-03-06 13:26 ` Robert Olsson
2007-03-06 22:20   ` David Miller [this message]
2007-03-08  6:26     ` Nick Piggin
2007-03-08 13:35     ` Robert Olsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070306.142055.14973231.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=Robert.Olsson@data.slu.se \
    --cc=dada1@cosmosbay.com \
    --cc=netdev@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=robert.olsson@its.uu.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).