netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Anton Blanchard <anton@samba.org>
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH] Limit size of route cache hash table
Date: Mon, 27 Apr 2009 07:17:42 +0200	[thread overview]
Message-ID: <49F53FF6.2040603@cosmosbay.com> (raw)
In-Reply-To: <20090427030433.GA17454@kryten>

Anton Blanchard a écrit :
> Right now we have no upper limit on the size of the route cache hash table.
> On a 128GB POWER6 box it ends up as 32MB:
> 
>     IP route cache hash table entries: 4194304 (order: 9, 33554432 bytes)
> 
> It would be nice to cap this just for memory consumption reasons, but this
> massive hashtable also causes a significant spike when measuring OS
> jitter.
> 
> With a 32MB hashtable and 4 million entries, rt_worker_func is taking
> 5 ms to complete. On another system with more memory it's taking 14 ms.
> Even though rt_worker_func does call cond_sched() to limit its impact,
> in an HPC environment we want to keep all sources of OS jitter to a minimum.

Then boot with rhash_entries = 8000 ?
or 
echo 1 >/proc/sys/net/ipv4/route/gc_interval
> 
> With the patch applied we limit the number of entries to 64k which
> can still be overriden by using the rt_entries boot option:
> 
>     IP route cache hash table entries: 65536 (order: 3, 524288 bytes)
> 
> With this patch rt_worker_func takes 0.060 ms on the same system.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
> 
> Is 64k a reasonable default for the limit?
> 
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index c40debe..5064c26 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -3397,7 +3397,7 @@ int __init ip_rt_init(void)
>  					0,
>  					&rt_hash_log,
>  					&rt_hash_mask,
> -					0);
> +					rhash_entries ? 0 : 64 * 1024);
>  	memset(rt_hash_table, 0, (rt_hash_mask + 1) * sizeof(struct rt_hash_bucket));
>  	rt_hash_lock_init();
>  
> 


Sorry this limit is too small. Many of my customer machines would collapse.

It would be smart to eventually change ip_rt_gc_interval from 60 
to 1 second for such machines ? Dividing 5 ms per 60 gives 83 us, which
is correct. 



  reply	other threads:[~2009-04-27  5:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-27  3:04 [PATCH] Limit size of route cache hash table Anton Blanchard
2009-04-27  5:17 ` Eric Dumazet [this message]
2009-04-27  5:47   ` Anton Blanchard
2009-04-27  6:12     ` Eric Dumazet
2009-04-27  6:36       ` David Miller
2009-04-27  6:47         ` Eric Dumazet
2009-04-27 11:44           ` Anton Blanchard
2009-04-27 11:50           ` Anton Blanchard
2009-04-27 12:40             ` David Miller
2009-04-27  6:35     ` David Miller
2009-04-27  6:11 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49F53FF6.2040603@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=anton@samba.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).