netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>
Cc: kuznet@ms2.inr.ac.ru, dev@sw.ru, netdev@vger.kernel.org
Subject: Re: [PATCH] limit rt cache size
Date: Tue, 8 Aug 2006 10:57:31 +0200	[thread overview]
Message-ID: <200608081057.32022.dada1@cosmosbay.com> (raw)
In-Reply-To: <20060807.204214.68039839.davem@davemloft.net>

On Tuesday 08 August 2006 05:42, David Miller wrote:
> From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Date: Mon, 7 Aug 2006 20:48:42 +0400
>
> > The patch looks OK. But I am not sure too.
> >
> > To be honest, I do not understand the sense of HASH_HIGHMEM flag.
> > At the first sight, hash table eats low memory, objects hashed in this
> > table also eat low memory. Why is its size calculated from total memory?
> > But taking into account that this flag is used only by tcp.c and route.c,
> > both of which feed on low memory, I miss something important.
> >
> > Let's ask people on netdev.
>
> Is it not so hard to check history of the change to see where these
> things come from?  :-) If we study the output of command:
>
> 	git whatchanged net/core/route.c
>
> we quickly discover this GIT commit:
>
> 424c4b70cc4ff3930ee36a2ef7b204e4d704fd26
>
> [IPV4]: Use the fancy alloc_large_system_hash() function for route hash
> table
>
> - rt hash table allocated using alloc_large_system_hash() function.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
>
> And it is clear that old code used num_physpages, which counts low
> memory only.  This shows clearly that Eric's usage of the HASH_HIGHMEM
> flag here is erroneous.  So we should remove it.

Yes probably.

If I recall well, I blindly copied code from net/ipv4/tcp.c (tcp ehash table 
allocation). I was not aware of this HASH_HIGHMEM part.

As the allocation of routes are SLAB_ATOMIC, while TCP sockets are allocated 
SLAB_KERNEL , it makes sense to size the route hash table accordingly to 
nr_kernel_pages instead of nr_all_pages

For TCP, an OOM is OK since sock_alloc_inode() should returns NULL and this 
should be handled fine.

I think we had discussion about being able to dynamically resize route hash 
table (or tcp hash table), using RCU. Did someone worked on this ?
For most current machines (ram size >= 1GB) , the default hash table sizes are 
just insane for 99% of uses.


>
> Look!  This thing even uses num_physpages in current code to compute
> the "scale" argument to alloc_large_system_hash() :)))
>
> > What's about routing cache size, it looks like it is another bug.
> > route.c should not force rt_max_size = 16*rt_hash_size.
> > I think it should consult available memory and to limit rt_max_size
> > to some reasonable value, even if hash size is too high.
>
> Sure.  This current setting of 16*rt_hash_size is meant to
> try to limit hash chain lengths I guess.  2.4.x does the same
> thing.  Note also that by basing it upon number of routing cache
> hash chains, it is effectively consulting available memory.
> This is why when hash table sizing is crap so it rt_max_size
> calculation.  Fix one and you fix them both :)
>
> Once the HASH_HIGHMEM flag is removed, assuming system has > 128K of
> memory, what we get is:
>
> 	hash_chains = lowmem / 128K
> 	rt_max_size = ((lowmem / 128K) * 16) == lowmem / 8K
>
> So we allow one routing cache entry for each 8K of lowmem we have :)
>
> So for now it is probably sufficient to just get rid of the
> HASH_HIGHMEM flag here.  Later we can try changing this multiplier
> of "16" to something like "8" or even "4".

  parent reply	other threads:[~2006-08-08  8:57 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <44D75EF8.1070901@sw.ru>
2006-08-07 16:48 ` [PATCH] limit rt cache size Alexey Kuznetsov
2006-08-08  3:42   ` David Miller
2006-08-08  5:11     ` Andi Kleen
2006-08-08  6:18       ` David Miller
2006-08-08  6:53         ` Andi Kleen
2006-08-08  7:01           ` David Miller
2006-08-08 12:54             ` Kirill Korotaev
2006-08-08 12:58               ` Andi Kleen
2006-08-08 20:37       ` akepner
2006-08-08 23:23         ` Andi Kleen
2006-08-09  0:06           ` akepner
2006-08-09  0:11           ` David Miller
2006-08-09  0:11             ` akepner
2006-08-09  0:22               ` David Miller
2006-08-09  1:02               ` Andi Kleen
2006-08-09 16:16                 ` akepner
2006-08-09 16:32                   ` Andi Kleen
2006-08-10  0:02                     ` David Miller
2006-08-09  8:05               ` Kirill Korotaev
2006-08-09  0:24             ` Andi Kleen
2006-08-09  0:32               ` David Miller
2006-08-09  8:09               ` Kirill Korotaev
2006-08-09  8:53                 ` Eric Dumazet
2006-08-09  9:22                   ` David Miller
2006-08-08  8:17     ` Kirill Korotaev
2006-08-08  8:34       ` David Miller
2006-08-08  8:57     ` Eric Dumazet [this message]
2006-08-08  9:12       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200608081057.32022.dada1@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=dev@sw.ru \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).