public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org
Subject: Re: Limit hash table size
Date: Thu, 05 Feb 2004 23:58:13 +0000	[thread overview]
Message-ID: <20040205155813.726041bd.akpm@osdl.org> (raw)
In-Reply-To: <B05667366EE6204181EABE9C1B1C0EB5802441@scsmsx401.sc.intel.com>


Ken, I remain unhappy with this patch.  If a big box has 500 million
dentries or inodes in cache (is possible), those hash chains will be more
than 200 entries long on average.  It will be very slow.

We need to do something smarter.  At least, for machines which do not have
the ia64 proliferation-of-zones problem.

Maybe we should leave the sizing of these tables as-is, and add some hook
which allows the architecture to scale them back.




From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>

The issue of exceedingly large hash tables has been discussed on the
mailing list a while back, but seems to slip through the cracks.

What we found is it's not a problem for x86 (and most other
architectures) because __get_free_pages won't be able to get anything
beyond order MAX_ORDER-1 (10) which means at most those hash tables are
4MB each (assume 4K page size).  However, on ia64, in order to support
larger hugeTLB page size, the MAX_ORDER is bumped up to 18, which now
means a 2GB upper limits enforced by the page allocator (assume 16K page
size).  PPC64 is another example that bumps up MAX_ORDER.

Last time I checked, the tcp ehash table is taking a whooping (insane!)
2GB on one of our large machine.  dentry and inode hash tables also take
considerable amount of memory.

We enforce the maximum size based on the number of entries instead of the
page order.  The upper bound is capped at 2M.  All numbers on x86 remain the
same as we don't want to disturb already established and working number.

(acked by davem)


---

 25-akpm/fs/dcache.c      |    9 +++++----
 25-akpm/fs/inode.c       |    7 +++++--
 25-akpm/net/ipv4/route.c |    2 +-
 25-akpm/net/ipv4/tcp.c   |    2 +-
 4 files changed, 12 insertions(+), 8 deletions(-)

diff -puN fs/dcache.c~limit-hash-table-sizes fs/dcache.c
--- 25/fs/dcache.c~limit-hash-table-sizes	Thu Feb  5 15:43:40 2004
+++ 25-akpm/fs/dcache.c	Thu Feb  5 15:43:40 2004
@@ -49,6 +49,7 @@ static kmem_cache_t *dentry_cache; 
  */
 #define D_HASHBITS     d_hash_shift
 #define D_HASHMASK     d_hash_mask
+#define D_HASHMAX	(2*1024*1024UL)	/* max number of entries */
 
 static unsigned int d_hash_mask;
 static unsigned int d_hash_shift;
@@ -1565,10 +1566,10 @@ static void __init dcache_init(unsigned 
 	
 	set_shrinker(DEFAULT_SEEKS, shrink_dcache_memory);
 
-#if PAGE_SHIFT < 13
-	mempages >>= (13 - PAGE_SHIFT);
-#endif
-	mempages *= sizeof(struct hlist_head);
+	mempages = PAGE_SHIFT < 13 ?
+		   mempages >> (13 - PAGE_SHIFT) :
+		   mempages << (PAGE_SHIFT - 13);
+	mempages = min(D_HASHMAX, mempages) * sizeof(struct hlist_head);
 	for (order = 0; ((1UL << order) << PAGE_SHIFT) < mempages; order++)
 		;
 
diff -puN fs/inode.c~limit-hash-table-sizes fs/inode.c
--- 25/fs/inode.c~limit-hash-table-sizes	Thu Feb  5 15:43:40 2004
+++ 25-akpm/fs/inode.c	Thu Feb  5 15:43:40 2004
@@ -53,6 +53,7 @@
  */
 #define I_HASHBITS	i_hash_shift
 #define I_HASHMASK	i_hash_mask
+#define I_HASHMAX	(2*1024*1024UL)	/* max number of entries */
 
 static unsigned int i_hash_mask;
 static unsigned int i_hash_shift;
@@ -1325,8 +1326,10 @@ void __init inode_init(unsigned long mem
 	for (i = 0; i < ARRAY_SIZE(i_wait_queue_heads); i++)
 		init_waitqueue_head(&i_wait_queue_heads[i].wqh);
 
-	mempages >>= (14 - PAGE_SHIFT);
-	mempages *= sizeof(struct hlist_head);
+	mempages = PAGE_SHIFT < 14 ?
+		   mempages >> (14 - PAGE_SHIFT) :
+		   mempages << (PAGE_SHIFT - 14);
+	mempages = min(I_HASHMAX, mempages) * sizeof(struct hlist_head);
 	for (order = 0; ((1UL << order) << PAGE_SHIFT) < mempages; order++)
 		;
 
diff -puN net/ipv4/route.c~limit-hash-table-sizes net/ipv4/route.c
--- 25/net/ipv4/route.c~limit-hash-table-sizes	Thu Feb  5 15:43:40 2004
+++ 25-akpm/net/ipv4/route.c	Thu Feb  5 15:43:40 2004
@@ -2744,7 +2744,7 @@ int __init ip_rt_init(void)
 
 	goal = num_physpages >> (26 - PAGE_SHIFT);
 
-	for (order = 0; (1UL << order) < goal; order++)
+	for (order = 0; (order < 10) && ((1UL << order) < goal); order++)
 		/* NOTHING */;
 
 	do {
diff -puN net/ipv4/tcp.c~limit-hash-table-sizes net/ipv4/tcp.c
--- 25/net/ipv4/tcp.c~limit-hash-table-sizes	Thu Feb  5 15:43:40 2004
+++ 25-akpm/net/ipv4/tcp.c	Thu Feb  5 15:43:40 2004
@@ -2611,7 +2611,7 @@ void __init tcp_init(void)
 	else
 		goal = num_physpages >> (23 - PAGE_SHIFT);
 
-	for (order = 0; (1UL << order) < goal; order++)
+	for (order = 0; (order < 10) && ((1UL << order) < goal); order++)
 		;
 	do {
 		tcp_ehash_size = (1UL << order) * PAGE_SIZE /

_


  parent reply	other threads:[~2004-02-05 23:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-08 23:12 Limit hash table size Chen, Kenneth W
2004-01-09  9:25 ` Andrew Morton
2004-01-09 14:25 ` Anton Blanchard
2004-01-09 19:05 ` Chen, Kenneth W
2004-01-12 13:32   ` Anton Blanchard
2004-01-14 22:29 ` Chen, Kenneth W
2004-01-14 22:31 ` Chen, Kenneth W
2004-01-18 14:25   ` Anton Blanchard
2004-02-05 23:58 ` Andrew Morton [this message]
2004-02-06  0:10 ` Chen, Kenneth W
2004-02-06  0:23   ` Andrew Morton
2004-02-09 23:12     ` Jes Sorensen
2004-02-17 22:24 ` Chen, Kenneth W
2004-02-17 23:24   ` Andrew Morton
2004-02-18  0:16 ` Chen, Kenneth W
2004-02-18  0:45 ` Chen, Kenneth W

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040205155813.726041bd.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=kenneth.w.chen@intel.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox