All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] shrink core hashes on small systems
Date: Wed, 7 Apr 2004 13:10:45 -0500	[thread overview]
Message-ID: <20040407181044.GV6248@waste.org> (raw)
In-Reply-To: <20040407052103.GB8738@logos.cnet>

On Wed, Apr 07, 2004 at 02:21:03AM -0300, Marcelo Tosatti wrote:
> On Mon, Apr 05, 2004 at 02:38:24PM -0700, Andrew Morton wrote:
> > Matt Mackall <mpm@selenic.com> wrote:
> > >
> > > Longer term, I think some serious thought needs to go into scaling
> > > hash sizes across the board, but this serves my purposes on the
> > > low-end without changing behaviour asymptotically.
> > 
> > Can we make longer-term a bit shorter?  init_per_zone_pages_min() only took
> > a few minutes thinking..
> > 
> > I suspect what we need is to replace num_physpages with nr_free_pages(),
> > then account for PAGE_SIZE, then muck around with sqrt() for a while then
> > apply upper and lower bounds to it.
> > 
> > That hard-wired breakpoint seems just too arbitrary to me - some sort of
> > graduated thing which has a little thought and explanation behind it would
> > be preferred please.
> 
> Hi,
> 
> Arjan told me his changes were not to allow orders higher than 5 
> (thus maximizing hash table size to 128K) to avoid possible cache thrashing.
> 
> I've done some tests with dbench during the day with different
> dentry hash table sizes, here are the results on a 2-way P4 2GB box
> (default of 1MB hashtable).
> 
> I ran three consecutive dbenchs (with 32 clients) each reboot (each
> line), and then six consecutive dbenchs on the last run.
> 
> Output is "Mb/s" output from dbench 2.1.
> 
> 128K dentry hashtable:
> 
> 160 145 155
> 152 147 148
> 170 132 132 156 154 127
> 
> 512K dentry hashtable:
> 
> 156 135 144
> 153 146 159
> 157 135 148 149 148 143
> 
> 1Mb dentry hashtable:
> 
> 167 127 139
> 160 144 139
> 144 137 162 153 132 121
> 
> Not much of noticeable difference between them. I was told
> SPECWeb benefits from big hash tables. Any other recommended test? 
> I have no access to SPECWeb. 
> 
> Testing the different hash sizes is not trivial because there are 
> so many different workloads...
> 
> Another thing is we allow the cache to grow too big: 1MB for 2GB, 
> 4Mb for 32Gb, 8Mb for 64Gb (on 32-bit, twice as much on 64-bit). 
> 
> What about the following patch to calculate the sizes of the VFS 
> caches based on more correct variables.
> 
> It might be good to shrink the caches a half (passing "4" instead of "3" to  
> vfs_cache_size() does it). We gain lowmem pinned memory and dont seem 
> to loose performance. Help with testing is very much appreciated.

I'm working on something similar, core code below.
 
> PS: Another improvement which might be interesting is non-power-of-2
> allocations? That would make the increase on the cache size
> "smoother" when memory size increases. AFAICS we dont do that
> because of our allocator is limited.

The other problem is hash functions frequently take advantage of bit
masking to wrap results so powers of two is nice.


My hash-sizing code:

/* calc_hash_order - calculate the page order for a hash table
 * @loworder: smallest allowed hash order
 * @lowpages: keep hash size minimal below this number of pages
 * @hiorder: largest order for linear growth
 * @hipages: point at which to switch to logarithmic growth
 * @pages: number of available pages
 *
 * This calculates the page order for hash tables. It uses linear
 * interpolation for memory sizes between lowpages and hipages and
 * then switches to a logarithmic algorithm. For every factor of 10
 * pages beyond hipages, the hash order is increased by 1. The
 * logarithmic piece is used to future-proof the code on large boxes.
 *
 */

int calc_hash_order(int loworder, unsigned long lowpages,
		    int hiorder, unsigned long hipages, unsigned long pages)
{
	unsigned long lowhash = 1<<loworder, hihash = 1<<hiorder, hash, order;

	if (pages <= hipages) {
		/* linear interpolation on hash sizes, not hash order */
		hash = minhash + ((pages - lopages) * (lowhash - hihash) /
				  (lowpages - hipages));
		order = ffs(hash);
	} else {
		/* for every factor of 10 beyond hipages, increase order
		   by one */
		for (order = hiorder; pages > hipages; pages /= 10)
			order++;
	}

	/* clip order range */
	return max(minorder, order);
}


-- 
Matt Mackall : http://www.selenic.com : Linux development and consulting

  reply	other threads:[~2004-04-07 18:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-04-05 20:49 [PATCH] shrink core hashes on small systems Matt Mackall
2004-04-05 21:02 ` Andrew Morton
2004-04-05 21:19   ` Matt Mackall
2004-04-05 21:38     ` Andrew Morton
2004-04-05 22:59       ` Matt Mackall
2004-04-06  6:31         ` Bryan Rittmeyer
2004-04-07  5:21       ` Marcelo Tosatti
2004-04-07 18:10         ` Matt Mackall [this message]
2004-04-07 22:53     ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040407181044.GV6248@waste.org \
    --to=mpm@selenic.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.tosatti@cyclades.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.