From: Matt Mackall <mpm@selenic.com>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] shrink core hashes on small systems
Date: Wed, 7 Apr 2004 13:10:45 -0500 [thread overview]
Message-ID: <20040407181044.GV6248@waste.org> (raw)
In-Reply-To: <20040407052103.GB8738@logos.cnet>
On Wed, Apr 07, 2004 at 02:21:03AM -0300, Marcelo Tosatti wrote:
> On Mon, Apr 05, 2004 at 02:38:24PM -0700, Andrew Morton wrote:
> > Matt Mackall <mpm@selenic.com> wrote:
> > >
> > > Longer term, I think some serious thought needs to go into scaling
> > > hash sizes across the board, but this serves my purposes on the
> > > low-end without changing behaviour asymptotically.
> >
> > Can we make longer-term a bit shorter? init_per_zone_pages_min() only took
> > a few minutes thinking..
> >
> > I suspect what we need is to replace num_physpages with nr_free_pages(),
> > then account for PAGE_SIZE, then muck around with sqrt() for a while then
> > apply upper and lower bounds to it.
> >
> > That hard-wired breakpoint seems just too arbitrary to me - some sort of
> > graduated thing which has a little thought and explanation behind it would
> > be preferred please.
>
> Hi,
>
> Arjan told me his changes were not to allow orders higher than 5
> (thus maximizing hash table size to 128K) to avoid possible cache thrashing.
>
> I've done some tests with dbench during the day with different
> dentry hash table sizes, here are the results on a 2-way P4 2GB box
> (default of 1MB hashtable).
>
> I ran three consecutive dbenchs (with 32 clients) each reboot (each
> line), and then six consecutive dbenchs on the last run.
>
> Output is "Mb/s" output from dbench 2.1.
>
> 128K dentry hashtable:
>
> 160 145 155
> 152 147 148
> 170 132 132 156 154 127
>
> 512K dentry hashtable:
>
> 156 135 144
> 153 146 159
> 157 135 148 149 148 143
>
> 1Mb dentry hashtable:
>
> 167 127 139
> 160 144 139
> 144 137 162 153 132 121
>
> Not much of noticeable difference between them. I was told
> SPECWeb benefits from big hash tables. Any other recommended test?
> I have no access to SPECWeb.
>
> Testing the different hash sizes is not trivial because there are
> so many different workloads...
>
> Another thing is we allow the cache to grow too big: 1MB for 2GB,
> 4Mb for 32Gb, 8Mb for 64Gb (on 32-bit, twice as much on 64-bit).
>
> What about the following patch to calculate the sizes of the VFS
> caches based on more correct variables.
>
> It might be good to shrink the caches a half (passing "4" instead of "3" to
> vfs_cache_size() does it). We gain lowmem pinned memory and dont seem
> to loose performance. Help with testing is very much appreciated.
I'm working on something similar, core code below.
> PS: Another improvement which might be interesting is non-power-of-2
> allocations? That would make the increase on the cache size
> "smoother" when memory size increases. AFAICS we dont do that
> because of our allocator is limited.
The other problem is hash functions frequently take advantage of bit
masking to wrap results so powers of two is nice.
My hash-sizing code:
/* calc_hash_order - calculate the page order for a hash table
* @loworder: smallest allowed hash order
* @lowpages: keep hash size minimal below this number of pages
* @hiorder: largest order for linear growth
* @hipages: point at which to switch to logarithmic growth
* @pages: number of available pages
*
* This calculates the page order for hash tables. It uses linear
* interpolation for memory sizes between lowpages and hipages and
* then switches to a logarithmic algorithm. For every factor of 10
* pages beyond hipages, the hash order is increased by 1. The
* logarithmic piece is used to future-proof the code on large boxes.
*
*/
int calc_hash_order(int loworder, unsigned long lowpages,
int hiorder, unsigned long hipages, unsigned long pages)
{
unsigned long lowhash = 1<<loworder, hihash = 1<<hiorder, hash, order;
if (pages <= hipages) {
/* linear interpolation on hash sizes, not hash order */
hash = minhash + ((pages - lopages) * (lowhash - hihash) /
(lowpages - hipages));
order = ffs(hash);
} else {
/* for every factor of 10 beyond hipages, increase order
by one */
for (order = hiorder; pages > hipages; pages /= 10)
order++;
}
/* clip order range */
return max(minorder, order);
}
--
Matt Mackall : http://www.selenic.com : Linux development and consulting
next prev parent reply other threads:[~2004-04-07 18:11 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-04-05 20:49 [PATCH] shrink core hashes on small systems Matt Mackall
2004-04-05 21:02 ` Andrew Morton
2004-04-05 21:19 ` Matt Mackall
2004-04-05 21:38 ` Andrew Morton
2004-04-05 22:59 ` Matt Mackall
2004-04-06 6:31 ` Bryan Rittmeyer
2004-04-07 5:21 ` Marcelo Tosatti
2004-04-07 18:10 ` Matt Mackall [this message]
2004-04-07 22:53 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040407181044.GV6248@waste.org \
--to=mpm@selenic.com \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo.tosatti@cyclades.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).