From: Greg Banks <gnb@sgi.com>
To: Neil Brown <neilb@suse.de>
Cc: Linux NFS Mailing List <nfs@lists.sourceforge.net>
Subject: Re: [PATCH 4/8] knfsd: repcache: split hash index
Date: Mon, 16 Oct 2006 19:51:21 +1000 [thread overview]
Message-ID: <20061016095121.GB8568@sgi.com> (raw)
In-Reply-To: <17714.59304.768727.298610@cse.unsw.edu.au>
On Mon, Oct 16, 2006 at 12:00:08PM +1000, Neil Brown wrote:
> On Wednesday October 11, gnb@melbourne.sgi.com wrote:
> > */
> > #define CACHESIZE 1024
> > -#define HASHSIZE 64
> > +/* number of buckets used to manage LRU lists and cache locks (power of 2) */
> > +#ifdef CONFIG_SMP
> > +#define CACHE_NUM_BUCKETS 64
> > +#else
> > +#define CACHE_NUM_BUCKETS 1
> > +#endif
> > +/* largest possible number of entries in all LRU lists (power of 2) */
> > +#define CACHE_MAX_SIZE (16*1024*CACHE_NUM_BUCKETS)
> > +/* largest possible number of entries in LRU per bucket */
> > +#define CACHE_BUCKET_MAX_SIZE (CACHE_MAX_SIZE/CACHE_NUM_BUCKETS)
> > +/* log2 of largest desired hash chain length */
> > +#define MAX_CHAIN_ORDER 2
> > +/* size of the per-bucket hash table */
> > +#define HASHSIZE ((CACHE_MAX_SIZE>>MAX_CHAIN_ORDER)/CACHE_NUM_BUCKETS)
>
> If I've done my sums right (there is always room for doubt), then HASHSIZE == 4096.
Correct.
> > +
> > + b->hash = kmalloc (HASHSIZE * sizeof(struct hlist_head), GFP_KERNEL);
>
> So this kmalloc asks for 16K or 32K depending on pointer size. On
> most machines that would be an order 2 or 3 allocation which is more
> likely to fail that order 0.
This has run without allocation failures on 2 classes of machines:
* Altix (ia64), PAGE_SIZE=16K sizeof(void*)=8 => order=1
* Altix XE (x86_64), PAGE_SIZE=4K, sizeof(void*)=8 => order=3
Of course, under normal conditions these allocations happen when
the NFS server is started at boot and thus we have the best chance
of them succeeding.
But I take your point, the nfsd buffer saga teaches us that there's
value in strictly limiting the order of these allocations.
> I would really like to see HASHSIZE limited to PAGE_SIZE, and if
> needed, push CACHE_NUM_BUCKETS up ... that might make the
> 'cache_buckets' array bigger than a page, but we don't kmalloc that so
> it shouldn't be a problem.
Sounds reasonable.
> Hmmm.. but if we wanted to scale the hash table size based on memory,
> we would want to kmalloc cache_buckets which might limit it's size...
Let's look at the maths. If we were to limit cache_buckets[] to a
single page, I calculate that would give us 186 entries on ia64, 46
on x86_64, and 68 on i386 (fewer if various spinlock-related config
options are enabled). That's too low on x86_64 but fine on the other
platforms. With a single order-1 allocation we could cover most bases.
Alternatively, we could allocate the buckets separately and make
cache_buckets[] an array of pointers to buckets. Then we could do a
single (say) 128*sizeof(svc_cache_bucket*) allocation plus (say) 128 *
sizeof(svc_cache_bucket) allocations, all of which would be order 0.
Now we've effectively got a 3-level fat tree keyed on hash value.
The more I think about it, the more I like this idea.
> So for now I would like to see this limit HASHSIZE to
> PAGE_SIZE/sizeof(void*), and possibly make CACHE_NUM_BUCKETS bigger in
> some circumstances. Allocating cache_buckets based on memory size can
> come later if it is needed.
>
> Sound fair?
Yep. I'll work up a new version of the patch with both the above ideas.
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next prev parent reply other threads:[~2006-10-16 9:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-11 11:27 [PATCH 4/8] knfsd: repcache: split hash index Greg Banks
2006-10-16 2:00 ` Neil Brown
2006-10-16 6:38 ` David Chinner
2006-10-16 9:51 ` Greg Banks [this message]
2006-10-16 10:23 ` Neil Brown
2006-10-16 11:06 ` Greg Banks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061016095121.GB8568@sgi.com \
--to=gnb@sgi.com \
--cc=neilb@suse.de \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.