From: Greg Banks <gnb@sgi.com>
To: Neil Brown <neilb@suse.de>
Cc: Linux NFS Mailing List <nfs@lists.sourceforge.net>
Subject: Re: [PATCH 4/8] knfsd: repcache: split hash index
Date: Mon, 16 Oct 2006 19:51:21 +1000 [thread overview]
Message-ID: <20061016095121.GB8568@sgi.com> (raw)
In-Reply-To: <17714.59304.768727.298610@cse.unsw.edu.au>
On Mon, Oct 16, 2006 at 12:00:08PM +1000, Neil Brown wrote:
> On Wednesday October 11, gnb@melbourne.sgi.com wrote:
> > */
> > #define CACHESIZE 1024
> > -#define HASHSIZE 64
> > +/* number of buckets used to manage LRU lists and cache locks (power of 2) */
> > +#ifdef CONFIG_SMP
> > +#define CACHE_NUM_BUCKETS 64
> > +#else
> > +#define CACHE_NUM_BUCKETS 1
> > +#endif
> > +/* largest possible number of entries in all LRU lists (power of 2) */
> > +#define CACHE_MAX_SIZE (16*1024*CACHE_NUM_BUCKETS)
> > +/* largest possible number of entries in LRU per bucket */
> > +#define CACHE_BUCKET_MAX_SIZE (CACHE_MAX_SIZE/CACHE_NUM_BUCKETS)
> > +/* log2 of largest desired hash chain length */
> > +#define MAX_CHAIN_ORDER 2
> > +/* size of the per-bucket hash table */
> > +#define HASHSIZE ((CACHE_MAX_SIZE>>MAX_CHAIN_ORDER)/CACHE_NUM_BUCKETS)
>
> If I've done my sums right (there is always room for doubt), then HASHSIZE == 4096.
Correct.
> > +
> > + b->hash = kmalloc (HASHSIZE * sizeof(struct hlist_head), GFP_KERNEL);
>
> So this kmalloc asks for 16K or 32K depending on pointer size. On
> most machines that would be an order 2 or 3 allocation which is more
> likely to fail that order 0.
This has run without allocation failures on 2 classes of machines:
* Altix (ia64), PAGE_SIZE=16K sizeof(void*)=8 => order=1
* Altix XE (x86_64), PAGE_SIZE=4K, sizeof(void*)=8 => order=3
Of course, under normal conditions these allocations happen when
the NFS server is started at boot and thus we have the best chance
of them succeeding.
But I take your point, the nfsd buffer saga teaches us that there's
value in strictly limiting the order of these allocations.
> I would really like to see HASHSIZE limited to PAGE_SIZE, and if
> needed, push CACHE_NUM_BUCKETS up ... that might make the
> 'cache_buckets' array bigger than a page, but we don't kmalloc that so
> it shouldn't be a problem.
Sounds reasonable.
> Hmmm.. but if we wanted to scale the hash table size based on memory,
> we would want to kmalloc cache_buckets which might limit it's size...
Let's look at the maths. If we were to limit cache_buckets[] to a
single page, I calculate that would give us 186 entries on ia64, 46
on x86_64, and 68 on i386 (fewer if various spinlock-related config
options are enabled). That's too low on x86_64 but fine on the other
platforms. With a single order-1 allocation we could cover most bases.
Alternatively, we could allocate the buckets separately and make
cache_buckets[] an array of pointers to buckets. Then we could do a
single (say) 128*sizeof(svc_cache_bucket*) allocation plus (say) 128 *
sizeof(svc_cache_bucket) allocations, all of which would be order 0.
Now we've effectively got a 3-level fat tree keyed on hash value.
The more I think about it, the more I like this idea.
> So for now I would like to see this limit HASHSIZE to
> PAGE_SIZE/sizeof(void*), and possibly make CACHE_NUM_BUCKETS bigger in
> some circumstances. Allocating cache_buckets based on memory size can
> come later if it is needed.
>
> Sound fair?
Yep. I'll work up a new version of the patch with both the above ideas.
Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next prev parent reply other threads:[~2006-10-16 9:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-11 11:27 [PATCH 4/8] knfsd: repcache: split hash index Greg Banks
2006-10-16 2:00 ` Neil Brown
2006-10-16 6:38 ` David Chinner
2006-10-16 9:51 ` Greg Banks [this message]
2006-10-16 10:23 ` Neil Brown
2006-10-16 11:06 ` Greg Banks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061016095121.GB8568@sgi.com \
--to=gnb@sgi.com \
--cc=neilb@suse.de \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox