From: Miquel van Smoorenburg <mikevs@xs4all.net>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Yohan <ytordjman@corp.free.fr>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org,
Neil Brown <neilb@suse.de>,
"J. Bruce Fields" <bfields@fieldses.org>,
mikevs@xs4all.net
Subject: sunrpc: dynamically allocate credcache hashtables [was: Re: VM issue causing high CPU loads]
Date: Thu, 03 Sep 2009 16:35:14 +0200 [thread overview]
Message-ID: <1251988514.20818.22.camel@laptop> (raw)
In-Reply-To: <1251986526.18338.29.camel@heimdal.trondhjem.org>
[-- Attachment #1: Type: text/plain, Size: 2128 bytes --]
On Thu, 2009-09-03 at 10:02 -0400, Trond Myklebust wrote:
> On Thu, 2009-09-03 at 15:39 +0200, Yohan wrote:
> > > As far as I can see, there is no RPCSEC_GSS involved, so credentials
> > > should never expire. They will be reused as long as processes aren't
> > > switching between thousands and thousands of different combinations of
> > > uid, gid and groups.
> > My servers are imap servers.
> > Foreach user (~15 million) it have a specific uid over ~10 nfs netapp
> > storage.
>
> OK, so 16 hash buckets are likely to be filled with ~10^6 entries each.
> I can see that might be a performance issue...
>
> So afaics, you did try adjusting the hashtable size. How much larger
> does it have to be before you start to get acceptable performance? If it
> solves your problem we could make hash table sizes adjustable via a
> module parameter, for instance.
That is *exactly* what my patch does :)
I ported it to 2.6.31-rc8-bk2 this afternoon, that was trivial.
What I wanted to discuss was finding out if there was another solution,
or that we should build something that auto-tunes hashtable sizes, of if
there was a way to limit the size of the cache in another way.
I have the same usage pattern as Yohan (also an IMAP server for
potentially a few million different uids) - lots of uids are used, but
not simultaneously (maybe a few hundred or a thousand at the same time).
It's just that the inode/dentry/cred caches never expire because modern
boxes have lots and lots of memory.
Due to personal circumstances though I haven't been able to work on
anything much for the last few months. I apologize for keeping quiet.
Patch attached. I've removed the debugging stuff, this is only the
"dynamically allocate credcache hashtables" patch.
Patch description:
auth.h: increase RPC_CREDCACHE_HASHBITS from 4 to 12
(16 hashtable entries -> 4096). This is just the default.
auth.c: allocate hashtables dyamically
add sysctl for credcache_hashsize
auth_generic.c: use rpcauth_init_credcache
auth_unix.c: use rpcauth_init_credcache
sunrpc_syms.c: add hashsize module parameter
Mike.
[-- Attachment #2: linux-2.6.31-rc8-git2-sunprc-credcache_hashsize.patch --]
[-- Type: text/x-patch, Size: 9129 bytes --]
diff -ruN linux-2.6.31-rc8-git2.orig/include/linux/sunrpc/auth.h linux-2.6.31-rc8-git2/include/linux/sunrpc/auth.h
--- linux-2.6.31-rc8-git2.orig/include/linux/sunrpc/auth.h 2009-08-28 02:59:04.000000000 +0200
+++ linux-2.6.31-rc8-git2/include/linux/sunrpc/auth.h 2009-09-03 12:29:45.000000000 +0200
@@ -60,10 +60,14 @@
/*
* Client authentication handle
*/
-#define RPC_CREDCACHE_HASHBITS 4
+#define RPC_CREDCACHE_HASHBITS 12
#define RPC_CREDCACHE_NR (1 << RPC_CREDCACHE_HASHBITS)
+#define RPC_CREDCACHE_MIN 4
+#define RPC_CREDCACHE_MAX 16384
struct rpc_cred_cache {
- struct hlist_head hashtable[RPC_CREDCACHE_NR];
+ int hashsize;
+ int hashbits;
+ struct hlist_head *hashtable;
spinlock_t lock;
};
@@ -124,9 +128,8 @@
extern const struct rpc_authops authunix_ops;
extern const struct rpc_authops authnull_ops;
-void __init rpc_init_authunix(void);
-void __init rpc_init_generic_auth(void);
-void __init rpcauth_init_module(void);
+int __init rpc_init_generic_auth(void);
+int __init rpcauth_init_module(int);
void __exit rpcauth_remove_module(void);
void __exit rpc_destroy_generic_auth(void);
diff -ruN linux-2.6.31-rc8-git2.orig/net/sunrpc/auth.c linux-2.6.31-rc8-git2/net/sunrpc/auth.c
--- linux-2.6.31-rc8-git2.orig/net/sunrpc/auth.c 2009-08-28 02:59:04.000000000 +0200
+++ linux-2.6.31-rc8-git2/net/sunrpc/auth.c 2009-09-03 13:59:01.000000000 +0200
@@ -14,6 +14,8 @@
#include <linux/hash.h>
#include <linux/sunrpc/clnt.h>
#include <linux/spinlock.h>
+#include <linux/vmalloc.h>
+#include <linux/sysctl.h>
#ifdef RPC_DEBUG
# define RPCDBG_FACILITY RPCDBG_AUTH
@@ -28,6 +30,7 @@
static LIST_HEAD(cred_unused);
static unsigned long number_cred_unused;
+int credcache_hashsize = RPC_CREDCACHE_NR;
static u32
pseudoflavor_to_flavor(u32 flavor) {
@@ -147,7 +150,14 @@
new = kmalloc(sizeof(*new), GFP_KERNEL);
if (!new)
return -ENOMEM;
- for (i = 0; i < RPC_CREDCACHE_NR; i++)
+ new->hashsize = credcache_hashsize;
+ new->hashbits = ilog2(new->hashsize);
+ new->hashtable = vmalloc(new->hashsize * sizeof(struct hlist_head));
+ if (!new->hashtable) {
+ kfree(new);
+ return -ENOMEM;
+ }
+ for (i = 0; i < new->hashsize; i++)
INIT_HLIST_HEAD(&new->hashtable[i]);
spin_lock_init(&new->lock);
auth->au_credcache = new;
@@ -184,7 +194,7 @@
spin_lock(&rpc_credcache_lock);
spin_lock(&cache->lock);
- for (i = 0; i < RPC_CREDCACHE_NR; i++) {
+ for (i = 0; i < cache->hashsize; i++) {
head = &cache->hashtable[i];
while (!hlist_empty(head)) {
cred = hlist_entry(head->first, struct rpc_cred, cr_hash);
@@ -213,6 +223,8 @@
if (cache) {
auth->au_credcache = NULL;
rpcauth_clear_credcache(cache);
+ if (cache->hashtable)
+ vfree(cache->hashtable);
kfree(cache);
}
}
@@ -291,7 +303,7 @@
*entry, *new;
unsigned int nr;
- nr = hash_long(acred->uid, RPC_CREDCACHE_HASHBITS);
+ nr = hash_long(acred->uid, cache->hashbits);
rcu_read_lock();
hlist_for_each_entry_rcu(entry, pos, &cache->hashtable[nr], cr_hash) {
@@ -568,19 +580,87 @@
test_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags) != 0;
}
+#ifdef RPC_DEBUG
+static int proc_credcache_hashsize(struct ctl_table *table, int write,
+ struct file *file, void __user *buffer,
+ size_t *length, loff_t *ppos)
+{
+ int tmp = credcache_hashsize;
+
+ table->data = &tmp;
+ table->maxlen = sizeof(int);
+ proc_dointvec(table, write, file, buffer, length, ppos);
+ if (write) {
+ if (tmp < RPC_CREDCACHE_MIN ||
+ tmp > RPC_CREDCACHE_MAX ||
+ !is_power_of_2(tmp))
+ return -EINVAL;
+ credcache_hashsize = tmp;
+ }
+ return 0;
+}
+
+static ctl_table sunrpc_credcache_knobs_table [] = {
+ {
+ .procname = "credcache_hashsize",
+ .data = NULL,
+ .mode = 0644,
+ .proc_handler = &proc_credcache_hashsize,
+ },
+ {
+ .ctl_name = 0,
+ }
+};
+
+static ctl_table sunrpc_credcache_table[] = {
+ {
+ .ctl_name = CTL_SUNRPC,
+ .procname = "sunrpc",
+ .mode = 0555,
+ .child = sunrpc_credcache_knobs_table,
+ },
+ {
+ .ctl_name = 0,
+ }
+};
+
+static struct ctl_table_header *sunrpc_credcache_table_header;
+#endif
+
static struct shrinker rpc_cred_shrinker = {
.shrink = rpcauth_cache_shrinker,
.seeks = DEFAULT_SEEKS,
};
-void __init rpcauth_init_module(void)
+int __init rpcauth_init_module(int hashsize)
{
- rpc_init_authunix();
- rpc_init_generic_auth();
+ int err;
+
+ if (hashsize) {
+ hashsize = min(hashsize, RPC_CREDCACHE_MAX);
+ hashsize = max(hashsize, RPC_CREDCACHE_MIN);
+ credcache_hashsize = rounddown_pow_of_two(hashsize);
+ printk(KERN_INFO "RPC: credcache hashtable size %d\n",
+ credcache_hashsize);
+ }
+
+ err = rpc_init_generic_auth();
+ if (err)
+ goto out;
+#ifdef RPC_DEBUG
+ sunrpc_credcache_table_header =
+ register_sysctl_table(sunrpc_credcache_table);
+#endif
register_shrinker(&rpc_cred_shrinker);
+out:
+ return err;
}
void __exit rpcauth_remove_module(void)
{
+#ifdef RPC_DEBUG
+ if (sunrpc_credcache_table_header)
+ unregister_sysctl_table(sunrpc_credcache_table_header);
+#endif
unregister_shrinker(&rpc_cred_shrinker);
}
diff -ruN linux-2.6.31-rc8-git2.orig/net/sunrpc/auth_generic.c linux-2.6.31-rc8-git2/net/sunrpc/auth_generic.c
--- linux-2.6.31-rc8-git2.orig/net/sunrpc/auth_generic.c 2009-08-28 02:59:04.000000000 +0200
+++ linux-2.6.31-rc8-git2/net/sunrpc/auth_generic.c 2009-09-03 12:29:45.000000000 +0200
@@ -26,7 +26,6 @@
};
static struct rpc_auth generic_auth;
-static struct rpc_cred_cache generic_cred_cache;
static const struct rpc_credops generic_credops;
/*
@@ -158,20 +157,16 @@
return 0;
}
-void __init rpc_init_generic_auth(void)
+int __init rpc_init_generic_auth(void)
{
- spin_lock_init(&generic_cred_cache.lock);
+ return rpcauth_init_credcache(&generic_auth);
}
void __exit rpc_destroy_generic_auth(void)
{
- rpcauth_clear_credcache(&generic_cred_cache);
+ rpcauth_destroy_credcache(&generic_auth);
}
-static struct rpc_cred_cache generic_cred_cache = {
- {{ NULL, },},
-};
-
static const struct rpc_authops generic_auth_ops = {
.owner = THIS_MODULE,
.au_name = "Generic",
@@ -182,7 +177,6 @@
static struct rpc_auth generic_auth = {
.au_ops = &generic_auth_ops,
.au_count = ATOMIC_INIT(0),
- .au_credcache = &generic_cred_cache,
};
static const struct rpc_credops generic_credops = {
diff -ruN linux-2.6.31-rc8-git2.orig/net/sunrpc/auth_unix.c linux-2.6.31-rc8-git2/net/sunrpc/auth_unix.c
--- linux-2.6.31-rc8-git2.orig/net/sunrpc/auth_unix.c 2009-08-28 02:59:04.000000000 +0200
+++ linux-2.6.31-rc8-git2/net/sunrpc/auth_unix.c 2009-09-03 12:29:45.000000000 +0200
@@ -28,15 +28,23 @@
#endif
static struct rpc_auth unix_auth;
-static struct rpc_cred_cache unix_cred_cache;
static const struct rpc_credops unix_credops;
static struct rpc_auth *
unx_create(struct rpc_clnt *clnt, rpc_authflavor_t flavor)
{
+ int err;
+
dprintk("RPC: creating UNIX authenticator for client %p\n",
clnt);
atomic_inc(&unix_auth.au_count);
+ if (!unix_auth.au_credcache) {
+ err = rpcauth_init_credcache(&unix_auth);
+ if (err) {
+ atomic_dec(&unix_auth.au_count);
+ return ERR_PTR(err);
+ }
+ }
return &unix_auth;
}
@@ -202,11 +210,6 @@
return p;
}
-void __init rpc_init_authunix(void)
-{
- spin_lock_init(&unix_cred_cache.lock);
-}
-
const struct rpc_authops authunix_ops = {
.owner = THIS_MODULE,
.au_flavor = RPC_AUTH_UNIX,
@@ -218,17 +221,12 @@
};
static
-struct rpc_cred_cache unix_cred_cache = {
-};
-
-static
struct rpc_auth unix_auth = {
.au_cslack = UNX_WRITESLACK,
.au_rslack = 2, /* assume AUTH_NULL verf */
.au_ops = &authunix_ops,
.au_flavor = RPC_AUTH_UNIX,
.au_count = ATOMIC_INIT(0),
- .au_credcache = &unix_cred_cache,
};
static
diff -ruN linux-2.6.31-rc8-git2.orig/net/sunrpc/sunrpc_syms.c linux-2.6.31-rc8-git2/net/sunrpc/sunrpc_syms.c
--- linux-2.6.31-rc8-git2.orig/net/sunrpc/sunrpc_syms.c 2009-08-28 02:59:04.000000000 +0200
+++ linux-2.6.31-rc8-git2/net/sunrpc/sunrpc_syms.c 2009-09-03 12:29:45.000000000 +0200
@@ -23,6 +23,7 @@
#include <linux/sunrpc/xprtsock.h>
extern struct cache_detail ip_map_cache, unix_gid_cache;
+static int hashsize;
static int __init
init_sunrpc(void)
@@ -31,13 +32,14 @@
if (err)
goto out;
err = rpc_init_mempool();
- if (err) {
- unregister_rpc_pipefs();
- goto out;
- }
+ if (err)
+ goto out_err1;
#ifdef RPC_DEBUG
rpc_register_sysctl();
#endif
+ err = rpcauth_init_module(hashsize);
+ if (err)
+ goto out_err2;
#ifdef CONFIG_PROC_FS
rpc_proc_init();
#endif
@@ -45,7 +47,14 @@
cache_register(&unix_gid_cache);
svc_init_xprt_sock(); /* svc sock transport */
init_socket_xprt(); /* clnt sock transport */
- rpcauth_init_module();
+ goto out;
+out_err2:
+ rpc_destroy_mempool();
+#ifdef RPC_DEBUG
+ rpc_unregister_sysctl();
+#endif
+out_err1:
+ unregister_rpc_pipefs();
out:
return err;
}
@@ -68,6 +77,8 @@
#endif
rcu_barrier(); /* Wait for completion of call_rcu()'s */
}
+module_param(hashsize, int, 0);
+MODULE_PARM_DESC(hashsize, "size of hashtables for credential caches");
MODULE_LICENSE("GPL");
module_init(init_sunrpc);
module_exit(cleanup_sunrpc);
next prev parent reply other threads:[~2009-09-03 14:44 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-24 14:23 VM issue causing high CPU loads Yohan
2009-08-24 23:21 ` Andrew Morton
2009-08-24 23:21 ` Andrew Morton
2009-08-26 11:08 ` Mel Gorman
2009-08-26 11:08 ` Mel Gorman
2009-08-26 11:55 ` Yohan
2009-08-26 11:53 ` Yohan
2009-08-27 8:39 ` Yohan
2009-08-27 8:39 ` Yohan
2009-08-31 20:39 ` Yohan
[not found] ` <4A9C34F8.2010307-CZvJ5kAzflf985uAA1p3mw@public.gmane.org>
2009-09-03 0:06 ` Andrew Morton
2009-09-03 0:06 ` Andrew Morton
2009-09-03 13:01 ` Trond Myklebust
2009-09-03 13:39 ` Yohan
2009-09-03 14:02 ` Trond Myklebust
2009-09-03 14:08 ` Yohan
2009-09-03 14:35 ` Miquel van Smoorenburg [this message]
2009-09-03 20:05 ` Simon Kirby
2009-09-03 20:49 ` Trond Myklebust
2009-09-03 22:22 ` Simon Kirby
2009-09-04 12:31 ` Trond Myklebust
2009-09-04 12:31 ` Trond Myklebust
2009-09-03 21:21 ` Muntz, Daniel
2009-09-03 21:21 ` Muntz, Daniel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1251988514.20818.22.camel@laptop \
--to=mikevs@xs4all.net \
--cc=akpm@linux-foundation.org \
--cc=bfields@fieldses.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
--cc=trond.myklebust@fys.uio.no \
--cc=ytordjman@corp.free.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.