* [PATCH] sunrpc: use supplimental groups in auth hash.
@ 2017-10-24 0:29 NeilBrown
2017-10-25 16:13 ` Chuck Lever
0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2017-10-24 0:29 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker; +Cc: Linux NFS Mailing List
[-- Attachment #1: Type: text/plain, Size: 2956 bytes --]
Some sites vary some supplimental groups a lot.
To avoid unduely long hash chains, include all of these
in the hash calculation.
Signed-off-by: NeilBrown <neilb@suse.com>
---
Hi,
I have a customer who had thousands of auth entries with the same
uid and gid, so the hashtable was unbalanced and some lookups were
noticeably slow. This fix helped.
Relatedly, I wonder if we should set a default auth_max_cred_cachesize,
and nfs_access_max_cachesize - smaller than ULONG_MAX.
For auth_max_cred_cachesize, a default of e.g. 256*(1<<auth_hashbits)
is appropriate I think. If there are more auth entries than that,
you'll start to notice lookups taking a long time.
For nfs_access_max_cachesize we want a similar limit as each access
entry pins an auth entry and so keeps a hash chain long.
Or maybe we could change the auth lookup to use an rbtree (or
hash-table of rbtrees?) so that time scales with log of size.
One option is to restore the 'jiffies' field to the access cache, and
discard entries older than (say) 10 minutes.
Thoughts?
Thanks,
NeilBrown
net/sunrpc/auth_generic.c | 15 ++++++++++++---
net/sunrpc/auth_unix.c | 12 +++++++++---
2 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/net/sunrpc/auth_generic.c b/net/sunrpc/auth_generic.c
index f1df9837f1ac..035f7c5f184f 100644
--- a/net/sunrpc/auth_generic.c
+++ b/net/sunrpc/auth_generic.c
@@ -81,9 +81,18 @@ static struct rpc_cred *generic_bind_cred(struct rpc_task *task,
static int
generic_hash_cred(struct auth_cred *acred, unsigned int hashbits)
{
- return hash_64(from_kgid(&init_user_ns, acred->gid) |
- ((u64)from_kuid(&init_user_ns, acred->uid) <<
- (sizeof(gid_t) * 8)), hashbits);
+ int ret = hash_32(from_kuid(&init_user_ns, acred->uid), 32);
+
+ if (acred->group_info) {
+ int g;
+
+ for (g = 0; g < acred->group_info->ngroups; g++)
+ ret = hash_32(ret ^
+ from_kgid(&init_user_ns,
+ acred->group_info->gid[g]),
+ 32);
+ }
+ return hash_32(ret ^ from_kgid(&init_user_ns, acred->gid), hashbits);
}
/*
diff --git a/net/sunrpc/auth_unix.c b/net/sunrpc/auth_unix.c
index 82337e1ec9cd..20cd4ab3b339 100644
--- a/net/sunrpc/auth_unix.c
+++ b/net/sunrpc/auth_unix.c
@@ -47,9 +47,15 @@ unx_destroy(struct rpc_auth *auth)
static int
unx_hash_cred(struct auth_cred *acred, unsigned int hashbits)
{
- return hash_64(from_kgid(&init_user_ns, acred->gid) |
- ((u64)from_kuid(&init_user_ns, acred->uid) <<
- (sizeof(gid_t) * 8)), hashbits);
+ int ret = hash_32(from_kuid(&init_user_ns, acred->uid), 32);
+ if (acred->group_info) {
+ int g;
+
+ for (g = 0; g < acred->group_info->ngroups && g < UNX_NGROUPS; g++)
+ ret = hash_32(ret ^ from_kgid(&init_user_ns,
+ acred->group_info->gid[g]), 32);
+ }
+ return hash_32(ret ^ from_kgid(&init_user_ns, acred->gid), hashbits);
}
/*
--
2.14.0.rc0.dirty
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] sunrpc: use supplimental groups in auth hash.
2017-10-24 0:29 [PATCH] sunrpc: use supplimental groups in auth hash NeilBrown
@ 2017-10-25 16:13 ` Chuck Lever
2017-10-25 21:19 ` NeilBrown
0 siblings, 1 reply; 3+ messages in thread
From: Chuck Lever @ 2017-10-25 16:13 UTC (permalink / raw)
To: NeilBrown; +Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List
> On Oct 23, 2017, at 8:29 PM, NeilBrown <neilb@suse.com> wrote:
>
>
> Some sites vary some supplimental groups a lot.
> To avoid unduely long hash chains, include all of these
> in the hash calculation.
>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>
> Hi,
> I have a customer who had thousands of auth entries with the same
> uid and gid, so the hashtable was unbalanced and some lookups were
> noticeably slow. This fix helped.
>
> Relatedly, I wonder if we should set a default auth_max_cred_cachesize,
> and nfs_access_max_cachesize - smaller than ULONG_MAX.
>
> For auth_max_cred_cachesize, a default of e.g. 256*(1<<auth_hashbits)
> is appropriate I think. If there are more auth entries than that,
> you'll start to notice lookups taking a long time.
>
> For nfs_access_max_cachesize we want a similar limit as each access
> entry pins an auth entry and so keeps a hash chain long.
>
> Or maybe we could change the auth lookup to use an rbtree (or
> hash-table of rbtrees?) so that time scales with log of size.
>
> One option is to restore the 'jiffies' field to the access cache, and
> discard entries older than (say) 10 minutes.
>
> Thoughts?
Seems like we are always revisiting the hash function, and
there are always workloads where long chains occur. Since
the lookup-to-insertion ratio is very high, maybe it's time
to consider a data structure that self-balances on insertion,
like an rb-tree.
Would it make sense to protect the cache with a rwlock
instead of spin lock to allow concurrent readers?
Is there any sane way to make the cred cache into a set of
per CPU caches instead, to reduce the need for locking?
> Thanks,
> NeilBrown
>
>
> net/sunrpc/auth_generic.c | 15 ++++++++++++---
> net/sunrpc/auth_unix.c | 12 +++++++++---
> 2 files changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/net/sunrpc/auth_generic.c b/net/sunrpc/auth_generic.c
> index f1df9837f1ac..035f7c5f184f 100644
> --- a/net/sunrpc/auth_generic.c
> +++ b/net/sunrpc/auth_generic.c
> @@ -81,9 +81,18 @@ static struct rpc_cred *generic_bind_cred(struct rpc_task *task,
> static int
> generic_hash_cred(struct auth_cred *acred, unsigned int hashbits)
> {
> - return hash_64(from_kgid(&init_user_ns, acred->gid) |
> - ((u64)from_kuid(&init_user_ns, acred->uid) <<
> - (sizeof(gid_t) * 8)), hashbits);
> + int ret = hash_32(from_kuid(&init_user_ns, acred->uid), 32);
> +
> + if (acred->group_info) {
> + int g;
> +
> + for (g = 0; g < acred->group_info->ngroups; g++)
> + ret = hash_32(ret ^
> + from_kgid(&init_user_ns,
> + acred->group_info->gid[g]),
> + 32);
> + }
> + return hash_32(ret ^ from_kgid(&init_user_ns, acred->gid), hashbits);
> }
>
> /*
> diff --git a/net/sunrpc/auth_unix.c b/net/sunrpc/auth_unix.c
> index 82337e1ec9cd..20cd4ab3b339 100644
> --- a/net/sunrpc/auth_unix.c
> +++ b/net/sunrpc/auth_unix.c
> @@ -47,9 +47,15 @@ unx_destroy(struct rpc_auth *auth)
> static int
> unx_hash_cred(struct auth_cred *acred, unsigned int hashbits)
> {
> - return hash_64(from_kgid(&init_user_ns, acred->gid) |
> - ((u64)from_kuid(&init_user_ns, acred->uid) <<
> - (sizeof(gid_t) * 8)), hashbits);
> + int ret = hash_32(from_kuid(&init_user_ns, acred->uid), 32);
> + if (acred->group_info) {
> + int g;
> +
> + for (g = 0; g < acred->group_info->ngroups && g < UNX_NGROUPS; g++)
> + ret = hash_32(ret ^ from_kgid(&init_user_ns,
> + acred->group_info->gid[g]), 32);
> + }
> + return hash_32(ret ^ from_kgid(&init_user_ns, acred->gid), hashbits);
> }
>
> /*
> --
> 2.14.0.rc0.dirty
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] sunrpc: use supplimental groups in auth hash.
2017-10-25 16:13 ` Chuck Lever
@ 2017-10-25 21:19 ` NeilBrown
0 siblings, 0 replies; 3+ messages in thread
From: NeilBrown @ 2017-10-25 21:19 UTC (permalink / raw)
To: Chuck Lever; +Cc: Trond Myklebust, Anna Schumaker, Linux NFS Mailing List
[-- Attachment #1: Type: text/plain, Size: 2317 bytes --]
On Wed, Oct 25 2017, Chuck Lever wrote:
>> On Oct 23, 2017, at 8:29 PM, NeilBrown <neilb@suse.com> wrote:
>>
>>
>> Some sites vary some supplimental groups a lot.
>> To avoid unduely long hash chains, include all of these
>> in the hash calculation.
>>
>> Signed-off-by: NeilBrown <neilb@suse.com>
>> ---
>>
>> Hi,
>> I have a customer who had thousands of auth entries with the same
>> uid and gid, so the hashtable was unbalanced and some lookups were
>> noticeably slow. This fix helped.
>>
>> Relatedly, I wonder if we should set a default auth_max_cred_cachesize,
>> and nfs_access_max_cachesize - smaller than ULONG_MAX.
>>
>> For auth_max_cred_cachesize, a default of e.g. 256*(1<<auth_hashbits)
>> is appropriate I think. If there are more auth entries than that,
>> you'll start to notice lookups taking a long time.
>>
>> For nfs_access_max_cachesize we want a similar limit as each access
>> entry pins an auth entry and so keeps a hash chain long.
>>
>> Or maybe we could change the auth lookup to use an rbtree (or
>> hash-table of rbtrees?) so that time scales with log of size.
>>
>> One option is to restore the 'jiffies' field to the access cache, and
>> discard entries older than (say) 10 minutes.
>>
>> Thoughts?
>
> Seems like we are always revisiting the hash function, and
> there are always workloads where long chains occur. Since
> the lookup-to-insertion ratio is very high, maybe it's time
> to consider a data structure that self-balances on insertion,
> like an rb-tree.
I'm certainly open to that possibility.
>
> Would it make sense to protect the cache with a rwlock
> instead of spin lock to allow concurrent readers?
I doubt if rwlocks are ever really useful. If there is noticeable
contention between readers, then a lockless/RCU based approach should be
preferred.
>
> Is there any sane way to make the cred cache into a set of
> per CPU caches instead, to reduce the need for locking?
Encouraged by your outside-the-box thinking, I have another question.
Do we need the auth cache at all? What is it actually caching?
For gss, I can easily imagine there is something worth storing in a
cache, but what value do the 'generic' and 'auth_unix' caches provide?
Does anyone know?
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-10-25 21:19 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-24 0:29 [PATCH] sunrpc: use supplimental groups in auth hash NeilBrown
2017-10-25 16:13 ` Chuck Lever
2017-10-25 21:19 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).