linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Greg Banks <ghb-xTcybq6BZ68@public.gmane.org>
Cc: Linux NFS ML <linux-nfs@vger.kernel.org>
Subject: Re: [patch 22/29] knfsd: make svc_authenticate() scale
Date: Tue, 12 May 2009 17:24:47 -0400	[thread overview]
Message-ID: <20090512212447.GH20719@fieldses.org> (raw)
In-Reply-To: <20090331202946.068419000@sgi.com>

On Wed, Apr 01, 2009 at 07:28:22AM +1100, Greg Banks wrote:
> Replace the global spinlock which protects the table of registered
> RPC authentication flavours, with an RCU scheme.  The spinlock was
> taken by nfsd on every CPU for every NFS call, resulting in lots
> of spinlock contention and one very hot and bouncy cacheline.
> 
> Tests on a 16 CPU Altix A4700 with 2 10gige Myricom cards, configured
> separately (no bonding).  Workload is 640 client threads doing directory
> traverals with random small reads, from server RAM.
> 
> Before: 242 KIOPS, with an oprofile like:
>   %   cumulative   self              self     total
>  time   samples   samples    calls   1/call   1/call  name
>   5.01   2276.00  2276.00     2666     0.85     1.00  nfsd_ofcache_lookup
>   4.61   4370.00  2094.00     2092     1.00     1.00  ia64_spinlock_contention	<----
>   4.20   6279.00  1909.00     3141     0.61     0.78  svc_sock_enqueue
>   4.03   8108.00  1829.00     1824     1.00     1.00  spin_unlock_irqrestore
>   3.32   9618.00  1510.00     3588     0.42     1.00  spin_lock
> 
>              2090.00    0.00    2088/2092        spin_lock [22]
> [40]     4.6 2094.00    0.00    2092         ia64_spinlock_contention [40]
> 
>              1473.39 2039.32    3501/3588        svc_authenticate [21]
> [22]     7.9 1510.00 2090.00    3588         spin_lock [22]
> 
> After: 253 KIOPS, with a oprofile like:
>   %   cumulative   self              self     total
>  time   samples   samples    calls   1/call   1/call  name
>   5.20   2250.00  2250.00     2638     0.85     1.00  nfsd_ofcache_lookup
>   4.31   4117.00  1867.00     1863     1.00     1.00  spin_unlock_irqrestore
>   3.13   5470.00  1353.00     1447     0.94     1.01  svcauth_unix_set_client
>   2.79   6677.00  1207.00     1203     1.00     1.00  exp_readunlock
>   2.77   7875.00  1198.00     1186     1.01     1.01  svc_export_put
>   ...
>   0.03  43095.00    13.00       13     1.00     1.00  ia64_spinlock_contention	<----
> 
> Before anyone asks, going to a rwlock_t kept similar performance and
> just turned the time spent spinning on the lock to time spent waiting
> for the cacheline to bounce.
> 
> Signed-off-by: Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
> Reviewed-by: David Chinner <dgc@sgi.com>
> ---
> 
>  net/sunrpc/svcauth.c |   26 +++++++++++++++++---------
>  1 file changed, 17 insertions(+), 9 deletions(-)
> 
> Index: bfields/net/sunrpc/svcauth.c
> ===================================================================
> --- bfields.orig/net/sunrpc/svcauth.c
> +++ bfields/net/sunrpc/svcauth.c
> @@ -42,17 +42,19 @@ svc_authenticate(struct svc_rqst *rqstp,
>  	*authp = rpc_auth_ok;
>  
>  	flavor = svc_getnl(&rqstp->rq_arg.head[0]);
> +	if (flavor >= RPC_AUTH_MAXFLAVOR)
> +		return SVC_DENIED;
>  
>  	dprintk("svc: svc_authenticate (%d)\n", flavor);
>  
> -	spin_lock(&authtab_lock);
> -	if (flavor >= RPC_AUTH_MAXFLAVOR || !(aops = authtab[flavor])
> -			|| !try_module_get(aops->owner)) {
> -		spin_unlock(&authtab_lock);
> +	rcu_read_lock();
> +	aops = rcu_dereference(authtab[flavor]);
> +	if (!aops || !try_module_get(aops->owner)) {
> +		rcu_read_unlock();
>  		*authp = rpc_autherr_badcred;
>  		return SVC_DENIED;
>  	}
> -	spin_unlock(&authtab_lock);
> +	rcu_read_unlock();
>  
>  	rqstp->rq_authop = aops;
>  	return aops->accept(rqstp, authp);
> @@ -87,9 +89,13 @@ int
>  svc_auth_register(rpc_authflavor_t flavor, struct auth_ops *aops)
>  {
>  	int rv = -EINVAL;
> +
> +	if (flavor >= RPC_AUTH_MAXFLAVOR)
> +		return -EINVAL;
> +
>  	spin_lock(&authtab_lock);
> -	if (flavor < RPC_AUTH_MAXFLAVOR && authtab[flavor] == NULL) {
> -		authtab[flavor] = aops;
> +	if (authtab[flavor] == NULL) {
> +		rcu_assign_pointer(authtab[flavor], aops);
>  		rv = 0;
>  	}
>  	spin_unlock(&authtab_lock);
> @@ -100,9 +106,11 @@ EXPORT_SYMBOL_GPL(svc_auth_register);
>  void
>  svc_auth_unregister(rpc_authflavor_t flavor)
>  {
> +	if (flavor >= RPC_AUTH_MAXFLAVOR)
> +		return;
> +
>  	spin_lock(&authtab_lock);
> -	if (flavor < RPC_AUTH_MAXFLAVOR)
> -		authtab[flavor] = NULL;
> +	rcu_assign_pointer(authtab[flavor], NULL);

Despite having seen Paul McKenney explain rcu quite well at least a
couple times, I still have to go look at the documentation for these
functions....  Fortunately the documentation is good.

Looks like rcu_assign_pointer() is just a memory barrier, and doesn't
ensure, e.g, that this assignment won't happen during a read-side
critical section.  Don't we need more than that?  Maybe something like:

	rcu_assign_pointer(authtab[flavor], NULL);
	synchronize_rcu();

to ensure the aops doesn't go away before someone's even had a chance to
call try_module_get() on it?

--b.

>  	spin_unlock(&authtab_lock);
>  }
>  EXPORT_SYMBOL_GPL(svc_auth_unregister);
> 
> --
> Greg

  reply	other threads:[~2009-05-12 21:24 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-31 20:28 [patch 00/29] SGI enhancedNFS patches Greg Banks
2009-03-31 20:28 ` [patch 01/29] knfsd: Add infrastructure for measuring RPC service times Greg Banks
2009-04-25  2:13   ` J. Bruce Fields
2009-04-25  2:14     ` J. Bruce Fields
2009-04-25  2:52     ` Greg Banks
2009-03-31 20:28 ` [patch 02/29] knfsd: Add stats table infrastructure Greg Banks
2009-04-25  3:56   ` J. Bruce Fields
2009-04-26  4:12     ` Greg Banks
2009-03-31 20:28 ` [patch 03/29] knfsd: add userspace controls for stats tables Greg Banks
2009-04-25 21:57   ` J. Bruce Fields
2009-04-25 22:03     ` J. Bruce Fields
2009-04-27 16:06       ` Chuck Lever
2009-04-27 23:22         ` J. Bruce Fields
2009-04-28 15:37           ` Chuck Lever
2009-04-28 15:57             ` J. Bruce Fields
2009-04-28 16:03               ` Chuck Lever
2009-04-28 16:26                 ` J. Bruce Fields
2009-04-29  1:45               ` Greg Banks
     [not found]         ` <ac442c870904271827w6041a67ew82fe36a843beeac3@mail.gmail.com>
     [not found]           ` <ac442c870904271827w6041a67ew82fe36a843beeac3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-04-28  1:31             ` Greg Banks
2009-04-26  4:14     ` Greg Banks
2009-03-31 20:28 ` [patch 04/29] knfsd: Add stats updating API Greg Banks
2009-03-31 20:28 ` [patch 05/29] knfsd: Infrastructure for providing stats to userspace Greg Banks
2009-04-01  0:28   ` J. Bruce Fields
2009-04-01  3:43     ` Greg Banks
2009-03-31 20:28 ` [patch 06/29] knfsd: Gather per-export stats Greg Banks
2009-03-31 20:28 ` [patch 07/29] knfsd: Prefetch the per-export stats entry Greg Banks
2009-03-31 20:28 ` [patch 08/29] knfsd: Gather per-client stats Greg Banks
2009-03-31 20:28 ` [patch 09/29] knfsd: Cache per-client stats entry on TCP transports Greg Banks
2009-03-31 20:28 ` [patch 10/29] knfsd: Update per-client & per-export stats from NFSv3 Greg Banks
2009-03-31 20:28 ` [patch 11/29] knfsd: Update per-client & per-export stats from NFSv2 Greg Banks
2009-03-31 20:28 ` [patch 12/29] knfsd: Update per-client & per-export stats from NFSv4 Greg Banks
2009-03-31 20:28 ` [patch 13/29] knfsd: reply cache cleanups Greg Banks
2009-05-12 19:54   ` J. Bruce Fields
2009-03-31 20:28 ` [patch 14/29] knfsd: better hashing in the reply cache Greg Banks
2009-05-08 22:01   ` J. Bruce Fields
2009-03-31 20:28 ` [patch 15/29] knfsd: fix reply cache memory corruption Greg Banks
2009-05-12 19:55   ` J. Bruce Fields
2009-03-31 20:28 ` [patch 16/29] knfsd: use client IPv4 address in reply cache hash Greg Banks
2009-05-11 21:48   ` J. Bruce Fields
2009-03-31 20:28 ` [patch 17/29] knfsd: make the reply cache SMP-friendly Greg Banks
2009-03-31 20:28 ` [patch 18/29] knfsd: dynamically expand the reply cache Greg Banks
2009-05-26 18:57   ` J. Bruce Fields
2009-05-26 19:04     ` J. Bruce Fields
2009-05-26 21:24     ` Rob Gardner
2009-05-26 21:52       ` J. Bruce Fields
2009-05-27  0:28       ` Greg Banks
2009-03-31 20:28 ` [patch 19/29] knfsd: faster probing in " Greg Banks
2009-03-31 20:28 ` [patch 20/29] knfsd: add extended reply cache stats Greg Banks
2009-03-31 20:28 ` [patch 21/29] knfsd: remove unreported filehandle stats counters Greg Banks
2009-05-12 20:00   ` J. Bruce Fields
2009-03-31 20:28 ` [patch 22/29] knfsd: make svc_authenticate() scale Greg Banks
2009-05-12 21:24   ` J. Bruce Fields [this message]
2009-03-31 20:28 ` [patch 23/29] knfsd: introduce SVC_INC_STAT Greg Banks
2009-03-31 20:28 ` [patch 24/29] knfsd: remove the program field from struct svc_stat Greg Banks
2009-03-31 20:28 ` [patch 25/29] knfsd: allocate svc_serv.sv_stats dynamically Greg Banks
2009-03-31 20:28 ` [patch 26/29] knfsd: make svc_serv.sv_stats per-CPU Greg Banks
2009-03-31 20:28 ` [patch 27/29] knfsd: move hot procedure count field out of svc_procedure Greg Banks
2009-03-31 20:28 ` [patch 28/29] knfsd: introduce NFSD_INC_STAT() Greg Banks
2009-03-31 20:28 ` [patch 29/29] knfsd: make nfsdstats per-CPU Greg Banks
2009-04-01  0:23 ` [patch 00/29] SGI enhancedNFS patches J. Bruce Fields
2009-04-01  3:32   ` Greg Banks
     [not found]     ` <ac442c870903312032t34630c6dvdbb644cb510f8079-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-04-01  6:34       ` Jeff Garzik
2009-04-01  6:41         ` Greg Banks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090512212447.GH20719@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=ghb-xTcybq6BZ68@public.gmane.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).