All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@intel.com>
To: Andi Kleen <ak@linux.intel.com>
Cc: Linux-NFS <linux-nfs@vger.kernel.org>
Subject: Re: rpcauth_lookup_credcache() lock contentions
Date: Mon, 25 Jun 2012 10:42:11 +0800	[thread overview]
Message-ID: <20120625024211.GA11057@localhost> (raw)
In-Reply-To: <20120624213417.GI4152@tassilo.jf.intel.com>

On Sun, Jun 24, 2012 at 02:34:17PM -0700, Andi Kleen wrote:
> On Sat, Jun 23, 2012 at 08:26:04PM +0800, Fengguang Wu wrote:
> > Hi,
> > 
> > When doing parallel kernel builds on an NFSROOT system with 32 logical
> > CPUs, I see half CPU time spent in kernel:
> > 
> > top - 20:08:28 up  5:01,  8 users,  load average: 62.97, 55.82, 54.56
> > Tasks: 751 total,  80 running, 671 sleeping,   0 stopped,   0 zombie
> > Cpu(s):  0.1%us, 53.5%sy, 46.4%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
> > Mem:  65401132k total, 36455604k used, 28945528k free,        0k buffers
> > Swap:        0k total,        0k used,        0k free, 31858244k cached
> > 
> > The attached lock_stat shows that rpcauth_lookup_credcache() has the
> > outstanding contentions.
> 
> Hmm, the cache is already using RCU for the read case, just takes
> a lock for writing. Or when checking an entry.
> 
> So you either have a lot of lookups on the same entry, or you thrash
> the cache.
> 
> When you look up the look address, is it the one in the first loop?
> If yes then it's the first.
> 
> I'm not fully sure why it takes the lock in the read case anyways. Neither
> test_bit nor atomic_inc under RCU should need a lock, so I suppose
> it can be just removed.
> 
> 367                 spin_lock(&cache->lock);
> 368                 if (test_bit(RPCAUTH_CRED_HASHED, &entry->cr_flags) == 0) {
> 369                         spin_unlock(&cache->lock);
> 370                         continue;
> 371                 }
> 372                 cred = get_rpccred(entry);
> 373                 spin_unlock(&cache->lock);
> 
> and get_rpccred is just
> 
> 154 static inline
> 155 struct rpc_cred *       get_rpccred(struct rpc_cred *cred)
> 156 {
> 157         atomic_inc(&cred->cr_count);
> 158         return cred;
> 159 }
> 
> Can you try this patch?

Thank you! This patch brings %sys down to 24%, a pretty large improvement!

root@snb /home/wfg# vmstat 3
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
78  0      0 21294520      0 8822056    0    0     0     0   46   56  7  3 90  0
71  0      0 21312104      0 8832776    0    0     0     0 32831 14960 76 24  0  0
76  0      0 21203800      0 8844552    0    0     0     0 32892 15100 75 25  0  0
73  0      0 21019004      0 8856028    0    0     0     0 32837 14770 75 25  0  0
72  0      0 21021084      0 8865996    0    0     0     0 32841 14901 76 24  0  0
76  0      0 20907260      0 8876084    0    0     0     0 32678 14643 76 24  0  0
68  0      0 20936304      0 8886732    0    0     0     0 32724 14858 78 22  0  0
72  0      0 20889164      0 8897520    0    0     0     0 32732 14813 76 24  0  0
81  0      0 20742408      0 8907368    0    0     0     0 32745 14669 77 23  0  0
70  0      0 20662060      0 8917840    0    0     0     0 32748 14751 76 24  0  0
74  0      0 21035616      0 8927836    0    0     0     0 32781 14856 77 23  0  0
^C

It completely removes the function from lock_stat. Now the new
contented functions are

                              class name    con-bounces    contentions   waittime-min   waittime-max waittime-total    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

               &(&dentry->d_lock)->rlock:     197445146      201224451           0.10        2777.33   461033150.94     1345525784     6033591257           0.08         321.13  1158851748.52
               -------------------------
               &(&dentry->d_lock)->rlock      111609824          [<ffffffff811609f1>] dput+0x33/0x169
               &(&dentry->d_lock)->rlock       18819236          [<ffffffff81156b86>] path_get+0x2c/0x3c
               &(&dentry->d_lock)->rlock       17621335          [<ffffffff81156c20>] unlazy_walk+0x8a/0x1ac
               &(&dentry->d_lock)->rlock       34148964          [<ffffffff81160233>] dget_parent+0x4b/0xaf
               -------------------------
               &(&dentry->d_lock)->rlock       14495017          [<ffffffff81156b86>] path_get+0x2c/0x3c
               &(&dentry->d_lock)->rlock       89313409          [<ffffffff811609f1>] dput+0x33/0x169
               &(&dentry->d_lock)->rlock       19487816          [<ffffffff81156c20>] unlazy_walk+0x8a/0x1ac
               &(&dentry->d_lock)->rlock       59151508          [<ffffffff81160233>] dget_parent+0x4b/0xaf

...............................................................................................................................................................................................

               &sb->s_type->i_lock_key#1:      24313829       24330907           0.09         185.01    22864745.70      572924538     1422754983           0.08         244.15   852837006.69
               -------------------------
               &sb->s_type->i_lock_key#1       24263909          [<ffffffff81244ad9>] nfs_do_access+0x2b/0x350
               &sb->s_type->i_lock_key#1           1438          [<ffffffff8125069d>] nfs_unblock_sillyrename+0x9b/0xce
               &sb->s_type->i_lock_key#1          24986          [<ffffffff81247049>] nfs_file_set_open_context+0x3a/0x62
               &sb->s_type->i_lock_key#1          18689          [<ffffffff813f8285>] _atomic_dec_and_lock+0x39/0x50
               -------------------------
               &sb->s_type->i_lock_key#1       24280596          [<ffffffff81244ad9>] nfs_do_access+0x2b/0x350
               &sb->s_type->i_lock_key#1            578          [<ffffffff8125069d>] nfs_unblock_sillyrename+0x9b/0xce
               &sb->s_type->i_lock_key#1           8602          [<ffffffff813f8285>] _atomic_dec_and_lock+0x39/0x50
               &sb->s_type->i_lock_key#1          15649          [<ffffffff81247186>] nfs_release+0x47/0x93

Thanks,
Fengguang

> commit fa1eef2ec22f2fc31e0381b864044fbb753dd572
> Author: Andi Kleen <ak@linux.intel.com>
> Date:   Sun Jun 24 14:31:06 2012 -0700
> 
>     sunrpc: remove useless spinlocks in credential lookup path
>     
>     Fengguang noticed that rpcauth_lookup_credcache has high lock contention
>     on the nfs server when doing kernel builds on nfsroot.
>     
>     There is no reason to take the spinlock in the read loop: the RCU
>     makes sure the object does not go away, and either test_bit
>     nor atomic_inc in get_rpccred() needs a lock.
>     
>     So just remove the spinlock in the read lookup path.
>     
>     Reported-by: Fengguang Wu
>     Signed-off-by: Andi Kleen <ak@linux.intel.com>
> 
> diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
> index 727e506..13c2b58 100644
> --- a/net/sunrpc/auth.c
> +++ b/net/sunrpc/auth.c
> @@ -364,13 +364,10 @@ rpcauth_lookup_credcache(struct rpc_auth *auth, struct auth_cred * acred,
>  	hlist_for_each_entry_rcu(entry, pos, &cache->hashtable[nr], cr_hash) {
>  		if (!entry->cr_ops->crmatch(acred, entry, flags))
>  			continue;
> -		spin_lock(&cache->lock);
>  		if (test_bit(RPCAUTH_CRED_HASHED, &entry->cr_flags) == 0) {
> -			spin_unlock(&cache->lock);
>  			continue;
>  		}
>  		cred = get_rpccred(entry);
> -		spin_unlock(&cache->lock);
>  		break;
>  	}
>  	rcu_read_unlock();

  parent reply	other threads:[~2012-06-25  2:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20120623122604.GA10887@localhost>
2012-06-23 15:07 ` rpcauth_lookup_credcache() lock contentions Fengguang Wu
2012-06-24 21:34 ` Andi Kleen
2012-06-25  1:21   ` Myklebust, Trond
2012-06-25  2:45     ` Andi Kleen
2012-06-25  2:42   ` Fengguang Wu [this message]
2012-06-27 18:03     ` Andi Kleen
2012-06-27 18:36       ` Myklebust, Trond
2012-07-05 13:11       ` Fengguang Wu
2012-07-05 15:05         ` Malahal Naineni
2012-07-05 16:29           ` Andi Kleen
2012-07-05 16:27         ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120625024211.GA11057@localhost \
    --to=fengguang.wu@intel.com \
    --cc=ak@linux.intel.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.