From: Harry Edmon <harry-qmPYOCrcNLLyFCzt5hm0YvZ8FUJU4vz8@public.gmane.org>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Max Kellermann <max-hDT0AjmEH7RAfugRpC6u6w@public.gmane.org>,
linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: High load in 2.6.27, NFS / rpcauth_lookup_credcache()?
Date: Mon, 15 Dec 2008 15:44:58 -0800 [thread overview]
Message-ID: <4946EBFA.60700@atmos.washington.edu> (raw)
In-Reply-To: <1224773745.7625.4.camel@localhost>
Trond Myklebust wrote:
> On Thu, 2008-10-23 at 14:36 +0200, Max Kellermann wrote:
>
>> On 2008/10/22 11:12, Max Kellermann <max-hDT0AjmEH7RAfugRpC6u6w@public.gmane.org> wrote:
>>
>>> after I was able to fix http://lkml.org/lkml/2008/10/17/147, the
>>> server which was already upgraded to 2.6.27.2 still gets very high
>>> load. It is a web server with NFS file storage (NetApp), and while
>>> the others in the cluster (kernel 2.6.25) have a load of 1-3, 2.6.27.2
>>> gets 30-50.
>>>
>>> I did an oprofile, with the following results (server just started,
>>> load "only" 5-10):
>>>
>>> 87593 56.1116 (no location information) vmlinux
>>> vmlinux rpcauth_lookup_credcache
>>> 16037 10.2732 auth_generic.c:0 vmlinux
>>> vmlinux generic_match
>>> 6460 4.1382 (no location information) php4
>>> php4 (no symbols)
>>> 2478 1.5874 (no location information) libc-2.7.so
>>> libc-2.7.so (no symbols)
>>> [...]
>>>
>>> We havn't configured any special authentication method. It is a NFSv3
>>> over UDP mount, but the kernel has NFSv4 and therefore KRB5 enabled.
>>>
>>> Any ideas why rpcauth_lookup_credcache() goes overboard with CPU
>>> usage?
>>>
>> I have bisected the problem: 98a8e323 is the result ("SUNRPC: Add a
>> helper rpcauth_lookup_generic_cred()"). 5c691044 is ok.
>>
>> See the attached oprofile annotation data for both commits. I guess
>> that the function rpcauth_lookup_credcache() is waiting for a spinlock
>> too often and too long. Trond, any idea?
>>
>
> Can you add a '-v' to the rpc.gssd daemon startup line? I'd like to see
> how often you are creating new gss contexts.
>
>
>> Harry: added you to Cc because your problem sounds similar.
>>
>
> Harry's problem is should be unrelated. afaik, he is seeing a problem
> with userland RPC code, not kernel rpc code.
>
> Trond
>
>
I am finally getting some time to look at my problem that I originally
reported in October (SUNRPC problem with 2.6.26 and beyond), and I am
seeing the same behavior as Max Kellermann when my machine slows as I
described earlier. The system in question is currently running
2.6.27.7. Here is what I see when it is misbehaving:
samples % image name app name
symbol name
11380517 57.4191 sunrpc.ko sunrpc
rpcauth_lookup_credcache
3263657 16.4664 sunrpc.ko sunrpc
generic_match
1081287 5.4555 vmlinux vmlinux
copy_user_generic_string
499407 2.5197 vmlinux vmlinux
__posix_lock_file
[...]
And here is what I see when I stop the programs that are chewing up all
the system time, and then starting them up again:
samples % image name app name
symbol name
6372650 21.7978 vmlinux vmlinux
copy_user_generic_string
5401386 18.4755 sunrpc.ko sunrpc
rpcauth_lookup_credcache
3018753 10.3257 vmlinux vmlinux
__posix_lock_file
1050095 3.5919 sunrpc.ko sunrpc
generic_match
and I am not using Kerberos with NFSv4 (i.e. no rpc.gssd). Did you ever
find a solution for this problem with rpcauth_lookup_credcache?
--
Dr. Harry Edmon E-MAIL: harry-qmPYOCrcNLLyFCzt5hm0YvZ8FUJU4vz8@public.gmane.org
206-543-0547 harry-B93hV6UPU7Z2icitjWtXSw@public.gmane.org
Dept of Atmospheric Sciences FAX: 206-543-0308
University of Washington, Box 351640, Seattle, WA 98195-1640
WARNING: multiple messages have this Message-ID (diff)
From: Harry Edmon <harry@atmos.washington.edu>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Max Kellermann <max@duempel.org>,
linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: High load in 2.6.27, NFS / rpcauth_lookup_credcache()?
Date: Mon, 15 Dec 2008 15:44:58 -0800 [thread overview]
Message-ID: <4946EBFA.60700@atmos.washington.edu> (raw)
In-Reply-To: <1224773745.7625.4.camel@localhost>
Trond Myklebust wrote:
> On Thu, 2008-10-23 at 14:36 +0200, Max Kellermann wrote:
>
>> On 2008/10/22 11:12, Max Kellermann <max@duempel.org> wrote:
>>
>>> after I was able to fix http://lkml.org/lkml/2008/10/17/147, the
>>> server which was already upgraded to 2.6.27.2 still gets very high
>>> load. It is a web server with NFS file storage (NetApp), and while
>>> the others in the cluster (kernel 2.6.25) have a load of 1-3, 2.6.27.2
>>> gets 30-50.
>>>
>>> I did an oprofile, with the following results (server just started,
>>> load "only" 5-10):
>>>
>>> 87593 56.1116 (no location information) vmlinux
>>> vmlinux rpcauth_lookup_credcache
>>> 16037 10.2732 auth_generic.c:0 vmlinux
>>> vmlinux generic_match
>>> 6460 4.1382 (no location information) php4
>>> php4 (no symbols)
>>> 2478 1.5874 (no location information) libc-2.7.so
>>> libc-2.7.so (no symbols)
>>> [...]
>>>
>>> We havn't configured any special authentication method. It is a NFSv3
>>> over UDP mount, but the kernel has NFSv4 and therefore KRB5 enabled.
>>>
>>> Any ideas why rpcauth_lookup_credcache() goes overboard with CPU
>>> usage?
>>>
>> I have bisected the problem: 98a8e323 is the result ("SUNRPC: Add a
>> helper rpcauth_lookup_generic_cred()"). 5c691044 is ok.
>>
>> See the attached oprofile annotation data for both commits. I guess
>> that the function rpcauth_lookup_credcache() is waiting for a spinlock
>> too often and too long. Trond, any idea?
>>
>
> Can you add a '-v' to the rpc.gssd daemon startup line? I'd like to see
> how often you are creating new gss contexts.
>
>
>> Harry: added you to Cc because your problem sounds similar.
>>
>
> Harry's problem is should be unrelated. afaik, he is seeing a problem
> with userland RPC code, not kernel rpc code.
>
> Trond
>
>
I am finally getting some time to look at my problem that I originally
reported in October (SUNRPC problem with 2.6.26 and beyond), and I am
seeing the same behavior as Max Kellermann when my machine slows as I
described earlier. The system in question is currently running
2.6.27.7. Here is what I see when it is misbehaving:
samples % image name app name
symbol name
11380517 57.4191 sunrpc.ko sunrpc
rpcauth_lookup_credcache
3263657 16.4664 sunrpc.ko sunrpc
generic_match
1081287 5.4555 vmlinux vmlinux
copy_user_generic_string
499407 2.5197 vmlinux vmlinux
__posix_lock_file
[...]
And here is what I see when I stop the programs that are chewing up all
the system time, and then starting them up again:
samples % image name app name
symbol name
6372650 21.7978 vmlinux vmlinux
copy_user_generic_string
5401386 18.4755 sunrpc.ko sunrpc
rpcauth_lookup_credcache
3018753 10.3257 vmlinux vmlinux
__posix_lock_file
1050095 3.5919 sunrpc.ko sunrpc
generic_match
and I am not using Kerberos with NFSv4 (i.e. no rpc.gssd). Did you ever
find a solution for this problem with rpcauth_lookup_credcache?
--
Dr. Harry Edmon E-MAIL: harry@atmos.washington.edu
206-543-0547 harry@washington.edu
Dept of Atmospheric Sciences FAX: 206-543-0308
University of Washington, Box 351640, Seattle, WA 98195-1640
next prev parent reply other threads:[~2008-12-15 23:57 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-22 9:12 High load in 2.6.27, NFS / rpcauth_lookup_credcache()? Max Kellermann
2008-10-22 9:12 ` Max Kellermann
[not found] ` <20081022091207.GA12996-2pNSKKP3PSJxEiad3KpGLI/oZP4lHnOC@public.gmane.org>
2008-10-22 17:56 ` J. Bruce Fields
2008-10-22 17:56 ` J. Bruce Fields
2008-10-23 12:36 ` Max Kellermann
2008-10-23 12:36 ` Max Kellermann
2008-10-23 14:55 ` Trond Myklebust
2008-10-24 8:39 ` Max Kellermann
2008-10-24 8:39 ` Max Kellermann
[not found] ` <20081024083913.GA15197-2pNSKKP3PSJxEiad3KpGLI/oZP4lHnOC@public.gmane.org>
2008-10-24 18:09 ` Trond Myklebust
2008-10-24 18:09 ` Trond Myklebust
2008-10-27 9:58 ` Max Kellermann
2008-10-27 9:58 ` Max Kellermann
[not found] ` <20081027095843.GA10937-2pNSKKP3PSJxEiad3KpGLI/oZP4lHnOC@public.gmane.org>
2008-10-27 15:48 ` Trond Myklebust
2008-10-27 15:48 ` Trond Myklebust
[not found] ` <1225122503.14242.12.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-11-17 12:35 ` Max Kellermann
2008-11-17 12:35 ` Max Kellermann
[not found] ` <20081117123536.GA16539-2pNSKKP3PSJxEiad3KpGLI/oZP4lHnOC@public.gmane.org>
2008-11-19 22:31 ` Trond Myklebust
2008-11-19 22:31 ` Trond Myklebust
[not found] ` <1227133861.28898.26.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-11-20 14:08 ` Max Kellermann
2008-11-20 14:08 ` Max Kellermann
2008-12-15 23:44 ` Harry Edmon [this message]
2008-12-15 23:44 ` Harry Edmon
[not found] ` <4946EBFA.60700-qmPYOCrcNLLyFCzt5hm0YvZ8FUJU4vz8@public.gmane.org>
2008-12-16 13:02 ` Trond Myklebust
2008-12-16 13:02 ` Trond Myklebust
[not found] ` <1229432553.7257.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-12-16 21:21 ` Willy Tarreau
2008-12-16 21:21 ` Willy Tarreau
[not found] ` <20081216212155.GA581-K+wRfnb2/UA@public.gmane.org>
2008-12-16 23:21 ` [stable] " Greg KH
2008-12-16 23:21 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4946EBFA.60700@atmos.washington.edu \
--to=harry-qmpyocrcnllyfczt5hm0yvz8fuju4vz8@public.gmane.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=max-hDT0AjmEH7RAfugRpC6u6w@public.gmane.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.