* lockd using up 60% CPU and won't let go
@ 2008-09-29 16:46 Just Marc
[not found] ` <48E10657.7020503-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Just Marc @ 2008-09-29 16:46 UTC (permalink / raw)
To: linux-nfs
Hi everyone,
Doing a seemingly innocent operation such as opening a file with vim on
a CFS (yes, that old crypto file system) NFS mount, lockd would wake up
and take 60% of my CPU away - probably doing nothing important but
certainly keeping the CPU busy, forever.
I use kernel 2.6.26 and kernel NFS. Some detail is available below:
$ grep nfs /proc/mounts
nfsd /proc/fs/nfsd nfsd rw 0 0localhost:/var/lib/cfs/.cfsfs /var/cfs nfs
rw,vers=2,rsize=8192,wsize=8192,namlen=255,hard,intr,proto=udp,timeo=11,retrans=3,sec=sys,addr=127.0.0.1
0 0
localhost:/var/lib/cfs/.cfsfs/x /var/cfs/x nfs
rw,vers=2,rsize=8192,wsize=8192,namlen=255,hard,intr,proto=udp,timeo=11,retrans=3,sec=sys,addr=127.0.0.1
0 0
$ egrep 'NFS|_LOCKD' .config
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFSD=y
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_ROOT_NFS=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
I noticed this a few weeks ago but I don't quite know what causes it but
I certainly know how to trigger it. Stopping CFS and NFS completely
doesn't help - as soon as NFS is restarted lockd starts eating CPU again
just like before.
I'd appreciate any hints on what I can do to find the root cause of the
problem and help get this bug out of the way.
Best,
Marc
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockd using up 60% CPU and won't let go
[not found] ` <48E10657.7020503-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
@ 2008-09-29 17:14 ` J. Bruce Fields
0 siblings, 0 replies; 5+ messages in thread
From: J. Bruce Fields @ 2008-09-29 17:14 UTC (permalink / raw)
To: Just Marc; +Cc: linux-nfs
On Mon, Sep 29, 2008 at 12:46:15PM -0400, Just Marc wrote:
> Doing a seemingly innocent operation such as opening a file with vim on
> a CFS (yes, that old crypto file system)
It's basically just a userspace NFS server, right?
> NFS mount, lockd would wake up
> and take 60% of my CPU away - probably doing nothing important but
> certainly keeping the CPU busy, forever.
Could you work around the problem by mounting with -onolock?
> I use kernel 2.6.26 and kernel NFS. Some detail is available below:
>
> $ grep nfs /proc/mounts
> nfsd /proc/fs/nfsd nfsd rw 0 0localhost:/var/lib/cfs/.cfsfs /var/cfs nfs
(Missing end-of-line before "localhost"?)
> rw,vers=2,rsize=8192,wsize=8192,namlen=255,hard,intr,proto=udp,timeo=11,retrans=3,sec=sys,addr=127.0.0.1
> 0 0
> localhost:/var/lib/cfs/.cfsfs/x /var/cfs/x nfs
> rw,vers=2,rsize=8192,wsize=8192,namlen=255,hard,intr,proto=udp,timeo=11,retrans=3,sec=sys,addr=127.0.0.1
> 0 0
>
> $ egrep 'NFS|_LOCKD' .config
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_NFS_FS=y
> CONFIG_NFS_V3=y
> CONFIG_NFS_V3_ACL=y
> CONFIG_NFS_V4=y
> CONFIG_NFSD=y
> CONFIG_NFSD_V2_ACL=y
> CONFIG_NFSD_V3=y
> CONFIG_NFSD_V3_ACL=y
> CONFIG_NFSD_V4=y
> CONFIG_ROOT_NFS=y
> CONFIG_LOCKD=y
> CONFIG_LOCKD_V4=y
> CONFIG_NFS_ACL_SUPPORT=y
> CONFIG_NFS_COMMON=y
>
> I noticed this a few weeks ago but I don't quite know what causes it but
> I certainly know how to trigger it. Stopping CFS and NFS completely
> doesn't help - as soon as NFS is restarted lockd starts eating CPU again
> just like before.
>
> I'd appreciate any hints on what I can do to find the root cause of the
> problem and help get this bug out of the way.
You might try running wireshark on the "lo" interface and seeing whether
there's any NLM traffic from lockd.
Or a sysrq-t trace ("echo t >/proc/sysrq-trigger", then look in the
logs) might show what lockd's doing.
--b.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockd using up 60% CPU and won't let go
@ 2008-09-30 0:18 Just Marc
[not found] ` <48E17042.101-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Just Marc @ 2008-09-30 0:18 UTC (permalink / raw)
To: linux-nfs; +Cc: bfields
Hi,
> It's basically just a userspace NFS server, right?
Correct.
> Could you work around the problem by mounting with -onolock?
That doesn't seem to help.
>You might try running wireshark on the "lo" interface and seeing
whether there's any NLM traffic from lockd.
You guessed right. There's a 12 megabytes per second of NLM traffic on lo.
unlock call requests and unlock replies saying permission denied, looks
like it just repeats forever in a tight loop.
Marc
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockd using up 60% CPU and won't let go
[not found] ` <48E17042.101-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
@ 2008-09-30 12:25 ` Trond Myklebust
2008-09-30 18:26 ` J. Bruce Fields
1 sibling, 0 replies; 5+ messages in thread
From: Trond Myklebust @ 2008-09-30 12:25 UTC (permalink / raw)
To: Just Marc; +Cc: linux-nfs, bfields
On Mon, 2008-09-29 at 20:18 -0400, Just Marc wrote:
> Hi,
>
> > It's basically just a userspace NFS server, right?
>
> Correct.
>
> > Could you work around the problem by mounting with -onolock?
>
> That doesn't seem to help.
>
> >You might try running wireshark on the "lo" interface and seeing
> whether there's any NLM traffic from lockd.
>
> You guessed right. There's a 12 megabytes per second of NLM traffic on lo.
>
> unlock call requests and unlock replies saying permission denied, looks
> like it just repeats forever in a tight loop.
As Bruce said, you need to mount with -onolock. Please unmount _all_
your cfs partitions, then mount them again with -onolock.
Note that -oremount,nolock will not work and for some kernels, mounting
while you have the same cfs partition mounted somewhere else will cause
the kernel to use the 'old' mount options.
See /proc/mounts to find out which mount options the kernel is actually
using.
Cheers
Trond
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: lockd using up 60% CPU and won't let go
[not found] ` <48E17042.101-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
2008-09-30 12:25 ` Trond Myklebust
@ 2008-09-30 18:26 ` J. Bruce Fields
1 sibling, 0 replies; 5+ messages in thread
From: J. Bruce Fields @ 2008-09-30 18:26 UTC (permalink / raw)
To: Just Marc; +Cc: linux-nfs
On Mon, Sep 29, 2008 at 08:18:10PM -0400, Just Marc wrote:
> Hi,
>
> > It's basically just a userspace NFS server, right?
>
> Correct.
>
> > Could you work around the problem by mounting with -onolock?
>
> That doesn't seem to help.
>
> >You might try running wireshark on the "lo" interface and seeing
> whether there's any NLM traffic from lockd.
>
> You guessed right. There's a 12 megabytes per second of NLM traffic on lo.
>
> unlock call requests and unlock replies saying permission denied, looks
> like it just repeats forever in a tight loop.
Permission denied on an unlock sounds pretty weird--probably a bug on
the server (CFS) side. The client might be able to handle it
gracefully, but that's probably not a high priority.
So -onlock is the way to go; see Trond's suggestions. Locking will
still work, CFS just won't be told about it. That would only be a
problem if you had multiple NFS clients doing locking on the same CFS
filesystem, but you're only loopback-mounting, so only one client is
involved.
--b.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-09-30 18:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-30 0:18 lockd using up 60% CPU and won't let go Just Marc
[not found] ` <48E17042.101-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
2008-09-30 12:25 ` Trond Myklebust
2008-09-30 18:26 ` J. Bruce Fields
-- strict thread matches above, loose matches on Subject: below --
2008-09-29 16:46 Just Marc
[not found] ` <48E10657.7020503-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
2008-09-29 17:14 ` J. Bruce Fields
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.