Linux NFS development
 help / color / mirror / Atom feed
* lockd using up 60% CPU and won't let go
@ 2008-09-29 16:46 Just Marc
       [not found] ` <48E10657.7020503-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Just Marc @ 2008-09-29 16:46 UTC (permalink / raw)
  To: linux-nfs

Hi everyone,

Doing a seemingly innocent operation such as opening a file with vim on 
a CFS (yes, that old crypto file system) NFS mount, lockd would wake up 
and take 60% of my CPU away - probably doing nothing important but 
certainly keeping the CPU busy, forever.

I use kernel 2.6.26 and kernel NFS.   Some detail is available below:
 
$ grep nfs /proc/mounts
nfsd /proc/fs/nfsd nfsd rw 0 0localhost:/var/lib/cfs/.cfsfs /var/cfs nfs 
rw,vers=2,rsize=8192,wsize=8192,namlen=255,hard,intr,proto=udp,timeo=11,retrans=3,sec=sys,addr=127.0.0.1 
0 0
localhost:/var/lib/cfs/.cfsfs/x /var/cfs/x nfs 
rw,vers=2,rsize=8192,wsize=8192,namlen=255,hard,intr,proto=udp,timeo=11,retrans=3,sec=sys,addr=127.0.0.1 
0 0

$ egrep 'NFS|_LOCKD' .config
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFSD=y
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_ROOT_NFS=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y

I noticed this a few weeks ago but I don't quite know what causes it but 
I certainly know how to trigger it.   Stopping CFS and NFS completely 
doesn't help - as soon as NFS is restarted lockd starts eating CPU again 
just like before.

I'd appreciate any hints on what I can do to find the root cause of the 
problem and help get this bug out of the way.

Best,
Marc


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockd using up 60% CPU and won't let go
       [not found] ` <48E10657.7020503-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
@ 2008-09-29 17:14   ` J. Bruce Fields
  0 siblings, 0 replies; 5+ messages in thread
From: J. Bruce Fields @ 2008-09-29 17:14 UTC (permalink / raw)
  To: Just Marc; +Cc: linux-nfs

On Mon, Sep 29, 2008 at 12:46:15PM -0400, Just Marc wrote:
> Doing a seemingly innocent operation such as opening a file with vim on  
> a CFS (yes, that old crypto file system)

It's basically just a userspace NFS server, right?

> NFS mount, lockd would wake up  
> and take 60% of my CPU away - probably doing nothing important but  
> certainly keeping the CPU busy, forever.

Could you work around the problem by mounting with -onolock?

> I use kernel 2.6.26 and kernel NFS.   Some detail is available below:
>
> $ grep nfs /proc/mounts
> nfsd /proc/fs/nfsd nfsd rw 0 0localhost:/var/lib/cfs/.cfsfs /var/cfs nfs  

(Missing end-of-line before "localhost"?)

> rw,vers=2,rsize=8192,wsize=8192,namlen=255,hard,intr,proto=udp,timeo=11,retrans=3,sec=sys,addr=127.0.0.1 
> 0 0
> localhost:/var/lib/cfs/.cfsfs/x /var/cfs/x nfs  
> rw,vers=2,rsize=8192,wsize=8192,namlen=255,hard,intr,proto=udp,timeo=11,retrans=3,sec=sys,addr=127.0.0.1 
> 0 0
>
> $ egrep 'NFS|_LOCKD' .config
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_NFS_FS=y
> CONFIG_NFS_V3=y
> CONFIG_NFS_V3_ACL=y
> CONFIG_NFS_V4=y
> CONFIG_NFSD=y
> CONFIG_NFSD_V2_ACL=y
> CONFIG_NFSD_V3=y
> CONFIG_NFSD_V3_ACL=y
> CONFIG_NFSD_V4=y
> CONFIG_ROOT_NFS=y
> CONFIG_LOCKD=y
> CONFIG_LOCKD_V4=y
> CONFIG_NFS_ACL_SUPPORT=y
> CONFIG_NFS_COMMON=y
>
> I noticed this a few weeks ago but I don't quite know what causes it but  
> I certainly know how to trigger it.   Stopping CFS and NFS completely  
> doesn't help - as soon as NFS is restarted lockd starts eating CPU again  
> just like before.
>
> I'd appreciate any hints on what I can do to find the root cause of the  
> problem and help get this bug out of the way.

You might try running wireshark on the "lo" interface and seeing whether
there's any NLM traffic from lockd.

Or a sysrq-t trace ("echo t >/proc/sysrq-trigger", then look in the
logs) might show what lockd's doing.

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockd using up 60% CPU and won't let go
@ 2008-09-30  0:18 Just Marc
       [not found] ` <48E17042.101-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Just Marc @ 2008-09-30  0:18 UTC (permalink / raw)
  To: linux-nfs; +Cc: bfields

Hi,

 > It's basically just a userspace NFS server, right?

Correct.

 > Could you work around the problem by mounting with -onolock?

That doesn't seem to help.

 >You might try running wireshark on the "lo" interface and seeing 
whether there's any NLM traffic from lockd.

You guessed right.   There's a 12 megabytes per second of NLM traffic on lo.

unlock call requests and unlock replies saying permission denied, looks 
like it just repeats forever in a tight loop.

Marc

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockd using up 60% CPU and won't let go
       [not found] ` <48E17042.101-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
@ 2008-09-30 12:25   ` Trond Myklebust
  2008-09-30 18:26   ` J. Bruce Fields
  1 sibling, 0 replies; 5+ messages in thread
From: Trond Myklebust @ 2008-09-30 12:25 UTC (permalink / raw)
  To: Just Marc; +Cc: linux-nfs, bfields

On Mon, 2008-09-29 at 20:18 -0400, Just Marc wrote:
> Hi,
> 
>  > It's basically just a userspace NFS server, right?
> 
> Correct.
> 
>  > Could you work around the problem by mounting with -onolock?
> 
> That doesn't seem to help.
> 
>  >You might try running wireshark on the "lo" interface and seeing 
> whether there's any NLM traffic from lockd.
> 
> You guessed right.   There's a 12 megabytes per second of NLM traffic on lo.
> 
> unlock call requests and unlock replies saying permission denied, looks 
> like it just repeats forever in a tight loop.

As Bruce said, you need to mount with -onolock. Please unmount _all_
your cfs partitions, then mount them again with -onolock.

Note that -oremount,nolock will not work and for some kernels, mounting
while you have the same cfs partition mounted somewhere else will cause
the kernel to use the 'old' mount options.
See /proc/mounts to find out which mount options the kernel is actually
using.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: lockd using up 60% CPU and won't let go
       [not found] ` <48E17042.101-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
  2008-09-30 12:25   ` Trond Myklebust
@ 2008-09-30 18:26   ` J. Bruce Fields
  1 sibling, 0 replies; 5+ messages in thread
From: J. Bruce Fields @ 2008-09-30 18:26 UTC (permalink / raw)
  To: Just Marc; +Cc: linux-nfs

On Mon, Sep 29, 2008 at 08:18:10PM -0400, Just Marc wrote:
> Hi,
>
> > It's basically just a userspace NFS server, right?
>
> Correct.
>
> > Could you work around the problem by mounting with -onolock?
>
> That doesn't seem to help.
>
> >You might try running wireshark on the "lo" interface and seeing  
> whether there's any NLM traffic from lockd.
>
> You guessed right.   There's a 12 megabytes per second of NLM traffic on lo.
>
> unlock call requests and unlock replies saying permission denied, looks  
> like it just repeats forever in a tight loop.

Permission denied on an unlock sounds pretty weird--probably a bug on
the server (CFS) side.  The client might be able to handle it
gracefully, but that's probably not a high priority.

So -onlock is the way to go; see Trond's suggestions.  Locking will
still work, CFS just won't be told about it.  That would only be a
problem if you had multiple NFS clients doing locking on the same CFS
filesystem, but you're only loopback-mounting, so only one client is
involved.

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-09-30 18:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-30  0:18 lockd using up 60% CPU and won't let go Just Marc
     [not found] ` <48E17042.101-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
2008-09-30 12:25   ` Trond Myklebust
2008-09-30 18:26   ` J. Bruce Fields
  -- strict thread matches above, loose matches on Subject: below --
2008-09-29 16:46 Just Marc
     [not found] ` <48E10657.7020503-ZTWYIuj8JqNeoWH0uzbU5w@public.gmane.org>
2008-09-29 17:14   ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox