public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Patrick McLean <chutz@cs.mcgill.ca>
To: Neil Brown <neilb@suse.de>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Andrew Bogecho <andrewb@cs.mcgill.ca>
Subject: Re: NFS locking
Date: Wed, 10 May 2006 21:46:52 -0400	[thread overview]
Message-ID: <4462978C.6080809@cs.mcgill.ca> (raw)
In-Reply-To: <17506.33247.884320.387785@cse.unsw.edu.au>

Neil Brown wrote:
> On Wednesday May 10, chutz@cs.mcgill.ca wrote:
>> We have a NFS server here with a fairly high load. The clients are
>> Linux, FreeBSD and Solaris. The exported filesystem is XFS, which is onb
>> a LVM drive. After between 3 and 30 days it seems that locking
>> completely stops working, clients generally either error or simply lock
>> up when they try to lock a file. The only way to fix it seems to be a
>> reboot.
> 
> Reboot the client or the server?
> 

The server, rebooting the clients had no effect.

>> Last time it happened was on 2.6.17-rc2, it started around 2.6.15.
>>
>> There is nothing in the dmesg on the server, the (Linux) clients are
>> printing this in the dmesg when something tries to create a lock:
>>
>> lockd: server xxx.xxx.xxx.xxx not responding, still trying
>> lockd: server xxx.xxx.xxx.xxx not responding, still trying
> 
> Sounds like the server has locked up.
> What does 'ps' on the server show for 'lockd'?  Is it in 'D'?  What is
> the 'wchan'?  Are any 'nfsd's permanently in 'D'?
> 
> Try
>  echo t > /proc/sysrq-trigger
> 
> and see what the stack trace for lockd is - probably only useful if it
> is in 'D'.
> 
> Maybe a 'tcpdump -s 1500' of traffic between client and server would
> help.

We have already rebooted the server this time around, we will do the stack trace
and tcpdump from a client next time it happens.

Though, I do seem to remember that lockd was in the "D" state on the server when
it happened this afternoon. Restarting the nfs service on the server did spawn a
new lockd process, but did not fix the problem.

      reply	other threads:[~2006-05-11  1:45 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-10 20:02 NFS locking Patrick McLean
2006-05-11  0:14 ` Neil Brown
2006-05-11  1:46   ` Patrick McLean [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4462978C.6080809@cs.mcgill.ca \
    --to=chutz@cs.mcgill.ca \
    --cc=andrewb@cs.mcgill.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox