Linux NFS development
 help / color / mirror / Atom feed
From: Matthew Mitchell <matthew@geodev.com>
To: nfs@lists.sourceforge.net
Subject: Need help with NFSD hang in 2.4.20+NFS_ALL
Date: Tue, 10 Jun 2003 10:55:33 -0500	[thread overview]
Message-ID: <3EE5FF75.5020104@geodev.com> (raw)

Everyone,

This morning I arrived at the office to find an NFS server hung up and 
users whining at me.

It appeared that an updatedb launched from a cron job had gotten hung 
up.  Perhaps it caused some sort of overload, I'm not really sure. 
System load was over 130, which is about what I expected given that we 
had 128 nfs daemon threads, all of which were presumably waiting.

Since nothing was responding (couldn't touch the affected disk, couldn't 
  successfully sync), I tried to start a "graceful" shutdown, and the 
exportfs -ua step hung.  A strace -p of that process showed it hung up in

	nfsservctl(0x4, <some address>, 0)

on the first mount listed in /var/lib/nfs/xtab.  Nothing I could do 
would make it move, and it wouldn't get any closer to shutdown so I had 
to cycle the power.  Joy.

Now, some background info: the NFS shared partition is a 
loopback-mounted reiserfs partition, the file underlying which rests on 
a big SW-raid volume.  (It's every bit as awful as it sounds.)  I don't 
think NFS is necessarily the culprit here but it did seize up in the 
most painful way.

There were some messages that looked like they were from lockd in the 
ring buffer but (I see now) they never got written to the messages file. 
  Damn.

Does it sound feasible to anyone who might know that the system might 
have just hiccuped under the load of the updatedb process?  That's not 
exactly good, but I can easily prevent it from running again...

More germane to this list: if I find this hung up again, is there 
anything I can do to diagnose the problem?  I don't know now if changing 
the value of /proc/sys/sunrpc/nfsd_debug would have any effect, but if 
someone suggests a good value I will try it.

This is a SMP box.

I'd greatly appreciate any help or suggestions or even questions to try 
to figure out what is going on.

-- 
Matthew Mitchell
Systems Programmer/Administrator            matthew@geodev.com
Geophysical Development Corporation         phone 713 782 1234
1 Riverway Suite 2100, Houston, TX  77056     fax 713 782 1829



-------------------------------------------------------
This SF.net email is sponsored by:  Etnus, makers of TotalView, The best
thread debugger on the planet. Designed with thread debugging features
you've never dreamed of, try TotalView 6 free at www.etnus.com.
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

             reply	other threads:[~2003-06-10 15:59 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-06-10 15:55 Matthew Mitchell [this message]
  -- strict thread matches above, loose matches on Subject: below --
2003-06-10 19:35 Need help with NFSD hang in 2.4.20+NFS_ALL Stuckless, Colin  709 778-3815

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3EE5FF75.5020104@geodev.com \
    --to=matthew@geodev.com \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox