From: Matthew Mitchell <matthew@geodev.com>
To: nfs@lists.sourceforge.net
Subject: Need help with NFSD hang in 2.4.20+NFS_ALL
Date: Tue, 10 Jun 2003 10:55:33 -0500 [thread overview]
Message-ID: <3EE5FF75.5020104@geodev.com> (raw)
Everyone,
This morning I arrived at the office to find an NFS server hung up and
users whining at me.
It appeared that an updatedb launched from a cron job had gotten hung
up. Perhaps it caused some sort of overload, I'm not really sure.
System load was over 130, which is about what I expected given that we
had 128 nfs daemon threads, all of which were presumably waiting.
Since nothing was responding (couldn't touch the affected disk, couldn't
successfully sync), I tried to start a "graceful" shutdown, and the
exportfs -ua step hung. A strace -p of that process showed it hung up in
nfsservctl(0x4, <some address>, 0)
on the first mount listed in /var/lib/nfs/xtab. Nothing I could do
would make it move, and it wouldn't get any closer to shutdown so I had
to cycle the power. Joy.
Now, some background info: the NFS shared partition is a
loopback-mounted reiserfs partition, the file underlying which rests on
a big SW-raid volume. (It's every bit as awful as it sounds.) I don't
think NFS is necessarily the culprit here but it did seize up in the
most painful way.
There were some messages that looked like they were from lockd in the
ring buffer but (I see now) they never got written to the messages file.
Damn.
Does it sound feasible to anyone who might know that the system might
have just hiccuped under the load of the updatedb process? That's not
exactly good, but I can easily prevent it from running again...
More germane to this list: if I find this hung up again, is there
anything I can do to diagnose the problem? I don't know now if changing
the value of /proc/sys/sunrpc/nfsd_debug would have any effect, but if
someone suggests a good value I will try it.
This is a SMP box.
I'd greatly appreciate any help or suggestions or even questions to try
to figure out what is going on.
--
Matthew Mitchell
Systems Programmer/Administrator matthew@geodev.com
Geophysical Development Corporation phone 713 782 1234
1 Riverway Suite 2100, Houston, TX 77056 fax 713 782 1829
-------------------------------------------------------
This SF.net email is sponsored by: Etnus, makers of TotalView, The best
thread debugger on the planet. Designed with thread debugging features
you've never dreamed of, try TotalView 6 free at www.etnus.com.
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next reply other threads:[~2003-06-10 15:59 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-06-10 15:55 Matthew Mitchell [this message]
-- strict thread matches above, loose matches on Subject: below --
2003-06-10 19:35 Need help with NFSD hang in 2.4.20+NFS_ALL Stuckless, Colin 709 778-3815
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3EE5FF75.5020104@geodev.com \
--to=matthew@geodev.com \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox