All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Dickson <SteveD@redhat.com>
To: John Nitis <jnitis@ati.com>
Cc: "'nfs@lists.sourceforge.net'" <nfs@lists.sourceforge.net>
Subject: Re: RHEL3 Update 1 and NetApp NFS freeze issues
Date: Fri, 10 Sep 2004 07:12:12 -0400	[thread overview]
Message-ID: <41418C0C.1030405@RedHat.com> (raw)
In-Reply-To: <5D5AD1BFE69EDE4283AE056291B89499074861ED@ca00exh03.ca.atitech.com>

John Nitis wrote:

>Greetings,
>
>This is a bit of a stab in the dark but I thought this might be a good forum
>to ask for input as it has both Linux NFS experts and a NetApp expert who
>are regular contributors.  We are unsure what the cause of the problem is
>(even whether it's hardware or software) but one of our focuses is NFS
>hangs.
>
>Our problem is this, essentially we have an entire rack of machines (57 of
>them) that lock up on a very regular basis.  They respond to ping but do not
>respond to telnet, ssh, etc.  When you plug in a VGA monitor/PS2 keyboard
>the screen pops up but you can't login.  When you hit enter it just echoes
>back a linefeed on the screen.  A small percentage of them kernel panic
>
Set up netdump so wen an oops occurs, a system image (or core) will be
created. Then use the crash to examine the the core. This will give you
a wealth of information on what is going on in the system.
(Note: You'll have to install the correct kernel-debuginfo for this to 
work).

When the system just hangs, make sure the Alt-SysRq  keys are  enabled
(by doing a "echo 1 > /proc/sys/kernel/sysrq"). Then use:
Alt-SysRq-p  to see where the process(es) are doing
Alt-SysRq T to get  system stack
Alt-SysRq M to memory information

>We have "top" and "ps augxww" output logging to a file once per minute and
>some of them show excessive load averages before they freeze with many
>processes stuck in D (uninterruptible sleep or disk wait).  If you catch
>these before the load average gets too high you can tell that a mount has
>locked up (df hangs after displaying a few mounts and you can't access the
>mount that's locked up).  Each new process that gets stuck adds 1 to the
>load average.  The machine locks up in exactly the same way when we yank the
>Ethernet cable from the box.
>  
>
Before things go south, does ifconfig ethX show any interface errors?

>Does anyone have any ideas as to what might be the problem or how we might
>go about debugging it further?  I've recently set the debugging levels to
>"10" in /proc/sys/sunrpc/rpc_debug and /proc/sys/sunrpc/nfs_debug to see if
>that will garner some information.  A few details follow below.
>
>  
>
If your using autofs/amd (if you can) turn it off to see what happens.

I hope this helps....

SteveD.



-------------------------------------------------------
This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM. 
Deadline: Sept. 13. Go here: http://sf.net/ppc_contest.php
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  parent reply	other threads:[~2004-09-10 11:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-10  0:48 RHEL3 Update 1 and NetApp NFS freeze issues John Nitis
2004-09-10  1:50 ` Trond Myklebust
2004-09-10 11:12 ` Steve Dickson [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-09-10 14:28 Lever, Charles
2004-09-10 14:42 Stuckless, Colin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41418C0C.1030405@RedHat.com \
    --to=steved@redhat.com \
    --cc=jnitis@ati.com \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.