All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frank Steiner <fsteiner-mail@bio.ifi.lmu.de>
To: nfs@lists.sourceforge.net
Subject: server freeze with 2.6.8.1
Date: Wed, 08 Sep 2004 09:32:52 +0200	[thread overview]
Message-ID: <413EB5A4.9040505@bio.ifi.lmu.de> (raw)

Hi,

I know that the following description is not helpful for reproducing
the freeze, but I'm hoping that someone might have encountered something
similar or has an idea...

Our NFS server serves about 60 clients, 25 of them with root-over-nfs,
the rest just gets some stuff like /home etc. We recently switched from
2.4.21 to 2.6.8.1 and use self-compiled kernel rpms (based on the SuSE
rpms). We installed some kernel upgrades in the last weeks, all 2.6.8.1 based,
but enhanced by some security fixes or patches like packet writing etc.
So we got versions 2.6.8.1-1 to -3 which differed just in the security
fixes nfsd-xdr-patch, reiserfs-xattr-acl.patch, as well as the cdwriting-
patch. So basically the same kernels.

During these updates and reboots we encoutered some mysterious server
freezes. They are  not 100% reproducible, but when they happened we
had always installed a new kernel rpm on the server (in parallel to the
old ones, so that diskless clients keep the needed /lib/modules/ until they
reboot) and then either

- rebooted several clients in parallel to the new kernel while the
   server is still running the older version.

- or had the server (and some clients) run the new version and then
   rebooted some clients which are still running the old version.

We had also one situation where a user logged in into a client with KDE and
the server froze. The first time this happened, the client had just
booted to the new version that the server was already running. The freeze
then happened 4 times in a row when the user logged in, until we cleaned
up all kde-related files and directories on that users home (hosted on
the nfs server).

When the freeze occurs, the nfs server does not give any message on
/dev/tty10. No kernel oops or sth. Sometimes, when I'm quick enough,
I can still switch between consoles, e.g., from tty10 to tty1 and
back, but trying e.g. a emergency sync will then freeze the server
completely,  so that not even alt+sysrq+b will work. The last messages
I see in /var/log/messages are always the messages that a client has
mounted the nfs directories.

We are using nfs v3 with tcp,hard,intr,lock.

We had the same problem already when running the official SuSE kernel
2.4.21-xxx (never before with 2.4.19 and 2.4.20), where the nfs server
would freeze the same way, and that happened to nfs servers running
an i386 and IBM pSeries (SLES 8), but it happened not that often.

Now that I turned on /proc/sys/sunrpc/rpc_debug and /proc/sys/sunrpc/nfsd_debug
it could (of course) not reproduce the freeze again by booting back and
forth with the different versions.
Maybe it happens only after the server had run for some days (kind
of a pollution effect...?).

I can try to keep /proc/sys/sunrpc/rpc_debug and nfsd_debug running,
although it causes such a lot of messages that the disk performance
of the server really goes down. Would those debugging messages help
at all of the freeze occured again?

Or is there something else I can try? Has anyone seen sth. similar?

cu,
Frank

-- 
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik    Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17           Phone: +49 89 2180-4049
80333 Muenchen, Germany       Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *


-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

             reply	other threads:[~2004-09-08  7:32 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-08  7:32 Frank Steiner [this message]
2004-09-08  7:53 ` server freeze with 2.6.8.1 Frank Steiner
2004-09-08 11:25 ` We got a logfile! Frank Steiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=413EB5A4.9040505@bio.ifi.lmu.de \
    --to=fsteiner-mail@bio.ifi.lmu.de \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.