* Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
@ 2004-03-18 21:09 Chip Salzenberg
2004-03-19 0:08 ` Trond Myklebust
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Chip Salzenberg @ 2004-03-18 21:09 UTC (permalink / raw)
To: Neil Brown; +Cc: nfs
Neil (or someone), what should I tell this user?
----- Forwarded message from "Steinar H. Gunderson" <sgunderson@bigfoot.com> -----
Subject: Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
From: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
To: Debian Bug Tracking System <submit@bugs.debian.org>
Date: Wed, 03 Mar 2004 00:41:21 +0100
X-Mailer: reportbug 1.50
Package: nfs-kernel-server
Version: 1:1.0-2woody1
Severity: important
It appears that from time to time, our NFS servers (both 2.4 and 2.6
servers, but both running woody) seem to go into giant loads with almost
no traffic (ie. 20-30 connections, but almost no file activity, as
confirmed by tcpdump). This is typically in the 7-8-9 range, and the
clients in question seem to hang almost indefinitely (like 20 minutes
for a simple ls). However, top shows no processes wanting CPU time, so
it almost looks like some kind of I/O starvation problem.
In addition, we seem to get strange errors like:
00:16:32.330039 129.241.93.186 > 129.241.93.30: icmp: ip reassembly time exceeded [tos 0xc0]
(.30 is the NFS server, .186 is one of the NFS clients)
Something is clearly wrong here; stopping nfs-kernel-server makes the
load drop to zero almost immediately, and substituting nfs-user-server for
nfs-kernel-server also fixes the problem. The servers in question are
also NFS clients, but there are no stale mounts and we aren't using NFS
re-export.
These problems seem to coincide with the rollout of Linux 2.6.x (seen
the problem with both 2.6.1 and 2.6.3) on the clients, so it seems
plausible that something in the Linux 2.6 client is triggering the NFS
kernel server code. I'm a bit unsure if I should file this on
nfs-kernel-server or on a kernel package; feel free to reassign as
needed.
-- System Information
Debian Release: 3.0
Architecture: i386
Kernel: Linux cassarossa 2.4.25 #1 SMP Wed Feb 18 22:46:21 CET 2004 i686
Locale: LANG=en_US, LC_CTYPE=en_US.ISO8859-1
Versions of packages nfs-kernel-server depends on:
ii debconf 1.2.35 Debian configuration management sy
ii libc6 2.2.5-11.5 GNU C Library: Shared libraries an
ii libwrap0 7.6-9 Wietse Venema's TCP wrappers libra
ii nfs-common 1:1.0-2woody1 NFS support files common to client
----- End forwarded message -----
--
Chip Salzenberg - a.k.a. - <chip@pobox.com>
"I wanted to play hopscotch with the impenetrable mystery of existence,
but he stepped in a wormhole and had to go in early." // MST3K
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
2004-03-18 21:09 Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients Chip Salzenberg
@ 2004-03-19 0:08 ` Trond Myklebust
2004-03-19 8:48 ` Olaf Kirch
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Trond Myklebust @ 2004-03-19 0:08 UTC (permalink / raw)
To: Chip Salzenberg; +Cc: Neil Brown, nfs
P=E5 to , 18/03/2004 klokka 16:09, skreiv Chip Salzenberg:
> Neil (or someone), what should I tell this user?
>=20
I typically see the "ip reassembly time exceeded" in situations where
the machine is dropping fragments due to missed interrupts.
What's probably happening here is that because the 2.6 clients can cache
huge amounts of writes before everything needs to be written out at
close() time, the server is being overwhelmed...
Cheers
Trond
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
2004-03-18 21:09 Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients Chip Salzenberg
2004-03-19 0:08 ` Trond Myklebust
@ 2004-03-19 8:48 ` Olaf Kirch
2004-03-19 8:49 ` Olaf Kirch
2004-03-22 0:25 ` Neil Brown
3 siblings, 0 replies; 7+ messages in thread
From: Olaf Kirch @ 2004-03-19 8:48 UTC (permalink / raw)
To: Chip Salzenberg; +Cc: Neil Brown, nfs
On Thu, Mar 18, 2004 at 04:09:43PM -0500, Chip Salzenberg wrote:
> Neil (or someone), what should I tell this user?
They should enable NFS and RPC debugging on the client when this problem
occurs, put the bzipped logs somewhere and send a pointer to this list.
To turn on debugging, do this
echo 65535 > /proc/sys/sunrpc/nfs_debug
echo 65535 > /proc/sys/sunrpc/rpc_debug
If something in the RPC client is going berserk this logs will probably
grow like crazy. It may be helpful to kill syslog and
"cat /proc/kmsg > /tmp/nfs.log" directly. Or even use "head -10000"
instead of cat.
Olaf
--
Olaf Kirch | Stop wasting entropy - start using predictable
okir@suse.de | tempfile names today!
---------------+
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
2004-03-18 21:09 Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients Chip Salzenberg
2004-03-19 0:08 ` Trond Myklebust
2004-03-19 8:48 ` Olaf Kirch
@ 2004-03-19 8:49 ` Olaf Kirch
2004-03-19 11:53 ` Bernd Schubert
2004-03-22 0:25 ` Neil Brown
3 siblings, 1 reply; 7+ messages in thread
From: Olaf Kirch @ 2004-03-19 8:49 UTC (permalink / raw)
To: Chip Salzenberg; +Cc: Neil Brown, nfs
One more thing:
> Something is clearly wrong here; stopping nfs-kernel-server makes the
> load drop to zero almost immediately, and substituting nfs-user-server for
> nfs-kernel-server also fixes the problem.
This probably means the problem is in the NFSv3 client code. The major
difference between knfsd and unfsd is that the latter is v2 only.
Olaf
--
Olaf Kirch | Stop wasting entropy - start using predictable
okir@suse.de | tempfile names today!
---------------+
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
2004-03-19 8:49 ` Olaf Kirch
@ 2004-03-19 11:53 ` Bernd Schubert
2004-03-19 12:03 ` Olaf Kirch
0 siblings, 1 reply; 7+ messages in thread
From: Bernd Schubert @ 2004-03-19 11:53 UTC (permalink / raw)
To: Olaf Kirch; +Cc: nfs
=2D----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Friday 19 March 2004 09:49, Olaf Kirch wrote:
> One more thing:
> > Something is clearly wrong here; stopping nfs-kernel-server makes the
> > load drop to zero almost immediately, and substituting nfs-user-server
> > for nfs-kernel-server also fixes the problem.
>
> This probably means the problem is in the NFSv3 client code. The major
> difference between knfsd and unfsd is that the latter is v2 only.
>
Now there's also unfs3 (http://unfs3.sourceforge.net/), which supports as t=
he=20
name suggests v3.
Cheers,
Bernd
=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFAWt8/C8BUnAF+ydYRAhCVAJ48HQ8l8o/ZKZBhaXFBQxv4VvMoNQCeIsv0
+cn/FO2oRl/XFf9v6Ktf+kE=3D
=3DjWh4
=2D----END PGP SIGNATURE-----
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
2004-03-19 11:53 ` Bernd Schubert
@ 2004-03-19 12:03 ` Olaf Kirch
0 siblings, 0 replies; 7+ messages in thread
From: Olaf Kirch @ 2004-03-19 12:03 UTC (permalink / raw)
To: Bernd Schubert; +Cc: nfs
On Fri, Mar 19, 2004 at 12:53:30PM +0100, Bernd Schubert wrote:
> Now there's also unfs3 (http://unfs3.sourceforge.net/), which supports as the
> name suggests v3.
Which I think has a striking resemblance to unfsd in some parts of the
code but neglects to acknowledge that, both in the copyright statements
and the READMEs...
Olaf
--
Olaf Kirch | Stop wasting entropy - start using predictable
okir@suse.de | tempfile names today!
---------------+
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
2004-03-18 21:09 Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients Chip Salzenberg
` (2 preceding siblings ...)
2004-03-19 8:49 ` Olaf Kirch
@ 2004-03-22 0:25 ` Neil Brown
3 siblings, 0 replies; 7+ messages in thread
From: Neil Brown @ 2004-03-22 0:25 UTC (permalink / raw)
To: Chip Salzenberg; +Cc: nfs
On Thursday March 18, chip@pobox.com wrote:
>
> Neil (or someone), what should I tell this user?
>
"use tcp" might be a good answer.... especially if it works :-)
NeilBrown
>
> ----- Forwarded message from "Steinar H. Gunderson" <sgunderson@bigfoot.com> -----
>
> Subject: Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients
> From: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
> To: Debian Bug Tracking System <submit@bugs.debian.org>
> Date: Wed, 03 Mar 2004 00:41:21 +0100
> X-Mailer: reportbug 1.50
>
> Package: nfs-kernel-server
> Version: 1:1.0-2woody1
> Severity: important
>
> It appears that from time to time, our NFS servers (both 2.4 and 2.6
> servers, but both running woody) seem to go into giant loads with almost
> no traffic (ie. 20-30 connections, but almost no file activity, as
> confirmed by tcpdump). This is typically in the 7-8-9 range, and the
> clients in question seem to hang almost indefinitely (like 20 minutes
> for a simple ls). However, top shows no processes wanting CPU time, so
> it almost looks like some kind of I/O starvation problem.
>
> In addition, we seem to get strange errors like:
>
> 00:16:32.330039 129.241.93.186 > 129.241.93.30: icmp: ip reassembly time exceeded [tos 0xc0]
>
> (.30 is the NFS server, .186 is one of the NFS clients)
>
> Something is clearly wrong here; stopping nfs-kernel-server makes the
> load drop to zero almost immediately, and substituting nfs-user-server for
> nfs-kernel-server also fixes the problem. The servers in question are
> also NFS clients, but there are no stale mounts and we aren't using NFS
> re-export.
>
> These problems seem to coincide with the rollout of Linux 2.6.x (seen
> the problem with both 2.6.1 and 2.6.3) on the clients, so it seems
> plausible that something in the Linux 2.6 client is triggering the NFS
> kernel server code. I'm a bit unsure if I should file this on
> nfs-kernel-server or on a kernel package; feel free to reassign as
> needed.
>
> -- System Information
> Debian Release: 3.0
> Architecture: i386
> Kernel: Linux cassarossa 2.4.25 #1 SMP Wed Feb 18 22:46:21 CET 2004 i686
> Locale: LANG=en_US, LC_CTYPE=en_US.ISO8859-1
>
> Versions of packages nfs-kernel-server depends on:
> ii debconf 1.2.35 Debian configuration management sy
> ii libc6 2.2.5-11.5 GNU C Library: Shared libraries an
> ii libwrap0 7.6-9 Wietse Venema's TCP wrappers libra
> ii nfs-common 1:1.0-2woody1 NFS support files common to client
>
> ----- End forwarded message -----
>
> --
> Chip Salzenberg - a.k.a. - <chip@pobox.com>
> "I wanted to play hopscotch with the impenetrable mystery of existence,
> but he stepped in a wormhole and had to go in early." // MST3K
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> NFS maillist - NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-03-22 0:26 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-18 21:09 Debian Bug#235886: nfs-kernel-server inducing load of 8-9 with no good reason for Linux 2.6 clients Chip Salzenberg
2004-03-19 0:08 ` Trond Myklebust
2004-03-19 8:48 ` Olaf Kirch
2004-03-19 8:49 ` Olaf Kirch
2004-03-19 11:53 ` Bernd Schubert
2004-03-19 12:03 ` Olaf Kirch
2004-03-22 0:25 ` Neil Brown
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.