public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* NFS client code slow in 2.4.3
@ 2001-04-03 18:56 Caleb Epstein
  2001-04-03 19:08 ` Caleb Epstein
  0 siblings, 1 reply; 3+ messages in thread
From: Caleb Epstein @ 2001-04-03 18:56 UTC (permalink / raw)
  To: linux-kernel


	I am having problems with timeouts and generaly throughput in
	the 2.4.3 NFS client side code which are not present in the
	2.4.2 kernel running in the same configuraiton on the same
	hardware.  The machines are on a 100 Mbit switched local
	network with essentially no other trafic.

	In both cases, testing against a 2.4.3 NFS server (using
	knfsd).  My tests involved using "dd" to read a large file on
	an NFS mounted directory and running the "connectathon" NFS
	test suite.

	When I boot my client machine with 2.4.3, reading a 327 Mbyte
	file over NFS takes on the order of 5-6 minutes to complete.
	If I run the same command witrh the client running kernel
	2.4.2, the command completes in about 1 minute.

	Running the "cthon01" test suite, the 2.4.3 client machine
	basically hangs in the "read + write" test section and I
	didn't bother waiting for it to finish.  Again, when switching
	back to 2.4.2, the client runs through the tests quite
	quickly.

	From my tests I'm pretty convinced that something in either
	the NFS client code or the networking layer has changed which
	has drastically reduced NFS client speeds in 2.4.3.

	Is this a known problem?  Can I provide any additional
	information to help debug it?

-- 
cae at bklyn dot org | Caleb Epstein | bklyn . org | Brooklyn Dust Bunny Mfg.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: NFS client code slow in 2.4.3
  2001-04-03 18:56 NFS client code slow in 2.4.3 Caleb Epstein
@ 2001-04-03 19:08 ` Caleb Epstein
  2001-04-03 20:00   ` Trond Myklebust
  0 siblings, 1 reply; 3+ messages in thread
From: Caleb Epstein @ 2001-04-03 19:08 UTC (permalink / raw)
  To: linux-kernel

On Tue, Apr 03, 2001 at 02:56:15PM -0400, Caleb Epstein wrote:

> 	I am having problems with timeouts and generaly throughput in
> the 2.4.3 NFS client side code which are not present in the 2.4.2
> kernel running in the same configuraiton on the same hardware.  The
> machines are on a 100 Mbit switched local network with essentially
> no other trafic.

	On second thought, it looks like 2.4.2 may also exhibit the
	same behaviro after a little while.  Now that the machine has
	been up for a half hour or so, NFS traffic has become slow on
	my 2.4.2 client again.  I am seeing messages like this in my
	kernel log:

Apr  3 15:01:54 hagrid kernel: nfs: server tela not responding, still trying
Apr  3 15:01:54 hagrid kernel: nfs: server tela OK

	The machines are *not* having any connectivity problems, at
	least judging from TCP sessions I have open between them.

	So it would seem that NFS performace degrades over a very
	short window in 2.4.2+.  It seems to fairly fly when the
	machine is freshly booted, but after 30 minutes or less, the
	performance is severely degraded.

	Is anyone using 2.4.2+ as a NFS server/client with success?
	Am I missing something?

-- 
cae at bklyn dot org | Caleb Epstein | bklyn . org | Brooklyn Dust Bunny Mfg.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: NFS client code slow in 2.4.3
  2001-04-03 19:08 ` Caleb Epstein
@ 2001-04-03 20:00   ` Trond Myklebust
  0 siblings, 0 replies; 3+ messages in thread
From: Trond Myklebust @ 2001-04-03 20:00 UTC (permalink / raw)
  To: Caleb Epstein; +Cc: linux-kernel

>>>>> " " == Caleb Epstein <cae@bklyn.org> writes:

     > On Tue, Apr 03, 2001 at 02:56:15PM -0400, Caleb Epstein wrote:
    >> I am having problems with timeouts and generaly throughput in
    >> the 2.4.3 NFS client side code which are not present in the
    >> 2.4.2 kernel running in the same configuraiton on the same
    >> hardware.  The machines are on a 100 Mbit switched local
    >> network with essentially no other trafic.

     > 	On second thought, it looks like 2.4.2 may also exhibit the
     > 	same behaviro after a little while.  Now that the machine has
     > 	been up for a half hour or so, NFS traffic has become slow on
     > 	my 2.4.2 client again.  I am seeing messages like this in my
     > 	kernel log:

     > Apr 3 15:01:54 hagrid kernel: nfs: server tela not responding,
     > still trying Apr 3 15:01:54 hagrid kernel: nfs: server tela OK

The above is a generic message that simply is stating that NFS traffic
is congested because the server isn't responding for whatever reason.

In 99% of all cases, this means that the server is not seeing all the
packets that the client is sending it. This forces the client to
throttle back the number of requests it can have on the fly, and then
to wait until the given packet times out, and then to resend.

Try checking whether or not the server is seeing all the packets that
the client is sending by comparing the output of tcpdump/ethereal
between the client and the server.
If the packet loss is large, try fiddling with the hardware: typically
stuff such as overriding the NIC autoconfiguration, swapping the NIC,
checking for noisy cables,...

If you're unable to trace the problem, try playing around with
rsize/wsize, timeo and retrans (man 5 nfs). The smaller the packet,
the less the chances are of UDP fragments getting lost.
You might also want to try out the NFS ping patch from
   http://www.fys.uio.no/~trondmy/src/2.4.2

Cheers,
   Trond

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2001-04-03 20:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-03 18:56 NFS client code slow in 2.4.3 Caleb Epstein
2001-04-03 19:08 ` Caleb Epstein
2001-04-03 20:00   ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox