Linux NFS development
 help / color / mirror / Atom feed
From: Haakon Riiser <haakon.riiser@fys.uio.no>
To: nfs@lists.sourceforge.net
Subject: Re: "Server not responding" after periods of client inactivity
Date: Sat, 30 Jul 2005 15:10:31 +0200	[thread overview]
Message-ID: <20050730131031.GA1668@fox> (raw)
In-Reply-To: <20050714212514.GA23867@fox>

(Replying to my own email, since I still haven't seen any comments.
The full original email is included below, since it's been a while
since I posted it.)

I have now tried this with another NFS client (Fedora Core 4)
connected to the NFS server simultaneously with client used
in the below problem report.  The same problem happens on
the new client as well, even when the other client has just
experienced the hang.

That is, after the stalled NFS operation completes on client A,
it is still possible to observe the same problem on client B.
It looks to me like this is caused by the server disconnecting idle
clients, and when a new request suddenly occurs, it takes a while
before the server manages to resurrect the connection.  I tried
adding a cron job that does 'ls -l /NFS-MOUNT-POINT >/dev/null'
every few minutes, but surprisingly, that didn't keep the the
connection alive.  Is this because of client side caching?  Is there
anything I can do?  These hangs are getting very annoying; it
would be great if I could somehow change the idle timeout.

> I have noticed that after periods of inactivity on the client
> machine, the first NFS operation will always hang for around
> 15 seconds.  My first guess was that the server had powered down
> its disk drives, and that it was the spin-up time that caused the
> delay, but when I had an ssh session open on the server at the
> same time as the client started complaining about not getting a
> reply, I saw that this was /not/ the case -- there is nothing
> on the server that would explain why the client is stalling.
> No load, and no delay when working directly on the server.
> 
> I did tcpdump (on the server side) while the client was hanging,
> and this is what I found:
> 
> Source  Time  Packets
> ------  ----  -------
> client  0.00  V3 ACCESS Call, FH:0x02120000
> client  0.10  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client  0.31  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client  0.71  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client  1.53  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client  3.16  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client  6.42  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client  7.12  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client  8.52  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 11.32  [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> server 15.30  V3 ACCESS Reply
> 
> After this, there are no more delays for this shared file system;
> any file operation on the same file system will succeed instantly.
> However, the first operation on any /other/ NFS file system also
> mounted on the client will still hang just like the above example.
> I.e., the hang always happens exactly once for each mount point.
> 
> I have tried setting rsize=1024,wsize=1024, and I have tried both
> tcp and udp, but nothing has helped so far.  Any ideas?  tcpdump
> clearly shows that all the requests arrive at the server, so why
> does the server wait 15 seconds before it replies?
> 
> NFS server:
>   Pentium III 650 MHz, 256 MB RAM
>   Fedora Core 3 (fully updated)
>   nfs-utils 1.0.6-52
>   kernel 2.6.11-1.35_FC3
> 
> NFS client:
>   Athlon XP2500+, 1 GB RAM
>   Slackware 10.1
>   nfs-utils 1.0.7
>   kernel 2.6.11.11

-- 
 Haakon


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2005-07-30 13:11 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-14 21:25 "Server not responding" after periods of client inactivity Haakon Riiser
2005-07-30 13:10 ` Haakon Riiser [this message]
2005-07-30 14:15   ` Trond Myklebust
2005-07-30 14:32     ` Haakon Riiser
2005-07-30 14:55       ` Trond Myklebust
2005-08-09 19:06         ` Haakon Riiser
2005-08-09 19:14           ` Trond Myklebust
2005-08-09 19:22             ` Haakon Riiser
2005-08-09 20:32             ` Haakon Riiser
2005-08-09 20:59               ` Trond Myklebust
2005-08-09 22:24                 ` Haakon Riiser
  -- strict thread matches above, loose matches on Subject: below --
2005-07-14 12:34 Haakon Riiser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050730131031.GA1668@fox \
    --to=haakon.riiser@fys.uio.no \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox