Re: why do attempts to access a nfs v3 filesystem (ro,soft) block the process for minutes at a time? (when the nfs server is down)

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Chuck Lever <chuck.lever@oracle.com>
To: Tom H <tom@limepepper.co.uk>
Cc: linux-nfs@vger.kernel.org
Subject: Re: why do attempts to access a nfs v3 filesystem (ro,soft) block the process for minutes at a time? (when the nfs server is down)
Date: Fri, 16 Jul 2010 11:25:00 -0400	[thread overview]
Message-ID: <4C4079CC.8070205@oracle.com> (raw)
In-Reply-To: <4C4078AE.5070300@limepepper.co.uk>

On 07/16/2010 11:20 AM, Tom H wrote:
>
> (apologies for the cross post from the deprecated list)
>
> Hi all,
>
> I have a web server which serves some content from an nfs filesystem
> mounted like so;
> nfsserver1:/somemount /var/www/html/somefiles nfs rw,soft
> 0 0
>
> # mount | grep nfs
> nfsserver1:/somemount on /var/www/html/somefiles type nfs
> (ro,soft,addr=xx.xx.xx.xx)
>
> According to the documentation, an NFS operation on a soft mount should
> wait for a "major timeout" and then report "server not responding" to
> syslog and return an error. where a major timeout is after default
> retrans=3 retransmissions.
>
> I understand the process to be like this;
> call --->0.7 secs --->retransmission--->1.4
> secs--->retransmission--->2.8 secs--->server not responding(major timeout)
>
> However it is pretty clear that this is retrying indefinitely (or at
> least many more times that I would like), as the
> log files show loads of;
> Jul 16 07:56:09 server1 kernel: nfs: server server2 not responding,
> timed out
> Jul 16 07:57:09 server1 last message repeated 4 times
> Jul 16 07:57:09 server1 last message repeated 6 times
>
> and eventually this kills the apache server as all the available
> processes are blocked during "retrying indefinitely", until the apache
> server is restarted. (restarting the nfs server at this point does not
> seem to recover the apache child processes)
>
> So what should my strategy be to stop the failed mount killing apache. I
> care more about the apache staying up, as I don't have that much control
> over the nfs server..
>
> (also I noticed that it seems to timeout quicker with the mount options
> set like (soft, timeo=7, retrans=3) which is unexpected, because they
> are supposed to be the default)

They are the default settings for UDP mounts, but you didn't specify 
UDP.  TCP is the default transport protocol, and has been for some time. 
  TCP uses a long retransmit timeout.  See nfs(5).

-- 
Chuck Lever

next prev parent reply	other threads:[~2010-07-16 15:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-16 15:20 why do attempts to access a nfs v3 filesystem (ro,soft) block the process for minutes at a time? (when the nfs server is down) Tom H
2010-07-16 15:25 ` Chuck Lever [this message]
2010-07-16 16:10   ` Tom H
2010-07-16 16:26     ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C4079CC.8070205@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tom@limepepper.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).