Linux NFS development
 help / color / mirror / Atom feed
From: "Gregory Baker" <gregory.baker@amd.com>
To: "Trond Myklebust" <trond.myklebust@fys.uio.no>
Cc: autofs@linux.kernel.org, nfs@lists.sourceforge.net
Subject: Re: [NFS] bug in linux mount? (says NetApp)
Date: Tue, 11 Jul 2006 18:34:10 -0500	[thread overview]
Message-ID: <44B43572.7040103@amd.com> (raw)
In-Reply-To: <1152660478.5681.38.camel@lade.trondhjem.org>


Thanks Trond!

I was referring to the 'standard' comment from the netapp PDF:

"Due to a bug in the mount command, the default retransmission timeout
value on Linux for NFS over TCP is quite small...To obtain standard
behavior, we strongly recommend using "timeo=600, retrans=2" explicitly
when mounting via TCP."

And was wondering what the 'standard' was.  Chuck politely pointed me to 
Solaris as the NFSv3 reference for 'standard'.

Thanks,

--Greg

Trond Myklebust wrote:
> On Tue, 2006-07-11 at 14:00 -0500, Gregory Baker wrote:
>> We have thousands of linux clients hitting netapp file servers (many 
>> 3500 series, clustered) on a local gigabit LAN.  From time to time, 
>> applications return "file not found" when attempting to automount a 
>> directory and access a file.  An example of this is a long running 
>> process, which reads in data, processes it for hours (in which time the 
>> filesystem is unmounted) then tries to read more data from that mount 
>> point (which causes a "file not found" error in the application).  This 
>> occurs about 1/100th of the time.
>>
>> Researching at Netapp turns up this bit by Chuck Lever (Linux NFS 
>> contributer)
>>
>> "Using the Linux NFS Client with Network Appliance Filers"
>> http://www.netapp.com/libr ary/tr/3183.pdf  (February 2006)
>>
>> page 10 says...
>>
>> "Due to a bug in the mount command, the default retransmission timeout 
>> value on Linux for NFS over TCP is quite small...To obtain standard 
>> behavior, we strongly recommend using "timeo=600, retrans=2" explicitly 
>> when mounting via TCP."
>>
>> Our defaults (assuming man pages are correct, RedHat Enterprise Linux 3) 
>> would be timeo=7, retrans=3, which translates to 7+14+28+56 = 105 tenths 
>> of a second (10 seconds).  It appears netapp is suggesting waiting 
>> 600+600 = 1200 tenths (120 seconds) before giving up on the mount command...
> 
> No they are not. See below.
> 
>> * What "bug" in the mount command do you believe NetApp is talking about?
> 
> It has nothing to do with the mount timeout: Chuck is talking about the
> retransmission timeout for TCP connections 'timeo' which should indeed
> be set to a high value since TCP guarantees message delivery (unlike UDP
> which requires a small timeo value). Setting it too low means that you
> end up spamming your server with a load of unnecessary retransmissions.
> 
> This was indeed the case for some older versions of 'mount' and also for
> older versions of the am-utils/amd automounters.
> 
>> * What do you think proper options for NFS auto/mounts would be for 
>> extremely busy centralized NFS filers?
> 
> Something like
> 
> mount -t nfs -ohard,timeo=600,retrans=2,rsize=32768,wsize=32768,tcp foo:/ /bar
> 
> should be a fairly safe bet. You might want to add the 'intr' flag too,
> depending on how you feel about the behaviour w.r.t. pressing ^C.
> 
>> * What is the reference standard behavior?
> 
> To which reference are you referring?
> 
> Cheers,
>   Trond
> 

-- 
----------------------------------------------------------------------
Greg Baker                                         512-602-3287 (work)
gregory.baker@amd.com                              512-602-6970 (fax)
5900 E. Ben White Blvd MS 626                      512-555-1212 (info)
Austin, TX 78741

  reply	other threads:[~2006-07-11 23:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-11 19:00 bug in linux mount? (says NetApp) Gregory Baker
2006-07-11 20:21 ` Chuck Lever
2006-07-14 20:36   ` Gregory Baker
2006-07-11 23:27 ` [NFS] " Trond Myklebust
2006-07-11 23:34   ` Gregory Baker [this message]
2006-07-12  3:03   ` [autofs] " Ian Kent
2006-07-12 12:19     ` Trond Myklebust
2006-07-12  9:32   ` James Pearson
2006-07-12  0:40 ` Blake Golliher
2006-07-12  1:07   ` Gregory Baker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44B43572.7040103@amd.com \
    --to=gregory.baker@amd.com \
    --cc=autofs@linux.kernel.org \
    --cc=nfs@lists.sourceforge.net \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox