* Man page update for timeo= and retrans= options.
@ 2008-01-04 2:32 Neil Brown
[not found] ` <18301.39633.368089.130622-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Neil Brown @ 2008-01-04 2:32 UTC (permalink / raw)
To: linux-nfs
I've been trying to understand exactly how timeouts work in the NFS
client and find that the man page in nfs-utils is not correct.
In particular, the implementation differentiates between TCP and UDP,
while the man page does not make that distinction.
I have attempted an update to the man page as you can see below. It
is entirely possible that I have not got it completely correct (or
comprehensible) so I'm asking for people to check that what I have
written is correct and clear.
This I would particularly like comment on:
1/ I have left
Better overall performance may be achieved by increasing the
timeout when mounting on a busy network, to a slow server, or through
several routers or gateways.
unchanged. Is it still a reasonable thing to say?
2/ I have moved the documentation about major timeouts into the retrans=
section. Does that break the description up too much?
3/ the old text seems to say that after the first major-timeout, a
slightly different sequence of timeouts are used. I couldn't find
evidence of this in the code. Did I miss something, or is my text
correct?
4/ Did this change in some ancient kernel version, and should the
version number of the change be documented? e.g. is it a 2.4 / 2.6
difference?
5/ As the behaviour is quite different for UDP and TCP, should we
introduce a major_timeo= option which calculates an appropriate
retrans= based on the actual timeo= and proto= used.
and anything else that occurs to anyone.
Thanks,
NeilBrown
diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man
index d92da19..0142075 100644
--- a/utils/mount/nfs.man
+++ b/utils/mount/nfs.man
@@ -83,24 +83,50 @@ Note: Setting this size to a value less than the largest supported
block size will adversely affect performance.
.TP 1.5i
.I timeo=n
-The value in tenths of a second before sending the
-first retransmission after an RPC timeout.
-The default value is 7 tenths of a second. After the first timeout,
-the timeout is doubled after each successive timeout until a maximum
-timeout of 60 seconds is reached or the enough retransmissions
-have occured to cause a major timeout. Then, if the filesystem
-is hard mounted, each new timeout cascade restarts at twice the
-initial value of the previous cascade, again doubling at each
-retransmission. The maximum timeout is always 60 seconds.
+The value in tenths of a second for the first RPC timeout. If no
+reply has been received in this much time, the message is
+retransmitted.
+Further timeouts are handled differently depending on the connection
+type.
+
+For UDP (which is unreliable and lacks congestion control),
+each successive timeout is twice the previous timeout. As the default
+is 11 tenths of a seconds, the timeouts used if
+.I timeo=
+is not specified are 1.1, 2.2, 4.4, 8.8,... seconds. The timeout for
+each retransmission is limited to 60 seconds, so the next few numbers
+in the above sequence would be 17.6, 35.2, 60, 60.
+
+For reliable protocols such as TCP and RDMA, the successive timeouts
+grow linearly rather than exponentially to a maximum of 10 minutes.
+The default is 1 minute, so the default successive timeout are 1,
+2, 3, 4, 5, 6, 7, 8, 9, 10, 10, 10 minutes.
+
+It is unwise to set
+.I timeo=
+explicitly without also setting the protocol to use, as it has a
+significantly different effect depending on protocol.
+
Better overall performance may be achieved by increasing the
timeout when mounting on a busy network, to a slow server, or through
several routers or gateways.
.TP 1.5i
.I retrans=n
The number of minor timeouts and retransmissions that must occur before
-a major timeout occurs. The default is 3 timeouts. When a major timeout
-occurs, the file operation is either aborted or a "server not responding"
-message is printed on the console.
+a major timeout occurs. The default is 2 yielding a total of 3
+attempts (1 transmission and 2 retransmissions). When a major timeout
+occurs the behaviour depends on whether the filesystem was mounted
+.I hard
+or
+.IR soft .
+In the case of a
+.I soft
+mount, the operation will abort and typically return an IO error to
+the application. In the case of a
+.I hard
+mount a "server not responding" message will be printed on the
+console, and the request will be retried with the original series of
+timeouts.
.TP 1.5i
.I acregmin=n
The minimum time in seconds that attributes of a regular file should
^ permalink raw reply related [flat|nested] 6+ messages in thread[parent not found: <18301.39633.368089.130622-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]
* Re: Man page update for timeo= and retrans= options. [not found] ` <18301.39633.368089.130622-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org> @ 2008-01-04 21:31 ` Trond Myklebust 2008-01-07 18:15 ` Chuck Lever 1 sibling, 0 replies; 6+ messages in thread From: Trond Myklebust @ 2008-01-04 21:31 UTC (permalink / raw) To: Neil Brown; +Cc: linux-nfs On Fri, 2008-01-04 at 13:32 +1100, Neil Brown wrote: > I've been trying to understand exactly how timeouts work in the NFS > client and find that the man page in nfs-utils is not correct. > > In particular, the implementation differentiates between TCP and UDP, > while the man page does not make that distinction. > > I have attempted an update to the man page as you can see below. It > is entirely possible that I have not got it completely correct (or > comprehensible) so I'm asking for people to check that what I have > written is correct and clear. > > This I would particularly like comment on: > > 1/ I have left > > Better overall performance may be achieved by increasing the > timeout when mounting on a busy network, to a slow server, or through > several routers or gateways. > > unchanged. Is it still a reasonable thing to say? I suppose so, however it might be worth stating that a better solution is to use TCP. It is also worth pointing out that for TCP, the timeo mount option is deprecated. > 2/ I have moved the documentation about major timeouts into the retrans= > section. Does that break the description up too much? No, that sounds like a good idea. > 3/ the old text seems to say that after the first major-timeout, a > slightly different sequence of timeouts are used. I couldn't find > evidence of this in the code. Did I miss something, or is my text > correct? The text stating that 'each new timeout cascade restarts at twice the initial value of the previous cascade' is wrong. AFAIK, we restart at the initial value... > 4/ Did this change in some ancient kernel version, and should the > version number of the change be documented? e.g. is it a 2.4 / 2.6 > difference? I'd have to check. > 5/ As the behaviour is quite different for UDP and TCP, should we > introduce a major_timeo= option which calculates an appropriate > retrans= based on the actual timeo= and proto= used. No. We should deprecate use of retrans/timeo altogether for TCP except possibly for the case of 'soft' mounts (and even then you need to be careful). It is far too easy to flood the server with redundant RPC requests... Cheers Trond ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Man page update for timeo= and retrans= options. [not found] ` <18301.39633.368089.130622-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org> 2008-01-04 21:31 ` Trond Myklebust @ 2008-01-07 18:15 ` Chuck Lever 2008-01-08 1:32 ` Neil Brown 2008-01-08 18:54 ` Steve Dickson 1 sibling, 2 replies; 6+ messages in thread From: Chuck Lever @ 2008-01-07 18:15 UTC (permalink / raw) To: Neil Brown, Steve Dickson; +Cc: linux-nfs Hi Neil- I just spent two months and rewrote all of nfs(5). It should appear in the next release of nfs-utils. Steve, when can we expect to see the updated man page? On Jan 3, 2008, at 9:32 PM, Neil Brown wrote: > > I've been trying to understand exactly how timeouts work in the NFS > client and find that the man page in nfs-utils is not correct. > > In particular, the implementation differentiates between TCP and UDP, > while the man page does not make that distinction. > > I have attempted an update to the man page as you can see below. It > is entirely possible that I have not got it completely correct (or > comprehensible) so I'm asking for people to check that what I have > written is correct and clear. > > This I would particularly like comment on: > > 1/ I have left > > Better overall performance may be achieved by increasing the > timeout when mounting on a busy network, to a slow server, or through > several routers or gateways. > > unchanged. Is it still a reasonable thing to say? > > 2/ I have moved the documentation about major timeouts into the > retrans= > section. Does that break the description up too much? > > 3/ the old text seems to say that after the first major-timeout, a > slightly different sequence of timeouts are used. I couldn't find > evidence of this in the code. Did I miss something, or is my text > correct? > > 4/ Did this change in some ancient kernel version, and should the > version number of the change be documented? e.g. is it a 2.4 / 2.6 > difference? > > 5/ As the behaviour is quite different for UDP and TCP, should we > introduce a major_timeo= option which calculates an appropriate > retrans= based on the actual timeo= and proto= used. > > and anything else that occurs to anyone. > > Thanks, > NeilBrown > > diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man > index d92da19..0142075 100644 > --- a/utils/mount/nfs.man > +++ b/utils/mount/nfs.man > @@ -83,24 +83,50 @@ Note: Setting this size to a value less than > the largest supported > block size will adversely affect performance. > .TP 1.5i > .I timeo=n > -The value in tenths of a second before sending the > -first retransmission after an RPC timeout. > -The default value is 7 tenths of a second. After the first timeout, > -the timeout is doubled after each successive timeout until a maximum > -timeout of 60 seconds is reached or the enough retransmissions > -have occured to cause a major timeout. Then, if the filesystem > -is hard mounted, each new timeout cascade restarts at twice the > -initial value of the previous cascade, again doubling at each > -retransmission. The maximum timeout is always 60 seconds. > +The value in tenths of a second for the first RPC timeout. If no > +reply has been received in this much time, the message is > +retransmitted. > +Further timeouts are handled differently depending on the connection > +type. > + > +For UDP (which is unreliable and lacks congestion control), > +each successive timeout is twice the previous timeout. As the > default > +is 11 tenths of a seconds, the timeouts used if > +.I timeo= > +is not specified are 1.1, 2.2, 4.4, 8.8,... seconds. The timeout for > +each retransmission is limited to 60 seconds, so the next few numbers > +in the above sequence would be 17.6, 35.2, 60, 60. > + > +For reliable protocols such as TCP and RDMA, the successive timeouts > +grow linearly rather than exponentially to a maximum of 10 minutes. > +The default is 1 minute, so the default successive timeout are 1, > +2, 3, 4, 5, 6, 7, 8, 9, 10, 10, 10 minutes. > + > +It is unwise to set > +.I timeo= > +explicitly without also setting the protocol to use, as it has a > +significantly different effect depending on protocol. > + > Better overall performance may be achieved by increasing the > timeout when mounting on a busy network, to a slow server, or through > several routers or gateways. > .TP 1.5i > .I retrans=n > The number of minor timeouts and retransmissions that must occur > before > -a major timeout occurs. The default is 3 timeouts. When a major > timeout > -occurs, the file operation is either aborted or a "server not > responding" > -message is printed on the console. > +a major timeout occurs. The default is 2 yielding a total of 3 > +attempts (1 transmission and 2 retransmissions). When a major > timeout > +occurs the behaviour depends on whether the filesystem was mounted > +.I hard > +or > +.IR soft . > +In the case of a > +.I soft > +mount, the operation will abort and typically return an IO error to > +the application. In the case of a > +.I hard > +mount a "server not responding" message will be printed on the > +console, and the request will be retried with the original series of > +timeouts. > .TP 1.5i > .I acregmin=n > The minimum time in seconds that attributes of a regular file should > - > To unsubscribe from this list: send the line "unsubscribe linux- > nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Man page update for timeo= and retrans= options. 2008-01-07 18:15 ` Chuck Lever @ 2008-01-08 1:32 ` Neil Brown [not found] ` <18306.53954.61368.902438-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org> 2008-01-08 18:54 ` Steve Dickson 1 sibling, 1 reply; 6+ messages in thread From: Neil Brown @ 2008-01-08 1:32 UTC (permalink / raw) To: Chuck Lever; +Cc: Steve Dickson, linux-nfs On Monday January 7, chuck.lever@oracle.com wrote: > Hi Neil- > > I just spent two months and rewrote all of nfs(5). It should appear > in the next release of nfs-utils. Steve, when can we expect to see > the updated man page? I thought I had seem some rewrite go past, but it wasn't in my inbox any more and also not it Steve's git so I just went ahead... I see it is in the .git now (as of Friday). Comments: - It says UDP defaults to 7/10 of a second, but nfs_init_timeout_values() says: if (!to->to_initval) to->to_initval = 11 * HZ / 10; which suggests 11/10 of a second. - It says If the retrans option is not specified, the NFS client retries each request three times. but nfs_init_timeout_values() says if (!to->to_retries) to->to_retries = 2; which suggests it retries 2 time (or tries 3 times). - It says: After each retransmission, the NFS client doubles the timeout for that request, up to a maximum timeout length of 60 seconds. but doesn't (to me) make it clear that only applies to UDP. For TCP, the timeouts appear to increase linearly up to 600 seconds. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <18306.53954.61368.902438-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>]
* Re: Man page update for timeo= and retrans= options. [not found] ` <18306.53954.61368.902438-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org> @ 2008-01-08 12:38 ` Chuck Lever 0 siblings, 0 replies; 6+ messages in thread From: Chuck Lever @ 2008-01-08 12:38 UTC (permalink / raw) To: Neil Brown; +Cc: Steve Dickson, linux-nfs Hi Neil- On Jan 7, 2008, at 8:32 PM, Neil Brown wrote: > On Monday January 7, chuck.lever@oracle.com wrote: >> Hi Neil- >> >> I just spent two months and rewrote all of nfs(5). It should appear >> in the next release of nfs-utils. Steve, when can we expect to see >> the updated man page? > > I thought I had seem some rewrite go past, but it wasn't in my inbox > any more and also not it Steve's git so I just went ahead... > I see it is in the .git now (as of Friday). Good. I hope others will also have a chance to look it over. And thanks for your scrutiny, btw. > Comments: > - It says UDP defaults to 7/10 of a second, but > nfs_init_timeout_values() > says: > if (!to->to_initval) > to->to_initval = 11 * HZ / 10; > > which suggests 11/10 of a second. Yup. I forgot about that code change, which I believe was to make UDP on Linux work more like Solaris does. > - It says > If the retrans option is not specified, the NFS client retries > each request three times. > > but nfs_init_timeout_values() says > > if (!to->to_retries) > to->to_retries = 2; > > which suggests it retries 2 time (or tries 3 times). Yes, nfs(5) should be changed to say "tries each request 3 times." > - It says: > After each retransmission, the NFS client doubles the timeout > for that request, up to a maximum timeout length of 60 seconds. > > but doesn't (to me) make it clear that only applies to UDP. It follows "However, for NFS over UDP" .... But perhaps the UDP part can be wholly split into a separate paragraph to make the distinction more clear. I'll post a patch with these updates to nfs(5). > For TCP, the timeouts appear to increase linearly up to 600 seconds. The TCP RTT should not change after a timeout. At least that was the way it worked when I modified it a few years ago. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Man page update for timeo= and retrans= options. 2008-01-07 18:15 ` Chuck Lever 2008-01-08 1:32 ` Neil Brown @ 2008-01-08 18:54 ` Steve Dickson 1 sibling, 0 replies; 6+ messages in thread From: Steve Dickson @ 2008-01-08 18:54 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing list Chuck Lever wrote: > Hi Neil- > > I just spent two months and rewrote all of nfs(5). It should appear in > the next release of nfs-utils. Steve, when can we expect to see the > updated man page? I committed the update a few days ago... steved. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-01-08 18:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-04 2:32 Man page update for timeo= and retrans= options Neil Brown
[not found] ` <18301.39633.368089.130622-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-01-04 21:31 ` Trond Myklebust
2008-01-07 18:15 ` Chuck Lever
2008-01-08 1:32 ` Neil Brown
[not found] ` <18306.53954.61368.902438-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-01-08 12:38 ` Chuck Lever
2008-01-08 18:54 ` Steve Dickson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox