public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Andrew Morton <akpm@osdl.org>,
	shannon@widomaker.com, linux-kernel@vger.kernel.org
Subject: Re: NFS and kernel 2.6.x
Date: Fri, 16 Apr 2004 19:48:21 +0100	[thread overview]
Message-ID: <20040416184821.GA25402@mail.shareable.org> (raw)
In-Reply-To: <1082130906.2581.10.camel@lade.trondhjem.org>

Trond Myklebust wrote:
> > Perhaps because 2.6 changes the UDP retransmit model for NFS, to
> > estimate the round-trip time and thus retransmit faster than 2.4
> > would.  Sometimes _much_ faster: I observed retransmits within a few
> > hundred microseconds.
> 
> Retransmits within a few 100 microsecond should no longer be occurring.
> Have you redone those measurements with a more recent kernel?

No, not since I sent you the packet trace from a 2.5 kernel that
wasn't working with "soft".  I took your advice and stopped using
"soft".  It causes the obvious problem when I (rarely) turn off the
server, otherwise it's been fine and I'm using 2.6.5 now, still fine
(with "soft" not being used).

> 2.6.x and 2.4.x should have pretty much the same code for RTO
> estimation.
> 
> In fact pretty much all the 2.4.x and 2.6.x RPC code is shared. The one
> difference is that 2.6.x uses zero copy writes.
> 
> > There was also a problem with late 2.5 clients and "soft" NFS mounts.
> > Requests would timeout after a fixed number of retransmits, which on a
> > LAN could be after a few milliseconds due to round-trip estimation and
> > fast server response.  Then when an I/O on the server took longer,
> > e.g. due to a disk seek or contention, the client would timeout and
> > abort requests.  2.4 doesn't have this problem with "soft" due to the
> > longer, fixed retransmit timeout.  I don't know if it is fixed in
> > current 2.6 kernels - but you can avoid it by not using "soft" anyway.
> 
> Or changing the default value of "retrans" to something more sane. As
> usual, Linux has a default that is lower than on any other platform.

If few-100-microsecond retransmits no longer occur, perhaps it's no
longer relevant.

The problem I saw with "soft" was that the retransmit time was quite a
good estimate of the server response time.  That part was fine, nice
even.  But then the server response latency would increase by a factor
of 10000 (ten thousand) due to normal disk I/O activity (compare cache
response with disk response on a busy disk), and of course 3
retransmits doubling each time is not adequate to cover that.  2.4 was
fine because the default rtt and retrans together could never get
shorter than a few seconds.

That's why I felt that iff rtt was adapting to the server response
time, then a fixed number of retransmits was no longer appropriate: a
lower bound on the time before timing out is appropriate, e.g. 3
seconds or 10 seconds or whatever.

In other words, with adaptive rtt the concept of "retrans" being a
fixed number is fundamentally flawed -- unless it's also accompanied
by a minimum timeout time.  You'd need a retrans value of 20 or so for
the above perfectly normal LAN situation, but then that's far too
large on other occasions with other networks or servers.

-- Jamie

  reply	other threads:[~2004-04-16 18:48 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-04-16  1:14 NFS and kernel 2.6.x Charles Shannon Hendrix
2004-04-16  1:31 ` Trond Myklebust
2004-04-16  1:53   ` Andrew Morton
2004-04-16  2:54     ` Trond Myklebust
2004-04-16  4:59       ` Phil Oester
2004-04-16  5:29         ` Trond Myklebust
2004-04-16  7:13           ` Paul Wagland
2004-04-16 14:44           ` Marcelo Tosatti
2004-04-16 14:46             ` Marcelo Tosatti
2004-04-16 15:50             ` Trond Myklebust
2004-04-16 15:55             ` Dave Gilbert (Home)
2004-04-16 16:13               ` Trond Myklebust
2004-04-16 19:07                 ` Daniel Egger
2004-04-17  4:56                   ` Chris Friesen
2004-04-17  9:56                     ` Daniel Egger
2004-04-17  5:24                   ` Trond Myklebust
2004-04-17 14:15                     ` Daniel Egger
2004-04-16 19:11                 ` Charles Shannon Hendrix
2004-04-17 16:44           ` Matthias Urlichs
2004-04-17 18:15             ` Trond Myklebust
2004-04-17 18:32               ` Marc Singer
2004-04-17 18:58                 ` Trond Myklebust
2004-04-17 19:01                   ` Marc Singer
2004-04-17 19:09                     ` Trond Myklebust
2004-04-17 19:19                       ` Russell King
2004-04-18  2:51                         ` Trond Myklebust
2004-04-19 16:39                           ` Trond Myklebust
2004-04-19 21:10                             ` Trond Myklebust
2004-04-17 22:22                   ` Marc Singer
2004-04-18  0:57                     ` Trond Myklebust
2004-04-18  5:01                       ` Marc Singer
2004-04-18  6:36                         ` Chris Friesen
2004-04-18  7:56                           ` Russell King
2004-04-18 17:31                             ` Marc Singer
2004-04-17 19:01                 ` Daniel Egger
2004-04-17 20:22                   ` Marc Singer
2004-04-18 11:14                     ` Daniel Egger
2004-04-19  9:06             ` Helge Hafting
2004-04-16  9:03     ` Jamie Lokier
2004-04-16 15:55       ` Trond Myklebust
2004-04-16 18:48         ` Jamie Lokier [this message]
2004-04-16 19:06           ` Trond Myklebust
2004-04-16 19:39             ` Jamie Lokier
2004-04-17 22:32               ` Trond Myklebust
2004-04-18  3:26                 ` Jamie Lokier
2004-04-18  7:03                   ` Trond Myklebust
2004-04-18 23:22                     ` Jamie Lokier
2004-04-19 15:38                       ` Trond Myklebust
2004-04-19 16:19                         ` Trond Myklebust
2004-04-20  0:09                         ` Jamie Lokier
     [not found]   ` <20040416190126.GB408@widomaker.com>
     [not found]     ` <1082144608.2581.156.camel@lade.trondhjem.org>
     [not found]       ` <20040417000353.GA3750@widomaker.com>
2004-04-17  5:28         ` Trond Myklebust
2004-04-17 17:55           ` Charles Shannon Hendrix
2004-04-17 18:55             ` Trond Myklebust
     [not found] <1Lql8-6O3-1@gated-at.bofh.it>
     [not found] ` <1LquO-6TK-5@gated-at.bofh.it>
     [not found]   ` <1LqOg-76p-19@gated-at.bofh.it>
     [not found]     ` <1LrKo-7Sn-21@gated-at.bofh.it>
     [not found]       ` <1LtM3-12d-5@gated-at.bofh.it>
     [not found]         ` <1Luf2-1kK-1@gated-at.bofh.it>
     [not found]           ` <1LDBL-uY-3@gated-at.bofh.it>
2004-04-16 20:31             ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040416184821.GA25402@mail.shareable.org \
    --to=jamie@shareable.org \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shannon@widomaker.com \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox