From: Jamie Lokier <jamie@shareable.org>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Andrew Morton <akpm@osdl.org>,
shannon@widomaker.com, linux-kernel@vger.kernel.org
Subject: Re: NFS and kernel 2.6.x
Date: Fri, 16 Apr 2004 19:48:21 +0100 [thread overview]
Message-ID: <20040416184821.GA25402@mail.shareable.org> (raw)
In-Reply-To: <1082130906.2581.10.camel@lade.trondhjem.org>
Trond Myklebust wrote:
> > Perhaps because 2.6 changes the UDP retransmit model for NFS, to
> > estimate the round-trip time and thus retransmit faster than 2.4
> > would. Sometimes _much_ faster: I observed retransmits within a few
> > hundred microseconds.
>
> Retransmits within a few 100 microsecond should no longer be occurring.
> Have you redone those measurements with a more recent kernel?
No, not since I sent you the packet trace from a 2.5 kernel that
wasn't working with "soft". I took your advice and stopped using
"soft". It causes the obvious problem when I (rarely) turn off the
server, otherwise it's been fine and I'm using 2.6.5 now, still fine
(with "soft" not being used).
> 2.6.x and 2.4.x should have pretty much the same code for RTO
> estimation.
>
> In fact pretty much all the 2.4.x and 2.6.x RPC code is shared. The one
> difference is that 2.6.x uses zero copy writes.
>
> > There was also a problem with late 2.5 clients and "soft" NFS mounts.
> > Requests would timeout after a fixed number of retransmits, which on a
> > LAN could be after a few milliseconds due to round-trip estimation and
> > fast server response. Then when an I/O on the server took longer,
> > e.g. due to a disk seek or contention, the client would timeout and
> > abort requests. 2.4 doesn't have this problem with "soft" due to the
> > longer, fixed retransmit timeout. I don't know if it is fixed in
> > current 2.6 kernels - but you can avoid it by not using "soft" anyway.
>
> Or changing the default value of "retrans" to something more sane. As
> usual, Linux has a default that is lower than on any other platform.
If few-100-microsecond retransmits no longer occur, perhaps it's no
longer relevant.
The problem I saw with "soft" was that the retransmit time was quite a
good estimate of the server response time. That part was fine, nice
even. But then the server response latency would increase by a factor
of 10000 (ten thousand) due to normal disk I/O activity (compare cache
response with disk response on a busy disk), and of course 3
retransmits doubling each time is not adequate to cover that. 2.4 was
fine because the default rtt and retrans together could never get
shorter than a few seconds.
That's why I felt that iff rtt was adapting to the server response
time, then a fixed number of retransmits was no longer appropriate: a
lower bound on the time before timing out is appropriate, e.g. 3
seconds or 10 seconds or whatever.
In other words, with adaptive rtt the concept of "retrans" being a
fixed number is fundamentally flawed -- unless it's also accompanied
by a minimum timeout time. You'd need a retrans value of 20 or so for
the above perfectly normal LAN situation, but then that's far too
large on other occasions with other networks or servers.
-- Jamie
next prev parent reply other threads:[~2004-04-16 18:48 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-04-16 1:14 NFS and kernel 2.6.x Charles Shannon Hendrix
2004-04-16 1:31 ` Trond Myklebust
2004-04-16 1:53 ` Andrew Morton
2004-04-16 2:54 ` Trond Myklebust
2004-04-16 4:59 ` Phil Oester
2004-04-16 5:29 ` Trond Myklebust
2004-04-16 7:13 ` Paul Wagland
2004-04-16 14:44 ` Marcelo Tosatti
2004-04-16 14:46 ` Marcelo Tosatti
2004-04-16 15:50 ` Trond Myklebust
2004-04-16 15:55 ` Dave Gilbert (Home)
2004-04-16 16:13 ` Trond Myklebust
2004-04-16 19:07 ` Daniel Egger
2004-04-17 4:56 ` Chris Friesen
2004-04-17 9:56 ` Daniel Egger
2004-04-17 5:24 ` Trond Myklebust
2004-04-17 14:15 ` Daniel Egger
2004-04-16 19:11 ` Charles Shannon Hendrix
2004-04-17 16:44 ` Matthias Urlichs
2004-04-17 18:15 ` Trond Myklebust
2004-04-17 18:32 ` Marc Singer
2004-04-17 18:58 ` Trond Myklebust
2004-04-17 19:01 ` Marc Singer
2004-04-17 19:09 ` Trond Myklebust
2004-04-17 19:19 ` Russell King
2004-04-18 2:51 ` Trond Myklebust
2004-04-19 16:39 ` Trond Myklebust
2004-04-19 21:10 ` Trond Myklebust
2004-04-17 22:22 ` Marc Singer
2004-04-18 0:57 ` Trond Myklebust
2004-04-18 5:01 ` Marc Singer
2004-04-18 6:36 ` Chris Friesen
2004-04-18 7:56 ` Russell King
2004-04-18 17:31 ` Marc Singer
2004-04-17 19:01 ` Daniel Egger
2004-04-17 20:22 ` Marc Singer
2004-04-18 11:14 ` Daniel Egger
2004-04-19 9:06 ` Helge Hafting
2004-04-16 9:03 ` Jamie Lokier
2004-04-16 15:55 ` Trond Myklebust
2004-04-16 18:48 ` Jamie Lokier [this message]
2004-04-16 19:06 ` Trond Myklebust
2004-04-16 19:39 ` Jamie Lokier
2004-04-17 22:32 ` Trond Myklebust
2004-04-18 3:26 ` Jamie Lokier
2004-04-18 7:03 ` Trond Myklebust
2004-04-18 23:22 ` Jamie Lokier
2004-04-19 15:38 ` Trond Myklebust
2004-04-19 16:19 ` Trond Myklebust
2004-04-20 0:09 ` Jamie Lokier
[not found] ` <20040416190126.GB408@widomaker.com>
[not found] ` <1082144608.2581.156.camel@lade.trondhjem.org>
[not found] ` <20040417000353.GA3750@widomaker.com>
2004-04-17 5:28 ` Trond Myklebust
2004-04-17 17:55 ` Charles Shannon Hendrix
2004-04-17 18:55 ` Trond Myklebust
[not found] <1Lql8-6O3-1@gated-at.bofh.it>
[not found] ` <1LquO-6TK-5@gated-at.bofh.it>
[not found] ` <1LqOg-76p-19@gated-at.bofh.it>
[not found] ` <1LrKo-7Sn-21@gated-at.bofh.it>
[not found] ` <1LtM3-12d-5@gated-at.bofh.it>
[not found] ` <1Luf2-1kK-1@gated-at.bofh.it>
[not found] ` <1LDBL-uY-3@gated-at.bofh.it>
2004-04-16 20:31 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040416184821.GA25402@mail.shareable.org \
--to=jamie@shareable.org \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=shannon@widomaker.com \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox