Re: Fragment ID wrap workaround (read-only, untested).

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Olaf Kirch <okir@suse.de>
To: David Stevens <dlstevens@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>,
	netdev@oss.sgi.com, "Rusty Russell (IBM)" <rusty@au1.ibm.com>
Subject: Re: Fragment ID wrap workaround (read-only, untested).
Date: Tue, 27 Jul 2004 14:38:42 +0200	[thread overview]
Message-ID: <20040727123842.GF27188@suse.de> (raw)
In-Reply-To: <OF97127705.C0C08B6A-ON88256ED2.004DB9A5-88256ED2.005146DA@us.ibm.com>

> you'll reassemble garbage when the IP ID wraps (well before the frag queue
> expires). And the checksum will pass anyway on average about 1/64K of the
> time. If you send at full rate and drop, say, 100 frags a second, it
> doesn't take too long to get a Frankenpacket-- reassembled from parts of
> others. :-)

In the scenarios we were looking at, packet loss rate was fairly low.
What compounded the problem was that the NFS payload wasn't very varied,
so the UDP checksum distribution was far from even.

When we looked into the problem, we considered implementing a per-route
parameter where the admin can set lower reassembly timeouts. I think this
is a solution that both addresses the problem, and does not interfere
with WAN traffic. The user space tools could even select reasonable
defaults based on the hardware type when setting up the device.

(We did not implement this because we decided to go for NFS over TCP by
default instead).

> > In general handling a link where the RTT increases would seem
> > tricky with your scheme. Unlike TCP there is no retransmit
> > to save the day.
> 
> In the particular case (NFS over UDP), there is both a retransmit (done
> by RPC) and significant loss rate to start with. As long as the time-out
> is conservative, I don't think this has to affect other cases
> significantly.

NFS isn't the only application making heavy use of UDP.  Video and
audio do so too, and these don't have retransmits. Granted, these should
choose a paket size that is below the path MTU, but not all applications
always do.

IMO an estimator such as you describe would need to be very sensitive
to jitter in fragment latencies, and it may be fairly hard to find a
solution that works from 802.11 up to 10GE. A per-route reassembly
timeout is probably a lot less of a headache.

Olaf
-- 
Olaf Kirch     |  The Hardware Gods hate me.
okir@suse.de   |
---------------+

next prev parent reply	other threads:[~2004-07-27 12:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-15  5:57 Fragment ID wrap workaround (read-only, untested) Rusty Russell (IBM)
2004-07-15  8:28 ` David Stevens
2004-07-15  9:27   ` Andi Kleen
2004-07-15 14:49     ` David Stevens
2004-07-15 16:24       ` John Heffner
2004-07-15 16:27       ` Andi Kleen
2004-07-15 16:54         ` David Stevens
2004-07-15 17:02           ` Andi Kleen
2004-07-27 12:38       ` Olaf Kirch [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-07-15  6:36 Rusty Russell (IBM)
2004-07-15 17:34 ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040727123842.GF27188@suse.de \
    --to=okir@suse.de \
    --cc=ak@suse.de \
    --cc=dlstevens@us.ibm.com \
    --cc=netdev@oss.sgi.com \
    --cc=rusty@au1.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).