netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tore Anderson <tore@fud.no>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev <netdev@vger.kernel.org>
Subject: Re: [RFC] ipv6: dst_allfrag() not taken into account by TCP
Date: Tue, 17 Jan 2012 21:03:35 +0100	[thread overview]
Message-ID: <4F15D417.4050005@fud.no> (raw)
In-Reply-To: <1326817699.2259.32.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

* Eric Dumazet

> Bugzilla reference :
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=42572

Hi, and thanks for taking an interest in this issue!

I've got some general comments regarding running IPv6-only Linux servers
behind stateless IPv4/IPv6 translators. (They are not strictly related
to the above bug, but not completely off-topic either I hope.)

1) The Linux kernel doesn't allow reducing the effective IPv6 link MTU
(as recorded in the routing cache) to anything less than 1280. This
means that it can end up in a situation where the effective IPv6 link
MTU is greater than the actual IPv6 Path MTU. In the PCAP in the
bugzilla, they are 1280 and 1279, respectively. However, the kernel
doesn't appear to record the actual Path MTU anywhere, instead setting
the allfrag feature.

While this is perfectly legal behaviour according to the RFC, from an
operational point of view it would have been nice if there were some way
(e.g. a sysctl) to tell the kernel to also actually allow an ICMPv6 PTB
to reduce the effective IPv6 link MTU to values less than 1280 (at least
down to the minimum IPv4 MTU + 20 bytes). That would have avoided the
need for the allfrag feature to come into play completely.

The RFC allows for this behaviour, too.

2) Since the kernel doesn't keep track of the actual Path MTU (if it's
lower than 1280), when the allfrag feature gets set on a route, *every*
packet gets a fragmentation header. (Which is to be expected, really,
given it's name.) However, this means that even tiny packets such as a
TCP SYN/ACK gets the fragmentation header added. This is clearly not
particularly useful.

If the kernel had kept track of the effective Path MTU, and only
included the IPv6 Fragmentation header on packets that were larger than
it *only*, this wouldn't have been a problem. (Alternatively, if it
allowed the effective link MTU to drop below 1280 that would also have
avoided this problem.)

3) There seems to be a bug related to generating the TCP checksum of
SYN/ACK packets to destinations with the allfrag features set. I just
submitted a bug report about this:

https://bugzilla.kernel.org/show_bug.cgi?id=42595

This makes the allfrag feature pretty much useless for me, as I can only
successfully establish a single TCP session from a client behind a <1280
MTU link for the entire lifetime of the routing cache entry.

Best regards,
-- 
Tore Anderson

  parent reply	other threads:[~2012-01-17 20:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-17 16:28 [RFC] ipv6: dst_allfrag() not taken into account by TCP Eric Dumazet
2012-01-17 17:34 ` David Miller
2012-01-17 18:15   ` Eric Dumazet
2012-01-17 18:25     ` David Miller
2012-01-17 20:03 ` Tore Anderson [this message]
2012-01-17 20:25   ` Eric Dumazet
2012-01-18 12:42     ` Tore Anderson
2012-01-18 14:06       ` Eric Dumazet
2012-01-18 14:43         ` Tore Anderson
2012-01-18 14:59           ` Eric Dumazet
2012-01-18 15:14             ` Tore Anderson
2012-01-18 15:40               ` Eric Dumazet
2012-01-18 17:01             ` David Miller
2012-01-17 23:43   ` Bugzilla 42595 Eric Dumazet
2012-01-18 10:58     ` Eric Dumazet
2012-01-18 13:44       ` Eric Dumazet
2012-01-18 14:20         ` Tore Anderson
2012-01-18 14:42           ` Eric Dumazet
2012-01-18 15:42             ` Eric Dumazet
2012-01-18 19:26               ` Tore Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F15D417.4050005@fud.no \
    --to=tore@fud.no \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).