netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Siebenmann <cks@cs.toronto.edu>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Chris Siebenmann <cks@cs.toronto.edu>, netdev@vger.kernel.org
Subject: Re: Bug? GRE tunnel periodically won't transmit some packets
Date: Thu, 10 Nov 2011 00:16:49 -0500	[thread overview]
Message-ID: <20111110051649.505C8362D2@apps0.cs.toronto.edu> (raw)
In-Reply-To: eric.dumazet's message of Tue, 08 Nov 2011 15:05:53 +0100. <1320761153.3444.4.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

| So it appears the drop is in gre xmit because frame is bigger than
| mtu...
| 
| Maybe you receive some strange ICMP (ICMP_FRAG_NEEDED) from a buggy
| host ?
| 
| You could catch it with "tcpdump -s 1000 -i any icmp" maybe...

 The problem went away for several days and then came roaring back
just now. I couldn't see any outside ICMP messages like this while
the problem was happening. I did see ICMP messages, but they were
locally generated:

	IP 128.100.3.52 > 128.100.3.52: ICMP 128.100.3.51 unreachable - need to frag (mtu 478), length 556

(I verified that these were for the things that were stalling with
'tcpdump -vv -ee'.)

At the time that this was happening, I could see a lot of 'ip route show
table cache' entries like this:

	128.100.3.58 from 66.96.18.208 dev ppp0  src 66.96.18.208
	    cache  expires 286sec ipid 0xdeda mtu 552 rtt 44ms rttvar 30ms ssthresh 7 cwnd 9 iif lo

There were a bunch of other 'mtu 552' routes. Flushing the routing cache
(with 'ip route flush cache; ip route show table cache' to verify that
it had flushed) did not change the situation; the problem continued and
the 'mtu 552' routes came back as fast as I could check (it appeared to
be the moment that the routing cache was repopulated, there they were).

 In general the route cache appears to have strangely low MTUs listed
for the remote end of the tunnel (at least after the problem's
happened), eg:

	128.100.3.58 from 66.96.18.208 dev ppp0 
	    cache  expires 203sec ipid 0xdeda mtu 934 rtt 44ms rttvar 30ms ssthresh 7 cwnd 9
	128.100.3.58 from 66.96.18.208 dev ppp0  src 66.96.18.208 
	    cache  expires 203sec ipid 0xdeda mtu 934 rtt 44ms rttvar 30ms ssthresh 7 cwnd 9 iif lo

I would normally expect this to be 1492, the PPP link MTU.

 This does not seem to happen on the Fedora 14 kernel (the one where
the problem doesn't happen). There the route is listed as:

	128.100.3.58 from 66.96.18.208 dev ppp0 
	    cache  mtu 1492 advmss 1452 hoplimit 64
	128.100.3.58 from 66.96.18.208 dev ppp0 
	    cache  mtu 1492 advmss 1452 hoplimit 64

(to quote what I *think* are the relevant entries.)

 Since things may be pointing towards routing cache oddities, I should
mention that I have a somewhat peculiar policy based routing setup
involving this GRE tunnel. While it's been working fine for literally
years, it's possible that recent changes in the area changed something
so that peculiar things now happen; if you think it might be relevant, I
can provide a dump of the routing rules and explain the setup.

(Part of why this occurs to me now is that I know it's possible to have
a connection to 128.100.3.58 (the remote end of the tunnel) that runs
through the tunnel itself, and I've seen such routes appear in the 'ip
route show table cache' output.)

		- cks

  reply	other threads:[~2011-11-10  5:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-07 16:21 Bug? GRE tunnel periodically won't transmit some packets Chris Siebenmann
2011-11-07 16:55 ` Eric Dumazet
2011-11-08  6:17   ` Chris Siebenmann
2011-11-08  6:43     ` Eric Dumazet
2011-11-08  7:08       ` Chris Siebenmann
2011-11-08  7:34         ` Eric Dumazet
2011-11-08 10:25           ` Eric Dumazet
2011-11-08 13:05           ` Chris Siebenmann
2011-11-08 14:05             ` Eric Dumazet
2011-11-10  5:16               ` Chris Siebenmann [this message]
2011-11-21  0:23                 ` Recursive routing causes MTU collapse (was Re: Bug? GRE tunnel periodically won't transmit some packets) Chris Siebenmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111110051649.505C8362D2@apps0.cs.toronto.edu \
    --to=cks@cs.toronto.edu \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).