From: Flavio Leitner <fbl@redhat.com>
To: Ivan Zahariev <famzah@icdsoft.com>
Cc: netdev@vger.kernel.org
Subject: Re: Unable to flush ICMP redirect routes in kernel 3.0+
Date: Thu, 17 Nov 2011 11:11:45 -0200 [thread overview]
Message-ID: <20111117111145.252924f5@asterix.rh> (raw)
In-Reply-To: <4EC4C160.6010804@icdsoft.com>
On Thu, 17 Nov 2011 10:10:08 +0200
Ivan Zahariev <famzah@icdsoft.com> wrote:
> On 17.11.2011 г. 02:33 ч., Flavio Leitner wrote:
> > On Thu, 17 Nov 2011 00:32:18 +0200
> > Ivan Zahariev<famzah@icdsoft.com> wrote:
> >
> >> On 11/15/2011 11:09 PM, Eric Dumazet wrote:
> >>> Le mardi 15 novembre 2011 à 22:23 +0200, Ivan Zahariev a écrit :
> >>>> Hello,
> >>>>
> >>>> We have changed nothing in our network infrastructure but only
> >>>> upgraded from Linux kernel 2.6.36.2 to 3.0.3. Here is the problem
> >>>> we are experiencing:
> >>>>
> >>>> ICMP redirected routes are cached forever, and they can be
> >>>> cleared only by a reboot.
> >>>>
> >> ### (bug #1) even though we flushed the route cache,
> >> the<redirected> route resurrects from somewhere; even without
> >> making any TCP requests ### this time what "ip" returns is
> >> consistent with the real (incorrect) routing behavior of machine5
> >> root@machine5:~# ip route flush cache
> >> root@machine5:~# ip route list cache match 8.8.4.4
> >> root@machine5:~# ip route get 8.8.4.4
> >> 8.8.4.4 via 192.168.0.120 dev eth0 src 192.168.0.244
> >> cache<redirected> ipid 0x303a
> >>
> >> ### only a reboot clears the cached<redirected> routes
> > IIRC, the cache flush doesn't affect the inetpeer where the
> > redirected gateway is now stored, so even after flushing the
> > route cache, the inetpeer will restore the old info later.
> >
> > fbl
> OK, I guess my questions now are:
> * How to flush the inetpeer (redirected cache info) without having to
> reboot the machine?
It will expire after 10min if you don't use that specific host.
> * Why "ip route" returns an incorrect route; example:
I am sorry for not being clear before. It is a bug, indeed.
> ### (bug #2) what "ip route" returns is inconsistent, because we are
> using the <redirected> route 192.168.0.120 in reality
> ### note that the count of the route lines increased with one
> root@machine5:~# ip route list cache match 8.8.4.4
> 8.8.4.4 from 192.168.0.244 tos lowdelay via 192.168.0.8 dev eth0
> cache ipid 0x303a
> 8.8.4.4 tos lowdelay via 192.168.0.8 dev eth0 src 192.168.0.244
> cache ipid 0x303a
> 8.8.4.4 via 192.168.0.8 dev eth0 src 192.168.0.244
> cache
> 8.8.4.4 from 192.168.0.244 tos lowdelay via 192.168.0.8 dev eth0
> cache ipid 0x303a
>
> ### After "ip route flush cache", the output of "ip route" gets
> consistent with the real routing behavior of machine5
> root@machine5:~# ip route flush cache
> root@machine5:~# ip route list cache match 8.8.4.4
> root@machine5:~# ip route get 8.8.4.4
> 8.8.4.4 via 192.168.0.120 dev eth0 src 192.168.0.244
> cache <redirected> ipid 0x303a
>
Now the redirected gateway is stored in inetpeer which represents
an specific peer. In your case, you have one for 8.8.4.4.
When you flush the routing cache everything is flushed, except for
the inetpeer as far as I can tell. Later, when you try to access
the host 8.8.4.4 again, the lookup will create a fresh route but
also find the previous 8.8.4.4 inetpeer, so it will re-use the
previous redirected gateway.
Therefore, the routing is fine, but it is missing a way to
invalidade or expire all related inetpeer entries when the flush
happens.
The inetpeer will expire eventually, so waiting before trying again
would help to work around:
1) flush
2) wait to expire (10min)
3) try again
If you know how to compile a kernel, try to change these thresholds
below to expire faster, then you have to wait less for it to expire
instead of rebooting.
net/ipv4/inetpeer.c:
int inet_peer_minttl __read_mostly = 120 * HZ; /* TTL under high load:
120 sec */ int inet_peer_maxttl __read_mostly = 10 * 60 * HZ; /*
usual time to live: 10 min */
That above is just a workaround, indeed.
I am going to be on vacations in the next couple weeks, so I won't be
able to help fixing this any time soon. However, I am pretty sure
someone else will help though :)
fbl
next prev parent reply other threads:[~2011-11-17 13:11 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-15 20:23 Unable to flush ICMP redirect routes in kernel 3.0+ Ivan Zahariev
2011-11-15 21:09 ` Eric Dumazet
2011-11-16 22:32 ` Ivan Zahariev
2011-11-17 0:33 ` Flavio Leitner
2011-11-17 8:10 ` Ivan Zahariev
2011-11-17 13:11 ` Flavio Leitner [this message]
2011-11-17 13:15 ` Eric Dumazet
2011-11-17 14:40 ` Eric Dumazet
2011-11-17 15:37 ` Flavio Leitner
2011-11-17 16:31 ` Eric Dumazet
2011-11-17 16:40 ` Flavio Leitner
2011-11-17 16:45 ` Eric Dumazet
2011-11-17 16:57 ` Eric Dumazet
2011-11-17 17:01 ` Flavio Leitner
2011-11-17 17:18 ` Eric Dumazet
2011-11-17 17:33 ` Flavio Leitner
2011-11-17 17:38 ` Eric Dumazet
2011-11-18 16:02 ` Eric Dumazet
2011-11-18 16:30 ` Flavio Leitner
2011-11-18 16:34 ` Eric Dumazet
2011-11-18 17:05 ` Flavio Leitner
2011-11-18 17:07 ` Eric Dumazet
2011-11-18 17:21 ` Flavio Leitner
2011-11-18 18:04 ` David Miller
2011-11-18 20:26 ` David Miller
2011-11-17 16:52 ` Vasiliy Kulikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111117111145.252924f5@asterix.rh \
--to=fbl@redhat.com \
--cc=famzah@icdsoft.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).