From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Roscic Subject: gateway icmp redirect handling problem (3.0.36-3.0.23) Date: Sat, 14 Jul 2012 20:57:13 +0200 Message-ID: <3383bade3b9d45d93fb42ffef3517c15@segfault.info> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE To: Return-path: Received: from justitmail.at ([83.175.122.136]:50762 "EHLO justitmail.at" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751452Ab2GNTEh (ORCPT ); Sat, 14 Jul 2012 15:04:37 -0400 Sender: netdev-owner@vger.kernel.org List-ID: Hello, I=C2=B4m experiencing the following problem with kernel versions 3.0.36= =20 (down to 3.0.23): on our network we all have one default gateway, it=C2=B4s 10.1.1.254, b= ut=20 there are some networks for which we have another gateway and for this=20 networks the default gateway sends an icmp redirect. lets assume my test machine has ip 10.1.20.79 netmask is 255.255.0.0=20 and my default gateway is 10.1.1.254, i now ping the following ip:=20 10.109.98.11, my default gateway (10.1.1.254) now sends me an icmp=20 redirect to another gateway (10.1.1.1) ... and now everything works as=20 expected, i get the replies from 10.109.98.11 but not for long, after=20 approx. 60 (or so) seconds i only get "ping: sendmsg: Network is down". (exact same problem with all other tcp/udp protocols, but i used ping=20 for the tests because it also prints the redirect messages to the=20 console) so let=C2=B4s have a closer look: not ok - kernel versions 3.0.36 down to 3.0.23: ----------------------------------------------- test-simon:~ # ping 10.109.98.10 =2E.. 64 bytes from 10.109.98.11: icmp_seq=3D62 ttl=3D60 time=3D12.1 ms 64 bytes from 10.109.98.11: icmp_seq=3D63 ttl=3D60 time=3D11.6 ms ping: sendmsg: Network is down ping: sendmsg: Network is down when looking at "ip neigh" the "ping: sendmsg: Network is down" message= =20 appears in the exact moment when the arp entry for the default gateway=20 (10.1.1.254) gets removed from the arp cache: ping "OK" test-simon:~ # ip neigh 10.1.1.1 dev eth0 lladdr 00:00:0c:9f:f0:64 REACHABLE 10.1.1.254 dev eth0 lladdr 00:1a:64:8f:23:64 STALE ping "dead" test-simon:~ # ip neigh 10.1.1.1 dev eth0 lladdr 00:00:0c:9f:f0:64 REACHABLE so it seems that when the default gateway is removed from the arp cache= =20 something goes wrong in the kernel route handling. i don=C2=B4t know th= e=20 internals of the linux route handling, now i need your help, any ideas=20 what=C2=B4s going wrong? i did a lot of tests, the problem i described first happens with kernel= =20 version 3.0.23, i found in the changelog of 3.0.23 the following two=20 commits: (http://www.kernel.org/pub/linux/kernel/v3.0/ChangeLog-3.0.23) commit 42ab5316ddcaa0de23e88e8a3d363c767b9ab0b3 Author: Eric Dumazet Date: Fri Nov 18 15:24:32 2011 -0500 ipv4: fix redirect handling commit bebee22bcbf0026f92141990972bd5863ef9b69c Author: Flavio Leitner Date: Mon Oct 24 02:56:38 2011 -0400 route: fix ICMP redirect validation i then took the net/ipv4/route.c file from kernel 3.0.22 and replaced=20 the version in 3.0.23 with it, this reverts the two mentioned patches=20 above (if i havent overlooked something) after that the problem=20 disappears. so those two patches surely fixed some problem but for kernel versions=20 3.0.23-3.0.36 they broke the gateway icmp redirect handling as describe= d=20 by me here. i did some further tests with different kernel versions: 3.5-rc6: OK 3.4.4: OK 3.2.22: OK 3.0.1 - 3.0.22: OK 3.0.23 - 3.0.36: not OK 2.6.35.13: OK now lets have a closer look at a kernel version which works: ------------------------------------------------------------ this is from 3.5-rc6, but 3.4.4, 3.2.2 and 2.6.35.13 also behave=20 exactly this way, 3.0.1-3.0.22 behave slightly different, see note=20 below. test-simon:~ # ping 10.109.98.11 PING 10.109.98.10 (10.109.98.11) 56(84) bytes of data. 64 bytes from 10.109.98.11: icmp_seq=3D1 ttl=3D60 time=3D15.2 ms From 10.1.1.254: icmp_seq=3D2 Redirect Host(New nexthop: 10.1.1.1) =2E.. test-simon:~ # ip neigh 10.1.1.1 dev eth0 lladdr 00:00:0c:9f:f0:64 REACHABLE 10.1.1.254 dev eth0 lladdr 00:1a:64:8f:23:64 STALE and after approx 60 or so seconds: test-simon:~ # ip neigh 10.1.1.1 dev eth0 lladdr 00:00:0c:9f:f0:64 REACHABLE and ping (and everything else) is as expected still working. note: ----- on 3.0.1-3.0.22: i see lots of icmp redirects sent from the default gateway (10.1.1.254)= =20 to my test machine, while running tcpdump on the default gateway=20 (10.1.1.254) i see every ping packet also arriving there and also some=20 icmp redirect messages going out to my test machine. but everything works so i think my test machine is correctly talking to= =20 the destination using the other gateway (10.1.1.1). i also sniffed a windows 7 client pc, it looks the same there, so=20 possibly no problem, but i mention this because kernel versions 3.5-rc6= ,=20 3.4.4, 3.2.22 and 2.6.35.13 act differently (see below). on 3.0.23-3.0.36: i see lots of icmp redirects sent from the default gateway (10.1.1.254)= =20 to my test machine, while running tcpdump on the default gateway=20 (10.1.1.254) i see up to 20 ping packets arriving there and also up to=20 17 icmp redirect messages going out to my test machine, after the 20th=20 ping packet i dont see further ping packets arriving at the default=20 gateway. so my test machine is then only talking to the other gateway=20 (10.1.1.1) i think. =2E.. 17:48:41.643952 IP 10.1.1.254 > 10.1.20.79: ICMP redirect 10.109.98.11=20 to host 10.1.1.1, length 92 =2E.. 17:48:44.649008 IP 10.1.20.79 > 10.109.98.11: ICMP echo request, id=20 30733, seq 20, length 64 17:48:44.649018 IP 10.1.20.79 > 10.109.98.11: ICMP echo request, id=20 30733, seq 20, length 64 on 3.5-rc6, 3.4.4, 3.2.22 and 2.6.35.13: here it looks different, and for me this is the expected behavior, or=20 at least the behavior i have seen from lots of linux machines on my=20 network. i see 1-2 icmp redirects sent from the default gateway=20 (10.1.1.254) to my test machine, while running tcpdump on the default=20 gateway (10.1.1.254) i only see up to 2 ping packets arriving then=20 nothing, so then my test machine seems to only talk to the other gatewa= y=20 (10.1.1.1). 17:50:58.995894 IP 10.1.20.79 > 10.109.98.11: ICMP echo request, id=20 10766, seq 1, length 64 17:50:58.995914 IP 10.1.20.79 > 10.109.98.11: ICMP echo request, id=20 10766, seq 1, length 64 17:50:59.997260 IP 10.1.20.79 > 10.109.98.11: ICMP echo request, id=20 10766, seq 2, length 64 17:50:59.997277 IP 10.1.1.254 > 10.1.20.79: ICMP redirect 10.109.98.11=20 to host 10.1.1.1, length 92 17:50:59.997287 IP 10.1.20.79 > 10.109.98.11: ICMP echo request, id=20 10766, seq 2, length 64 =2E.. (before someone asks why i "must" use kernel 3.0.x ... because this are= =20 SLES 11 SP2 VMs and they currently ship kernel 3.0.34) i hope i described the problem in a way so that the kernel network=20 stack maintainers can understand the problem, please conact me if you=20 have further questions, and please CC me as i am not subscribed to=20 linux-netdev. if you wish you can also CC this message or replies to=20 linux-kernel. kind regards, Simon Roscic.