From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jim Westfall Subject: Re: NOARP devices and NOARP arp_cache entires Date: Fri, 12 Jan 2018 13:11:18 -0800 Message-ID: <20180112211118.GE740@surrealistic.net> References: <20180111210524.GD740@surrealistic.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii To: netdev@vger.kernel.org Return-path: Received: from whipper.surrealistic.net ([50.251.204.81]:48106 "EHLO whipper.surrealistic.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964844AbeALVLT (ORCPT ); Fri, 12 Jan 2018 16:11:19 -0500 Received: from localhost (whipper.surrealistic.net [local]) by whipper.surrealistic.net (OpenSMTPD) with ESMTPA id e017116a for ; Fri, 12 Jan 2018 21:11:18 +0000 (UTC) Content-Disposition: inline In-Reply-To: <20180111210524.GD740@surrealistic.net> Sender: netdev-owner@vger.kernel.org List-ID: Jim Westfall wrote [01.11.18]: > Hi > > I'm seeing some weird behavior related to NOARP devices and NOARP > entries in the arp cache. > > I have a couple gre tunnels between a linux box and a upstream router that > send/recv a large amount of packets with unique ips. On the order of 10k+ > unique ips per second seen by the linux box. > > Each one of the ips ends up getting added to the arp cache as > > dev tun1234 lladdr 0.0.0.0 NOARP > > This of course makes the arp cache grow extremely fast and overflow. > While I can tweak gc_thresh1/2/3 to make arp cache size huge, it doesn't > seem like the right answer as the kernel is spinning its wheels having to > adding/expunging entries for the high rate of unique ips. > > I'm unclear why the kernel is even tracking them in the arp cache. If > routing table says to route the packet out a NOARP interface then there is > no arp, why involve the arp cache at all? > > You can see the behavior with the following > > [root@jwestfall:~]# uname -a > Linux jwestfall.jwestfall.net 4.14.10_1 #1 SMP PREEMPT Sun Dec 31 20:23:29 UTC 2017 x86_64 GNU/Linux > > [root@jwestfall:~]# ip neigh show nud noarp > 10.0.0.172 dev lo lladdr 00:00:00:00:00:00 NOARP > 10.70.50.5 dev tun0 lladdr 08 NOARP > 127.0.0.1 dev lo lladdr 00:00:00:00:00:00 NOARP > > Setup a bogus gre tunnel, the remote ip doesn't matter > [root@jwestfall:~]# ip tunnel add tun1234 mode gre local 10.0.0.172 remote 10.0.0.156 dev enp4s0 > [root@jwestfall:~]# ip link set up dev tun1234 > > Route a bogus network to the tunnel > [root@jwestfall:~]# ip route add 192.168.111.0/24 dev tun1234 > > Ping ips on the bogus network > [root@jwestfall:~]# nmap -sP 192.168.111.0/24 > > Starting Nmap 7.60 ( https://nmap.org ) at 2018-01-11 12:06 PST > ... > > [root@jwestfall:~]# ip neigh show nud noarp > 192.168.111.18 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.4 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.28 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.17 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.14 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.34 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.3 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.20 dev tun1234 lladdr 0.0.0.0 NOARP > 10.0.0.172 dev lo lladdr 00:00:00:00:00:00 NOARP > 192.168.111.6 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.27 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.13 dev tun1234 lladdr 0.0.0.0 NOARP > 192.168.111.33 dev tun1234 lladdr 0.0.0.0 NOARP > ... > > Also somewhat interesting is that on older kernels (3.2 time range) these > NOARP entries didn't get added for ipv4, but they did for ipv6 if you > pushed ipv6 through the ipv4 tunnel. > > 2804:14c:f281:a1d8:61a2:a30:989f:3eb1 dev tun1 lladdr 0.0.0.0 NOARP > 2607:8400:2122:4:e9f9:dbb8:2d44:75d1 dev tun2 lladdr 0.0.0.0 NOARP > > Thanks > Jim Westfall > > Digging into this a bit in older kernels there was the following static struct neighbour *ipv4_neigh_lookup(const struct dst_entry *dst, const void *daddr) { static const __be32 inaddr_any = 0; struct net_device *dev = dst->dev; const __be32 *pkey = daddr; struct neighbour *n; if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT)) pkey = &inaddr_any; which was forcing the hash key to be 0.0.0.0 for tunnels. This was removed as part of a263b3093641fb1ec377582c90986a7fd0625184 which was part of a larger set that "Disconnect neigh from dst_entry" Would there be any aversion to me submitting a patch to mimic this older behavior? Thanks jim