From: Jim Westfall <jwestfall@surrealistic.net>
To: netdev@vger.kernel.org
Subject: Re: NOARP devices and NOARP arp_cache entires
Date: Fri, 12 Jan 2018 13:11:18 -0800 [thread overview]
Message-ID: <20180112211118.GE740@surrealistic.net> (raw)
In-Reply-To: <20180111210524.GD740@surrealistic.net>
Jim Westfall <jwestfall@surrealistic.net> wrote [01.11.18]:
> Hi
>
> I'm seeing some weird behavior related to NOARP devices and NOARP
> entries in the arp cache.
>
> I have a couple gre tunnels between a linux box and a upstream router that
> send/recv a large amount of packets with unique ips. On the order of 10k+
> unique ips per second seen by the linux box.
>
> Each one of the ips ends up getting added to the arp cache as
>
> <ip> dev tun1234 lladdr 0.0.0.0 NOARP
>
> This of course makes the arp cache grow extremely fast and overflow.
> While I can tweak gc_thresh1/2/3 to make arp cache size huge, it doesn't
> seem like the right answer as the kernel is spinning its wheels having to
> adding/expunging entries for the high rate of unique ips.
>
> I'm unclear why the kernel is even tracking them in the arp cache. If
> routing table says to route the packet out a NOARP interface then there is
> no arp, why involve the arp cache at all?
>
> You can see the behavior with the following
>
> [root@jwestfall:~]# uname -a
> Linux jwestfall.jwestfall.net 4.14.10_1 #1 SMP PREEMPT Sun Dec 31 20:23:29 UTC 2017 x86_64 GNU/Linux
>
> [root@jwestfall:~]# ip neigh show nud noarp
> 10.0.0.172 dev lo lladdr 00:00:00:00:00:00 NOARP
> 10.70.50.5 dev tun0 lladdr 08 NOARP
> 127.0.0.1 dev lo lladdr 00:00:00:00:00:00 NOARP
>
> Setup a bogus gre tunnel, the remote ip doesn't matter
> [root@jwestfall:~]# ip tunnel add tun1234 mode gre local 10.0.0.172 remote 10.0.0.156 dev enp4s0
> [root@jwestfall:~]# ip link set up dev tun1234
>
> Route a bogus network to the tunnel
> [root@jwestfall:~]# ip route add 192.168.111.0/24 dev tun1234
>
> Ping ips on the bogus network
> [root@jwestfall:~]# nmap -sP 192.168.111.0/24
>
> Starting Nmap 7.60 ( https://nmap.org ) at 2018-01-11 12:06 PST
> ...
>
> [root@jwestfall:~]# ip neigh show nud noarp
> 192.168.111.18 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.4 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.28 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.17 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.14 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.34 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.3 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.20 dev tun1234 lladdr 0.0.0.0 NOARP
> 10.0.0.172 dev lo lladdr 00:00:00:00:00:00 NOARP
> 192.168.111.6 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.27 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.13 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.33 dev tun1234 lladdr 0.0.0.0 NOARP
> ...
>
> Also somewhat interesting is that on older kernels (3.2 time range) these
> NOARP entries didn't get added for ipv4, but they did for ipv6 if you
> pushed ipv6 through the ipv4 tunnel.
>
> 2804:14c:f281:a1d8:61a2:a30:989f:3eb1 dev tun1 lladdr 0.0.0.0 NOARP
> 2607:8400:2122:4:e9f9:dbb8:2d44:75d1 dev tun2 lladdr 0.0.0.0 NOARP
>
> Thanks
> Jim Westfall
>
>
Digging into this a bit in older kernels there was the following
static struct neighbour *ipv4_neigh_lookup(const struct dst_entry *dst, const void *daddr)
{
static const __be32 inaddr_any = 0;
struct net_device *dev = dst->dev;
const __be32 *pkey = daddr;
struct neighbour *n;
if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
pkey = &inaddr_any;
which was forcing the hash key to be 0.0.0.0 for tunnels. This was removed as
part of a263b3093641fb1ec377582c90986a7fd0625184 which was part of a larger set
that "Disconnect neigh from dst_entry"
Would there be any aversion to me submitting a patch to mimic this older
behavior?
Thanks
jim
prev parent reply other threads:[~2018-01-12 21:11 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-11 21:05 NOARP devices and NOARP arp_cache entires Jim Westfall
2018-01-12 21:11 ` Jim Westfall [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180112211118.GE740@surrealistic.net \
--to=jwestfall@surrealistic.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.