From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Marco C. Coelho" Date: Thu, 25 Oct 2007 15:08:16 +0000 Subject: Re: [LARTC] neighbor table overflow Message-Id: <4720B160.6060602@argontech.net> MIME-Version: 1 Content-Type: multipart/mixed; boundary="===============0215075752==" List-Id: References: <200710230146.27081.peet@altlinux.org> In-Reply-To: <200710230146.27081.peet@altlinux.org> To: lartc@vger.kernel.org This is a multi-part message in MIME format. --===============0215075752== Content-Type: multipart/alternative; boundary="------------010205000206000104070403" This is a multi-part message in MIME format. --------------010205000206000104070403 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Looking into it further an ip route shows: 64.0.0.0/8 via 64.202.224.1 dev eth0 proto zebra metric 20 equalize So the 64.0.0.0 announce is coming into this box through OSPF (zebra) The 169.254.0.0/16 is being automajically added through the sysconfig network scripts. I'm looking into why. In either case I still don't see why these entries would make the neighbor table overflow. Could it have been the previous fix to the hosts file? mc Alexandru Dragoi wrote: > Marco C. Coelho wrote: > >> the ip route with a grep for link returns: >> >> snip** too long >> 64.202.227.198 dev ppp436 proto kernel scope link src 10.20.1.1 >> 64.202.227.196 dev ppp421 proto kernel scope link src 10.20.1.1 >> 64.202.227.197 dev ppp211 proto kernel scope link src 10.20.0.1 >> 64.202.227.194 dev ppp13 proto kernel scope link src 10.20.1.1 >> 64.202.227.192 dev ppp404 proto kernel scope link src 10.20.1.1 >> 64.202.227.254 dev ppp194 proto kernel scope link src 10.20.1.1 >> 64.202.227.253 dev ppp130 proto kernel scope link src 10.20.1.1 >> 64.202.227.252 dev ppp243 proto kernel scope link src 10.20.1.1 >> 64.202.227.249 dev ppp195 proto kernel scope link src 10.20.1.1 >> 64.202.227.248 dev ppp254 proto kernel scope link src 10.20.1.1 >> 64.202.227.247 dev ppp235 proto kernel scope link src 10.20.1.1 >> 64.202.227.242 dev ppp78 proto kernel scope link src 10.20.1.1 >> 64.202.227.240 dev ppp328 proto kernel scope link src 10.20.1.1 >> 64.202.227.237 dev ppp44 proto kernel scope link src 10.20.1.1 >> 64.202.227.236 dev ppp122 proto kernel scope link src 10.20.1.1 >> 64.202.227.234 dev ppp316 proto kernel scope link src 10.20.1.1 >> 64.202.227.232 dev ppp132 proto kernel scope link src 10.20.1.1 >> 64.202.227.231 dev ppp104 proto kernel scope link src 10.20.0.1 >> 64.202.227.226 dev ppp179 proto kernel scope link src 10.20.0.1 >> 64.202.224.0/24 dev eth0 proto kernel scope link src 64.202.224.8 >> 192.168.1.0/24 dev eth3 proto kernel scope link src 192.168.1.8 >> 169.254.0.0/16 dev eth3 scope link >> > > The one above must be deleted, many redhat-like distros attach > 169.254.0.0/16. > >> All the pppoe terminations (pppd) are shown, as well as the last three >> subnets. I'll have to see where the 169.254.0.0/16 is coming from? >> >> mc >> >> >> >> >> Alexandru Dragoi wrote: >> >>> Marco C. Coelho wrote: >>> >>> >>>> This box is doing a lot. It terminates 1000 PPPoE connections, >>>> provides traffic shaping using TC/HTB, authenticates all users via >>>> Radius. It also runs OSPF routing for the internal network. Looking >>>> at a simple route output I see all the PPP connections coming through >>>> the box, and due to the OSPF I also see the rest of my network >>>> announcements. The only strange things are: >>>> >>>> 1. The last man working on this box had mistakenly edited the hosts >>>> file and added the machine name and complete domain name to the local >>>> host 127.0.0.1 name. It should only be pointed to the eth0 >>>> interface. I have changed this. >>>> >>>> 2. The route output is making an announcement >>>> >>>> 64.0.0.0 argontech.net 255.0.0.0 UG 20 >>>> 0 0 eth0 >>>> >>>> >>> This doesn't look dangerous for your problem, I was only talking about >>> directly connected networks: >>> >>> # ip route |grep link >>> >>> >>> >>>> My public IP space is a /20 within that space, not the whole Class A. >>>> I have not found which box is announcing this within my network yet. >>>> >>>> >>>> >>>> >>>> >>>> Jeff Welling wrote: >>>> >>>> >>>>>> On 10/23/07 06:56, Alexandru Dragoi wrote: >>>>>> >>>>>> >>>>>>> What about checking your routing table? you may have link routes >>>>>>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some >>>>>>> programs prefer to use "standard" netmask of classes A and B. >>>>>>> >>>>>>> >>>>>> I'm betting that the OP has other things going on seeing has how >>>>>> s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol, >>>>>> and thus not subject to typical routing scenarios. In essence the >>>>>> OP could have thousands of PPPoE connections terminating on one >>>>>> system with the ARP cache having to deal with where to send traffic >>>>>> to which MAC address. There is not a lot of room for routing in such >>>>>> a scenario. >>>>>> >>>>>> >>>>>> >>>>> I agree with Peter's suggestion, arpd. I ran into the neighbor table >>>>> overflow problem recently, at the hands of our ISP. I was in the >>>>> process of recompiling the kernel and mucking with arpd (I couldn't >>>>> get it to run/start properly) when the problem disappeared as quickly >>>>> as it showed up. Lucky for me, this was some kind of ISP problem, I >>>>> was able to determine that much through `tcpdump -i X -n arpd`. >>>>> >>>>> My 'two cents' is that you try arpd, I did a bit of looking when I >>>>> came across that problem and it seemed to be the last ditch effort >>>>> when changing the gc threshold had no effect. Wasn't able to confirm >>>>> that it worked for sure though. >>>>> >>>>> Cheers. >>>>> _______________________________________________ >>>>> LARTC mailing list >>>>> LARTC@mailman.ds9a.nl >>>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> LARTC mailing list >>>> LARTC@mailman.ds9a.nl >>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >>>> >>>> >>> >>> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> LARTC mailing list >> LARTC@mailman.ds9a.nl >> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >> >> > > > --------------010205000206000104070403 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Looking into it further an ip route shows:

64.0.0.0/8 via 64.202.224.1 dev eth0  proto zebra  metric 20 equalize

So the 64.0.0.0 announce is coming into this box through OSPF  (zebra)

The 169.254.0.0/16 is being automajically added through the sysconfig network scripts.  I'm looking into why.

In either case I still don't see why these entries would make the neighbor table overflow.  Could it have been the previous fix to the hosts file?

mc

Alexandru Dragoi wrote:
Marco C. Coelho wrote:
  
the ip route with a grep for link returns:

snip** too long
64.202.227.198 dev ppp436  proto kernel  scope link  src 10.20.1.1
64.202.227.196 dev ppp421  proto kernel  scope link  src 10.20.1.1
64.202.227.197 dev ppp211  proto kernel  scope link  src 10.20.0.1
64.202.227.194 dev ppp13  proto kernel  scope link  src 10.20.1.1
64.202.227.192 dev ppp404  proto kernel  scope link  src 10.20.1.1
64.202.227.254 dev ppp194  proto kernel  scope link  src 10.20.1.1
64.202.227.253 dev ppp130  proto kernel  scope link  src 10.20.1.1
64.202.227.252 dev ppp243  proto kernel  scope link  src 10.20.1.1
64.202.227.249 dev ppp195  proto kernel  scope link  src 10.20.1.1
64.202.227.248 dev ppp254  proto kernel  scope link  src 10.20.1.1
64.202.227.247 dev ppp235  proto kernel  scope link  src 10.20.1.1
64.202.227.242 dev ppp78  proto kernel  scope link  src 10.20.1.1
64.202.227.240 dev ppp328  proto kernel  scope link  src 10.20.1.1
64.202.227.237 dev ppp44  proto kernel  scope link  src 10.20.1.1
64.202.227.236 dev ppp122  proto kernel  scope link  src 10.20.1.1
64.202.227.234 dev ppp316  proto kernel  scope link  src 10.20.1.1
64.202.227.232 dev ppp132  proto kernel  scope link  src 10.20.1.1
64.202.227.231 dev ppp104  proto kernel  scope link  src 10.20.0.1
64.202.227.226 dev ppp179  proto kernel  scope link  src 10.20.0.1
64.202.224.0/24 dev eth0  proto kernel  scope link  src 64.202.224.8
192.168.1.0/24 dev eth3  proto kernel  scope link  src 192.168.1.8
169.254.0.0/16 dev eth3  scope link
    

The one above must be deleted, many redhat-like distros attach
169.254.0.0/16.
  
All the pppoe terminations (pppd) are shown, as well as the last three
subnets.  I'll have to see where the 169.254.0.0/16 is coming from?

mc




Alexandru Dragoi wrote:
    
Marco C. Coelho wrote:
  
      
This box is doing a lot.  It terminates 1000 PPPoE connections,
provides traffic shaping using TC/HTB, authenticates all users via
Radius.  It also runs OSPF routing for the internal network.  Looking
at a simple route output I see all the PPP connections coming through
the box, and due to the OSPF I also see the rest of my network
announcements.  The only strange things are:

1.  The last man working on this box had mistakenly edited the hosts
file and added the machine name and complete domain name to the local
host 127.0.0.1 name.  It should only be pointed to the eth0
interface.   I have changed this.

2.  The route output is making an announcement

   64.0.0.0        argontech.net   255.0.0.0       UG    20    
0        0 eth0
    
        
This doesn't look dangerous for your problem, I was only talking about
directly connected networks:

# ip route |grep link

  
      
My public IP space is a /20 within that space, not the whole Class A. 
I have not found which box is announcing this within my network yet.





Jeff Welling wrote:
    
        
On 10/23/07 06:56, Alexandru Dragoi wrote:
        
            
What about checking your routing table? you may have link routes
for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some
programs prefer to use "standard" netmask of classes A and B.
          
              
I'm betting that the OP has other things going on seeing has how
s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol,
and thus not subject to typical routing scenarios.  In essence the
OP could have thousands of PPPoE connections terminating on one
system with the ARP cache having to deal with where to send traffic
to which MAC address. There is not a lot of room for routing in such
a scenario.

        
            
I agree with Peter's suggestion, arpd.  I ran into the neighbor table
overflow problem recently, at the hands of our ISP.  I was in the
process of recompiling the kernel and mucking with arpd (I couldn't
get it to run/start properly) when the problem disappeared as quickly
as it showed up.  Lucky for me, this was some kind of ISP problem, I
was able to determine that much through `tcpdump -i X -n arpd`.

My 'two cents' is that you try arpd, I did a bit of looking when I
came across that problem and it seemed to be the last ditch effort
when changing the gc threshold had no effect.  Wasn't able to confirm
that it worked for sure though.

Cheers.
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

      
          
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
    
        
  
      
------------------------------------------------------------------------

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
  
    


  
--------------010205000206000104070403-- --===============0215075752== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc --===============0215075752==--