All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [LARTC] neighbor table overflow
@ 2007-10-22 21:46 Peter V. Saveliev
  2007-10-22 21:46 ` Marco C. Coelho
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: Peter V. Saveliev @ 2007-10-22 21:46 UTC (permalink / raw)
  To: lartc

<skip />
>
> # Added to stop "neighbor table overflow" messages in the kernel
> net.ipv4.neigh.default.gc_thresh1Q2
> net.ipv4.neigh.default.gc_thresh2 48
> net.ipv4.neigh.default.gc_thresh3@96
> # Added to increase IP contrack number (was getting to max)
> net.ipv4.ip_conntrack_max™999
>
> to sysctl.conf to increase the size, but this only seems to delay the
> problem.
>
> Any thoughts?
<skip />

try arpd?

-- 
Peter V. Saveliev
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
@ 2007-10-22 21:46 ` Marco C. Coelho
  2007-10-22 22:35 ` Grant Taylor
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Marco C. Coelho @ 2007-10-22 21:46 UTC (permalink / raw)
  To: lartc

I've got a linux router pushing 600-1000 pppoe connections through it.  
I'm getting a screen error "Neighbor Table Overflow" after this box has 
been up for between 1 week and 1 month.  When this is happening, routing 
slows to a crawl if at all.  Then dies.  I've added:

# Added to stop "neighbor table overflow" messages in the kernel
net.ipv4.neigh.default.gc_thresh1Q2
net.ipv4.neigh.default.gc_thresh2 48
net.ipv4.neigh.default.gc_thresh3@96
# Added to increase IP contrack number (was getting to max)
net.ipv4.ip_conntrack_max™999

to sysctl.conf to increase the size, but this only seems to delay the 
problem.

Any thoughts?

Marco
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
  2007-10-22 21:46 ` Marco C. Coelho
@ 2007-10-22 22:35 ` Grant Taylor
  2007-10-23 11:56 ` Alexandru Dragoi
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Grant Taylor @ 2007-10-22 22:35 UTC (permalink / raw)
  To: lartc

On 10/22/07 16:46, Peter V. Saveliev wrote:
> try arpd?

You took the statement right out from under my finger tips.



Grant. . . .
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
  2007-10-22 21:46 ` Marco C. Coelho
  2007-10-22 22:35 ` Grant Taylor
@ 2007-10-23 11:56 ` Alexandru Dragoi
  2007-10-23 20:32 ` Grant Taylor
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Alexandru Dragoi @ 2007-10-23 11:56 UTC (permalink / raw)
  To: lartc

Marco C. Coelho wrote:
> I've got a linux router pushing 600-1000 pppoe connections through
> it.  I'm getting a screen error "Neighbor Table Overflow" after this
> box has been up for between 1 week and 1 month.  When this is
> happening, routing slows to a crawl if at all.  Then dies.  I've added:
>
> # Added to stop "neighbor table overflow" messages in the kernel
> net.ipv4.neigh.default.gc_thresh1Q2
> net.ipv4.neigh.default.gc_thresh2 48
> net.ipv4.neigh.default.gc_thresh3@96
> # Added to increase IP contrack number (was getting to max)
> net.ipv4.ip_conntrack_max™999
>
> to sysctl.conf to increase the size, but this only seems to delay the
> problem.
>
> Any thoughts?
>
> Marco
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
What about checking your routing table? you may have link routes for
massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some programs prefer
to use "standard" netmask of classes A and B.


# ip rou |grep link
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (2 preceding siblings ...)
  2007-10-23 11:56 ` Alexandru Dragoi
@ 2007-10-23 20:32 ` Grant Taylor
  2007-10-23 20:43 ` Jeff Welling
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Grant Taylor @ 2007-10-23 20:32 UTC (permalink / raw)
  To: lartc

On 10/23/07 06:56, Alexandru Dragoi wrote:
> What about checking your routing table? you may have link routes for 
> massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some programs 
> prefer to use "standard" netmask of classes A and B.

I'm betting that the OP has other things going on seeing has how s/he 
mentioned PPPoE, which to my knowledge is a layer 2 protocol, and thus 
not subject to typical routing scenarios.  In essence the OP could have 
thousands of PPPoE connections terminating on one system with the ARP 
cache having to deal with where to send traffic to which MAC address. 
There is not a lot of room for routing in such a scenario.



Grant. . . .
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (3 preceding siblings ...)
  2007-10-23 20:32 ` Grant Taylor
@ 2007-10-23 20:43 ` Jeff Welling
  2007-10-23 21:04 ` Grant Taylor
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Jeff Welling @ 2007-10-23 20:43 UTC (permalink / raw)
  To: lartc


> On 10/23/07 06:56, Alexandru Dragoi wrote:
>> What about checking your routing table? you may have link routes  
>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some  
>> programs prefer to use "standard" netmask of classes A and B.
>
> I'm betting that the OP has other things going on seeing has how s/ 
> he mentioned PPPoE, which to my knowledge is a layer 2 protocol,  
> and thus not subject to typical routing scenarios.  In essence the  
> OP could have thousands of PPPoE connections terminating on one  
> system with the ARP cache having to deal with where to send traffic  
> to which MAC address. There is not a lot of room for routing in  
> such a scenario.
>
I agree with Peter's suggestion, arpd.  I ran into the neighbor table  
overflow problem recently, at the hands of our ISP.  I was in the  
process of recompiling the kernel and mucking with arpd (I couldn't  
get it to run/start properly) when the problem disappeared as quickly  
as it showed up.  Lucky for me, this was some kind of ISP problem, I  
was able to determine that much through `tcpdump -i X -n arpd`.

My 'two cents' is that you try arpd, I did a bit of looking when I  
came across that problem and it seemed to be the last ditch effort  
when changing the gc threshold had no effect.  Wasn't able to confirm  
that it worked for sure though.

Cheers.
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (4 preceding siblings ...)
  2007-10-23 20:43 ` Jeff Welling
@ 2007-10-23 21:04 ` Grant Taylor
  2007-10-23 21:10 ` Marco C. Coelho
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Grant Taylor @ 2007-10-23 21:04 UTC (permalink / raw)
  To: lartc

On 10/23/07 16:10, Marco C. Coelho wrote:
> This box is doing a lot.  It terminates 1000 PPPoE connections, provides 
> traffic shaping using TC/HTB, authenticates all users via Radius.  It 
> also runs OSPF routing for the internal network.  Looking at a simple 
> route output I see all the PPP connections coming through the box, and 
> due to the OSPF I also see the rest of my network announcements.  The 
> only strange things are:

That's just a few things to do on one box.  How well is it handling it 
if I can ask (aside from the problem that you are working on)?

> 1.  The last man working on this box had mistakenly edited the hosts 
> file and added the machine name and complete domain name to the local 
> host 127.0.0.1 name.  It should only be pointed to the eth0 interface.   
> I have changed this.

Dough!

> 2.  The route output is making an announcement
> 
>    64.0.0.0        argontech.net   255.0.0.0       UG    20     0        
> 0 eth0
> 
> My public IP space is a /20 within that space, not the whole Class A.  I 
> have not found which box is announcing this within my network yet.

I would think that you could extract that information from OSPF, or at 
least the system that is advertising and work backwards until you find 
the ultimate culprit.



Grant. . . .
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (5 preceding siblings ...)
  2007-10-23 21:04 ` Grant Taylor
@ 2007-10-23 21:10 ` Marco C. Coelho
  2007-10-23 21:23 ` Grant Taylor
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Marco C. Coelho @ 2007-10-23 21:10 UTC (permalink / raw)
  To: lartc


This box is doing a lot.  It terminates 1000 PPPoE connections, provides 
traffic shaping using TC/HTB, authenticates all users via Radius.  It 
also runs OSPF routing for the internal network.  Looking at a simple 
route output I see all the PPP connections coming through the box, and 
due to the OSPF I also see the rest of my network announcements.  The 
only strange things are:

1.  The last man working on this box had mistakenly edited the hosts 
file and added the machine name and complete domain name to the local 
host 127.0.0.1 name.  It should only be pointed to the eth0 interface.   
I have changed this.

2.  The route output is making an announcement

    64.0.0.0        argontech.net   255.0.0.0       UG    20     
0        0 eth0

My public IP space is a /20 within that space, not the whole Class A.  I 
have not found which box is announcing this within my network yet.





Jeff Welling wrote:
>
>> On 10/23/07 06:56, Alexandru Dragoi wrote:
>>> What about checking your routing table? you may have link routes for 
>>> massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some programs 
>>> prefer to use "standard" netmask of classes A and B.
>>
>> I'm betting that the OP has other things going on seeing has how s/he 
>> mentioned PPPoE, which to my knowledge is a layer 2 protocol, and 
>> thus not subject to typical routing scenarios.  In essence the OP 
>> could have thousands of PPPoE connections terminating on one system 
>> with the ARP cache having to deal with where to send traffic to which 
>> MAC address. There is not a lot of room for routing in such a scenario.
>>
> I agree with Peter's suggestion, arpd.  I ran into the neighbor table 
> overflow problem recently, at the hands of our ISP.  I was in the 
> process of recompiling the kernel and mucking with arpd (I couldn't 
> get it to run/start properly) when the problem disappeared as quickly 
> as it showed up.  Lucky for me, this was some kind of ISP problem, I 
> was able to determine that much through `tcpdump -i X -n arpd`.
>
> My 'two cents' is that you try arpd, I did a bit of looking when I 
> came across that problem and it seemed to be the last ditch effort 
> when changing the gc threshold had no effect.  Wasn't able to confirm 
> that it worked for sure though.
>
> Cheers.
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (6 preceding siblings ...)
  2007-10-23 21:10 ` Marco C. Coelho
@ 2007-10-23 21:23 ` Grant Taylor
  2007-10-23 21:27 ` Marco C. Coelho
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Grant Taylor @ 2007-10-23 21:23 UTC (permalink / raw)
  To: lartc

On 10/23/07 16:27, Marco C. Coelho wrote:
> Is there a way to probe the kernel to find out how big the neighbor 
> table is on a regular basis?  Without making a smoking hole of course.

Other than querying the ARP cache, I'm not aware of any thing.  I'm sure 
there is a way with in the kernel to see how many entries are in the ARP 
cache, but I am the wrong person to ask.

> BTW, Traffic shaping is also controlled via Radius.

*nod*

> It's actually pretty happy on a single processor, single core AMD 3000 
> with 1.5 G of RAM (it was not happy with 512K!!!).   I've actually got a 
> new Dual Core, Dual Processor box loaded and ready to place in 
> production, but would like to fix this problem first.  Unfortunately it 
> takes between a week and a month for the problem to surface.

Good.  It is nice to see Linux doing some things that Cisco and others 
tried to dominate for so long.

> I'm working on it, but time is slim today (but not me)!

I wonder if you can turn up debugging on your OSPF daemon to see who / 
what is being advertised.



Grant. . . .
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (7 preceding siblings ...)
  2007-10-23 21:23 ` Grant Taylor
@ 2007-10-23 21:27 ` Marco C. Coelho
  2007-10-24 10:19 ` Alexandru Dragoi
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Marco C. Coelho @ 2007-10-23 21:27 UTC (permalink / raw)
  To: lartc

Is there a way to probe the kernel to find out how big the neighbor 
table is on a regular basis?  Without making a smoking hole of course.

BTW, Traffic shaping is also controlled via Radius.




Grant Taylor wrote:
> On 10/23/07 16:10, Marco C. Coelho wrote:
>> This box is doing a lot.  It terminates 1000 PPPoE connections, 
>> provides traffic shaping using TC/HTB, authenticates all users via 
>> Radius.  It also runs OSPF routing for the internal network.  Looking 
>> at a simple route output I see all the PPP connections coming through 
>> the box, and due to the OSPF I also see the rest of my network 
>> announcements.  The only strange things are:
>
> That's just a few things to do on one box.  How well is it handling it 
> if I can ask (aside from the problem that you are working on)?

It's actually pretty happy on a single processor, single core AMD 3000 
with 1.5 G of RAM (it was not happy with 512K!!!).   I've actually got a 
new Dual Core, Dual Processor box loaded and ready to place in 
production, but would like to fix this problem first.  Unfortunately it 
takes between a week and a month for the problem to surface.

model name      : AMD Athlon(tm) 64 Processor 3000+
stepping        : 0
cpu MHz         : 2000.000
cache size      : 512 KB

             total       used       free     shared    buffers     cached
Mem:       1554796    1044324     510472          0     221180     430860
-/+ buffers/cache:     392284    1162512
Swap:      4096496        148    4096348

ping times through this box:

64 bytes from f1.www.vip.mud.yahoo.com (209.191.93.52): icmp_seq=1 
ttlX time=7.74 ms
64 bytes from f1.www.vip.mud.yahoo.com (209.191.93.52): icmp_seq=2 
ttlX time=8.25 ms
64 bytes from f1.www.vip.mud.yahoo.com (209.191.93.52): icmp_seq=3 
ttlX time=8.36 ms
64 bytes from f1.www.vip.mud.yahoo.com (209.191.93.52): icmp_seq=4 
ttlX time\x11.9 ms
64 bytes from f1.www.vip.mud.yahoo.com (209.191.93.52): icmp_seq=5 
ttlX time=8.39 ms


>
>> 1.  The last man working on this box had mistakenly edited the hosts 
>> file and added the machine name and complete domain name to the local 
>> host 127.0.0.1 name.  It should only be pointed to the eth0 
>> interface.   I have changed this.
>
> Dough!
>
>> 2.  The route output is making an announcement
>>
>>    64.0.0.0        argontech.net   255.0.0.0       UG    20     
>> 0        0 eth0
>>
>> My public IP space is a /20 within that space, not the whole Class 
>> A.  I have not found which box is announcing this within my network yet.
>
> I would think that you could extract that information from OSPF, or at 
> least the system that is advertising and work backwards until you find 
> the ultimate culprit.

I'm working on it, but time is slim today (but not me)!

>
>
>
> Grant. . . .
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (8 preceding siblings ...)
  2007-10-23 21:27 ` Marco C. Coelho
@ 2007-10-24 10:19 ` Alexandru Dragoi
  2007-10-24 15:19 ` Marco C. Coelho
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Alexandru Dragoi @ 2007-10-24 10:19 UTC (permalink / raw)
  To: lartc

Marco C. Coelho wrote:
>
> This box is doing a lot.  It terminates 1000 PPPoE connections,
> provides traffic shaping using TC/HTB, authenticates all users via
> Radius.  It also runs OSPF routing for the internal network.  Looking
> at a simple route output I see all the PPP connections coming through
> the box, and due to the OSPF I also see the rest of my network
> announcements.  The only strange things are:
>
> 1.  The last man working on this box had mistakenly edited the hosts
> file and added the machine name and complete domain name to the local
> host 127.0.0.1 name.  It should only be pointed to the eth0
> interface.   I have changed this.
>
> 2.  The route output is making an announcement
>
>    64.0.0.0        argontech.net   255.0.0.0       UG    20    
> 0        0 eth0

This doesn't look dangerous for your problem, I was only talking about
directly connected networks:

# ip route |grep link

>
> My public IP space is a /20 within that space, not the whole Class A. 
> I have not found which box is announcing this within my network yet.
>
>
>
>
>
> Jeff Welling wrote:
>>
>>> On 10/23/07 06:56, Alexandru Dragoi wrote:
>>>> What about checking your routing table? you may have link routes
>>>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some
>>>> programs prefer to use "standard" netmask of classes A and B.
>>>
>>> I'm betting that the OP has other things going on seeing has how
>>> s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol,
>>> and thus not subject to typical routing scenarios.  In essence the
>>> OP could have thousands of PPPoE connections terminating on one
>>> system with the ARP cache having to deal with where to send traffic
>>> to which MAC address. There is not a lot of room for routing in such
>>> a scenario.
>>>
>> I agree with Peter's suggestion, arpd.  I ran into the neighbor table
>> overflow problem recently, at the hands of our ISP.  I was in the
>> process of recompiling the kernel and mucking with arpd (I couldn't
>> get it to run/start properly) when the problem disappeared as quickly
>> as it showed up.  Lucky for me, this was some kind of ISP problem, I
>> was able to determine that much through `tcpdump -i X -n arpd`.
>>
>> My 'two cents' is that you try arpd, I did a bit of looking when I
>> came across that problem and it seemed to be the last ditch effort
>> when changing the gc threshold had no effect.  Wasn't able to confirm
>> that it worked for sure though.
>>
>> Cheers.
>> _______________________________________________
>> LARTC mailing list
>> LARTC@mailman.ds9a.nl
>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (9 preceding siblings ...)
  2007-10-24 10:19 ` Alexandru Dragoi
@ 2007-10-24 15:19 ` Marco C. Coelho
  2007-10-24 16:06 ` Alexandru Dragoi
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Marco C. Coelho @ 2007-10-24 15:19 UTC (permalink / raw)
  To: lartc


[-- Attachment #1.1: Type: text/plain, Size: 4632 bytes --]


the ip route with a grep for link returns:

snip** too long
64.202.227.198 dev ppp436  proto kernel  scope link  src 10.20.1.1
64.202.227.196 dev ppp421  proto kernel  scope link  src 10.20.1.1
64.202.227.197 dev ppp211  proto kernel  scope link  src 10.20.0.1
64.202.227.194 dev ppp13  proto kernel  scope link  src 10.20.1.1
64.202.227.192 dev ppp404  proto kernel  scope link  src 10.20.1.1
64.202.227.254 dev ppp194  proto kernel  scope link  src 10.20.1.1
64.202.227.253 dev ppp130  proto kernel  scope link  src 10.20.1.1
64.202.227.252 dev ppp243  proto kernel  scope link  src 10.20.1.1
64.202.227.249 dev ppp195  proto kernel  scope link  src 10.20.1.1
64.202.227.248 dev ppp254  proto kernel  scope link  src 10.20.1.1
64.202.227.247 dev ppp235  proto kernel  scope link  src 10.20.1.1
64.202.227.242 dev ppp78  proto kernel  scope link  src 10.20.1.1
64.202.227.240 dev ppp328  proto kernel  scope link  src 10.20.1.1
64.202.227.237 dev ppp44  proto kernel  scope link  src 10.20.1.1
64.202.227.236 dev ppp122  proto kernel  scope link  src 10.20.1.1
64.202.227.234 dev ppp316  proto kernel  scope link  src 10.20.1.1
64.202.227.232 dev ppp132  proto kernel  scope link  src 10.20.1.1
64.202.227.231 dev ppp104  proto kernel  scope link  src 10.20.0.1
64.202.227.226 dev ppp179  proto kernel  scope link  src 10.20.0.1
64.202.224.0/24 dev eth0  proto kernel  scope link  src 64.202.224.8
192.168.1.0/24 dev eth3  proto kernel  scope link  src 192.168.1.8
169.254.0.0/16 dev eth3  scope link

All the pppoe terminations (pppd) are shown, as well as the last three 
subnets.  I'll have to see where the 169.254.0.0/16 is coming from?

mc




Alexandru Dragoi wrote:
> Marco C. Coelho wrote:
>   
>> This box is doing a lot.  It terminates 1000 PPPoE connections,
>> provides traffic shaping using TC/HTB, authenticates all users via
>> Radius.  It also runs OSPF routing for the internal network.  Looking
>> at a simple route output I see all the PPP connections coming through
>> the box, and due to the OSPF I also see the rest of my network
>> announcements.  The only strange things are:
>>
>> 1.  The last man working on this box had mistakenly edited the hosts
>> file and added the machine name and complete domain name to the local
>> host 127.0.0.1 name.  It should only be pointed to the eth0
>> interface.   I have changed this.
>>
>> 2.  The route output is making an announcement
>>
>>    64.0.0.0        argontech.net   255.0.0.0       UG    20    
>> 0        0 eth0
>>     
>
> This doesn't look dangerous for your problem, I was only talking about
> directly connected networks:
>
> # ip route |grep link
>
>   
>> My public IP space is a /20 within that space, not the whole Class A. 
>> I have not found which box is announcing this within my network yet.
>>
>>
>>
>>
>>
>> Jeff Welling wrote:
>>     
>>>> On 10/23/07 06:56, Alexandru Dragoi wrote:
>>>>         
>>>>> What about checking your routing table? you may have link routes
>>>>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some
>>>>> programs prefer to use "standard" netmask of classes A and B.
>>>>>           
>>>> I'm betting that the OP has other things going on seeing has how
>>>> s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol,
>>>> and thus not subject to typical routing scenarios.  In essence the
>>>> OP could have thousands of PPPoE connections terminating on one
>>>> system with the ARP cache having to deal with where to send traffic
>>>> to which MAC address. There is not a lot of room for routing in such
>>>> a scenario.
>>>>
>>>>         
>>> I agree with Peter's suggestion, arpd.  I ran into the neighbor table
>>> overflow problem recently, at the hands of our ISP.  I was in the
>>> process of recompiling the kernel and mucking with arpd (I couldn't
>>> get it to run/start properly) when the problem disappeared as quickly
>>> as it showed up.  Lucky for me, this was some kind of ISP problem, I
>>> was able to determine that much through `tcpdump -i X -n arpd`.
>>>
>>> My 'two cents' is that you try arpd, I did a bit of looking when I
>>> came across that problem and it seemed to be the last ditch effort
>>> when changing the gc threshold had no effect.  Wasn't able to confirm
>>> that it worked for sure though.
>>>
>>> Cheers.
>>> _______________________________________________
>>> LARTC mailing list
>>> LARTC@mailman.ds9a.nl
>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>
>>>       
>> _______________________________________________
>> LARTC mailing list
>> LARTC@mailman.ds9a.nl
>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>     
>
>
>   

[-- Attachment #1.2: Type: text/html, Size: 6012 bytes --]

[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (10 preceding siblings ...)
  2007-10-24 15:19 ` Marco C. Coelho
@ 2007-10-24 16:06 ` Alexandru Dragoi
  2007-10-25 15:08 ` Marco C. Coelho
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Alexandru Dragoi @ 2007-10-24 16:06 UTC (permalink / raw)
  To: lartc

Marco C. Coelho wrote:
>
> the ip route with a grep for link returns:
>
> snip** too long
> 64.202.227.198 dev ppp436  proto kernel  scope link  src 10.20.1.1
> 64.202.227.196 dev ppp421  proto kernel  scope link  src 10.20.1.1
> 64.202.227.197 dev ppp211  proto kernel  scope link  src 10.20.0.1
> 64.202.227.194 dev ppp13  proto kernel  scope link  src 10.20.1.1
> 64.202.227.192 dev ppp404  proto kernel  scope link  src 10.20.1.1
> 64.202.227.254 dev ppp194  proto kernel  scope link  src 10.20.1.1
> 64.202.227.253 dev ppp130  proto kernel  scope link  src 10.20.1.1
> 64.202.227.252 dev ppp243  proto kernel  scope link  src 10.20.1.1
> 64.202.227.249 dev ppp195  proto kernel  scope link  src 10.20.1.1
> 64.202.227.248 dev ppp254  proto kernel  scope link  src 10.20.1.1
> 64.202.227.247 dev ppp235  proto kernel  scope link  src 10.20.1.1
> 64.202.227.242 dev ppp78  proto kernel  scope link  src 10.20.1.1
> 64.202.227.240 dev ppp328  proto kernel  scope link  src 10.20.1.1
> 64.202.227.237 dev ppp44  proto kernel  scope link  src 10.20.1.1
> 64.202.227.236 dev ppp122  proto kernel  scope link  src 10.20.1.1
> 64.202.227.234 dev ppp316  proto kernel  scope link  src 10.20.1.1
> 64.202.227.232 dev ppp132  proto kernel  scope link  src 10.20.1.1
> 64.202.227.231 dev ppp104  proto kernel  scope link  src 10.20.0.1
> 64.202.227.226 dev ppp179  proto kernel  scope link  src 10.20.0.1
> 64.202.224.0/24 dev eth0  proto kernel  scope link  src 64.202.224.8
> 192.168.1.0/24 dev eth3  proto kernel  scope link  src 192.168.1.8
> 169.254.0.0/16 dev eth3  scope link

The one above must be deleted, many redhat-like distros attach
169.254.0.0/16.
>
> All the pppoe terminations (pppd) are shown, as well as the last three
> subnets.  I'll have to see where the 169.254.0.0/16 is coming from?
>
> mc
>
>
>
>
> Alexandru Dragoi wrote:
>> Marco C. Coelho wrote:
>>   
>>> This box is doing a lot.  It terminates 1000 PPPoE connections,
>>> provides traffic shaping using TC/HTB, authenticates all users via
>>> Radius.  It also runs OSPF routing for the internal network.  Looking
>>> at a simple route output I see all the PPP connections coming through
>>> the box, and due to the OSPF I also see the rest of my network
>>> announcements.  The only strange things are:
>>>
>>> 1.  The last man working on this box had mistakenly edited the hosts
>>> file and added the machine name and complete domain name to the local
>>> host 127.0.0.1 name.  It should only be pointed to the eth0
>>> interface.   I have changed this.
>>>
>>> 2.  The route output is making an announcement
>>>
>>>    64.0.0.0        argontech.net   255.0.0.0       UG    20    
>>> 0        0 eth0
>>>     
>>
>> This doesn't look dangerous for your problem, I was only talking about
>> directly connected networks:
>>
>> # ip route |grep link
>>
>>   
>>> My public IP space is a /20 within that space, not the whole Class A. 
>>> I have not found which box is announcing this within my network yet.
>>>
>>>
>>>
>>>
>>>
>>> Jeff Welling wrote:
>>>     
>>>>> On 10/23/07 06:56, Alexandru Dragoi wrote:
>>>>>         
>>>>>> What about checking your routing table? you may have link routes
>>>>>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some
>>>>>> programs prefer to use "standard" netmask of classes A and B.
>>>>>>           
>>>>> I'm betting that the OP has other things going on seeing has how
>>>>> s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol,
>>>>> and thus not subject to typical routing scenarios.  In essence the
>>>>> OP could have thousands of PPPoE connections terminating on one
>>>>> system with the ARP cache having to deal with where to send traffic
>>>>> to which MAC address. There is not a lot of room for routing in such
>>>>> a scenario.
>>>>>
>>>>>         
>>>> I agree with Peter's suggestion, arpd.  I ran into the neighbor table
>>>> overflow problem recently, at the hands of our ISP.  I was in the
>>>> process of recompiling the kernel and mucking with arpd (I couldn't
>>>> get it to run/start properly) when the problem disappeared as quickly
>>>> as it showed up.  Lucky for me, this was some kind of ISP problem, I
>>>> was able to determine that much through `tcpdump -i X -n arpd`.
>>>>
>>>> My 'two cents' is that you try arpd, I did a bit of looking when I
>>>> came across that problem and it seemed to be the last ditch effort
>>>> when changing the gc threshold had no effect.  Wasn't able to confirm
>>>> that it worked for sure though.
>>>>
>>>> Cheers.
>>>> _______________________________________________
>>>> LARTC mailing list
>>>> LARTC@mailman.ds9a.nl
>>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>>
>>>>       
>>> _______________________________________________
>>> LARTC mailing list
>>> LARTC@mailman.ds9a.nl
>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>     
>>
>>
>>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>   

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (11 preceding siblings ...)
  2007-10-24 16:06 ` Alexandru Dragoi
@ 2007-10-25 15:08 ` Marco C. Coelho
  2007-10-25 16:30 ` Alexandru Dragoi
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Marco C. Coelho @ 2007-10-25 15:08 UTC (permalink / raw)
  To: lartc


[-- Attachment #1.1: Type: text/plain, Size: 5892 bytes --]

Looking into it further an ip route shows:

64.0.0.0/8 via 64.202.224.1 dev eth0  proto zebra  metric 20 equalize

So the 64.0.0.0 announce is coming into this box through OSPF  (zebra)

The 169.254.0.0/16 is being automajically added through the sysconfig 
network scripts.  I'm looking into why.

In either case I still don't see why these entries would make the 
neighbor table overflow.  Could it have been the previous fix to the 
hosts file?

mc

Alexandru Dragoi wrote:
> Marco C. Coelho wrote:
>   
>> the ip route with a grep for link returns:
>>
>> snip** too long
>> 64.202.227.198 dev ppp436  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.196 dev ppp421  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.197 dev ppp211  proto kernel  scope link  src 10.20.0.1
>> 64.202.227.194 dev ppp13  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.192 dev ppp404  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.254 dev ppp194  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.253 dev ppp130  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.252 dev ppp243  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.249 dev ppp195  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.248 dev ppp254  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.247 dev ppp235  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.242 dev ppp78  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.240 dev ppp328  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.237 dev ppp44  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.236 dev ppp122  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.234 dev ppp316  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.232 dev ppp132  proto kernel  scope link  src 10.20.1.1
>> 64.202.227.231 dev ppp104  proto kernel  scope link  src 10.20.0.1
>> 64.202.227.226 dev ppp179  proto kernel  scope link  src 10.20.0.1
>> 64.202.224.0/24 dev eth0  proto kernel  scope link  src 64.202.224.8
>> 192.168.1.0/24 dev eth3  proto kernel  scope link  src 192.168.1.8
>> 169.254.0.0/16 dev eth3  scope link
>>     
>
> The one above must be deleted, many redhat-like distros attach
> 169.254.0.0/16.
>   
>> All the pppoe terminations (pppd) are shown, as well as the last three
>> subnets.  I'll have to see where the 169.254.0.0/16 is coming from?
>>
>> mc
>>
>>
>>
>>
>> Alexandru Dragoi wrote:
>>     
>>> Marco C. Coelho wrote:
>>>   
>>>       
>>>> This box is doing a lot.  It terminates 1000 PPPoE connections,
>>>> provides traffic shaping using TC/HTB, authenticates all users via
>>>> Radius.  It also runs OSPF routing for the internal network.  Looking
>>>> at a simple route output I see all the PPP connections coming through
>>>> the box, and due to the OSPF I also see the rest of my network
>>>> announcements.  The only strange things are:
>>>>
>>>> 1.  The last man working on this box had mistakenly edited the hosts
>>>> file and added the machine name and complete domain name to the local
>>>> host 127.0.0.1 name.  It should only be pointed to the eth0
>>>> interface.   I have changed this.
>>>>
>>>> 2.  The route output is making an announcement
>>>>
>>>>    64.0.0.0        argontech.net   255.0.0.0       UG    20    
>>>> 0        0 eth0
>>>>     
>>>>         
>>> This doesn't look dangerous for your problem, I was only talking about
>>> directly connected networks:
>>>
>>> # ip route |grep link
>>>
>>>   
>>>       
>>>> My public IP space is a /20 within that space, not the whole Class A. 
>>>> I have not found which box is announcing this within my network yet.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Jeff Welling wrote:
>>>>     
>>>>         
>>>>>> On 10/23/07 06:56, Alexandru Dragoi wrote:
>>>>>>         
>>>>>>             
>>>>>>> What about checking your routing table? you may have link routes
>>>>>>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some
>>>>>>> programs prefer to use "standard" netmask of classes A and B.
>>>>>>>           
>>>>>>>               
>>>>>> I'm betting that the OP has other things going on seeing has how
>>>>>> s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol,
>>>>>> and thus not subject to typical routing scenarios.  In essence the
>>>>>> OP could have thousands of PPPoE connections terminating on one
>>>>>> system with the ARP cache having to deal with where to send traffic
>>>>>> to which MAC address. There is not a lot of room for routing in such
>>>>>> a scenario.
>>>>>>
>>>>>>         
>>>>>>             
>>>>> I agree with Peter's suggestion, arpd.  I ran into the neighbor table
>>>>> overflow problem recently, at the hands of our ISP.  I was in the
>>>>> process of recompiling the kernel and mucking with arpd (I couldn't
>>>>> get it to run/start properly) when the problem disappeared as quickly
>>>>> as it showed up.  Lucky for me, this was some kind of ISP problem, I
>>>>> was able to determine that much through `tcpdump -i X -n arpd`.
>>>>>
>>>>> My 'two cents' is that you try arpd, I did a bit of looking when I
>>>>> came across that problem and it seemed to be the last ditch effort
>>>>> when changing the gc threshold had no effect.  Wasn't able to confirm
>>>>> that it worked for sure though.
>>>>>
>>>>> Cheers.
>>>>> _______________________________________________
>>>>> LARTC mailing list
>>>>> LARTC@mailman.ds9a.nl
>>>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>>>
>>>>>       
>>>>>           
>>>> _______________________________________________
>>>> LARTC mailing list
>>>> LARTC@mailman.ds9a.nl
>>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>>     
>>>>         
>>>   
>>>       
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> LARTC mailing list
>> LARTC@mailman.ds9a.nl
>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>   
>>     
>
>
>   

[-- Attachment #1.2: Type: text/html, Size: 7023 bytes --]

[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (12 preceding siblings ...)
  2007-10-25 15:08 ` Marco C. Coelho
@ 2007-10-25 16:30 ` Alexandru Dragoi
  2007-11-19 22:36 ` Marco C. Coelho
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Alexandru Dragoi @ 2007-10-25 16:30 UTC (permalink / raw)
  To: lartc

Marco C. Coelho wrote:
> Looking into it further an ip route shows:
>
> 64.0.0.0/8 via 64.202.224.1 dev eth0  proto zebra  metric 20 equalize
This /8 doesn't affect the neighbor table. There must be a problem on
the site that announce it
>
> So the 64.0.0.0 announce is coming into this box through OSPF  (zebra)
>
> The 169.254.0.0/16 is being automajically added through the sysconfig
> network scripts.  I'm looking into why.
>
> In either case I still don't see why these entries would make the
> neighbor table overflow.  Could it have been the previous fix to the
> hosts file?
Well, when somebody try to make traffic with somebody from
169.254.0.0/16 throught you, your server will ask for arp on eth3, and
most probably will record an <incomplete> entry in arp table. Virii and
others can make this worse. Another quick fix is to drop arps from/to
169.254.0.0/16 with arptables.
>
> mc
>
> Alexandru Dragoi wrote:
>> Marco C. Coelho wrote:
>>   
>>> the ip route with a grep for link returns:
>>>
>>> snip** too long
>>> 64.202.227.198 dev ppp436  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.196 dev ppp421  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.197 dev ppp211  proto kernel  scope link  src 10.20.0.1
>>> 64.202.227.194 dev ppp13  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.192 dev ppp404  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.254 dev ppp194  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.253 dev ppp130  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.252 dev ppp243  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.249 dev ppp195  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.248 dev ppp254  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.247 dev ppp235  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.242 dev ppp78  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.240 dev ppp328  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.237 dev ppp44  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.236 dev ppp122  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.234 dev ppp316  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.232 dev ppp132  proto kernel  scope link  src 10.20.1.1
>>> 64.202.227.231 dev ppp104  proto kernel  scope link  src 10.20.0.1
>>> 64.202.227.226 dev ppp179  proto kernel  scope link  src 10.20.0.1
>>> 64.202.224.0/24 dev eth0  proto kernel  scope link  src 64.202.224.8
>>> 192.168.1.0/24 dev eth3  proto kernel  scope link  src 192.168.1.8
>>> 169.254.0.0/16 dev eth3  scope link
>>>     
>>
>> The one above must be deleted, many redhat-like distros attach
>> 169.254.0.0/16.
>>   
>>> All the pppoe terminations (pppd) are shown, as well as the last three
>>> subnets.  I'll have to see where the 169.254.0.0/16 is coming from?
>>>
>>> mc
>>>
>>>
>>>
>>>
>>> Alexandru Dragoi wrote:
>>>     
>>>> Marco C. Coelho wrote:
>>>>   
>>>>       
>>>>> This box is doing a lot.  It terminates 1000 PPPoE connections,
>>>>> provides traffic shaping using TC/HTB, authenticates all users via
>>>>> Radius.  It also runs OSPF routing for the internal network.  Looking
>>>>> at a simple route output I see all the PPP connections coming through
>>>>> the box, and due to the OSPF I also see the rest of my network
>>>>> announcements.  The only strange things are:
>>>>>
>>>>> 1.  The last man working on this box had mistakenly edited the hosts
>>>>> file and added the machine name and complete domain name to the local
>>>>> host 127.0.0.1 name.  It should only be pointed to the eth0
>>>>> interface.   I have changed this.
>>>>>
>>>>> 2.  The route output is making an announcement
>>>>>
>>>>>    64.0.0.0        argontech.net   255.0.0.0       UG    20    
>>>>> 0        0 eth0
>>>>>     
>>>>>         
>>>> This doesn't look dangerous for your problem, I was only talking about
>>>> directly connected networks:
>>>>
>>>> # ip route |grep link
>>>>
>>>>   
>>>>       
>>>>> My public IP space is a /20 within that space, not the whole Class A. 
>>>>> I have not found which box is announcing this within my network yet.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Jeff Welling wrote:
>>>>>     
>>>>>         
>>>>>>> On 10/23/07 06:56, Alexandru Dragoi wrote:
>>>>>>>         
>>>>>>>             
>>>>>>>> What about checking your routing table? you may have link routes
>>>>>>>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some
>>>>>>>> programs prefer to use "standard" netmask of classes A and B.
>>>>>>>>           
>>>>>>>>               
>>>>>>> I'm betting that the OP has other things going on seeing has how
>>>>>>> s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol,
>>>>>>> and thus not subject to typical routing scenarios.  In essence the
>>>>>>> OP could have thousands of PPPoE connections terminating on one
>>>>>>> system with the ARP cache having to deal with where to send traffic
>>>>>>> to which MAC address. There is not a lot of room for routing in such
>>>>>>> a scenario.
>>>>>>>
>>>>>>>         
>>>>>>>             
>>>>>> I agree with Peter's suggestion, arpd.  I ran into the neighbor table
>>>>>> overflow problem recently, at the hands of our ISP.  I was in the
>>>>>> process of recompiling the kernel and mucking with arpd (I couldn't
>>>>>> get it to run/start properly) when the problem disappeared as quickly
>>>>>> as it showed up.  Lucky for me, this was some kind of ISP problem, I
>>>>>> was able to determine that much through `tcpdump -i X -n arpd`.
>>>>>>
>>>>>> My 'two cents' is that you try arpd, I did a bit of looking when I
>>>>>> came across that problem and it seemed to be the last ditch effort
>>>>>> when changing the gc threshold had no effect.  Wasn't able to confirm
>>>>>> that it worked for sure though.
>>>>>>
>>>>>> Cheers.
>>>>>> _______________________________________________
>>>>>> LARTC mailing list
>>>>>> LARTC@mailman.ds9a.nl
>>>>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>>>>
>>>>>>       
>>>>>>           
>>>>> _______________________________________________
>>>>> LARTC mailing list
>>>>> LARTC@mailman.ds9a.nl
>>>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>>>     
>>>>>         
>>>>   
>>>>       
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> LARTC mailing list
>>> LARTC@mailman.ds9a.nl
>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>   
>>>     
>>
>>
>>   

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (13 preceding siblings ...)
  2007-10-25 16:30 ` Alexandru Dragoi
@ 2007-11-19 22:36 ` Marco C. Coelho
  2007-11-19 23:15 ` darko
  2007-12-07 17:17 ` Marco C. Coelho
  16 siblings, 0 replies; 18+ messages in thread
From: Marco C. Coelho @ 2007-11-19 22:36 UTC (permalink / raw)
  To: lartc


[-- Attachment #1.1: Type: text/plain, Size: 10951 bytes --]

Still beating the same bush!

I've done all the possible suggestions so far.  I still was getting a 
neighbor table overflow.
Looking at the MAN 7 ARP pages, I see:

       gc_thresh1
              The minimum number of entries to keep in the ARP cache.  
The garbage collector will not run if there are
              fewer than this number of entries in the cache.  Defaults 
to 128.

       gc_thresh2
              The soft maximum number of entries to keep in the ARP 
cache.  The garbage collector will allow the  num-
              ber of entries to exceed this for 5 seconds before 
collection will be performed.  Defaults to 512.

       gc_thresh3
              The  hard  maximum number of entries to keep in the ARP 
cache.  The garbage collector will always run if
              there are more than this number of entries in the cache.  
Defaults to 1024.

Since this box never gets less than 500 pppoe connections, this Sat I 
changed
                          WAS     NOW  
gc_thresh1      512         1024
gc_thresh2     2048        2048
gc_thresh3     4096        4096
   
what's strange is when I do an 'arp -an' I only get three entries back. 
(ips changed to protect the guilty).  Shouldn't this show the arp entries

? (x.202.x.3) at 00:03:47:2D:8B:F9 [ether] on eth0
? (x.202.x.1) at 00:03:E3:88:EC:C2 [ether] on eth0
? (x.202.x.2) at 00:18:8B:76:EC:D8 [ether] on eth0
? (x.202.x.9) at 00:90:27:43:C2:CF [ether] on eth0

ip route | grep link provides:

snip (lots of pppoe connects)
x.202.x.237 dev ppp53  proto kernel  scope link  src 10.20.1.1
x.202.x.235 dev ppp339  proto kernel  scope link  src 10.20.1.1
x.202.x.232 dev ppp185  proto kernel  scope link  src 10.20.1.1
x.202.x.231 dev ppp313  proto kernel  scope link  src 10.20.1.1
x.202.x.230 dev ppp67  proto kernel  scope link  src 10.20.1.1
x.202.x.226 dev ppp74  proto kernel  scope link  src 10.20.1.1
x.202.x.224 dev ppp150  proto kernel  scope link  src 10.20.1.1
x.202.x.0/24 dev eth0  proto kernel  scope link  src x.202.224.8
192.168.1.0/24 dev eth3  proto kernel  scope link  src 192.168.1.8

I don't think we are doing anything too special with this box that we 
would see a kernel issue no one else is seeing.  Can arp poisoning cause 
this?

a dmesg after a clean reboot only gives:

Shorewall:all2all:REJECT:IN=ppp413 OUT= MAC= SRC=x.202.x.165 
DST=10.20.1.1 LEN=60 TOS=0x00 PREC=0x00 TTL=254 ID=39752 PROTO=ICMP 
TYPE=8 CODE=0 ID=25040 SEQ=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=72 TOS=0x00 PREC=0x00 TTL=126 ID=48363 PROTO=UDP 
SPT=427 DPT=427 LEN=52
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48492 DF PROTO=TCP 
SPT=36005 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48493 DF PROTO=TCP 
SPT=36005 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48517 DF PROTO=TCP 
SPT=36005 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48518 DF PROTO=TCP 
SPT=33969 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=72 TOS=0x00 PREC=0x00 TTL=126 ID=48519 PROTO=UDP 
SPT=427 DPT=427 LEN=52
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48522 DF PROTO=TCP 
SPT=33969 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48526 DF PROTO=TCP 
SPT=33969 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48614 DF PROTO=TCP 
SPT=35790 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48630 DF PROTO=TCP 
SPT=35790 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48x6 DF PROTO=TCP 
SPT=35790 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48x8 DF PROTO=TCP 
SPT=34718 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48663 DF PROTO=TCP 
SPT=34718 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48679 DF PROTO=TCP 
SPT=34718 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.y.x.110 
DST=192.168.1.7 LEN=72 TOS=0x00 PREC=0x00 TTL=126 ID=48724 PROTO=UDP 
SPT=427 DPT=427 LEN=52

Kernel Version 2.6.18-8.1.6


Looking for any suggestions.

Marco





Andrei Kovacs wrote:
> On 10/25/07, Marco C. Coelho <maillist1@argontech.net> wrote:
>   
>>  Looking into it further an ip route shows:
>>
>>  x.0.0.0/8 via x.y.224.1 dev eth0  proto zebra  metric 20 equalize
>>
>>  So the x.0.0.0 announce is coming into this box through OSPF  (zebra)
>>
>>  The 169.254.0.0/16 is being automajically added through the sysconfig
>> network scripts.  I'm looking into why.
>>
>>     
>
> Add "NOZEROCONF=yes" in /etc/sysconfig/network and the 169.254.0.0/16
> network won't be created anymore.
>
>   
>>  In either case I still don't see why these entries would make the neighbor
>> table overflow.  Could it have been the previous fix to the hosts file?
>>
>>  mc
>>
>>  Alexandru Dragoi wrote:
>>  Marco C. Coelho wrote:
>>
>>
>>  the ip route with a grep for link returns:
>>
>> snip** too long
>> x.y.x.198 dev ppp436 proto kernel scope link src 10.20.1.1
>> x.y.x.196 dev ppp421 proto kernel scope link src 10.20.1.1
>> x.y.x.197 dev ppp211 proto kernel scope link src 10.20.0.1
>> x.y.x.194 dev ppp13 proto kernel scope link src 10.20.1.1
>> x.y.x.192 dev ppp404 proto kernel scope link src 10.20.1.1
>> x.y.x.254 dev ppp194 proto kernel scope link src 10.20.1.1
>> x.y.x.253 dev ppp130 proto kernel scope link src 10.20.1.1
>> x.y.x.252 dev ppp243 proto kernel scope link src 10.20.1.1
>> x.y.x.249 dev ppp195 proto kernel scope link src 10.20.1.1
>> x.y.x.248 dev ppp254 proto kernel scope link src 10.20.1.1
>> x.y.x.247 dev ppp235 proto kernel scope link src 10.20.1.1
>> x.y.x.242 dev ppp78 proto kernel scope link src 10.20.1.1
>> x.y.x.240 dev ppp328 proto kernel scope link src 10.20.1.1
>> x.y.x.237 dev ppp44 proto kernel scope link src 10.20.1.1
>> x.y.x.236 dev ppp122 proto kernel scope link src 10.20.1.1
>> x.y.x.234 dev ppp316 proto kernel scope link src 10.20.1.1
>> x.y.x.232 dev ppp132 proto kernel scope link src 10.20.1.1
>> x.y.x.231 dev ppp104 proto kernel scope link src 10.20.0.1
>> x.y.x.226 dev ppp179 proto kernel scope link src 10.20.0.1
>> x.y.224.0/24 dev eth0 proto kernel scope link src x.y.224.8
>> 192.168.1.0/24 dev eth3 proto kernel scope link src 192.168.1.8
>> 169.254.0.0/16 dev eth3 scope link
>>
>>  The one above must be deleted, many redhat-like distros attach
>> 169.254.0.0/16.
>>
>>
>>  All the pppoe terminations (pppd) are shown, as well as the last three
>> subnets. I'll have to see where the 169.254.0.0/16 is coming from?
>>
>> mc
>>
>>
>>
>>
>> Alexandru Dragoi wrote:
>>
>>
>>  Marco C. Coelho wrote:
>>
>>
>>
>>  This box is doing a lot. It terminates 1000 PPPoE connections,
>> provides traffic shaping using TC/HTB, authenticates all users via
>> Radius. It also runs OSPF routing for the internal network. Looking
>> at a simple route output I see all the PPP connections coming through
>> the box, and due to the OSPF I also see the rest of my network
>> announcements. The only strange things are:
>>
>> 1. The last man working on this box had mistakenly edited the hosts
>> file and added the machine name and complete domain name to the local
>> host 127.0.0.1 name. It should only be pointed to the eth0
>> interface. I have changed this.
>>
>> 2. The route output is making an announcement
>>
>>  x.0.0.0 argontech.net 255.0.0.0 UG 20
>> 0 0 eth0
>>
>>
>>  This doesn't look dangerous for your problem, I was only talking about
>> directly connected networks:
>>
>> # ip route |grep link
>>
>>
>>
>>
>>  My public IP space is a /20 within that space, not the whole Class A.
>> I have not found which box is announcing this within my network yet.
>>
>>
>>
>>
>>
>> Jeff Welling wrote:
>>
>>
>>
>>
>>  On 10/23/07 06:56, Alexandru Dragoi wrote:
>>
>>
>>
>>  What about checking your routing table? you may have link routes
>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some
>> programs prefer to use "standard" netmask of classes A and B.
>>
>>
>>  I'm betting that the OP has other things going on seeing has how
>> s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol,
>> and thus not subject to typical routing scenarios. In essence the
>> OP could have thousands of PPPoE connections terminating on one
>> system with the ARP cache having to deal with where to send traffic
>> to which MAC address. There is not a lot of room for routing in such
>> a scenario.
>>
>>
>>
>>  I agree with Peter's suggestion, arpd. I ran into the neighbor table
>> overflow problem recently, at the hands of our ISP. I was in the
>> process of recompiling the kernel and mucking with arpd (I couldn't
>> get it to run/start properly) when the problem disappeared as quickly
>> as it showed up. Lucky for me, this was some kind of ISP problem, I
>> was able to determine that much through `tcpdump -i X -n arpd`.
>>
>> My 'two cents' is that you try arpd, I did a bit of looking when I
>> came across that problem and it seemed to be the last ditch effort
>> when changing the gc threshold had no effect. Wasn't able to confirm
>> that it worked for sure though.
>>
>> Cheers.
>> _______________________________________________
>> LARTC mailing list
>> LARTC@mailman.ds9a.nl
>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>
>>
>>
>>  _______________________________________________
>> LARTC mailing list
>> LARTC@mailman.ds9a.nl
>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> LARTC mailing list
>> LARTC@mailman.ds9a.nl
>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>
>>
>>
>>
>>
>> _______________________________________________
>> LARTC mailing list
>> LARTC@mailman.ds9a.nl
>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>
>>
>>     
>
>   

[-- Attachment #1.2: Type: text/html, Size: 13040 bytes --]

[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (14 preceding siblings ...)
  2007-11-19 22:36 ` Marco C. Coelho
@ 2007-11-19 23:15 ` darko
  2007-12-07 17:17 ` Marco C. Coelho
  16 siblings, 0 replies; 18+ messages in thread
From: darko @ 2007-11-19 23:15 UTC (permalink / raw)
  To: lartc

> Still beating the same bush!
> 
> I've done all the possible suggestions so far.  I still was getting 
> a neighbor table overflow. ...

If this can help. Have same problem when testing new server in network (kernel
is 2.6.21.5) - everything seems OK in system except neighbor table overflow,
and as consequence buffer overflow. Situation was next:
Server Internet port was connected in network where logically local port of
the server belong. There are mostly 10.x.0.0/16 addresses. Also on that
network was one client with some viruses which produce excessive ARP scanning
of 10.x.0.0/16 clients.

In situation when local port of server is connected where it's belonging,
there is no overflow messages, nor any problems.
Also there are no problem in situation when problematic client was offline.

Darko
_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LARTC] neighbor table overflow
  2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
                   ` (15 preceding siblings ...)
  2007-11-19 23:15 ` darko
@ 2007-12-07 17:17 ` Marco C. Coelho
  16 siblings, 0 replies; 18+ messages in thread
From: Marco C. Coelho @ 2007-12-07 17:17 UTC (permalink / raw)
  To: lartc


[-- Attachment #1.1: Type: text/plain, Size: 12434 bytes --]

Ok, I hope this helps someone else out there when they google neighbor 
table overflow solution linux kernel:

This is just an update to state that since gc_thresh1 was increased to a 
number greater than the number of simultaneous connected PPPoE clients 
on this box, it has not given me the neighbor table problem.

So set gc_thresh1 greater than the number of local connections you get with:

ip route | grep link | wc -l

So in /etc/sysctl.conf add something like:

# Added to stop "neighbor table overflow" messages in the kernel
net.ipv4.neigh.default.gc_thresh1=1024
net.ipv4.neigh.default.gc_thresh2=2048
net.ipv4.neigh.default.gc_thresh3=4096
# Added to increase IP contrack number (was getting to max)
net.ipv4.ip_conntrack_max=99999


Have a Merry Christmas!

Marco Coelho
Argon Technologies Inc.
www.argontech.net



Marco C. Coelho wrote:
> Still beating the same bush!
>
> I've done all the possible suggestions so far.  I still was getting a 
> neighbor table overflow.
> Looking at the MAN 7 ARP pages, I see:
>
>        gc_thresh1
>               The minimum number of entries to keep in the ARP cache.  
> The garbage collector will not run if there are
>               fewer than this number of entries in the cache.  
> Defaults to 128.
>
>        gc_thresh2
>               The soft maximum number of entries to keep in the ARP 
> cache.  The garbage collector will allow the  num-
>               ber of entries to exceed this for 5 seconds before 
> collection will be performed.  Defaults to 512.
>
>        gc_thresh3
>               The  hard  maximum number of entries to keep in the ARP 
> cache.  The garbage collector will always run if
>               there are more than this number of entries in the 
> cache.  Defaults to 1024.
>
> Since this box never gets less than 500 pppoe connections, this Sat I 
> changed
>                           WAS     NOW  
> gc_thresh1      512         1024
> gc_thresh2     2048        2048
> gc_thresh3     4096        4096
>    
> what's strange is when I do an 'arp -an' I only get three entries 
> back. (ips changed to protect the guilty).  Shouldn't this show the 
> arp entries
>
> ? (x.202.x.3) at 00:03:47:2D:8B:F9 [ether] on eth0
> ? (x.202.x.1) at 00:03:E3:88:EC:C2 [ether] on eth0
> ? (x.202.x.2) at 00:18:8B:76:EC:D8 [ether] on eth0
> ? (x.202.x.9) at 00:90:27:43:C2:CF [ether] on eth0
>
> ip route | grep link provides:
>
> snip (lots of pppoe connects)
> x.202.x.237 dev ppp53  proto kernel  scope link  src 10.20.1.1
> x.202.x.235 dev ppp339  proto kernel  scope link  src 10.20.1.1
> x.202.x.232 dev ppp185  proto kernel  scope link  src 10.20.1.1
> x.202.x.231 dev ppp313  proto kernel  scope link  src 10.20.1.1
> x.202.x.230 dev ppp67  proto kernel  scope link  src 10.20.1.1
> x.202.x.226 dev ppp74  proto kernel  scope link  src 10.20.1.1
> x.202.x.224 dev ppp150  proto kernel  scope link  src 10.20.1.1
> x.202.x.0/24 dev eth0  proto kernel  scope link  src x.202.224.8
> 192.168.1.0/24 dev eth3  proto kernel  scope link  src 192.168.1.8
>
> I don't think we are doing anything too special with this box that we 
> would see a kernel issue no one else is seeing.  Can arp poisoning 
> cause this?
>
> a dmesg after a clean reboot only gives:
>
> Shorewall:all2all:REJECT:IN=ppp413 OUT= MAC= SRC=x.202.x.165 
> DST=10.20.1.1 LEN=60 TOS=0x00 PREC=0x00 TTL=254 ID=39752 PROTO=ICMP 
> TYPE=8 CODE=0 ID=25040 SEQ=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=72 TOS=0x00 PREC=0x00 TTL=126 ID=48363 PROTO=UDP 
> SPT=427 DPT=427 LEN=52
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48492 DF 
> PROTO=TCP SPT=36005 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48493 DF 
> PROTO=TCP SPT=36005 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48517 DF 
> PROTO=TCP SPT=36005 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48518 DF 
> PROTO=TCP SPT=33969 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=72 TOS=0x00 PREC=0x00 TTL=126 ID=48519 PROTO=UDP 
> SPT=427 DPT=427 LEN=52
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48522 DF 
> PROTO=TCP SPT=33969 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48526 DF 
> PROTO=TCP SPT=33969 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48614 DF 
> PROTO=TCP SPT=35790 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48630 DF 
> PROTO=TCP SPT=35790 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48x6 DF PROTO=TCP 
> SPT=35790 DPT=9220 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48x8 DF PROTO=TCP 
> SPT=34718 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48663 DF 
> PROTO=TCP SPT=34718 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.202.x.110 
> DST=192.168.1.7 LEN=48 TOS=0x00 PREC=0x00 TTL=126 ID=48679 DF 
> PROTO=TCP SPT=34718 DPT=16398 WINDOW=16384 RES=0x00 SYN URGP=0
> Shorewall:all2all:REJECT:IN=ppp160 OUT=eth3 SRC=x.y.x.110 
> DST=192.168.1.7 LEN=72 TOS=0x00 PREC=0x00 TTL=126 ID=48724 PROTO=UDP 
> SPT=427 DPT=427 LEN=52
>
> Kernel Version 2.6.18-8.1.6
>
>
> Looking for any suggestions.
>
> Marco
>
>
>
>
>
> Andrei Kovacs wrote:
>> On 10/25/07, Marco C. Coelho <maillist1@argontech.net> wrote:
>>   
>>>  Looking into it further an ip route shows:
>>>
>>>  x.0.0.0/8 via x.y.224.1 dev eth0  proto zebra  metric 20 equalize
>>>
>>>  So the x.0.0.0 announce is coming into this box through OSPF  (zebra)
>>>
>>>  The 169.254.0.0/16 is being automajically added through the sysconfig
>>> network scripts.  I'm looking into why.
>>>
>>>     
>>
>> Add "NOZEROCONF=yes" in /etc/sysconfig/network and the 169.254.0.0/16
>> network won't be created anymore.
>>
>>   
>>>  In either case I still don't see why these entries would make the neighbor
>>> table overflow.  Could it have been the previous fix to the hosts file?
>>>
>>>  mc
>>>
>>>  Alexandru Dragoi wrote:
>>>  Marco C. Coelho wrote:
>>>
>>>
>>>  the ip route with a grep for link returns:
>>>
>>> snip** too long
>>> x.y.x.198 dev ppp436 proto kernel scope link src 10.20.1.1
>>> x.y.x.196 dev ppp421 proto kernel scope link src 10.20.1.1
>>> x.y.x.197 dev ppp211 proto kernel scope link src 10.20.0.1
>>> x.y.x.194 dev ppp13 proto kernel scope link src 10.20.1.1
>>> x.y.x.192 dev ppp404 proto kernel scope link src 10.20.1.1
>>> x.y.x.254 dev ppp194 proto kernel scope link src 10.20.1.1
>>> x.y.x.253 dev ppp130 proto kernel scope link src 10.20.1.1
>>> x.y.x.252 dev ppp243 proto kernel scope link src 10.20.1.1
>>> x.y.x.249 dev ppp195 proto kernel scope link src 10.20.1.1
>>> x.y.x.248 dev ppp254 proto kernel scope link src 10.20.1.1
>>> x.y.x.247 dev ppp235 proto kernel scope link src 10.20.1.1
>>> x.y.x.242 dev ppp78 proto kernel scope link src 10.20.1.1
>>> x.y.x.240 dev ppp328 proto kernel scope link src 10.20.1.1
>>> x.y.x.237 dev ppp44 proto kernel scope link src 10.20.1.1
>>> x.y.x.236 dev ppp122 proto kernel scope link src 10.20.1.1
>>> x.y.x.234 dev ppp316 proto kernel scope link src 10.20.1.1
>>> x.y.x.232 dev ppp132 proto kernel scope link src 10.20.1.1
>>> x.y.x.231 dev ppp104 proto kernel scope link src 10.20.0.1
>>> x.y.x.226 dev ppp179 proto kernel scope link src 10.20.0.1
>>> x.y.224.0/24 dev eth0 proto kernel scope link src x.y.224.8
>>> 192.168.1.0/24 dev eth3 proto kernel scope link src 192.168.1.8
>>> 169.254.0.0/16 dev eth3 scope link
>>>
>>>  The one above must be deleted, many redhat-like distros attach
>>> 169.254.0.0/16.
>>>
>>>
>>>  All the pppoe terminations (pppd) are shown, as well as the last three
>>> subnets. I'll have to see where the 169.254.0.0/16 is coming from?
>>>
>>> mc
>>>
>>>
>>>
>>>
>>> Alexandru Dragoi wrote:
>>>
>>>
>>>  Marco C. Coelho wrote:
>>>
>>>
>>>
>>>  This box is doing a lot. It terminates 1000 PPPoE connections,
>>> provides traffic shaping using TC/HTB, authenticates all users via
>>> Radius. It also runs OSPF routing for the internal network. Looking
>>> at a simple route output I see all the PPP connections coming through
>>> the box, and due to the OSPF I also see the rest of my network
>>> announcements. The only strange things are:
>>>
>>> 1. The last man working on this box had mistakenly edited the hosts
>>> file and added the machine name and complete domain name to the local
>>> host 127.0.0.1 name. It should only be pointed to the eth0
>>> interface. I have changed this.
>>>
>>> 2. The route output is making an announcement
>>>
>>>  x.0.0.0 argontech.net 255.0.0.0 UG 20
>>> 0 0 eth0
>>>
>>>
>>>  This doesn't look dangerous for your problem, I was only talking about
>>> directly connected networks:
>>>
>>> # ip route |grep link
>>>
>>>
>>>
>>>
>>>  My public IP space is a /20 within that space, not the whole Class A.
>>> I have not found which box is announcing this within my network yet.
>>>
>>>
>>>
>>>
>>>
>>> Jeff Welling wrote:
>>>
>>>
>>>
>>>
>>>  On 10/23/07 06:56, Alexandru Dragoi wrote:
>>>
>>>
>>>
>>>  What about checking your routing table? you may have link routes
>>> for massive subnets (like 85.0.0.0/8 or 140.20.0.0/16). Some
>>> programs prefer to use "standard" netmask of classes A and B.
>>>
>>>
>>>  I'm betting that the OP has other things going on seeing has how
>>> s/he mentioned PPPoE, which to my knowledge is a layer 2 protocol,
>>> and thus not subject to typical routing scenarios. In essence the
>>> OP could have thousands of PPPoE connections terminating on one
>>> system with the ARP cache having to deal with where to send traffic
>>> to which MAC address. There is not a lot of room for routing in such
>>> a scenario.
>>>
>>>
>>>
>>>  I agree with Peter's suggestion, arpd. I ran into the neighbor table
>>> overflow problem recently, at the hands of our ISP. I was in the
>>> process of recompiling the kernel and mucking with arpd (I couldn't
>>> get it to run/start properly) when the problem disappeared as quickly
>>> as it showed up. Lucky for me, this was some kind of ISP problem, I
>>> was able to determine that much through `tcpdump -i X -n arpd`.
>>>
>>> My 'two cents' is that you try arpd, I did a bit of looking when I
>>> came across that problem and it seemed to be the last ditch effort
>>> when changing the gc threshold had no effect. Wasn't able to confirm
>>> that it worked for sure though.
>>>
>>> Cheers.
>>> _______________________________________________
>>> LARTC mailing list
>>> LARTC@mailman.ds9a.nl
>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>
>>>
>>>
>>>  _______________________________________________
>>> LARTC mailing list
>>> LARTC@mailman.ds9a.nl
>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> LARTC mailing list
>>> LARTC@mailman.ds9a.nl
>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> LARTC mailing list
>>> LARTC@mailman.ds9a.nl
>>> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>>>
>>>
>>>     
>>
>>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>   

[-- Attachment #1.2: Type: text/html, Size: 14869 bytes --]

[-- Attachment #2: Type: text/plain, Size: 143 bytes --]

_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-12-07 17:17 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-22 21:46 [LARTC] neighbor table overflow Peter V. Saveliev
2007-10-22 21:46 ` Marco C. Coelho
2007-10-22 22:35 ` Grant Taylor
2007-10-23 11:56 ` Alexandru Dragoi
2007-10-23 20:32 ` Grant Taylor
2007-10-23 20:43 ` Jeff Welling
2007-10-23 21:04 ` Grant Taylor
2007-10-23 21:10 ` Marco C. Coelho
2007-10-23 21:23 ` Grant Taylor
2007-10-23 21:27 ` Marco C. Coelho
2007-10-24 10:19 ` Alexandru Dragoi
2007-10-24 15:19 ` Marco C. Coelho
2007-10-24 16:06 ` Alexandru Dragoi
2007-10-25 15:08 ` Marco C. Coelho
2007-10-25 16:30 ` Alexandru Dragoi
2007-11-19 22:36 ` Marco C. Coelho
2007-11-19 23:15 ` darko
2007-12-07 17:17 ` Marco C. Coelho

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.