* ip_conntrack table full problem @ 2005-03-14 15:47 Thomas Jarosch 2005-03-14 17:18 ` Phil Oester 0 siblings, 1 reply; 15+ messages in thread From: Thomas Jarosch @ 2005-03-14 15:47 UTC (permalink / raw) To: netfilter-devel Hi, I'm facing a problem with conntrack on a 2.4.21 kernel. One machine which firewalls a webradio reproducable becomes unresponsive every week with "ip_conntrack: table full, dropping packet." Raising the /proc/sys/net/ipv4/ip_conntrack_max limit only delayed the problem. I also installed a cronscript, which saves the contents of /proc/net/ip_conntrack every minute to a folder. When the system died there were around 150 connections in conntrack, far below the maximum limit. Also interesting is that the system never recovers from the "full table error", even though the conntrack table in /proc is almost empty. It feels like the table is filled with "ghost entries" and there's no room for new connections. I googled around and found this: http://cert.uni-stuttgart.de/archive/suse/security/2005/02/msg00174.html The problem is at least confirmed by Ludwig Nussel from SuSE: http://cert.uni-stuttgart.de/archive/suse/security/2005/02/msg00197.html I want to help tracking the problem down. I can't upgrade to a newer kernel version because of various other patches, but as the "stock" SuSE 9.2 kernel got the same problem I assume it's a more generic problem. Would it be wise to dump the complete internal conntrack table to syslog when the error occurs? Any patches I could try? Any other ideas? Thanks in advance, Thomas Jarosch ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-14 15:47 ip_conntrack table full problem Thomas Jarosch @ 2005-03-14 17:18 ` Phil Oester 2005-03-15 10:13 ` Thomas Jarosch 2005-03-21 14:13 ` Thomas Jarosch 0 siblings, 2 replies; 15+ messages in thread From: Phil Oester @ 2005-03-14 17:18 UTC (permalink / raw) To: Thomas Jarosch; +Cc: netfilter-devel On Mon, Mar 14, 2005 at 04:47:42PM +0100, Thomas Jarosch wrote: > Hi, > > I'm facing a problem with conntrack on a 2.4.21 kernel. > One machine which firewalls a webradio reproducable > becomes unresponsive every week with > "ip_conntrack: table full, dropping packet." When this happens, what does output from this look like: wc -l /proc/net/ip_conntrack ; grep ip_conntrack /proc/slabinfo Phil ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-14 17:18 ` Phil Oester @ 2005-03-15 10:13 ` Thomas Jarosch 2005-03-21 14:13 ` Thomas Jarosch 1 sibling, 0 replies; 15+ messages in thread From: Thomas Jarosch @ 2005-03-15 10:13 UTC (permalink / raw) To: netfilter-devel > > I'm facing a problem with conntrack on a 2.4.21 kernel. > > One machine which firewalls a webradio reproducable > > becomes unresponsive every week with > > "ip_conntrack: table full, dropping packet." > > When this happens, what does output from this look like: > > wc -l /proc/net/ip_conntrack ; grep ip_conntrack /proc/slabinfo Thanks Phil, I've extended my logger. The machine freezed yesterday, so it will take a week until it freezes again. I'll keep you posted. Thomas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-14 17:18 ` Phil Oester 2005-03-15 10:13 ` Thomas Jarosch @ 2005-03-21 14:13 ` Thomas Jarosch 2005-03-21 16:21 ` Phil Oester 1 sibling, 1 reply; 15+ messages in thread From: Thomas Jarosch @ 2005-03-21 14:13 UTC (permalink / raw) To: netfilter-devel > > I'm facing a problem with conntrack on a 2.4.21 kernel. > > One machine which firewalls a webradio reproducable > > becomes unresponsive every week with > > "ip_conntrack: table full, dropping packet." > > When this happens, what does output from this look like: > > wc -l /proc/net/ip_conntrack ; grep ip_conntrack /proc/slabinfo It happend again on Sunday night: wc -l: 35 /proc/net/ip_conntrack /proc/slabinfo: ip_conntrack 16263 16272 320 1356 1356 1 top: 9:51pm up 7 days, 2:04, 0 users, load average: 0.11, 0.04, 0.01 120 processes: 119 sleeping, 1 running, 0 zombie, 0 stopped CPU states: 2.0% user, 0.5% system, 0.3% nice, 5.8% idle Mem: 253116K av, 246808K used, 6308K free, 0K shrd, 74200K buff Swap: 260992K av, 57580K used, 203412K free 66160K cached I'm not familiar with the slab stuff, but it looks "full" to me ;-) Cheers, Thomas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 14:13 ` Thomas Jarosch @ 2005-03-21 16:21 ` Phil Oester 2005-03-21 17:03 ` Thomas Jarosch 0 siblings, 1 reply; 15+ messages in thread From: Phil Oester @ 2005-03-21 16:21 UTC (permalink / raw) To: Thomas Jarosch; +Cc: netfilter-devel On Mon, Mar 21, 2005 at 03:13:59PM +0100, Thomas Jarosch wrote: > > > I'm facing a problem with conntrack on a 2.4.21 kernel. > > > One machine which firewalls a webradio reproducable > > > becomes unresponsive every week with > > > "ip_conntrack: table full, dropping packet." > > > > When this happens, what does output from this look like: > > > > wc -l /proc/net/ip_conntrack ; grep ip_conntrack /proc/slabinfo > > It happend again on Sunday night: > > wc -l: > 35 /proc/net/ip_conntrack > > /proc/slabinfo: > ip_conntrack 16263 16272 320 1356 1356 1 Yes, you're leaking conntracks somewhere. Any possibility of testing a somewhat newer kernel than 2.4.21? This may have already been fixed. Phil ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 16:21 ` Phil Oester @ 2005-03-21 17:03 ` Thomas Jarosch 2005-03-21 18:08 ` Phil Oester ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Thomas Jarosch @ 2005-03-21 17:03 UTC (permalink / raw) To: netfilter-devel Phil, > > /proc/slabinfo: > > ip_conntrack 16263 16272 320 1356 1356 1 > > Yes, you're leaking conntracks somewhere. Any possibility of testing > a somewhat newer kernel than 2.4.21? This may have already been > fixed. Thank you for your response. Unfortunately I cannot update to a newer kernel soon. Would it be possible to dump the internal conntrack tables once the error occurs? Then we would at least know what is filling the table up. Is there some kind of debug macro I could add before the printk("conntrack table full") code? Or a more aggressive solution: Flush the complete conntrack table once the error occurs. This would kill all running connections, but the machine would still be reachable afterwards. Any other ideas? I'll try to reproduce the problem in a test environment, but it will be hard to narrow the cause down. Thomas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 17:03 ` Thomas Jarosch @ 2005-03-21 18:08 ` Phil Oester 2005-03-21 18:23 ` Thomas Jarosch 2005-03-21 18:41 ` Patrick Schaaf 2005-03-23 2:38 ` Patrick McHardy 2 siblings, 1 reply; 15+ messages in thread From: Phil Oester @ 2005-03-21 18:08 UTC (permalink / raw) To: Thomas Jarosch; +Cc: netfilter-devel On Mon, Mar 21, 2005 at 06:03:18PM +0100, Thomas Jarosch wrote: > > Yes, you're leaking conntracks somewhere. Any possibility of testing > > a somewhat newer kernel than 2.4.21? This may have already been > > fixed. > > Thank you for your response. > Unfortunately I cannot update to a newer kernel soon. > > Would it be possible to dump the internal conntrack tables > once the error occurs? Then we would at least know what > is filling the table up. Is there some kind of debug macro > I could add before the printk("conntrack table full") code? No easy way. Last week I posted a patch which would have made this possible by creating a 'cleaned' list, but since you cannot upgrade kernels, you could not use this anyway. > Or a more aggressive solution: > Flush the complete conntrack table once the error occurs. > This would kill all running connections, but the machine > would still be reachable afterwards. Even if conntrack were modular, you would be unable to unload it (see the thread referenced above). > Any other ideas? I'm still studying the root cause and have narrowed it down somewhat, but no patch yet. > I'll try to reproduce the problem in a test environment, > but it will be hard to narrow the cause down. What's the traffic pattern on this box? In my testing I've never seen such high rates of leakage. Phil ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 18:08 ` Phil Oester @ 2005-03-21 18:23 ` Thomas Jarosch 2005-03-21 21:14 ` Phil Oester 0 siblings, 1 reply; 15+ messages in thread From: Thomas Jarosch @ 2005-03-21 18:23 UTC (permalink / raw) To: netfilter-devel > > Would it be possible to dump the internal conntrack tables > > once the error occurs? Then we would at least know what > > is filling the table up. Is there some kind of debug macro > > I could add before the printk("conntrack table full") code? > > No easy way. Last week I posted a patch which would have made > this possible by creating a 'cleaned' list, but since you cannot > upgrade kernels, you could not use this anyway. But I can still patch kernels ;-) > > Any other ideas? > > I'm still studying the root cause and have narrowed it down > somewhat, but no patch yet. > > > I'll try to reproduce the problem in a test environment, > > but it will be hard to narrow the cause down. > > What's the traffic pattern on this box? In my testing I've > never seen such high rates of leakage. IIRC the box makes heavy use of SNAT/DNAT for port forwarding. I'll try to get a copy of the firewall rules tomorrow and test it locally here. Is there an easy way to see if it leaked conntracks? Should the information in /proc/slabinfo be somewhat proportional to the number of connections/lines in /proc/net/ip_conntrack? Thomas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 18:23 ` Thomas Jarosch @ 2005-03-21 21:14 ` Phil Oester 2005-03-21 22:58 ` Thomas Jarosch 0 siblings, 1 reply; 15+ messages in thread From: Phil Oester @ 2005-03-21 21:14 UTC (permalink / raw) To: Thomas Jarosch; +Cc: netfilter-devel On Mon, Mar 21, 2005 at 07:23:48PM +0100, Thomas Jarosch wrote: > > No easy way. Last week I posted a patch which would have made > > this possible by creating a 'cleaned' list, but since you cannot > > upgrade kernels, you could not use this anyway. > > But I can still patch kernels ;-) OK, I'll send along a 2.4.21 -> 2.6.11 patch shortly ;-) > IIRC the box makes heavy use of SNAT/DNAT for port forwarding. > I'll try to get a copy of the firewall rules tomorrow and > test it locally here. > > Is there an easy way to see if it leaked conntracks? > Should the information in /proc/slabinfo be somewhat proportional > to the number of connections/lines in /proc/net/ip_conntrack? Yes, the numbers should be in the same ballpark. Conntracks are being cleaned from the lists (i.e. /proc/net/ip_conntrack), but never being destroyed. In my testing this is caused by a process not freeing the skb. What kinds of processes are running on this box? Phil ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 21:14 ` Phil Oester @ 2005-03-21 22:58 ` Thomas Jarosch 0 siblings, 0 replies; 15+ messages in thread From: Thomas Jarosch @ 2005-03-21 22:58 UTC (permalink / raw) To: netfilter-devel > > IIRC the box makes heavy use of SNAT/DNAT for port forwarding. > > I'll try to get a copy of the firewall rules tomorrow and > > test it locally here. > > > > Is there an easy way to see if it leaked conntracks? > > Should the information in /proc/slabinfo be somewhat proportional > > to the number of connections/lines in /proc/net/ip_conntrack? > > Yes, the numbers should be in the same ballpark. Conntracks are being > cleaned from the lists (i.e. /proc/net/ip_conntrack), but never being > destroyed. In my testing this is caused by a process not freeing > the skb. What kinds of processes are running on this box? There are the usual suspects like apache, mailserver etc., but I can't think of anything that manipulates skbs directly. FreeS/WAN is also installed on the box, but I use the same configuration at home with no trouble ever. The best thing would be to find a way of reproducing the problem, which I start tomorrow. Just checked another box with the same installation and 32 days uptime, /proc/slabinfo looks good. IMHO the problem is related somehow to the heavy use of SNAT/DNAT as it's the main difference between the freezing box and mine. Thomas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 17:03 ` Thomas Jarosch 2005-03-21 18:08 ` Phil Oester @ 2005-03-21 18:41 ` Patrick Schaaf 2005-03-21 21:15 ` Phil Oester 2005-03-23 2:38 ` Patrick McHardy 2 siblings, 1 reply; 15+ messages in thread From: Patrick Schaaf @ 2005-03-21 18:41 UTC (permalink / raw) To: Thomas Jarosch; +Cc: netfilter-devel > Would it be possible to dump the internal conntrack tables > once the error occurs? The interface for such dumping is /proc/net/ip_conntrack, where you only find 35 conntracks in your table-full situation. So I would strongly doubt that extra "show them all" code would show much more than that... Just a random comment - unfortunately I have no idea how to help more constructively. best regards Patrick ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 18:41 ` Patrick Schaaf @ 2005-03-21 21:15 ` Phil Oester 0 siblings, 0 replies; 15+ messages in thread From: Phil Oester @ 2005-03-21 21:15 UTC (permalink / raw) To: Patrick Schaaf; +Cc: Thomas Jarosch, netfilter-devel On Mon, Mar 21, 2005 at 07:41:06PM +0100, Patrick Schaaf wrote: > > Would it be possible to dump the internal conntrack tables > > once the error occurs? > > The interface for such dumping is /proc/net/ip_conntrack, where you > only find 35 conntracks in your table-full situation. So I would > strongly doubt that extra "show them all" code would show much > more than that... It would show the conntracks which have been removed from ip_conntrack but not yet destroyed. Phil ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-21 17:03 ` Thomas Jarosch 2005-03-21 18:08 ` Phil Oester 2005-03-21 18:41 ` Patrick Schaaf @ 2005-03-23 2:38 ` Patrick McHardy 2005-03-23 9:11 ` Thomas Jarosch 2005-03-29 20:26 ` Thomas Jarosch 2 siblings, 2 replies; 15+ messages in thread From: Patrick McHardy @ 2005-03-23 2:38 UTC (permalink / raw) To: Thomas Jarosch; +Cc: netfilter-devel Thomas Jarosch wrote: > Phil, > > >>>/proc/slabinfo: >>>ip_conntrack 16263 16272 320 1356 1356 1 >> >>Yes, you're leaking conntracks somewhere. Any possibility of testing >>a somewhat newer kernel than 2.4.21? This may have already been >>fixed. > > > Thank you for your response. > Unfortunately I cannot update to a newer kernel soon. I suggest trying this patch: http://linux.bkbits.net:8080/linux-2.4/cset@3f219dbcj1MnJqxiJa99m_AcShdk5A?nav=index.html|src/net/|src/net|src/net/ipv4|src/net/ipv4/netfilter|related/net/ipv4/netfilter/ip_conntrack_core.c Regards Patrick ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-23 2:38 ` Patrick McHardy @ 2005-03-23 9:11 ` Thomas Jarosch 2005-03-29 20:26 ` Thomas Jarosch 1 sibling, 0 replies; 15+ messages in thread From: Thomas Jarosch @ 2005-03-23 9:11 UTC (permalink / raw) To: netfilter-devel > >>>/proc/slabinfo: > >>>ip_conntrack 16263 16272 320 1356 1356 1 > >> > >>Yes, you're leaking conntracks somewhere. Any possibility of testing > >>a somewhat newer kernel than 2.4.21? This may have already been > >>fixed. > > > > Thank you for your response. > > Unfortunately I cannot update to a newer kernel soon. > > I suggest trying this patch: > > http://linux.bkbits.net:8080/linux-2.4/cset@3f219dbcj1MnJqxiJa99m_AcShdk5A? >nav=index.html|src/net/|src/net|src/net/ipv4|src/net/ipv4/netfilter|related/ >net/ipv4/netfilter/ip_conntrack_core.c Thanks! I'll install a patched kernel on the box today. Hopefully the problem will vanish... Thomas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ip_conntrack table full problem 2005-03-23 2:38 ` Patrick McHardy 2005-03-23 9:11 ` Thomas Jarosch @ 2005-03-29 20:26 ` Thomas Jarosch 1 sibling, 0 replies; 15+ messages in thread From: Thomas Jarosch @ 2005-03-29 20:26 UTC (permalink / raw) To: netfilter-devel Hi Patrick, > >>Yes, you're leaking conntracks somewhere. Any possibility of testing > >>a somewhat newer kernel than 2.4.21? This may have already been > >>fixed. > > > > Thank you for your response. > > Unfortunately I cannot update to a newer kernel soon. > > I suggest trying this patch: > > http://linux.bkbits.net:8080/linux-2.4/cset@3f219dbcj1MnJqxiJa99m_AcShdk5A? >nav=index.html|src/net/|src/net|src/net/ipv4|src/net/ipv4/netfilter|related/ >net/ipv4/netfilter/ip_conntrack_core.c That patch solved the problem. Thanks! I'm curious how this bug is triggered in real-life usage? Cheers, Thomas ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2005-03-29 20:26 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-03-14 15:47 ip_conntrack table full problem Thomas Jarosch 2005-03-14 17:18 ` Phil Oester 2005-03-15 10:13 ` Thomas Jarosch 2005-03-21 14:13 ` Thomas Jarosch 2005-03-21 16:21 ` Phil Oester 2005-03-21 17:03 ` Thomas Jarosch 2005-03-21 18:08 ` Phil Oester 2005-03-21 18:23 ` Thomas Jarosch 2005-03-21 21:14 ` Phil Oester 2005-03-21 22:58 ` Thomas Jarosch 2005-03-21 18:41 ` Patrick Schaaf 2005-03-21 21:15 ` Phil Oester 2005-03-23 2:38 ` Patrick McHardy 2005-03-23 9:11 ` Thomas Jarosch 2005-03-29 20:26 ` Thomas Jarosch
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.