From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: ucc_geth: nf_conntrack: table full, dropping packet. Date: Tue, 24 Mar 2009 10:12:53 +0100 Message-ID: <49C8A415.1090606@cosmosbay.com> References: <49C77D71.8090709@trash.net> <49C780AD.70704@trash.net> <49C7CB9B.1040409@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Patrick McHardy , avorontsov@ru.mvista.com, netdev@vger.kernel.org, "Paul E. McKenney" To: Joakim Tjernlund Return-path: Received: from gw1.cosmosbay.com ([212.99.114.194]:48390 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753651AbZCXJNM convert rfc822-to-8bit (ORCPT ); Tue, 24 Mar 2009 05:13:12 -0400 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Joakim Tjernlund a =E9crit : > Patrick McHardy wrote on 23/03/2009 18:49:15: >> Joakim Tjernlund wrote: >>> Patrick McHardy wrote on 23/03/2009 13:29:33: >>> >>> >>>>> There is no /proc/net/netfilter/nf_conntrack. There is a >>>>> /proc/net/nf_conntrack though and it is empty. If I telnet >>>>> to the board I see: >>>>> >>>> That means that something is leaking conntrack references, most=20 > likely >>>> by leaking skbs. Since I haven't seen any other reports, my guess=20 > would >>>> be the ucc_geth driver. >>>> >>> Mucking around with the ucc_geth driver I found that if I: >>> - Move TX from IRQ to NAPI context >>> - double the weight. >>> - after booting up, wait a few mins until the JFFS2 GC kernel thre= ad=20 > has=20 >>> stopped >>> scanning the FS=20 >>> >>> Then the "nf_conntrack: table full, dropping packet." msgs stops. >>> Does this seem right to you guys? >> No. As I said, something seems to be leaking packets. You should be >> able to confirm that by checking the sk_buff slabs in /proc/slabinfo= =2E >> If that *doesn't* show any signs of a leak, please run "conntrack -E= " >> to capture the conntrack events before the "table full" message >> appears and post the output. >=20 > skbuff does not differ much, but others do >=20 > Before ping: > skbuff_fclone_cache 0 0 352 11 1 : tunables 54 = 27 0=20 > : slabdata 0 0 0 > skbuff_head_cache 20 20 192 20 1 : tunables 120 = 60 0=20 > : slabdata 1 1 0 > size-64 731 767 64 59 1 : tunables 120 = 60 0=20 > : slabdata 13 13 0 > nf_conntrack 10 19 208 19 1 : tunables 120 = 60 0=20 > : slabdata 1 1 0 >=20 > During ping:=20 > skbuff_fclone_cache 0 0 352 11 1 : tunables 54 = 27 0=20 > : slabdata 0 0 0 > skbuff_head_cache 40 40 192 20 1 : tunables 120 = 60 0=20 > : slabdata 2 2 0 > size-64 8909 8909 64 59 1 : tunables 120 = 60 0=20 > : slabdata 151 151 0 > nf_conntrack 5111 5111 208 19 1 : tunables 120 = 60 0=20 > : slabdata 269 269 0 >=20 > This feels more like the freeing of conntrack objects are delayed and= =20 > builds up when ping flooding. >=20 > Don't have "conntrack -E" for my embedded board so that will have to = wait=20 > a bit longer. I dont understand how your ping can use so many conntrack entries... Then, as I said yesterday, I believe you have a RCU delay, because of a misbehaving driver or something... grep RCU .config grep CONFIG_SMP .config You could change qhimark from 10000 to 1000 in kernel/rcuclassic.c (lin= e 80) as a workaround. It should force a quiescent state after 1000 freed con= ntracks.