From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: weird problem Date: Fri, 26 Jun 2009 12:19:04 +0200 Message-ID: <4A44A098.8080006@cosmosbay.com> References: <4A43DB99.70602@gmail.com> <20090626083719.GA6445@ff.dom.local> <20090626090545.GB6445@ff.dom.local> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Eric Dumazet , =?ISO-8859-2?Q?Pawe=B3_Staszewski?= , Linux Network Development list To: Jarek Poplawski Return-path: Received: from gw1.cosmosbay.com ([212.99.114.194]:54161 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750847AbZFZKTX (ORCPT ); Fri, 26 Jun 2009 06:19:23 -0400 In-Reply-To: <20090626090545.GB6445@ff.dom.local> Sender: netdev-owner@vger.kernel.org List-ID: Jarek Poplawski a =E9crit : > On Fri, Jun 26, 2009 at 08:37:19AM +0000, Jarek Poplawski wrote: >> On 25-06-2009 22:18, Eric Dumazet wrote: >>> Pawe? Staszewski a ?crit : >>>> Ok >>>> >>>> After this day of observation im near 100% sure that this cpu load= is >>>> made by route cahce flushes >>>> When route cache increase to its "net.ipv4.route.gc_thresh" size o= r is >>>> near that size >>>> system is starting to drop some routes from cache then cpu load is >>>> increase from 2% to near 80% >>>> after cleaning / flush cache when cache is filling cpu load is aga= in >>>> normal 2% >>>> >>>> Someone know how to resolve this ? >>>> on kernels < 2.6.29 i don't see this, all start after upgrade from >>>> 2.6.28 to 2.6.29 - then i try 2.6.29.1 , 2.6.29.3 and 2.6.30 and o= n all >>>> this kernels >=3D 2.6.29 problem with cpu load is the same. >>>> >>>> I can minimize this cpu fluctuations by changing of route cache /p= roc >>>> parameters but the best result for my router was >>>> >>>> 15 sec of 2% cpu >>>> and after >>>> 15sec of 80% cpu >>>> >>>> >>>> Regards >>>> Pawel Staszewski >>> >>> I believe this is known 2.6.29 regressions >>> >>> Following two commits should correct the problem you have >>> >>> Your best bet would be to try 2.6.31-rc1, and tell us if this recen= t kernel >>> is ok on your machine ? >> >> Btw., the first of these commits is in 2.6.30, which according to >=20 > And the second as well. >=20 Thanks Jarek. Pawel made some reports errors in fib thread, so I am not sure he reall= y tried 2.6.30 and had same oprofile results. rt_worker_func() taking 13% of cpu0 is an alarm for me :) And 21% of cpu0 and 34% of cpu6 taken by oprofiled seems odd too... Pawel, could you give us : grep . /proc/sys/net/ipv4/route/* cat /proc/interrupts on your various kernels (previous to 2.6.29, 2.6.29, 2.6.30, ...) I suspect a change in hash table size, and/or change in interrupt affin= ities... Change in hash table size comes from commit c9503e0fe052020e0294cd07d0e= cd982eb7c9177 But as Pawel mentioned "net.ipv4.route.gc_thresh =3D 190536", I believe his hash table is smaller than 512k entries! Author: Anton Blanchard Date: Mon Apr 27 05:42:24 2009 -0700 ipv4: Limit size of route cache hash table Right now we have no upper limit on the size of the route cache has= h table. On a 128GB POWER6 box it ends up as 32MB: IP route cache hash table entries: 4194304 (order: 9, 33554432 = bytes) It would be nice to cap this for memory consumption reasons, but a = massive hashtable also causes a significant spike when measuring OS jitter. With a 32MB hashtable and 4 million entries, rt_worker_func is taki= ng 5 ms to complete. On another system with more memory it's taking 14= ms. Even though rt_worker_func does call cond_sched() to limit its impa= ct, in an HPC environment we want to keep all sources of OS jitter to a= minimum. With the patch applied we limit the number of entries to 512k which can still be overriden by using the rt_entries boot option: IP route cache hash table entries: 524288 (order: 6, 4194304 by= tes) With this patch rt_worker_func now takes 0.460 ms on the same syste= m. Signed-off-by: Anton Blanchard Acked-by: Eric Dumazet Signed-off-by: David S. Miller