From: Jarek Poplawski <jarkao2@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: =?ISO-8859-2?Q?Pawe=B3_Staszewski?= <pstaszewski@itcare.pl>,
"Linux Network Development list" <netdev@vger.kernel.org>
Subject: Re: weird problem
Date: Fri, 26 Jun 2009 09:05:45 +0000 [thread overview]
Message-ID: <20090626090545.GB6445@ff.dom.local> (raw)
In-Reply-To: <20090626083719.GA6445@ff.dom.local>
On Fri, Jun 26, 2009 at 08:37:19AM +0000, Jarek Poplawski wrote:
> On 25-06-2009 22:18, Eric Dumazet wrote:
> > Pawe? Staszewski a ?crit :
> >> Ok
> >>
> >> After this day of observation im near 100% sure that this cpu load is
> >> made by route cahce flushes
> >> When route cache increase to its "net.ipv4.route.gc_thresh" size or is
> >> near that size
> >> system is starting to drop some routes from cache then cpu load is
> >> increase from 2% to near 80%
> >> after cleaning / flush cache when cache is filling cpu load is again
> >> normal 2%
> >>
> >> Someone know how to resolve this ?
> >> on kernels < 2.6.29 i don't see this, all start after upgrade from
> >> 2.6.28 to 2.6.29 - then i try 2.6.29.1 , 2.6.29.3 and 2.6.30 and on all
> >> this kernels >= 2.6.29 problem with cpu load is the same.
> >>
> >> I can minimize this cpu fluctuations by changing of route cache /proc
> >> parameters but the best result for my router was
> >>
> >> 15 sec of 2% cpu
> >> and after
> >> 15sec of 80% cpu
> >>
> >>
> >> Regards
> >> Pawel Staszewski
> >
> >
> > I believe this is known 2.6.29 regressions
> >
> > Following two commits should correct the problem you have
> >
> > Your best bet would be to try 2.6.31-rc1, and tell us if this recent kernel
> > is ok on your machine ?
>
>
> Btw., the first of these commits is in 2.6.30, which according to
And the second as well.
Jarek P.
> Pawel was tried. And IMHO trying -rc1 on a production system needs
> a lot of bravery.
>
> Jarek P.
>
> >
> >
> >
> > commit 1ddbcb005c395518c2cd0df504cff3d4b5c85853
> > Author: Eric Dumazet <dada1@cosmosbay.com>
> > Date: Tue May 19 20:14:28 2009 +0000
> >
> > net: fix rtable leak in net/ipv4/route.c
> >
> > Alexander V. Lukyanov found a regression in 2.6.29 and made a complete
> > analysis found in http://bugzilla.kernel.org/show_bug.cgi?id=13339
> > Quoted here because its a perfect one :
> >
> > begin_of_quotation
> > 2.6.29 patch has introduced flexible route cache rebuilding. Unfortunately the
> > patch has at least one critical flaw, and another problem.
> >
> > rt_intern_hash calculates rthi pointer, which is later used for new entry
> > insertion. The same loop calculates cand pointer which is used to clean the
> > list. If the pointers are the same, rtable leak occurs, as first the cand is
> > removed then the new entry is appended to it.
> >
> > This leak leads to unregister_netdevice problem (usage count > 0).
> >
> > Another problem of the patch is that it tries to insert the entries in certain
> > order, to facilitate counting of entries distinct by all but QoS parameters.
> > Unfortunately, referencing an existing rtable entry moves it to list beginning,
> > to speed up further lookups, so the carefully built order is destroyed.
> >
> > For the first problem the simplest patch it to set rthi=0 when rthi==cand, but
> > it will also destroy the ordering.
> > end_of_quotation
> >
> > Problematic commit is 1080d709fb9d8cd4392f93476ee46a9d6ea05a5b
> > (net: implement emergency route cache rebulds when gc_elasticity is exceeded)
> >
> > Trying to keep dst_entries ordered is too complex and breaks the fact that
> > order should depend on the frequency of use for garbage collection.
> >
> > A possible fix is to make rt_intern_hash() simpler, and only makes
> > rt_check_expire() a litle bit smarter, being able to cope with an arbitrary
> > entries order. The added loop is running on cache hot data, while cpu
> > is prefetching next object, so should be unnoticied.
> >
> > Reported-and-analyzed-by: Alexander V. Lukyanov <lav@yar.ru>
> >
> > commit cf8da764fc6959b7efb482f375dfef9830e98205
> > Author: Eric Dumazet <dada1@cosmosbay.com>
> > Date: Tue May 19 18:54:22 2009 +0000
> >
> > net: fix length computation in rt_check_expire()
> >
> > rt_check_expire() computes average and standard deviation of chain lengths,
> > but not correclty reset length to 0 at beginning of each chain.
> > This probably gives overflows for sum2 (and sum) on loaded machines instead
> > of meaningful results.
> >
> > Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> > Signed-off-by: David S. Miller <davem@davemloft.net>
next prev parent reply other threads:[~2009-06-26 9:05 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-25 16:06 weird problem Paweł Staszewski
2009-06-25 16:33 ` Paweł Staszewski
2009-06-25 17:18 ` Paweł Staszewski
2009-06-25 19:45 ` Paweł Staszewski
2009-06-25 20:18 ` Eric Dumazet
2009-06-25 22:23 ` Paweł Staszewski
2009-06-26 8:37 ` Jarek Poplawski
2009-06-26 9:05 ` Jarek Poplawski [this message]
2009-06-26 10:19 ` Eric Dumazet
2009-06-26 17:45 ` Paweł Staszewski
2009-06-26 17:57 ` Paweł Staszewski
2009-06-30 6:40 ` Jarek Poplawski
2009-06-30 8:35 ` Paweł Staszewski
2009-06-30 8:36 ` Paweł Staszewski
2009-07-08 22:34 ` Jarek Poplawski
2009-07-09 23:14 ` Paweł Staszewski
2009-07-09 23:59 ` Paweł Staszewski
2009-07-10 14:47 ` Jarek Poplawski
2009-07-11 6:24 ` Jarek Poplawski
2009-07-13 23:26 ` Paweł Staszewski
2009-07-14 16:24 ` Jarek Poplawski
2009-07-15 20:15 ` Paweł Staszewski
2009-07-15 22:43 ` Jarek Poplawski
2009-07-16 11:01 ` Jarek Poplawski
-- strict thread matches above, loose matches on Subject: below --
2003-10-14 11:00 Weird problem Jean-Rene Cormier
[not found] ` <3F8BEAEB.1060005@Loudoun-Fairfax.com>
[not found] ` <1066136413.12935.43.camel@forbidden.cipanb.ca>
2003-10-14 15:31 ` Jeffrey Laramie
[not found] ` <3F8C1700.3070902@Loudoun-Fairfax.com>
2003-10-14 16:59 ` Jean-Rene Cormier
2003-10-14 17:49 ` Jeffrey Laramie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090626090545.GB6445@ff.dom.local \
--to=jarkao2@gmail.com \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pstaszewski@itcare.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.