From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Horman Subject: Re: [PATCH] net: implement emergency route cache rebulds when gc_elasticity is exceeded Date: Thu, 2 Oct 2008 20:31:58 -0400 Message-ID: <20081003003158.GA19230@localhost.localdomain> References: <48E141F3.9000903@cosmosbay.com> <20080929223801.GA3157@hmsreliant.think-freely.org> <48E1C104.2080801@cosmosbay.com> <20080930.071023.07946874.davem@davemloft.net> <48E25F02.8030303@cosmosbay.com> <20081001180846.GB3566@hmsreliant.think-freely.org> <20081002010101.6f0a1fa5.billfink@mindspring.com> <48E470AC.10305@cosmosbay.com> <48E4832A.3070804@cosmosbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Bill Fink , David Miller , netdev@vger.kernel.org, kuznet@ms2.inr.ac.ru, pekkas@netcore.fi, jmorris@namei.org, yoshfuji@linux-ipv6.org, kaber@trash.net, Evgeniy Polyakov To: Eric Dumazet Return-path: Received: from charlotte.tuxdriver.com ([70.61.120.58]:58010 "EHLO smtp.tuxdriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753305AbYJCAeQ (ORCPT ); Thu, 2 Oct 2008 20:34:16 -0400 Content-Disposition: inline In-Reply-To: <48E4832A.3070804@cosmosbay.com> Sender: netdev-owner@vger.kernel.org List-ID: On Thu, Oct 02, 2008 at 10:15:38AM +0200, Eric Dumazet wrote: > Eric Dumazet a =C3=A9crit : >> Bill Fink a =C3=A9crit : >>> On Wed, 1 Oct 2008, Neil Horman wrote: >>> >>>> Hey all- >>>> Since Eric mentioned the use of statistical analysis to determ= ine if >>>> hash chains were growing improperly, I thought I would take a stab= =20 >>>> at such an >>>> approach. I'm no statistics expert, but it would seem to me that = simply >>>> computing the standard deviation of all the non-zero chain lengths= =20 >>>> would give a >>>> good guide point to determine if we needed to invalidate our hash = =20 >>>> table. I've >>>> written the below implementation. I've not tested it (I'll be=20 >>>> doing that here >>>> shortly for the next few days), but I wanted to post it to get =20 >>>> feedback from you >>>> all on the subject. The highlights are: >>>> >>>> 1) We add a counter to rt_hash_bucket structure to track the lengt= h=20 >>>> of each >>>> individual chain. I realize this is sub-optimal, as it adds =20 >>>> potentially lots of >>>> memory to hash table as a whole (4 bytes * number of hash buckets)= =2E=20 >>>> I'm afraid >>>> I've not come up with a better way to track that yet. We also=20 >>>> track the total >>>> number of route entries and the number of non-zero length chains. = =20 >>>> Lastly we >>>> also maintain a global maximum chain length which defines the=20 >>>> longest chain we >>>> will tolerate in the route table. This patch defines it as the=20 >>>> mean chain >>>> length plus one standard deviation. >>> >>> I believe the general rule of thumb for something like this is at >>> least two standard deviations. For a normal distribution, one stan= dard >>> deviation covers about 68 % of the sample universe, while two stand= ard >>> deviations covers about 95 % (three standard deviations covers 99.7= 3 %). >>> See the Wikipedia entry: >>> >>> http://en.wikipedia.org/wiki/Standard_deviation >>> >> >> Thanks Bill for the pointer, this is the trick. >> >> I believe we should target "4=CF=83 99.993666% " case. >> >> But we dont need to really compute Standard deviation at runtime, on= ly =20 >> find an (upper) approximation of it. >> >> For elasticity=3D4 and 512*1024 samples (mean < 4), I guess 4=CF=83 = can be =20 >> approximated by 20 or something. >> > > Good estimation of Standard Deviation could be computed for free > in rt_check_expire(). (each runs scans 20% of hash table with default= =20 > tunables timeout & ip_rt_gc_interval) > > We could update 4=CF=83 estimation in this function, every minute (ip= _rt_gc_interval) > > At softirq time we then can detect a particular hash chain is > longer than 4=CF=83 estimation and trigger an appropriate action. > > This action is to : flush table, and while we do that, expand hash ta= ble > if its current size is under ip_rt_max_size/elasticity... > That is a good idea. I'm still testing the last patch I posted, and Its= going pretty well (I did find a few bugs, so I'll be reposting when I'm done)= =2E Doing the update on demand when seem to be pretty quick, but I'll rewrite to = gather stats on how long it take specifically, and then rewrite/compare to you= r suggested implementation above. It seems like the above would save spa= ce in the hash table, as I could eliminate the chain_length counter per hash buck= et. On the down side though, I don't know if a 1 minute garbage collection = interval would allow too much of a window to grow a chain in (i.e. we could chec= k chain lengths against a too small standard deviation value, and trigger a cac= he rebuild unnecessecarily). Just thinking out loud, I've not got any evi= dence to support or contradict that Best Neil > > > >