netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>
Cc: shemminger@vyatta.com, andi@firstfloor.org, davej@redhat.com,
	netdev@vger.kernel.org, j.w.r.degoede@hhs.nl
Subject: Re: cat /proc/net/tcp takes 0.5 seconds on x86_64
Date: Thu, 28 Aug 2008 09:13:13 +0200	[thread overview]
Message-ID: <48B65009.5040805@cosmosbay.com> (raw)
In-Reply-To: <20080827.235158.187658055.davem@davemloft.net>

David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Thu, 28 Aug 2008 08:20:51 +0200
> 
>> But for route cache, it is probably doable since we added the
>> rt_genid thing in commit 29e75252da20f3ab9e132c68c9aed156b87beae6
>> ([IPV4] route cache: Introduce rt_genid for smooth cache
>> invalidation)
>>
>> If we add a hash table for each "struct net"
>> (net->ipv4.rt_hash_table), we then could do something sensible when
>> an admin writes to /proc/sys/net/ipv4/route/hash_size or at
>> rt_check_expire() time, if hash table is found to be full...
> 
> The synchronization and implementation is not a problem for
> the route cache, I implemented this eons ago.
> 
>> 3) In rt_check_expire(), adds some metrics to trigger an expand of the
>>   hash table in case we found too many entries in it.
> 
> This is the problem and why I didn't just commit the patch I had back
> then.
> 
> We could not define a reasonable way to trigger hash table growth.
> 
> GC attempts to keep a resident set of entries in the cache, and these
> heuristics are guided by the table size itself.  So if you grow the
> table too aggressively this never has a chance to work.

Maybe because of overcomplicated algos in net/ipv4/route.c, and
mixing "number of entries in cache", and "hash table size" things...

Fact is that nobody wants to have eight elements per hash bucket,
especially in case of DDOS.

> 
> You want to respond dynamically to traffic in a reasonable amount of
> time, but you don't want to get tricked by bursts of RCU effects.
> 

Right, but we also use process context processing instead of plain
timer soft irq things, so at least RCU effects should now be OK.

> We never came up with an algorithm that addresses all of these
> issues.

Could you give us the pointer to your previous work ?

Thank you




  reply	other threads:[~2008-08-28  7:13 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-459782-176318@bugzilla.redhat.com>
     [not found] ` <200808261549.m7QFnVUN032543@bz-web1.app.phx.redhat.com>
2008-08-26 16:37   ` cat /proc/net/tcp takes 0.5 seconds on x86_64 Dave Jones
2008-08-26 18:32     ` Eric Dumazet
2008-08-26 19:01       ` Hans de Goede
2008-08-26 20:39         ` Eric Dumazet
2008-08-26 20:58           ` Hans de Goede
2008-08-26 21:27             ` Eric Dumazet
2008-08-27  9:14               ` Hans de Goede
2008-08-27  9:05                 ` David Miller
2008-08-27  9:45                   ` Hans de Goede
2008-08-27  9:39                     ` David Miller
2008-08-27  4:19         ` Herbert Xu
2008-08-27  9:07           ` Hans de Goede
2008-08-27 12:41     ` Andi Kleen
2008-08-27 21:29       ` Trent Piepho
2008-08-27 21:47         ` Andi Kleen
2008-08-27 22:54           ` Andi Kleen
2008-08-27 21:29       ` David Miller
2008-08-27 21:48         ` Stephen Hemminger
2008-08-27 22:09           ` David Miller
2008-08-28  6:20             ` Eric Dumazet
2008-08-28  6:51               ` David Miller
2008-08-28  7:13                 ` Eric Dumazet [this message]
2008-08-28  7:57                   ` David Miller
2008-08-28  9:52                     ` Eric Dumazet
2008-08-28  7:26               ` Andi Kleen
2008-08-27 22:34         ` Andi Kleen
2008-08-27 22:39           ` David Miller
2008-08-27 22:57             ` Andi Kleen
2008-08-27 23:07               ` David Miller
2008-08-27 23:09             ` Eric Dumazet
2008-08-27 23:15               ` David Miller
2008-08-27 23:35                 ` Andi Kleen
2008-08-27 23:43                 ` Eric Dumazet
2008-08-27 23:45                   ` David Miller
2008-08-28  0:40                     ` Eric Dumazet
2008-08-28  7:45                       ` Andi Kleen
2008-08-28  7:59                         ` David Miller
2008-08-28  8:12                           ` Hans de Goede
2008-08-28  8:04                             ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48B65009.5040805@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=andi@firstfloor.org \
    --cc=davej@redhat.com \
    --cc=davem@davemloft.net \
    --cc=j.w.r.degoede@hhs.nl \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).