Re: Route cache performance

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Route cache performance
       [not found] ` <16940.9990.975632.115834@robur.slu.se>
@ 2005-03-09  1:45   ` Simon Kirby
  2005-03-09 12:05     ` Robert Olsson
  0 siblings, 1 reply; 36+ messages in thread
From: Simon Kirby @ 2005-03-09  1:45 UTC (permalink / raw)
  To: Robert Olsson, netdev

On Mon, Mar 07, 2005 at 11:03:50AM +0100, Robert Olsson wrote:

> FYI. The preroute12 was incomplete... There is a number 13.

Hi Robert,

Interesting patch!  I haven't had a chance to try it yet, but I have
been thinking about this sort of approach for some time.

I'm wondering, though, if this patch would ever be accepted upstream.
The preroute patches make it now require a full slow route lookup
before checking the route cache, right?  Eg: ip_route_input() is
replaced with a call to ip_route_input_nohash() which then might fall
back on ip_route_input() which checks the route cache.  The nohash
function, however, still appears to have to do the full fib_lookup()
which is the same as doing at least one slow route lookup for every
packet.

The random src/dst DoS case really kills the route cache because of the
rehashing, locking, and memory allocation and freeing.  I see that the
RCU lists and locking now makes it very difficult to recycle the entries,
so I think this patch is probably the right idea for now (although the
route cache should probably still be optimized where possible).

I was wondering if instead it makes sense to still check the route cache
first, but insert the bypass code as in ip_route_input_nohash() between
where the slow route lookup is done and the dst cache entry is created. 
In other words:

- The route cache is checked first.  Entries in the route cache will
  continue immediately as they do now.

- Entries not in the route cache will trigger a slow route lookup as they
  do now.

- Routes which are "INPUT" or "OUTPUT" routes (eg: in or out of the local
  machine) will be added to the route cache as normal.

- Routes which are "FORWARD" routes will NOT be added to the route cache
  (and thus fall back to "slow" lookups up each time as with the preroute
  patch).  These slow lookups will be faster than maintaining route cache
  entries for these packets which we don't ever learn an MSS for anyway.

In fact, a heuristic could maybe be added to make the route cache bypass
conditional so that it only occurs when the table is full or there are
too many cache misses, or something.  This would maintain the route cache
performance in normal conditions but remove the route cache overhead in
spoofed src/dst type DoS loads that kill us today.

My guess is this would be an even simpler patch as some of the preroute
patch is a duplication of ip_route_input_slow() that has to happen in
this case anyway.

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-03-09  1:45   ` Simon Kirby
@ 2005-03-09 12:05     ` Robert Olsson
  0 siblings, 0 replies; 36+ messages in thread
From: Robert Olsson @ 2005-03-09 12:05 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, netdev

Simon Kirby writes:

 > Interesting patch!  I haven't had a chance to try it yet, but I have
 > been thinking about this sort of approach for some time.
 > 
 > I'm wondering, though, if this patch would ever be accepted upstream.
 > The preroute patches make it now require a full slow route lookup
 > before checking the route cache, right?  Eg: ip_route_input() is
 > replaced with a call to ip_route_input_nohash() which then might fall
 > back on ip_route_input() which checks the route cache.  The nohash
 > function, however, still appears to have to do the full fib_lookup()
 > which is the same as doing at least one slow route lookup for every
 > packet.

 Well the intention with the patch was to be able to test performance
 w/o route hash as many poeple think if just route hash goes away
 all things will be fine. Jamal and I starting hacking on this at OLS
 2004 and it was used to test LC-trie.

 IMO there is still no definitive answer and even the answer is dependent
 of other things i.e how much can we improve FIB lookup etc,

 Also a bit interesting is those DDOS stories/cases being heard, This
 as I'm involved in configuring and solving practical routing problems
 but never seen DDOS at the levels reported. We should design for worst 
 case but not for something that's totally broke.

 > The random src/dst DoS case really kills the route cache because of the
 > rehashing, locking, and memory allocation and freeing.  I see that the
 > RCU lists and locking now makes it very difficult to recycle the entries,
 > so I think this patch is probably the right idea for now (although the
 > route cache should probably still be optimized where possible).

 To some extent true... but you have the patches to test and copare 
 with pure FIB lookup yourself. Try single and DDOS and compare. DDOS 
 forces a lot of FIB lookups still and keeps your L2 cache very busy 
 you see degradation here to. I have some results to compare with,

 Also route hash can be improved and of course it can tuned but it needs 
 some effort.
 BTW I have some patches somewhere that makes input route hash per 
 device a bit NAPI thinking to get fairness in case of DDOS etc.

 I heard some examples of intelligent dropping at DDOS attacks in 
 the PREROUTE hook.

 > I was wondering if instead it makes sense to still check the route cache
 > first, but insert the bypass code as in ip_route_input_nohash() between
 > where the slow route lookup is done and the dst cache entry is created. 
 > In other words:
 > 
 > - The route cache is checked first.  Entries in the route cache will
 >   continue immediately as they do now.
 > 
 > - Entries not in the route cache will trigger a slow route lookup as they
 >   do now.
 > 
 > - Routes which are "INPUT" or "OUTPUT" routes (eg: in or out of the local
 >   machine) will be added to the route cache as normal.

 Maybe a variant. You have to lookup the hash and could still hold 
 src/local_IPs/tos entries so we open for DDOS again?

 We have some experiments to do...

					--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Route cache performance
@ 2005-08-15 21:38 Simon Kirby
  2005-08-16  2:23 ` Eric Dumazet
  0 siblings, 1 reply; 36+ messages in thread
From: Simon Kirby @ 2005-08-15 21:38 UTC (permalink / raw)
  To: Robert Olsson, netdev

Hi!

Well, after a few years of other work :), I have finally got around to
setting up some more permanent forwarding / route cache performance
test boxes.  I noticed the route trie option in the newer 2.6 kernels
and figured it would be a good time to revisit things.

Test setup:

[Xeon w/e1000]---[Opteron w/dual e1000]---[Xeon w/e1000]

The Xeons are 2.4 GHz boxes and the Opteron is a 140.  At some point
I intend to compare the performance of a 32 bit versus 64 bit kernel.

I'm only able to get pktgen to spit out about 660 kpps from the test 2.4
Xeon box with onboard e1000 (pause disabled), but already I notice some
disappointing results.  The old 2.4.27 kernel I last did tests with seems
to do a much better job of forwarding small packets (static src/dst) than
2.4.31 and 2.6.12.

On the (leftmost) sending box, 2.4.27, 2.4.31, and 2.6.12 all seem to do
fairly well at transmission with pktgen.  The 2.6 pktgen seems a little
better (no transmission errors and a few more Mbps), so I've been using
2.6.12.  With fixed dst packets and pause disabled via ethtool, about
660 kpps is sent continuously.  juno (spoofed source, userspace) seems to
do about 360 kpps.  The routes and packets are set up to route through
the Opteron box to the receiving (rightmost) box.

I've noticed that e1000 changes integrated in 2.6.11-bk2 are resulting in
the forwarding test box slowing down enough that it seems to be exposing
"dst cache overflow", even though under slightly less load the gc seems
to be able to keep up.  Robert, if I read correctly it seems that the
e1000 NAPI changes were some fixes you submitted?

Something appears to be different in the rtcache GC or perhaps NAPI or
some other interaction, because firing juno at 2.4 does not show any
problems while I can't seem to get 2.6.12 to _not_ print "dst cache
overflow".  2.6.11 (pre-bk2) seems a little better at start, but any kind
of burst seems to make the route cache entries exceed gc and then the
slower hash lookups seem to make it get stuck at max_size (and printing
"dst cache overflow") until the attack stops, even with gc_min_interval
set to 0 (really 0).

Anyway, I'm still in early testing stages here but it seems it's still as
easy as ever to destroy routers (and hosts?) with a fairly small stream
of small packets which create new rtcache entries.  These days, 184 Mbps
is starting to fall under the "small" DoS attack category, too.

I notice the hash table size is still only 4096 buckets for 512 MB, which
isn't that wonderful when it hits a max_size of 65536 (w/512 MB)...

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-15 21:38 Route cache performance Simon Kirby
@ 2005-08-16  2:23 ` Eric Dumazet
  2005-08-23 19:08   ` Simon Kirby
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Dumazet @ 2005-08-16  2:23 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, netdev

Simon Kirby a écrit :
> Hi!
> 
> Well, after a few years of other work :), I have finally got around to
> setting up some more permanent forwarding / route cache performance
> test boxes.  I noticed the route trie option in the newer 2.6 kernels
> and figured it would be a good time to revisit things.
> 
> Test setup:
> 
> [Xeon w/e1000]---[Opteron w/dual e1000]---[Xeon w/e1000]
> 
> The Xeons are 2.4 GHz boxes and the Opteron is a 140.  At some point
> I intend to compare the performance of a 32 bit versus 64 bit kernel.
> 
> I'm only able to get pktgen to spit out about 660 kpps from the test 2.4
> Xeon box with onboard e1000 (pause disabled), but already I notice some
> disappointing results.  The old 2.4.27 kernel I last did tests with seems
> to do a much better job of forwarding small packets (static src/dst) than
> 2.4.31 and 2.6.12.
> 
> On the (leftmost) sending box, 2.4.27, 2.4.31, and 2.6.12 all seem to do
> fairly well at transmission with pktgen.  The 2.6 pktgen seems a little
> better (no transmission errors and a few more Mbps), so I've been using
> 2.6.12.  With fixed dst packets and pause disabled via ethtool, about
> 660 kpps is sent continuously.  juno (spoofed source, userspace) seems to
> do about 360 kpps.  The routes and packets are set up to route through
> the Opteron box to the receiving (rightmost) box.
> 
> I've noticed that e1000 changes integrated in 2.6.11-bk2 are resulting in
> the forwarding test box slowing down enough that it seems to be exposing
> "dst cache overflow", even though under slightly less load the gc seems
> to be able to keep up.  Robert, if I read correctly it seems that the
> e1000 NAPI changes were some fixes you submitted?
> 
> Something appears to be different in the rtcache GC or perhaps NAPI or
> some other interaction, because firing juno at 2.4 does not show any
> problems while I can't seem to get 2.6.12 to _not_ print "dst cache
> overflow".  2.6.11 (pre-bk2) seems a little better at start, but any kind
> of burst seems to make the route cache entries exceed gc and then the
> slower hash lookups seem to make it get stuck at max_size (and printing
> "dst cache overflow") until the attack stops, even with gc_min_interval
> set to 0 (really 0).
> 
> Anyway, I'm still in early testing stages here but it seems it's still as
> easy as ever to destroy routers (and hosts?) with a fairly small stream
> of small packets which create new rtcache entries.  These days, 184 Mbps
> is starting to fall under the "small" DoS attack category, too.
> 
> I notice the hash table size is still only 4096 buckets for 512 MB, which
> isn't that wonderful when it hits a max_size of 65536 (w/512 MB)...
> 
> Simon-
> 
> 

Hi Simon

I think one of the reason linux 2.6 has worst results is because HZ=1000 (instead of HZ=100 for linux 2.4)
So if rt_garbage_collect() has heavy work to do, it usually break out of the loop because of :

} while (!in_softirq() && time_before_eq(jiffies, now));

Could you please test latest 2.6.13-rc6 kernel on the Opteron machine, compiled with HZ=100, with the appended kernel argument :

rhash_entries=8191  ( or rhash_entries=16383 )

and

echo 1 >/proc/sys/net/ipv4/route/gc_interval
echo 2 >/proc/sys/net/ipv4/route/gc_elasticity

Could you also post some data from your router (like : rtstat -c 20 -i 1)

Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-16  2:23 ` Eric Dumazet
@ 2005-08-23 19:08   ` Simon Kirby
  2005-08-23 19:56     ` Robert Olsson
  0 siblings, 1 reply; 36+ messages in thread
From: Simon Kirby @ 2005-08-23 19:08 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Robert Olsson, netdev

On Tue, Aug 16, 2005 at 04:23:35AM +0200, Eric Dumazet wrote:

> Hi Simon

Hi there!

> I think one of the reason linux 2.6 has worst results is because HZ=1000 
> (instead of HZ=100 for linux 2.4)
> So if rt_garbage_collect() has heavy work to do, it usually break out of 
> the loop because of :
> 
> } while (!in_softirq() && time_before_eq(jiffies, now));

I was under the impression, however, that the code Alexei added last time
I brought up this problem was intended to always allow gc when the the
table is full and another entry is attempting to be created, even when
under gc_min_interval.  I'm actually not even interested (yet) with
the gc_interval/timer case because I'm testing currently with a flow
creation rate of much larger than max_size per second (the minimum
gc_interval being one second).

> Could you please test latest 2.6.13-rc6 kernel on the Opteron machine, 
> compiled with HZ=100, with the appended kernel argument :
> 
> rhash_entries=8191  ( or rhash_entries=16383 )
> 
> and
> 
> echo 1 >/proc/sys/net/ipv4/route/gc_interval
> echo 2 >/proc/sys/net/ipv4/route/gc_elasticity
> 
> Could you also post some data from your router (like : rtstat -c 20 -i 1)

Sure.  Here are results from 2.6.13-rc6 with HZ=100 and
rhash_entries=8191, which sets the max_size to 131072.  I'm using
lnstat becuase the rtstat version I could find doesn't work on
newer kernels:

lnstat -c -1 -i 1 -f rt_cache -k entries,in_hit,in_slow_tot,gc_total,gc_ignored,gc_goal_miss,gc_dst_overflow,in_hlist_search

The sender is running "juno 192.168.1.1 31313 0" (juno-z.101f.c):

pid 18492: ran for 40s, 13595333 packets out, 16241091 bytes/s
(~340kpps)

Without tweaks to gc_interval and gc_elasticity:

rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|
 entries|  in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist|
        |        |     tot|        |      ed|    miss| verflow| _search|
      32|     117|     419|       0|       0|       0|       0|       0|
      32|       6|       0|       0|       0|       0|       0|       0|
      32|       2|       0|       0|       0|       0|       0|       0|
      33|       2|       4|       0|       0|       0|       0|       0|
    9033|       2|    9002|     840|     839|       0|       0|    4962|
  131062|      22|  125633|  125629|  125447|     182|     181|  837163|
  131062|       0|   13511|   13509|     900|   12609|   12609|      10|
  131062|       0|    8772|    8770|     600|    8170|    8170|       7|
  131062|       0|    8709|    8706|     600|    8106|    8106|       8|
  131062|       0|    8771|    8770|     600|    8170|    8170|       6|
  131062|       0|    8770|    8768|     600|    8168|    8168|       6|
  131062|       0|    8706|    8704|     600|    8104|    8104|      10|
  131062|       0|    8770|    8770|     600|    8170|    8170|       5|
  131062|       0|    8708|    8706|     600|    8106|    8106|       5|
  131062|       0|    8770|    8769|     600|    8169|    8169|       6|
  131062|       0|    8770|    8769|     600|    8169|    8169|      10|
  131062|       0|    8713|    8706|     600|    8106|    8106|       7|
  131062|       0|    8786|    8769|     600|    8169|    8169|       9|

With tweaks (and after 60 seconds to wait for timer expiry):

rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|
 entries|  in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist|
        |        |     tot|        |      ed|    miss| verflow| _search|
      28|     632|  424656|  413834|  145906|  267927|  267926|  842370|
      28|       2|       3|       0|       0|       0|       0|       0|
      28|       3|       2|       0|       0|       0|       0|       0|
      28|       2|       4|       0|       0|       0|       0|       0|
   35129|       3|   35999|   27826|   27825|       0|       0|   61913|
  131062|       6|  102045|  102043|   99432|    2611|    2610|  288926|
  131062|       0|   13446|   13442|     900|   12542|   12542|      11|
  131062|       0|   11914|   11909|     800|   11109|   11109|       5|
  131062|       0|    8772|    8770|     599|    8171|    8170|       5|
  131062|       0|    8708|    8708|     600|    8108|    8108|       7|
  131062|       0|    8774|    8771|     600|    8171|    8171|       2|
  131062|       0|    8769|    8769|     600|    8169|    8169|       9|
  131062|       0|    8706|    8704|     600|    8104|    8104|       4|
  131062|       0|    8769|    8768|     599|    8169|    8168|       5|
  131062|       0|    8707|    8706|     600|    8106|    8106|       7|
  131062|       0|    8771|    8768|     600|    8168|    8168|       6|
  131062|       0|    8770|    8768|     600|    8168|    8168|       8|
  131062|       0|    8705|    8704|     600|    8104|    8104|       6|
  131062|       0|    8771|    8768|     600|    8168|    8168|       5|

No visible difference to me.

On stock 2.4.31 with no alterations to the gc settings (and no
rhash_entries as it doesn't exist), lnstat shows:

rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|
 entries|  in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist|
        |        |     tot|        |      ed|    miss| verflow| _search|
      21|      85|     160|       0|       0|       0|       0|       0|
      21|       4|       2|       0|       0|       0|       0|       0|
      21|       2|       3|       0|       0|       0|       0|       0|
      22|       2|       2|       0|       0|       0|       0|       0|
   18432|      11|  136187|  134158|  134156|       1|       0| 1133784|
   18432|       5|  195891|  195889|  195887|       2|       0| 1763070|
   18432|       9|  195585|  195568|  195566|       2|       0| 1758397|
   18432|       7|  195290|  195281|  195279|       0|       0| 1751884|
   18432|       8|  195587|  195579|  195577|       0|       0| 1754813|
   18432|      20|  195276|  195275|  195273|       0|       0| 1752216|
   18432|      11|  194983|  194980|  194978|       0|       0| 1749822|
   18432|       7|  195288|  195287|  195285|       0|       0| 1752655|
   18432|      13|  195282|  195281|  195279|       0|       0| 1752869|
   18432|      12|  194984|  194984|  194982|       1|       0| 1749589|
   18432|      17|  194978|  194974|  194972|       0|       0| 1748817|
   18432|      11|  194985|  194981|  194979|       0|       0| 1749182|
   18432|      14|  194981|  194977|  194975|       0|       0| 1749287|
   18432|      14|  194682|  194679|  194677|       0|       0| 1746847|
   18432|      11|  194983|  194980|  194978|       0|       0| 1749679|

...and the machine is perfectly responsive.  It's dropping packets
(managing to forward ~210 kpps, a little less than 2.4.27), but it's
at least working.  2.6.13-rc6 dribbles out ~33 kpps.

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-23 19:08   ` Simon Kirby
@ 2005-08-23 19:56     ` Robert Olsson
  2005-08-24  0:01       ` Simon Kirby
  0 siblings, 1 reply; 36+ messages in thread
From: Robert Olsson @ 2005-08-23 19:56 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Eric Dumazet, Robert Olsson, netdev


 Hello!

Simon Kirby writes:

 > I was under the impression, however, that the code Alexei added last time
 > I brought up this problem was intended to always allow gc when the the
 > table is full and another entry is attempting to be created, even when
 > under gc_min_interval.

 Yes your GC does not work at all in your 2.6 setups...Why?


 > rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|
 >  entries|  in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist|
 >         |        |     tot|        |      ed|    miss| verflow| _search|
 

 >   131062|      22|  125633|  125629|  125447|     182|     181|  837163|
 >   131062|       0|   13511|   13509|     900|   12609|   12609|      10|
 >   131062|       0|    8772|    8770|     600|    8170|    8170|       7|
 >   131062|       0|    8709|    8706|     600|    8106|    8106|       8|
 >   131062|       0|    8771|    8770|     600|    8170|    8170|       6|
 >   131062|       0|    8770|    8768|     600|    8168|    8168|       6|
 >   131062|       0|    8706|    8704|     600|    8104|    8104|      10|


 Can you try  
 
 echo 50 >  /proc/sys/net/ipv4/route/gc_min_interval_ms

 Cheers.
					--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-23 19:56     ` Robert Olsson
@ 2005-08-24  0:01       ` Simon Kirby
  2005-08-24  3:50         ` Robert Olsson
  2005-08-25 18:11         ` Simon Kirby
  0 siblings, 2 replies; 36+ messages in thread
From: Simon Kirby @ 2005-08-24  0:01 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Eric Dumazet, netdev

On Tue, Aug 23, 2005 at 09:56:53PM +0200, Robert Olsson wrote:

>  Yes your GC does not work at all in your 2.6 setups...Why?

Good question. :)

>  echo 50 >  /proc/sys/net/ipv4/route/gc_min_interval_ms

The output looks exactly the same with gc_min_interval_ms set to 50.

If I set it to 0, it does change a little but _still_ overflows:

rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|
 entries|  in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist|
        |        |     tot|        |      ed|    miss| verflow| _search|
       3|       3|       1|       1|       1|       0|       0|       0|
       4|      11|       5|       0|       0|       0|       0|       0|
       5|       5|       2|       0|       0|       0|       0|       0|
   23615|       1|   24002|   15812|       0|       0|       0|   11470|
   68692|       0|   46780|   46777|       0|    4687|       0|    4492|
   86046|       0|   18763|   18754|       0|   18754|       0|     119|
   94884|       0|    9540|    9538|       0|    9538|       0|      47|
  104901|       0|   10819|   10817|       0|   10817|       0|      61|
  114919|       0|   10817|   10818|       0|   10818|       0|      68|
  127424|       0|   13512|   13505|       0|   13505|       0|      74|
  131062|       0|   15113|   15106|       0|   15106|   10368|      28|
  131062|       0|   12503|   12482|       0|   12482|   11582|       9|
  131062|       0|    8146|    8130|       0|    8130|    7530|       5|
  131062|       0|    8204|    8194|       0|    8194|    7594|       2|
  131062|       0|    8132|    8131|       0|    8131|    7531|       5|
  131062|       0|    8196|    8195|       0|    8195|    7595|       4|
  131062|       0|    8130|    8129|       0|    8129|    7529|       8|

Something is definitely broken here.  Are the interrupts (or in this
case, NAPI) able to starve the gc somehow?

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-24  0:01       ` Simon Kirby
@ 2005-08-24  3:50         ` Robert Olsson
  2005-08-25 18:11         ` Simon Kirby
  1 sibling, 0 replies; 36+ messages in thread
From: Robert Olsson @ 2005-08-24  3:50 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, Eric Dumazet, netdev

Simon Kirby writes:

 > Something is definitely broken here.  Are the interrupts (or in this
 > case, NAPI) able to starve the gc somehow?

 Hmm no in 2.6 dst entries are freed via RCU callback this had problems
 but was redesigned.

 Reading your old email... Didn't you get "dst cache overflow" before
 2.6.11-bk2?

 In other case I like to have your detailed setup to see if I get any
 idea or possible can reproduced.

 Cheers.
					--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
@ 2005-08-24 16:06 Simon Kirby
  0 siblings, 0 replies; 36+ messages in thread
From: Simon Kirby @ 2005-08-24 16:06 UTC (permalink / raw)
  To: netdev

On Wed, Aug 24, 2005 at 04:59:32PM +0200, Robert Olsson wrote:

> You could probably also try to set hash table very large.
> via the boot option rt_hash_entries.

I have, and it just grows until it uses up all memory and kills my SSH
session.

> and gc_thresh to 1/4 of that as an experiment.

The threshold appears to have no difference except for where it settles
once I stop the DoS traffic.

> Also if you find any 2.6 version that work a la 2.4 it's 
> a good start.

It's weird because 2.6.11 is a lot better in that the GC appears to work
for some time, but it eventually something happens and it also hits
max_size and overflows continually.  I think I'm going to have to find a
version that works consistently as opposed to being "a little better".

I was just testing it again and noticed that on 2.6.11 it seems to be
almost stable at 71,000 entries (max_size = 131072) but as soon as I type
"dmesg" in another SSH window it will hit 131072.  It's as if it's at
equilibrium with the packet creation.

It may just be as simple as something that has always been buggy but
doesn't show up in 2.4 because the e1000 driver is more efficient there
(and/or some other piece of networking, which appears to be more likely).

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-24  0:01       ` Simon Kirby
  2005-08-24  3:50         ` Robert Olsson
@ 2005-08-25 18:11         ` Simon Kirby
  2005-08-25 20:05           ` Alexey Kuznetsov
  1 sibling, 1 reply; 36+ messages in thread
From: Simon Kirby @ 2005-08-25 18:11 UTC (permalink / raw)
  To: Robert Olsson, kuznet; +Cc: Eric Dumazet, netdev

[ Alexei / kuznet, I've included you here as I suspect you'll know what's
  going on. :) ]

I've been working with 2.6.13-rc6 to try to figure out why it's breaking.
I've added a more rt_cache statistics and flung more DoS traffic at it. 
I've determined that when the gc_goal_miss counter is increased, it seems
to be because the refcnt is non-zero in rt_may_expire().

The static "expire" variable reaches 0 easily this case and the rover
variables and loop aren't overflowing or anything -- it's just that
something is holding the refcnt > 0 for almost all of the entries that
rt_garbage_collect() walks.  Here are some statistics I recorded:

rt_cache
entries|in_slow|gc_tota|gc_igno|gc_aggr|gc_zero|gc_expi|gc_expi|gc_goa|gc_dst|in_hlis|
       |   _tot|      l|    red| essive|_expire|  re_no| re_yes|l_miss|overfl| t_srch|
     14|      4|      0|      0|      0|      0|      0|      0|     0|     0|      0|
  24012|  24003|  15819|  15818|      0|      0|      0|      0|     0|     0|  35182|
 131062| 112232| 112229| 110515|   1714|   1703|  10309|  79489|  1711|  1711| 767998|
 131062|  14279|  14276|    900|  13376|  13376|  75352|    900| 13376| 13376|      8|
 131062|   9542|   9538|    600|   8938|   8938|  50276|    600|  8938|  8938|      5|
 131062|   9543|   9539|    600|   8939|   8939|  50278|    600|  8939|  8939|      5|
 131062|   9542|   9538|    600|   8938|   8938|  50276|    600|  8938|  8938|     10|
 131062|   9542|   9536|    600|   8936|   8936|  50272|    600|  8936|  8936|      6|
 131062|   9475|   9472|    600|   8872|   8872|  50144|    600|  8872|  8872|      5|
 131062|   9540|   9538|    600|   8938|   8938|  50276|    600|  8938|  8938|      4|

gc_aggressive: Times the "we are in dangerous area" block executes.
gc_zero_expire: Times the loop is broken because expire == 0.
gc_expire_no: Times rt_may_expire() said no.
gc_expire_yes: Times rt_may_expire() said yes.

It seems the code is all the same in 2.4 in rt_may_expire(), so something
outside must have changed.  I can't even find anything in route.c that
decrements or zeros the refcnt.  Does anybody know why this is happening?

Simon-

On Tue, Aug 23, 2005 at 05:01:58PM -0700, Simon Kirby wrote:

> On Tue, Aug 23, 2005 at 09:56:53PM +0200, Robert Olsson wrote:
> 
> >  Yes your GC does not work at all in your 2.6 setups...Why?
> 
> Good question. :)
> 
> >  echo 50 >  /proc/sys/net/ipv4/route/gc_min_interval_ms
> 
> The output looks exactly the same with gc_min_interval_ms set to 50.
> 
> If I set it to 0, it does change a little but _still_ overflows:
> 
> rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|rt_cache|
>  entries|  in_hit|in_slow_|gc_total|gc_ignor|gc_goal_|gc_dst_o|in_hlist|
>         |        |     tot|        |      ed|    miss| verflow| _search|
>        3|       3|       1|       1|       1|       0|       0|       0|
>        4|      11|       5|       0|       0|       0|       0|       0|
>        5|       5|       2|       0|       0|       0|       0|       0|
>    23615|       1|   24002|   15812|       0|       0|       0|   11470|
>    68692|       0|   46780|   46777|       0|    4687|       0|    4492|
>    86046|       0|   18763|   18754|       0|   18754|       0|     119|
>    94884|       0|    9540|    9538|       0|    9538|       0|      47|
>   104901|       0|   10819|   10817|       0|   10817|       0|      61|
>   114919|       0|   10817|   10818|       0|   10818|       0|      68|
>   127424|       0|   13512|   13505|       0|   13505|       0|      74|
>   131062|       0|   15113|   15106|       0|   15106|   10368|      28|
>   131062|       0|   12503|   12482|       0|   12482|   11582|       9|
>   131062|       0|    8146|    8130|       0|    8130|    7530|       5|
>   131062|       0|    8204|    8194|       0|    8194|    7594|       2|
>   131062|       0|    8132|    8131|       0|    8131|    7531|       5|
>   131062|       0|    8196|    8195|       0|    8195|    7595|       4|
>   131062|       0|    8130|    8129|       0|    8129|    7529|       8|

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-25 18:11         ` Simon Kirby
@ 2005-08-25 20:05           ` Alexey Kuznetsov
  2005-08-25 21:22             ` Simon Kirby
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Kuznetsov @ 2005-08-25 20:05 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, kuznet, Eric Dumazet, netdev

Hello!

> something is holding the refcnt > 0 for almost all of the entries that
> rt_garbage_collect() walks.

Did you try to look at output of "ip -s -s ro ls ca" ?
If it is just a refcnt leakage, leaked routes should appear there
and it is possible to guess, where they leaked.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-25 20:05           ` Alexey Kuznetsov
@ 2005-08-25 21:22             ` Simon Kirby
  2005-08-26 11:55               ` Alexey Kuznetsov
  0 siblings, 1 reply; 36+ messages in thread
From: Simon Kirby @ 2005-08-25 21:22 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Robert Olsson, Eric Dumazet, netdev

On Fri, Aug 26, 2005 at 12:05:43AM +0400, Alexey Kuznetsov wrote:

> Hello!
> 
> > something is holding the refcnt > 0 for almost all of the entries that
> > rt_garbage_collect() walks.
> 
> Did you try to look at output of "ip -s -s ro ls ca" ?
> If it is just a refcnt leakage, leaked routes should appear there
> and it is possible to guess, where they leaked.

Hi Alexey,

It appears to be just the DoS traffic I am routing through the box, as
expected, but showing a refcnt for each entry:

    cache  users 1 age 0sec mtu 1500 advmss 1460 hoplimit 64 iif eth3

I can't find in route.c what would ever decrement refcnt, and it seems to
start being set to 1.  It obviously does at some point or else the table
would stay full forever, but when I stop the DoS it falls back down. 
What part of the code will decrement the count?  I can't see it.

The DoS in this case is set up to be from a spoofed source per packet and
to the address of a remote box behind the box in question.  Forwarding is
enabled.

BTW, I hacked a busy loop into juno-z.101f.c to fine rate control and
found that with 2.6.13-rc6, it is unable to keep up with the traffic
starting at about 112 kpps (each packet being a new random source).

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-25 21:22             ` Simon Kirby
@ 2005-08-26 11:55               ` Alexey Kuznetsov
  2005-08-26 19:49                 ` Robert Olsson
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Kuznetsov @ 2005-08-26 11:55 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Alexey Kuznetsov, Robert Olsson, Eric Dumazet, netdev

Hello!

> What part of the code will decrement the count?  I can't see it.

It depends. In the case of forwarding, it is kfree_skb(), happening
after the packet is transmitted by output device.

Well, it could result in overflow only if device queue is longer
than route/max_size.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-26 11:55               ` Alexey Kuznetsov
@ 2005-08-26 19:49                 ` Robert Olsson
  2005-09-06 23:57                   ` Simon Kirby
  0 siblings, 1 reply; 36+ messages in thread
From: Robert Olsson @ 2005-08-26 19:49 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Simon Kirby, Robert Olsson, Eric Dumazet, netdev


Hello!

This thread seems familar :)

I think Simon uses UP and it could be idea to check if the RCU deferred 
deletion causes the problem. 

Simon it would be interesting to see if the patch below makes any 
difference given the assumption about UP was correct,

Cheers.
					--ro


diff --git a/net/ipv4/route.c b/net/ipv4/route.c
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -485,7 +485,11 @@ static struct file_operations rt_cpu_seq
 static __inline__ void rt_free(struct rtable *rt)
 {
 	multipath_remove(rt);
+#ifdef CONFIG_SMP
 	call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free);
+#else
+	dst_free((struct dst_entry *)rt);
+#endif
 }
 
 static __inline__ void rt_drop(struct rtable *rt)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-08-26 19:49                 ` Robert Olsson
@ 2005-09-06 23:57                   ` Simon Kirby
  2005-09-07  1:19                     ` Alexey Kuznetsov
  2005-09-07 14:45                     ` Robert Olsson
  0 siblings, 2 replies; 36+ messages in thread
From: Simon Kirby @ 2005-09-06 23:57 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Alexey Kuznetsov, Eric Dumazet, netdev

On Fri, Aug 26, 2005 at 09:49:11PM +0200, Robert Olsson wrote:

> Hello!
> 
> This thread seems familar :)
> 
> I think Simon uses UP and it could be idea to check if the RCU deferred 
> deletion causes the problem. 
>...
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -485,7 +485,11 @@ static struct file_operations rt_cpu_seq
>  static __inline__ void rt_free(struct rtable *rt)
>  {
>  	multipath_remove(rt);
> +#ifdef CONFIG_SMP
>  	call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free);
> +#else
> +	dst_free((struct dst_entry *)rt);
> +#endif
>  }
>  
>  static __inline__ void rt_drop(struct rtable *rt)

Woot!

Yes, this is the difference.  With the patch applied (ajust directly
freeing the dst_entry), everything balances easily, there are no
overflows, and the result of rt_may_expire() looks very reasonable.
(Yay!)

So, this seems to be the culprit.  Is NAPI supposed to allow the
queued bh to run or should we just not be queuing this?

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-06 23:57                   ` Simon Kirby
@ 2005-09-07  1:19                     ` Alexey Kuznetsov
  2005-09-07 15:03                       ` Robert Olsson
  2005-09-07 14:45                     ` Robert Olsson
  1 sibling, 1 reply; 36+ messages in thread
From: Alexey Kuznetsov @ 2005-09-07  1:19 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, Alexey Kuznetsov, Eric Dumazet, netdev

Hello!

On Tue, Sep 06, 2005 at 04:57:00PM -0700, Simon Kirby wrote:
> On Fri, Aug 26, 2005 at 09:49:11PM +0200, Robert Olsson wrote:
...
> > I think Simon uses UP and it could be idea to check if the RCU deferred 
> > deletion causes the problem. 
..
> Yes, this is the difference.  With the patch applied (ajust directly
> freeing the dst_entry), everything balances easily, there are no
> overflows, and the result of rt_may_expire() looks very reasonable.
> (Yay!)
> 
> So, this seems to be the culprit.  Is NAPI supposed to allow the
> queued bh to run or should we just not be queuing this?

It is supposed to work. :-) The problem is like an unkillable zombie.

Robert, have you seen this pehonomenon already? Did you mean that SMP works
or that it never works (but this patch is valid only for UP)? Did it
become worse after 2.6.9?

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-06 23:57                   ` Simon Kirby
  2005-09-07  1:19                     ` Alexey Kuznetsov
@ 2005-09-07 14:45                     ` Robert Olsson
  2005-09-07 16:28                       ` Simon Kirby
  1 sibling, 1 reply; 36+ messages in thread
From: Robert Olsson @ 2005-09-07 14:45 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, Alexey Kuznetsov, Eric Dumazet, netdev

Simon Kirby writes:

 > Woot!
 > 
 > Yes, this is the difference.  With the patch applied (ajust directly
 > freeing the dst_entry), everything balances easily, there are no
 > overflows, and the result of rt_may_expire() looks very reasonable.
 > (Yay!)
 >
 > So, this seems to be the culprit.  Is NAPI supposed to allow the
 > queued bh to run or should we just not be queuing this?

 Packet processing happens in RX_SOFIRQ. NAPI or non-NAPI is no difference
 with RCU deferred delete this should happen by the RCU-tasklet when 
 tasklets are run after the real SOFTIRQ's.

 There is a limit for RCU work... maxbatch it's set to 10  you could back 
 out the patch and try increase it 1000/10000 so we know this not prevent 
 the freeing of entries. 

 module_param(maxbatch, int, 0);  /* rcupdate.c */

 Also RCU clearly states that is should be used in read-mostly situations
 rDoS is outside this scope. Anyway it would be interesting to understand 
 what's going on.

 Cheers.
						--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-07  1:19                     ` Alexey Kuznetsov
@ 2005-09-07 15:03                       ` Robert Olsson
  2005-09-07 16:55                         ` Simon Kirby
  0 siblings, 1 reply; 36+ messages in thread
From: Robert Olsson @ 2005-09-07 15:03 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Simon Kirby, Robert Olsson, Eric Dumazet, netdev

Alexey Kuznetsov writes:

 > Robert, have you seen this pehonomenon already? Did you mean that SMP works
 > or that it never works (but this patch is valid only for UP)? Did it
 > become worse after 2.6.9?

 It was quite some time since I saw dst cache overflow and we use 2.6 
 in infrastructure. Anyway I was able to "tune" route cache so I see
 in our lab system on a SMP box. I think UP and SMP behaves the same 
 but with UP we could disable the deferred delete as Simon tested.

 I don't know if anything happen in 2.6.9 I don't think so. But any
 improvement in drivers or FIB lookup may increase the burden so we get
 overflows.

 We had some code that checked the RCU latency. 

 Cheers.
					--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-07 14:45                     ` Robert Olsson
@ 2005-09-07 16:28                       ` Simon Kirby
  2005-09-07 16:49                         ` Robert Olsson
  2005-09-07 19:59                         ` Alexey Kuznetsov
  0 siblings, 2 replies; 36+ messages in thread
From: Simon Kirby @ 2005-09-07 16:28 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Alexey Kuznetsov, Eric Dumazet, netdev

On Wed, Sep 07, 2005 at 04:45:03PM +0200, Robert Olsson wrote:

>  Packet processing happens in RX_SOFIRQ. NAPI or non-NAPI is no difference
>  with RCU deferred delete this should happen by the RCU-tasklet when 
>  tasklets are run after the real SOFTIRQ's.
> 
>  There is a limit for RCU work... maxbatch it's set to 10  you could back 
>  out the patch and try increase it 1000/10000 so we know this not prevent 
>  the freeing of entries. 

Yes, setting maxbatch to 10000 also results in working gc, though routing
throughput is about 5.7% higher when just calling dst_free directly.

>  Also RCU clearly states that is should be used in read-mostly situations
>  rDoS is outside this scope. Anyway it would be interesting to understand 
>  what's going on.

There was discussion about this before (recycling of existing entries is
also now impossible, as compared with 2.4).  It's a shame that this win
for the normal case also hurts the DoS case...and it really hurts when
the when the DoS case is the normal case.

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-07 16:28                       ` Simon Kirby
@ 2005-09-07 16:49                         ` Robert Olsson
  2005-09-07 16:57                           ` Simon Kirby
  2005-09-07 19:59                         ` Alexey Kuznetsov
  1 sibling, 1 reply; 36+ messages in thread
From: Robert Olsson @ 2005-09-07 16:49 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, Alexey Kuznetsov, Eric Dumazet, netdev


Simon Kirby writes:

 > Yes, setting maxbatch to 10000 also results in working gc, though routing
 > throughput is about 5.7% higher when just calling dst_free directly.

 Oh that's good news... You loose 5.7% for rDoS but should benefit 
 in normal conditions.

 > There was discussion about this before (recycling of existing entries is
 > also now impossible, as compared with 2.4).  It's a shame that this win
 > for the normal case also hurts the DoS case...and it really hurts when
 > the when the DoS case is the normal case.

 It's called trade-off's :) rDoS is hardly nomal case? But maybe it's time 
 to compare routing via route hash vs FIB lookup directly again now when 
 we have RCU with some FIB lookup's too.


 Cheers.
						--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-07 15:03                       ` Robert Olsson
@ 2005-09-07 16:55                         ` Simon Kirby
  2005-09-07 17:21                           ` Robert Olsson
  0 siblings, 1 reply; 36+ messages in thread
From: Simon Kirby @ 2005-09-07 16:55 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Alexey Kuznetsov, Eric Dumazet, netdev

On Wed, Sep 07, 2005 at 05:03:17PM +0200, Robert Olsson wrote:

>  It was quite some time since I saw dst cache overflow and we use 2.6 
>  in infrastructure. Anyway I was able to "tune" route cache so I see
>  in our lab system on a SMP box. I think UP and SMP behaves the same 
>  but with UP we could disable the deferred delete as Simon tested.
> 
>  I don't know if anything happen in 2.6.9 I don't think so. But any
>  improvement in drivers or FIB lookup may increase the burden so we get
>  overflows.

I believe what I've been seeing is a _reduction_ in performance in both
the e1000 driver and other parts of the kernel that result in it handling
these packets much more slowly than in 2.4.  The dst cache only overflows
when the thing is completely pegged, so earlier 2.6 versions that were a
little faster (eg: 2.6.11) were only overflowing occasionally depending
on the speed of the input traffic.

I've only been able to send 179 Mbps from one box, so that's what has
been killing it.  On the receiving end, 2.6.13-rc6 with the direct
dst_free now drops a bunch but stays responsive with working GC,
routing through about 69.6 Mbps, while 2.4.27 routes 103 Mbps worth.

If it would be helpful, I can build some scripts to do benchmarks with
different kernel combinations, and run it on a bunch of different kernel
versions.

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-07 16:49                         ` Robert Olsson
@ 2005-09-07 16:57                           ` Simon Kirby
  0 siblings, 0 replies; 36+ messages in thread
From: Simon Kirby @ 2005-09-07 16:57 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Alexey Kuznetsov, Eric Dumazet, netdev

On Wed, Sep 07, 2005 at 06:49:03PM +0200, Robert Olsson wrote:

>  It's called trade-off's :) rDoS is hardly nomal case? But maybe it's time 
>  to compare routing via route hash vs FIB lookup directly again now when 
>  we have RCU with some FIB lookup's too.

I haven't even filled the route tables yet.  I've just been testing with
a bog standard table (three /24s and one /0).

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-07 16:55                         ` Simon Kirby
@ 2005-09-07 17:21                           ` Robert Olsson
  0 siblings, 0 replies; 36+ messages in thread
From: Robert Olsson @ 2005-09-07 17:21 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, Alexey Kuznetsov, Eric Dumazet, netdev

Simon Kirby writes:

 > I've only been able to send 179 Mbps from one box, so that's what has
 > been killing it.  On the receiving end, 2.6.13-rc6 with the direct
 > dst_free now drops a bunch but stays responsive with working GC,
 > routing through about 69.6 Mbps, while 2.4.27 routes 103 Mbps worth.

 If route hash setup is identical, buckets etc and HZ is same etc. I have no 
 idea about the performance difference. Somebody else?  In other case you need 
 to compare (o)profiles and see if this can give us any hints. To test drivers 
 etc you might also want to test with a single flow.

 Cheers.
						--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-07 16:28                       ` Simon Kirby
  2005-09-07 16:49                         ` Robert Olsson
@ 2005-09-07 19:59                         ` Alexey Kuznetsov
  2005-09-13 22:14                           ` Simon Kirby
  1 sibling, 1 reply; 36+ messages in thread
From: Alexey Kuznetsov @ 2005-09-07 19:59 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, Alexey Kuznetsov, Eric Dumazet, netdev

Hello!

> Yes, setting maxbatch to 10000 also results in working gc,

Could you try lower values? F.e. I guess 300 or a little more
(it is netdev_max_backlog) should be enough.


> for the normal case also hurts the DoS case...and it really hurts when
> the when the DoS case is the normal case.

5.7% is not "really hurts" yet. :-)

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-07 19:59                         ` Alexey Kuznetsov
@ 2005-09-13 22:14                           ` Simon Kirby
  2005-09-14  8:04                             ` Robert Olsson
  2005-09-15 21:04                             ` Alexey Kuznetsov
  0 siblings, 2 replies; 36+ messages in thread
From: Simon Kirby @ 2005-09-13 22:14 UTC (permalink / raw)
  To: Alexey Kuznetsov, Robert Olsson, Eric Dumazet, netdev

On Wed, Sep 07, 2005 at 11:59:11PM +0400, Alexey Kuznetsov wrote:

> Hello!
> 
> > Yes, setting maxbatch to 10000 also results in working gc,
> 
> Could you try lower values? F.e. I guess 300 or a little more
> (it is netdev_max_backlog) should be enough.

300 seems to be sufficient, but I'm not sure what this depends on (load,
HZ, timing of some sort?).  See below for full tests.

> > for the normal case also hurts the DoS case...and it really hurts when
> > the when the DoS case is the normal case.
> 
> 5.7% is not "really hurts" yet. :-)

I decided to try out FreeBSD in comparison as I've heard people saying
that it handles this case quite well.  The results are interesting.

FreeBSD seems to have a route cache; however, it keys only on
destination.  When a new destination is seen, the route table entry that
matched is "cloned" so that the MTU, etc., is copied, the dst rewritten
to the exact IP (as opposed to a network route), and path MTU discovery
results are maintained in this entry, keyed by destination address only.

I'm not sure if Linux could work in the same way with the source routing
tables enabled, but perhaps it's possible to either disable the source
side of the route cache when policy routing is disabled.  Or perhaps a
route cache hash could be instantiated per route table or something.

Actually, is there ever a valid case where the source needs to be tracked
in the route cache when policy routing is disabled?  A local socket will
track MSS correctly while a forwarded packet will create or use an entry
without touching it, so I don't see why not.

Anyway, spoofed source or not go the same speed through FreeBSD.  Also,
there is a "fastforwarding" sysctl that sends forwarded packets from
the input interrupt/poll without queueing them in a soft interrupt
("NETISR").

Polling mode on FreeBSD isn't as nice as NAPI in that it's fully manual
on or off, and when it's on it triggers entirely from the timer interrupt
unless told to also trigger from the idle loop.  The user/kernel balancing
is also manual but I can't seem to get it to forward as fast as with it
disabled no matter how I adjust it.

TEST RESULTS
------------

All Linux tests with NAPI enabled and the e1000 driver native to that
kernel unless otherwise specified.  maxbatch does not exist in kernels
< 2.6.9, and rhash_size does not exist in 2.4.

Sender: 367 Mbps, 717883 pps valid src/dst, 64 byte (Ethernet) packets

2.4.27-rc1: 297 Mbps forwarded (w/idle time?!)
2.4.31: 296 Mbps forwarded (w/idle time?!)
2.6.13-rc6: 173 Mbps forwarded
FreeBSD 5.4-RELEASE (HZ=1000): 103 Mbps forwarded (dead userland)
`- net.inet.ip.fastforwarding=1: 282 Mbps forwarded (dead userland)
   `- kern.polling.enable=1: 75.3 Mbps forwarded
      `- kern.polling.idle_poll=1: 226 Mbps forwarded

Sender: 348 Mbps, 680416 pps random src, valid dst, 64 bytes

(All FreeBSD tests have identical results.)

2.4.27-rc1: 122 Mbps forwarded
2.4.27-rc1 gc_elasticity=1: 182 Mbps forwarded
2.4.27-rc1+2.4.31_e1000: 117 Mbps forwarded
2.4.27-rc1+2.4.31_e1000 gc_elasticity=1: 170 Mbps forwarded
2.4.31: 95.1 Mbps forwarded
2.4.31 gc_elasticity=1: 122 Mbps forwarded

2.6.13-rc6: <1 Mbps forwarded (dst overflow)
2.6.13-rc6 maxbatch=30: <1 Mbps forwarded (dst overflow)
2.6.13-rc6 maxbatch=60: 1.5 Mbps forwarded (dst overflow)
2.6.13-rc6 maxbatch=100: 2.6 Mbps forwarded (dst overflow)
2.6.13-rc6 maxbatch=150: 3.8 Mbps forwarded (dst overflow)
2.6.13-rc6 maxbatch=200: 6.9 Mbps forwarded (dst overflow)
2.6.13-rc6 maxbatch=250: 15.4 Mbps forwarded (dst overflow)
2.6.13-rc6 maxbatch=300: 58.6 Mbps forwarded (gc balanced)
2.6.13-rc6 maxbatch=350: 60.5 Mbps forwarded
2.6.13-rc6 maxbatch=400: 59.4 Mbps forwarded
2.6.13-rc6 maxbatch=450: 59.1 Mbps forwarded
2.6.13-rc6 maxbatch=500: 62.0 Mbps forwarded
2.6.13-rc6 maxbatch=550: 61.9 Mbps forwarded
2.6.13-rc6 maxbatch=1000: 61.4 Mbps forwarded
2.6.13-rc6 maxbatch=2000: 60.2 Mbps forwarded
2.6.13-rc6 maxbatch=3000: 60.1 Mbps forwarded
2.6.13-rc6 maxbatch=5000: 59.1 Mbps forwarded
2.6.13-rc6 maxbatch=MAXINT: 59.1 Mbps forwarded
2.6.13-rc6 dst_free: 66.0 Mbps forwarded
2.6.13-rc6 dst_free max_size=rhash_size: 79.2 Mbps forwarded

------------

2.6 definitely has better dst cache gc balancing than 2.4.  I can set
the max_size=rhash_size in 2.6.13-rc6 and it will just work, even without
adjusting gc_elasticity or gc_thresh.  In 2.4.27 and 2.4.31, the only
parameter that appears to help is gc_elasticity.  If I just adjust
max_size, it overflows and falls over.

I note that the actual read copy update "maxbatch" limit was added in
2.6.9.  Before then, it seems there was no limit (infinite).  Was it
added for latency reasons?

Time permitting, I'd also like to run some profiles.  It's interesting
to note that 2.6 is slower at forwarding even straight duplicate small
packets.  We should definitely get to the bottom of that.

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-13 22:14                           ` Simon Kirby
@ 2005-09-14  8:04                             ` Robert Olsson
  2005-09-17  0:28                               ` Simon Kirby
  2005-09-15 21:04                             ` Alexey Kuznetsov
  1 sibling, 1 reply; 36+ messages in thread
From: Robert Olsson @ 2005-09-14  8:04 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Alexey Kuznetsov, Robert Olsson, Eric Dumazet, netdev


Simon Kirby writes:


 > Sender: 367 Mbps, 717883 pps valid src/dst, 64 byte (Ethernet) packets
 > 
 > 2.4.27-rc1: 297 Mbps forwarded (w/idle time?!)
 > 2.4.31: 296 Mbps forwarded (w/idle time?!)
 > 2.6.13-rc6: 173 Mbps forwarded

 > Time permitting, I'd also like to run some profiles.  It's interesting
 > to note that 2.6 is slower at forwarding even straight duplicate small
 > packets.  We should definitely get to the bottom of that.

 Yes. This is single flow? Strange.

 Run a fixed size shot 10Mpkts pkts or so for both 2.4 and 2.6 and save 
 /proc/interrupts, proc/net/softnetstat, netstat -i, tc -s qdisc to start with.
 
 A profile on 2.6 could solve the confusion.

 Cheers.
						--ro

 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-13 22:14                           ` Simon Kirby
  2005-09-14  8:04                             ` Robert Olsson
@ 2005-09-15 21:04                             ` Alexey Kuznetsov
  2005-09-15 21:30                               ` Robert Olsson
  1 sibling, 1 reply; 36+ messages in thread
From: Alexey Kuznetsov @ 2005-09-15 21:04 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Alexey Kuznetsov, Robert Olsson, Eric Dumazet, netdev

Hello!

> 300 seems to be sufficient, but I'm not sure what this depends on (load,
> HZ, timing of some sort?).

It should be enough not depending on anything but sysctl
net/core/netdev_max_backlog.

> I'm not sure if Linux could work in the same way

It could, but it does not. Still.

> Actually, is there ever a valid case where the source needs to be tracked
> in the route cache when policy routing is disabled?

Unfortunately. It caches lots of information depeding on incoming
interface and address. All this is mostly useless and can be eliminated,
but it is not so trivial.

> there is a "fastforwarding" sysctl that sends forwarded packets from
> the input interrupt/poll without queueing them in a soft interrupt
> ("NETISR").

We used to experiment with this too. "fastroute" was killed completely,
napi is a little slower, but much better from viewpoint of maintainability.

>2.4.27-rc1: 297 Mbps forwarded (w/idle time?!)
vs
>2.6.13-rc6: 173 Mbps forwarded

and

> 2.4.27-rc1 gc_elasticity=1: 182 Mbps forwarded
vs
> 2.6.13-rc6 maxbatch=300: 58.6 Mbps forwarded (gc balanced)

No clue! There should be no such big difference. It is some disaster.

Something is very wrong and most of loss is even not related to routing cache.
Most likely it is driver or something is seriously screwed up in softirq
processing. Profiling is really required...

Robert, did you not see anything like this?

> 2.6 definitely has better dst cache gc balancing than 2.4.  I can set
> the max_size=rhash_size in 2.6.13-rc6 and it will just work, even without
> adjusting gc_elasticity or gc_thresh.  In 2.4.27 and 2.4.31, the only
> parameter that appears to help is gc_elasticity.  If I just adjust
> max_size, it overflows and falls over.

I have no idea why it works. Size of cache is determined by gc_elasticity
both in 2.4 and 2.6. Nothing changed.

The only difference in 2.4 is that it used to have wrong default
5 second value for gc_min_interval (0.5 sec in 2.6). Unless this is fixed,
gc just does not work at high rates.

Both in 2.6 and 2.4 you must not touch max_size unless you want to _increase_
it, default value is minimum allowed by sanity. Actually there is a hard
constraint: gc_elasticity*rhash_size <= max_size/2,
if you break this condition, it must break. Probably, you do not see
this because you do not change routing tables while testing.

> I note that the actual read copy update "maxbatch" limit was added in
> 2.6.9.  Before then, it seems there was no limit (infinite).  Was it
> added for latency reasons?

Before 2.6.9 rcu worked differently. It run very rarely and had to do
lots of work each run, effectively unlimited. Apparently, when RCU folks
finally implemented new better mechanism they also added some job limit
and did this wrong, 10 is ridiculously low limit.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-15 21:04                             ` Alexey Kuznetsov
@ 2005-09-15 21:30                               ` Robert Olsson
  2005-09-15 22:21                                 ` Alexey Kuznetsov
  0 siblings, 1 reply; 36+ messages in thread
From: Robert Olsson @ 2005-09-15 21:30 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Simon Kirby, Robert Olsson, Eric Dumazet, netdev

Alexey Kuznetsov writes:

 > Most likely it is driver or something is seriously screwed up in softirq
 > processing. Profiling is really required...
 > 
 > Robert, did you not see anything like this?

 No. There must be an explanation... I've seen around 1 Mpps in the best
 single flow tests w. 2.6 kernels of course decent HW. Simon can you 
 report in pps as you use 64 byte pkts.

 > Before 2.6.9 rcu worked differently. It run very rarely and had to do
 > lots of work each run, effectively unlimited. Apparently, when RCU folks
 > finally implemented new better mechanism they also added some job limit
 > and did this wrong, 10 is ridiculously low limit.

 Yes. I'll guess the thinking was that RCU is for read mostly and rDoS 
 violates this but yes 10 seems dangerous low.

 Also interesing to get BSD numbers? Sounds like they use something like
 old FASTROUTE.

 Cheers.
					--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-15 21:30                               ` Robert Olsson
@ 2005-09-15 22:21                                 ` Alexey Kuznetsov
  2005-09-16 12:18                                   ` Robert Olsson
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Kuznetsov @ 2005-09-15 22:21 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Alexey Kuznetsov, Simon Kirby, Eric Dumazet, netdev

Hello!

>  No. There must be an explanation... I've seen around 1 Mpps in the best
>  single flow tests w. 2.6 kernels of course decent HW. Simon can you 
>  report in pps as you use 64 byte pkts.

Sender: 367 Mbps, 717883 pps valid src/dst, 64 byte (Ethernet) packets

2.4.27-rc1: 297 Mbps forwarded (w/idle time?!)

So, his best number is (717883/367)*297 ~= 580kpps

>  Yes. I'll guess the thinking was that RCU is for read mostly

RCU should not add essential overhead to DoS, actually. The difference
between direct dst_free and RCU is strange as well.


>  Also interesing to get BSD numbers? Sounds like they use something like
>  old FASTROUTE.

Yes, it is quite funny. I guess it required irq protection to radix tree
manipulations, grr... Anyway, I would expect BSD with fastforwarding beat
NAPI.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-15 22:21                                 ` Alexey Kuznetsov
@ 2005-09-16 12:18                                   ` Robert Olsson
  2005-09-16 19:04                                     ` Alexey Kuznetsov
  0 siblings, 1 reply; 36+ messages in thread
From: Robert Olsson @ 2005-09-16 12:18 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Robert Olsson, Simon Kirby, Eric Dumazet, netdev


Alexey Kuznetsov writes:

 > Sender: 367 Mbps, 717883 pps valid src/dst, 64 byte (Ethernet) packets
 > 2.4.27-rc1: 297 Mbps forwarded (w/idle time?!)
 > So, his best number is (717883/367)*297 ~= 580kpps

Yes sounds famliar XEON with e1000... So why not for 2.6?


Below a very quick test from our 1.6 GHz Opteron. Latest GIT tree w. UP.

e1000 at PCI-X 133/100 MHz. 82546 GB dual NIC's Input 881 kpps.into eth0.
  
Iface   MTU Met  RX-OK RX-ERR RX-DRP RX-OVR  TX-OK TX-ERR TX-DRP TX-OVR Flags
eth0   1500   0 6733009 3267274 6534548 3267274    180      0      0      0 BRU
eth1   1500   0      6      0      0      0 6732737      0      0      0 BRU

cat /proc/net/softnet_stat 
0066bd27 00000000 000057aa 00000000 00000000 00000000 00000000 00000000 00000000

cat /proc/interrupts 
           CPU0       
 16:        707   IO-APIC-level  eth0
 17:        293   IO-APIC-level  eth1
 18:        286   IO-APIC-level  eth2
 19:        286   IO-APIC-level  eth3

Total routed T-put of 590 Kpps 

 > >  Yes. I'll guess the thinking was that RCU is for read mostly
 > 
 > RCU should not add essential overhead to DoS, actually. The difference
 > between direct dst_free and RCU is strange as well.

 I think we saw this before. I proposed disabling deferred deletions
 as with the patch I sent for UP. 

 > >  Also interesing to get BSD numbers? Sounds like they use something like
 > >  old FASTROUTE.
 > 
 > Yes, it is quite funny. I guess it required irq protection to radix tree
 > manipulations, grr... Anyway, I would expect BSD with fastforwarding beat
 > NAPI.

 BSD uses fixed polling from what I understand so it should be pretty close
 NAPI. With Radix for FIB they need route even more than Linux. But code
 path might be more efficient have less hooks. Also I dunno about SMP/NUMA 
 for BSD we pay some price for it but hopefully we get something return.

 Cheers.
						--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-16 12:18                                   ` Robert Olsson
@ 2005-09-16 19:04                                     ` Alexey Kuznetsov
  2005-09-16 19:22                                       ` Ben Greear
  2005-09-16 19:57                                       ` Robert Olsson
  0 siblings, 2 replies; 36+ messages in thread
From: Alexey Kuznetsov @ 2005-09-16 19:04 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Alexey Kuznetsov, Simon Kirby, Eric Dumazet, netdev

Hello!

> Yes sounds famliar XEON with e1000... So why not for 2.6?

Most likely, something is broken in the e1000 driver. Otherwise, no ideas.

>  I think we saw this before. I proposed disabling deferred deletions
>  as with the patch I sent for UP. 

I do not see _why_. Apparently some overhead is present but I do not
understand why it is so large. Is it just because 300 redundant entries
pollute cache a little more? I do not see another reasons.

Maybe it makes sense to compare this effect with the effect of increment
gc_elasticity by 1. If it is due to cache pollution, effect of increment
of gc_elasticity, which increses size of cache by rhash_size should be
even worse.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-16 19:04                                     ` Alexey Kuznetsov
@ 2005-09-16 19:22                                       ` Ben Greear
  2005-09-16 19:57                                       ` Robert Olsson
  1 sibling, 0 replies; 36+ messages in thread
From: Ben Greear @ 2005-09-16 19:22 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Robert Olsson, Simon Kirby, Eric Dumazet, netdev

Alexey Kuznetsov wrote:
> Hello!
> 
> 
>>Yes sounds famliar XEON with e1000... So why not for 2.6?
> 
> 
> Most likely, something is broken in the e1000 driver. Otherwise, no ideas.

Has anyone tried using bridging to compare numbers?  I would assume that
the bridging code is lower-overhead than the routing, so if it's a route
cache problem, the bridge traffic should be significantly higher than
the routed traffic.  If they are both about the same, then either
bridging has lots of overhead too, or the driver (or other network
sub-system) is the bottleneck.

For reference, I was able to bridge only about 200kpps (in each direction, 64 byte pkts)
on a P-IV 3Ghz system with dual Intel e1000 NIC in a PCI-X 64/133 bus....

I would like to hear of any other bridging benchmarks that someone
may have, especially for bi-directional traffic flows.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-16 19:04                                     ` Alexey Kuznetsov
  2005-09-16 19:22                                       ` Ben Greear
@ 2005-09-16 19:57                                       ` Robert Olsson
  1 sibling, 0 replies; 36+ messages in thread
From: Robert Olsson @ 2005-09-16 19:57 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Robert Olsson, Simon Kirby, Eric Dumazet, netdev

Alexey Kuznetsov writes:

 > Most likely, something is broken in the e1000 driver. Otherwise, no ideas.

 No it's hard to guess. Simon will hopefully bring some more data.

 > I do not see _why_. Apparently some overhead is present but I do not
 > understand why it is so large. Is it just because 300 redundant entries
 > pollute cache a little more? I do not see another reasons.

 Yes when RX softirq is done. RCU tasklet has to take over and probably
 reload cache with some the entries to complete the deletion. It might be 
 worth a profile...

 > Maybe it makes sense to compare this effect with the effect of increment
 > gc_elasticity by 1. If it is due to cache pollution, effect of increment
 > of gc_elasticity, which increses size of cache by rhash_size should be
 > even worse.

 Something like that yes :) but if we increase gc_elasticity we also add 
 more spinning in hash chains. So we need to sort out if the expected 
 performance drop comes from extra hash spinning or are cache effects from 
 the increased hash.

 Cheers.
					--ro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-14  8:04                             ` Robert Olsson
@ 2005-09-17  0:28                               ` Simon Kirby
  2005-09-17  9:04                                 ` Martin Josefsson
  2005-09-17 15:17                                 ` jamal
  0 siblings, 2 replies; 36+ messages in thread
From: Simon Kirby @ 2005-09-17  0:28 UTC (permalink / raw)
  To: Robert Olsson; +Cc: Alexey Kuznetsov, Eric Dumazet, netdev

On Wed, Sep 14, 2005 at 10:04:21AM +0200, Robert Olsson wrote:

> 
> Simon Kirby writes:
> 
> 
>  > Sender: 367 Mbps, 717883 pps valid src/dst, 64 byte (Ethernet) packets
>  > 
>  > 2.4.27-rc1: 297 Mbps forwarded (w/idle time?!)
>  > 2.4.31: 296 Mbps forwarded (w/idle time?!)
>  > 2.6.13-rc6: 173 Mbps forwarded
> 
>  > Time permitting, I'd also like to run some profiles.  It's interesting
>  > to note that 2.6 is slower at forwarding even straight duplicate small
>  > packets.  We should definitely get to the bottom of that.
> 
>  Yes. This is single flow? Strange.
> 
>  Run a fixed size shot 10Mpkts pkts or so for both 2.4 and 2.6 and save 
>  /proc/interrupts, proc/net/softnetstat, netstat -i, tc -s qdisc to start with.

I got stuck in some mud again, but I was able to run a small oprofile.

nf_iterate was near the top even though the firewall was empty, so I
changed CONFIG_IP_NF_IPTABLES=y to CONFIG_IP_NF_IPTABLES=m (and didn't
load it).  Throughput went up from 173 Mbps to 232 Mbps...yikes. 
Conntrack was never compiled.  I'll do some more profiling when I get
a chance...

Simon-

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-17  0:28                               ` Simon Kirby
@ 2005-09-17  9:04                                 ` Martin Josefsson
  2005-09-17 15:17                                 ` jamal
  1 sibling, 0 replies; 36+ messages in thread
From: Martin Josefsson @ 2005-09-17  9:04 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, Alexey Kuznetsov, Eric Dumazet, netdev

[-- Attachment #1: Type: text/plain, Size: 689 bytes --]

On Fri, 2005-09-16 at 17:28 -0700, Simon Kirby wrote:

> I got stuck in some mud again, but I was able to run a small oprofile.
> 
> nf_iterate was near the top even though the firewall was empty, so I
> changed CONFIG_IP_NF_IPTABLES=y to CONFIG_IP_NF_IPTABLES=m (and didn't
> load it).  Throughput went up from 173 Mbps to 232 Mbps...yikes. 
> Conntrack was never compiled.  I'll do some more profiling when I get
> a chance...

Yes, it's bloody slow even without any rules loaded at the moment, it's
on my todo list...

If you want even less overhead then don't even select CONFIG_NETFILTER,
that way you avoid compiling in the netfilter hooks completely.

-- 
/Martin

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Route cache performance
  2005-09-17  0:28                               ` Simon Kirby
  2005-09-17  9:04                                 ` Martin Josefsson
@ 2005-09-17 15:17                                 ` jamal
  1 sibling, 0 replies; 36+ messages in thread
From: jamal @ 2005-09-17 15:17 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Robert Olsson, Alexey Kuznetsov, Eric Dumazet, netdev

On Fri, 2005-16-09 at 17:28 -0700, Simon Kirby wrote:

> nf_iterate was near the top even though the firewall was empty, so I
> changed CONFIG_IP_NF_IPTABLES=y to CONFIG_IP_NF_IPTABLES=m (and didn't
> load it).  Throughput went up from 173 Mbps to 232 Mbps...yikes. 
> Conntrack was never compiled.  I'll do some more profiling when I get
> a chance...
> 

If you want some basic stateless firewalling, turn off netfilter and use
tc ingress/egress actions instead. The impact on performance is a lot
more tolerable.

cheers,
jamal

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2005-09-17 15:17 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-15 21:38 Route cache performance Simon Kirby
2005-08-16  2:23 ` Eric Dumazet
2005-08-23 19:08   ` Simon Kirby
2005-08-23 19:56     ` Robert Olsson
2005-08-24  0:01       ` Simon Kirby
2005-08-24  3:50         ` Robert Olsson
2005-08-25 18:11         ` Simon Kirby
2005-08-25 20:05           ` Alexey Kuznetsov
2005-08-25 21:22             ` Simon Kirby
2005-08-26 11:55               ` Alexey Kuznetsov
2005-08-26 19:49                 ` Robert Olsson
2005-09-06 23:57                   ` Simon Kirby
2005-09-07  1:19                     ` Alexey Kuznetsov
2005-09-07 15:03                       ` Robert Olsson
2005-09-07 16:55                         ` Simon Kirby
2005-09-07 17:21                           ` Robert Olsson
2005-09-07 14:45                     ` Robert Olsson
2005-09-07 16:28                       ` Simon Kirby
2005-09-07 16:49                         ` Robert Olsson
2005-09-07 16:57                           ` Simon Kirby
2005-09-07 19:59                         ` Alexey Kuznetsov
2005-09-13 22:14                           ` Simon Kirby
2005-09-14  8:04                             ` Robert Olsson
2005-09-17  0:28                               ` Simon Kirby
2005-09-17  9:04                                 ` Martin Josefsson
2005-09-17 15:17                                 ` jamal
2005-09-15 21:04                             ` Alexey Kuznetsov
2005-09-15 21:30                               ` Robert Olsson
2005-09-15 22:21                                 ` Alexey Kuznetsov
2005-09-16 12:18                                   ` Robert Olsson
2005-09-16 19:04                                     ` Alexey Kuznetsov
2005-09-16 19:22                                       ` Ben Greear
2005-09-16 19:57                                       ` Robert Olsson
  -- strict thread matches above, loose matches on Subject: below --
2005-08-24 16:06 Simon Kirby
     [not found] <20050301220743.GF2554@netnation.com>
     [not found] ` <16940.9990.975632.115834@robur.slu.se>
2005-03-09  1:45   ` Simon Kirby
2005-03-09 12:05     ` Robert Olsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).