linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* route cache DoS testing and softirqs
@ 2004-03-29 18:45 Dipankar Sarma
  2004-03-29 22:29 ` Andrea Arcangeli
  0 siblings, 1 reply; 44+ messages in thread
From: Dipankar Sarma @ 2004-03-29 18:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, Robert Olsson, Andrea Arcangeli, Paul E. McKenney,
	Dave Miller, Alexey Kuznetsov, Andrew Morton

Robert Olsson noticed dst cache overflows while doing DoS stress testing in a
2.6 based router setup a few months and davem, alexey, robert and I
have been discussing this privately since then (198 mails, no less!!).
Recently, I set up an environment to test Robert's problem and have 
been characterizing it. My setup is -
                                                                                
pktgen box --- in router out --
eth0           eth0 <-> dumm0
                                                                                
10.0.0.1       10.0.0.2  5.0.0.1
                                                                                
The router box is a 2-way P4 xeon 2.4 GHz with 256MB memory. I use
Robert's pktgen script -
                                                                                
CLONE_SKB="clone_skb 1"
PKT_SIZE="pkt_size 60"
#COUNT="count 0"
COUNT="count 10000000"
IPG="ipg 0"
                                                                                
PGDEV=/proc/net/pktgen/eth0
echo "Configuring $PGDEV"
pgset "$COUNT"
pgset "$CLONE_SKB"
pgset "$PKT_SIZE"
pgset "$IPG"
pgset "flag IPDST_RND"
pgset "dst_min 5.0.0.0"
pgset "dst_max 5.255.255.255"
pgset "flows 32768"
pgset "flowlen 10"
                                                                                
With this, wthin a few seconds of starting pktgen, I get dst cache
overflow messages. I use the following instrumentation patch
to look at what's happening -
                                                                                
http://lse.sourceforge.net/locking/rcu/rtcache/pktgen/patches/15-rcu-debug.patch                                                                                
I tried both vanilla 2.6.0 and 2.6.0 + throttle-rcu patch which limits
RCU to 4 updates per RCU tasklet. The results are here -

http://lse.sourceforge.net/locking/rcu/rtcache/pktgen/gracedata/cpu-grace.png

This graph shows the maximum grace period during ~4ms time buckets on x-axis.

Couple of things are clear from this -

1. RCU grace periods of upto 300ms are seen. 300ms + 100Kpps packet
   amounts to about 30000 pending dst entries which result in route cache
   overflow.

2. throttle-rcu is only marginally better (10% less worst case grace period).

So, what causes RCU to stall for 300ms odd time ? I did some measurements
using the following patch -

http://lse.sourceforge.net/locking/rcu/rtcache/pktgen/patches/25-softirq-debug.patch

It applies on top of the 15-rcu-debug patch. This counts the number of
softirqs (in effect and approximation) during ~4ms time buckets. The
result is here -

http://lse.sourceforge.net/locking/rcu/rtcache/pktgen/softirq/cpu-softirq.png

The rcu grace period spikes are always accompanied by softirq frequency
spikes. So, this indicates that it is the large number of quick-running
softirqs that cause userland starvation which in turn result in RCU
delays. This raises a fundamental question - should we work around
this by providing a quiescent point at the end of every softirq handler
(giving softirqs its own RCU mechanism) or should we address a wider
problem, the system getting overwhelmed by heavy softirq load, and
try to implement a real softirq throttling mechanism that balances
cpu use. 

Robert demonstrated to us sometime ago with a small
timestamping user program to show that it can get starved for
more than 6 seconds in his system. So userland starvation is an
issue.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2004-04-08 14:07 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-29 18:45 route cache DoS testing and softirqs Dipankar Sarma
2004-03-29 22:29 ` Andrea Arcangeli
2004-03-30  5:06   ` Srivatsa Vaddagiri
2004-03-30  5:35     ` Srivatsa Vaddagiri
2004-03-30 15:11       ` Andrea Arcangeli
2004-03-31  2:36     ` Rusty Russell
2004-03-30 14:43   ` Dipankar Sarma
2004-03-30 19:53     ` Dipankar Sarma
2004-03-30 20:47       ` Andrea Arcangeli
2004-03-30 21:06         ` Dipankar Sarma
2004-03-30 21:27           ` Andrea Arcangeli
2004-03-30 21:29         ` Robert Olsson
2004-03-31  7:36           ` Dipankar Sarma
2004-03-30 20:05   ` kuznet
2004-03-30 20:28     ` Dipankar Sarma
2004-04-01  6:00       ` kuznet
2004-03-30 21:14     ` Andrea Arcangeli
2004-03-30 21:30       ` David S. Miller
2004-03-30 21:37         ` Andrea Arcangeli
2004-03-30 22:22           ` David S. Miller
2004-03-30 22:49             ` Andrea Arcangeli
2004-03-31 20:46               ` Dipankar Sarma
2004-03-31 21:31                 ` Andrea Arcangeli
2004-03-31 21:52                   ` Dipankar Sarma
2004-03-30 22:33           ` Robert Olsson
2004-03-31 17:10           ` Dipankar Sarma
2004-03-31 18:46             ` Robert Olsson
2004-03-31 20:37               ` Dipankar Sarma
2004-03-31 21:28                 ` Andrea Arcangeli
2004-03-31 21:43                   ` Dipankar Sarma
2004-04-05 17:11                     ` Robert Olsson
2004-04-05 21:22                       ` Dipankar Sarma
2004-04-06 12:55                         ` Robert Olsson
2004-04-06 19:52                           ` Dipankar Sarma
2004-04-07 15:23                             ` Robert Olsson
2004-04-07 19:48                               ` Dipankar Sarma
2004-04-08 13:29                           ` kuznet
2004-04-08 14:07                             ` Robert Olsson
2004-03-31 22:36                   ` Robert Olsson
2004-03-31 22:52                     ` Andrea Arcangeli
2004-04-01  6:43                       ` kuznet
2004-04-01 13:16                         ` Andrea Arcangeli
2004-04-08 13:38                           ` kuznet
2004-04-01 13:44                       ` Robert Olsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).