* [fw@deneb.enyo.de: Route cache performance under stress]
@ 2003-04-05 16:50 bert hubert
2003-04-05 19:02 ` jamal
0 siblings, 1 reply; 8+ messages in thread
From: bert hubert @ 2003-04-05 16:50 UTC (permalink / raw)
To: netdev
Forwarded:
----- Forwarded message from Florian Weimer <fw@deneb.enyo.de> -----
To: linux-kernel@vger.kernel.org
Subject: Route cache performance under stress
From: Florian Weimer <fw@deneb.enyo.de>
Date: Sat, 05 Apr 2003 18:37:43 +0200
X-Mailing-List: linux-kernel@vger.kernel.org
Please read the following paper:
<http://www.cs.rice.edu/~scrosby/tr/HashAttack.pdf>
Then look at the 2.4 route cache implementation.
Short summary: It is possible to freeze machines with 1 GB of RAM and
more with a stream of 400 packets per second with carefully chosen
source addresses. Not good.
The route cache is a DoS bottleneck in general (that's why I started
looking at it). You have to apply rate-limits in the PREROUTING
chain, otherwise a modest packet flood will push the machine off the
network (even with truly random source addresses, not triggering hash
collisions). The route cache partially defeats the purpose of SYN
cookies, too, because the kernel keeps (transient) state for spoofed
connection attempts in the route cache.
The following patch can be applied in an emergency, if you face the
hash collision DoS attack. It drastically limits the size of the
cache (but not the bucket count), and decreases performance in some
applications, but
--- route.c 2003/04/05 12:41:51 1.1
+++ route.c 2003/04/05 12:42:42
@@ -2508,8 +2508,8 @@
rt_hash_table[i].chain = NULL;
}
- ipv4_dst_ops.gc_thresh = (rt_hash_mask + 1);
- ip_rt_max_size = (rt_hash_mask + 1) * 16;
+ ipv4_dst_ops.gc_thresh = 512;
+ ip_rt_max_size = 2048;
devinet_init();
ip_fib_init();
(Yeah, I know, it's stupid, but it might help in an emergency.)
I wonder why the route cache is needed at all for hosts which don't
forward any IP packets, and why it has to include the source addresses
and TOS (for policy-based routing, probably). Most hosts simply don't
face such complex routing decisions to make the cache a win.
If you don't believe me, hook a Linux box to a packet generator
(generating packets with random source addresses) and use iptables to
drop the packets, in a first test run in the INPUT chain (after route
cache), and in a second one in the PREROUTING chain (before route
cache). I've observed an incredible difference (not in laboratory
tests, but during actual DoS attacks).
Netfilter ip_conntrack support might have similar issues, but you
can't use it in a uncooperative environment anyway, at least in my
experience. (Note that there appears to be no way to disable
connection tracking while the code is in the kernel.)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
----- End forwarded message -----
--
http://www.PowerDNS.com Open source, database driven DNS Software
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [fw@deneb.enyo.de: Route cache performance under stress]
2003-04-05 16:50 [fw@deneb.enyo.de: Route cache performance under stress] bert hubert
@ 2003-04-05 19:02 ` jamal
2003-04-05 22:55 ` Florian Weimer
0 siblings, 1 reply; 8+ messages in thread
From: jamal @ 2003-04-05 19:02 UTC (permalink / raw)
To: bert hubert; +Cc: netdev
On Sat, 5 Apr 2003, bert hubert wrote:
> Forwarded:
>
> ----- Forwarded message from Florian Weimer <fw@deneb.enyo.de> -----
>
> To: linux-kernel@vger.kernel.org
> Subject: Route cache performance under stress
> From: Florian Weimer <fw@deneb.enyo.de>
> Date: Sat, 05 Apr 2003 18:37:43 +0200
> X-Mailing-List: linux-kernel@vger.kernel.org
>
> Please read the following paper:
>
> <http://www.cs.rice.edu/~scrosby/tr/HashAttack.pdf>
>
> Then look at the 2.4 route cache implementation.
>
> Short summary: It is possible to freeze machines with 1 GB of RAM and
> more with a stream of 400 packets per second with carefully chosen
> source addresses. Not good.
>
I dont think the author has done any testing actually at the rate they
claim to have to - if they did they wouldnt be wording it as "carefully
chosen source addresses".
> The route cache is a DoS bottleneck in general (that's why I started
> looking at it).
Yes it is - but not using the reasons described above.
> You have to apply rate-limits in the PREROUTING
> chain, otherwise a modest packet flood will push the machine off the
> network (even with truly random source addresses, not triggering hash
> collisions). The route cache partially defeats the purpose of SYN
> cookies, too, because the kernel keeps (transient) state for spoofed
> connection attempts in the route cache.
>
I think two issues are being mixed here: One the dst cache and two
the SYN attacks. The TCP SYNs are more vulnerable than dst cache
and rate control of SYNs may remedy the issue.
> The following patch can be applied in an emergency, if you face the
> hash collision DoS attack. It drastically limits the size of the
> cache (but not the bucket count), and decreases performance in some
> applications, but
>
> --- route.c 2003/04/05 12:41:51 1.1
> +++ route.c 2003/04/05 12:42:42
> @@ -2508,8 +2508,8 @@
> rt_hash_table[i].chain = NULL;
> }
>
> - ipv4_dst_ops.gc_thresh = (rt_hash_mask + 1);
> - ip_rt_max_size = (rt_hash_mask + 1) * 16;
> + ipv4_dst_ops.gc_thresh = 512;
> + ip_rt_max_size = 2048;
>
And why could this not have been done in /proc? Is this supposed
to cure something?
>
> I wonder why the route cache is needed at all for hosts which don't
> forward any IP packets, and why it has to include the source addresses
> and TOS (for policy-based routing, probably). Most hosts simply don't
> face such complex routing decisions to make the cache a win.
>
> If you don't believe me, hook a Linux box to a packet generator
> (generating packets with random source addresses) and use iptables to
> drop the packets, in a first test run in the INPUT chain (after route
> cache), and in a second one in the PREROUTING chain (before route
> cache). I've observed an incredible difference (not in laboratory
> tests, but during actual DoS attacks).
>
> Netfilter ip_conntrack support might have similar issues, but you
> can't use it in a uncooperative environment anyway, at least in my
> experience. (Note that there appears to be no way to disable
> connection tracking while the code is in the kernel.)
There are issues with dst cache no doubt. contracking on the
other hand is appaling. In any case i think it is extremely dangerous to
read some academic paper and start waving hands around about how it
applies here. Do some tests and come up with data of where things need to
be improved. Just pointing a finger at hashing is insufficient
(infact i dont think hashing is the primary issue based on some
experiments foo and i did)
cheers,
jamal
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [fw@deneb.enyo.de: Route cache performance under stress]
2003-04-05 19:02 ` jamal
@ 2003-04-05 22:55 ` Florian Weimer
2003-04-05 23:48 ` jamal
0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2003-04-05 22:55 UTC (permalink / raw)
To: netdev
jamal <hadi@cyberus.ca> writes:
>> Short summary: It is possible to freeze machines with 1 GB of RAM and
>> more with a stream of 400 packets per second with carefully chosen
>> source addresses. Not good.
>>
>
> I dont think the author has done any testing actually at the rate they
> claim to have to - if they did they wouldnt be wording it as "carefully
> chosen source addresses".
Why don't you ask the author the wording is unclear?
>> The route cache is a DoS bottleneck in general (that's why I started
>> looking at it).
>
> Yes it is - but not using the reasons described above.
Have you actually read the paper? Do you understand its implications
for the dst cache?
> I think two issues are being mixed here: One the dst cache and two
> the SYN attacks. The TCP SYNs are more vulnerable than dst cache
> and rate control of SYNs may remedy the issue.
Currently, the dst cache (which is a misnomer, as it includes the
source address and TOS field) is not only a bottleneck, you can
actually abuse it for DoS attacks.
Some people told me that the general performance issues are rather
well-known. Why aren't they being addressed?
> And why could this not have been done in /proc?
Ugh, thanks. I didn't notice that this part is actually tunable.
> Is this supposed to cure something?
Garbage collection is much more aggressive and the chains attached to
the bucket are kept much shorter, so less CPU time is spent processing
them.
> In any case i think it is extremely dangerous to read some academic
> paper
Have you read it? I doubt it.
> and start waving hands around about how it applies here.
I discovered the paper *after* writing the DoS tool.
I wrote the DoS tool *after* investigating why a server froze during a
2 Mbit/s SYN flood which was being dropped by an iptables rule (in the
INPUT chain, admittedly).
Please stop your ad hominem attacks.
> Do some tests and come up with data of where things need to be
> improved.
I have already done this, and I think I have provided the necessary
information to reproduce this. For obvious reasons, I don't want to
release the DoS tool before a real fix is available.
> Just pointing a finger at hashing is insufficient
I'm not pointing a finger at hashing, but at hash collisions.
> (infact i dont think hashing is the primary issue based on some
> experiments foo and i did)
Where can I read about your testing methods? There must be some flaw
because reality looks quite different.
Maybe you tested with non-random source addresses, or some "Internet
mix" traffic. Such tests look fine on glossy paper, but you shouldn't
assume that they tell anything about the robustness of networking
devices in the presence of malicious traffic.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [fw@deneb.enyo.de: Route cache performance under stress]
2003-04-05 22:55 ` Florian Weimer
@ 2003-04-05 23:48 ` jamal
2003-04-06 12:08 ` Florian Weimer
0 siblings, 1 reply; 8+ messages in thread
From: jamal @ 2003-04-05 23:48 UTC (permalink / raw)
To: Florian Weimer; +Cc: netdev
On Sun, 6 Apr 2003, Florian Weimer wrote:
> jamal <hadi@cyberus.ca> writes:
>
> Why don't you ask the author the wording is unclear?
I meant whoever said that 400pps would cause a DOS. I didnt see the dst
cache test being described in the source, so i assume it is someone else
other than people who wrote the tool.
>
> > Yes it is - but not using the reasons described above.
>
> Have you actually read the paper? Do you understand its implications
> for the dst cache?
I skimmed through it and got their message. Yes, i understand the
implications ;->
>
> > I think two issues are being mixed here: One the dst cache and two
> > the SYN attacks. The TCP SYNs are more vulnerable than dst cache
> > and rate control of SYNs may remedy the issue.
>
> Currently, the dst cache (which is a misnomer, as it includes the
> source address and TOS field) is not only a bottleneck, you can
> actually abuse it for DoS attacks.
>
yes you can.
> Some people told me that the general performance issues are rather
> well-known. Why aren't they being addressed?
>
"Addressed" is a strong word but certainly they are being looked into
(for some time now). If you are enthusiastic and want to help, send
private mail to me.
> > Is this supposed to cure something?
>
> Garbage collection is much more aggressive and the chains attached to
> the bucket are kept much shorter, so less CPU time is spent processing
> them.
You may find that aggressive gc is one of your problems infact.
>
> > In any case i think it is extremely dangerous to read some academic
> > paper
>
> Have you read it? I doubt it.
>
The skimming was good enough to get the mesage. I'll read it in more
details later after killing a small branch.
> > and start waving hands around about how it applies here.
>
> I discovered the paper *after* writing the DoS tool.
>
> I wrote the DoS tool *after* investigating why a server froze during a
> 2 Mbit/s SYN flood which was being dropped by an iptables rule (in the
> INPUT chain, admittedly).
>
I dont see the correlation of syn attacks and the dst cache in your
description. Can you collect some profiles?
400pps is peanuts for the dst cache even on a 386. OTOH, this may be an
issue for a SYN DOS.
> Please stop your ad hominem attacks.
>
Relax.
> > Do some tests and come up with data of where things need to be
> > improved.
>
> I have already done this, and I think I have provided the necessary
> information to reproduce this. For obvious reasons, I don't want to
> release the DoS tool before a real fix is available.
>
There are a lot of tools already being used by kiddies out of
there but its a good idea not to arm them with one more.
Please collect some profiles and send them privately.
> > Just pointing a finger at hashing is insufficient
>
> I'm not pointing a finger at hashing, but at hash collisions.
>
I know thats what the paper says - but this is exactly what i have
problems with: In the little study we did, its not the collisions that
kill us rather the fact we are being forced into the slow path the
majority of times. Please collect the profiles and send me the tool as
well.
> > (infact i dont think hashing is the primary issue based on some
> > experiments foo and i did)
>
> Where can I read about your testing methods? There must be some flaw
> because reality looks quite different.
>
> Maybe you tested with non-random source addresses, or some "Internet
> mix" traffic. Such tests look fine on glossy paper, but you shouldn't
> assume that they tell anything about the robustness of networking
> devices in the presence of malicious traffic.
Actually try varying the dst address, it gets more interesting.
Our data was collected on a real ISP which hosts a lot of web servers
and was being constantly DOSed. I dont think you can get more real world
than that.
Talk to me privately about our tests - i am a little conservative about
wowing people with results without doing appropriate analysis. It is
possible you hit a different unrelated issue.
cheers,
jamal
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [fw@deneb.enyo.de: Route cache performance under stress]
2003-04-05 23:48 ` jamal
@ 2003-04-06 12:08 ` Florian Weimer
2003-04-06 15:14 ` jamal
0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2003-04-06 12:08 UTC (permalink / raw)
To: jamal; +Cc: netdev
jamal <hadi@cyberus.ca> writes:
> I meant whoever said that 400pps would cause a DOS. I didnt see the dst
> cache test being described in the source, so i assume it is someone else
> other than people who wrote the tool.
Ah, I see. I'm sorry for the confusion.
> I skimmed through it and got their message. Yes, i understand the
> implications ;->
Good! 8-)
>> > Is this supposed to cure something?
>>
>> Garbage collection is much more aggressive and the chains attached to
>> the bucket are kept much shorter, so less CPU time is spent processing
>> them.
>
> You may find that aggressive gc is one of your problems infact.
I don't think so. During a DoS attack with spoofed source addresses,
the dst cache quickly fills up, and the overwhelming majority of the
entries is useless (they won't be used again). The slabinfo line
looks like this:
ip_dst_cache 116477 131080 192 6554 6554 1
There are only 8192 hash buckets on this system, and if we assume that
the entries are uniformly distributed over the buckets (which is not
necessarily true), the code in ip_route_input() has to look at 14 or
15 cache entries before the miss is detected. I can hardly see how
this is efficient.
> I dont see the correlation of syn attacks and the dst cache in your
> description. Can you collect some profiles?
On the machine above, the dst cache has 2**17 entries. Imagine what
happens if all these entries are chained to the same bucket, and the
chain has to be traversed for each packet.
> I know thats what the paper says - but this is exactly what i have
> problems with: In the little study we did, its not the collisions that
> kill us rather the fact we are being forced into the slow path the
> majority of times. Please collect the profiles and send me the tool as
> well.
Okay, I'm going to do this (if I can figure out how to read the
profiling data).
> Our data was collected on a real ISP which hosts a lot of web
> servers and was being constantly DOSed. I dont think you can get
> more real world than that.
Did you look at a router, or at a host?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [fw@deneb.enyo.de: Route cache performance under stress]
2003-04-06 12:08 ` Florian Weimer
@ 2003-04-06 15:14 ` jamal
2003-04-06 16:42 ` Florian Weimer
0 siblings, 1 reply; 8+ messages in thread
From: jamal @ 2003-04-06 15:14 UTC (permalink / raw)
To: Florian Weimer; +Cc: netdev
On Sun, 6 Apr 2003, Florian Weimer wrote:
> > You may find that aggressive gc is one of your problems infact.
>
> I don't think so. During a DoS attack with spoofed source addresses,
> the dst cache quickly fills up, and the overwhelming majority of the
> entries is useless (they won't be used again). The slabinfo line
> looks like this:
>
> ip_dst_cache 116477 131080 192 6554 6554 1
>
> There are only 8192 hash buckets on this system, and if we assume that
> the entries are uniformly distributed over the buckets (which is not
> necessarily true), the code in ip_route_input() has to look at 14 or
> 15 cache entries before the miss is detected. I can hardly see how
> this is efficient.
>
Do:
cat /proc/net/rt_cache_stat
Should give us a lot more info.
> > I dont see the correlation of syn attacks and the dst cache in your
> > description. Can you collect some profiles?
>
> On the machine above, the dst cache has 2**17 entries. Imagine what
> happens if all these entries are chained to the same bucket, and the
> chain has to be traversed for each packet.
Yes, in that (worse case) scenario, you have two effects one of walking a
lot of elements before finding you have a cache miss and then being forced
into a slow path after all that pain. The cache miss is not as
expensive compared to the slow path execution. Youd have to walk a lot
entries to get the same effect as being forced one time into slow path.
Again, this is my qualm with the papers general pov.
> > Our data was collected on a real ISP which hosts a lot of web
> > servers and was being constantly DOSed. I dont think you can get
> > more real world than that.
>
> Did you look at a router, or at a host?
As a router, but the hash compute shouldnt matter.
cheers,
jamal
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [fw@deneb.enyo.de: Route cache performance under stress]
2003-04-06 15:14 ` jamal
@ 2003-04-06 16:42 ` Florian Weimer
2003-04-06 17:20 ` Robert Olsson
0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2003-04-06 16:42 UTC (permalink / raw)
To: jamal; +Cc: netdev
jamal <hadi@cyberus.ca> writes:
> cat /proc/net/rt_cache_stat
> Should give us a lot more info.
I'll try to obtain the data points you requested.
>> > Our data was collected on a real ISP which hosts a lot of web
>> > servers and was being constantly DOSed. I dont think you can get
>> > more real world than that.
>>
>> Did you look at a router, or at a host?
>
> As a router, but the hash compute shouldnt matter.
Slow path ist faster on hosts than on routers which substantial
routing tables, isn't it?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [fw@deneb.enyo.de: Route cache performance under stress]
2003-04-06 16:42 ` Florian Weimer
@ 2003-04-06 17:20 ` Robert Olsson
0 siblings, 0 replies; 8+ messages in thread
From: Robert Olsson @ 2003-04-06 17:20 UTC (permalink / raw)
To: Florian Weimer; +Cc: jamal, netdev
Florian Weimer writes:
> jamal <hadi@cyberus.ca> writes:
>
> > cat /proc/net/rt_cache_stat
> > Should give us a lot more info.
You can use rtstat to read it.
robur.slu.se:
/pub/Linux/net-development/rt_cache_stat/rtstat.c
It's in iproute2 package too but it seems to be an older version.
Cheers.
-ro
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-04-06 17:20 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-05 16:50 [fw@deneb.enyo.de: Route cache performance under stress] bert hubert
2003-04-05 19:02 ` jamal
2003-04-05 22:55 ` Florian Weimer
2003-04-05 23:48 ` jamal
2003-04-06 12:08 ` Florian Weimer
2003-04-06 15:14 ` jamal
2003-04-06 16:42 ` Florian Weimer
2003-04-06 17:20 ` Robert Olsson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).