* [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle}
@ 2008-01-30 8:38 Andi Kleen
2008-01-30 19:22 ` Ben Greear
0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2008-01-30 8:38 UTC (permalink / raw)
To: netdev
We've recently had a long discussion about the CVE-2005-0356 time stamp denial-of-service
attack. It turned out that Linux is only vunerable to this problem when tcp_tw_recycle
is enabled (which it is not by default).
In general these two options are not really usable in today's internet because they
make the (often false) assumption that a single IP address has a single TCP time stamp /
PAWS clock. This assumption breaks both NAT/masquerading and also opens Linux to denial
of service attacks (see the CVE description)
Due to these numerous problems I propose to remove this code for 2.6.26
Signed-off-by: Andi Kleen <ak@suse.de>
Index: linux/Documentation/feature-removal-schedule.txt
===================================================================
--- linux.orig/Documentation/feature-removal-schedule.txt
+++ linux/Documentation/feature-removal-schedule.txt
@@ -354,3 +354,15 @@ Why: The support code for the old firmwa
and slightly hurts runtime performance. Bugfixes for the old firmware
are not provided by Broadcom anymore.
Who: Michael Buesch <mb@bu3sch.de>
+
+---------------------------
+
+What: Support for /proc/sys/net/ipv4/tcp_tw_{reuse,recycle} = 1
+When: 2.6.26
+Why: Enabling either of those makes Linux TCP incompatible with masquerading and
+ also opens Linux to the CVE-2005-0356 denial of service attack. And these
+ optimizations are explicitely disallowed by some benchmarks. They also have
+ been disabled by default for more than ten years so they're unlikely to be used
+ much. Due to these fatal flaws it doesn't make sense to keep the code.
+Who: Andi Kleen <andi@firstfloor.org>
+
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle}
2008-01-30 8:38 [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle} Andi Kleen
@ 2008-01-30 19:22 ` Ben Greear
2008-01-31 2:59 ` Andi Kleen
0 siblings, 1 reply; 7+ messages in thread
From: Ben Greear @ 2008-01-30 19:22 UTC (permalink / raw)
To: Andi Kleen; +Cc: netdev
Andi Kleen wrote:
> We've recently had a long discussion about the CVE-2005-0356 time stamp denial-of-service
> attack. It turned out that Linux is only vunerable to this problem when tcp_tw_recycle
> is enabled (which it is not by default).
>
> In general these two options are not really usable in today's internet because they
> make the (often false) assumption that a single IP address has a single TCP time stamp /
> PAWS clock. This assumption breaks both NAT/masquerading and also opens Linux to denial
> of service attacks (see the CVE description)
>
> Due to these numerous problems I propose to remove this code for 2.6.26
We use these features to enable creating very high numbers of short-lived
TCP connections, primarily used as a test tool for other network
devices.
Perhaps just document the adverse affects and/or have it print out a warning
on the console whenever the feature is enabled?
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle}
2008-01-30 19:22 ` Ben Greear
@ 2008-01-31 2:59 ` Andi Kleen
2008-01-31 6:37 ` Ben Greear
0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2008-01-31 2:59 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev
On Wednesday 30 January 2008 20:22, Ben Greear wrote:
> We use these features to enable creating very high numbers of short-lived
> TCP connections, primarily used as a test tool for other network
> devices.
Hopefully these other network devices don't do any NAT then
or don't otherwise violate the IP-matches-PAWS assumption.
Most likely they do actually, so enabling TW recycle
for testing is probably not even safe for you.
Modern systems have a lot of RAM so even without tw recycle
you should be able to get a very high number of connections.
An timewait socket is around 128 bytes on 64bit; this means
with a GB of memory you can already support > 8 Million TW sockets.
On 32bit it's even more.
The optimization was originally written at a time when 64MB systems
were common.
If you don't care about data integrity have you considered just
using some custom UDP based protocol or run one of the user space
TCP stacks and disable all data integrity features? If you do care about
data integrity then you should probably disable tw recycle anyways.
The deprecation period will be some time (several months) so you'll have
enough time to migrate to another method
> Perhaps just document the adverse affects and/or have it print out a
> warning on the console whenever the feature is enabled?
"This feature is insecure and does not work on the internet or with NAT" ?
Somehow this just does not seem right to me.
-Andi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle}
2008-01-31 2:59 ` Andi Kleen
@ 2008-01-31 6:37 ` Ben Greear
2008-01-31 6:55 ` Andi Kleen
0 siblings, 1 reply; 7+ messages in thread
From: Ben Greear @ 2008-01-31 6:37 UTC (permalink / raw)
To: Andi Kleen; +Cc: netdev
Andi Kleen wrote:
> On Wednesday 30 January 2008 20:22, Ben Greear wrote:
>
>
>> We use these features to enable creating very high numbers of short-lived
>> TCP connections, primarily used as a test tool for other network
>> devices.
>>
>
> Hopefully these other network devices don't do any NAT then
> or don't otherwise violate the IP-matches-PAWS assumption.
> Most likely they do actually, so enabling TW recycle
> for testing is probably not even safe for you.
>
> Modern systems have a lot of RAM so even without tw recycle
> you should be able to get a very high number of connections.
> An timewait socket is around 128 bytes on 64bit; this means
> with a GB of memory you can already support > 8 Million TW sockets.
> On 32bit it's even more.
>
I believe the problem was that all of my ports were used up with
TIME_WAIT sockets and so it couldn't create more. My test
case was similar to this:
1 Have one machine B listen for connections on one interface (one IP).
2 Have one machine A make a connection to B, and close connection
immediately or soon after
it was established.
goto 2
The goal was to make a maximum number of TCP connections per second.
The data passed
is just filler, and for the fastest settings, we don't pass data at all.
Without setting
tcp_tw_recycle to 1, the system could do only a few thousand connections
per second. With
it set to 1, I think I was getting around 10,000. Either way, it was
significantly faster than
w/out recycle enabled.
So, is there a better way to max out the connections per second without
having to use tcp_tw_recycle?
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle}
2008-01-31 6:37 ` Ben Greear
@ 2008-01-31 6:55 ` Andi Kleen
2008-01-31 16:41 ` Ben Greear
0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2008-01-31 6:55 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev
> I believe the problem was that all of my ports were used up with
> TIME_WAIT sockets and so it couldn't create more. My test
> case was similar to this:
Ah that's simple to solve then :- use more IP addresses and bind
to them in RR in your user program.
Arguably the Linux TCP code should be able to do this by itself
when enough IP addresses are available, but it's not very hard
to do in user space using bind(2)
BTW it's also an very unusual case -- in most cases there are more
remote IP addresses
> So, is there a better way to max out the connections per second without
> having to use tcp_tw_recycle?
Well did you profile where the bottle necks were?
Perhaps also just increase the memory allowed for TCP sockets.
-Andi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle}
2008-01-31 6:55 ` Andi Kleen
@ 2008-01-31 16:41 ` Ben Greear
2008-01-31 16:49 ` Andi Kleen
0 siblings, 1 reply; 7+ messages in thread
From: Ben Greear @ 2008-01-31 16:41 UTC (permalink / raw)
To: Andi Kleen; +Cc: netdev
Andi Kleen wrote:
>> I believe the problem was that all of my ports were used up with
>> TIME_WAIT sockets and so it couldn't create more. My test
>> case was similar to this:
>>
>
> Ah that's simple to solve then :- use more IP addresses and bind
> to them in RR in your user program.
>
> Arguably the Linux TCP code should be able to do this by itself
> when enough IP addresses are available, but it's not very hard
> to do in user space using bind(2)
>
> BTW it's also an very unusual case -- in most cases there are more
> remote IP addresses
>
This could be done, but it does decrease our options for testing certain
scenarios.
>> So, is there a better way to max out the connections per second without
>> having to use tcp_tw_recycle?
>>
>
> Well did you profile where the bottle necks were?
>
> Perhaps also just increase the memory allowed for TCP sockets.
>
I may be missing something, but I believe the issue is that the sockets
wait around a while (maybe 30 seconds
or so) in TIME_WAIT state. So, even if we use all 64k of the local port
range, that will limit us to about 2000 new sockets
per second, as we have to wait for old ones to transition out of TIME_WAIT.
I guess I could probably decrease TIME_WAIT, but then all of my
connections would be affected, not just the
ones on the ports creating very large numbers of connections per
second. From 'man tcp', it does not seem
I can set the TIME_WAIT on a per-socket basis.
I don't know exactly how the tcp_tw_recycle works, but it seems like it
could be made to only
take affect when all local ports are used up in TIME_WAIT. It could
then recycle the oldest one
as a new socket is requested. For any normal program, it would be very
unlikely to ever need to
recycle in this case because there would be enough free IP/port pairs
available. But, for weird things
like my own, at least it could be made to work w/out hacking the global
TIME_WAIT.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle}
2008-01-31 16:41 ` Ben Greear
@ 2008-01-31 16:49 ` Andi Kleen
0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2008-01-31 16:49 UTC (permalink / raw)
To: Ben Greear; +Cc: Andi Kleen, netdev
On Thu, Jan 31, 2008 at 08:41:38AM -0800, Ben Greear wrote:
> I don't know exactly how the tcp_tw_recycle works, but it seems like it
> could be made to only
> take affect when all local ports are used up in TIME_WAIT.
TIME-WAIT does not actually use up local ports; it uses up remote ports
because it is done on the LISTEN socket which has always a fixed
local port. And it has no idea how many ports the other end has left.
-Andi
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-01-31 16:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-30 8:38 [PATCH] [1/1] Deprecate tcp_tw_{reuse,recycle} Andi Kleen
2008-01-30 19:22 ` Ben Greear
2008-01-31 2:59 ` Andi Kleen
2008-01-31 6:37 ` Ben Greear
2008-01-31 6:55 ` Andi Kleen
2008-01-31 16:41 ` Ben Greear
2008-01-31 16:49 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).