* auto recycling of TIME_WAIT connections
@ 2007-09-07 9:21 Pádraig Brady
2007-09-07 17:04 ` Rick Jones
0 siblings, 1 reply; 4+ messages in thread
From: Pádraig Brady @ 2007-09-07 9:21 UTC (permalink / raw)
To: netdev
As I see it, TIME_WAIT state is required for 2 reasons:
to handle wandering duplicate packets
(so a reincarnation of a connection will not be corrupted by these packets)
To handle last ack from active closer (client) not being received by remote.
If that happened, the server which is in LAST_ACK state would retransmit its FIN
(which may contain data also) so the client must be in TIME_WAIT state to handle that.
If client is not in TIME_WAIT state, then it could only indicate to the server
that data was maybe lost (with an RST).
The first issue, requires a large timeout, and
the TIME_WAIT timeout is currently 60 seconds on linux.
That timeout effectively limits the connection rate between
local TCP clients and a server to 32k/60s or around 500 connections/second.
But that issue can't really happen when the client
and server are on the same machine can it, and
even if it could, the timeouts involved would be shorter.
Now linux does have an (undocumented) /proc/sys/net/ipv4/tcp_tw_recycle flag
to enable recycling of TIME_WAIT connections. This is global however and could cause
problems in general for external connections.
So how about auto enabling recycling for local connections?
cheers,
Pádraig.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: auto recycling of TIME_WAIT connections
2007-09-07 9:21 auto recycling of TIME_WAIT connections Pádraig Brady
@ 2007-09-07 17:04 ` Rick Jones
2007-09-10 10:26 ` Pádraig Brady
0 siblings, 1 reply; 4+ messages in thread
From: Rick Jones @ 2007-09-07 17:04 UTC (permalink / raw)
To: Pádraig Brady; +Cc: netdev
> The first issue, requires a large timeout, and
> the TIME_WAIT timeout is currently 60 seconds on linux.
> That timeout effectively limits the connection rate between
> local TCP clients and a server to 32k/60s or around 500 connections/second.
Actually, it would be more like 60k/60s if the application were making
explicit calls to bind() as arguably it should if it is going to be
churning through so many connections.
This was an issue over a decade ago with SPECweb96 benchmarking. The
initial solution was to make the explicit bind() calls and not rely on
the anonymous/ephemeral port space. After that, one starts adding
additional IP's into the mix (at least where possible). And if that
fails, one has to go back to the beginning and ask oneself exactly why a
client is trying to churn through so many connections per second in the
first place.
If we were slavishly conformant to the RFC's :) that 60 seconds would be
240 seconds...
> But that issue can't really happen when the client
> and server are on the same machine can it, and
> even if it could, the timeouts involved would be shorter.
>
> Now linux does have an (undocumented) /proc/sys/net/ipv4/tcp_tw_recycle flag
> to enable recycling of TIME_WAIT connections. This is global however and could cause
> problems in general for external connections.
Rampant speculation begins...
If the client can be convinced to just call shutdown(SHUT_RDWR) rather
than close(), and be the first to do so, ahead of the server, I think it
will retain a link to the TCP endpoint in TIME_WAIT. It could then, in
TCP theory, call connect() again, and go through a path that allows
transition from TIME_WAIT to ESTABLISHED if all the right things wrt
Initial Sequence Number selection happen. Whether randomization of the
ISN allows that today is questionable.
> So how about auto enabling recycling for local connections?
I think the standard response is that one can never _really_ know what
is local and what not, particularly in the presence of netfilter and the
rewriting of headers behind one's back.
rick jones
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: auto recycling of TIME_WAIT connections
2007-09-07 17:04 ` Rick Jones
@ 2007-09-10 10:26 ` Pádraig Brady
2007-09-10 17:23 ` Rick Jones
0 siblings, 1 reply; 4+ messages in thread
From: Pádraig Brady @ 2007-09-10 10:26 UTC (permalink / raw)
To: Rick Jones; +Cc: netdev
Rick Jones wrote:
>> The first issue, requires a large timeout, and
>> the TIME_WAIT timeout is currently 60 seconds on linux.
>> That timeout effectively limits the connection rate between
>> local TCP clients and a server to 32k/60s or around 500
>> connections/second.
>
> Actually, it would be more like 60k/60s if the application were making
> explicit calls to bind() as arguably it should if it is going to be
> churning through so many connections.
> This was an issue over a decade ago with SPECweb96 benchmarking. The
> initial solution was to make the explicit bind() calls and not rely on
> the anonymous/ephemeral port space. After that, one starts adding
> additional IP's into the mix (at least where possible). And if that
> fails, one has to go back to the beginning and ask oneself exactly why a
> client is trying to churn through so many connections per second in the
> first place.
right. This is for benchmarking mainly.
Sane applications would use persistent connections,
or a different form of IPC.
>
> If we were slavishly conformant to the RFC's :) that 60 seconds would be
> 240 seconds...
>
>> But that issue can't really happen when the client
>> and server are on the same machine can it, and
>> even if it could, the timeouts involved would be shorter.
>>
>> Now linux does have an (undocumented)
>> /proc/sys/net/ipv4/tcp_tw_recycle flag
>> to enable recycling of TIME_WAIT connections. This is global however
>> and could cause
>> problems in general for external connections.
>
> Rampant speculation begins...
>
> If the client can be convinced to just call shutdown(SHUT_RDWR) rather
> than close(), and be the first to do so, ahead of the server, I think it
> will retain a link to the TCP endpoint in TIME_WAIT. It could then, in
> TCP theory, call connect() again, and go through a path that allows
> transition from TIME_WAIT to ESTABLISHED if all the right things wrt
> Initial Sequence Number selection happen. Whether randomization of the
> ISN allows that today is questionable.
Sounds good, unfortunately connect() returns EISCONN
unless you do a close().
>
>> So how about auto enabling recycling for local connections?
>
> I think the standard response is that one can never _really_ know what
> is local and what not, particularly in the presence of netfilter and the
> rewriting of headers behind one's back.
Hmm, I was afraid someone would say that :)
thanks for the feedback,
Pádraig.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: auto recycling of TIME_WAIT connections
2007-09-10 10:26 ` Pádraig Brady
@ 2007-09-10 17:23 ` Rick Jones
0 siblings, 0 replies; 4+ messages in thread
From: Rick Jones @ 2007-09-10 17:23 UTC (permalink / raw)
To: Pádraig Brady; +Cc: netdev
Pádraig Brady wrote:
> Rick Jones wrote:
>>This was an issue over a decade ago with SPECweb96 benchmarking. The
>>initial solution was to make the explicit bind() calls and not rely on
>>the anonymous/ephemeral port space. After that, one starts adding
>>additional IP's into the mix (at least where possible). And if that
>>fails, one has to go back to the beginning and ask oneself exactly why a
>>client is trying to churn through so many connections per second in the
>>first place.
>
>
> right. This is for benchmarking mainly.
> Sane applications would use persistent connections,
> or a different form of IPC.
All the more reason to go the "add more client IP's" path then. It
gives you more connections per second, and gives you a much broader
umber of "client" IP's hitting the server which will be more realistic.
That is one thing I like very much about polygraph (based on what I've
read) - it's use of _lots_ of client IPs to better simulate reality. I
think other web-oriented benchmarks should start to include that as well
for there are stacks which do indeed make "decisions" based on whether
or not a destination is perceived to be "local" or not.
rick jones
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-09-10 17:23 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-07 9:21 auto recycling of TIME_WAIT connections Pádraig Brady
2007-09-07 17:04 ` Rick Jones
2007-09-10 10:26 ` Pádraig Brady
2007-09-10 17:23 ` Rick Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).