netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: TCP_DEFER_ACCEPT issues
       [not found] <20071102013321.GA30893@codeblau.de>
@ 2007-11-02  7:24 ` Eric Dumazet
  2007-11-02 22:19   ` Felix von Leitner
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2007-11-02  7:24 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: linux-kernel, Linux Netdev List

Felix von Leitner a écrit :
> I am trying to use TCP_DEFER_ACCEPT in my web server.
> 
> There are some operational problems.  First of all: timeout handling.  I
> would like to be able to set a timeout in seconds (or better:
> milliseconds) for how long the socket is allowed to sit there without
> data coming in.  For high load situations, I have been enforcing
> timeouts in the range of 15 seconds, otherwise someone can DoS the
> server by opening a lot of connections and tying up data structures.
> 
> It is still possible, of course, to tie up kernel memory this way, by
> not reacting to the FIN or RST packets and running into a timeout there,
> too, but that is partially tunable via sysctl.
> 
> According to tcp(7) the int argument to TCP_DEFER_ACCEPT is in seconds.
> In the kernel code, it's converted to TCP timeout units.  When I ran my
> server, and connected without sending any data, nothing happened.  No
> timeout.  Minutes later, the connection was still there.  Even worse:
> when I killed (!) the server process (thus closing the server socket),
> the client did not get a reset.  Only when I type something in the
> telnet, I get a reset.  This appears to be very broken.
> 
> My suggestion:
> 
>   1. make the argument to the setsockopt be in seconds, or milliseconds.
>   2. if the server socket is closed, reset all pending connections.
> 
> Comments?
> 

I agree TCP_DEFER_ACCEPT is not worth it at the current time, if you take into 
account the bad guys, or very slow networks.

1) Setting a timeout in a millisecond range (< 1000) is not very good because 
some clients may need much more time to send your server the data (very long 
distance). So a second granularity is OK.

2) After timeout is elapsed, the server tcp stack has no socket associated to 
your client attempt. So closing the server listening socket wont be able to 
send RST. I agree a RST *should* be sent by the server once the timeout is 
triggered.

A typical tcpdump of what is happening for a tcp_defer_accept timeout of 20 
seconds is :

[1]08:52:47.480291 IP client.60930 > server.http: S 2498995442:2498995442(0) 
win 5840 <mss 1460,sackOK,timestamp 2685904595 0,nop,wscale 2>
[2]08:52:47.480302 IP server.http > client.60930: S 1173302644:1173302644(0) 
ack 2498995443 win 5840 <mss 1460>
[3]08:52:47.481669 IP client.60930 > server.http: . ack 1 win 5840

[4]08:52:50.757543 IP server.http > client.60930: S 1173302644:1173302644(0) 
ack 2498995443 win 5840 <mss 1460>
[5]08:52:50.758953 IP client.60930 > server.http: . ack 1 win 5840

[6]08:52:56.760611 IP server.http > client.60930: S 1173302644:1173302644(0) 
ack 2498995443 win 5840 <mss 1460>
[7]08:52:56.761886 IP client.60930 > server.http: . ack 1 win 5840

[8]08:53:08.771254 IP server.http > client.60930: S 1173302644:1173302644(0) 
ack 2498995443 win 5840 <mss 1460>
[9]08:53:08.772514 IP client.60930 > server.http: . ack 1 win 5840

[10]08:53:32.782488 IP server.http > client.60930: S 1173302644:1173302644(0) 
ack 2498995443 win 5840 <mss 1460>
[11]08:53:32.783754 IP client.60930 > server.http: . ack 1 win 5840

<a very long time, then client finally sends 2 bytes>

[12]08:59:30.509097 IP client.60930 > server.http: P 1:3(2) ack 1 win 5840
[13]08:59:30.509125 IP server.http > client.60930: R 1173302645:1173302645(0) 
win 0


So TCP_DEFER_ACCEPT might send way more packets than needed. Packets 4,6,8,10 
(and their corresponding acks 5,7,9,11) seem un-necessary, since (1,2,3) has 
engaged a normal TCP session (three way handshake).

We only should wait for the data coming from the client to be able to pass the 
new socket to the listening application.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: TCP_DEFER_ACCEPT issues
  2007-11-02  7:24 ` TCP_DEFER_ACCEPT issues Eric Dumazet
@ 2007-11-02 22:19   ` Felix von Leitner
  2007-11-04 17:18     ` dean gaudet
  0 siblings, 1 reply; 3+ messages in thread
From: Felix von Leitner @ 2007-11-02 22:19 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, Linux Netdev List

Thus spake Eric Dumazet (dada1@cosmosbay.com):
> 1) Setting a timeout in a millisecond range (< 1000) is not very good 
> because some clients may need much more time to send your server the data 
> (very long distance). So a second granularity is OK.

I want millisecond accuracy for consistency.  select and poll have it,
we have a 1000 Hz timer, we should also expose that accuracy.  I don't
want to have sub second timeouts, in case you were wondering.

> 2) After timeout is elapsed, the server tcp stack has no socket associated 
> to your client attempt. So closing the server listening socket wont be able 
> to send RST. I agree a RST *should* be sent by the server once the timeout 
> is triggered.

I don't see any evidence for a timeout happening at all.
I passed 1 as argument to the setsockopt, so I'd expect a timeout to
happen pretty quickly.  There was no connection reset until I Ctrl-C'd
the server 15 minuets (!) laster.

> A typical tcpdump of what is happening for a tcp_defer_accept timeout of 20 
> seconds is :

> [1]08:52:47.480291 IP client.60930 > server.http: S 
> 2498995442:2498995442(0) win 5840 <mss 1460,sackOK,timestamp 2685904595 
> 0,nop,wscale 2>
> [2]08:52:47.480302 IP server.http > client.60930: S 
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [3]08:52:47.481669 IP client.60930 > server.http: . ack 1 win 5840

> [4]08:52:50.757543 IP server.http > client.60930: S 
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [5]08:52:50.758953 IP client.60930 > server.http: . ack 1 win 5840

> [6]08:52:56.760611 IP server.http > client.60930: S 
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [7]08:52:56.761886 IP client.60930 > server.http: . ack 1 win 5840

> [8]08:53:08.771254 IP server.http > client.60930: S 
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [9]08:53:08.772514 IP client.60930 > server.http: . ack 1 win 5840

> [10]08:53:32.782488 IP server.http > client.60930: S 
> 1173302644:1173302644(0) ack 2498995443 win 5840 <mss 1460>
> [11]08:53:32.783754 IP client.60930 > server.http: . ack 1 win 5840

> <a very long time, then client finally sends 2 bytes>

> [12]08:59:30.509097 IP client.60930 > server.http: P 1:3(2) ack 1 win 5840
> [13]08:59:30.509125 IP server.http > client.60930: R 
> 1173302645:1173302645(0) win 0

I see this, too.  If I connect and not send something, I expected the
kernel to drop the connection when the timeout is reached.  Nothing like
that happens.

> So TCP_DEFER_ACCEPT might send way more packets than needed.

Only in the face of attackers, and after the handshake.  I could live
with that.  If the timeout happened.

> We only should wait for the data coming from the client to be able to pass 
> the new socket to the listening application.

Yes.  And we should send a RST if no data is coming in within the
timeout, which is not happening for me (2.6.23).

Felix

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: TCP_DEFER_ACCEPT issues
  2007-11-02 22:19   ` Felix von Leitner
@ 2007-11-04 17:18     ` dean gaudet
  0 siblings, 0 replies; 3+ messages in thread
From: dean gaudet @ 2007-11-04 17:18 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: Eric Dumazet, linux-kernel, Linux Netdev List

fwiw i also brought the TCP_DEFER_ACCEPT problems up the end of last year:

http://www.mail-archive.com/netdev@vger.kernel.org/msg28916.html

it's possible the final message in that thread is how we should define the 
behaviour, i haven't tried the TCP_SYNCNT idea though.

-dean

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-11-04 17:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20071102013321.GA30893@codeblau.de>
2007-11-02  7:24 ` TCP_DEFER_ACCEPT issues Eric Dumazet
2007-11-02 22:19   ` Felix von Leitner
2007-11-04 17:18     ` dean gaudet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).