* [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets
@ 2009-08-15 0:39 Octavian Purdila
2009-08-24 20:47 ` Octavian Purdila
0 siblings, 1 reply; 3+ messages in thread
From: Octavian Purdila @ 2009-08-15 0:39 UTC (permalink / raw)
To: netdev
[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]
NOTE: this issue has been found, fixed and tested on an ancient 2.6.7 kernel.
This patch is a blind port of that fix, since unfortunately there is no easy
way for me to reproduce the original issue with a newer kernel. But the issue
still seems to be there.
tavi
---
There is a race condition in the time-wait sockets code that can lead
to premature termination of FIN_WAIT2 and, subsequently, to RST
generation when the FIN,ACK from the peer finally arrives:
Time TCP header
0.000000 30755 > http [SYN] Seq=0 Win=2920 Len=0 MSS=1460 TSV=282912 TSER=0
0.000008 http > 30755 aSYN, ACK] Seq=0 Ack=1 Win=2896 Len=0 MSS=1460 TSV=...
0.136899 HEAD /1b.html?n1Lg=v1 HTTP/1.0 [Packet size limited during capture]
0.136934 HTTP/1.0 200 OK [Packet size limited during capture]
0.136945 http > 30755 [FIN, ACK] Seq=187 Ack=207 Win=2690 Len=0 TSV=270521...
0.136974 30755 > http [ACK] Seq=207 Ack=187 Win=2734 Len=0 TSV=283049 TSER=...
0.177983 30755 > http [ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283089 TSER=...
0.238618 30755 > http [FIN, ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283151...
0.238625 http > 30755 [RST] Seq=188 Win=0 Len=0
Say twdr->slot = 1 and we are running inet_twdr_hangman and in this
instance inet_twdr_do_twkill_work returns 1. At that point we will
mark slot 1 and schedule inet_twdr_twkill_work. We will also make
twdr->slot = 2.
Next, a connection is closed and tcp_time_wait(TCP_FIN_WAIT2, timeo)
is called which will create a new FIN_WAIT2 time-wait socket and will
place it in the last to be reached slot, i.e. twdr->slot = 1.
At this point say inet_twdr_twkill_work will run which will start
destroying the time-wait sockets in slot 1, including the just added
TCP_FIN_WAIT2 one.
To avoid this issue we increment the slot only if all entries in the
slot have been purged.
This change may delay the slots cleanup by a time-wait death row
period but only if the worker thread didn't had the time to run/purge
the current slot in the next period (6 seconds with default sysctl
settings). However, on such a busy system even without this change we
would probably see delays...
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
net/ipv4/inet_timewait_sock.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
[-- Attachment #2: b36bc8257b528fc8ce5d6e1eb988459f5c2be10d.diff --]
[-- Type: text/x-patch, Size: 551 bytes --]
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index 61283f9..13f0781 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -218,8 +218,8 @@ void inet_twdr_hangman(unsigned long data)
/* We purged the entire slot, anything left? */
if (twdr->tw_count)
need_timer = 1;
+ twdr->slot = ((twdr->slot + 1) & (INET_TWDR_TWKILL_SLOTS - 1));
}
- twdr->slot = ((twdr->slot + 1) & (INET_TWDR_TWKILL_SLOTS - 1));
if (need_timer)
mod_timer(&twdr->tw_timer, jiffies + twdr->period);
out:
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets
2009-08-15 0:39 [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets Octavian Purdila
@ 2009-08-24 20:47 ` Octavian Purdila
2009-08-29 7:00 ` David Miller
0 siblings, 1 reply; 3+ messages in thread
From: Octavian Purdila @ 2009-08-24 20:47 UTC (permalink / raw)
To: netdev
[-- Attachment #1: Type: Text/Plain, Size: 556 bytes --]
On Saturday 15 August 2009 03:39:12 Octavian Purdila wrote:
> NOTE: this issue has been found, fixed and tested on an ancient 2.6.7
> kernel. This patch is a blind port of that fix, since unfortunately there
> is no easy way for me to reproduce the original issue with a newer kernel.
> But the issue still seems to be there.
Update: I was able to reproduce the issue on a 2.6.30 debian kernel with the
attached test. It took me about 10 runs of 2-5 mins each to reproduce it
(multiple runs to keep the capture file reasonable in terms of size).
tavi
[-- Attachment #2: finwait2.c --]
[-- Type: text/x-csrc, Size: 3384 bytes --]
#define _GNU_SOURCE
#include <sys/syscall.h>
#include <sched.h>
#include <stdlib.h>
#include <string.h>
#include <error.h>
#include <argp.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/epoll.h>
#include <sys/fcntl.h>
#include <sys/wait.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <net/if.h>
#include <signal.h>
#include <sys/socket.h>
#include <netpacket/packet.h>
#include <assert.h>
#include <netinet/tcp.h>
#include <linux/unistd.h>
static char doc[] = "";
static char args_doc[] = "";
static struct argp_option options[] = {
{"server", 's', "host", 0, "server address"},
{"port", 'p', "int", 0, "port to connect"},
{"connections", 'c', "int", 0, "how many connections to open"},
{0},
};
struct cl_args {
unsigned short port, connections;
uint32_t server;
};
struct cl_args cla = {
.port=5555,
.connections=2500,
};
static error_t parse_opt(int key, char *arg, struct argp_state *state)
{
struct cl_args *cla = state->input;
switch (key) {
case 's':
{
struct hostent *hostinfo =gethostbyname(arg);
if (!hostinfo) {
fprintf(stderr, "unknown host %s\n", arg);
return -1;
}
cla->server=ntohl(((struct in_addr*)hostinfo->h_addr)->s_addr);
break;
}
case 'p':
{
int port = atoi(arg);
if (port <=0 || port > 65535) {
fprintf(stderr, "invalid port: %d\n", cla->port);
return -1;
}
cla->port=port;
break;
}
case 'c':
{
cla->connections=atoi(arg);
break;
}
case ARGP_KEY_ARG:
break;
default:
return ARGP_ERR_UNKNOWN;
}
return 0;
}
static struct argp argp = { options, parse_opt, args_doc, doc };
static inline int is_client(void)
{
if (cla.server != INADDR_ANY)
return 1;
return 0;
}
#define FAIL(x) \
do { \
if (x) { \
fprintf(stderr, "%s:%d: error: %s\n", __func__, __LINE__, \
strerror(errno)); \
exit(1); \
} \
} while (0)
static void run_client(void)
{
int i, socks[cla.connections];
struct sockaddr_in name = {
.sin_port = htons (cla.port),
.sin_addr = { htonl(cla.server) },
.sin_family = AF_INET
};
for(i=0; i < cla.connections; i++) {
FAIL((socks[i] = socket(PF_INET, SOCK_STREAM, 0)) < 0);
FAIL(connect(socks[i], (struct sockaddr*)&name, sizeof(name)) < 0);
}
/* wait server to close */
sleep(1);
for(i=0; i < cla.connections; i++)
close(socks[i]);
}
static void run_server(int lsock)
{
int i, socks[cla.connections];
for(i = 0; i < cla.connections; i++)
FAIL((socks[i] = accept(lsock, NULL, NULL)) < 0);
for(i = 0; i < cla.connections; i++)
close(socks[i]);
sleep(1);
}
int main(int argc, char **argv)
{
int lsock;
if (argp_parse(&argp, argc, argv, 0, 0, &cla) < 0)
return -1;
FAIL(system("echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse") != 0);
setpriority(PRIO_PROCESS, getpid(), -20);
while (1) {
static int count;
if (is_client())
run_client();
else {
if (!count) {
struct sockaddr_in name = {
.sin_port = htons (cla.port),
.sin_family = AF_INET
};
FAIL((lsock = socket(PF_INET, SOCK_STREAM, 0)) < 0);
FAIL(bind(lsock, (struct sockaddr *) &name, sizeof (name)) < 0);
FAIL(listen(lsock, cla.connections) < 0);
}
run_server(lsock);
}
printf("iteration %d\n", ++count);
}
return 0;
}
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets
2009-08-24 20:47 ` Octavian Purdila
@ 2009-08-29 7:00 ` David Miller
0 siblings, 0 replies; 3+ messages in thread
From: David Miller @ 2009-08-29 7:00 UTC (permalink / raw)
To: opurdila; +Cc: netdev
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Mon, 24 Aug 2009 23:47:05 +0300
> On Saturday 15 August 2009 03:39:12 Octavian Purdila wrote:
>
>> NOTE: this issue has been found, fixed and tested on an ancient 2.6.7
>> kernel. This patch is a blind port of that fix, since unfortunately there
>> is no easy way for me to reproduce the original issue with a newer kernel.
>> But the issue still seems to be there.
>
> Update: I was able to reproduce the issue on a 2.6.30 debian kernel with the
> attached test. It took me about 10 runs of 2-5 mins each to reproduce it
> (multiple runs to keep the capture file reasonable in terms of size).
Thanks a lot for fixing this bug.
I've applied your patch to net-next-2.6
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-08-29 7:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-15 0:39 [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets Octavian Purdila
2009-08-24 20:47 ` Octavian Purdila
2009-08-29 7:00 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).