* [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets @ 2009-08-15 0:39 Octavian Purdila 2009-08-24 20:47 ` Octavian Purdila 0 siblings, 1 reply; 3+ messages in thread From: Octavian Purdila @ 2009-08-15 0:39 UTC (permalink / raw) To: netdev [-- Attachment #1: Type: text/plain, Size: 2262 bytes --] NOTE: this issue has been found, fixed and tested on an ancient 2.6.7 kernel. This patch is a blind port of that fix, since unfortunately there is no easy way for me to reproduce the original issue with a newer kernel. But the issue still seems to be there. tavi --- There is a race condition in the time-wait sockets code that can lead to premature termination of FIN_WAIT2 and, subsequently, to RST generation when the FIN,ACK from the peer finally arrives: Time TCP header 0.000000 30755 > http [SYN] Seq=0 Win=2920 Len=0 MSS=1460 TSV=282912 TSER=0 0.000008 http > 30755 aSYN, ACK] Seq=0 Ack=1 Win=2896 Len=0 MSS=1460 TSV=... 0.136899 HEAD /1b.html?n1Lg=v1 HTTP/1.0 [Packet size limited during capture] 0.136934 HTTP/1.0 200 OK [Packet size limited during capture] 0.136945 http > 30755 [FIN, ACK] Seq=187 Ack=207 Win=2690 Len=0 TSV=270521... 0.136974 30755 > http [ACK] Seq=207 Ack=187 Win=2734 Len=0 TSV=283049 TSER=... 0.177983 30755 > http [ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283089 TSER=... 0.238618 30755 > http [FIN, ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283151... 0.238625 http > 30755 [RST] Seq=188 Win=0 Len=0 Say twdr->slot = 1 and we are running inet_twdr_hangman and in this instance inet_twdr_do_twkill_work returns 1. At that point we will mark slot 1 and schedule inet_twdr_twkill_work. We will also make twdr->slot = 2. Next, a connection is closed and tcp_time_wait(TCP_FIN_WAIT2, timeo) is called which will create a new FIN_WAIT2 time-wait socket and will place it in the last to be reached slot, i.e. twdr->slot = 1. At this point say inet_twdr_twkill_work will run which will start destroying the time-wait sockets in slot 1, including the just added TCP_FIN_WAIT2 one. To avoid this issue we increment the slot only if all entries in the slot have been purged. This change may delay the slots cleanup by a time-wait death row period but only if the worker thread didn't had the time to run/purge the current slot in the next period (6 seconds with default sysctl settings). However, on such a busy system even without this change we would probably see delays... Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> --- net/ipv4/inet_timewait_sock.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) [-- Attachment #2: b36bc8257b528fc8ce5d6e1eb988459f5c2be10d.diff --] [-- Type: text/x-patch, Size: 551 bytes --] diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c index 61283f9..13f0781 100644 --- a/net/ipv4/inet_timewait_sock.c +++ b/net/ipv4/inet_timewait_sock.c @@ -218,8 +218,8 @@ void inet_twdr_hangman(unsigned long data) /* We purged the entire slot, anything left? */ if (twdr->tw_count) need_timer = 1; + twdr->slot = ((twdr->slot + 1) & (INET_TWDR_TWKILL_SLOTS - 1)); } - twdr->slot = ((twdr->slot + 1) & (INET_TWDR_TWKILL_SLOTS - 1)); if (need_timer) mod_timer(&twdr->tw_timer, jiffies + twdr->period); out: ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets 2009-08-15 0:39 [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets Octavian Purdila @ 2009-08-24 20:47 ` Octavian Purdila 2009-08-29 7:00 ` David Miller 0 siblings, 1 reply; 3+ messages in thread From: Octavian Purdila @ 2009-08-24 20:47 UTC (permalink / raw) To: netdev [-- Attachment #1: Type: Text/Plain, Size: 556 bytes --] On Saturday 15 August 2009 03:39:12 Octavian Purdila wrote: > NOTE: this issue has been found, fixed and tested on an ancient 2.6.7 > kernel. This patch is a blind port of that fix, since unfortunately there > is no easy way for me to reproduce the original issue with a newer kernel. > But the issue still seems to be there. Update: I was able to reproduce the issue on a 2.6.30 debian kernel with the attached test. It took me about 10 runs of 2-5 mins each to reproduce it (multiple runs to keep the capture file reasonable in terms of size). tavi [-- Attachment #2: finwait2.c --] [-- Type: text/x-csrc, Size: 3384 bytes --] #define _GNU_SOURCE #include <sys/syscall.h> #include <sched.h> #include <stdlib.h> #include <string.h> #include <error.h> #include <argp.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <sys/epoll.h> #include <sys/fcntl.h> #include <sys/wait.h> #include <stdio.h> #include <errno.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <net/if.h> #include <signal.h> #include <sys/socket.h> #include <netpacket/packet.h> #include <assert.h> #include <netinet/tcp.h> #include <linux/unistd.h> static char doc[] = ""; static char args_doc[] = ""; static struct argp_option options[] = { {"server", 's', "host", 0, "server address"}, {"port", 'p', "int", 0, "port to connect"}, {"connections", 'c', "int", 0, "how many connections to open"}, {0}, }; struct cl_args { unsigned short port, connections; uint32_t server; }; struct cl_args cla = { .port=5555, .connections=2500, }; static error_t parse_opt(int key, char *arg, struct argp_state *state) { struct cl_args *cla = state->input; switch (key) { case 's': { struct hostent *hostinfo =gethostbyname(arg); if (!hostinfo) { fprintf(stderr, "unknown host %s\n", arg); return -1; } cla->server=ntohl(((struct in_addr*)hostinfo->h_addr)->s_addr); break; } case 'p': { int port = atoi(arg); if (port <=0 || port > 65535) { fprintf(stderr, "invalid port: %d\n", cla->port); return -1; } cla->port=port; break; } case 'c': { cla->connections=atoi(arg); break; } case ARGP_KEY_ARG: break; default: return ARGP_ERR_UNKNOWN; } return 0; } static struct argp argp = { options, parse_opt, args_doc, doc }; static inline int is_client(void) { if (cla.server != INADDR_ANY) return 1; return 0; } #define FAIL(x) \ do { \ if (x) { \ fprintf(stderr, "%s:%d: error: %s\n", __func__, __LINE__, \ strerror(errno)); \ exit(1); \ } \ } while (0) static void run_client(void) { int i, socks[cla.connections]; struct sockaddr_in name = { .sin_port = htons (cla.port), .sin_addr = { htonl(cla.server) }, .sin_family = AF_INET }; for(i=0; i < cla.connections; i++) { FAIL((socks[i] = socket(PF_INET, SOCK_STREAM, 0)) < 0); FAIL(connect(socks[i], (struct sockaddr*)&name, sizeof(name)) < 0); } /* wait server to close */ sleep(1); for(i=0; i < cla.connections; i++) close(socks[i]); } static void run_server(int lsock) { int i, socks[cla.connections]; for(i = 0; i < cla.connections; i++) FAIL((socks[i] = accept(lsock, NULL, NULL)) < 0); for(i = 0; i < cla.connections; i++) close(socks[i]); sleep(1); } int main(int argc, char **argv) { int lsock; if (argp_parse(&argp, argc, argv, 0, 0, &cla) < 0) return -1; FAIL(system("echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse") != 0); setpriority(PRIO_PROCESS, getpid(), -20); while (1) { static int count; if (is_client()) run_client(); else { if (!count) { struct sockaddr_in name = { .sin_port = htons (cla.port), .sin_family = AF_INET }; FAIL((lsock = socket(PF_INET, SOCK_STREAM, 0)) < 0); FAIL(bind(lsock, (struct sockaddr *) &name, sizeof (name)) < 0); FAIL(listen(lsock, cla.connections) < 0); } run_server(lsock); } printf("iteration %d\n", ++count); } return 0; } ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets 2009-08-24 20:47 ` Octavian Purdila @ 2009-08-29 7:00 ` David Miller 0 siblings, 0 replies; 3+ messages in thread From: David Miller @ 2009-08-29 7:00 UTC (permalink / raw) To: opurdila; +Cc: netdev From: Octavian Purdila <opurdila@ixiacom.com> Date: Mon, 24 Aug 2009 23:47:05 +0300 > On Saturday 15 August 2009 03:39:12 Octavian Purdila wrote: > >> NOTE: this issue has been found, fixed and tested on an ancient 2.6.7 >> kernel. This patch is a blind port of that fix, since unfortunately there >> is no easy way for me to reproduce the original issue with a newer kernel. >> But the issue still seems to be there. > > Update: I was able to reproduce the issue on a 2.6.30 debian kernel with the > attached test. It took me about 10 runs of 2-5 mins each to reproduce it > (multiple runs to keep the capture file reasonable in terms of size). Thanks a lot for fixing this bug. I've applied your patch to net-next-2.6 ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-08-29 7:00 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-15 0:39 [PATCH] tcp: fix premature termination of FIN_WAIT2 time-wait sockets Octavian Purdila 2009-08-24 20:47 ` Octavian Purdila 2009-08-29 7:00 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).