* tcp_tw_recycle broken? @ 2008-11-15 4:37 Karl Pickett 2008-11-15 5:57 ` Willy Tarreau 0 siblings, 1 reply; 7+ messages in thread From: Karl Pickett @ 2008-11-15 4:37 UTC (permalink / raw) To: linux-kernel, netdev Hey. Developing a http proxy on fedora 9 (2.6.25) and running into a strange issue. Having the proxy set up and tear down 6000 tcp connections a second to the same test server ip and port, it quickly blows up (5 seconds) due to all 30000 ephemeral ports going to TIME_WAIT. setting tw_recycle=1 fixed the problem, and there are never more than a couple hundred ports in TIME_WAIT. BUT... Changing the load test to alternate between two test server ips, it blows up. Connect: can't assign requested address. (note I am not binding before hand, I tried and binding first to port 0 made no difference - it just blows up then during the bind). And there are ~28K ports in TIME_WAIT. For example: proxy_ip:30000 load_test_1:8080 TIME_WAIT proxy_ip:30000 load_test_2:8080 TIME_WAIT ... but most are not duplicates of the same local port. What. The. Heck. So short of rebuilding the kernel with time_wait as 1 second, is there any other way not to brick my proxy? -- Karl Pickett -- Karl Pickett ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tcp_tw_recycle broken? 2008-11-15 4:37 tcp_tw_recycle broken? Karl Pickett @ 2008-11-15 5:57 ` Willy Tarreau 2008-11-15 7:29 ` Karl Pickett [not found] ` <f9be06770811142325j79ca0831j7d5820716199811@mail.gmail.com> 0 siblings, 2 replies; 7+ messages in thread From: Willy Tarreau @ 2008-11-15 5:57 UTC (permalink / raw) To: Karl Pickett; +Cc: linux-kernel, netdev On Fri, Nov 14, 2008 at 11:37:06PM -0500, Karl Pickett wrote: > Hey. Developing a http proxy on fedora 9 (2.6.25) and running into a > strange issue. > > Having the proxy set up and tear down 6000 tcp connections a second to > the same test server ip and port, > it quickly blows up (5 seconds) due to all 30000 ephemeral ports going > to TIME_WAIT. > setting tw_recycle=1 fixed the problem, and there are never more than > a couple hundred ports in TIME_WAIT. > > BUT... > > Changing the load test to alternate between two test server ips, it > blows up. Connect: can't assign requested address. (note I am not > binding before hand, I tried > and binding first to port 0 made no difference - it just blows up then > during the bind). > > And there are ~28K ports in TIME_WAIT. For example: > > proxy_ip:30000 load_test_1:8080 TIME_WAIT > proxy_ip:30000 load_test_2:8080 TIME_WAIT > ... > but most are not duplicates of the same local port. > > > What. The. Heck. > > So short of rebuilding the kernel with time_wait as 1 second, is there > any other way not to brick my proxy? two things : - set tcp_tw_reuse to 1 too. - do a setsockopt(SO_REUSEADDR) before connect() Using this, my proxy has no problem at 35K sess/s on 2.6.25. I'm not sure if disabling either option above still works. Hoping this helps, Willy ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tcp_tw_recycle broken? 2008-11-15 5:57 ` Willy Tarreau @ 2008-11-15 7:29 ` Karl Pickett 2008-11-15 13:09 ` Andi Kleen [not found] ` <f9be06770811142325j79ca0831j7d5820716199811@mail.gmail.com> 1 sibling, 1 reply; 7+ messages in thread From: Karl Pickett @ 2008-11-15 7:29 UTC (permalink / raw) To: linux-kernel, netdev On Sat, Nov 15, 2008 at 12:57 AM, Willy Tarreau <w@1wt.eu> wrote: > On Fri, Nov 14, 2008 at 11:37:06PM -0500, Karl Pickett wrote: >> Hey. Developing a http proxy on fedora 9 (2.6.25) and running into a >> strange issue. >> >> Having the proxy set up and tear down 6000 tcp connections a second to >> the same test server ip and port, >> it quickly blows up (5 seconds) due to all 30000 ephemeral ports going >> to TIME_WAIT. >> setting tw_recycle=1 fixed the problem, and there are never more than >> a couple hundred ports in TIME_WAIT. >> >> BUT... >> >> Changing the load test to alternate between two test server ips, it >> blows up. Connect: can't assign requested address. (note I am not >> binding before hand, I tried >> and binding first to port 0 made no difference - it just blows up then >> during the bind). >> >> And there are ~28K ports in TIME_WAIT. For example: >> >> proxy_ip:30000 load_test_1:8080 TIME_WAIT >> proxy_ip:30000 load_test_2:8080 TIME_WAIT >> ... >> but most are not duplicates of the same local port. >> >> >> What. The. Heck. >> >> So short of rebuilding the kernel with time_wait as 1 second, is there >> any other way not to brick my proxy? > > two things : > - set tcp_tw_reuse to 1 too. > - do a setsockopt(SO_REUSEADDR) before connect() > > Using this, my proxy has no problem at 35K sess/s on 2.6.25. I'm not sure > if disabling either option above still works. > > Hoping this helps, > Willy > > Thanks for the help. Well, it looks like tw_reuse is what I wanted... not tw_recycle. Based on a python test program over loopback, tw_reuse alone solves the problem... so_reuseaddr doesn't do anything. And apparently the tcp code is too much for me...looking at the source I thought tw_reuse only can happen when timestamps are enabled. But even after disabling timestamps tw_reuse still works over loopback. I'll have to wait until Monday to try it again in the lab. I was trying combinations of tw_reuse and recycle, too many to remember apparently. May I just confirm.. is tcp_tw_reuse NOT dependent on receiving timestamps? -- Karl Pickett ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tcp_tw_recycle broken? 2008-11-15 7:29 ` Karl Pickett @ 2008-11-15 13:09 ` Andi Kleen 2008-11-15 15:47 ` Karl Pickett 0 siblings, 1 reply; 7+ messages in thread From: Andi Kleen @ 2008-11-15 13:09 UTC (permalink / raw) To: Karl Pickett; +Cc: linux-kernel, netdev "Karl Pickett" <karl.pickett@gmail.com> writes: > > May I just confirm.. is tcp_tw_reuse NOT dependent on receiving timestamps? The big problem is that both are incompatible with NAT. So if you ever talk to any NATed clients don't use it. -Andi -- ak@linux.intel.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tcp_tw_recycle broken? 2008-11-15 13:09 ` Andi Kleen @ 2008-11-15 15:47 ` Karl Pickett 2008-11-15 15:52 ` Willy Tarreau 0 siblings, 1 reply; 7+ messages in thread From: Karl Pickett @ 2008-11-15 15:47 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel, netdev On Sat, Nov 15, 2008 at 8:09 AM, Andi Kleen <andi@firstfloor.org> wrote: > "Karl Pickett" <karl.pickett@gmail.com> writes: >> >> May I just confirm.. is tcp_tw_reuse NOT dependent on receiving timestamps? > > The big problem is that both are incompatible with NAT. So if you > ever talk to any NATed clients don't use it. > > -Andi > > -- > ak@linux.intel.com > Hmph. Running the test again - after getting a little sleep - timestamps do indeed determine if tw_reuse/recyle work. I must not have let all the tw buckets expire before changing my timestamp settings last night. Since A. I don't want to rely on arbitrary web servers having timestamps B. People say it breaks NAT for clients, and the settings are global only, I will just set TCP_TIMEWAIT_LEN to 10 seconds and call it a day. Sure would be nice if it was a tunable, so only the most heavily loaded customers could set it... -- Karl Pickett ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tcp_tw_recycle broken? 2008-11-15 15:47 ` Karl Pickett @ 2008-11-15 15:52 ` Willy Tarreau 0 siblings, 0 replies; 7+ messages in thread From: Willy Tarreau @ 2008-11-15 15:52 UTC (permalink / raw) To: Karl Pickett; +Cc: Andi Kleen, linux-kernel, netdev On Sat, Nov 15, 2008 at 10:47:10AM -0500, Karl Pickett wrote: > On Sat, Nov 15, 2008 at 8:09 AM, Andi Kleen <andi@firstfloor.org> wrote: > > "Karl Pickett" <karl.pickett@gmail.com> writes: > >> > >> May I just confirm.. is tcp_tw_reuse NOT dependent on receiving timestamps? > > > > The big problem is that both are incompatible with NAT. So if you > > ever talk to any NATed clients don't use it. > > > > -Andi > > > > -- > > ak@linux.intel.com > > > > > Hmph. Running the test again - after getting a little sleep - > timestamps do indeed determine if tw_reuse/recyle work. I must not > have let all the tw buckets expire before changing my timestamp > settings last night. > > Since > A. I don't want to rely on arbitrary web servers having timestamps > B. People say it breaks NAT for clients, and the settings are global only, > > I will just set TCP_TIMEWAIT_LEN to 10 seconds and call it a day. you should increase it a bit. I've encountered occasional issues at 15s, but none at 20s. > Sure would be nice if it was a tunable, so only the most heavily > loaded customers could set it... Indeed. other OSes (eg Solaris) ship with standard values and let us adjust them. Willy ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <f9be06770811142325j79ca0831j7d5820716199811@mail.gmail.com>]
* Re: tcp_tw_recycle broken? [not found] ` <f9be06770811142325j79ca0831j7d5820716199811@mail.gmail.com> @ 2008-11-15 7:45 ` Willy Tarreau 0 siblings, 0 replies; 7+ messages in thread From: Willy Tarreau @ 2008-11-15 7:45 UTC (permalink / raw) To: Karl Pickett; +Cc: linux-kernel, netdev On Sat, Nov 15, 2008 at 02:25:52AM -0500, Karl Pickett wrote: > On Sat, Nov 15, 2008 at 12:57 AM, Willy Tarreau <w@1wt.eu> wrote: > > > On Fri, Nov 14, 2008 at 11:37:06PM -0500, Karl Pickett wrote: > > > Hey. Developing a http proxy on fedora 9 (2.6.25) and running into a > > > strange issue. > > > > > > Having the proxy set up and tear down 6000 tcp connections a second to > > > the same test server ip and port, > > > it quickly blows up (5 seconds) due to all 30000 ephemeral ports going > > > to TIME_WAIT. > > > setting tw_recycle=1 fixed the problem, and there are never more than > > > a couple hundred ports in TIME_WAIT. > > > > > > BUT... > > > > > > Changing the load test to alternate between two test server ips, it > > > blows up. Connect: can't assign requested address. (note I am not > > > binding before hand, I tried > > > and binding first to port 0 made no difference - it just blows up then > > > during the bind). > > > > > > And there are ~28K ports in TIME_WAIT. For example: > > > > > > proxy_ip:30000 load_test_1:8080 TIME_WAIT > > > proxy_ip:30000 load_test_2:8080 TIME_WAIT > > > ... > > > but most are not duplicates of the same local port. > > > > > > > > > What. The. Heck. > > > > > > So short of rebuilding the kernel with time_wait as 1 second, is there > > > any other way not to brick my proxy? > > > > two things : > > - set tcp_tw_reuse to 1 too. > > - do a setsockopt(SO_REUSEADDR) before connect() > > > > Using this, my proxy has no problem at 35K sess/s on 2.6.25. I'm not sure > > if disabling either option above still works. > > > > Hoping this helps, > > Willy > > > > > Well, it looks like tw_reuse is what I wanted... not tw_recycle. Based on a > python test program over loopback, tw_reuse alone solves the problem... > so_reuseaddr doesn't do anything. And apparently the tcp code is too much > for me...looking at the source I thought tw_reuse only can happen when > timestamps are enabled. But even after disabling timestamps tw_reuse still > works over loopback. > > I'll have to wait until Monday to try it again in the lab. > > May I just confirm.. is tcp_tw_reuse NOT dependent on receiving timestamps? I never observed any dependency between both, though the code tends to make me think there is. However, enabling timestamps is often needed when you're reusing TW sockets, not because of your local system, but because of possible intermediate systems between the client and the server, such as firewalls which randomize sequence numbers. Not having tw_reuse prevents ports from being reused too early. But having tw_reuse alone often makes the client chose a source port for which a session still exists on a middle host, with different sequence numbers, which causes trouble. Enabling timestamps solves the problem when the other end supports PAWS. So in general, I would add as a rule of thumb that if you need tw_reuse, you should also enable timestamps "just in case". Willy ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-11-15 15:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-15 4:37 tcp_tw_recycle broken? Karl Pickett
2008-11-15 5:57 ` Willy Tarreau
2008-11-15 7:29 ` Karl Pickett
2008-11-15 13:09 ` Andi Kleen
2008-11-15 15:47 ` Karl Pickett
2008-11-15 15:52 ` Willy Tarreau
[not found] ` <f9be06770811142325j79ca0831j7d5820716199811@mail.gmail.com>
2008-11-15 7:45 ` Willy Tarreau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).