From mboxrd@z Thu Jan 1 00:00:00 1970 From: Venkat Venkatsubra Subject: Re: listen(2) backlog changes in or around Linux 3.1? Date: Thu, 18 Oct 2012 11:53:35 -0500 Message-ID: <5080340F.3050207@oracle.com> References: <507C4401.7050500@oracle.com> <5080279F.80008@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org To: enh Return-path: Received: from acsinet15.oracle.com ([141.146.126.227]:38744 "EHLO acsinet15.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757101Ab2JRQxk (ORCPT ); Thu, 18 Oct 2012 12:53:40 -0400 In-Reply-To: <5080279F.80008@oracle.com> Sender: netdev-owner@vger.kernel.org List-ID: Correction. I don't see the client side receiving any abort/termination notification. They all remain on ESTABLISHED state on the client side. In tcpdump I don't see a FIN or RST coming from the server for the aborted connections. Venkat On 10/18/2012 11:00 AM, Venkat Venkatsubra wrote: > Hi Elliott, > > I see the same behavior with your test program. > The connect() keeps succeeding even though accept() is not performed. > It pauses after 4 connections for a while and then periodically keeps > adding few (2 I think). > > But the server side end points are terminated too. You will see only > the first 2 sessions on the server side. > If you modify your test program to say read or poll the sockets you > should get a termination notification on them I think . > > The behavior overall looks fine in my opinion. But it could be a > change of behavior for your test program. > > Venkat > > On 10/16/2012 6:31 PM, enh wrote: >> boiling things down to a short C++ program, i see that i can reproduce >> the behavior even on 2.6 kernels. if i run this, i see 4 connections >> immediately (3 + 1, as i'd expect)... but then about 10s later i see >> another 2. and every few seconds after that, i see another 2. i've let >> this run until i have hundreds of connect(2) calls that have returned, >> despite my small listen(2) backlog and the fact that i'm not >> accept(2)ing. >> >> so i guess the only thing that's changed with newer kernels is timing >> (hell, since i only see newer kernels on newer hardware, it might just >> be a hardware thing). >> >> and clearly i don't understand what the listen(2) backlog means any >> more. >> >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> >> void dump_ti(int fd) { >> tcp_info ti; >> socklen_t tcp_info_length = sizeof(tcp_info); >> int rc = getsockopt(fd, SOL_IP, TCP_INFO,&ti,&tcp_info_length); >> if (rc == -1) { >> std::cout<< "getsockopt rc "<< rc<< ": "<< strerror(errno)<< >> "\n"; >> return; >> } >> >> std::cout<< "ti.tcpi_unacked="<< ti.tcpi_unacked<< "\n"; >> std::cout<< "ti.tcpi_sacked="<< ti.tcpi_sacked<< "\n"; >> } >> >> void connect_to(sockaddr_in& sa) { >> int s = socket(AF_INET, SOCK_STREAM, 0); >> if (s == -1) { >> abort(); >> } >> >> int rc = connect(s, (sockaddr*)&sa, sizeof(sockaddr_in)); >> std::cout<< "connect = "<< rc<< "\n"; >> } >> >> int main() { >> int ss = socket(AF_INET, SOCK_STREAM, 0); >> std::cout<< "socket fd "<< ss<< "\n"; >> >> sockaddr_in sa; >> memset(&sa, 0, sizeof(sa)); >> sa.sin_family = AF_INET; >> sa.sin_addr.s_addr = htonl(INADDR_ANY); >> sa.sin_port = htons(9877); >> int rc = bind(ss, (sockaddr*)&sa, sizeof(sa)); >> std::cout<< "bind rc "<< rc<< ": "<< strerror(errno)<< "\n"; >> std::cout<< "bind port "<< sa.sin_port<< "\n"; >> >> rc = listen(ss, 1); >> std::cout<< "listen rc "<< rc<< ": "<< strerror(errno)<< "\n"; >> dump_ti(ss); >> >> while (true) { >> connect_to(sa); >> dump_ti(ss); >> } >> >> return 0; >> } >> >> >> On Mon, Oct 15, 2012 at 10:26 AM, enh wrote: >>> On Mon, Oct 15, 2012 at 10:12 AM, Venkat Venkatsubra >>> wrote: >>>> On 10/12/2012 6:40 PM, enh wrote: >>>>> i used to use the following hack to unit test connect timeouts: i'd >>>>> call listen(2) on a socket and then deliberately connect (backlog >>>>> + 3) >>>>> sockets without accept(2)ing any of the connections. (why 3? because >>>>> Stevens told me so, and experiment backed him up. see figure 4.10 in >>>>> his UNIX Network Programming.) >>>>> >>>>> with "old" kernels, 2.6.35-ish to 3.0-ish, this worked great. my next >>>>> connect(2) to the same loopback port would hang indefinitely. i could >>>>> even unblock the connect by calling accept(2) in another thread. this >>>>> was awesome for testing. >>>>> >>>>> in 3.1 on ARM, 3.2 on x86 (Ubuntu desktop), and 3.4 on ARM, this no >>>>> longer works. it doesn't seem to be as simple as "the constant is no >>>>> longer 3". my tests are now flaky. sometimes they work like they used >>>>> to, and sometimes an extra connect(2) will succeed. (or, if i'm in >>>>> non-blocking mode, my poll(2) will return with the non-blocking >>>>> socket >>>>> that's trying to connect now ready.) >>>>> >>>>> i'm guessing if this changed in 3.1 and is still changed in 3.4, >>>>> whatever's changed wasn't an accident. but i haven't been able to >>>>> find >>>>> the right search terms to RTFM. i also finally got around to grepping >>>>> the kernel for the "+ 3", but wasn't able to find that. (so i'd be >>>>> interested to know where the old behavior came from too.) >>>>> >>>>> my least worst workaround at the moment is to use one of RFC5737's >>>>> test networks, but that requires that the device have a network >>>>> connection, otherwise my connect(2)s fail immediately with >>>>> ENETUNREACH, which is no use to me. also, unlike my old trick, i've >>>>> got no way to suddenly "unblock" a slow connect(2) (this is useful >>>>> for >>>>> unit testing the code that does the poll(2) part of the usual >>>>> connect-with-timeout implementation). >>>>> https://android-review.googlesource.com/#/c/44563/ >>>>> >>>>> hopefully someone here can shed some light on this? ideally someone >>>>> will have a workaround as good as my old trick. i realize i was >>>>> relying on undocumented behavior, and i'm happy to have to check >>>>> /proc/version and behave appropriately, but i'd really like a way to >>>>> keep my unit tests! >>>>> >>>>> thanks, >>>>> elliott >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> Hi Elliott, >>>> >>>> In BSD I think the backlog used to be reset to 3/2 times that >>>> passed by the >>>> user. So, 2 becomes 3. >>>> Probably the 1/2 times increase was to accommodate the ones in >>>> partial/incomplete queue. >>>> In Linux is it possible you were getting the same behavior before >>>> the below >>>> commit ? >>>> Since the check used to be "backlog+1" a 2 will behave as 3 ? >>> i don't think so, because with<= 3.0 kernels i used to have a backlog >>> of 1 and be able to make _4_ connections before my next connect would >>> hang. but this> to>= change is at least something for me to >>> investigate... >>> >>>> commit 8488df894d05d6fa41c2bd298c335f944bb0e401 >>>> Author: Wei Dong >>>> Date: Fri Mar 2 12:37:26 2007 -0800 >>>> >>>> [NET]: Fix bugs in "Whether sock accept queue is full" checking >>>> >>>> when I use linux TCP socket, and find there is a bug in >>>> function >>>> sk_acceptq_is_full(). >>>> >>>> When a new SYN comes, TCP module first checks its >>>> validation. If >>>> valid, >>>> send SYN,ACK to the client and add the sock to the syn hash >>>> table. Next >>>> time if received the valid ACK for SYN,ACK from the client. >>>> server will >>>> accept this connection and increase the sk->sk_ack_backlog -- >>>> which is >>>> done in function tcp_check_req().We check wether acceptq is >>>> full in >>>> function tcp_v4_syn_recv_sock(). >>>> >>>> Consider an example: >>>> >>>> After listen(sockfd, 1) system call, sk->sk_max_ack_backlog >>>> is set to >>>> 1. As we know, sk->sk_ack_backlog is initialized to 0. >>>> Assuming accept() >>>> system call is not invoked now. >>>> >>>> 1. 1st connection comes. invoke sk_acceptq_is_full(). >>>> sk->sk_ack_backlog=0 sk->sk_max_ack_backlog=1, function >>>> return 0 accept >>>> this connection. >>>> Increase the sk->sk_ack_backlog >>>> 2. 2nd connection comes. invoke sk_acceptq_is_full(). >>>> sk->sk_ack_backlog=1 sk->sk_max_ack_backlog=1, function >>>> return 0 accept >>>> this connection. >>>> Increase the sk->sk_ack_backlog >>>> 3. 3rd connection comes. invoke sk_acceptq_is_full(). >>>> sk->sk_ack_backlog=2 sk->sk_max_ack_backlog=1, function >>>> return 1. >>>> Refuse this connection. >>>> >>>> I think it has bugs. after listen system call. >>>> sk->sk_max_ack_backlog=1 >>>> but now it can accept 2 connections. >>>> >>>> Signed-off-by: Wei Dong >>>> Signed-off-by: David S. Miller >>>> >>>> Venkat >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html