From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vitaly Mayatskikh Subject: Re: speed regression in udp_lib_lport_inuse() Date: Thu, 22 Jan 2009 23:40:59 +0100 Message-ID: References: <4978EE03.9040207@cosmosbay.com> Mime-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Cc: Vitaly Mayatskikh , David Miller , netdev@vger.kernel.org To: Eric Dumazet Return-path: Received: from ug-out-1314.google.com ([66.249.92.168]:21697 "EHLO ug-out-1314.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753195AbZAVWlC (ORCPT ); Thu, 22 Jan 2009 17:41:02 -0500 Received: by ug-out-1314.google.com with SMTP id 39so582208ugf.37 for ; Thu, 22 Jan 2009 14:41:00 -0800 (PST) In-Reply-To: <4978EE03.9040207@cosmosbay.com> Sender: netdev-owner@vger.kernel.org List-ID: At Thu, 22 Jan 2009 23:06:59 +0100, Eric Dumazet wrote: > > err = bind(s, (const struct sockaddr*)&sa, sizeof(sa)); > > Bug here, if bind() returns -1 (all ports are in use) Yeah, there was assert(), but the program drops to problems very soon, I was lazy to handle this situation correctly and just removed it ;) > > Thanks! > > Hello Vitaly, thanks for this excellent report. > > Yes, current code is really not good when all ports are in use : > > We now have to scan 28232 [1] times long chains of 220 sockets. > Thats very long (but at least thread is preemptable) > > In the past (before patches), only one thread was allowed to run in kernel while scanning > udp port table (we had only one global lock udp_hash_lock protecting the whole udp table). Very true, my (older) kernel with udp_hash_lock just become totally unresponsive after running this test. .29-rc2 become jerky only, but still works. > This thread was faster because it was not slowed down by other threads. > (But the rwlock we used was responsible for starvations of writers if many UDP frames > were received) > > > > One way to solve the problem could be to use following : > > 1) Raising UDP_HTABLE_SIZE from 128 to 1024 to reduce average chain lengths. > > 2) In bind(0) algo, use rcu locking to find a possible usable port. All cpus can run in //, without > dirtying locks. Then lock the found chain and recheck port is available before using it. I think 2 is definitely better than 1, because 1 is not actually fixing anything, but postpones the problem slightly. > [1] replace 28232 by your actual /proc/sys/net/ipv4/ip_local_port_range values > 61000 - 32768 = 28232 > > I will try to code a patch before this week end. Cool! > Thanks > > Note : I tried to use a mutex to force only one thread in bind(0) code but got no real speedup. > But it should help if you have a SMP machine, since only one cpu will be busy in bind(0) > You saved my time, I was thinking about trying mutexes also. Thanks :) -- wbr, Vitaly