From: Vitaly Mayatskikh <v.mayatskih@gmail.com>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Vitaly Mayatskikh <v.mayatskih@gmail.com>,
David Miller <davem@davemloft.net>,
netdev@vger.kernel.org
Subject: Re: speed regression in udp_lib_lport_inuse()
Date: Thu, 22 Jan 2009 23:40:59 +0100 [thread overview]
Message-ID: <m3iqo7dkw4.wl%vmayatsk@redhat.com> (raw)
In-Reply-To: <4978EE03.9040207@cosmosbay.com>
At Thu, 22 Jan 2009 23:06:59 +0100, Eric Dumazet wrote:
> > err = bind(s, (const struct sockaddr*)&sa, sizeof(sa));
>
> Bug here, if bind() returns -1 (all ports are in use)
Yeah, there was assert(), but the program drops to problems very soon,
I was lazy to handle this situation correctly and just removed it ;)
> > Thanks!
>
> Hello Vitaly, thanks for this excellent report.
>
> Yes, current code is really not good when all ports are in use :
>
> We now have to scan 28232 [1] times long chains of 220 sockets.
> Thats very long (but at least thread is preemptable)
>
> In the past (before patches), only one thread was allowed to run in kernel while scanning
> udp port table (we had only one global lock udp_hash_lock protecting the whole udp table).
Very true, my (older) kernel with udp_hash_lock just become totally
unresponsive after running this test. .29-rc2 become jerky only, but
still works.
> This thread was faster because it was not slowed down by other threads.
> (But the rwlock we used was responsible for starvations of writers if many UDP frames
> were received)
>
>
>
> One way to solve the problem could be to use following :
>
> 1) Raising UDP_HTABLE_SIZE from 128 to 1024 to reduce average chain lengths.
>
> 2) In bind(0) algo, use rcu locking to find a possible usable port. All cpus can run in //, without
> dirtying locks. Then lock the found chain and recheck port is available before using it.
I think 2 is definitely better than 1, because 1 is not actually
fixing anything, but postpones the problem slightly.
> [1] replace 28232 by your actual /proc/sys/net/ipv4/ip_local_port_range values
> 61000 - 32768 = 28232
>
> I will try to code a patch before this week end.
Cool!
> Thanks
>
> Note : I tried to use a mutex to force only one thread in bind(0) code but got no real speedup.
> But it should help if you have a SMP machine, since only one cpu will be busy in bind(0)
>
You saved my time, I was thinking about trying mutexes also. Thanks :)
--
wbr, Vitaly
next prev parent reply other threads:[~2009-01-22 22:41 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-22 18:49 speed regression in udp_lib_lport_inuse() Vitaly Mayatskikh
2009-01-22 22:06 ` Eric Dumazet
2009-01-22 22:14 ` Evgeniy Polyakov
2009-01-23 0:20 ` Eric Dumazet
2009-01-22 22:40 ` Vitaly Mayatskikh [this message]
2009-01-23 0:14 ` Eric Dumazet
2009-01-23 9:42 ` Vitaly Mayatskikh
2009-01-23 11:45 ` Eric Dumazet
2009-01-23 13:44 ` Eric Dumazet
2009-01-23 14:56 ` Vitaly Mayatskikh
2009-01-23 16:05 ` Eric Dumazet
2009-01-23 16:14 ` Vitaly Mayatskikh
2009-01-26 8:20 ` [PATCH] udp: optimize bind(0) if many ports are in use Eric Dumazet
2009-01-27 5:35 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3iqo7dkw4.wl%vmayatsk@redhat.com \
--to=v.mayatskih@gmail.com \
--cc=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).