netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* inet_hash_connect: source port allocation
@ 2010-11-29 17:04 John Haxby
  2010-11-29 17:26 ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: John Haxby @ 2010-11-29 17:04 UTC (permalink / raw)
  To: NetDev

Hello,

Please forgive me if this is a stupid question, but is there any 
particular reason why the source port allocation in 
__inet_hash_connect() shouldn't use the same random allocation that 
inet_csk_get_port() uses?  The latter, of course, is used when bind() 
doesn't specify a source port but the implicit "bind" for a connect() 
gets its port allocated by __inet_hash_connect().

jch

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: inet_hash_connect: source port allocation
  2010-11-29 17:04 inet_hash_connect: source port allocation John Haxby
@ 2010-11-29 17:26 ` Eric Dumazet
  2010-11-29 18:29   ` John Haxby
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2010-11-29 17:26 UTC (permalink / raw)
  To: John Haxby; +Cc: NetDev

Le lundi 29 novembre 2010 à 17:04 +0000, John Haxby a écrit :
> Hello,
> 
> Please forgive me if this is a stupid question, but is there any 
> particular reason why the source port allocation in 
> __inet_hash_connect() shouldn't use the same random allocation that 
> inet_csk_get_port() uses?  The latter, of course, is used when bind() 
> doesn't specify a source port but the implicit "bind" for a connect() 
> gets its port allocated by __inet_hash_connect().
> 
> jch

autobind vs bind

bind() gives more information, like local address (if any)

autobind(), we dont know local address, it'll be chose later by routing.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: inet_hash_connect: source port allocation
  2010-11-29 17:26 ` Eric Dumazet
@ 2010-11-29 18:29   ` John Haxby
  2010-11-29 18:46     ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: John Haxby @ 2010-11-29 18:29 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: NetDev

On 29/11/10 17:26, Eric Dumazet wrote:
> Le lundi 29 novembre 2010 à 17:04 +0000, John Haxby a écrit :
>> Hello,
>>
>> Please forgive me if this is a stupid question, but is there any
>> particular reason why the source port allocation in
>> __inet_hash_connect() shouldn't use the same random allocation that
>> inet_csk_get_port() uses?  The latter, of course, is used when bind()
>> doesn't specify a source port but the implicit "bind" for a connect()
>> gets its port allocated by __inet_hash_connect().
>>
>> jch
> autobind vs bind
>
> bind() gives more information, like local address (if any)
>
> autobind(), we dont know local address, it'll be chose later by routing.

Sorry,  I think I phrased my question badly.

inet_csk_get_port() starts its search for a free port with

     smallest_rover = rover = net_random() % remaining + low;

whereas __inet_hash_connect() basically misses out that call to 
net_random() so you get a predictable port number.

Is there any good reason why that is the case?

jch



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: inet_hash_connect: source port allocation
  2010-11-29 18:29   ` John Haxby
@ 2010-11-29 18:46     ` Eric Dumazet
  2010-11-29 19:07       ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2010-11-29 18:46 UTC (permalink / raw)
  To: John Haxby; +Cc: NetDev, Stephen Hemminger

Le lundi 29 novembre 2010 à 18:29 +0000, John Haxby a écrit :

> Sorry,  I think I phrased my question badly.
> 
> inet_csk_get_port() starts its search for a free port with
> 
>      smallest_rover = rover = net_random() % remaining + low;
> 
> whereas __inet_hash_connect() basically misses out that call to 
> net_random() so you get a predictable port number.
> 
> Is there any good reason why that is the case?
> 

It seems random select was done at bind() time only in commit
6df716340da3a6f ([TCP/DCCP]: Randomize port selection)

It probably should be done in autobind too.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: inet_hash_connect: source port allocation
  2010-11-29 18:46     ` Eric Dumazet
@ 2010-11-29 19:07       ` Eric Dumazet
  2010-11-29 19:21         ` Eric Dumazet
  2010-11-29 19:38         ` Stephen Hemminger
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2010-11-29 19:07 UTC (permalink / raw)
  To: John Haxby; +Cc: NetDev, Stephen Hemminger

Le lundi 29 novembre 2010 à 19:46 +0100, Eric Dumazet a écrit :
> Le lundi 29 novembre 2010 à 18:29 +0000, John Haxby a écrit :
> 
> > Sorry,  I think I phrased my question badly.
> > 
> > inet_csk_get_port() starts its search for a free port with
> > 
> >      smallest_rover = rover = net_random() % remaining + low;
> > 
> > whereas __inet_hash_connect() basically misses out that call to 
> > net_random() so you get a predictable port number.
> > 
> > Is there any good reason why that is the case?
> > 
> 
> It seems random select was done at bind() time only in commit
> 6df716340da3a6f ([TCP/DCCP]: Randomize port selection)
> 
> It probably should be done in autobind too.
> 
> 

I'll test following patch :

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 1b344f3..65c3702 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -466,20 +466,18 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 	int twrefcnt = 1;
 
 	if (!snum) {
-		int i, remaining, low, high, port;
-		static u32 hint;
-		u32 offset = hint + port_offset;
+		int remaining, low, high, port;
 		struct hlist_node *node;
 		struct inet_timewait_sock *tw = NULL;
 
 		inet_get_local_port_range(&low, &high);
 		remaining = (high - low) + 1;
+		port = net_random() % remaining + low;
 
 		local_bh_disable();
-		for (i = 1; i <= remaining; i++) {
-			port = low + (i + offset) % remaining;
+		do {
 			if (inet_is_reserved_local_port(port))
-				continue;
+				goto next_nolock;
 			head = &hinfo->bhash[inet_bhashfn(net, port,
 					hinfo->bhash_size)];
 			spin_lock(&head->lock);
@@ -510,16 +508,17 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 			tb->fastreuse = -1;
 			goto ok;
 
-		next_port:
+next_port:
 			spin_unlock(&head->lock);
-		}
+next_nolock:
+			if (++port > high)
+				port = low;
+		} while (--remaining > 0);
 		local_bh_enable();
 
 		return -EADDRNOTAVAIL;
 
 ok:
-		hint += i;
-
 		/* Head lock still held and bh's disabled */
 		inet_bind_hash(sk, tb, port);
 		if (sk_unhashed(sk)) {



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: inet_hash_connect: source port allocation
  2010-11-29 19:07       ` Eric Dumazet
@ 2010-11-29 19:21         ` Eric Dumazet
  2010-11-29 19:38         ` Stephen Hemminger
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2010-11-29 19:21 UTC (permalink / raw)
  To: John Haxby; +Cc: NetDev, Stephen Hemminger

Le lundi 29 novembre 2010 à 20:07 +0100, Eric Dumazet a écrit :
> Le lundi 29 novembre 2010 à 19:46 +0100, Eric Dumazet a écrit :
> > Le lundi 29 novembre 2010 à 18:29 +0000, John Haxby a écrit :
> > 
> > > Sorry,  I think I phrased my question badly.
> > > 
> > > inet_csk_get_port() starts its search for a free port with
> > > 
> > >      smallest_rover = rover = net_random() % remaining + low;
> > > 
> > > whereas __inet_hash_connect() basically misses out that call to 
> > > net_random() so you get a predictable port number.
> > > 
> > > Is there any good reason why that is the case?
> > > 
> > 
> > It seems random select was done at bind() time only in commit
> > 6df716340da3a6f ([TCP/DCCP]: Randomize port selection)
> > 
> > It probably should be done in autobind too.
> > 
> > 
> 
> I'll test following patch :

Oh well, forget this, there is something about inet_sk_port_offset()
using secure_ipv4_port_ephemeral()

We want to avoid reusing same port too fast.

http://www.tcpipguide.com/free/t_TCPIPClientEphemeralPortsandClientServerApplicatio-2.htm

Port is predictable only for same destination, and if no other
connections are attempted by other threads.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: inet_hash_connect: source port allocation
  2010-11-29 19:07       ` Eric Dumazet
  2010-11-29 19:21         ` Eric Dumazet
@ 2010-11-29 19:38         ` Stephen Hemminger
  1 sibling, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2010-11-29 19:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: John Haxby, NetDev

On Mon, 29 Nov 2010 20:07:35 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le lundi 29 novembre 2010 à 19:46 +0100, Eric Dumazet a écrit :
> > Le lundi 29 novembre 2010 à 18:29 +0000, John Haxby a écrit :
> > 
> > > Sorry,  I think I phrased my question badly.
> > > 
> > > inet_csk_get_port() starts its search for a free port with
> > > 
> > >      smallest_rover = rover = net_random() % remaining + low;
> > > 
> > > whereas __inet_hash_connect() basically misses out that call to 
> > > net_random() so you get a predictable port number.
> > > 
> > > Is there any good reason why that is the case?
> > > 
> > 
> > It seems random select was done at bind() time only in commit
> > 6df716340da3a6f ([TCP/DCCP]: Randomize port selection)
> > 
> > It probably should be done in autobind too.
> > 
> > 
> 
> I'll test following patch :
> 
> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> index 1b344f3..65c3702 100644
> --- a/net/ipv4/inet_hashtables.c
> +++ b/net/ipv4/inet_hashtables.c
> @@ -466,20 +466,18 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
>  	int twrefcnt = 1;
>  
>  	if (!snum) {
> -		int i, remaining, low, high, port;
> -		static u32 hint;
> -		u32 offset = hint + port_offset;
> +		int remaining, low, high, port;
>  		struct hlist_node *node;
>  		struct inet_timewait_sock *tw = NULL;
>  
>  		inet_get_local_port_range(&low, &high);
>  		remaining = (high - low) + 1;
> +		port = net_random() % remaining + low;
>  
>  		local_bh_disable();
> -		for (i = 1; i <= remaining; i++) {
> -			port = low + (i + offset) % remaining;
> +		do {
>  			if (inet_is_reserved_local_port(port))
> -				continue;
> +				goto next_nolock;
>  			head = &hinfo->bhash[inet_bhashfn(net, port,
>  					hinfo->bhash_size)];
>  			spin_lock(&head->lock);
> @@ -510,16 +508,17 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
>  			tb->fastreuse = -1;
>  			goto ok;
>  
> -		next_port:
> +next_port:
>  			spin_unlock(&head->lock);
> -		}
> +next_nolock:
> +			if (++port > high)
> +				port = low;
> +		} while (--remaining > 0);
>  		local_bh_enable();
>  
>  		return -EADDRNOTAVAIL;
>  
>  ok:
> -		hint += i;
> -
>  		/* Head lock still held and bh's disabled */
>  		inet_bind_hash(sk, tb, port);
>  		if (sk_unhashed(sk)) {
> 

The original algorithm works better than uses if the port space is small
and being reused rapidly. Because the hint in the old algorithm is sequential ports
get used up sequentially.

You should look a the port randomization RFC. The earlier versions of the
RFC were better before the BSD guys started putting in their non-scalable
algorithms :-)


-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-11-29 19:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-29 17:04 inet_hash_connect: source port allocation John Haxby
2010-11-29 17:26 ` Eric Dumazet
2010-11-29 18:29   ` John Haxby
2010-11-29 18:46     ` Eric Dumazet
2010-11-29 19:07       ` Eric Dumazet
2010-11-29 19:21         ` Eric Dumazet
2010-11-29 19:38         ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).