netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* __inet_hash_connect port_offset parameter
@ 2015-05-27 15:25 Crestez Dan Leonard
  2015-05-27 16:18 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Crestez Dan Leonard @ 2015-05-27 15:25 UTC (permalink / raw)
  To: netdev

Hello,

I'm confused about the port_offset parameter to __inet_hash_connect.

When allocating the local port for an outgoing TCP connection the port search looks something like this:

    static u32 hint;
    u32 offset = hint + port_offset;

    inet_get_local_port_range(net, &low, &high);
    remaining = (high - low) + 1;

    for (i = 1; i <= remaining; i++) {
        port = low + (i + offset) % remaining;
        /* check port is free */

The port_offset is calculated for v4 and v6 based on a hash of src/dst addresses, presumably in order to improve security.

I see a few issues with this:
 - The port_offset is calculated even if the local port was already assigned via bind. This wastes a few cycles.
 - Keeping the last searched port as a static variable is a bad idea on multicore cpus. Starting a lot of connections to the same target will result in lock contention in the bind hash. This is probably only visible in highly synthetic tests.
 - When doing a port search at bind() time the search starts from "prandom_32()". Is this "less secure" for port allocation? I bet most applications are not aware of this difference.

Wouldn't it be better to use the same local port search mechanism at both bind (inet_csk_get_port) and connect (__inet_hash_connect) time, based on starting from a random point? It would also make connecting slightly faster.

Regards,
Leonard

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: __inet_hash_connect port_offset parameter
  2015-05-27 15:25 __inet_hash_connect port_offset parameter Crestez Dan Leonard
@ 2015-05-27 16:18 ` Eric Dumazet
  2015-05-27 16:30   ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-05-27 16:18 UTC (permalink / raw)
  To: Crestez Dan Leonard; +Cc: netdev

On Wed, 2015-05-27 at 18:25 +0300, Crestez Dan Leonard wrote:
> Hello,
> 
> I'm confused about the port_offset parameter to __inet_hash_connect.
> 
> When allocating the local port for an outgoing TCP connection the port
> search looks something like this:
> 
>     static u32 hint;
>     u32 offset = hint + port_offset;
> 
>     inet_get_local_port_range(net, &low, &high);
>     remaining = (high - low) + 1;
> 
>     for (i = 1; i <= remaining; i++) {
>         port = low + (i + offset) % remaining;
>         /* check port is free */
> 
> The port_offset is calculated for v4 and v6 based on a hash of src/dst
> addresses, presumably in order to improve security.
> 
> I see a few issues with this:
>  - The port_offset is calculated even if the local port was already
> assigned via bind. This wastes a few cycles.

OK. Not a big deal I guess.

>  - Keeping the last searched port as a static variable is a bad idea
> on multicore cpus. Starting a lot of connections to the same target
> will result in lock contention in the bind hash. This is probably only
> visible in highly synthetic tests.

Not really. This is a hint only.

I have one patch adding an array of hints, but this does not change
behavior if connecting to same target.

u32 key = port_offset % HINTS_SZ;

>  - When doing a port search at bind() time the search starts from
> "prandom_32()". Is this "less secure" for port allocation? I bet most
> applications are not aware of this difference.

I am afraid we need to keep a sequential search for ephemeral port
selection.

This known behavior is described in RFC 6056

> 
> Wouldn't it be better to use the same local port search mechanism at
> both bind (inet_csk_get_port) and connect (__inet_hash_connect) time,
> based on starting from a random point? It would also make connecting
> slightly faster.

Nope, bind() and connect() have different semantics.

The randomization is only good for bind(port=0) users.

Have you seen my proposal ?

https://patchwork.ozlabs.org/patch/476002/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: __inet_hash_connect port_offset parameter
  2015-05-27 16:18 ` Eric Dumazet
@ 2015-05-27 16:30   ` Eric Dumazet
  2015-05-27 17:46     ` [PATCH net-next] tcp: connect() from bound sockets can be faster Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-05-27 16:30 UTC (permalink / raw)
  To: Crestez Dan Leonard; +Cc: netdev

On Wed, 2015-05-27 at 09:18 -0700, Eric Dumazet wrote:
> On Wed, 2015-05-27 at 18:25 +0300, Crestez Dan Leonard wrote:
> > Hello,
> > 
> > I'm confused about the port_offset parameter to __inet_hash_connect.
> > 
> > When allocating the local port for an outgoing TCP connection the port
> > search looks something like this:
> > 
> >     static u32 hint;
> >     u32 offset = hint + port_offset;
> > 
> >     inet_get_local_port_range(net, &low, &high);
> >     remaining = (high - low) + 1;
> > 
> >     for (i = 1; i <= remaining; i++) {
> >         port = low + (i + offset) % remaining;
> >         /* check port is free */
> > 
> > The port_offset is calculated for v4 and v6 based on a hash of src/dst
> > addresses, presumably in order to improve security.
> > 
> > I see a few issues with this:
> >  - The port_offset is calculated even if the local port was already
> > assigned via bind. This wastes a few cycles.
> 
> OK. Not a big deal I guess.

Patch for IPv4 would be :

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 185efef0f1251ba9d45fabb3ed51777a8be097a6..be4bac368b6bfb8a1eca429cce415da99adc5515 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -594,7 +594,11 @@ out:
 int inet_hash_connect(struct inet_timewait_death_row *death_row,
 		      struct sock *sk)
 {
-	return __inet_hash_connect(death_row, sk, inet_sk_port_offset(sk),
+	u32 port_offset = 0;
+
+	if (!inet_sk(sk)->inet_num)
+		port_offset = inet_sk_port_offset(sk);
+	return __inet_hash_connect(death_row, sk, port_offset,
 				   __inet_check_established);
 }
 EXPORT_SYMBOL_GPL(inet_hash_connect);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net-next] tcp: connect() from bound sockets can be faster
  2015-05-27 16:30   ` Eric Dumazet
@ 2015-05-27 17:46     ` Eric Dumazet
  2015-05-27 18:31       ` David Miller
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-05-27 17:46 UTC (permalink / raw)
  To: Crestez Dan Leonard, David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

__inet_hash_connect() does not use its third argument (port_offset)
if socket was already bound to a source port.

No need to perform useless but expensive md5 computations.

Reported-by: Crestez Dan Leonard <cdleonard@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/inet_hashtables.c  |    9 +++++++--
 net/ipv6/inet6_hashtables.c |    8 ++++++--
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 185efef0f1251ba9d45fabb3ed51777a8be097a6..e36942f0725d2e2d15f8d248b19a34c3487528ef 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -394,9 +394,10 @@ not_unique:
 	return -EADDRNOTAVAIL;
 }
 
-static inline u32 inet_sk_port_offset(const struct sock *sk)
+static u32 inet_sk_port_offset(const struct sock *sk)
 {
 	const struct inet_sock *inet = inet_sk(sk);
+
 	return secure_ipv4_port_ephemeral(inet->inet_rcv_saddr,
 					  inet->inet_daddr,
 					  inet->inet_dport);
@@ -594,7 +595,11 @@ out:
 int inet_hash_connect(struct inet_timewait_death_row *death_row,
 		      struct sock *sk)
 {
-	return __inet_hash_connect(death_row, sk, inet_sk_port_offset(sk),
+	u32 port_offset = 0;
+
+	if (!inet_sk(sk)->inet_num)
+		port_offset = inet_sk_port_offset(sk);
+	return __inet_hash_connect(death_row, sk, port_offset,
 				   __inet_check_established);
 }
 EXPORT_SYMBOL_GPL(inet_hash_connect);
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index 871641bc1ed4eb5b8f5f554c9efb75f2419fe5b6..b4fd96de97e61627003eff220e10bdd05a899e28 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -257,7 +257,7 @@ not_unique:
 	return -EADDRNOTAVAIL;
 }
 
-static inline u32 inet6_sk_port_offset(const struct sock *sk)
+static u32 inet6_sk_port_offset(const struct sock *sk)
 {
 	const struct inet_sock *inet = inet_sk(sk);
 
@@ -269,7 +269,11 @@ static inline u32 inet6_sk_port_offset(const struct sock *sk)
 int inet6_hash_connect(struct inet_timewait_death_row *death_row,
 		       struct sock *sk)
 {
-	return __inet_hash_connect(death_row, sk, inet6_sk_port_offset(sk),
+	u32 port_offset = 0;
+
+	if (!inet_sk(sk)->inet_num)
+		port_offset = inet6_sk_port_offset(sk);
+	return __inet_hash_connect(death_row, sk, port_offset,
 				   __inet6_check_established);
 }
 EXPORT_SYMBOL_GPL(inet6_hash_connect);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] tcp: connect() from bound sockets can be faster
  2015-05-27 17:46     ` [PATCH net-next] tcp: connect() from bound sockets can be faster Eric Dumazet
@ 2015-05-27 18:31       ` David Miller
  0 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2015-05-27 18:31 UTC (permalink / raw)
  To: eric.dumazet; +Cc: cdleonard, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 27 May 2015 10:46:02 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> __inet_hash_connect() does not use its third argument (port_offset)
> if socket was already bound to a source port.
> 
> No need to perform useless but expensive md5 computations.
> 
> Reported-by: Crestez Dan Leonard <cdleonard@gmail.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Looks great, applied, thanks Eric!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-05-27 18:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-27 15:25 __inet_hash_connect port_offset parameter Crestez Dan Leonard
2015-05-27 16:18 ` Eric Dumazet
2015-05-27 16:30   ` Eric Dumazet
2015-05-27 17:46     ` [PATCH net-next] tcp: connect() from bound sockets can be faster Eric Dumazet
2015-05-27 18:31       ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).