* [PATCH net-next v1] tcp/dccp: avoid parity split for socket-local bind range
@ 2026-06-26 9:38 xuanqiang.luo
2026-06-26 23:40 ` Kuniyuki Iwashima
0 siblings, 1 reply; 3+ messages in thread
From: xuanqiang.luo @ 2026-06-26 9:38 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, netdev
Cc: Kuniyuki Iwashima, David S . Miller, Jakub Kicinski, Paolo Abeni,
Simon Horman, luoxuanqiang
From: luoxuanqiang <luoxuanqiang@kylinos.cn>
IP_LOCAL_PORT_RANGE lets applications override the netns ephemeral port
range on a per-socket basis. __inet_hash_connect() already treats such a
range as an explicit application partition and scans it with step 1 [1].
Do the same in inet_csk_find_open_port(): when a socket-local range is set,
walk the whole selected range instead of first splitting it by parity.
Keep the existing step-2 parity behavior for sockets using the netns range,
so the default bind/connect separation remains unchanged.
[1] https://lore.kernel.org/r/20231214192939.1962891-3-edumazet@google.com
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: luoxuanqiang <luoxuanqiang@kylinos.cn>
---
net/ipv4/inet_connection_sock.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 56902bba54838..ad8af70c92ca3 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -323,13 +323,16 @@ inet_csk_find_open_port(const struct sock *sk, struct inet_bind_bucket **tb_ret,
struct inet_bind2_bucket *tb2;
struct inet_bind_bucket *tb;
u32 remaining, offset;
+ bool local_ports;
bool relax = false;
+ int step;
l3mdev = inet_sk_bound_l3mdev(sk);
ports_exhausted:
attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
other_half_scan:
- inet_sk_get_local_port_range(sk, &low, &high);
+ local_ports = inet_sk_get_local_port_range(sk, &low, &high);
+ step = local_ports ? 1 : 2;
high++; /* [32768, 60999] -> [32768, 61000[ */
if (high - low < 4)
attempt_half = 0;
@@ -342,18 +345,19 @@ inet_csk_find_open_port(const struct sock *sk, struct inet_bind_bucket **tb_ret,
low = half;
}
remaining = high - low;
- if (likely(remaining > 1))
+ if (!local_ports && remaining > 1)
remaining &= ~1U;
offset = get_random_u32_below(remaining);
/* __inet_hash_connect() favors ports having @low parity
* We do the opposite to not pollute connect() users.
*/
- offset |= 1U;
+ if (!local_ports)
+ offset |= 1U;
other_parity_scan:
port = low + offset;
- for (i = 0; i < remaining; i += 2, port += 2) {
+ for (i = 0; i < remaining; i += step, port += step) {
if (unlikely(port >= high))
port -= remaining;
if (inet_is_local_reserved_port(net, port))
@@ -384,9 +388,11 @@ inet_csk_find_open_port(const struct sock *sk, struct inet_bind_bucket **tb_ret,
cond_resched();
}
- offset--;
- if (!(offset & 1))
- goto other_parity_scan;
+ if (!local_ports) {
+ offset--;
+ if (!(offset & 1))
+ goto other_parity_scan;
+ }
if (attempt_half == 1) {
/* OK we now try the upper half of the range */
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net-next v1] tcp/dccp: avoid parity split for socket-local bind range
2026-06-26 9:38 [PATCH net-next v1] tcp/dccp: avoid parity split for socket-local bind range xuanqiang.luo
@ 2026-06-26 23:40 ` Kuniyuki Iwashima
2026-06-27 1:59 ` luoxuanqiang
0 siblings, 1 reply; 3+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-26 23:40 UTC (permalink / raw)
To: xuanqiang.luo
Cc: Eric Dumazet, Neal Cardwell, netdev, David S . Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, luoxuanqiang
On Fri, Jun 26, 2026 at 2:40 AM <xuanqiang.luo@linux.dev> wrote:
>
> From: luoxuanqiang <luoxuanqiang@kylinos.cn>
>
> IP_LOCAL_PORT_RANGE lets applications override the netns ephemeral port
> range on a per-socket basis. __inet_hash_connect() already treats such a
> range as an explicit application partition and scans it with step 1 [1].
>
> Do the same in inet_csk_find_open_port():
What's the use case of IP_LOCAL_PORT_RANGE + bind(, 0)
without IP_BIND_ADDRESS_NO_PORT ?
> when a socket-local range is set,
> walk the whole selected range instead of first splitting it by parity.
> Keep the existing step-2 parity behavior for sockets using the netns range,
> so the default bind/connect separation remains unchanged.
>
> [1] https://lore.kernel.org/r/20231214192939.1962891-3-edumazet@google.com
>
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: luoxuanqiang <luoxuanqiang@kylinos.cn>
> ---
> net/ipv4/inet_connection_sock.c | 20 +++++++++++++-------
> 1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index 56902bba54838..ad8af70c92ca3 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -323,13 +323,16 @@ inet_csk_find_open_port(const struct sock *sk, struct inet_bind_bucket **tb_ret,
> struct inet_bind2_bucket *tb2;
> struct inet_bind_bucket *tb;
> u32 remaining, offset;
> + bool local_ports;
> bool relax = false;
> + int step;
>
> l3mdev = inet_sk_bound_l3mdev(sk);
> ports_exhausted:
> attempt_half = (sk->sk_reuse == SK_CAN_REUSE) ? 1 : 0;
> other_half_scan:
> - inet_sk_get_local_port_range(sk, &low, &high);
> + local_ports = inet_sk_get_local_port_range(sk, &low, &high);
> + step = local_ports ? 1 : 2;
> high++; /* [32768, 60999] -> [32768, 61000[ */
> if (high - low < 4)
> attempt_half = 0;
> @@ -342,18 +345,19 @@ inet_csk_find_open_port(const struct sock *sk, struct inet_bind_bucket **tb_ret,
> low = half;
> }
> remaining = high - low;
> - if (likely(remaining > 1))
> + if (!local_ports && remaining > 1)
> remaining &= ~1U;
>
> offset = get_random_u32_below(remaining);
> /* __inet_hash_connect() favors ports having @low parity
> * We do the opposite to not pollute connect() users.
> */
> - offset |= 1U;
> + if (!local_ports)
> + offset |= 1U;
>
> other_parity_scan:
> port = low + offset;
> - for (i = 0; i < remaining; i += 2, port += 2) {
> + for (i = 0; i < remaining; i += step, port += step) {
> if (unlikely(port >= high))
> port -= remaining;
> if (inet_is_local_reserved_port(net, port))
> @@ -384,9 +388,11 @@ inet_csk_find_open_port(const struct sock *sk, struct inet_bind_bucket **tb_ret,
> cond_resched();
> }
>
> - offset--;
> - if (!(offset & 1))
> - goto other_parity_scan;
> + if (!local_ports) {
> + offset--;
> + if (!(offset & 1))
> + goto other_parity_scan;
> + }
>
> if (attempt_half == 1) {
> /* OK we now try the upper half of the range */
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net-next v1] tcp/dccp: avoid parity split for socket-local bind range
2026-06-26 23:40 ` Kuniyuki Iwashima
@ 2026-06-27 1:59 ` luoxuanqiang
0 siblings, 0 replies; 3+ messages in thread
From: luoxuanqiang @ 2026-06-27 1:59 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Eric Dumazet, Neal Cardwell, netdev, David S . Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, luoxuanqiang
> 2026年6月27日 07:40,Kuniyuki Iwashima <kuniyu@google.com> 写道:
>
> On Fri, Jun 26, 2026 at 2:40 AM <xuanqiang.luo@linux.dev> wrote:
>>
>> From: luoxuanqiang <luoxuanqiang@kylinos.cn>
>>
>> IP_LOCAL_PORT_RANGE lets applications override the netns ephemeral port
>> range on a per-socket basis. __inet_hash_connect() already treats such a
>> range as an explicit application partition and scans it with step 1 [1].
>>
>> Do the same in inet_csk_find_open_port():
>
> What's the use case of IP_LOCAL_PORT_RANGE + bind(, 0)
> without IP_BIND_ADDRESS_NO_PORT ?
Hi Kuniyuki,
Thanks for the question!
The use case is when an application wants to restrict ephemeral port
allocation to a socket-local IP_LOCAL_PORT_RANGE, but still needs
bind(..., 0) to allocate and reserve a local port immediately.
IP_BIND_ADDRESS_NO_PORT is useful when the application can defer port
allocation until connect(), but it changes this behavior: bind(..., 0)
does not reserve a port in that case. So it is not a replacement for
applications that need the local port before connect(), for example to
publish it to another component or set up local policy.
This patch is also intended to keep the bind(..., 0) path consistent with
Eric's earlier change in __inet_hash_connect().
Thanks,
Xuanqiang
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-27 2:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-26 9:38 [PATCH net-next v1] tcp/dccp: avoid parity split for socket-local bind range xuanqiang.luo
2026-06-26 23:40 ` Kuniyuki Iwashima
2026-06-27 1:59 ` luoxuanqiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox