All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Sitnicki <jakub@cloudflare.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>,
	 "David S. Miller" <davem@davemloft.net>,
	 Jakub Kicinski <kuba@kernel.org>,
	 Neal Cardwell <ncardwell@google.com>,
	 Kuniyuki Iwashima <kuniyu@google.com>,
	netdev@vger.kernel.org,  kernel-team@cloudflare.com,
	 Lee Valentine <lvalentine@cloudflare.com>
Subject: Re: [PATCH net-next v3 2/3] tcp: Consider every port when connecting with IP_LOCAL_PORT_RANGE
Date: Thu, 17 Jul 2025 11:44:01 +0200	[thread overview]
Message-ID: <87qzyfdue6.fsf@cloudflare.com> (raw)
In-Reply-To: <00911a84-c4e3-452e-ab51-1275a43ca4b2@redhat.com> (Paolo Abeni's message of "Thu, 17 Jul 2025 11:23:36 +0200")

On Thu, Jul 17, 2025 at 11:23 AM +02, Paolo Abeni wrote:
> On 7/14/25 6:03 PM, Jakub Sitnicki wrote:
>> Solution
>> --------
>> 
>> If there is no IP address conflict with any socket bound to a given local
>> port, then from the protocol's perspective, the port can be safely shared.
>> 
>> With that in mind, modify the port search during connect(), that is
>> __inet_hash_connect, to consider all bind buckets (ports) when looking for
>> a local port for egress.
>> 
>> To achieve this, add an extra walk over bhash2 buckets for the port to
>> check for IP conflicts. The additional walk is not free, so perform it only
>> once per port - during the second phase of conflict checking, when the
>> bhash bucket is locked.
>> 
>> We enable this changed behavior only if the IP_LOCAL_PORT_RANGE socket
>> option is set. The rationale is that users are likely to care about using
>> every possible local port only when they have deliberately constrained the
>> ephemeral port range.
>
> I'm not a big fan of piggybacking additional semantic on existing
> socketopt, have you considered a new one?

That's a fair point. Though a dedicated sysctl seems more appropriate in
this case. Akin to how we have ip_autobind_reuse to enable amore
aggresive port sharing strategy on bind() side. How does that sound?

>
> At very least you will need to update the man page.
>
>
>> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
>> index d3ce6d0a514e..9d8a9c7c8274 100644
>> --- a/net/ipv4/inet_hashtables.c
>> +++ b/net/ipv4/inet_hashtables.c
>> @@ -1005,6 +1005,52 @@ EXPORT_IPV6_MOD(inet_bhash2_reset_saddr);
>>  #define INET_TABLE_PERTURB_SIZE (1 << CONFIG_INET_TABLE_PERTURB_ORDER)
>>  static u32 *table_perturb;
>>  
>> +/* True on source address conflict with another socket. False otherwise. */
>> +static inline bool check_bind2_bucket(const struct sock *sk,
>> +				      const struct inet_bind2_bucket *tb2)
>
> Please no inline function in c files.

Will fix.

Thanks for feedback.

  reply	other threads:[~2025-07-17  9:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-14 16:03 [PATCH net-next v3 0/3] tcp: Consider every port when connecting with IP_LOCAL_PORT_RANGE Jakub Sitnicki
2025-07-14 16:03 ` [PATCH net-next v3 1/3] tcp: Add RCU management to inet_bind2_bucket Jakub Sitnicki
2025-07-14 16:03 ` [PATCH net-next v3 2/3] tcp: Consider every port when connecting with IP_LOCAL_PORT_RANGE Jakub Sitnicki
2025-07-17  9:23   ` Paolo Abeni
2025-07-17  9:44     ` Jakub Sitnicki [this message]
2025-07-17 11:21       ` Jakub Sitnicki
2025-07-14 16:03 ` [PATCH net-next v3 3/3] selftests/net: Cover port sharing scenarios " Jakub Sitnicki
2025-07-17  9:27   ` Paolo Abeni
2025-07-17  9:44     ` Jakub Sitnicki
2025-07-17  9:34   ` Paolo Abeni
2025-07-17  9:50     ` Jakub Sitnicki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87qzyfdue6.fsf@cloudflare.com \
    --to=jakub@cloudflare.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=lvalentine@cloudflare.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.