From: Fernando Fernandez Mancera <fmancera@suse.de>
To: netdev@vger.kernel.org
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, corbet@lwn.net,
ncardwell@google.com, kuniyu@google.com, dsahern@kernel.org,
idosch@nvidia.com, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org,
Thorsten Toepper <thorsten.toepper@sap.com>
Subject: Re: [PATCH RFC net-next] inet: add ip_retry_random_port sysctl to reduce sequential port retries
Date: Tue, 3 Feb 2026 19:02:36 +0100 [thread overview]
Message-ID: <b20965f7-e251-4793-951e-f211d179dbba@suse.de> (raw)
In-Reply-To: <20260203175422.4620-1-fmancera@suse.de>
On 2/3/26 6:54 PM, Fernando Fernandez Mancera wrote:
> With the current port selection algorithm, ports after a reserved port
> or long time used port are used more often than others. This combines
> with cloud environments blocking connections between the application
> server and the database server if there was a previous connection with
> the same source port. This leads to connectivity problems between
> applications on cloud environments.
>
> The situation is that a source tuple is usable again after being closed
> for a maximum lifetime segment of two minutes while in the firewall it's
> still noted as existing for 60 minutes or longer. So in case that the
> port is reused for the same target tuple before the firewall cleans up,
> the connection will fail due to firewall interference which itself will
> reset the activity timeout in its own table. We understand the real
> issue here is that these firewalls cannot cope with standards-compliant
> port reuse. But this is a workaround for such situations and an
> improvement on the distribution of ports selected.
>
> The proposed solution is instead of incrementing the port number,
> performing a re-selection of a new random port within the remaining
> range. This solution is configured via sysctl new option
> "net.ipv4.ip_retry_random_port".
>
> The test run consists of two processes, a client and a server, and loops
> connect to the server sending some bytes back. The results we got are
> promising:
>
> Executed test: Current algorithm
> ephemeral port range: 9000-65499
> simulated selections: 10000000
> retries during simulation: 14197718
> longest retry sequence: 5202
>
> Executed test: Proposed modified algorithm
> ephemeral port range: 9000-65499
> simulated selections: 10000000
> retries during simulation: 3976671
> longest retry sequence: 12
>
> In addition, on graphs generated we can observe that the distribution of
> source ports is more even with the proposed patch.
>
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> Tested-by: Thorsten Toepper <thorsten.toepper@sap.com>
> ---
> .../networking/net_cachelines/netns_ipv4_sysctl.rst | 1 +
> include/net/netns/ipv4.h | 1 +
> net/ipv4/inet_hashtables.c | 7 ++++++-
> net/ipv4/sysctl_net_ipv4.c | 7 +++++++
> 4 files changed, 15 insertions(+), 1 deletion(-)
>
I just noticed I didn't add the following diffs to the patch. Please
keep them on mind and sorry for the inconvenience.
diff --git a/Documentation/networking/ip-sysctl.rst
b/Documentation/networking/ip-sysctl.rst
index bc9a01606daf..e6ae9400332c 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1610,6 +1610,17 @@ ip_local_reserved_ports - list of comma separated
ranges
Default: Empty
+ip_retry_random_port - BOOLEAN
+ Randomize the selection of a new port if a reserved port is hit
during
+ automatic port selection instead of incrementing the port number.
+
+ Possible values:
+
+ - 0 (disabled)
+ - 1 (enabled)
+
+ Default: 0 (disabled)
+
ip_unprivileged_port_start - INTEGER
This is a per-namespace sysctl. It defines the first
unprivileged port in the network namespace. Privileged ports
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 5eade7d9e4a2..32ca260701ba 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -828,6 +828,8 @@ static struct ctl_table ipv4_net_table[] = {
.data = &init_net.ipv4.sysctl_ip_retry_random_port,
.mode = 0644,
.proc_handler = proc_dou8vec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
},
{
.procname = "ip_local_reserved_ports",
next prev parent reply other threads:[~2026-02-03 18:02 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-03 17:54 [PATCH RFC net-next] inet: add ip_retry_random_port sysctl to reduce sequential port retries Fernando Fernandez Mancera
2026-02-03 18:02 ` Fernando Fernandez Mancera [this message]
2026-02-04 16:25 ` Fernando Fernandez Mancera
2026-02-04 16:49 ` Eric Dumazet
2026-02-04 17:29 ` Fernando Fernandez Mancera
2026-02-06 16:27 ` Fernando Fernandez Mancera
2026-02-06 17:09 ` Eric Dumazet
2026-02-09 11:56 ` Fernando Fernandez Mancera
2026-02-09 13:53 ` longxie86
2026-02-09 15:25 ` Fernando Fernandez Mancera
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b20965f7-e251-4793-951e-f211d179dbba@suse.de \
--to=fmancera@suse.de \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=idosch@nvidia.com \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=thorsten.toepper@sap.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox