From: Kevin Yang <yyd@google.com>
To: David Miller <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org, ncardwell@google.com, ycheng@google.com,
kerneljasonxing@gmail.com, pabeni@redhat.com,
tonylu@linux.alibaba.com, Kevin Yang <yyd@google.com>
Subject: [PATCH net-next v2 2/2] tcp: add sysctl_tcp_rto_min_us
Date: Thu, 30 May 2024 15:34:36 +0000 [thread overview]
Message-ID: <20240530153436.2202800-3-yyd@google.com> (raw)
In-Reply-To: <20240530153436.2202800-1-yyd@google.com>
Adding a sysctl knob to allow user to specify a default
rto_min at socket init time, other than using the hard
coded 200ms default rto_min.
Note that the rto_min route option has the highest precedence
for configuring this setting, followed by the TCP_BPF_RTO_MIN
socket option, followed by the tcp_rto_min_us sysctl.
Signed-off-by: Kevin Yang <yyd@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
---
Documentation/networking/ip-sysctl.rst | 13 +++++++++++++
include/net/netns/ipv4.h | 1 +
net/ipv4/sysctl_net_ipv4.c | 8 ++++++++
net/ipv4/tcp.c | 4 +++-
net/ipv4/tcp_ipv4.c | 1 +
5 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index bd50df6a5a42..6e99eccdb837 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1196,6 +1196,19 @@ tcp_pingpong_thresh - INTEGER
Default: 1
+tcp_rto_min_us - INTEGER
+ Minimal TCP retransmission timeout (in microseconds). Note that the
+ rto_min route option has the highest precedence for configuring this
+ setting, followed by the TCP_BPF_RTO_MIN socket option, followed by
+ this tcp_rto_min_us sysctl.
+
+ The recommended practice is to use a value less or equal to 200000
+ microseconds.
+
+ Possible Values: 1 - INT_MAX
+
+ Default: 200000
+
UDP variables
=============
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index c356c458b340..a91bb971f901 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -170,6 +170,7 @@ struct netns_ipv4 {
u8 sysctl_tcp_sack;
u8 sysctl_tcp_window_scaling;
u8 sysctl_tcp_timestamps;
+ int sysctl_tcp_rto_min_us;
u8 sysctl_tcp_recovery;
u8 sysctl_tcp_thin_linear_timeouts;
u8 sysctl_tcp_slow_start_after_idle;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index d7892f34a15b..bb64c0ef092d 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1503,6 +1503,14 @@ static struct ctl_table ipv4_net_table[] = {
.proc_handler = proc_dou8vec_minmax,
.extra1 = SYSCTL_ONE,
},
+ {
+ .procname = "tcp_rto_min_us",
+ .data = &init_net.ipv4.sysctl_tcp_rto_min_us,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ONE,
+ },
};
static __net_init int ipv4_sysctl_init_net(struct net *net)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5fa68e7f6ddb..fa43aaacd92b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -420,6 +420,7 @@ void tcp_init_sock(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
+ int rto_min_us;
tp->out_of_order_queue = RB_ROOT;
sk->tcp_rtx_queue = RB_ROOT;
@@ -428,7 +429,8 @@ void tcp_init_sock(struct sock *sk)
INIT_LIST_HEAD(&tp->tsorted_sent_queue);
icsk->icsk_rto = TCP_TIMEOUT_INIT;
- icsk->icsk_rto_min = TCP_RTO_MIN;
+ rto_min_us = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rto_min_us);
+ icsk->icsk_rto_min = usecs_to_jiffies(rto_min_us);
icsk->icsk_delack_max = TCP_DELACK_MAX;
tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT);
minmax_reset(&tp->rtt_min, tcp_jiffies32, ~0U);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 041c7eda9abe..49a5e2c4ec18 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -3506,6 +3506,7 @@ static int __net_init tcp_sk_init(struct net *net)
net->ipv4.sysctl_tcp_shrink_window = 0;
net->ipv4.sysctl_tcp_pingpong_thresh = 1;
+ net->ipv4.sysctl_tcp_rto_min_us = jiffies_to_usecs(TCP_RTO_MIN);
return 0;
}
--
2.45.1.288.g0e0cd299f1-goog
prev parent reply other threads:[~2024-05-30 15:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-30 15:34 [PATCH net-next v2 0/2] tcp: add sysctl_tcp_rto_min_us Kevin Yang
2024-05-30 15:34 ` [PATCH net-next v2 1/2] tcp: derive delack_max with tcp_rto_min helper Kevin Yang
2024-06-01 13:28 ` Simon Horman
2024-06-01 14:56 ` David Laight
2024-06-03 21:34 ` Kevin Yang
2024-05-30 15:34 ` Kevin Yang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240530153436.2202800-3-yyd@google.com \
--to=yyd@google.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kerneljasonxing@gmail.com \
--cc=kuba@kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=tonylu@linux.alibaba.com \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).