From: Andi Kleen <andi@firstfloor.org>
To: Chris Snook <csnook@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>, Rick Jones <rick.jones2@hp.com>,
Netdev <netdev@vger.kernel.org>
Subject: Re: RFC: Nagle latency tuning
Date: Tue, 9 Sep 2008 21:07:37 +0200 [thread overview]
Message-ID: <20080909190737.GB7714@one.firstfloor.org> (raw)
In-Reply-To: <48C6C300.4050102@redhat.com>
> These apps have a love/hate relationship with TCP. They'll probably love
> SCTP 5 years from now, but it's not mature enough for them yet. They do
> want to minimize all latencies,
Then they should just TCP_NODELAY.
> and many of the apps explicitly set
> TCP_NODELAY.
That's the right thing for them.
> The goal here is to improve latencies on the supporting apps
> that aren't quite as carefully optimized as the main message daemons
> themselves. If we can give them a knob that bounds their worst-case
> latency to 2-3 times their average latency, without risking network floods
> that won't show up in testing, they'll be much happier.
Hmm in theory I don't see a big drawback in making the these defaults sysctls.
As in this untested patch. It's probably not the right solution
for this problem. Still if you want to experiment. This makes both
the ato default and the delack default tunable. You'll have to restart
sockets for it to take effect.
-Andi
---
Make ato min and delack min tunable
This might potentially help with some programs which have problems with nagle.
Sockets have to be restarted
TBD documentation for the new sysctls
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Index: linux-2.6.27-rc4-misc/include/net/tcp.h
===================================================================
--- linux-2.6.27-rc4-misc.orig/include/net/tcp.h
+++ linux-2.6.27-rc4-misc/include/net/tcp.h
@@ -118,12 +118,16 @@ extern void tcp_time_wait(struct sock *s
#define TCP_DELACK_MAX ((unsigned)(HZ/5)) /* maximal time to delay before sending an ACK */
#if HZ >= 100
-#define TCP_DELACK_MIN ((unsigned)(HZ/25)) /* minimal time to delay before sending an ACK */
-#define TCP_ATO_MIN ((unsigned)(HZ/25))
+#define TCP_DELACK_MIN_DEFAULT ((unsigned)(HZ/25)) /* minimal time to delay before sending an ACK */
+#define TCP_ATO_MIN_DEFAULT ((unsigned)(HZ/25))
#else
-#define TCP_DELACK_MIN 4U
-#define TCP_ATO_MIN 4U
+#define TCP_DELACK_MIN_DEFAULT 4U
+#define TCP_ATO_MIN_DEFAULT 4U
#endif
+
+#define TCP_DELACK_MIN sysctl_tcp_delack_min
+#define TCP_ATO_MIN sysctl_tcp_ato_min
+
#define TCP_RTO_MAX ((unsigned)(120*HZ))
#define TCP_RTO_MIN ((unsigned)(HZ/5))
#define TCP_TIMEOUT_INIT ((unsigned)(3*HZ)) /* RFC 1122 initial RTO value */
@@ -236,6 +240,8 @@ extern int sysctl_tcp_base_mss;
extern int sysctl_tcp_workaround_signed_windows;
extern int sysctl_tcp_slow_start_after_idle;
extern int sysctl_tcp_max_ssthresh;
+extern int sysctl_tcp_ato_min;
+extern int sysctl_tcp_delack_min;
extern atomic_t tcp_memory_allocated;
extern atomic_t tcp_sockets_allocated;
Index: linux-2.6.27-rc4-misc/net/ipv4/sysctl_net_ipv4.c
===================================================================
--- linux-2.6.27-rc4-misc.orig/net/ipv4/sysctl_net_ipv4.c
+++ linux-2.6.27-rc4-misc/net/ipv4/sysctl_net_ipv4.c
@@ -717,6 +717,24 @@ static struct ctl_table ipv4_table[] = {
},
{
.ctl_name = CTL_UNNUMBERED,
+ .procname = "tcp_delack_min",
+ .data = &sysctl_tcp_delack_min,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec_jiffies,
+ .strategy = &sysctl_jiffies
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
+ .procname = "tcp_ato_min",
+ .data = &sysctl_tcp_ato_min,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec_jiffies,
+ .strategy = &sysctl_jiffies
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
.procname = "udp_mem",
.data = &sysctl_udp_mem,
.maxlen = sizeof(sysctl_udp_mem),
Index: linux-2.6.27-rc4-misc/net/ipv4/tcp_timer.c
===================================================================
--- linux-2.6.27-rc4-misc.orig/net/ipv4/tcp_timer.c
+++ linux-2.6.27-rc4-misc/net/ipv4/tcp_timer.c
@@ -29,6 +29,8 @@ int sysctl_tcp_keepalive_intvl __read_mo
int sysctl_tcp_retries1 __read_mostly = TCP_RETR1;
int sysctl_tcp_retries2 __read_mostly = TCP_RETR2;
int sysctl_tcp_orphan_retries __read_mostly;
+int sysctl_tcp_delack_min __read_mostly = TCP_DELACK_MIN_DEFAULT;
+int sysctl_tcp_ato_min __read_mostly = TCP_ATO_MIN_DEFAULT;
static void tcp_write_timer(unsigned long);
static void tcp_delack_timer(unsigned long);
Index: linux-2.6.27-rc4-misc/net/ipv4/tcp_output.c
===================================================================
--- linux-2.6.27-rc4-misc.orig/net/ipv4/tcp_output.c
+++ linux-2.6.27-rc4-misc/net/ipv4/tcp_output.c
@@ -2436,7 +2436,7 @@ void tcp_send_delayed_ack(struct sock *s
* directly.
*/
if (tp->srtt) {
- int rtt = max(tp->srtt >> 3, TCP_DELACK_MIN);
+ int rtt = max_t(unsigned, tp->srtt >> 3, TCP_DELACK_MIN);
if (rtt < max_ato)
max_ato = rtt;
--
ak@linux.intel.com
next prev parent reply other threads:[~2008-09-09 19:03 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-08 21:56 RFC: Nagle latency tuning Christopher Snook
2008-09-08 22:39 ` Rick Jones
2008-09-09 5:10 ` Chris Snook
2008-09-09 5:17 ` David Miller
2008-09-09 5:56 ` Chris Snook
2008-09-09 6:02 ` David Miller
2008-09-09 10:31 ` Mark Brown
2008-09-09 12:05 ` David Miller
2008-09-09 12:09 ` Mark Brown
2008-09-09 12:19 ` David Miller
2008-09-09 6:22 ` Evgeniy Polyakov
2008-09-09 6:28 ` Chris Snook
2008-09-09 13:00 ` Arnaldo Carvalho de Melo
2008-09-09 14:36 ` Andi Kleen
2008-09-09 18:40 ` Chris Snook
2008-09-09 19:07 ` Andi Kleen [this message]
2008-09-09 19:21 ` Arnaldo Carvalho de Melo
2008-09-11 4:08 ` Chris Snook
2008-09-09 19:59 ` David Miller
2008-09-09 20:25 ` Chris Snook
2008-09-22 10:49 ` David Miller
2008-09-22 11:09 ` David Miller
2008-09-22 20:30 ` Andi Kleen
2008-09-22 22:22 ` Chris Snook
2008-09-22 22:26 ` David Miller
2008-09-22 23:00 ` Chris Snook
2008-09-22 23:13 ` David Miller
2008-09-22 23:24 ` Andi Kleen
2008-09-22 23:21 ` David Miller
2008-09-23 0:14 ` Andi Kleen
2008-09-23 0:33 ` Rick Jones
2008-09-23 2:12 ` Andi Kleen
2008-09-23 1:40 ` David Miller
2008-09-23 2:23 ` Andi Kleen
2008-09-23 2:28 ` David Miller
2008-09-23 2:41 ` Andi Kleen
2008-09-22 22:47 ` Rick Jones
2008-09-22 22:57 ` Chris Snook
2008-09-09 16:33 ` Rick Jones
2008-09-09 16:54 ` Chuck Lever
2008-09-09 17:21 ` Arnaldo Carvalho de Melo
2008-09-09 17:54 ` Rick Jones
2008-09-08 22:55 ` Andi Kleen
2008-09-09 5:22 ` Chris Snook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080909190737.GB7714@one.firstfloor.org \
--to=andi@firstfloor.org \
--cc=csnook@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=rick.jones2@hp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).