* [PATCH] tcp: make TCP quick ACK behavior modifiable
@ 2010-08-23 19:00 Hagen Paul Pfeifer
2010-08-23 19:14 ` Stephen Hemminger
2010-08-23 20:44 ` Eric Dumazet
0 siblings, 2 replies; 17+ messages in thread
From: Hagen Paul Pfeifer @ 2010-08-23 19:00 UTC (permalink / raw)
To: netdev
Cc: Hagen Paul Pfeifer, David S. Miller, Eric Dumazet,
Ilpo Järvinen
The TCP quick ACK mechanism analyze if a connection is interactive or
not. Per default the quick ACK mechanism is enabled and ACK packets are
triggered instantly to raise the CWND fast - which is clever for
bulk data (non-interactive) flows. On the other hand interactive protocols
like HTTP, SMTP or XMPP will suffer from the quick ACK mechanism
because one additional packets is generated. A simple heuristic
detects if a connection is interactive (pingpong) and if so,
disable the quick ACK. But, the mechanism is not in the ability to
blindly guess if a connection is interactive, and so it must wait for at
least one return packet with payload.
For the server side this requires one additional packet because (packet
number 5 and 6 can be combined):
192.168.1.35.44833 > 78.47.222.210.80: Flags [S], seq 2854340018, win 5840, options [mss 1460,sackOK,TS val 4382726 ecr 0,nop,wscale 7], length 0
78.47.222.210.80 > 192.168.1.35.44833: Flags [S.], seq 719041385, ack 2854340019, win 5792, options [mss 1452,sackOK,TS val 2606891996 ecr 4382726,nop,wscale 7], length 0
192.168.1.35.44833 > 78.47.222.210.80: Flags [.], ack 1, win 46, options [nop,nop,TS val 4382730 ecr 2606891996], length 0
192.168.1.35.44833 > 78.47.222.210.80: Flags [P.], seq 1:682, ack 1, win 46, options [nop,nop,TS val 4382730 ecr 2606891996], length 681
78.47.222.210.80 > 192.168.1.35.44833: Flags [.], ack 682, win 56, options [nop,nop,TS val 2606892002 ecr 4382730], length 0
78.47.222.210.80 > 192.168.1.35.44833: Flags [.], seq 1:1441, ack 682, win 56, options [nop,nop,TS val 2606892002 ecr 4382730], length 1440
192.168.1.35.44833 > 78.47.222.210.80: Flags [.], ack 1441, win 69, options [nop,nop,TS val 4382737 ecr 2606892002], length 0
This patch provides a sysctl interface for the administrator to globally
enable or disable TCP quick ACKs. Short lived protocols like HTTP will
save a non unimportant portion of packets!
Disable TCP Quick ACK:
$ echo 0 > /proc/sys/net/ipv4/tcp_quickack
Enable TCP Quick ACK:
$ echo 1 > /proc/sys/net/ipv4/tcp_quickack
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
include/net/inet_connection_sock.h | 3 +++
net/ipv4/sysctl_net_ipv4.c | 7 +++++++
net/ipv4/tcp.c | 2 ++
3 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index b6d3b55..da2fbaf 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -170,9 +170,12 @@ static inline int inet_csk_ack_scheduled(const struct sock *sk)
return inet_csk(sk)->icsk_ack.pending & ICSK_ACK_SCHED;
}
+extern int sysctl_tcp_quickack;
+
static inline void inet_csk_delack_init(struct sock *sk)
{
memset(&inet_csk(sk)->icsk_ack, 0, sizeof(inet_csk(sk)->icsk_ack));
+ inet_csk(sk)->icsk_ack.pingpong = sysctl_tcp_quickack ? 0 : 1;
}
extern void inet_csk_delete_keepalive_timer(struct sock *sk);
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index d96c1da..8923ca8 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -394,6 +394,13 @@ static struct ctl_table ipv4_table[] = {
.proc_handler = proc_dointvec
},
{
+ .procname = "tcp_quickack",
+ .data = &sysctl_tcp_quickack,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec
+ },
+ {
.procname = "tcp_mem",
.data = &sysctl_tcp_mem,
.maxlen = sizeof(sysctl_tcp_mem),
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 176e11a..5161689 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -279,6 +279,8 @@
int sysctl_tcp_fin_timeout __read_mostly = TCP_FIN_TIMEOUT;
+int sysctl_tcp_quickack __read_mostly = 1;
+
struct percpu_counter tcp_orphan_count;
EXPORT_SYMBOL_GPL(tcp_orphan_count);
--
1.7.2.1.95.g3d045.dirty
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 19:00 [PATCH] tcp: make TCP quick ACK behavior modifiable Hagen Paul Pfeifer
@ 2010-08-23 19:14 ` Stephen Hemminger
2010-08-23 19:57 ` Hagen Paul Pfeifer
2010-08-23 20:08 ` Hagen Paul Pfeifer
2010-08-23 20:44 ` Eric Dumazet
1 sibling, 2 replies; 17+ messages in thread
From: Stephen Hemminger @ 2010-08-23 19:14 UTC (permalink / raw)
To: Hagen Paul Pfeifer
Cc: netdev, David S. Miller, Eric Dumazet, Ilpo Järvinen
On Mon, 23 Aug 2010 21:00:37 +0200
Hagen Paul Pfeifer <hagen@jauu.net> wrote:
> The TCP quick ACK mechanism analyze if a connection is interactive or
> not. Per default the quick ACK mechanism is enabled and ACK packets are
> triggered instantly to raise the CWND fast - which is clever for
> bulk data (non-interactive) flows. On the other hand interactive protocols
> like HTTP, SMTP or XMPP will suffer from the quick ACK mechanism
> because one additional packets is generated. A simple heuristic
> detects if a connection is interactive (pingpong) and if so,
> disable the quick ACK. But, the mechanism is not in the ability to
> blindly guess if a connection is interactive, and so it must wait for at
> least one return packet with payload.
>
> For the server side this requires one additional packet because (packet
> number 5 and 6 can be combined):
>
> 192.168.1.35.44833 > 78.47.222.210.80: Flags [S], seq 2854340018, win 5840, options [mss 1460,sackOK,TS val 4382726 ecr 0,nop,wscale 7], length 0
> 78.47.222.210.80 > 192.168.1.35.44833: Flags [S.], seq 719041385, ack 2854340019, win 5792, options [mss 1452,sackOK,TS val 2606891996 ecr 4382726,nop,wscale 7], length 0
> 192.168.1.35.44833 > 78.47.222.210.80: Flags [.], ack 1, win 46, options [nop,nop,TS val 4382730 ecr 2606891996], length 0
> 192.168.1.35.44833 > 78.47.222.210.80: Flags [P.], seq 1:682, ack 1, win 46, options [nop,nop,TS val 4382730 ecr 2606891996], length 681
> 78.47.222.210.80 > 192.168.1.35.44833: Flags [.], ack 682, win 56, options [nop,nop,TS val 2606892002 ecr 4382730], length 0
> 78.47.222.210.80 > 192.168.1.35.44833: Flags [.], seq 1:1441, ack 682, win 56, options [nop,nop,TS val 2606892002 ecr 4382730], length 1440
> 192.168.1.35.44833 > 78.47.222.210.80: Flags [.], ack 1441, win 69, options [nop,nop,TS val 4382737 ecr 2606892002], length 0
>
> This patch provides a sysctl interface for the administrator to globally
> enable or disable TCP quick ACKs. Short lived protocols like HTTP will
> save a non unimportant portion of packets!
If this is configurable (still not sure about having yet more
TCP knobs). It should either be per-socket or a route metric so it can
be controlled on a per-path basis.
--
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 19:14 ` Stephen Hemminger
@ 2010-08-23 19:57 ` Hagen Paul Pfeifer
2010-08-23 20:08 ` Hagen Paul Pfeifer
1 sibling, 0 replies; 17+ messages in thread
From: Hagen Paul Pfeifer @ 2010-08-23 19:57 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, David S. Miller, Eric Dumazet, Ilpo Järvinen
* Stephen Hemminger | 2010-08-23 12:14:49 [-0700]:
>If this is configurable (still not sure about having yet more
>TCP knobs). It should either be per-socket or a route metric so it can
>be controlled on a per-path basis.
Hello Stephen,
I thought about this too. But IMHO it makes no sense because interactive/bulk
characteristic do not depend on the "path". Rather it depends on application
level. Furthermore, how should an administrator configure this on a per path
basis? The administrator knows that he runs a WEB server - great . And then
SHOULD disable Quick ACK and everything is fine, think about typical server
setup.
The only remunerating alternative is to disable TCP quick ACK at all for the
_first_ server ACK. So that the standard delayed ACK is active and therefore
the mechanism can detect if the flow is interactive or not. The drawback is
that for bulk data flows the first ACK is "artificial" delayed. This is the
superior solution IMHO.
Anyway, I think that for most server setup these days (HTTP, SMTP, ...) the
TCP Quick ACK mechanism is contra-productive. Vanilla bulk data transfer
protocols are rarer these days.
Hagen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 19:14 ` Stephen Hemminger
2010-08-23 19:57 ` Hagen Paul Pfeifer
@ 2010-08-23 20:08 ` Hagen Paul Pfeifer
2010-08-23 21:21 ` David Miller
1 sibling, 1 reply; 17+ messages in thread
From: Hagen Paul Pfeifer @ 2010-08-23 20:08 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, David S. Miller, Eric Dumazet, Ilpo Järvinen,
Tom Herbert
* Stephen Hemminger | 2010-08-23 12:14:49 [-0700]:
>If this is configurable (still not sure about having yet more
>TCP knobs). It should either be per-socket or a route metric so it can
>be controlled on a per-path basis.
BTW: quick ACK's are still configurable on the socket basis. But sometimes you
had no access to the code, the maintainer is busy or whatever ..
At least google inc is too busy to disable tcp quick ack's. And I think they will
save a lot of packets! Who make a bet? ;)
IP 192.168.1.35.45632 > 209.85.135.99.80: Flags [S], seq 4207702225, win 5840, options [mss 1460,sackOK,TS val 1669166 ecr 0,nop,wscale 7], length 0
IP 209.85.135.99.80 > 192.168.1.35.45632: Flags [S.], seq 263349349, ack 4207702226, win 5672, options [mss 1430,sackOK,TS val 397205192 ecr 1669166,nop,wscale 6], length 0
IP 192.168.1.35.45632 > 209.85.135.99.80: Flags [.], ack 1, win 46, options [nop,nop,TS val 1669168 ecr 397205192], length 0
IP 192.168.1.35.45632 > 209.85.135.99.80: Flags [P.], seq 1:925, ack 1, win 46, options [nop,nop,TS val 1669168 ecr 397205192], length 924
IP 209.85.135.99.80 > 192.168.1.35.45632: Flags [.], ack 925, win 118, options [nop,nop,TS val 397205208 ecr 1669168], length 0
IP 209.85.135.99.80 > 192.168.1.35.45632: Flags [P.], seq 1:510, ack 925, win 118, options [nop,nop,TS val 397205221 ecr 1669168], length 509
Hagen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 19:00 [PATCH] tcp: make TCP quick ACK behavior modifiable Hagen Paul Pfeifer
2010-08-23 19:14 ` Stephen Hemminger
@ 2010-08-23 20:44 ` Eric Dumazet
2010-08-23 20:49 ` Hagen Paul Pfeifer
2010-08-23 21:10 ` Chris Snook
1 sibling, 2 replies; 17+ messages in thread
From: Eric Dumazet @ 2010-08-23 20:44 UTC (permalink / raw)
To: Hagen Paul Pfeifer; +Cc: netdev, David S. Miller, Ilpo Järvinen
Le lundi 23 août 2010 à 21:00 +0200, Hagen Paul Pfeifer a écrit :
> The TCP quick ACK mechanism analyze if a connection is interactive or
> not. Per default the quick ACK mechanism is enabled and ACK packets are
> triggered instantly to raise the CWND fast - which is clever for
> bulk data (non-interactive) flows. On the other hand interactive protocols
> like HTTP, SMTP or XMPP will suffer from the quick ACK mechanism
> because one additional packets is generated. A simple heuristic
> detects if a connection is interactive (pingpong) and if so,
> disable the quick ACK. But, the mechanism is not in the ability to
> blindly guess if a connection is interactive, and so it must wait for at
> least one return packet with payload.
>
> For the server side this requires one additional packet because (packet
> number 5 and 6 can be combined):
>
> 192.168.1.35.44833 > 78.47.222.210.80: Flags [S], seq 2854340018, win 5840, options [mss 1460,sackOK,TS val 4382726 ecr 0,nop,wscale 7], length 0
> 78.47.222.210.80 > 192.168.1.35.44833: Flags [S.], seq 719041385, ack 2854340019, win 5792, options [mss 1452,sackOK,TS val 2606891996 ecr 4382726,nop,wscale 7], length 0
> 192.168.1.35.44833 > 78.47.222.210.80: Flags [.], ack 1, win 46, options [nop,nop,TS val 4382730 ecr 2606891996], length 0
> 192.168.1.35.44833 > 78.47.222.210.80: Flags [P.], seq 1:682, ack 1, win 46, options [nop,nop,TS val 4382730 ecr 2606891996], length 681
> 78.47.222.210.80 > 192.168.1.35.44833: Flags [.], ack 682, win 56, options [nop,nop,TS val 2606892002 ecr 4382730], length 0
> 78.47.222.210.80 > 192.168.1.35.44833: Flags [.], seq 1:1441, ack 682, win 56, options [nop,nop,TS val 2606892002 ecr 4382730], length 1440
> 192.168.1.35.44833 > 78.47.222.210.80: Flags [.], ack 1441, win 69, options [nop,nop,TS val 4382737 ecr 2606892002], length 0
>
> This patch provides a sysctl interface for the administrator to globally
> enable or disable TCP quick ACKs. Short lived protocols like HTTP will
> save a non unimportant portion of packets!
>
> Disable TCP Quick ACK:
> $ echo 0 > /proc/sys/net/ipv4/tcp_quickack
>
> Enable TCP Quick ACK:
> $ echo 1 > /proc/sys/net/ipv4/tcp_quickack
>
> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> ---
> include/net/inet_connection_sock.h | 3 +++
> net/ipv4/sysctl_net_ipv4.c | 7 +++++++
> net/ipv4/tcp.c | 2 ++
> 3 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
> index b6d3b55..da2fbaf 100644
> --- a/include/net/inet_connection_sock.h
> +++ b/include/net/inet_connection_sock.h
> @@ -170,9 +170,12 @@ static inline int inet_csk_ack_scheduled(const struct sock *sk)
> return inet_csk(sk)->icsk_ack.pending & ICSK_ACK_SCHED;
> }
>
> +extern int sysctl_tcp_quickack;
> +
> static inline void inet_csk_delack_init(struct sock *sk)
> {
> memset(&inet_csk(sk)->icsk_ack, 0, sizeof(inet_csk(sk)->icsk_ack));
> + inet_csk(sk)->icsk_ack.pingpong = sysctl_tcp_quickack ? 0 : 1;
What about dccp using this function ?
> }
>
> extern void inet_csk_delete_keepalive_timer(struct sock *sk);
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index d96c1da..8923ca8 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -394,6 +394,13 @@ static struct ctl_table ipv4_table[] = {
> .proc_handler = proc_dointvec
> },
> {
> + .procname = "tcp_quickack",
> + .data = &sysctl_tcp_quickack,
> + .maxlen = sizeof(int),
> + .mode = 0644,
> + .proc_handler = proc_dointvec
> + },
> + {
> .procname = "tcp_mem",
> .data = &sysctl_tcp_mem,
> .maxlen = sizeof(sysctl_tcp_mem),
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 176e11a..5161689 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -279,6 +279,8 @@
>
> int sysctl_tcp_fin_timeout __read_mostly = TCP_FIN_TIMEOUT;
>
> +int sysctl_tcp_quickack __read_mostly = 1;
> +
> struct percpu_counter tcp_orphan_count;
> EXPORT_SYMBOL_GPL(tcp_orphan_count);
>
So here is a new undocumented setting ? hint hint...
I thought setsockopt(TCP_QUICKACK) was already available for this
optimization ?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 20:44 ` Eric Dumazet
@ 2010-08-23 20:49 ` Hagen Paul Pfeifer
2010-08-23 21:10 ` Chris Snook
1 sibling, 0 replies; 17+ messages in thread
From: Hagen Paul Pfeifer @ 2010-08-23 20:49 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, David S. Miller, Ilpo Järvinen
* Eric Dumazet | 2010-08-23 22:44:03 [+0200]:
>> static inline void inet_csk_delack_init(struct sock *sk)
>> {
>> memset(&inet_csk(sk)->icsk_ack, 0, sizeof(inet_csk(sk)->icsk_ack));
>> + inet_csk(sk)->icsk_ack.pingpong = sysctl_tcp_quickack ? 0 : 1;
>
>What about dccp using this function ?
I will check this and re-spin the patch if necessary. Thank you Eric, I missed
DCCP.
>So here is a new undocumented setting ? hint hint...
>
>I thought setsockopt(TCP_QUICKACK) was already available for this
>optimization ?
... and it is still available. ;-) See my other post answering Stephen's email.
Hagen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 20:44 ` Eric Dumazet
2010-08-23 20:49 ` Hagen Paul Pfeifer
@ 2010-08-23 21:10 ` Chris Snook
2010-08-23 22:01 ` Hagen Paul Pfeifer
1 sibling, 1 reply; 17+ messages in thread
From: Chris Snook @ 2010-08-23 21:10 UTC (permalink / raw)
To: Eric Dumazet
Cc: Hagen Paul Pfeifer, netdev, David S. Miller, Ilpo Järvinen,
acme, Stephen Hemminger
Answering two of you...
On Mon, Aug 23, 2010 at 4:44 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le lundi 23 août 2010 à 21:00 +0200, Hagen Paul Pfeifer a écrit :
>> This patch provides a sysctl interface for the administrator to globally
>> enable or disable TCP quick ACKs. Short lived protocols like HTTP will
>> save a non unimportant portion of packets!
A year and a half ago, we merged a patch to tune the delayed ack
behavior. While the proof of concept was a sysctl interface that a
relative novice to the TCP code like myself could write, the consensus
was that we already had a glut of TCP sysctls, and there was a
potential benefit to doing it on a per-route basis, so we could both
make the feature more flexible and avoid sysctl pollution by making it
a per-route tunable. I think all of the same arguments apply to this
feature, as well as the argument that it surely makes sense to be
tuning delayed ack and quick ack in the same place. I'm CCing acme,
because he wrote the final patch.
> What about dccp using this function ?
Also probably of interest to acme.
> I thought setsockopt(TCP_QUICKACK) was already available for this
> optimization ?
As with the delayed ack patch, modifying the application is not always
practical. It's extremely helpful for an administrator to be able to
turn this on with a few keystrokes when performance starts suffering,
since patching the app (or multiple apps), validating the changes,
etc. could take months or years.
-- Chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 20:08 ` Hagen Paul Pfeifer
@ 2010-08-23 21:21 ` David Miller
2010-08-23 21:51 ` Hagen Paul Pfeifer
0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2010-08-23 21:21 UTC (permalink / raw)
To: hagen; +Cc: shemminger, netdev, eric.dumazet, ilpo.jarvinen, therbert
From: Hagen Paul Pfeifer <hagen@jauu.net>
Date: Mon, 23 Aug 2010 22:08:20 +0200
> At least google inc is too busy to disable tcp quick ack's. And I
> think they will save a lot of packets! Who make a bet? ;)
I think assuming that turning off quick ACKs will be a net
positive for google's traffic is a bit presumptuous, don't
you?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 21:21 ` David Miller
@ 2010-08-23 21:51 ` Hagen Paul Pfeifer
2010-08-23 22:04 ` Chris Snook
2010-08-23 22:16 ` David Miller
0 siblings, 2 replies; 17+ messages in thread
From: Hagen Paul Pfeifer @ 2010-08-23 21:51 UTC (permalink / raw)
To: David Miller; +Cc: shemminger, netdev, eric.dumazet, ilpo.jarvinen, therbert
* David Miller | 2010-08-23 14:21:14 [-0700]:
>> At least google inc is too busy to disable tcp quick ack's. And I
>> think they will save a lot of packets! Who make a bet? ;)
>
>I think assuming that turning off quick ACKs will be a net
>positive for google's traffic is a bit presumptuous, don't
>you?
bet? ;) The patch provides a switch to disable quick acks. If google or
someone else is using this switch or not is up to the audience! ;)
I don't know how many packets the current average HTTP session includes, but
assuming ~20 packets the saving of one packet is a benefit.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 21:10 ` Chris Snook
@ 2010-08-23 22:01 ` Hagen Paul Pfeifer
2010-08-23 22:19 ` Chris Snook
0 siblings, 1 reply; 17+ messages in thread
From: Hagen Paul Pfeifer @ 2010-08-23 22:01 UTC (permalink / raw)
To: Chris Snook
Cc: Eric Dumazet, netdev, David S. Miller, Ilpo Järvinen, acme,
Stephen Hemminger
* Chris Snook | 2010-08-23 17:10:19 [-0400]:
>A year and a half ago, we merged a patch to tune the delayed ack
>behavior. While the proof of concept was a sysctl interface that a
>relative novice to the TCP code like myself could write, the consensus
>was that we already had a glut of TCP sysctls, and there was a
>potential benefit to doing it on a per-route basis, so we could both
>make the feature more flexible and avoid sysctl pollution by making it
>a per-route tunable. I think all of the same arguments apply to this
>feature, as well as the argument that it surely makes sense to be
>tuning delayed ack and quick ack in the same place. I'm CCing acme,
>because he wrote the final patch.
Chris, but I don't support the argument to do this on a per path basis. Why? I
mean it makes sense for RTT, IW and other variables. But quick ack is at
application level and it makes no sense at a per path level. And yes there are
too many sysctl knobs but should we restrict ourself because of this argument?
I mean there are some knobs which are more _special_ then this knob.
The best mechanism is to automatically detect this but it is impossible if the
server had no change to reply. Therefore the idea of disabling the quick ack
mechanism for the _first_ ACK packet, analyze the flow and categorize to bulk
or interactive. But this is another topic and not trivial.
Hagen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 21:51 ` Hagen Paul Pfeifer
@ 2010-08-23 22:04 ` Chris Snook
2010-08-23 22:16 ` David Miller
1 sibling, 0 replies; 17+ messages in thread
From: Chris Snook @ 2010-08-23 22:04 UTC (permalink / raw)
To: Hagen Paul Pfeifer
Cc: David Miller, shemminger, netdev, eric.dumazet, ilpo.jarvinen,
therbert
On Mon, Aug 23, 2010 at 5:51 PM, Hagen Paul Pfeifer <hagen@jauu.net> wrote:
> * David Miller | 2010-08-23 14:21:14 [-0700]:
>
>>> At least google inc is too busy to disable tcp quick ack's. And I
>>> think they will save a lot of packets! Who make a bet? ;)
>>
>>I think assuming that turning off quick ACKs will be a net
>>positive for google's traffic is a bit presumptuous, don't
>>you?
>
> bet? ;) The patch provides a switch to disable quick acks. If google or
> someone else is using this switch or not is up to the audience! ;)
>
> I don't know how many packets the current average HTTP session includes, but
> assuming ~20 packets the saving of one packet is a benefit.
The user experience with HTTP is very sensitive to latency. Since the
payload packets in HTTP are large, HTTP workloads generally aren't
generating enough packets that packet count is something that needs to
be optimized. I'm sure there are exceptions, like web servers
delivering small AJAX or JSON objects, but they're the exception, not
the norm.
That said, I still like the idea of this feature, but I think it
should be a per-route tunable, for consistency with the delayed ack
patch.
-- Chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 21:51 ` Hagen Paul Pfeifer
2010-08-23 22:04 ` Chris Snook
@ 2010-08-23 22:16 ` David Miller
1 sibling, 0 replies; 17+ messages in thread
From: David Miller @ 2010-08-23 22:16 UTC (permalink / raw)
To: hagen; +Cc: shemminger, netdev, eric.dumazet, ilpo.jarvinen, therbert
From: Hagen Paul Pfeifer <hagen@jauu.net>
Date: Mon, 23 Aug 2010 23:51:15 +0200
> I don't know how many packets the current average HTTP session
> includes, but assuming ~20 packets the saving of one packet is a
> benefit.
Assuming no packet loss.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 22:01 ` Hagen Paul Pfeifer
@ 2010-08-23 22:19 ` Chris Snook
2010-08-23 22:23 ` David Miller
0 siblings, 1 reply; 17+ messages in thread
From: Chris Snook @ 2010-08-23 22:19 UTC (permalink / raw)
To: Hagen Paul Pfeifer
Cc: Eric Dumazet, netdev, David S. Miller, Ilpo Järvinen, acme,
Stephen Hemminger
On Mon, Aug 23, 2010 at 6:01 PM, Hagen Paul Pfeifer <hagen@jauu.net> wrote:
> * Chris Snook | 2010-08-23 17:10:19 [-0400]:
>
>>A year and a half ago, we merged a patch to tune the delayed ack
>>behavior. While the proof of concept was a sysctl interface that a
>>relative novice to the TCP code like myself could write, the consensus
>>was that we already had a glut of TCP sysctls, and there was a
>>potential benefit to doing it on a per-route basis, so we could both
>>make the feature more flexible and avoid sysctl pollution by making it
>>a per-route tunable. I think all of the same arguments apply to this
>>feature, as well as the argument that it surely makes sense to be
>>tuning delayed ack and quick ack in the same place. I'm CCing acme,
>>because he wrote the final patch.
>
> Chris, but I don't support the argument to do this on a per path basis. Why? I
> mean it makes sense for RTT, IW and other variables. But quick ack is at
> application level and it makes no sense at a per path level. And yes there are
> too many sysctl knobs but should we restrict ourself because of this argument?
> I mean there are some knobs which are more _special_ then this knob.
>
> The best mechanism is to automatically detect this but it is impossible if the
> server had no change to reply. Therefore the idea of disabling the quick ack
> mechanism for the _first_ ACK packet, analyze the flow and categorize to bulk
> or interactive. But this is another topic and not trivial.
Just because we've allowed stupid TCP sysctls in the past does not
mean we should continue to do so now. We recently made delayed ack a
per-route tunable, so consistency would suggest we do the same here.
Per-route tunables are more flexible, and as with the delayed ack
patch, there are use cases where that granularity gives a clear
advantage over a sysctl. For example, you may want to disable quick
ack on a high-MTU path and enable it on a low-MTU path.
If you need a hint for how to implement the per-route tunable, look
for the delayed ack patch from early 2009.
-- Chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 22:19 ` Chris Snook
@ 2010-08-23 22:23 ` David Miller
2010-08-23 22:26 ` Hagen Paul Pfeifer
2010-08-23 23:17 ` Arnaldo Carvalho de Melo
0 siblings, 2 replies; 17+ messages in thread
From: David Miller @ 2010-08-23 22:23 UTC (permalink / raw)
To: chris.snook; +Cc: hagen, eric.dumazet, netdev, ilpo.jarvinen, acme, shemminger
From: Chris Snook <chris.snook@gmail.com>
Date: Mon, 23 Aug 2010 18:19:45 -0400
> Just because we've allowed stupid TCP sysctls in the past does not
> mean we should continue to do so now. We recently made delayed ack a
> per-route tunable, so consistency would suggest we do the same here.
> Per-route tunables are more flexible, and as with the delayed ack
> patch, there are use cases where that granularity gives a clear
> advantage over a sysctl. For example, you may want to disable quick
> ack on a high-MTU path and enable it on a low-MTU path.
>
> If you need a hint for how to implement the per-route tunable, look
> for the delayed ack patch from early 2009.
I completely agree with Chris that this should be a per-route rather
than a global sysctl tunable.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 22:23 ` David Miller
@ 2010-08-23 22:26 ` Hagen Paul Pfeifer
2010-08-23 23:17 ` Arnaldo Carvalho de Melo
1 sibling, 0 replies; 17+ messages in thread
From: Hagen Paul Pfeifer @ 2010-08-23 22:26 UTC (permalink / raw)
To: David Miller
Cc: chris.snook, eric.dumazet, netdev, ilpo.jarvinen, acme,
shemminger
* David Miller | 2010-08-23 15:23:30 [-0700]:
>I completely agree with Chris that this should be a per-route rather
>than a global sysctl tunable.
OK, then I will follow Chris and re-spin the patch!
HGN
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 22:23 ` David Miller
2010-08-23 22:26 ` Hagen Paul Pfeifer
@ 2010-08-23 23:17 ` Arnaldo Carvalho de Melo
2010-08-23 23:18 ` Arnaldo Carvalho de Melo
1 sibling, 1 reply; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2010-08-23 23:17 UTC (permalink / raw)
To: David Miller
Cc: chris.snook, hagen, eric.dumazet, netdev, ilpo.jarvinen,
shemminger
Em Mon, Aug 23, 2010 at 03:23:30PM -0700, David Miller escreveu:
> From: Chris Snook <chris.snook@gmail.com>
> Date: Mon, 23 Aug 2010 18:19:45 -0400
>
> > Just because we've allowed stupid TCP sysctls in the past does not
> > mean we should continue to do so now. We recently made delayed ack a
> > per-route tunable, so consistency would suggest we do the same here.
> > Per-route tunables are more flexible, and as with the delayed ack
> > patch, there are use cases where that granularity gives a clear
> > advantage over a sysctl. For example, you may want to disable quick
> > ack on a high-MTU path and enable it on a low-MTU path.
> >
> > If you need a hint for how to implement the per-route tunable, look
> > for the delayed ack patch from early 2009.
>
> I completely agree with Chris that this should be a per-route rather
> than a global sysctl tunable.
My first impression was not so strong as to participate, if every
tunable gets a knob, well, we'd be flying concordes in no time.
But even with such reaction, I thought that if a tunable would be
interesting to have would be a setsockopt one, that knowledgeable,
performance/latency hungry actors would jump into as if they were really
hungry.
And yes, that knob I worked on got lost along the way, I guess I have to
think again about it and submit.
- Arnaldo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] tcp: make TCP quick ACK behavior modifiable
2010-08-23 23:17 ` Arnaldo Carvalho de Melo
@ 2010-08-23 23:18 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2010-08-23 23:18 UTC (permalink / raw)
To: David Miller
Cc: chris.snook, hagen, eric.dumazet, netdev, ilpo.jarvinen,
shemminger
Em Mon, Aug 23, 2010 at 08:17:26PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Aug 23, 2010 at 03:23:30PM -0700, David Miller escreveu:
> > From: Chris Snook <chris.snook@gmail.com>
> > Date: Mon, 23 Aug 2010 18:19:45 -0400
> >
> > > Just because we've allowed stupid TCP sysctls in the past does not
> > > mean we should continue to do so now. We recently made delayed ack a
> > > per-route tunable, so consistency would suggest we do the same here.
> > > Per-route tunables are more flexible, and as with the delayed ack
> > > patch, there are use cases where that granularity gives a clear
> > > advantage over a sysctl. For example, you may want to disable quick
> > > ack on a high-MTU path and enable it on a low-MTU path.
> > >
> > > If you need a hint for how to implement the per-route tunable, look
> > > for the delayed ack patch from early 2009.
> >
> > I completely agree with Chris that this should be a per-route rather
> > than a global sysctl tunable.
>
> My first impression was not so strong as to participate, if every
> tunable gets a knob, well, we'd be flying concordes in no time.
Gack, s/tunable/heuristic/g
> But even with such reaction, I thought that if a tunable would be
> interesting to have would be a setsockopt one, that knowledgeable,
> performance/latency hungry actors would jump into as if they were really
> hungry.
>
> And yes, that knob I worked on got lost along the way, I guess I have to
> think again about it and submit.
>
> - Arnaldo
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2010-08-23 23:18 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-23 19:00 [PATCH] tcp: make TCP quick ACK behavior modifiable Hagen Paul Pfeifer
2010-08-23 19:14 ` Stephen Hemminger
2010-08-23 19:57 ` Hagen Paul Pfeifer
2010-08-23 20:08 ` Hagen Paul Pfeifer
2010-08-23 21:21 ` David Miller
2010-08-23 21:51 ` Hagen Paul Pfeifer
2010-08-23 22:04 ` Chris Snook
2010-08-23 22:16 ` David Miller
2010-08-23 20:44 ` Eric Dumazet
2010-08-23 20:49 ` Hagen Paul Pfeifer
2010-08-23 21:10 ` Chris Snook
2010-08-23 22:01 ` Hagen Paul Pfeifer
2010-08-23 22:19 ` Chris Snook
2010-08-23 22:23 ` David Miller
2010-08-23 22:26 ` Hagen Paul Pfeifer
2010-08-23 23:17 ` Arnaldo Carvalho de Melo
2010-08-23 23:18 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).