netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC Patch net-next] tcp: add a global sysctl to control TCP delayed ack
@ 2013-01-16 11:05 Cong Wang
  2013-01-16 12:22 ` David Laight
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2013-01-16 11:05 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Rick Jones, Stephen Hemminger, David S. Miller,
	Thomas Graf, David Laight, Cong Wang

According to previous discussion, it seems there is no
reasonable heuristics.

Similar to TCP_QUICK_ACK option, but for people who can't
modify the source code and still wants to control
TCP delayed ACK behavior.

Makes any sense?

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Rick Jones <rick.jones2@hp.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Cong Wang <amwang@redhat.com>

---
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 4976564..8fc96f2 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -605,6 +605,11 @@ tcp_challenge_ack_limit - INTEGER
 	in RFC 5961 (Improving TCP's Robustness to Blind In-Window Attacks)
 	Default: 100
 
+tcp_quick_ack - BOOLEAN
+	Globally enables or disables TCP delayed ACK. The applications
+	can still change the quick ACK mode by TCP_QUICK_ACK option.
+	Default: off
+
 UDP variables:
 
 udp_mem - vector of 3 INTEGERs: min, pressure, max
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 614af8b..0ba0c26 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -291,6 +291,7 @@ extern int sysctl_tcp_thin_dupack;
 extern int sysctl_tcp_early_retrans;
 extern int sysctl_tcp_limit_output_bytes;
 extern int sysctl_tcp_challenge_ack_limit;
+extern int sysctl_tcp_quick_ack;
 
 extern atomic_long_t tcp_memory_allocated;
 extern struct percpu_counter tcp_sockets_allocated;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index a25e1d2..9b4bb75 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -767,6 +767,13 @@ static struct ctl_table ipv4_table[] = {
 		.extra2		= &two,
 	},
 	{
+		.procname	= "tcp_quick_ack",
+		.data		= &sysctl_tcp_quick_ack,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
+	{
 		.procname	= "udp_mem",
 		.data		= &sysctl_udp_mem,
 		.maxlen		= sizeof(sysctl_udp_mem),
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 0905997..3f68482 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -100,6 +100,7 @@ int sysctl_tcp_thin_dupack __read_mostly;
 int sysctl_tcp_moderate_rcvbuf __read_mostly = 1;
 int sysctl_tcp_abc __read_mostly;
 int sysctl_tcp_early_retrans __read_mostly = 2;
+int sysctl_tcp_quick_ack __read_mostly;
 
 #define FLAG_DATA		0x01 /* Incoming frame contained data.		*/
 #define FLAG_WIN_UPDATE		0x02 /* Incoming ACK was a window update.	*/
@@ -4081,7 +4082,8 @@ static void tcp_fin(struct sock *sk)
 	case TCP_ESTABLISHED:
 		/* Move to CLOSE_WAIT */
 		tcp_set_state(sk, TCP_CLOSE_WAIT);
-		inet_csk(sk)->icsk_ack.pingpong = 1;
+		if (!sysctl_tcp_quick_ack)
+			inet_csk(sk)->icsk_ack.pingpong = 1;
 		break;
 
 	case TCP_CLOSE_WAIT:
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 667a6ad..44eff34 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -174,8 +174,9 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
 	/* If it is a reply for ato after last received
 	 * packet, enter pingpong mode.
 	 */
-	if ((u32)(now - icsk->icsk_ack.lrcvtime) < icsk->icsk_ack.ato)
-		icsk->icsk_ack.pingpong = 1;
+	if ((u32)(now - icsk->icsk_ack.lrcvtime) < icsk->icsk_ack.ato &&
+	    !sysctl_tcp_quick_ack)
+			icsk->icsk_ack.pingpong = 1;
 }
 
 /* Account for an ACK we sent. */

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [RFC Patch net-next] tcp: add a global sysctl to control TCP delayed ack
  2013-01-16 11:05 [RFC Patch net-next] tcp: add a global sysctl to control TCP delayed ack Cong Wang
@ 2013-01-16 12:22 ` David Laight
  2013-01-17  9:21   ` Cong Wang
  2013-01-17 12:34   ` Thomas Graf
  0 siblings, 2 replies; 5+ messages in thread
From: David Laight @ 2013-01-16 12:22 UTC (permalink / raw)
  To: Cong Wang, netdev
  Cc: Eric Dumazet, Rick Jones, Stephen Hemminger, David S. Miller,
	Thomas Graf

> According to previous discussion, it seems there is no
> reasonable heuristics.
> 
> Similar to TCP_QUICK_ACK option, but for people who can't
> modify the source code and still wants to control
> TCP delayed ACK behavior.
> 
> Makes any sense?

A sysctl is a bit of a big hammer, it probably isn't necessary
to disable delayed acks on all connections.

IIRC the related problems I saw were really on the sending
side when Nagle is disabled and it is doing 'slow start'.

Globally disabling on connections that have Nagle disabled
might be a possibility - but it is the Nagle parameter
at the other end that matters.

Perhaps the sending side, after sending 4 small frames immediately,
could send 1 or 2 additional full sized frames in order to
provoke an ack (IIRC an ack is sent if there are 2 full sized
frames of data unacked).

The other problem is that 'slow start' is restarted very
aggressively - whenever there is no unacked data.
If you have a very low latency connection and aren't doing
continuous bulk transfer it is restarted for every short
burst of transmits - effectively after every received ack.
There really ought to have to be a moderate idle time
before 'slow start' is restarted.

	David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [RFC Patch net-next] tcp: add a global sysctl to control TCP delayed ack
  2013-01-16 12:22 ` David Laight
@ 2013-01-17  9:21   ` Cong Wang
  2013-01-17 12:34   ` Thomas Graf
  1 sibling, 0 replies; 5+ messages in thread
From: Cong Wang @ 2013-01-17  9:21 UTC (permalink / raw)
  To: David Laight
  Cc: netdev, Eric Dumazet, Rick Jones, Stephen Hemminger,
	David S. Miller, Thomas Graf

On Wed, 2013-01-16 at 12:22 +0000, David Laight wrote:
> > According to previous discussion, it seems there is no
> > reasonable heuristics.
> > 
> > Similar to TCP_QUICK_ACK option, but for people who can't
> > modify the source code and still wants to control
> > TCP delayed ACK behavior.
> > 
> > Makes any sense?
> 
> A sysctl is a bit of a big hammer, it probably isn't necessary
> to disable delayed acks on all connections.

You mean make this sysctl per-socket? But we don't have per-socket or
per-connection sysctl for networking, do we?

> 
> IIRC the related problems I saw were really on the sending
> side when Nagle is disabled and it is doing 'slow start'.
> 
> Globally disabling on connections that have Nagle disabled
> might be a possibility - but it is the Nagle parameter
> at the other end that matters.
> 
> Perhaps the sending side, after sending 4 small frames immediately,
> could send 1 or 2 additional full sized frames in order to
> provoke an ack (IIRC an ack is sent if there are 2 full sized
> frames of data unacked).
> 
> The other problem is that 'slow start' is restarted very
> aggressively - whenever there is no unacked data.
> If you have a very low latency connection and aren't doing
> continuous bulk transfer it is restarted for every short
> burst of transmits - effectively after every received ack.
> There really ought to have to be a moderate idle time
> before 'slow start' is restarted.
> 

These situations are not easy at all to detect.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC Patch net-next] tcp: add a global sysctl to control TCP delayed ack
  2013-01-16 12:22 ` David Laight
  2013-01-17  9:21   ` Cong Wang
@ 2013-01-17 12:34   ` Thomas Graf
  2013-01-17 13:25     ` David Laight
  1 sibling, 1 reply; 5+ messages in thread
From: Thomas Graf @ 2013-01-17 12:34 UTC (permalink / raw)
  To: David Laight
  Cc: Cong Wang, netdev, Eric Dumazet, Rick Jones, Stephen Hemminger,
	David S. Miller

On 01/16/13 at 12:22pm, David Laight wrote:
> A sysctl is a bit of a big hammer, it probably isn't necessary
> to disable delayed acks on all connections.
> 
> IIRC the related problems I saw were really on the sending
> side when Nagle is disabled and it is doing 'slow start'.
> 
> Globally disabling on connections that have Nagle disabled
> might be a possibility - but it is the Nagle parameter
> at the other end that matters.
> 
> Perhaps the sending side, after sending 4 small frames immediately,
> could send 1 or 2 additional full sized frames in order to
> provoke an ack (IIRC an ack is sent if there are 2 full sized
> frames of data unacked).
> 
> The other problem is that 'slow start' is restarted very
> aggressively - whenever there is no unacked data.
> If you have a very low latency connection and aren't doing
> continuous bulk transfer it is restarted for every short
> burst of transmits - effectively after every received ack.
> There really ought to have to be a moderate idle time
> before 'slow start' is restarted.

Not that I disagree with this fundamentally but we already
have a socket option to enable the functionality. All this
patch does is making the same functionality available to
users that are not able to make modification on the
application level.

We can argue about making it available as route metric
exclusively though.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [RFC Patch net-next] tcp: add a global sysctl to control TCP delayed ack
  2013-01-17 12:34   ` Thomas Graf
@ 2013-01-17 13:25     ` David Laight
  0 siblings, 0 replies; 5+ messages in thread
From: David Laight @ 2013-01-17 13:25 UTC (permalink / raw)
  To: Thomas Graf
  Cc: Cong Wang, netdev, Eric Dumazet, Rick Jones, Stephen Hemminger,
	David S. Miller

> Not that I disagree with this fundamentally but we already
> have a socket option to enable the functionality. All this
> patch does is making the same functionality available to
> users that are not able to make modification on the
> application level.

My reading of TCP_QUICKACK documentation is that it is a request
to send an ack now - rather than permanently disable delayed acks.
Having to do an extra system call after every rcv() call
is rather OTT.

Or did you mean some other socket option?

	David

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-01-17 13:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-16 11:05 [RFC Patch net-next] tcp: add a global sysctl to control TCP delayed ack Cong Wang
2013-01-16 12:22 ` David Laight
2013-01-17  9:21   ` Cong Wang
2013-01-17 12:34   ` Thomas Graf
2013-01-17 13:25     ` David Laight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).