[net-next 13/13] tcp: disable fack by default

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Yuchung Cheng <ycheng@google.com>
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, edumazet@google.com,
	ncardwell@google.com, nanditad@google.com,
	Yuchung Cheng <ycheng@google.com>
Subject: [net-next 13/13] tcp: disable fack by default
Date: Thu, 12 Jan 2017 22:03:54 -0800	[thread overview]
Message-ID: <20170113060354.85234-14-ycheng@google.com> (raw)
In-Reply-To: <20170113060354.85234-1-ycheng@google.com>

This patch disables FACK by default as RACK is the successor of FACK
(inspired by the insights behind FACK).

FACK[1] in Linux works as follows: a packet P is deemed lost,
if packet Q of higher sequence is s/acked and P and Q are distant
by at least dupthresh number of packets in sequence space.

FACK is more aggressive than the IETF recommened recovery for SACK
(RFC3517 A Conservative Selective Acknowledgment (SACK)-based Loss
 Recovery Algorithm for TCP), because a single SACK may trigger
fast recovery. This obviously won't work well with reordering so
FACK is dynamically disabled upon detecting reordering.

RACK supersedes FACK by using time distance instead of sequence
distance. On reordering, RACK waits for a quarter of RTT receiving
a single SACK before starting recovery. (the timer can be made more
adaptive in the future by measuring reordering distance in time,
but currently RTT/4 seem to work well.) Once the recovery starts,
RACK behaves almost like FACK because it reduces the reodering
window to 1ms, so it fast retransmits quickly. In addition RACK
can detect loss retransmission as it does not care about the packet
sequences (being repeated or not), which is extremely useful when
the connection is going through a traffic policer.

Google server experiments indicate that disabling FACK after enabling
RACK has negligible impact on the overall loss recovery performance
with more reordering events detected.  But we still keep the FACK
implementation for backup if RACK has bugs that needs to be disabled.

[1] M. Mathis, J. Mahdavi, "Forward Acknowledgment: Refining
TCP Congestion Control," In Proceedings of SIGCOMM '96, August 1996.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 39ebc20ca1b2..1a34e9278c07 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -79,7 +79,7 @@
 int sysctl_tcp_timestamps __read_mostly = 1;
 int sysctl_tcp_window_scaling __read_mostly = 1;
 int sysctl_tcp_sack __read_mostly = 1;
-int sysctl_tcp_fack __read_mostly = 1;
+int sysctl_tcp_fack __read_mostly;
 int sysctl_tcp_max_reordering __read_mostly = 300;
 int sysctl_tcp_dsack __read_mostly = 1;
 int sysctl_tcp_app_win __read_mostly = 31;
@@ -2114,7 +2114,8 @@ static inline int tcp_dupack_heuristics(const struct tcp_sock *tp)
  *		dynamically measured and adjusted. This is implemented in
  *		tcp_rack_mark_lost.
  *
- *		FACK: it is the simplest heuristics. As soon as we decided
+ *		FACK (Disabled by default. Subsumbed by RACK):
+ *		It is the simplest heuristics. As soon as we decided
  *		that something is lost, we decide that _all_ not SACKed
  *		packets until the most forward SACK are lost. I.e.
  *		lost_out = fackets_out - sacked_out and left_out = fackets_out.
-- 
2.11.0.483.g087da7b7c-goog

     prev parent reply	other threads:[~2017-01-13  6:04 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-13  6:03 [net-next 00/13] RACK fast recovery Yuchung Cheng
2017-01-13  6:03 ` [net-next 01/13] tcp: new helper function for RACK loss detection Yuchung Cheng
2017-01-13  6:07   ` Yuchung Cheng
2017-01-13  6:03 ` [net-next 02/13] tcp: new helper for RACK to detect loss Yuchung Cheng
2017-01-13  6:03 ` [net-next 03/13] tcp: record most recent RTT in RACK loss detection Yuchung Cheng
2017-01-13  6:03 ` [net-next 04/13] tcp: add reordering timer " Yuchung Cheng
2017-01-13  6:03 ` [net-next 05/13] tcp: use sequence to break TS ties for " Yuchung Cheng
2017-01-13  6:03 ` [net-next 06/13] tcp: check undo conditions before detecting losses Yuchung Cheng
2017-01-13  6:03 ` [net-next 07/13] tcp: enable RACK loss detection to trigger recovery Yuchung Cheng
2017-01-13  6:03 ` [net-next 08/13] tcp: extend F-RTO to catch more spurious timeouts Yuchung Cheng
2017-01-13  6:03 ` [net-next 09/13] tcp: remove forward retransmit feature Yuchung Cheng
2017-01-13  6:03 ` [net-next 10/13] tcp: remove early retransmit Yuchung Cheng
2017-01-13  6:03 ` [net-next 11/13] tcp: remove RFC4653 NCR Yuchung Cheng
2017-01-13  6:03 ` [net-next 12/13] tcp: remove thin_dupack feature Yuchung Cheng
2017-01-13  6:03 ` Yuchung Cheng [this message]

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:39ebc20ca1b dfblob:1a34e9278c0 )
 OR (
bs:"[net-next 13/13] tcp: disable fack by default" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170113060354.85234-14-ycheng@google.com \
    --to=ycheng@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=nanditad@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).