From: Yuchung Cheng <ycheng@google.com>
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, edumazet@google.com,
ncardwell@google.com, nanditad@google.com,
Yuchung Cheng <ycheng@google.com>
Subject: [net-next 08/13] tcp: extend F-RTO to catch more spurious timeouts
Date: Thu, 12 Jan 2017 22:03:49 -0800 [thread overview]
Message-ID: <20170113060354.85234-9-ycheng@google.com> (raw)
In-Reply-To: <20170113060354.85234-1-ycheng@google.com>
Current F-RTO reverts cwnd reset whenever a never-retransmitted
packet was (s)acked. The timeout can be declared spurious because
the packets acknoledged with this ACK was transmitted before the
timeout, so clearly not all the packets are lost to reset the cwnd.
This nice detection does not really depend F-RTO internals. This
patch applies the detection universally. On Google servers this
change detected 20% more spurious timeouts.
Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_input.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4ad75b8c4fee..9469ce384d3b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1939,7 +1939,6 @@ void tcp_enter_loss(struct sock *sk)
struct tcp_sock *tp = tcp_sk(sk);
struct net *net = sock_net(sk);
struct sk_buff *skb;
- bool new_recovery = icsk->icsk_ca_state < TCP_CA_Recovery;
bool is_reneg; /* is receiver reneging on SACKs? */
bool mark_lost;
@@ -2000,13 +1999,15 @@ void tcp_enter_loss(struct sock *sk)
tp->high_seq = tp->snd_nxt;
tcp_ecn_queue_cwr(tp);
- /* F-RTO RFC5682 sec 3.1 step 1: retransmit SND.UNA if no previous
- * loss recovery is underway except recurring timeout(s) on
- * the same SND.UNA (sec 3.2). Disable F-RTO on path MTU probing
+ /* F-RTO RFC5682 sec 3.1 step 1 mandates to disable F-RTO
+ * if a previous recovery is underway, otherwise it may incorrectly
+ * call a timeout spurious if some previously retransmitted packets
+ * are s/acked (sec 3.2). We do not apply that retriction since
+ * retransmitted skbs are permanently tagged with TCPCB_EVER_RETRANS
+ * so FLAG_ORIG_SACK_ACKED is always correct. But we do disable F-RTO
+ * on PTMU discovery to avoid sending new data.
*/
- tp->frto = sysctl_tcp_frto &&
- (new_recovery || icsk->icsk_retransmits) &&
- !inet_csk(sk)->icsk_mtup.probe_size;
+ tp->frto = sysctl_tcp_frto && !inet_csk(sk)->icsk_mtup.probe_size;
}
/* If ACK arrived pointing to a remembered SACK, it means that our
@@ -2740,14 +2741,18 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack,
tcp_try_undo_loss(sk, false))
return;
- if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */
- /* Step 3.b. A timeout is spurious if not all data are
- * lost, i.e., never-retransmitted data are (s)acked.
- */
- if ((flag & FLAG_ORIG_SACK_ACKED) &&
- tcp_try_undo_loss(sk, true))
- return;
+ /* The ACK (s)acks some never-retransmitted data meaning not all
+ * the data packets before the timeout were lost. Therefore we
+ * undo the congestion window and state. This is essentially
+ * the operation in F-RTO (RFC5682 section 3.1 step 3.b). Since
+ * a retransmitted skb is permantly marked, we can apply such an
+ * operation even if F-RTO was not used.
+ */
+ if ((flag & FLAG_ORIG_SACK_ACKED) &&
+ tcp_try_undo_loss(sk, tp->undo_marker))
+ return;
+ if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */
if (after(tp->snd_nxt, tp->high_seq)) {
if (flag & FLAG_DATA_SACKED || is_dupack)
tp->frto = 0; /* Step 3.a. loss was real */
--
2.11.0.483.g087da7b7c-goog
next prev parent reply other threads:[~2017-01-13 6:04 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-13 6:03 [net-next 00/13] RACK fast recovery Yuchung Cheng
2017-01-13 6:03 ` [net-next 01/13] tcp: new helper function for RACK loss detection Yuchung Cheng
2017-01-13 6:07 ` Yuchung Cheng
2017-01-13 6:03 ` [net-next 02/13] tcp: new helper for RACK to detect loss Yuchung Cheng
2017-01-13 6:03 ` [net-next 03/13] tcp: record most recent RTT in RACK loss detection Yuchung Cheng
2017-01-13 6:03 ` [net-next 04/13] tcp: add reordering timer " Yuchung Cheng
2017-01-13 6:03 ` [net-next 05/13] tcp: use sequence to break TS ties for " Yuchung Cheng
2017-01-13 6:03 ` [net-next 06/13] tcp: check undo conditions before detecting losses Yuchung Cheng
2017-01-13 6:03 ` [net-next 07/13] tcp: enable RACK loss detection to trigger recovery Yuchung Cheng
2017-01-13 6:03 ` Yuchung Cheng [this message]
2017-01-13 6:03 ` [net-next 09/13] tcp: remove forward retransmit feature Yuchung Cheng
2017-01-13 6:03 ` [net-next 10/13] tcp: remove early retransmit Yuchung Cheng
2017-01-13 6:03 ` [net-next 11/13] tcp: remove RFC4653 NCR Yuchung Cheng
2017-01-13 6:03 ` [net-next 12/13] tcp: remove thin_dupack feature Yuchung Cheng
2017-01-13 6:03 ` [net-next 13/13] tcp: disable fack by default Yuchung Cheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170113060354.85234-9-ycheng@google.com \
--to=ycheng@google.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=nanditad@google.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).