* [PATH net 0/4] tcp: RACK loss recovery bug fixes
@ 2017-12-07 19:33 Yuchung Cheng
2017-12-07 19:33 ` [PATH net 1/4] tcp: correctly test congestion state in RACK Yuchung Cheng
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Yuchung Cheng @ 2017-12-07 19:33 UTC (permalink / raw)
To: davem; +Cc: netdev, edumazet, ncardwell, priyarjha, Yuchung Cheng
This patch set has four minor bug fixes in TCP RACK loss recovery.
Yuchung Cheng (4):
tcp: correctly test congestion state in RACK
tcp: always evaluate losses in RACK upon undo
tcp: fix off-by-one bug in RACK
tcp: evaluate packet losses upon RTT change
net/ipv4/tcp_input.c | 1 +
net/ipv4/tcp_recovery.c | 28 +++++++++++++---------------
2 files changed, 14 insertions(+), 15 deletions(-)
--
2.15.1.424.g9478a66081-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATH net 1/4] tcp: correctly test congestion state in RACK
2017-12-07 19:33 [PATH net 0/4] tcp: RACK loss recovery bug fixes Yuchung Cheng
@ 2017-12-07 19:33 ` Yuchung Cheng
2017-12-07 19:33 ` [PATH net 2/4] tcp: always evaluate losses in RACK upon undo Yuchung Cheng
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Yuchung Cheng @ 2017-12-07 19:33 UTC (permalink / raw)
To: davem; +Cc: netdev, edumazet, ncardwell, priyarjha, Yuchung Cheng
RACK does not test the loss recovery state correctly to compute
the reordering window. It assumes if lost_out is zero then TCP is
not in loss recovery. But it can be zero during recovery before
calling tcp_rack_detect_loss(): when an ACK acknowledges all
packets marked lost before receiving this ACK, but has not yet
to discover new ones by tcp_rack_detect_loss(). The fix is to
simply test the congestion state directly.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_recovery.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
index d3ea89020c69..3143664902e9 100644
--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -55,7 +55,8 @@ static void tcp_rack_detect_loss(struct sock *sk, u32 *reo_timeout)
* to queuing or delayed ACKs.
*/
reo_wnd = 1000;
- if ((tp->rack.reord || !tp->lost_out) && min_rtt != ~0U) {
+ if ((tp->rack.reord || inet_csk(sk)->icsk_ca_state < TCP_CA_Recovery) &&
+ min_rtt != ~0U) {
reo_wnd = max((min_rtt >> 2) * tp->rack.reo_wnd_steps, reo_wnd);
reo_wnd = min(reo_wnd, tp->srtt_us >> 3);
}
--
2.15.1.424.g9478a66081-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATH net 2/4] tcp: always evaluate losses in RACK upon undo
2017-12-07 19:33 [PATH net 0/4] tcp: RACK loss recovery bug fixes Yuchung Cheng
2017-12-07 19:33 ` [PATH net 1/4] tcp: correctly test congestion state in RACK Yuchung Cheng
@ 2017-12-07 19:33 ` Yuchung Cheng
2017-12-07 19:33 ` [PATH net 3/4] tcp: fix off-by-one bug in RACK Yuchung Cheng
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Yuchung Cheng @ 2017-12-07 19:33 UTC (permalink / raw)
To: davem; +Cc: netdev, edumazet, ncardwell, priyarjha, Yuchung Cheng
When sender detects spurious retransmission, all packets
marked lost are remarked to be in-flight. However some may
be considered lost based on its timestamps in RACK. This patch
forces RACK to re-evaluate, which may be skipped previously if
the ACK does not advance RACK timestamp.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_input.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 734cfc8ff76e..5a240f9d208b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2326,6 +2326,7 @@ static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss)
}
tp->snd_cwnd_stamp = tcp_jiffies32;
tp->undo_marker = 0;
+ tp->rack.advanced = 1; /* Force RACK to re-exam losses */
}
static inline bool tcp_may_undo(const struct tcp_sock *tp)
--
2.15.1.424.g9478a66081-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATH net 3/4] tcp: fix off-by-one bug in RACK
2017-12-07 19:33 [PATH net 0/4] tcp: RACK loss recovery bug fixes Yuchung Cheng
2017-12-07 19:33 ` [PATH net 1/4] tcp: correctly test congestion state in RACK Yuchung Cheng
2017-12-07 19:33 ` [PATH net 2/4] tcp: always evaluate losses in RACK upon undo Yuchung Cheng
@ 2017-12-07 19:33 ` Yuchung Cheng
2017-12-07 19:33 ` [PATH net 4/4] tcp: evaluate packet losses upon RTT change Yuchung Cheng
2017-12-08 19:14 ` [PATH net 0/4] tcp: RACK loss recovery bug fixes David Miller
4 siblings, 0 replies; 6+ messages in thread
From: Yuchung Cheng @ 2017-12-07 19:33 UTC (permalink / raw)
To: davem; +Cc: netdev, edumazet, ncardwell, priyarjha, Yuchung Cheng
RACK should mark a packet lost when remaining wait time is zero.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_recovery.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
index 3143664902e9..0c182303e62e 100644
--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -80,12 +80,12 @@ static void tcp_rack_detect_loss(struct sock *sk, u32 *reo_timeout)
*/
remaining = tp->rack.rtt_us + reo_wnd -
tcp_stamp_us_delta(tp->tcp_mstamp, skb->skb_mstamp);
- if (remaining < 0) {
+ if (remaining <= 0) {
tcp_rack_mark_skb_lost(sk, skb);
list_del_init(&skb->tcp_tsorted_anchor);
} else {
- /* Record maximum wait time (+1 to avoid 0) */
- *reo_timeout = max_t(u32, *reo_timeout, 1 + remaining);
+ /* Record maximum wait time */
+ *reo_timeout = max_t(u32, *reo_timeout, remaining);
}
}
}
--
2.15.1.424.g9478a66081-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATH net 4/4] tcp: evaluate packet losses upon RTT change
2017-12-07 19:33 [PATH net 0/4] tcp: RACK loss recovery bug fixes Yuchung Cheng
` (2 preceding siblings ...)
2017-12-07 19:33 ` [PATH net 3/4] tcp: fix off-by-one bug in RACK Yuchung Cheng
@ 2017-12-07 19:33 ` Yuchung Cheng
2017-12-08 19:14 ` [PATH net 0/4] tcp: RACK loss recovery bug fixes David Miller
4 siblings, 0 replies; 6+ messages in thread
From: Yuchung Cheng @ 2017-12-07 19:33 UTC (permalink / raw)
To: davem; +Cc: netdev, edumazet, ncardwell, priyarjha, Yuchung Cheng
RACK skips an ACK unless it advances the most recently delivered
TX timestamp (rack.mstamp). Since RACK also uses the most recent
RTT to decide if a packet is lost, RACK should still run the
loss detection whenever the most recent RTT changes. For example,
an ACK that does not advance the timestamp but triggers the cwnd
undo due to reordering, would then use the most recent (higher)
RTT measurement to detect further losses.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
net/ipv4/tcp_recovery.c | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
index 0c182303e62e..3a81720ac0c4 100644
--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -117,13 +117,8 @@ void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
{
u32 rtt_us;
- if (tp->rack.mstamp &&
- !tcp_rack_sent_after(xmit_time, tp->rack.mstamp,
- end_seq, tp->rack.end_seq))
- return;
-
rtt_us = tcp_stamp_us_delta(tp->tcp_mstamp, xmit_time);
- if (sacked & TCPCB_RETRANS) {
+ if (rtt_us < tcp_min_rtt(tp) && (sacked & TCPCB_RETRANS)) {
/* If the sacked packet was retransmitted, it's ambiguous
* whether the retransmission or the original (or the prior
* retransmission) was sacked.
@@ -134,13 +129,15 @@ void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
* so it's at least one RTT (i.e., retransmission is at least
* an RTT later).
*/
- if (rtt_us < tcp_min_rtt(tp))
- return;
+ return;
}
- tp->rack.rtt_us = rtt_us;
- tp->rack.mstamp = xmit_time;
- tp->rack.end_seq = end_seq;
tp->rack.advanced = 1;
+ tp->rack.rtt_us = rtt_us;
+ if (tcp_rack_sent_after(xmit_time, tp->rack.mstamp,
+ end_seq, tp->rack.end_seq)) {
+ tp->rack.mstamp = xmit_time;
+ tp->rack.end_seq = end_seq;
+ }
}
/* We have waited long enough to accommodate reordering. Mark the expired
--
2.15.1.424.g9478a66081-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATH net 0/4] tcp: RACK loss recovery bug fixes
2017-12-07 19:33 [PATH net 0/4] tcp: RACK loss recovery bug fixes Yuchung Cheng
` (3 preceding siblings ...)
2017-12-07 19:33 ` [PATH net 4/4] tcp: evaluate packet losses upon RTT change Yuchung Cheng
@ 2017-12-08 19:14 ` David Miller
4 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2017-12-08 19:14 UTC (permalink / raw)
To: ycheng; +Cc: netdev, edumazet, ncardwell, priyarjha
From: Yuchung Cheng <ycheng@google.com>
Date: Thu, 7 Dec 2017 11:33:29 -0800
> This patch set has four minor bug fixes in TCP RACK loss recovery.
Series applied, thank you.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-12-08 19:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-07 19:33 [PATH net 0/4] tcp: RACK loss recovery bug fixes Yuchung Cheng
2017-12-07 19:33 ` [PATH net 1/4] tcp: correctly test congestion state in RACK Yuchung Cheng
2017-12-07 19:33 ` [PATH net 2/4] tcp: always evaluate losses in RACK upon undo Yuchung Cheng
2017-12-07 19:33 ` [PATH net 3/4] tcp: fix off-by-one bug in RACK Yuchung Cheng
2017-12-07 19:33 ` [PATH net 4/4] tcp: evaluate packet losses upon RTT change Yuchung Cheng
2017-12-08 19:14 ` [PATH net 0/4] tcp: RACK loss recovery bug fixes David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).