A buggy behavior for Linux TCP Reno and HTCP

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Wei Sun <unlcsewsun@gmail.com>
To: netdev@vger.kernel.org
Subject: A buggy behavior for Linux TCP Reno and HTCP
Date: Tue, 18 Jul 2017 16:36:08 -0500	[thread overview]
Message-ID: <CANdGJ5Qp1MoGfeSyAnEqYeSeEyoiH8e1j_Ut3F4PnLRBnqNNZg@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 928 bytes --]

Hi there,

We find a buggy behavior when using Linux TCP Reno and HTCP in low
bandwidth or highly congested network environments.

In a simple word, their undo functions may mistakenly double the cwnd,
leading to a more aggressive behavior in a highly congested scenario.

The detailed reason:

The current reno undo function assumes cwnd halving (and thus doubles
the cwnd), but it doesn't consider a corner case condition that
ssthresh is at least 2.

e.g.,
                         cwnd              ssth
An initial state:     2                    5
A spurious loss:   1                    2
Undo:                   4                    5

Here the cwnd after undo is two times as that before undo. Attached is
a simple script to reproduce it.

A similar reason for HTCP, so we recommend to store the cwnd on loss
in .ssthresh implementation and restore it again in .undo_cwnd for TCP
Reno and HTCP implementations.

Thanks

[-- Attachment #2: undo-2-1-4.pkt --]
[-- Type: application/octet-stream, Size: 2102 bytes --]

/***************
A simpe script to trigger the bug
usage:
1. Download packetdrill tool (https://github.com/google/packetdrill/tree/master/gtests/net/packetdrill) and then run this script
2. sudo ./packetdrill undo-2-1-4.pkt --tolerance_usecs=500000

output:
[Before undo] cwnd: 2 ssth: 5
[Loss Detecting] cwnd: 1 ssth: 2
[After undo] cwnd: 5 ssth: 5
*****************/

+0 `sysctl -q net.ipv4.tcp_congestion_control=reno`
+0 `sysctl -q net.ipv4.tcp_sack=0`

// Establish a connection.
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0

0.100 < S 0:0(0) win 42340 <mss 1000,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <...>
0.110 < . 1:1(0) ack 1 win 257
0.120 accept(3, ..., ...) = 4

// Send 10 MSS.
0.13 write(4, ..., 10000) = 10000

0.13 > . 1:1001(1000) ack 1
0.13 > . 1001:2001(1000) ack 1
0.13 > . 2001:3001(1000) ack 1
0.13 > . 3001:4001(1000) ack 1
0.13 > . 4001:5001(1000) ack 1
0.13 > . 5001:6001(1000) ack 1
0.13 > . 6001:7001(1000) ack 1
0.13 > . 7001:8001(1000) ack 1
0.13 > . 8001:9001(1000) ack 1
0.13 > P. 9001:10001(1000) ack 1

0.4 > . 1:1001(1000) ack 1
0.7 > . 1:1001(1000) ack 1

0.9 < . 1:1(0) ack 1001 win 257
0.9 %{print "[Before undo] cwnd:", tcpi_snd_cwnd, "ssth:", tcpi_snd_ssthresh}%
1.0 > . 1001:2001(1000) ack 1
1.0 > . 2001:3001(1000) ack 1

// Get 3 dupacks.
1.300 < . 1:1(0) ack 1 win 257 <sack 2001:3001,nop,nop>
1.300 < . 1:1(0) ack 1 win 257 <sack 2001:4001,nop,nop>
1.300 < . 1:1(0) ack 1 win 257 <sack 2001:5001,nop,nop>

// We've received 3 duplicate ACKs, so we do a fast retransmit.
1.400 > . 1001:2001(1000) ack 1

1.4 %{print "[Loss Detecting] cwnd:", tcpi_snd_cwnd, "ssth:", tcpi_snd_ssthresh}%
// Apparently just reordering. Retransmit was spurious.
// Original ACKs for sequence ranges up to 10001 are all lost.
// Receiver sends DSACK for retransmitted packet.
1.4 < . 1:1(0) ack 5001 win 257 <sack 1001:2001,nop,nop>
1.401 %{print "[After undo] cwnd:", tcpi_snd_cwnd, "ssth:", tcpi_snd_ssthresh}%

+0 `sysctl -q net.ipv4.tcp_congestion_control=cubic`

next             reply	other threads:[~2017-07-18 21:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-18 21:36 Wei Sun [this message]
2017-07-19 19:31 ` A buggy behavior for Linux TCP Reno and HTCP Yuchung Cheng
2017-07-20 21:28   ` Wei Sun
2017-07-21 17:59     ` Yuchung Cheng
2017-07-21 20:26       ` Lisong Xu
2017-07-21 20:27       ` Lisong Xu
     [not found]         ` <CADVnQynG0MZcuAPpZ+hiK-9Ounx8JKPWxvb1n3t-OyyC7=es_Q@mail.gmail.com>
2017-07-21 20:49           ` Neal Cardwell
2017-07-21 21:16           ` Yuchung Cheng
2017-07-24  2:36             ` Neal Cardwell
2017-07-24  2:37               ` Neal Cardwell
2017-07-24 18:17                 ` Yuchung Cheng
2017-07-24 18:29                   ` Neal Cardwell
2017-07-24 23:41                     ` Yuchung Cheng
2017-07-25  4:19                       ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANdGJ5Qp1MoGfeSyAnEqYeSeEyoiH8e1j_Ut3F4PnLRBnqNNZg@mail.gmail.com \
    --to=unlcsewsun@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).