Slow ramp-up for single-stream TCP throughput on 4.2 kernel.

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
@ 2015-10-02 23:42 Ben Greear
  2015-10-03  0:21 ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2015-10-02 23:42 UTC (permalink / raw)
  To: netdev, ath10k

I'm seeing something that looks more dodgy than normal.

Test case id ath10k station uploading to ath10k AP.

AP is always running 4.2 kernel in this case, and both systems are using
the same ath10k firmware.

I have tuned the stack:

echo 4000000 > /proc/sys/net/core/wmem_max
echo 4096 87380 50000000 > /proc/sys/net/ipv4/tcp_rmem
echo 4096 16384 50000000 > /proc/sys/net/ipv4/tcp_wmem
echo 50000000 > /proc/sys/net/core/rmem_max
echo 30000 > /proc/sys/net/core/netdev_max_backlog
echo 1024000 > /proc/sys/net/ipv4/tcp_limit_output_bytes

On the 3.17.8+ kernel, single stream TCP very quickly (1-2 seconds) reaches about
525Mbps upload throughput (station to AP).

But, when station machine is running the 4.2 kernel, the connection goes to
about 30Mbps for 5-10 seconds, then may ramp up to 200-300Mbps, and may plateau
at around 400Mbps after another minute or two.  Once, I saw it finally reach 500+Mbps
after about 3 minutes.

Both behaviors are repeatable in my testing.

For 4.2, I tried setting the send/rcv buffers to 2Mbps,
I tried leaving them at system defaults, same behavior.  I tried doubling
the tcp_limit_output_bytes to 2048k, and that had no affect.

Netstat shows about 1MB of data setting in the TX queue for
for 3.17 and 4.2 kernels when this test is running.

If I start a 50-stream TCP test, then total throughput is 500+Mbps
on 4.2, and generally correlates well with whatever UDP can do at
that time.

A 50-stream throughput has virtually identical performance to the 1 stream
test on the 3.17 kernel.

For the 4.0.4+ kernel, single stream stuck at 30Mbps and would not budge (4.2 does this sometimes too,
perhaps it would have gone up if I had waited more than the ~15 seconds that I did)
50 stream stuck at 420Mbps and would not improve, but it ramped to that quickly.
100 stream test ran at 560Mbps throughput, which is about the maximum TCP throughput
we normally see for ath10k over-the-air.

I'm interested to know if someone has any suggestions for things to tune in 4.2
or 4.0 that might help this, or any reason why I might be seeing this behaviour.

I'm also interested to know if anyone else sees similar behaviour.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
  2015-10-02 23:42 Slow ramp-up for single-stream TCP throughput on 4.2 kernel Ben Greear
@ 2015-10-03  0:21 ` Ben Greear
  2015-10-03 16:29   ` Neal Cardwell
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2015-10-03  0:21 UTC (permalink / raw)
  To: netdev, ath10k

Gah, seems 'cubic' related.  That is the default tcp cong ctrl
I was using (same in 3.17, for that matter).

Most other rate-ctrls vastly out-perform it.



On 10/02/2015 04:42 PM, Ben Greear wrote:
> I'm seeing something that looks more dodgy than normal.

Gah, seems 'cubic' related. That is the default tcp cong ctrl
I was using (same in 3.17, for that matter).

Most other rate-ctrls vastly out-perform it.

Here's a throughput graph for single-stream TCP for each of the rate-ctrl
in 4.2

http://www.candelatech.com/downloads/tcp_cong_ctrl_ath10k_4.2.pdf

I'll re-run and annotate this for posterity's sake, but basically, I started with
'cubic', and then ran each of these in order:

[root@ben-ota-1 lanforge]# echo reno > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo bic > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo cdg > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo dctcp > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo westwood > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo highspeed > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo hybla > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo htcp > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo vegas > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo veno > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo scalable > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo lp > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo yeah > /proc/sys/net/ipv4/tcp_congestion_control
[root@ben-ota-1 lanforge]# echo illinois > /proc/sys/net/ipv4/tcp_congestion_control


The first low non-spike is cubic, then reno does OK, etc.
CDG was the next abject failure.
Vegas sucks
Yeah has issues, but is not horrible.


Thanks,
Ben



>
> Test case id ath10k station uploading to ath10k AP.
>
> AP is always running 4.2 kernel in this case, and both systems are using
> the same ath10k firmware.
>
> I have tuned the stack:
>
> echo 4000000 > /proc/sys/net/core/wmem_max
> echo 4096 87380 50000000 > /proc/sys/net/ipv4/tcp_rmem
> echo 4096 16384 50000000 > /proc/sys/net/ipv4/tcp_wmem
> echo 50000000 > /proc/sys/net/core/rmem_max
> echo 30000 > /proc/sys/net/core/netdev_max_backlog
> echo 1024000 > /proc/sys/net/ipv4/tcp_limit_output_bytes
>
>
> On the 3.17.8+ kernel, single stream TCP very quickly (1-2 seconds) reaches about
> 525Mbps upload throughput (station to AP).
>
> But, when station machine is running the 4.2 kernel, the connection goes to
> about 30Mbps for 5-10 seconds, then may ramp up to 200-300Mbps, and may plateau
> at around 400Mbps after another minute or two.  Once, I saw it finally reach 500+Mbps
> after about 3 minutes.
>
> Both behaviors are repeatable in my testing.
>
> For 4.2, I tried setting the send/rcv buffers to 2Mbps,
> I tried leaving them at system defaults, same behavior.  I tried doubling
> the tcp_limit_output_bytes to 2048k, and that had no affect.
>
> Netstat shows about 1MB of data setting in the TX queue for
> for 3.17 and 4.2 kernels when this test is running.
>
> If I start a 50-stream TCP test, then total throughput is 500+Mbps
> on 4.2, and generally correlates well with whatever UDP can do at
> that time.
>
> A 50-stream throughput has virtually identical performance to the 1 stream
> test on the 3.17 kernel.
>
>
> For the 4.0.4+ kernel, single stream stuck at 30Mbps and would not budge (4.2 does this sometimes too,
> perhaps it would have gone up if I had waited more than the ~15 seconds that I did)
> 50 stream stuck at 420Mbps and would not improve, but it ramped to that quickly.
> 100 stream test ran at 560Mbps throughput, which is about the maximum TCP throughput
> we normally see for ath10k over-the-air.
>
>
> I'm interested to know if someone has any suggestions for things to tune in 4.2
> or 4.0 that might help this, or any reason why I might be seeing this behaviour.
>
> I'm also interested to know if anyone else sees similar behaviour.
>
> Thanks,
> Ben
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
  2015-10-03  0:21 ` Ben Greear
@ 2015-10-03 16:29   ` Neal Cardwell
  2015-10-03 22:46     ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Neal Cardwell @ 2015-10-03 16:29 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev, ath10k

On Fri, Oct 2, 2015 at 8:21 PM, Ben Greear <greearb@candelatech.com> wrote:
> Gah, seems 'cubic' related.  That is the default tcp cong ctrl
> I was using (same in 3.17, for that matter).

There have been recent changes to CUBIC that may account for this. If
you could repeat your test with more instrumentation, eg "nstat", that
would be very helpful.

nstat > /dev/null
# run one test
nstat

Also, if you could take a sender-side tcpdump trace of the test, that
would be very useful (default capture length, grabbing just headers,
is fine).

Thanks!

neal

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
  2015-10-03 16:29   ` Neal Cardwell
@ 2015-10-03 22:46     ` Ben Greear
  2015-10-04  1:20       ` Neal Cardwell
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2015-10-03 22:46 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: netdev, ath10k



On 10/03/2015 09:29 AM, Neal Cardwell wrote:
> On Fri, Oct 2, 2015 at 8:21 PM, Ben Greear <greearb@candelatech.com> wrote:
>> Gah, seems 'cubic' related.  That is the default tcp cong ctrl
>> I was using (same in 3.17, for that matter).
>
> There have been recent changes to CUBIC that may account for this. If
> you could repeat your test with more instrumentation, eg "nstat", that
> would be very helpful.
>
> nstat > /dev/null
> # run one test
> nstat
>
> Also, if you could take a sender-side tcpdump trace of the test, that
> would be very useful (default capture length, grabbing just headers,
> is fine).

Here is nstat output:

[root@ben-ota-1 ~]# nstat
#kernel
IpInReceives                    14507              0.0
IpInDelivers                    14507              0.0
IpOutRequests                   49531              0.0
TcpActiveOpens                  3                  0.0
TcpPassiveOpens                 2                  0.0
TcpInSegs                       14498              0.0
TcpOutSegs                      50269              0.0
UdpInDatagrams                  9                  0.0
UdpOutDatagrams                 1                  0.0
TcpExtDelayedACKs               43                 0.0
TcpExtDelayedACKLost            5                  0.0
TcpExtTCPHPHits                 483                0.0
TcpExtTCPPureAcks               918                0.0
TcpExtTCPHPAcks                 12758              0.0
TcpExtTCPDSACKOldSent           5                  0.0
TcpExtTCPRcvCoalesce            49                 0.0
TcpExtTCPAutoCorking            3                  0.0
TcpExtTCPOrigDataSent           49776              0.0
TcpExtTCPHystartTrainDetect     1                  0.0
TcpExtTCPHystartTrainCwnd       16                 0.0
IpExtInBcastPkts                8                  0.0
IpExtInOctets                   2934274            0.0
IpExtOutOctets                  74817312           0.0
IpExtInBcastOctets              640                0.0
IpExtInNoECTPkts                14911              0.0
[root@ben-ota-1 ~]#


And, you can find the pcap here:

http://www.candelatech.com/downloads/cubic.pcap.bz2

Let me know if you need anything else.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
  2015-10-03 22:46     ` Ben Greear
@ 2015-10-04  1:20       ` Neal Cardwell
  2015-10-04 17:05         ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Neal Cardwell @ 2015-10-04  1:20 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev, ath10k, Eric Dumazet

On Sat, Oct 3, 2015 at 6:46 PM, Ben Greear <greearb@candelatech.com> wrote:
>
>
> On 10/03/2015 09:29 AM, Neal Cardwell wrote:
>>
>> On Fri, Oct 2, 2015 at 8:21 PM, Ben Greear <greearb@candelatech.com>
>> wrote:
>>>
>>> Gah, seems 'cubic' related.  That is the default tcp cong ctrl
>>> I was using (same in 3.17, for that matter).
>>
>>
>> There have been recent changes to CUBIC that may account for this. If
>> you could repeat your test with more instrumentation, eg "nstat", that
>> would be very helpful.
>>
>> nstat > /dev/null
>> # run one test
>> nstat
>>
>> Also, if you could take a sender-side tcpdump trace of the test, that
>> would be very useful (default capture length, grabbing just headers,
>> is fine).
>
>
> Here is nstat output:
>
> [root@ben-ota-1 ~]# nstat
> #kernel
> IpInReceives                    14507              0.0
> IpInDelivers                    14507              0.0
> IpOutRequests                   49531              0.0
> TcpActiveOpens                  3                  0.0
> TcpPassiveOpens                 2                  0.0
> TcpInSegs                       14498              0.0
> TcpOutSegs                      50269              0.0
> UdpInDatagrams                  9                  0.0
> UdpOutDatagrams                 1                  0.0
> TcpExtDelayedACKs               43                 0.0
> TcpExtDelayedACKLost            5                  0.0
> TcpExtTCPHPHits                 483                0.0
> TcpExtTCPPureAcks               918                0.0
> TcpExtTCPHPAcks                 12758              0.0
> TcpExtTCPDSACKOldSent           5                  0.0
> TcpExtTCPRcvCoalesce            49                 0.0
> TcpExtTCPAutoCorking            3                  0.0
> TcpExtTCPOrigDataSent           49776              0.0
> TcpExtTCPHystartTrainDetect     1                  0.0
> TcpExtTCPHystartTrainCwnd       16                 0.0
> IpExtInBcastPkts                8                  0.0
> IpExtInOctets                   2934274            0.0
> IpExtOutOctets                  74817312           0.0
> IpExtInBcastOctets              640                0.0
> IpExtInNoECTPkts                14911              0.0
> [root@ben-ota-1 ~]#
>
>
> And, you can find the pcap here:
>
> http://www.candelatech.com/downloads/cubic.pcap.bz2
>
> Let me know if you need anything else.

Thanks! This is very useful. It looks like the sender is sending 3
(and later 4) packets every ~1.5ms for the entirety of the trace. 3
packets per burst is usually a hint that this may be related to TSQ.

This slow-and-steady behavior triggers CUBIC's Hystart Train Detection
to enter congestion avoidance at a cwnd of 16, which probably in turn
leads to slow cwnd growth, since the sending is not cwnd-limited, but
probably TSQ-limited, so cwnd does not grow in congestion avoidance
mode. Probably most of the other congestion control modules do better
because they stay in slow-start, which has a more aggressive criterion
for growing cwnd.

So this is probably at root due to the known issue with an interaction
between the ath10k driver and the following change in 3.19:

  605ad7f tcp: refine TSO autosizing

There has been a lot of discussion about how to address the
TSQ-related issues with this driver. For example, you might consider:

  https://patchwork.ozlabs.org/patch/438322/

But I am not sure of the latest status of that effort. Perhaps someone
on the ath10k list will know.

neal

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
  2015-10-04  1:20       ` Neal Cardwell
@ 2015-10-04 17:05         ` Ben Greear
  2015-10-04 17:28           ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2015-10-04 17:05 UTC (permalink / raw)
  To: Neal Cardwell; +Cc: netdev, ath10k, Eric Dumazet



On 10/03/2015 06:20 PM, Neal Cardwell wrote:
> On Sat, Oct 3, 2015 at 6:46 PM, Ben Greear <greearb@candelatech.com> wrote:
>>
>>
>> On 10/03/2015 09:29 AM, Neal Cardwell wrote:
>>>
>>> On Fri, Oct 2, 2015 at 8:21 PM, Ben Greear <greearb@candelatech.com>
>>> wrote:
>>>>
>>>> Gah, seems 'cubic' related.  That is the default tcp cong ctrl
>>>> I was using (same in 3.17, for that matter).
>>>
>>>
>>> There have been recent changes to CUBIC that may account for this. If
>>> you could repeat your test with more instrumentation, eg "nstat", that
>>> would be very helpful.
>>>
>>> nstat > /dev/null
>>> # run one test
>>> nstat
>>>
>>> Also, if you could take a sender-side tcpdump trace of the test, that
>>> would be very useful (default capture length, grabbing just headers,
>>> is fine).
>>
>>
>> Here is nstat output:
>>
>> [root@ben-ota-1 ~]# nstat
>> #kernel
>> IpInReceives                    14507              0.0
>> IpInDelivers                    14507              0.0
>> IpOutRequests                   49531              0.0
>> TcpActiveOpens                  3                  0.0
>> TcpPassiveOpens                 2                  0.0
>> TcpInSegs                       14498              0.0
>> TcpOutSegs                      50269              0.0
>> UdpInDatagrams                  9                  0.0
>> UdpOutDatagrams                 1                  0.0
>> TcpExtDelayedACKs               43                 0.0
>> TcpExtDelayedACKLost            5                  0.0
>> TcpExtTCPHPHits                 483                0.0
>> TcpExtTCPPureAcks               918                0.0
>> TcpExtTCPHPAcks                 12758              0.0
>> TcpExtTCPDSACKOldSent           5                  0.0
>> TcpExtTCPRcvCoalesce            49                 0.0
>> TcpExtTCPAutoCorking            3                  0.0
>> TcpExtTCPOrigDataSent           49776              0.0
>> TcpExtTCPHystartTrainDetect     1                  0.0
>> TcpExtTCPHystartTrainCwnd       16                 0.0
>> IpExtInBcastPkts                8                  0.0
>> IpExtInOctets                   2934274            0.0
>> IpExtOutOctets                  74817312           0.0
>> IpExtInBcastOctets              640                0.0
>> IpExtInNoECTPkts                14911              0.0
>> [root@ben-ota-1 ~]#
>>
>>
>> And, you can find the pcap here:
>>
>> http://www.candelatech.com/downloads/cubic.pcap.bz2
>>
>> Let me know if you need anything else.
>
> Thanks! This is very useful. It looks like the sender is sending 3
> (and later 4) packets every ~1.5ms for the entirety of the trace. 3
> packets per burst is usually a hint that this may be related to TSQ.
>
> This slow-and-steady behavior triggers CUBIC's Hystart Train Detection
> to enter congestion avoidance at a cwnd of 16, which probably in turn
> leads to slow cwnd growth, since the sending is not cwnd-limited, but
> probably TSQ-limited, so cwnd does not grow in congestion avoidance
> mode. Probably most of the other congestion control modules do better
> because they stay in slow-start, which has a more aggressive criterion
> for growing cwnd.
>
> So this is probably at root due to the known issue with an interaction
> between the ath10k driver and the following change in 3.19:
>
>    605ad7f tcp: refine TSO autosizing
>
> There has been a lot of discussion about how to address the
> TSQ-related issues with this driver. For example, you might consider:
>
>    https://patchwork.ozlabs.org/patch/438322/
>
> But I am not sure of the latest status of that effort. Perhaps someone
> on the ath10k list will know.

If the guys in that thread cannot get a patch upstream, then there is
little chance I'd be able to make a difference.

I guess I'll just stop using Cubic.  Any suggestions for another
congestion algorithm to use?  I'd prefer something that worked well
in pretty much any network condition, of course, and it has to work with
ath10k.

We can also run some tests with 1G, 10G, ath10k, ath9k, and in conjunction
with network emulators and various congestion control algorithms.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
  2015-10-04 17:05         ` Ben Greear
@ 2015-10-04 17:28           ` Eric Dumazet
  2015-10-04 19:33             ` Yuchung Cheng
  2015-10-05 13:35             ` Ben Greear
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2015-10-04 17:28 UTC (permalink / raw)
  To: Ben Greear; +Cc: Neal Cardwell, netdev, ath10k, Eric Dumazet

On Sun, 2015-10-04 at 10:05 -0700, Ben Greear wrote:

> I guess I'll just stop using Cubic.  Any suggestions for another
> congestion algorithm to use?  I'd prefer something that worked well
> in pretty much any network condition, of course, and it has to work with
> ath10k.
> 
> We can also run some tests with 1G, 10G, ath10k, ath9k, and in conjunction
> with network emulators and various congestion control algorithms.

You could use cubic , but disable or tune HyStart, which is known to be
problematic anyway.

echo  0 >/sys/module/tcp_cubic/parameters/hystart_detect

Or try

echo 40 >/sys/module/tcp_cubic/parameters/hystart_low_window

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
  2015-10-04 17:28           ` Eric Dumazet
@ 2015-10-04 19:33             ` Yuchung Cheng
  2015-10-05 13:35             ` Ben Greear
  1 sibling, 0 replies; 9+ messages in thread
From: Yuchung Cheng @ 2015-10-04 19:33 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Ben Greear, Neal Cardwell, netdev, ath10k, Eric Dumazet

On Sun, Oct 4, 2015 at 10:28 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Sun, 2015-10-04 at 10:05 -0700, Ben Greear wrote:
>
>> I guess I'll just stop using Cubic.  Any suggestions for another
>> congestion algorithm to use?  I'd prefer something that worked well
>> in pretty much any network condition, of course, and it has to work with
>> ath10k.
>>
>> We can also run some tests with 1G, 10G, ath10k, ath9k, and in conjunction
>> with network emulators and various congestion control algorithms.
>
> You could use cubic , but disable or tune HyStart, which is known to be
> problematic anyway.
>
> echo  0 >/sys/module/tcp_cubic/parameters/hystart_detect
>
> Or try
>
> echo 40 >/sys/module/tcp_cubic/parameters/hystart_low_window
Or if you truly don't want to use cubic, Reno would be my choice b/c
1) Cubic in mid-low BDP behaves like Reno (tcp friendly mode)
2) Reno has several recent stretched-ack fixes which may be useful in wireless

>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.
  2015-10-04 17:28           ` Eric Dumazet
  2015-10-04 19:33             ` Yuchung Cheng
@ 2015-10-05 13:35             ` Ben Greear
  1 sibling, 0 replies; 9+ messages in thread
From: Ben Greear @ 2015-10-05 13:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Neal Cardwell, netdev, ath10k, Eric Dumazet



On 10/04/2015 10:28 AM, Eric Dumazet wrote:
> On Sun, 2015-10-04 at 10:05 -0700, Ben Greear wrote:
>
>> I guess I'll just stop using Cubic.  Any suggestions for another
>> congestion algorithm to use?  I'd prefer something that worked well
>> in pretty much any network condition, of course, and it has to work with
>> ath10k.
>>
>> We can also run some tests with 1G, 10G, ath10k, ath9k, and in conjunction
>> with network emulators and various congestion control algorithms.
>
> You could use cubic , but disable or tune HyStart, which is known to be
> problematic anyway.
>
> echo  0 >/sys/module/tcp_cubic/parameters/hystart_detect

I ran some more tests this morning.  The first bump in the graph
has the setting above.  Seems to work pretty good.

The second is using RENO.  Also good, and a small bit of higher over-all
throughput (about 555-560Mbps of RX throughput).

Third is 'highspeed'.  It ramps up quickly, but seems a bit more jagged, and
not quite as fast as RENO.

>
> Or try
>
> echo 40 >/sys/module/tcp_cubic/parameters/hystart_low_window

The last graph is this setting (with hystart_detect set back to 3,
which was the default on my system).

This is better than stock cubic, but it still is awful compared
to the others.

http://www.candelatech.com/downloads/tcp_cong_ath10k_2.pdf

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-10-05 13:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-02 23:42 Slow ramp-up for single-stream TCP throughput on 4.2 kernel Ben Greear
2015-10-03  0:21 ` Ben Greear
2015-10-03 16:29   ` Neal Cardwell
2015-10-03 22:46     ` Ben Greear
2015-10-04  1:20       ` Neal Cardwell
2015-10-04 17:05         ` Ben Greear
2015-10-04 17:28           ` Eric Dumazet
2015-10-04 19:33             ` Yuchung Cheng
2015-10-05 13:35             ` Ben Greear

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).