* [PATCH 1/7] tcp: fix RTT for quick packets in congestion control
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
@ 2011-03-14 17:52 ` Stephen Hemminger
2011-03-14 17:52 ` [PATCH 2/7] tcp_cubic: fix comparison of jiffies Stephen Hemminger
` (8 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2011-03-14 17:52 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
[-- Attachment #1: tcp-input-rtt.patch --]
[-- Type: text/plain, Size: 967 bytes --]
In the congestion control interface, the callback for each ACK
includes an estimated round trip time in microseconds.
Some algorithms need high resolution (Vegas style) but most only
need jiffie resolution. If RTT is not accurate (like a retransmission)
-1 is used as a flag value.
When doing coarse resolution if RTT is less than a a jiffie
then 0 should be returned rather than no estimate. Otherwise algorithms
that expect good ack's to trigger slow start (like CUBIC Hystart)
will be confused.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
--- a/net/ipv4/tcp_input.c 2011-03-14 08:31:35.442834792 -0700
+++ b/net/ipv4/tcp_input.c 2011-03-14 08:31:40.078917049 -0700
@@ -3350,7 +3350,7 @@ static int tcp_clean_rtx_queue(struct so
net_invalid_timestamp()))
rtt_us = ktime_us_delta(ktime_get_real(),
last_ackt);
- else if (ca_seq_rtt > 0)
+ else if (ca_seq_rtt >= 0)
rtt_us = jiffies_to_usecs(ca_seq_rtt);
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/7] tcp_cubic: fix comparison of jiffies
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
2011-03-14 17:52 ` [PATCH 1/7] tcp: fix RTT for quick packets in congestion control Stephen Hemminger
@ 2011-03-14 17:52 ` Stephen Hemminger
2011-03-14 17:52 ` [PATCH 3/7] tcp_cubic: make ack train delta value a parameter Stephen Hemminger
` (7 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2011-03-14 17:52 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
[-- Attachment #1: tcp-cubic-jiffies-wrap.patch --]
[-- Type: text/plain, Size: 1021 bytes --]
Jiffies wraps around therefore the correct way to compare is
to use cast to signed value.
Note: cubic is not using full jiffies value on 64 bit arch
because using full unsigned long makes struct bictcp grow too
large for the available ca_priv area.
Includes correction from Sangtae Ha to improve ack train detection.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
--- a/net/ipv4/tcp_cubic.c 2011-03-11 09:00:06.856664687 -0800
+++ b/net/ipv4/tcp_cubic.c 2011-03-11 09:02:11.685796371 -0800
@@ -342,9 +342,11 @@ static void hystart_update(struct sock *
u32 curr_jiffies = jiffies;
/* first detection parameter - ack-train detection */
- if (curr_jiffies - ca->last_jiffies <= msecs_to_jiffies(2)) {
+ if ((s32)(curr_jiffies - ca->last_jiffies) <=
+ msecs_to_jiffies(2)) {
ca->last_jiffies = curr_jiffies;
- if (curr_jiffies - ca->round_start >= ca->delay_min>>4)
+ if ((s32) (curr_jiffies - ca->round_start) >
+ ca->delay_min >> 4)
ca->found |= HYSTART_ACK_TRAIN;
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/7] tcp_cubic: make ack train delta value a parameter
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
2011-03-14 17:52 ` [PATCH 1/7] tcp: fix RTT for quick packets in congestion control Stephen Hemminger
2011-03-14 17:52 ` [PATCH 2/7] tcp_cubic: fix comparison of jiffies Stephen Hemminger
@ 2011-03-14 17:52 ` Stephen Hemminger
2011-03-14 17:52 ` [PATCH 4/7] tcp_cubic: fix clock dependency Stephen Hemminger
` (6 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2011-03-14 17:52 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
[-- Attachment #1: tcp-cubic-ackdelta.patch --]
[-- Type: text/plain, Size: 1432 bytes --]
Make the spacing between ACK's that indicates a train a tuneable
value like other hystart values.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
--- a/net/ipv4/tcp_cubic.c 2011-03-14 08:19:15.697936023 -0700
+++ b/net/ipv4/tcp_cubic.c 2011-03-14 08:19:18.361944814 -0700
@@ -52,6 +52,7 @@ static int tcp_friendliness __read_mostl
static int hystart __read_mostly = 1;
static int hystart_detect __read_mostly = HYSTART_ACK_TRAIN | HYSTART_DELAY;
static int hystart_low_window __read_mostly = 16;
+static int hystart_ack_delta __read_mostly = 2;
static u32 cube_rtt_scale __read_mostly;
static u32 beta_scale __read_mostly;
@@ -75,6 +76,8 @@ MODULE_PARM_DESC(hystart_detect, "hyrbri
" 1: packet-train 2: delay 3: both packet-train and delay");
module_param(hystart_low_window, int, 0644);
MODULE_PARM_DESC(hystart_low_window, "lower bound cwnd for hybrid slow start");
+module_param(hystart_ack_delta, int, 0644);
+MODULE_PARM_DESC(hystart_ack_delta, "spacing between ack's indicating train (msecs)");
/* BIC TCP Parameters */
struct bictcp {
@@ -343,7 +346,7 @@ static void hystart_update(struct sock *
/* first detection parameter - ack-train detection */
if ((s32)(curr_jiffies - ca->last_jiffies) <=
- msecs_to_jiffies(2)) {
+ msecs_to_jiffies(hystart_ack_delta)) {
ca->last_jiffies = curr_jiffies;
if ((s32) (curr_jiffies - ca->round_start) >
ca->delay_min >> 4)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 4/7] tcp_cubic: fix clock dependency
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
` (2 preceding siblings ...)
2011-03-14 17:52 ` [PATCH 3/7] tcp_cubic: make ack train delta value a parameter Stephen Hemminger
@ 2011-03-14 17:52 ` Stephen Hemminger
2011-03-14 18:51 ` Eric Dumazet
2011-03-14 17:52 ` [PATCH 5/7] tcp_cubic: enable high resolution ack time if needed Stephen Hemminger
` (5 subsequent siblings)
9 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2011-03-14 17:52 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
[-- Attachment #1: tcp-cubic-minrtt.patch --]
[-- Type: text/plain, Size: 2977 bytes --]
The hystart code was written with assumption that HZ=1000.
Replace the use of jiffies with bictcp_clock as a millisecond
real time clock.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
--- a/net/ipv4/tcp_cubic.c 2011-03-14 08:19:18.000000000 -0700
+++ b/net/ipv4/tcp_cubic.c 2011-03-14 08:22:42.486690594 -0700
@@ -88,7 +88,7 @@ struct bictcp {
u32 last_time; /* time when updated last_cwnd */
u32 bic_origin_point;/* origin point of bic function */
u32 bic_K; /* time to origin point from the beginning of the current epoch */
- u32 delay_min; /* min delay */
+ u32 delay_min; /* min delay (msec << 3) */
u32 epoch_start; /* beginning of an epoch */
u32 ack_cnt; /* number of acks */
u32 tcp_cwnd; /* estimated tcp cwnd */
@@ -98,7 +98,7 @@ struct bictcp {
u8 found; /* the exit point is found? */
u32 round_start; /* beginning of each round */
u32 end_seq; /* end_seq of the round */
- u32 last_jiffies; /* last time when the ACK spacing is close */
+ u32 last_ack; /* last time when the ACK spacing is close */
u32 curr_rtt; /* the minimum rtt of current round */
};
@@ -119,12 +119,21 @@ static inline void bictcp_reset(struct b
ca->found = 0;
}
+static inline u32 bictcp_clock(void)
+{
+#if HZ < 1000
+ return ktime_to_ms(ktime_get_real());
+#else
+ return jiffies_to_msecs(jiffies);
+#endif
+}
+
static inline void bictcp_hystart_reset(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
struct bictcp *ca = inet_csk_ca(sk);
- ca->round_start = ca->last_jiffies = jiffies;
+ ca->round_start = ca->last_ack = bictcp_clock();
ca->end_seq = tp->snd_nxt;
ca->curr_rtt = 0;
ca->sample_cnt = 0;
@@ -239,8 +248,8 @@ static inline void bictcp_update(struct
*/
/* change the unit from HZ to bictcp_HZ */
- t = ((tcp_time_stamp + (ca->delay_min>>3) - ca->epoch_start)
- << BICTCP_HZ) / HZ;
+ t = ((tcp_time_stamp + msecs_to_jiffies(ca->delay_min>>3)
+ - ca->epoch_start) << BICTCP_HZ) / HZ;
if (t < ca->bic_K) /* t - K */
offs = ca->bic_K - t;
@@ -342,14 +351,12 @@ static void hystart_update(struct sock *
struct bictcp *ca = inet_csk_ca(sk);
if (!(ca->found & hystart_detect)) {
- u32 curr_jiffies = jiffies;
+ u32 now = bictcp_clock();
/* first detection parameter - ack-train detection */
- if ((s32)(curr_jiffies - ca->last_jiffies) <=
- msecs_to_jiffies(hystart_ack_delta)) {
- ca->last_jiffies = curr_jiffies;
- if ((s32) (curr_jiffies - ca->round_start) >
- ca->delay_min >> 4)
+ if ((s32)(now - ca->last_ack) <= hystart_ack_delta) {
+ ca->last_ack = now;
+ if ((s32)(now - ca->round_start) > ca->delay_min >> 4)
ca->found |= HYSTART_ACK_TRAIN;
}
@@ -396,7 +403,7 @@ static void bictcp_acked(struct sock *sk
if ((s32)(tcp_time_stamp - ca->epoch_start) < HZ)
return;
- delay = usecs_to_jiffies(rtt_us) << 3;
+ delay = (rtt_us << 3) / USEC_PER_MSEC;
if (delay == 0)
delay = 1;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/7] tcp_cubic: fix clock dependency
2011-03-14 17:52 ` [PATCH 4/7] tcp_cubic: fix clock dependency Stephen Hemminger
@ 2011-03-14 18:51 ` Eric Dumazet
2011-03-14 21:21 ` Stephen Hemminger
0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2011-03-14 18:51 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David S. Miller, netdev
Le lundi 14 mars 2011 à 10:52 -0700, Stephen Hemminger a écrit :
> pièce jointe document texte brut (tcp-cubic-minrtt.patch)
> The hystart code was written with assumption that HZ=1000.
> Replace the use of jiffies with bictcp_clock as a millisecond
> real time clock.
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
>
> --- a/net/ipv4/tcp_cubic.c 2011-03-14 08:19:18.000000000 -0700
> +++ b/net/ipv4/tcp_cubic.c 2011-03-14 08:22:42.486690594 -0700
> @@ -88,7 +88,7 @@ struct bictcp {
> u32 last_time; /* time when updated last_cwnd */
> u32 bic_origin_point;/* origin point of bic function */
> u32 bic_K; /* time to origin point from the beginning of the current epoch */
> - u32 delay_min; /* min delay */
> + u32 delay_min; /* min delay (msec << 3) */
> u32 epoch_start; /* beginning of an epoch */
> u32 ack_cnt; /* number of acks */
> u32 tcp_cwnd; /* estimated tcp cwnd */
> @@ -98,7 +98,7 @@ struct bictcp {
> u8 found; /* the exit point is found? */
> u32 round_start; /* beginning of each round */
> u32 end_seq; /* end_seq of the round */
> - u32 last_jiffies; /* last time when the ACK spacing is close */
> + u32 last_ack; /* last time when the ACK spacing is close */
> u32 curr_rtt; /* the minimum rtt of current round */
> };
>
> @@ -119,12 +119,21 @@ static inline void bictcp_reset(struct b
> ca->found = 0;
> }
>
> +static inline u32 bictcp_clock(void)
> +{
> +#if HZ < 1000
> + return ktime_to_ms(ktime_get_real());
Small point : This can be changed if date/time is changed
Maybe use monotonic time (aka ktime_get_ts()) ?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/7] tcp_cubic: fix clock dependency
2011-03-14 18:51 ` Eric Dumazet
@ 2011-03-14 21:21 ` Stephen Hemminger
2011-03-14 21:37 ` Eric Dumazet
0 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2011-03-14 21:21 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David S. Miller, netdev
On Mon, 14 Mar 2011 19:51:19 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le lundi 14 mars 2011 à 10:52 -0700, Stephen Hemminger a écrit :
> > pièce jointe document texte brut (tcp-cubic-minrtt.patch)
> > The hystart code was written with assumption that HZ=1000.
> > Replace the use of jiffies with bictcp_clock as a millisecond
> > real time clock.
> >
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> > Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
> >
> > --- a/net/ipv4/tcp_cubic.c 2011-03-14 08:19:18.000000000 -0700
> > +++ b/net/ipv4/tcp_cubic.c 2011-03-14 08:22:42.486690594 -0700
> > @@ -88,7 +88,7 @@ struct bictcp {
> > u32 last_time; /* time when updated last_cwnd */
> > u32 bic_origin_point;/* origin point of bic function */
> > u32 bic_K; /* time to origin point from the beginning of the current epoch */
> > - u32 delay_min; /* min delay */
> > + u32 delay_min; /* min delay (msec << 3) */
> > u32 epoch_start; /* beginning of an epoch */
> > u32 ack_cnt; /* number of acks */
> > u32 tcp_cwnd; /* estimated tcp cwnd */
> > @@ -98,7 +98,7 @@ struct bictcp {
> > u8 found; /* the exit point is found? */
> > u32 round_start; /* beginning of each round */
> > u32 end_seq; /* end_seq of the round */
> > - u32 last_jiffies; /* last time when the ACK spacing is close */
> > + u32 last_ack; /* last time when the ACK spacing is close */
> > u32 curr_rtt; /* the minimum rtt of current round */
> > };
> >
> > @@ -119,12 +119,21 @@ static inline void bictcp_reset(struct b
> > ca->found = 0;
> > }
> >
> > +static inline u32 bictcp_clock(void)
> > +{
> > +#if HZ < 1000
> > + return ktime_to_ms(ktime_get_real());
>
> Small point : This can be changed if date/time is changed
>
> Maybe use monotonic time (aka ktime_get_ts()) ?
I choose get_real() because that is what skb timestamp is using;
both should probably use monotonic clock.
--
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 5/7] tcp_cubic: enable high resolution ack time if needed
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
` (3 preceding siblings ...)
2011-03-14 17:52 ` [PATCH 4/7] tcp_cubic: fix clock dependency Stephen Hemminger
@ 2011-03-14 17:52 ` Stephen Hemminger
2011-03-14 17:52 ` [PATCH 6/7] tcp_cubic: make the delay threshold of HyStart less sensitive Stephen Hemminger
` (4 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2011-03-14 17:52 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
[-- Attachment #1: tcp-cubic-rtt-cong.patch --]
[-- Type: text/plain, Size: 726 bytes --]
This is a refined version of an earlier patch by Lucas Nussbaum.
Cubic needs RTT values in milliseconds. If HZ < 1000 then
the values will be too coarse.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
--- a/net/ipv4/tcp_cubic.c 2011-03-14 08:22:42.486690594 -0700
+++ b/net/ipv4/tcp_cubic.c 2011-03-14 08:27:24.435852847 -0700
@@ -459,6 +459,10 @@ static int __init cubictcp_register(void
/* divide by bic_scale and by constant Srtt (100ms) */
do_div(cube_factor, bic_scale * 10);
+ /* hystart needs ms clock resolution */
+ if (hystart && HZ < 1000)
+ cubictcp.flags |= TCP_CONG_RTT_STAMP;
+
return tcp_register_congestion_control(&cubictcp);
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 6/7] tcp_cubic: make the delay threshold of HyStart less sensitive
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
` (4 preceding siblings ...)
2011-03-14 17:52 ` [PATCH 5/7] tcp_cubic: enable high resolution ack time if needed Stephen Hemminger
@ 2011-03-14 17:52 ` Stephen Hemminger
2011-03-14 17:52 ` [PATCH 7/7] tcp_cubic: fix low utilization of CUBIC with HyStart Stephen Hemminger
` (3 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2011-03-14 17:52 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Sangtae Ha
[-- Attachment #1: tcp-cubic-increase-delay.patch --]
[-- Type: text/plain, Size: 795 bytes --]
From: Sangtae Ha <sangtae.ha@gmail.com>
Make HyStart less sensitive to abrupt delay variations due to buffer bloat.
Signed-off-by: Sangtae Ha <sangtae.ha@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
---
net/ipv4/tcp_cubic.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
--- a/net/ipv4/tcp_cubic.c 2011-03-14 08:27:24.435852847 -0700
+++ b/net/ipv4/tcp_cubic.c 2011-03-14 08:27:29.043872578 -0700
@@ -39,7 +39,7 @@
/* Number of delay samples for detecting the increase of delay */
#define HYSTART_MIN_SAMPLES 8
-#define HYSTART_DELAY_MIN (2U<<3)
+#define HYSTART_DELAY_MIN (4U<<3)
#define HYSTART_DELAY_MAX (16U<<3)
#define HYSTART_DELAY_THRESH(x) clamp(x, HYSTART_DELAY_MIN, HYSTART_DELAY_MAX)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 7/7] tcp_cubic: fix low utilization of CUBIC with HyStart
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
` (5 preceding siblings ...)
2011-03-14 17:52 ` [PATCH 6/7] tcp_cubic: make the delay threshold of HyStart less sensitive Stephen Hemminger
@ 2011-03-14 17:52 ` Stephen Hemminger
2011-03-14 22:58 ` [PATCH 0/7] TCP CUBIC Hystart fixes David Miller
` (2 subsequent siblings)
9 siblings, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2011-03-14 17:52 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Sangtae Ha
[-- Attachment #1: tcp-cubic-initial-growth.patch --]
[-- Type: text/plain, Size: 1077 bytes --]
From: Sangtae Ha <sangtae.ha@gmail.com>
HyStart sets the initial exit point of slow start.
Suppose that HyStart exits at 0.5BDP in a BDP network and no history exists.
If the BDP of a network is large, CUBIC's initial cwnd growth may be
too conservative to utilize the link.
CUBIC increases the cwnd 20% per RTT in this case.
Signed-off-by: Sangtae Ha <sangtae.ha@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
---
net/ipv4/tcp_cubic.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
--- a/net/ipv4/tcp_cubic.c 2011-03-14 08:32:46.347993869 -0700
+++ b/net/ipv4/tcp_cubic.c 2011-03-14 10:57:00.846549449 -0700
@@ -270,6 +270,13 @@ static inline void bictcp_update(struct
ca->cnt = 100 * cwnd; /* very small increment*/
}
+ /*
+ * The initial growth of cubic function may be too conservative
+ * when the available bandwidth is still unknown.
+ */
+ if (ca->loss_cwnd == 0 && ca->cnt > 20)
+ ca->cnt = 20; /* increase cwnd 5% per RTT */
+
/* TCP Friendly */
if (tcp_friendliness) {
u32 scale = beta_scale;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/7] TCP CUBIC Hystart fixes
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
` (6 preceding siblings ...)
2011-03-14 17:52 ` [PATCH 7/7] tcp_cubic: fix low utilization of CUBIC with HyStart Stephen Hemminger
@ 2011-03-14 22:58 ` David Miller
2011-03-22 11:34 ` Lucas Nussbaum
2011-03-22 11:35 ` Lucas Nussbaum
9 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2011-03-14 22:58 UTC (permalink / raw)
To: shemminger; +Cc: netdev
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Mon, 14 Mar 2011 10:52:11 -0700
> This is the merge of my patches and recent update Sangtae.
> It addresses the problems reported by Lucas Nussbaum that Hystart causes
> poor startup performance over links with lots of buffering.
Ok, I've applied all of this to net-2.6 and did test builds with HZ={100,250,1000}
on both sparc64 and x86.
I'll let it cook for a day or two before pushing it out to Linus.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/7] TCP CUBIC Hystart fixes
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
` (7 preceding siblings ...)
2011-03-14 22:58 ` [PATCH 0/7] TCP CUBIC Hystart fixes David Miller
@ 2011-03-22 11:34 ` Lucas Nussbaum
2011-03-22 11:35 ` Lucas Nussbaum
9 siblings, 0 replies; 15+ messages in thread
From: Lucas Nussbaum @ 2011-03-22 11:34 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David S. Miller, netdev
On 14/03/11 at 10:52 -0700, Stephen Hemminger wrote:
> This is the merge of my patches and recent update Sangtae.
> It addresses the problems reported by Lucas Nussbaum that Hystart causes
> poor startup performance over links with lots of buffering.
Hi,
I've tested the patches, and they work fine.
Here are some results (gigabit link, RTT=11ms).
Without the patches, hystart disabled:
Segments (cwnd, ssthresh)
2500 ++-----------+-------------+------------+-------------+-----------++
+ + + + snd_cwnd ****** +
| snd_ssthresh ###### |
2000 ++ ************************************************* ++
| ** |
| * |
| * |
1500 ++ * ++
| * |
| * |
1000 ++ * ++
| ** |
| * |
| * |
500 ++ * ++
| ** |
+ ** + + + + +
0 ++-***####################################################--------++
0 0.5 1 1.5 2 2.5
time (seconds)
Without the patches, hystart enabled:
Segments (cwnd, ssthresh)
300 ++------------+------------+-------------+------------+------------++
+ + + + snd_cwnd ****** +
| ***snd_ssthresh ###### |
250 ++ *********************** ++
| ******************################################### |
| * |
200 ++ * ++
| * |
150 ++ * ++
| * |
| * |
100 ++ ** ++
| *# |
| *# |
50 ++ *# ++
| **# |
+ ** # + + + + +
0 ++-*###-------+------------+-------------+------------+------------++
0 0.5 1 1.5 2 2.5
time (seconds)
Note how slow start ends very early (~ 230 segments), resulting in poor performance.
With the patches, hystart enabled, run 1:
Segments (cwnd, ssthresh)
2500 ++-----+-------+------+-------+------+-------+------+-------+-----++
+ + + + + + + snd_cwnd ****** +
| snd_ssthresh ###### |
2000 ++ ********************************************************* ++
| * |
| * |
| * |
1500 ++ * ++
| * |
| ** |
1000 ++ * ++
| * |
| * |
| * |
500 ++ * ++
| ** |
+ * + + + + + + + + +
0 ++**##########################################################----++
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
time (seconds)
There's no perceived delay increase, but also no losses. The NIC sends data at
line rate without congestion. we don't exit slow start, but that's fine:
With the patch, hystart enabled, run 2: (that's the most frequent situation)
Segments (cwnd, ssthresh)
2500 ++-----+-------+------+-------+------+-------+------+-------+-----++
+ + + + + + + snd_cwnd ****** +
| snd_ssthresh ###### |
2000 ++ ******************************************************** ++
| *# |
| *# |
| *# |
1500 ++ *# ++
| *# |
| *# |
1000 ++ **# ++
| * # |
| * # |
| * # |
500 ++ ** # ++
| * # |
+ ** # + + + + + + + +
0 ++**####-------+------+-------+------+-------+------+-------+-----++
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
time (seconds)
Hystart detects a delay increase, so we exit slow start, but at a reasonable point.
Hystart works fine in that case. (no impact on performance).
With the patch, hystart enabled, run 3:
Segments (cwnd, ssthresh)
2500 ++-----+-------+------+-------+------+-------+------+-------+-----++
+ + + + + + + snd_cwnd ****** +
| snd_ssthresh ###### |
2000 ++ ******************************************************* ++
| * |
| ** |
| * |
1500 ++ **####################################################### ++
| *# |
| *# |
1000 ++ *# ++
| *# |
| **# |
| * # |
500 ++ * # ++
| * # |
+ ** #+ + + + + + + + +
0 ++**###+-------+------+-------+------+-------+------+-------+-----++
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
time (seconds)
Hystart causes slow start to end a bit too early, but late enough not to affect
performance significantly. Hystart behaves fine in that case too.
Tested-By: Lucas Nussbaum <lucas.nussbaum@loria.fr>
--
| Lucas Nussbaum MCF Université Nancy 2 |
| lucas.nussbaum@loria.fr LORIA / AlGorille |
| http://www.loria.fr/~lnussbau/ +33 3 54 95 86 19 |
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/7] TCP CUBIC Hystart fixes
2011-03-14 17:52 [PATCH 0/7] TCP CUBIC Hystart fixes Stephen Hemminger
` (8 preceding siblings ...)
2011-03-22 11:34 ` Lucas Nussbaum
@ 2011-03-22 11:35 ` Lucas Nussbaum
2011-03-22 12:05 ` David Miller
9 siblings, 1 reply; 15+ messages in thread
From: Lucas Nussbaum @ 2011-03-22 11:35 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David S. Miller, netdev
On 14/03/11 at 10:52 -0700, Stephen Hemminger wrote:
> This is the merge of my patches and recent update Sangtae.
> It addresses the problems reported by Lucas Nussbaum that Hystart causes
> poor startup performance over links with lots of buffering.
What do you plan to do regarding stable kernels? We should probably
either push that patch serie, or disable hystart if HZ < 1000.
--
| Lucas Nussbaum MCF Université Nancy 2 |
| lucas.nussbaum@loria.fr LORIA / AlGorille |
| http://www.loria.fr/~lnussbau/ +33 3 54 95 86 19 |
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/7] TCP CUBIC Hystart fixes
2011-03-22 11:35 ` Lucas Nussbaum
@ 2011-03-22 12:05 ` David Miller
0 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2011-03-22 12:05 UTC (permalink / raw)
To: lucas.nussbaum; +Cc: shemminger, netdev
From: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Date: Tue, 22 Mar 2011 12:35:42 +0100
> On 14/03/11 at 10:52 -0700, Stephen Hemminger wrote:
>> This is the merge of my patches and recent update Sangtae.
>> It addresses the problems reported by Lucas Nussbaum that Hystart causes
>> poor startup performance over links with lots of buffering.
>
> What do you plan to do regarding stable kernels? We should probably
> either push that patch serie, or disable hystart if HZ < 1000.
I think once the patch series gets some soaking time in Linus's tree
we can send it over to -stable.
^ permalink raw reply [flat|nested] 15+ messages in thread