* [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
@ 2012-07-20 15:02 Eric Dumazet
2012-07-20 15:07 ` Yuchung Cheng
0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2012-07-20 15:02 UTC (permalink / raw)
To: David Miller
Cc: netdev, Tom Herbert, Neal Cardwell, Yuchung Cheng,
Stephen Hemminger, John Heffner, Nandita Dukkipati
From: Eric Dumazet <edumazet@google.com>
When/if sysctl_tcp_abc > 1, we expect to increase cwnd by 2 if the
received ACK acknowledges more than 2*MSS bytes, in tcp_slow_start()
Problem is this RFC 3465 statement is not correctly coded, as
the while () loop increases snd_cwnd one by one.
Add a new variable to avoid this off-by one error.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: John Heffner <johnwheffner@gmail.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
---
v2: added John suggestion
net/ipv4/tcp_cong.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index 04dbd7a..4d4db16 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -307,6 +307,7 @@ EXPORT_SYMBOL_GPL(tcp_is_cwnd_limited);
void tcp_slow_start(struct tcp_sock *tp)
{
int cnt; /* increase in packets */
+ unsigned int delta = 0;
/* RFC3465: ABC Slow start
* Increase only after a full MSS of bytes is acked
@@ -333,9 +334,9 @@ void tcp_slow_start(struct tcp_sock *tp)
tp->snd_cwnd_cnt += cnt;
while (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
tp->snd_cwnd_cnt -= tp->snd_cwnd;
- if (tp->snd_cwnd < tp->snd_cwnd_clamp)
- tp->snd_cwnd++;
+ delta++;
}
+ tp->snd_cwnd = min(tp->snd_cwnd + delta, tp->snd_cwnd_clamp);
}
EXPORT_SYMBOL_GPL(tcp_slow_start);
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
2012-07-20 15:02 [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start() Eric Dumazet
@ 2012-07-20 15:07 ` Yuchung Cheng
2012-07-20 16:03 ` Neal Cardwell
2012-07-20 17:58 ` Neal Cardwell
0 siblings, 2 replies; 10+ messages in thread
From: Yuchung Cheng @ 2012-07-20 15:07 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, netdev, Tom Herbert, Neal Cardwell,
Stephen Hemminger, John Heffner, Nandita Dukkipati
On Fri, Jul 20, 2012 at 8:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> When/if sysctl_tcp_abc > 1, we expect to increase cwnd by 2 if the
> received ACK acknowledges more than 2*MSS bytes, in tcp_slow_start()
>
> Problem is this RFC 3465 statement is not correctly coded, as
> the while () loop increases snd_cwnd one by one.
>
> Add a new variable to avoid this off-by one error.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Nandita Dukkipati <nanditad@google.com>
> Cc: John Heffner <johnwheffner@gmail.com>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> ---
> v2: added John suggestion
>
> net/ipv4/tcp_cong.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
> index 04dbd7a..4d4db16 100644
> --- a/net/ipv4/tcp_cong.c
> +++ b/net/ipv4/tcp_cong.c
> @@ -307,6 +307,7 @@ EXPORT_SYMBOL_GPL(tcp_is_cwnd_limited);
> void tcp_slow_start(struct tcp_sock *tp)
> {
> int cnt; /* increase in packets */
> + unsigned int delta = 0;
>
> /* RFC3465: ABC Slow start
> * Increase only after a full MSS of bytes is acked
> @@ -333,9 +334,9 @@ void tcp_slow_start(struct tcp_sock *tp)
> tp->snd_cwnd_cnt += cnt;
> while (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
> tp->snd_cwnd_cnt -= tp->snd_cwnd;
> - if (tp->snd_cwnd < tp->snd_cwnd_clamp)
> - tp->snd_cwnd++;
> + delta++;
Nice! this also removes wasteful iteration when clamp << cwnd_cnt.
> }
> + tp->snd_cwnd = min(tp->snd_cwnd + delta, tp->snd_cwnd_clamp);
> }
> EXPORT_SYMBOL_GPL(tcp_slow_start);
>
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
2012-07-20 15:07 ` Yuchung Cheng
@ 2012-07-20 16:03 ` Neal Cardwell
2012-07-20 16:08 ` Eric Dumazet
2012-07-20 17:58 ` Neal Cardwell
1 sibling, 1 reply; 10+ messages in thread
From: Neal Cardwell @ 2012-07-20 16:03 UTC (permalink / raw)
To: Yuchung Cheng
Cc: Eric Dumazet, David Miller, netdev, Tom Herbert,
Stephen Hemminger, John Heffner, Nandita Dukkipati
On Fri, Jul 20, 2012 at 8:07 AM, Yuchung Cheng <ycheng@google.com> wrote:
> On Fri, Jul 20, 2012 at 8:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> tp->snd_cwnd_cnt += cnt;
>> while (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
Nice catch, Eric.
One thing that's always bothered me about the tp->snd_cwnd_cnt code is
that the slow start and congestion avoidance use different criteria
for incrementing snd_cwnd_cnt. tcp_slow_start() increments
snd_cwnd_cnt by snd_cwnd for each ACKed packet, and congestion
avoidance increases snd_cwnd_cnt by just 1 for each packet.
This means that if we exit slow start and enter congestion avoidance,
then we think we can have a "credit" for a bunch of ACKs that never
happened (up to snd_cwnd-1), so we can conceivably do our first
additive increase in congestion avoidance up to almost 1RTT too
early. Can we just get rid of the use of snd_cwnd_cnt in slow start,
and just use local variables in tcp_slow_start() rather than trying to
carry state between ACKs?
neal
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
2012-07-20 16:03 ` Neal Cardwell
@ 2012-07-20 16:08 ` Eric Dumazet
2012-07-20 16:50 ` Neal Cardwell
0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2012-07-20 16:08 UTC (permalink / raw)
To: Neal Cardwell
Cc: Yuchung Cheng, David Miller, netdev, Tom Herbert,
Stephen Hemminger, John Heffner, Nandita Dukkipati
On Fri, 2012-07-20 at 09:03 -0700, Neal Cardwell wrote:
> On Fri, Jul 20, 2012 at 8:07 AM, Yuchung Cheng <ycheng@google.com> wrote:
> > On Fri, Jul 20, 2012 at 8:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> tp->snd_cwnd_cnt += cnt;
> >> while (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
>
> Nice catch, Eric.
>
> One thing that's always bothered me about the tp->snd_cwnd_cnt code is
> that the slow start and congestion avoidance use different criteria
> for incrementing snd_cwnd_cnt. tcp_slow_start() increments
> snd_cwnd_cnt by snd_cwnd for each ACKed packet, and congestion
> avoidance increases snd_cwnd_cnt by just 1 for each packet.
>
> This means that if we exit slow start and enter congestion avoidance,
> then we think we can have a "credit" for a bunch of ACKs that never
> happened (up to snd_cwnd-1), so we can conceivably do our first
> additive increase in congestion avoidance up to almost 1RTT too
> early. Can we just get rid of the use of snd_cwnd_cnt in slow start,
> and just use local variables in tcp_slow_start() rather than trying to
> carry state between ACKs?
Apparently tcp_slow_start() needs the snd_cwnd_cnt in case
"limited slow start" is used :
cnt = sysctl_tcp_max_ssthresh >> 1;
So to address your point, maybe we should clear snd_cwnd_cnt
when leaving slow start for congestion avoidance phase ?
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
2012-07-20 16:08 ` Eric Dumazet
@ 2012-07-20 16:50 ` Neal Cardwell
0 siblings, 0 replies; 10+ messages in thread
From: Neal Cardwell @ 2012-07-20 16:50 UTC (permalink / raw)
To: Eric Dumazet
Cc: Yuchung Cheng, David Miller, netdev, Tom Herbert,
Stephen Hemminger, John Heffner, Nandita Dukkipati
On Fri, Jul 20, 2012 at 9:08 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> So to address your point, maybe we should clear snd_cwnd_cnt
> when leaving slow start for congestion avoidance phase ?
Sounds good. That can be a separate commit to add the new logic to the
end of tcp_slow_start() to check to see if we've bumped into ssthresh
and reset snd_cwnd_cnt.
neal
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
2012-07-20 15:07 ` Yuchung Cheng
2012-07-20 16:03 ` Neal Cardwell
@ 2012-07-20 17:58 ` Neal Cardwell
2012-07-20 18:01 ` David Miller
1 sibling, 1 reply; 10+ messages in thread
From: Neal Cardwell @ 2012-07-20 17:58 UTC (permalink / raw)
To: Yuchung Cheng
Cc: Eric Dumazet, David Miller, netdev, Tom Herbert,
Stephen Hemminger, John Heffner, Nandita Dukkipati
On Fri, Jul 20, 2012 at 8:07 AM, Yuchung Cheng <ycheng@google.com> wrote:
> On Fri, Jul 20, 2012 at 8:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> From: Eric Dumazet <edumazet@google.com>
>>
>> When/if sysctl_tcp_abc > 1, we expect to increase cwnd by 2 if the
>> received ACK acknowledges more than 2*MSS bytes, in tcp_slow_start()
>>
>> Problem is this RFC 3465 statement is not correctly coded, as
>> the while () loop increases snd_cwnd one by one.
>>
>> Add a new variable to avoid this off-by one error.
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
neal
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
2012-07-20 17:58 ` Neal Cardwell
@ 2012-07-20 18:01 ` David Miller
0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2012-07-20 18:01 UTC (permalink / raw)
To: ncardwell
Cc: ycheng, eric.dumazet, netdev, therbert, shemminger, johnwheffner,
nanditad
From: Neal Cardwell <ncardwell@google.com>
Date: Fri, 20 Jul 2012 10:58:27 -0700
> On Fri, Jul 20, 2012 at 8:07 AM, Yuchung Cheng <ycheng@google.com> wrote:
>> On Fri, Jul 20, 2012 at 8:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> From: Eric Dumazet <edumazet@google.com>
>>>
>>> When/if sysctl_tcp_abc > 1, we expect to increase cwnd by 2 if the
>>> received ACK acknowledges more than 2*MSS bytes, in tcp_slow_start()
>>>
>>> Problem is this RFC 3465 statement is not correctly coded, as
>>> the while () loop increases snd_cwnd one by one.
>>>
>>> Add a new variable to avoid this off-by one error.
>>>
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>> Acked-by: Yuchung Cheng <ycheng@google.com>
>
> Acked-by: Neal Cardwell <ncardwell@google.com>
Applied.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
@ 2012-07-20 18:01 Stephen Hemminger
2012-07-20 18:02 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2012-07-20 18:01 UTC (permalink / raw)
To: Neal Cardwell
Cc: Yuchung Cheng, Eric Dumazet, David Miller, netdev, Tom Herbert,
Stephen Hemminger, John Heffner, Nandita Dukkipati
TCP ABC was an experiment that failed. It solves a problem that does not exist on Linux. I did the implementation mostly to prove that.
Perhaps it abc should just be removed?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
2012-07-20 18:01 Stephen Hemminger
@ 2012-07-20 18:02 ` David Miller
2012-07-20 18:13 ` Neal Cardwell
0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2012-07-20 18:02 UTC (permalink / raw)
To: stephen.hemminger
Cc: ncardwell, ycheng, eric.dumazet, netdev, therbert, shemminger,
johnwheffner, nanditad
From: Stephen Hemminger <stephen.hemminger@vyatta.com>
Date: Fri, 20 Jul 2012 11:01:43 -0700
> TCP ABC was an experiment that failed. It solves a problem that does
> not exist on Linux. I did the implementation mostly to prove that.
>
> Perhaps it abc should just be removed?
This is my impression as well.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start()
2012-07-20 18:02 ` David Miller
@ 2012-07-20 18:13 ` Neal Cardwell
0 siblings, 0 replies; 10+ messages in thread
From: Neal Cardwell @ 2012-07-20 18:13 UTC (permalink / raw)
To: David Miller
Cc: stephen.hemminger, ycheng, eric.dumazet, netdev, therbert,
shemminger, johnwheffner, nanditad
On Fri, Jul 20, 2012 at 11:02 AM, David Miller <davem@davemloft.net> wrote:
> From: Stephen Hemminger <stephen.hemminger@vyatta.com>
> Date: Fri, 20 Jul 2012 11:01:43 -0700
>
>> TCP ABC was an experiment that failed. It solves a problem that does
>> not exist on Linux. I did the implementation mostly to prove that.
>>
>> Perhaps it abc should just be removed?
>
> This is my impression as well.
Removing ABC sounds good to me as well.
neal
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-07-20 18:13 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-20 15:02 [PATCH v2 net-next] tcp: fix ABC in tcp_slow_start() Eric Dumazet
2012-07-20 15:07 ` Yuchung Cheng
2012-07-20 16:03 ` Neal Cardwell
2012-07-20 16:08 ` Eric Dumazet
2012-07-20 16:50 ` Neal Cardwell
2012-07-20 17:58 ` Neal Cardwell
2012-07-20 18:01 ` David Miller
-- strict thread matches above, loose matches on Subject: below --
2012-07-20 18:01 Stephen Hemminger
2012-07-20 18:02 ` David Miller
2012-07-20 18:13 ` Neal Cardwell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox