* [PATCH net 2/5] tcp_bbr: introduce bbr_bw_to_pacing_rate() helper
2017-07-14 21:49 [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Neal Cardwell
@ 2017-07-14 21:49 ` Neal Cardwell
2017-07-14 21:49 ` [PATCH net 3/5] tcp_bbr: introduce bbr_init_pacing_rate_from_rtt() helper Neal Cardwell
` (4 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Neal Cardwell @ 2017-07-14 21:49 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Neal Cardwell, Yuchung Cheng, Soheil Hassas Yeganeh
Introduce a helper to convert a BBR bandwidth and gain factor to a
pacing rate in bytes per second. This is a pure refactor, but is
needed for two following fixes.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
net/ipv4/tcp_bbr.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 743e97511dc8..29e23b851b97 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -211,6 +211,16 @@ static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain)
return rate >> BW_SCALE;
}
+/* Convert a BBR bw and gain factor to a pacing rate in bytes per second. */
+static u32 bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain)
+{
+ u64 rate = bw;
+
+ rate = bbr_rate_bytes_per_sec(sk, rate, gain);
+ rate = min_t(u64, rate, sk->sk_max_pacing_rate);
+ return rate;
+}
+
/* Pace using current bw estimate and a gain factor. In order to help drive the
* network toward lower queues while maintaining high utilization and low
* latency, the average pacing rate aims to be slightly (~1%) lower than the
@@ -220,10 +230,8 @@ static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain)
*/
static void bbr_set_pacing_rate(struct sock *sk, u32 bw, int gain)
{
- u64 rate = bw;
+ u32 rate = bbr_bw_to_pacing_rate(sk, bw, gain);
- rate = bbr_rate_bytes_per_sec(sk, rate, gain);
- rate = min_t(u64, rate, sk->sk_max_pacing_rate);
if (bbr_full_bw_reached(sk) || rate > sk->sk_pacing_rate)
sk->sk_pacing_rate = rate;
}
--
2.13.2.932.g7449e964c-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 3/5] tcp_bbr: introduce bbr_init_pacing_rate_from_rtt() helper
2017-07-14 21:49 [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Neal Cardwell
2017-07-14 21:49 ` [PATCH net 2/5] tcp_bbr: introduce bbr_bw_to_pacing_rate() helper Neal Cardwell
@ 2017-07-14 21:49 ` Neal Cardwell
2017-07-14 21:49 ` [PATCH net 4/5] tcp_bbr: remove sk_pacing_rate=0 transient during init Neal Cardwell
` (3 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Neal Cardwell @ 2017-07-14 21:49 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Neal Cardwell, Yuchung Cheng, Soheil Hassas Yeganeh
Introduce a helper to initialize the BBR pacing rate unconditionally,
based on the current cwnd and RTT estimate. This is a pure refactor,
but is needed for two following fixes.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
net/ipv4/tcp_bbr.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 29e23b851b97..3276140c2506 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -221,6 +221,23 @@ static u32 bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain)
return rate;
}
+/* Initialize pacing rate to: high_gain * init_cwnd / RTT. */
+static void bbr_init_pacing_rate_from_rtt(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ u64 bw;
+ u32 rtt_us;
+
+ if (tp->srtt_us) { /* any RTT sample yet? */
+ rtt_us = max(tp->srtt_us >> 3, 1U);
+ } else { /* no RTT sample yet */
+ rtt_us = USEC_PER_MSEC; /* use nominal default RTT */
+ }
+ bw = (u64)tp->snd_cwnd * BW_UNIT;
+ do_div(bw, rtt_us);
+ sk->sk_pacing_rate = bbr_bw_to_pacing_rate(sk, bw, bbr_high_gain);
+}
+
/* Pace using current bw estimate and a gain factor. In order to help drive the
* network toward lower queues while maintaining high utilization and low
* latency, the average pacing rate aims to be slightly (~1%) lower than the
@@ -805,7 +822,6 @@ static void bbr_init(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
struct bbr *bbr = inet_csk_ca(sk);
- u64 bw;
bbr->prior_cwnd = 0;
bbr->tso_segs_goal = 0; /* default segs per skb until first ACK */
@@ -821,11 +837,8 @@ static void bbr_init(struct sock *sk)
minmax_reset(&bbr->bw, bbr->rtt_cnt, 0); /* init max bw to 0 */
- /* Initialize pacing rate to: high_gain * init_cwnd / RTT. */
- bw = (u64)tp->snd_cwnd * BW_UNIT;
- do_div(bw, (tp->srtt_us >> 3) ? : USEC_PER_MSEC);
sk->sk_pacing_rate = 0; /* force an update of sk_pacing_rate */
- bbr_set_pacing_rate(sk, bw, bbr_high_gain);
+ bbr_init_pacing_rate_from_rtt(sk);
bbr->restore_cwnd = 0;
bbr->round_start = 0;
--
2.13.2.932.g7449e964c-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 4/5] tcp_bbr: remove sk_pacing_rate=0 transient during init
2017-07-14 21:49 [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Neal Cardwell
2017-07-14 21:49 ` [PATCH net 2/5] tcp_bbr: introduce bbr_bw_to_pacing_rate() helper Neal Cardwell
2017-07-14 21:49 ` [PATCH net 3/5] tcp_bbr: introduce bbr_init_pacing_rate_from_rtt() helper Neal Cardwell
@ 2017-07-14 21:49 ` Neal Cardwell
2017-07-14 21:49 ` [PATCH net 5/5] tcp_bbr: init pacing rate on first RTT sample Neal Cardwell
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Neal Cardwell @ 2017-07-14 21:49 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Neal Cardwell, Yuchung Cheng, Soheil Hassas Yeganeh
Fix a corner case noticed by Eric Dumazet, where BBR's setting
sk->sk_pacing_rate to 0 during initialization could theoretically
cause packets in the sending host to hang if there were packets "in
flight" in the pacing infrastructure at the time the BBR congestion
control state is initialized. This could occur if the pacing
infrastructure happened to race with bbr_init() in a way such that the
pacer read the 0 rather than the immediately following non-zero pacing
rate.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
net/ipv4/tcp_bbr.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 3276140c2506..42e0017f2ebc 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -837,7 +837,6 @@ static void bbr_init(struct sock *sk)
minmax_reset(&bbr->bw, bbr->rtt_cnt, 0); /* init max bw to 0 */
- sk->sk_pacing_rate = 0; /* force an update of sk_pacing_rate */
bbr_init_pacing_rate_from_rtt(sk);
bbr->restore_cwnd = 0;
--
2.13.2.932.g7449e964c-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 5/5] tcp_bbr: init pacing rate on first RTT sample
2017-07-14 21:49 [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Neal Cardwell
` (2 preceding siblings ...)
2017-07-14 21:49 ` [PATCH net 4/5] tcp_bbr: remove sk_pacing_rate=0 transient during init Neal Cardwell
@ 2017-07-14 21:49 ` Neal Cardwell
2017-07-14 22:36 ` [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Stephen Hemminger
2017-07-15 21:44 ` David Miller
5 siblings, 0 replies; 11+ messages in thread
From: Neal Cardwell @ 2017-07-14 21:49 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Neal Cardwell, Yuchung Cheng, Soheil Hassas Yeganeh
Fixes the following behavior: for connections that had no RTT sample
at the time of initializing congestion control, BBR was initializing
the pacing rate to a high nominal rate (based an a guess of RTT=1ms,
in case this is LAN traffic). Then BBR never adjusted the pacing rate
downward upon obtaining an actual RTT sample, if the connection never
filled the pipe (e.g. all sends were small app-limited writes()).
This fix adjusts the pacing rate upon obtaining the first RTT sample.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
net/ipv4/tcp_bbr.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 42e0017f2ebc..69ee877574d0 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -112,7 +112,8 @@ struct bbr {
cwnd_gain:10, /* current gain for setting cwnd */
full_bw_cnt:3, /* number of rounds without large bw gains */
cycle_idx:3, /* current index in pacing_gain cycle array */
- unused_b:6;
+ has_seen_rtt:1, /* have we seen an RTT sample yet? */
+ unused_b:5;
u32 prior_cwnd; /* prior cwnd upon entering loss recovery */
u32 full_bw; /* recent bw, to estimate if pipe is full */
};
@@ -225,11 +226,13 @@ static u32 bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain)
static void bbr_init_pacing_rate_from_rtt(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
u64 bw;
u32 rtt_us;
if (tp->srtt_us) { /* any RTT sample yet? */
rtt_us = max(tp->srtt_us >> 3, 1U);
+ bbr->has_seen_rtt = 1;
} else { /* no RTT sample yet */
rtt_us = USEC_PER_MSEC; /* use nominal default RTT */
}
@@ -247,8 +250,12 @@ static void bbr_init_pacing_rate_from_rtt(struct sock *sk)
*/
static void bbr_set_pacing_rate(struct sock *sk, u32 bw, int gain)
{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
u32 rate = bbr_bw_to_pacing_rate(sk, bw, gain);
+ if (unlikely(!bbr->has_seen_rtt && tp->srtt_us))
+ bbr_init_pacing_rate_from_rtt(sk);
if (bbr_full_bw_reached(sk) || rate > sk->sk_pacing_rate)
sk->sk_pacing_rate = rate;
}
@@ -837,6 +844,7 @@ static void bbr_init(struct sock *sk)
minmax_reset(&bbr->bw, bbr->rtt_cnt, 0); /* init max bw to 0 */
+ bbr->has_seen_rtt = 0;
bbr_init_pacing_rate_from_rtt(sk);
bbr->restore_cwnd = 0;
--
2.13.2.932.g7449e964c-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
2017-07-14 21:49 [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Neal Cardwell
` (3 preceding siblings ...)
2017-07-14 21:49 ` [PATCH net 5/5] tcp_bbr: init pacing rate on first RTT sample Neal Cardwell
@ 2017-07-14 22:36 ` Stephen Hemminger
2017-07-14 22:54 ` Neal Cardwell
2017-07-15 21:44 ` David Miller
5 siblings, 1 reply; 11+ messages in thread
From: Stephen Hemminger @ 2017-07-14 22:36 UTC (permalink / raw)
To: Neal Cardwell; +Cc: David Miller, netdev, Yuchung Cheng, Soheil Hassas Yeganeh
On Fri, 14 Jul 2017 17:49:21 -0400
Neal Cardwell <ncardwell@google.com> wrote:
> In bbr_set_pacing_rate(), which decides whether to cut the pacing
> rate, there was some code that considered exiting STARTUP to be
> equivalent to the notion of filling the pipe (i.e.,
> bbr_full_bw_reached()). Specifically, as the code was structured,
> exiting STARTUP and going into PROBE_RTT could cause us to cut the
> pacing rate down to something silly and low, based on whatever
> bandwidth samples we've had so far, when it's possible that all of
> them have been small app-limited bandwidth samples that are not
> representative of the bandwidth available in the path. (The code was
> correct at the time it was written, but the state machine changed
> without this spot being adjusted correspondingly.)
>
> Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Looks good, but net-next is closed at this time. Please resubmit later.
http://vger.kernel.org/~davem/net-next.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
2017-07-14 22:36 ` [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Stephen Hemminger
@ 2017-07-14 22:54 ` Neal Cardwell
2017-07-14 23:15 ` Stephen Hemminger
0 siblings, 1 reply; 11+ messages in thread
From: Neal Cardwell @ 2017-07-14 22:54 UTC (permalink / raw)
To: Stephen Hemminger
Cc: David Miller, Netdev, Yuchung Cheng, Soheil Hassas Yeganeh
On Fri, Jul 14, 2017 at 6:36 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Fri, 14 Jul 2017 17:49:21 -0400
> Neal Cardwell <ncardwell@google.com> wrote:
>
>> In bbr_set_pacing_rate(), which decides whether to cut the pacing
>> rate, there was some code that considered exiting STARTUP to be
>> equivalent to the notion of filling the pipe (i.e.,
>> bbr_full_bw_reached()). Specifically, as the code was structured,
>> exiting STARTUP and going into PROBE_RTT could cause us to cut the
>> pacing rate down to something silly and low, based on whatever
>> bandwidth samples we've had so far, when it's possible that all of
>> them have been small app-limited bandwidth samples that are not
>> representative of the bandwidth available in the path. (The code was
>> correct at the time it was written, but the state machine changed
>> without this spot being adjusted correspondingly.)
>>
>> Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
>> Signed-off-by: Neal Cardwell <ncardwell@google.com>
>> Signed-off-by: Yuchung Cheng <ycheng@google.com>
>> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
>
> Looks good, but net-next is closed at this time. Please resubmit later.
>
> http://vger.kernel.org/~davem/net-next.html
Thanks, Stephen. We see your point on the net-next patch "tcp: adjust
tail loss probe timeout"; we'll resubmit that patch when net-next
opens. Sorry about that!
But for the tcp_bbr patch series, those are bug fixes, and we were
marking them as being for "net" with Fixes: footers in the hopes that
they could go into the "net" branch and be queued up for inclusion in
-stable releases. Are you saying that in your estimation the substance
of the fixes doesn't rise to the level of "net" material? If that is
the consensus then we can resubmit for net-next when that opens.
thanks,
neal
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
2017-07-14 22:54 ` Neal Cardwell
@ 2017-07-14 23:15 ` Stephen Hemminger
2017-07-15 0:29 ` Neal Cardwell
0 siblings, 1 reply; 11+ messages in thread
From: Stephen Hemminger @ 2017-07-14 23:15 UTC (permalink / raw)
To: Neal Cardwell; +Cc: David Miller, Netdev, Yuchung Cheng, Soheil Hassas Yeganeh
On Fri, 14 Jul 2017 18:54:02 -0400
Neal Cardwell <ncardwell@google.com> wrote:
> On Fri, Jul 14, 2017 at 6:36 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > On Fri, 14 Jul 2017 17:49:21 -0400
> > Neal Cardwell <ncardwell@google.com> wrote:
> >
> >> In bbr_set_pacing_rate(), which decides whether to cut the pacing
> >> rate, there was some code that considered exiting STARTUP to be
> >> equivalent to the notion of filling the pipe (i.e.,
> >> bbr_full_bw_reached()). Specifically, as the code was structured,
> >> exiting STARTUP and going into PROBE_RTT could cause us to cut the
> >> pacing rate down to something silly and low, based on whatever
> >> bandwidth samples we've had so far, when it's possible that all of
> >> them have been small app-limited bandwidth samples that are not
> >> representative of the bandwidth available in the path. (The code was
> >> correct at the time it was written, but the state machine changed
> >> without this spot being adjusted correspondingly.)
> >>
> >> Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
> >> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> >> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> >> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> >
You are correct, these look more like bug fixes. I was a little concerned
that the changes would be visible but they really aren't user visible.
Should they go to stable as well?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
2017-07-14 23:15 ` Stephen Hemminger
@ 2017-07-15 0:29 ` Neal Cardwell
0 siblings, 0 replies; 11+ messages in thread
From: Neal Cardwell @ 2017-07-15 0:29 UTC (permalink / raw)
To: Stephen Hemminger
Cc: David Miller, Netdev, Yuchung Cheng, Soheil Hassas Yeganeh
On Fri, Jul 14, 2017 at 7:15 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Fri, 14 Jul 2017 18:54:02 -0400
> Neal Cardwell <ncardwell@google.com> wrote:
>
>> On Fri, Jul 14, 2017 at 6:36 PM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>> > On Fri, 14 Jul 2017 17:49:21 -0400
>> > Neal Cardwell <ncardwell@google.com> wrote:
>> >
>> >> In bbr_set_pacing_rate(), which decides whether to cut the pacing
>> >> rate, there was some code that considered exiting STARTUP to be
>> >> equivalent to the notion of filling the pipe (i.e.,
>> >> bbr_full_bw_reached()). Specifically, as the code was structured,
>> >> exiting STARTUP and going into PROBE_RTT could cause us to cut the
>> >> pacing rate down to something silly and low, based on whatever
>> >> bandwidth samples we've had so far, when it's possible that all of
>> >> them have been small app-limited bandwidth samples that are not
>> >> representative of the bandwidth available in the path. (The code was
>> >> correct at the time it was written, but the state machine changed
>> >> without this spot being adjusted correspondingly.)
>> >>
>> >> Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
>> >> Signed-off-by: Neal Cardwell <ncardwell@google.com>
>> >> Signed-off-by: Yuchung Cheng <ycheng@google.com>
>> >> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
>> >
>
> You are correct, these look more like bug fixes. I was a little concerned
> that the changes would be visible but they really aren't user visible.
Yes, exactly.
> Should they go to stable as well?
Yes, please. The intention was for this whole 5-patch BBR pacing
bug-fix series to go into "net" and into the -stable queue together.
thanks,
neal
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
2017-07-14 21:49 [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Neal Cardwell
` (4 preceding siblings ...)
2017-07-14 22:36 ` [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe Stephen Hemminger
@ 2017-07-15 21:44 ` David Miller
2017-07-16 5:18 ` Neal Cardwell
5 siblings, 1 reply; 11+ messages in thread
From: David Miller @ 2017-07-15 21:44 UTC (permalink / raw)
To: ncardwell; +Cc: netdev, ycheng, soheil
Series applied and queued up for -stable.
Please provide a proper "[PATCH net 0/N] " header posting next time.
All patch series should have one.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
2017-07-15 21:44 ` David Miller
@ 2017-07-16 5:18 ` Neal Cardwell
0 siblings, 0 replies; 11+ messages in thread
From: Neal Cardwell @ 2017-07-16 5:18 UTC (permalink / raw)
To: David Miller; +Cc: Netdev, Yuchung Cheng, Soheil Hassas Yeganeh
On Sat, Jul 15, 2017 at 5:44 PM, David Miller <davem@davemloft.net> wrote:
>
> Series applied and queued up for -stable.
>
> Please provide a proper "[PATCH net 0/N] " header posting next time.
> All patch series should have one.
Sorry about that. Will do!
Thanks,
neal
^ permalink raw reply [flat|nested] 11+ messages in thread