* [PATCH] sch_red: fix red_calc_qavg_from_idle_time
@ 2011-11-30 21:34 Eric Dumazet
2011-11-30 21:40 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2011-11-30 21:34 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Stephen Hemminger, David Täht
Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to
6) it seems [G]RED are broken in red_calc_qavg_from_idle_time()
This function computes a delay in us units, but this delay is now 16
times bigger than real delay, so the final qavg result is wrong...
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
Maybe we should use native kernel time services in red ?
(ktime_get(), ktime_us_delta())
include/net/pkt_sched.h | 1 +
include/net/red.h | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index fffdc60..c719d9b 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -43,6 +43,7 @@ typedef long psched_tdiff_t;
/* Avoid doing 64 bit divide */
#define PSCHED_SHIFT 6
#define PSCHED_TICKS2NS(x) ((s64)(x) << PSCHED_SHIFT)
+#define PSCHED_TICKS2US(x) (PSCHED_TICKS2NS(x) >> 10) /* approximate 1000 by 1024 */
#define PSCHED_NS2TICKS(x) ((x) >> PSCHED_SHIFT)
#define PSCHED_TICKS_PER_SEC PSCHED_NS2TICKS(NSEC_PER_SEC)
diff --git a/include/net/red.h b/include/net/red.h
index 3319f16..a413638 100644
--- a/include/net/red.h
+++ b/include/net/red.h
@@ -175,7 +175,7 @@ static inline unsigned long red_calc_qavg_from_idle_time(struct red_parms *p)
int shift;
now = psched_get_time();
- us_idle = psched_tdiff_bounded(now, p->qidlestart, p->Scell_max);
+ us_idle = PSCHED_TICKS2US(psched_tdiff_bounded(now, p->qidlestart, p->Scell_max));
/*
* The problem: ideally, average length queue recalcultion should
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] sch_red: fix red_calc_qavg_from_idle_time
2011-11-30 21:34 [PATCH] sch_red: fix red_calc_qavg_from_idle_time Eric Dumazet
@ 2011-11-30 21:40 ` Eric Dumazet
2011-11-30 22:10 ` [PATCH V2] " Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2011-11-30 21:40 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Stephen Hemminger, David Täht
Le mercredi 30 novembre 2011 à 22:34 +0100, Eric Dumazet a écrit :
> Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to
> 6) it seems [G]RED are broken in red_calc_qavg_from_idle_time()
>
> This function computes a delay in us units, but this delay is now 16
> times bigger than real delay, so the final qavg result is wrong...
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> Maybe we should use native kernel time services in red ?
> (ktime_get(), ktime_us_delta())
>
> include/net/pkt_sched.h | 1 +
> include/net/red.h | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> index fffdc60..c719d9b 100644
> --- a/include/net/pkt_sched.h
> +++ b/include/net/pkt_sched.h
> @@ -43,6 +43,7 @@ typedef long psched_tdiff_t;
> /* Avoid doing 64 bit divide */
> #define PSCHED_SHIFT 6
> #define PSCHED_TICKS2NS(x) ((s64)(x) << PSCHED_SHIFT)
> +#define PSCHED_TICKS2US(x) (PSCHED_TICKS2NS(x) >> 10) /* approximate 1000 by 1024 */
> #define PSCHED_NS2TICKS(x) ((x) >> PSCHED_SHIFT)
>
> #define PSCHED_TICKS_PER_SEC PSCHED_NS2TICKS(NSEC_PER_SEC)
> diff --git a/include/net/red.h b/include/net/red.h
> index 3319f16..a413638 100644
> --- a/include/net/red.h
> +++ b/include/net/red.h
> @@ -175,7 +175,7 @@ static inline unsigned long red_calc_qavg_from_idle_time(struct red_parms *p)
> int shift;
>
> now = psched_get_time();
> - us_idle = psched_tdiff_bounded(now, p->qidlestart, p->Scell_max);
> + us_idle = PSCHED_TICKS2US(psched_tdiff_bounded(now, p->qidlestart, p->Scell_max));
Hmm, this is not correct the bound should be applied on the us result,
not on tick I presume (one tick == 64ns)
I'll test another patch.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH V2] sch_red: fix red_calc_qavg_from_idle_time
2011-11-30 21:40 ` Eric Dumazet
@ 2011-11-30 22:10 ` Eric Dumazet
2011-11-30 22:35 ` Dave Taht
2011-12-01 4:29 ` [PATCH V2] sch_red: fix red_calc_qavg_from_idle_time David Miller
0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2011-11-30 22:10 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Stephen Hemminger, David Täht
Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to
6) it seems RED/GRED are broken.
red_calc_qavg_from_idle_time() computes a delay in us units, but this
delay is now 16 times bigger than real delay, so the final qavg result
smaller than expected.
Use standard kernel time services since there is no need to obfuscate
them.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
include/net/red.h | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/include/net/red.h b/include/net/red.h
index 3319f16..b72a3b8 100644
--- a/include/net/red.h
+++ b/include/net/red.h
@@ -116,7 +116,7 @@ struct red_parms {
u32 qR; /* Cached random number */
unsigned long qavg; /* Average queue length: A scaled */
- psched_time_t qidlestart; /* Start of current idle period */
+ ktime_t qidlestart; /* Start of current idle period */
};
static inline u32 red_rmask(u8 Plog)
@@ -148,17 +148,17 @@ static inline void red_set_parms(struct red_parms *p,
static inline int red_is_idling(struct red_parms *p)
{
- return p->qidlestart != PSCHED_PASTPERFECT;
+ return p->qidlestart.tv64 != 0;
}
static inline void red_start_of_idle_period(struct red_parms *p)
{
- p->qidlestart = psched_get_time();
+ p->qidlestart = ktime_get();
}
static inline void red_end_of_idle_period(struct red_parms *p)
{
- p->qidlestart = PSCHED_PASTPERFECT;
+ p->qidlestart.tv64 = 0;
}
static inline void red_restart(struct red_parms *p)
@@ -170,13 +170,10 @@ static inline void red_restart(struct red_parms *p)
static inline unsigned long red_calc_qavg_from_idle_time(struct red_parms *p)
{
- psched_time_t now;
- long us_idle;
+ s64 delta = ktime_us_delta(ktime_get(), p->qidlestart);
+ long us_idle = min_t(s64, delta, p->Scell_max);
int shift;
- now = psched_get_time();
- us_idle = psched_tdiff_bounded(now, p->qidlestart, p->Scell_max);
-
/*
* The problem: ideally, average length queue recalcultion should
* be done over constant clock intervals. This is too expensive, so
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH V2] sch_red: fix red_calc_qavg_from_idle_time
2011-11-30 22:10 ` [PATCH V2] " Eric Dumazet
@ 2011-11-30 22:35 ` Dave Taht
2011-11-30 22:44 ` Eric Dumazet
2011-12-01 4:29 ` [PATCH V2] sch_red: fix red_calc_qavg_from_idle_time David Miller
1 sibling, 1 reply; 9+ messages in thread
From: Dave Taht @ 2011-11-30 22:35 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, Stephen Hemminger
On Wed, Nov 30, 2011 at 11:10 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to
> 6) it seems RED/GRED are broken.
> red_calc_qavg_from_idle_time() computes a delay in us units, but this
> delay is now 16 times bigger than real delay, so the final qavg result
> smaller than expected.
I've been scratching my head about RED's behavior for a while.
Pre-BQL I was unable to even see red behave, so we're still ahead
on this game.
I will apply this patch to the ongoing work. Thank you.
http://www.bufferbloat.net/issues/312
One of my other problems is when I try to size red (or choke) appropriately
(or so I think) for GigE bandwidths and queue depths, it would fail
to calculate the ewma value.
--
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
FR Tel: 0638645374
http://www.bufferbloat.net
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2] sch_red: fix red_calc_qavg_from_idle_time
2011-11-30 22:35 ` Dave Taht
@ 2011-11-30 22:44 ` Eric Dumazet
2011-12-01 11:04 ` [PATCH iproute2] red: give a hint about burst value Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2011-11-30 22:44 UTC (permalink / raw)
To: Dave Taht; +Cc: David Miller, netdev, Stephen Hemminger
Le mercredi 30 novembre 2011 à 23:35 +0100, Dave Taht a écrit :
> One of my other problems is when I try to size red (or choke) appropriately
> (or so I think) for GigE bandwidths and queue depths, it would fail
> to calculate the ewma value.
Do you have one example of such failure ?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2] sch_red: fix red_calc_qavg_from_idle_time
2011-11-30 22:10 ` [PATCH V2] " Eric Dumazet
2011-11-30 22:35 ` Dave Taht
@ 2011-12-01 4:29 ` David Miller
1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2011-12-01 4:29 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, shemminger, dave.taht
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 30 Nov 2011 23:10:53 +0100
> Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to
> 6) it seems RED/GRED are broken.
>
> red_calc_qavg_from_idle_time() computes a delay in us units, but this
> delay is now 16 times bigger than real delay, so the final qavg result
> smaller than expected.
>
> Use standard kernel time services since there is no need to obfuscate
> them.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied and queued up for -stable, thanks Eric.
If someone is looking for a fun project, I would like to see
psched_t obliterated completely and replaced with ktime_t across
the board.
To be honest, psched_t was created exactly because at the time we
lacked something like ktime_t.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH iproute2] red: give a hint about burst value
2011-11-30 22:44 ` Eric Dumazet
@ 2011-12-01 11:04 ` Eric Dumazet
2011-12-01 14:25 ` [PATCH iproute2] red: make burst optional Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2011-12-01 11:04 UTC (permalink / raw)
To: Dave Taht; +Cc: David Miller, netdev, Stephen Hemminger
Le mercredi 30 novembre 2011 à 23:44 +0100, Eric Dumazet a écrit :
> Le mercredi 30 novembre 2011 à 23:35 +0100, Dave Taht a écrit :
>
> > One of my other problems is when I try to size red (or choke) appropriately
> > (or so I think) for GigE bandwidths and queue depths, it would fail
> > to calculate the ewma value.
>
> Do you have one example of such failure ?
>
Oh I understand your burst value is really too small.
Stephen, can we add this helper to iproute2 ?
[PATCH iproute2] red: try to give hint about burst value
Reported-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/tc/tc_red.c b/tc/tc_red.c
index 66658ca..a65435e 100644
--- a/tc/tc_red.c
+++ b/tc/tc_red.c
@@ -56,8 +56,11 @@ int tc_red_eval_ewma(unsigned qmin, unsigned burst, unsigned avpkt)
double W = 0.5;
double a = (double)burst + 1 - (double)qmin/avpkt;
- if (a < 1.0)
+ if (a < 1.0) {
+ fprintf(stderr, "tc_red_eval_ewma() burst %u is too small ?"
+ " Try burst %u\n", burst, 1 + qmin/avpkt);
return -1;
+ }
for (wlog=1; wlog<32; wlog++, W /= 2) {
if (a <= (1 - pow(1-W, burst))/W)
return wlog;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH iproute2] red: make burst optional
2011-12-01 11:04 ` [PATCH iproute2] red: give a hint about burst value Eric Dumazet
@ 2011-12-01 14:25 ` Eric Dumazet
2011-12-01 17:24 ` Stephen Hemminger
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2011-12-01 14:25 UTC (permalink / raw)
To: Dave Taht, Stephen Hemminger; +Cc: David Miller, netdev
Documentation advises to set burst to (min+min+max)/(3*avpkt)
Let tc do this automatically if user doesnt provide burst himself.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
tc/q_choke.c | 2 +-
tc/q_gred.c | 10 +++++++---
tc/q_red.c | 11 +++++++----
3 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/tc/q_choke.c b/tc/q_choke.c
index 04a864f..7816f62 100644
--- a/tc/q_choke.c
+++ b/tc/q_choke.c
@@ -133,7 +133,7 @@ static int choke_parse_opt(struct qdisc_util *qu, int argc, char **argv,
return -1;
}
if (wlog >= 10)
- fprintf(stderr, "CHOKE: WARNING. Burst %d seems to be to large.\n", burst);
+ fprintf(stderr, "CHOKE: WARNING. Burst %d seems to be too large.\n", burst);
opt.Wlog = wlog;
wlog = tc_red_eval_P(opt.qth_min*avpkt, opt.qth_max*avpkt, probability);
diff --git a/tc/q_gred.c b/tc/q_gred.c
index 59651d3..5fa0cc7 100644
--- a/tc/q_gred.c
+++ b/tc/q_gred.c
@@ -215,19 +215,23 @@ static int gred_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct n
if (rate == 0)
get_rate(&rate, "10Mbit");
- if (!opt.qth_min || !opt.qth_max || !burst || !opt.limit || !avpkt ||
+ if (!opt.qth_min || !opt.qth_max || !opt.limit || !avpkt ||
(opt.DP<0)) {
- fprintf(stderr, "Required parameter (min, max, burst, limit, "
+ fprintf(stderr, "Required parameter (min, max, limit, "
"avpkt, DP) is missing\n");
return -1;
}
+ if (!burst) {
+ burst = (2 * opt.qth_min + opt.qth_max) / (3 * avpkt);
+ fprintf(stderr, "GRED: set burst to %u\n", burst);
+ }
if ((wlog = tc_red_eval_ewma(opt.qth_min, burst, avpkt)) < 0) {
fprintf(stderr, "GRED: failed to calculate EWMA constant.\n");
return -1;
}
if (wlog >= 10)
- fprintf(stderr, "GRED: WARNING. Burst %d seems to be to "
+ fprintf(stderr, "GRED: WARNING. Burst %d seems to be too "
"large.\n", burst);
opt.Wlog = wlog;
if ((wlog = tc_red_eval_P(opt.qth_min, opt.qth_max, probability)) < 0) {
diff --git a/tc/q_red.c b/tc/q_red.c
index 4b1b889..6e58d7a 100644
--- a/tc/q_red.c
+++ b/tc/q_red.c
@@ -104,17 +104,20 @@ static int red_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl
if (rate == 0)
get_rate(&rate, "10Mbit");
- if (!opt.qth_min || !opt.qth_max || !burst || !opt.limit || !avpkt) {
- fprintf(stderr, "Required parameter (min, max, burst, limit, avpkt) is missing\n");
+ if (!opt.qth_min || !opt.qth_max || !opt.limit || !avpkt) {
+ fprintf(stderr, "RED: Required parameter (min, max, limit, avpkt) is missing\n");
return -1;
}
-
+ if (!burst) {
+ burst = (2 * opt.qth_min + opt.qth_max) / (3 * avpkt);
+ fprintf(stderr, "RED: set burst to %u\n", burst);
+ }
if ((wlog = tc_red_eval_ewma(opt.qth_min, burst, avpkt)) < 0) {
fprintf(stderr, "RED: failed to calculate EWMA constant.\n");
return -1;
}
if (wlog >= 10)
- fprintf(stderr, "RED: WARNING. Burst %d seems to be to large.\n", burst);
+ fprintf(stderr, "RED: WARNING. Burst %d seems to be too large.\n", burst);
opt.Wlog = wlog;
if ((wlog = tc_red_eval_P(opt.qth_min, opt.qth_max, probability)) < 0) {
fprintf(stderr, "RED: failed to calculate probability.\n");
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH iproute2] red: make burst optional
2011-12-01 14:25 ` [PATCH iproute2] red: make burst optional Eric Dumazet
@ 2011-12-01 17:24 ` Stephen Hemminger
0 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2011-12-01 17:24 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Dave Taht, David Miller, netdev
On Thu, 01 Dec 2011 15:25:59 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Documentation advises to set burst to (min+min+max)/(3*avpkt)
>
> Let tc do this automatically if user doesnt provide burst himself.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Both applied.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-12-01 17:24 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-30 21:34 [PATCH] sch_red: fix red_calc_qavg_from_idle_time Eric Dumazet
2011-11-30 21:40 ` Eric Dumazet
2011-11-30 22:10 ` [PATCH V2] " Eric Dumazet
2011-11-30 22:35 ` Dave Taht
2011-11-30 22:44 ` Eric Dumazet
2011-12-01 11:04 ` [PATCH iproute2] red: give a hint about burst value Eric Dumazet
2011-12-01 14:25 ` [PATCH iproute2] red: make burst optional Eric Dumazet
2011-12-01 17:24 ` Stephen Hemminger
2011-12-01 4:29 ` [PATCH V2] sch_red: fix red_calc_qavg_from_idle_time David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox