From: Jarek Poplawski <jarkao2@gmail.com>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Antonio Almeida <vexwek@gmail.com>,
netdev@vger.kernel.org, kaber@trash.net, davem@davemloft.net,
devik@cdi.cz
Subject: Re: HTB accuracy for high speed
Date: Mon, 18 May 2009 19:23:49 +0200 [thread overview]
Message-ID: <20090518172349.GA2755@ami.dom.local> (raw)
In-Reply-To: <4A118F98.60101@cosmosbay.com>
On Mon, May 18, 2009 at 06:40:56PM +0200, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> > On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
> > ...
> >> I also note that, for HTB rate configurations over 500Mbit/s on leaf
> >> class, when I stop the traffic, in the output of "tc -s -d class ls
> >> dev eth1" command, I see that leaf's rate (in bits/s) is growing
> >> instead of decreasing (as expected since I've stopped the traffic).
> >> Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
> >> above 1000Mbit and stays there for a few minutes. After two or three
> >> minutes it becomes 0bit. The same happens for it's ancestors (also for
> >> root class).Here's tc output of my leaf class for this situation:
> >>
> >> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> >> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> >> 70901b/8 mpu 0b overhead 0b level 0
> >> Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
> >> requeues 0)
> >> rate 1074Mbit 0pps backlog 0b 0p requeues 0
> >> lended: 242475339 borrowed: 0 giants: 0
> >> tokens: 8 ctokens: 8
> >
> > This looks like a regular bug. I guess it's an overflow in
> > gen_estimator(), but I'm not sure there is nothing more. Could you
> > try the patch below? (An offset warning when patching 2.6.25 is OK)
> >
> > Thanks,
> > Jarek P.
> > ---
> >
> > net/core/gen_estimator.c | 6 +++++-
> > 1 files changed, 5 insertions(+), 1 deletions(-)
> >
> > diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> > index 9cc9f95..87f0ced 100644
> > --- a/net/core/gen_estimator.c
> > +++ b/net/core/gen_estimator.c
> > @@ -127,7 +127,11 @@ static void est_timer(unsigned long arg)
> > npackets = e->bstats->packets;
> > rate = (nbytes - e->last_bytes)<<(7 - idx);
> > e->last_bytes = nbytes;
> > - e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> > + if (rate > e->avbps)
> > + e->avbps += (rate - e->avbps) >> e->ewma_log;
> > + else
> > + e->avbps -= (e->avbps - rate) >> e->ewma_log;
> > +
> > e->rate_est->bps = (e->avbps+0xF)>>5;
> >
> > rate = (npackets - e->last_packets)<<(12 - idx);
>
> With a typical estimator "1sec 8sec", ewma_log value is 3
>
> At gigabit speeds, we are very close to overflow yes, since
> we only have 27 bits available, so 134217728 bytes per second
> or 1073741824 bits per second.
>
> So formula :
> e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> is going to overflow.
>
> One way to avoid the overflow would be to use a smaller estimator, like "500ms 4sec"
>
> Or use a 64bits rate & avbps, this is needed fo 10Gb speeds I suppose...
Yes, I considered this too, but because of an overhead I decided to
fix as designed (according to the comment) for now. But probably you
are right, and we should go further, so I'm OK with your patch.
Jarek P.
>
> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> index 9cc9f95..150e2f5 100644
> --- a/net/core/gen_estimator.c
> +++ b/net/core/gen_estimator.c
> @@ -86,9 +86,9 @@ struct gen_estimator
> spinlock_t *stats_lock;
> int ewma_log;
> u64 last_bytes;
> + u64 avbps;
> u32 last_packets;
> u32 avpps;
> - u32 avbps;
> struct rcu_head e_rcu;
> struct rb_node node;
> };
> @@ -115,6 +115,7 @@ static void est_timer(unsigned long arg)
> rcu_read_lock();
> list_for_each_entry_rcu(e, &elist[idx].list, list) {
> u64 nbytes;
> + u64 brate;
> u32 npackets;
> u32 rate;
>
> @@ -125,9 +126,9 @@ static void est_timer(unsigned long arg)
>
> nbytes = e->bstats->bytes;
> npackets = e->bstats->packets;
> - rate = (nbytes - e->last_bytes)<<(7 - idx);
> + brate = (nbytes - e->last_bytes)<<(7 - idx);
> e->last_bytes = nbytes;
> - e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> + e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
> e->rate_est->bps = (e->avbps+0xF)>>5;
>
> rate = (npackets - e->last_packets)<<(12 - idx);
>
next prev parent reply other threads:[~2009-05-18 17:24 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <298f5c050905150745p13dc226eia1ff50ffa8c4b300@mail.gmail.com>
2009-05-15 14:49 ` HTB accuracy for high speed Antonio Almeida
2009-05-15 18:12 ` Stephen Hemminger
2009-05-18 10:01 ` Antonio Almeida
2009-05-18 10:45 ` Jarek Poplawski
2009-05-18 12:27 ` Antonio Almeida
2009-05-18 12:32 ` Jarek Poplawski
2009-05-18 16:13 ` Stephen Hemminger
2009-05-18 18:03 ` Antonio Almeida
2009-05-18 22:02 ` Stephen Hemminger
2009-05-19 11:48 ` Antonio Almeida
2009-05-19 13:08 ` Antonio Almeida
2009-05-16 8:31 ` Jarek Poplawski
2009-05-18 10:39 ` Antonio Almeida
2009-05-18 11:14 ` Jarek Poplawski
2009-05-18 12:05 ` Antonio Almeida
2009-05-16 14:14 ` Jarek Poplawski
2009-05-18 14:36 ` Antonio Almeida
2009-05-18 23:14 ` Vladimir Ivashchenko
2009-05-18 23:27 ` Vladimir Ivashchenko
2009-05-19 11:03 ` Jarek Poplawski
2009-05-19 14:04 ` Vladimir Ivashchenko
2009-05-19 20:10 ` Jarek Poplawski
2009-05-20 22:07 ` Vladimir Ivashchenko
2009-05-20 22:46 ` Eric Dumazet
2009-05-21 7:20 ` Jarek Poplawski
2009-05-21 7:44 ` Vladimir Ivashchenko
2009-05-21 8:28 ` Jarek Poplawski
2009-05-21 9:07 ` Eric Dumazet
2009-05-21 9:22 ` Jarek Poplawski
2009-05-23 10:37 ` HTB accuracy for high speed (and bonding) Vladimir Ivashchenko
2009-05-23 14:34 ` Jarek Poplawski
2009-05-23 15:06 ` Vladimir Ivashchenko
2009-05-23 15:35 ` Jarek Poplawski
2009-05-23 15:53 ` Vladimir Ivashchenko
2009-05-23 16:02 ` Jarek Poplawski
2009-05-18 16:40 ` HTB accuracy for high speed Eric Dumazet
2009-05-18 17:23 ` Jarek Poplawski [this message]
2009-05-18 21:52 ` David Miller
2009-05-18 23:59 ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps Eric Dumazet
2009-05-19 2:27 ` David Miller
2009-05-19 7:02 ` Jarek Poplawski
2009-05-19 7:31 ` Eric Dumazet
2009-05-19 7:42 ` Jarek Poplawski
2009-05-19 7:57 ` Jarek Poplawski
2009-05-19 18:03 ` Eric Dumazet
2009-05-19 19:09 ` [PATCH] pkt_sched: gen_estimator: Fix signed integers right-shifts Jarek Poplawski
2009-05-26 5:47 ` David Miller
2009-05-19 8:18 ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps David Miller
2009-05-17 20:15 ` HTB accuracy for high speed Jarek Poplawski
2009-05-18 6:56 ` [PATCH iproute2] " Jarek Poplawski
2009-05-18 16:54 ` Antonio Almeida
2009-05-18 17:16 ` Antonio Almeida
2009-05-21 8:51 ` Jarek Poplawski
2009-05-22 17:42 ` Antonio Almeida
2009-05-23 7:32 ` Jarek Poplawski
2009-05-28 18:13 ` Antonio Almeida
2009-05-28 21:12 ` Jarek Poplawski
2009-05-29 17:02 ` Antonio Almeida
2009-05-29 17:28 ` Stephen Hemminger
2009-05-29 19:58 ` Jarek Poplawski
2009-05-29 19:46 ` Jarek Poplawski
2009-05-29 20:49 ` Stephen Hemminger
2009-05-29 20:59 ` Jarek Poplawski
2009-05-30 20:07 ` Jarek Poplawski
2009-06-02 10:12 ` Antonio Almeida
2009-06-02 11:45 ` Antonio Almeida
2009-06-02 12:36 ` Jarek Poplawski
2009-06-02 12:45 ` Patrick McHardy
2009-06-02 13:08 ` Jarek Poplawski
2009-06-02 13:20 ` Patrick McHardy
2009-06-02 21:37 ` Jarek Poplawski
2009-06-02 21:50 ` Jarek Poplawski
2009-06-03 7:06 ` Patrick McHardy
2009-06-03 7:40 ` Jarek Poplawski
2009-06-03 7:53 ` Patrick McHardy
2009-06-03 8:01 ` Jarek Poplawski
2009-06-03 8:29 ` Patrick McHardy
2009-06-03 8:45 ` Jarek Poplawski
2009-06-03 9:54 ` Jarek Poplawski
2009-06-03 10:01 ` Patrick McHardy
2009-06-03 10:05 ` Patrick McHardy
2009-06-03 10:06 ` Patrick McHardy
2009-06-03 10:27 ` Jarek Poplawski
2009-06-04 13:50 ` Antonio Almeida
[not found] ` <20090604193013.GA2755@ami.dom.local>
[not found] ` <4A282216.20203@trash.net>
[not found] ` <20090604194203.GB2755@ami.dom.local>
2009-06-09 5:25 ` Badalian Vyacheslav
2009-06-09 5:49 ` Jarek Poplawski
2009-06-04 4:53 ` David Miller
2009-06-04 7:50 ` Jarek Poplawski
2009-05-18 17:53 ` Jarek Poplawski
2009-05-18 18:23 ` Antonio Almeida
2009-05-18 18:32 ` Jarek Poplawski
2009-05-18 18:56 ` Antonio Almeida
2009-05-18 19:05 ` Jarek Poplawski
2009-05-19 10:55 ` Antonio Almeida
2009-05-19 11:04 ` Denys Fedoryschenko
2009-05-19 11:18 ` Jarek Poplawski
2009-05-19 11:21 ` Denys Fedoryschenko
2009-05-19 11:28 ` Jarek Poplawski
2009-05-19 14:31 ` Antonio Almeida
2009-05-19 11:09 ` Jarek Poplawski
2009-05-19 13:18 ` Jesper Dangaard Brouer
2009-05-19 19:35 ` Jarek Poplawski
2009-05-18 7:01 ` [PATCH iproute2 v2] " Jarek Poplawski
2009-05-17 20:29 ` Vladimir Ivashchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090518172349.GA2755@ami.dom.local \
--to=jarkao2@gmail.com \
--cc=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=devik@cdi.cz \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=vexwek@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.