From: Jesper Dangaard Brouer <jdb@comx.dk>
To: Patrick McHardy <kaber@trash.net>
Cc: Jesper Dangaard Brouer <hawk@diku.dk>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Stephen Hemminger <shemminger@linux-foundation.org>
Subject: Re: [PATCH 2/2]: [NET_SCHED]: Making rate table lookups more flexible.
Date: Wed, 05 Sep 2007 15:58:18 +0200 [thread overview]
Message-ID: <1189000698.28083.22.camel@localhost.localdomain> (raw)
In-Reply-To: <46DD86F9.8000902@trash.net>
[-- Attachment #1: Type: text/plain, Size: 3532 bytes --]
On Tue, 2007-09-04 at 18:25 +0200, Patrick McHardy wrote:
> Jesper Dangaard Brouer wrote:
> > On Sun, 2007-09-02 at 23:16 +0200, Patrick McHardy wrote:
> >
> >>Jesper Dangaard Brouer wrote:
> >>
> >>>On Sun, 2 Sep 2007, Patrick McHardy wrote:
> >>>
> >>>Lets focus on the general case, where the functionality actually is
> >>>needed right away.
> >>>
> >>>In the general case:
> >>>
> >>>- The rate table needs to be aligned (cell_align=-1).
> >>> (currently, we miscalculates up to 7 bytes on every lookup)
> >>
> >>We will always do that, thats a consequence of storing the
> >>transmission times for multiples of 8b.
> >
> >
> > The issue is that we use the lower boundary for calculating the transmit
> > cost. Thus, a 15 bytes packet only have a transmit cost of 8 bytes.
>
> I believe this is something that should be fixed anyway,
> its better to overestimate than underestimate to stay
> in control of the queue.
Well, I have attached a patch that uses the upper boundry instead.
The patch uses the cell_align feature.
The patch is very simple it self, but figure out what happens the rtab
array requires a little illustration:
Illustrating the rate table array:
Legend description
rtab[x] : Array index x of rtab[x]
xmit_sz : Transmit size contained in rtab[x] (normal transmit time)
maps[a-b] : Packet sizes from a to b, will map into rtab[x]
Current/old rate table mapping (cell_log:3):
rtab[0]:=xmit_sz:0 maps[0-7]
rtab[1]:=xmit_sz:8 maps[8-15]
rtab[2]:=xmit_sz:16 maps[16-23]
rtab[3]:=xmit_sz:24 maps[24-31]
rtab[4]:=xmit_sz:32 maps[32-39]
rtab[5]:=xmit_sz:40 maps[40-47]
rtab[6]:=xmit_sz:48 maps[48-55]
New rate table mapping, with kernel cell_align support.
rtab[0]:=xmit_sz:8 maps[0-8]
rtab[1]:=xmit_sz:16 maps[9-16]
rtab[2]:=xmit_sz:24 maps[17-24]
rtab[3]:=xmit_sz:32 maps[25-32]
rtab[4]:=xmit_sz:40 maps[33-40]
rtab[5]:=xmit_sz:48 maps[41-48]
rtab[6]:=xmit_sz:56 maps[49-56]
New TC util on a kernel WITHOUT support for cell_align
rtab[0]:=xmit_sz:8 maps[0-7]
rtab[1]:=xmit_sz:16 maps[8-15]
rtab[2]:=xmit_sz:24 maps[16-23]
rtab[3]:=xmit_sz:32 maps[24-31]
rtab[4]:=xmit_sz:40 maps[32-39]
rtab[5]:=xmit_sz:48 maps[40-47]
rtab[6]:=xmit_sz:56 maps[48-55]
Notice that without the kernel cell_align feature, we are only off by
one byte. That should be acceptable, when somebody uses a new TC util
on a old kernel.
> We could additionally make the
> rate tables more finegrained (optionally).
That is actually already possible with the approach used to handle
overflow of the rate table ("TSO" large packet support). By setting
cell_log=0, and letting the overflow code handle the rest, we get a very
fingrained lookup.
> >>>- The existing tc overhead calc can be made more accurate.
> >>> (by adding overhead before doing the lookup, instead of the
> >>> current solution where the rate table is modified with its
> >>> limited resolution)
> >>
> >>Please demonstrate this with patches (one for the overhead
> >>calculation, one for the cell_align thing), then we can
> >>continue this discussion.
> >
> >
> > I have attached a patch for the overhead calculation.
Attached is a patch that uses "the cell_align thing".
> Thanks, I probably won't get to looking into this until
> after the netfilter workshop next week.
Okay, but I'll see you at the workshop, so I might bug you there ;-)
--
Med venlig hilsen / Best regards
Jesper Brouer
ComX Networks A/S
Linux Network developer
Cand. Scient Datalog / MSc.
Author of http://adsl-optimizer.dk
[-- Attachment #2: upperbound_rate_table_aligned.patch --]
[-- Type: text/x-patch, Size: 2156 bytes --]
commit 9a21e8bd56a5f057fc9f605e061c22d264ec27ef
Author: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Wed Sep 5 15:24:51 2007 +0200
[IPROUTE2]: Change the rate table calc of transmit cost to use upper bound value.
Patrick McHardy, Cite: 'its better to overestimate than underestimate
to stay in control of the queue'.
Illustrating the rate table array:
Legend description
rtab[x] : Array index x of rtab[x]
xmit_sz : Transmit size contained in rtab[x] (normally transmit time)
maps[a-b] : Packet sizes from a to b, will map into rtab[x]
Current/old rate table mapping (cell_log:3):
rtab[0]:=xmit_sz:0 maps[0-7]
rtab[1]:=xmit_sz:8 maps[8-15]
rtab[2]:=xmit_sz:16 maps[16-23]
rtab[3]:=xmit_sz:24 maps[24-31]
rtab[4]:=xmit_sz:32 maps[32-39]
rtab[5]:=xmit_sz:40 maps[40-47]
rtab[6]:=xmit_sz:48 maps[48-55]
New rate table mapping, with kernel cell_align support.
rtab[0]:=xmit_sz:8 maps[0-8]
rtab[1]:=xmit_sz:16 maps[9-16]
rtab[2]:=xmit_sz:24 maps[17-24]
rtab[3]:=xmit_sz:32 maps[25-32]
rtab[4]:=xmit_sz:40 maps[33-40]
rtab[5]:=xmit_sz:48 maps[41-48]
rtab[6]:=xmit_sz:56 maps[49-56]
New TC util on a kernel WITHOUT support for cell_align
rtab[0]:=xmit_sz:8 maps[0-7]
rtab[1]:=xmit_sz:16 maps[8-15]
rtab[2]:=xmit_sz:24 maps[16-23]
rtab[3]:=xmit_sz:32 maps[24-31]
rtab[4]:=xmit_sz:40 maps[32-39]
rtab[5]:=xmit_sz:48 maps[40-47]
rtab[6]:=xmit_sz:56 maps[48-55]
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
diff --git a/tc/tc_core.c b/tc/tc_core.c
index c713a18..752b07c 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -84,11 +84,12 @@ int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mt
cell_log++;
}
for (i=0; i<256; i++) {
- unsigned sz = (i<<cell_log);
+ unsigned sz = ((i+1)<<cell_log);
if (sz < mpu)
sz = mpu;
rtab[i] = tc_calc_xmittime(bps, sz);
}
+ r->cell_align=-1; // Due to the sz calc
r->cell_log=cell_log;
return cell_log;
}
[-- Attachment #3: cleanup_tc_calc_rtable_git.patch --]
[-- Type: text/x-patch, Size: 6028 bytes --]
commit 29044ac37e30d9662ad1bb83290a007c492ad7b2
Author: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Wed Sep 5 10:47:47 2007 +0200
[IPROUTE2]: Cleanup: tc_calc_rtable().
Change tc_calc_rtable() to take a tc_ratespec struct as an
argument. (cell_log still needs to be passed on as a parameter,
because -1 indicate that the cell_log needs to be computed by the
function.).
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
diff --git a/tc/m_police.c b/tc/m_police.c
index 5d2528b..acdfd22 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -263,22 +263,20 @@ int act_parse_police(struct action_util *a,int *argc_p, char ***argv_p, int tca_
}
if (p.rate.rate) {
- if ((Rcell_log = tc_calc_rtable(p.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) {
+ p.rate.mpu = mpu;
+ if (tc_calc_rtable(&p.rate, rtab, Rcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate rate table.\n");
return -1;
}
p.burst = tc_calc_xmittime(p.rate.rate, buffer);
- p.rate.cell_log = Rcell_log;
- p.rate.mpu = mpu;
}
p.mtu = mtu;
if (p.peakrate.rate) {
- if ((Pcell_log = tc_calc_rtable(p.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) {
+ p.peakrate.mpu = mpu;
+ if (tc_calc_rtable(&p.peakrate, ptab, Pcell_log, mtu) < 0) {
fprintf(stderr, "POLICE: failed to calculate peak rate table.\n");
return -1;
}
- p.peakrate.cell_log = Pcell_log;
- p.peakrate.mpu = mpu;
}
tail = NLMSG_TAIL(n);
diff --git a/tc/q_cbq.c b/tc/q_cbq.c
index f2b4ce8..df98312 100644
--- a/tc/q_cbq.c
+++ b/tc/q_cbq.c
@@ -137,12 +137,11 @@ static int cbq_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl
if (allot < (avpkt*3)/2)
allot = (avpkt*3)/2;
- if ((cell_log = tc_calc_rtable(r.rate, rtab, cell_log, allot, mpu)) < 0) {
+ r.mpu = mpu;
+ if (tc_calc_rtable(&r, rtab, cell_log, allot) < 0) {
fprintf(stderr, "CBQ: failed to calculate rate table.\n");
return -1;
}
- r.cell_log = cell_log;
- r.mpu = mpu;
if (ewma_log < 0)
ewma_log = TC_CBQ_DEF_EWMA;
@@ -336,12 +335,11 @@ static int cbq_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
unsigned pktsize = wrr.allot;
if (wrr.allot < (lss.avpkt*3)/2)
wrr.allot = (lss.avpkt*3)/2;
- if ((cell_log = tc_calc_rtable(r.rate, rtab, cell_log, pktsize, mpu)) < 0) {
+ r.mpu = mpu;
+ if (tc_calc_rtable(&r, rtab, cell_log, pktsize) < 0) {
fprintf(stderr, "CBQ: failed to calculate rate table.\n");
return -1;
}
- r.cell_log = cell_log;
- r.mpu = mpu;
}
if (ewma_log < 0)
ewma_log = TC_CBQ_DEF_EWMA;
diff --git a/tc/q_htb.c b/tc/q_htb.c
index b579ebe..cca77fa 100644
--- a/tc/q_htb.c
+++ b/tc/q_htb.c
@@ -212,19 +212,17 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
opt.ceil.mpu = mpu;
opt.rate.mpu = mpu;
- if ((cell_log = tc_calc_rtable(opt.rate.rate, rtab, cell_log, mtu, mpu)) < 0) {
+ if (tc_calc_rtable(&opt.rate, rtab, cell_log, mtu) < 0) {
fprintf(stderr, "htb: failed to calculate rate table.\n");
return -1;
}
opt.buffer = tc_calc_xmittime(opt.rate.rate, buffer);
- opt.rate.cell_log = cell_log;
- if ((ccell_log = tc_calc_rtable(opt.ceil.rate, ctab, cell_log, mtu, mpu)) < 0) {
+ if (tc_calc_rtable(&opt.ceil, ctab, ccell_log, mtu) < 0) {
fprintf(stderr, "htb: failed to calculate ceil rate table.\n");
return -1;
}
opt.cbuffer = tc_calc_xmittime(opt.ceil.rate, cbuffer);
- opt.ceil.cell_log = ccell_log;
tail = NLMSG_TAIL(n);
addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
diff --git a/tc/q_tbf.c b/tc/q_tbf.c
index 1fc05f4..c7b4f0f 100644
--- a/tc/q_tbf.c
+++ b/tc/q_tbf.c
@@ -170,21 +170,20 @@ static int tbf_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl
opt.limit = lim;
}
- if ((Rcell_log = tc_calc_rtable(opt.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) {
+ opt.rate.mpu = mpu;
+ if (tc_calc_rtable(&opt.rate, rtab, Rcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate rate table.\n");
return -1;
}
opt.buffer = tc_calc_xmittime(opt.rate.rate, buffer);
- opt.rate.cell_log = Rcell_log;
- opt.rate.mpu = mpu;
+
if (opt.peakrate.rate) {
- if ((Pcell_log = tc_calc_rtable(opt.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) {
+ opt.peakrate.mpu = mpu;
+ if (tc_calc_rtable(&opt.peakrate, ptab, Pcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate peak rate table.\n");
return -1;
}
opt.mtu = tc_calc_xmittime(opt.peakrate.rate, mtu);
- opt.peakrate.cell_log = Pcell_log;
- opt.peakrate.mpu = mpu;
}
tail = NLMSG_TAIL(n);
diff --git a/tc/tc_core.c b/tc/tc_core.c
index 1ab0ba0..c713a18 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -69,10 +69,11 @@ unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)
rtab[pkt_len>>cell_log] = pkt_xmit_time
*/
-int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu,
- unsigned mpu)
+int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mtu)
{
int i;
+ unsigned bps = r->rate;
+ unsigned mpu = r->mpu;
if (mtu == 0)
mtu = 2047;
@@ -88,6 +89,7 @@ int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu,
sz = mpu;
rtab[i] = tc_calc_xmittime(bps, sz);
}
+ r->cell_log=cell_log;
return cell_log;
}
diff --git a/tc/tc_core.h b/tc/tc_core.h
index a139da6..e98a7b4 100644
--- a/tc/tc_core.h
+++ b/tc/tc_core.h
@@ -13,7 +13,7 @@ long tc_core_time2ktime(long time);
long tc_core_ktime2time(long ktime);
unsigned tc_calc_xmittime(unsigned rate, unsigned size);
unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks);
-int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu, unsigned mpu);
+int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mtu);
int tc_setup_estimator(unsigned A, unsigned time_const, struct tc_estimator *est);
prev parent reply other threads:[~2007-09-05 13:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-31 12:22 [PATCH 2/2]: [NET_SCHED]: Making rate table lookups more flexible Jesper Dangaard Brouer
2007-09-01 7:10 ` Patrick McHardy
2007-09-01 21:56 ` Jesper Dangaard Brouer
2007-09-02 14:35 ` Patrick McHardy
2007-09-02 18:56 ` Jesper Dangaard Brouer
2007-09-02 21:16 ` Patrick McHardy
2007-09-03 14:19 ` Jesper Dangaard Brouer
2007-09-04 16:25 ` Patrick McHardy
2007-09-05 13:58 ` Jesper Dangaard Brouer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1189000698.28083.22.camel@localhost.localdomain \
--to=jdb@comx.dk \
--cc=hawk@diku.dk \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=shemminger@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.