From: Jesper Dangaard Brouer <jdb@comx.dk>
To: Patrick McHardy <kaber@trash.net>
Cc: Jesper Dangaard Brouer <hawk@diku.dk>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Stephen Hemminger <shemminger@linux-foundation.org>
Subject: Re: [PATCH 2/2]: [NET_SCHED]: Making rate table lookups more flexible.
Date: Wed, 05 Sep 2007 15:58:18 +0200 [thread overview]
Message-ID: <1189000698.28083.22.camel@localhost.localdomain> (raw)
In-Reply-To: <46DD86F9.8000902@trash.net>
[-- Attachment #1: Type: text/plain, Size: 3532 bytes --]
On Tue, 2007-09-04 at 18:25 +0200, Patrick McHardy wrote:
> Jesper Dangaard Brouer wrote:
> > On Sun, 2007-09-02 at 23:16 +0200, Patrick McHardy wrote:
> >
> >>Jesper Dangaard Brouer wrote:
> >>
> >>>On Sun, 2 Sep 2007, Patrick McHardy wrote:
> >>>
> >>>Lets focus on the general case, where the functionality actually is
> >>>needed right away.
> >>>
> >>>In the general case:
> >>>
> >>>- The rate table needs to be aligned (cell_align=-1).
> >>> (currently, we miscalculates up to 7 bytes on every lookup)
> >>
> >>We will always do that, thats a consequence of storing the
> >>transmission times for multiples of 8b.
> >
> >
> > The issue is that we use the lower boundary for calculating the transmit
> > cost. Thus, a 15 bytes packet only have a transmit cost of 8 bytes.
>
> I believe this is something that should be fixed anyway,
> its better to overestimate than underestimate to stay
> in control of the queue.
Well, I have attached a patch that uses the upper boundry instead.
The patch uses the cell_align feature.
The patch is very simple it self, but figure out what happens the rtab
array requires a little illustration:
Illustrating the rate table array:
Legend description
rtab[x] : Array index x of rtab[x]
xmit_sz : Transmit size contained in rtab[x] (normal transmit time)
maps[a-b] : Packet sizes from a to b, will map into rtab[x]
Current/old rate table mapping (cell_log:3):
rtab[0]:=xmit_sz:0 maps[0-7]
rtab[1]:=xmit_sz:8 maps[8-15]
rtab[2]:=xmit_sz:16 maps[16-23]
rtab[3]:=xmit_sz:24 maps[24-31]
rtab[4]:=xmit_sz:32 maps[32-39]
rtab[5]:=xmit_sz:40 maps[40-47]
rtab[6]:=xmit_sz:48 maps[48-55]
New rate table mapping, with kernel cell_align support.
rtab[0]:=xmit_sz:8 maps[0-8]
rtab[1]:=xmit_sz:16 maps[9-16]
rtab[2]:=xmit_sz:24 maps[17-24]
rtab[3]:=xmit_sz:32 maps[25-32]
rtab[4]:=xmit_sz:40 maps[33-40]
rtab[5]:=xmit_sz:48 maps[41-48]
rtab[6]:=xmit_sz:56 maps[49-56]
New TC util on a kernel WITHOUT support for cell_align
rtab[0]:=xmit_sz:8 maps[0-7]
rtab[1]:=xmit_sz:16 maps[8-15]
rtab[2]:=xmit_sz:24 maps[16-23]
rtab[3]:=xmit_sz:32 maps[24-31]
rtab[4]:=xmit_sz:40 maps[32-39]
rtab[5]:=xmit_sz:48 maps[40-47]
rtab[6]:=xmit_sz:56 maps[48-55]
Notice that without the kernel cell_align feature, we are only off by
one byte. That should be acceptable, when somebody uses a new TC util
on a old kernel.
> We could additionally make the
> rate tables more finegrained (optionally).
That is actually already possible with the approach used to handle
overflow of the rate table ("TSO" large packet support). By setting
cell_log=0, and letting the overflow code handle the rest, we get a very
fingrained lookup.
> >>>- The existing tc overhead calc can be made more accurate.
> >>> (by adding overhead before doing the lookup, instead of the
> >>> current solution where the rate table is modified with its
> >>> limited resolution)
> >>
> >>Please demonstrate this with patches (one for the overhead
> >>calculation, one for the cell_align thing), then we can
> >>continue this discussion.
> >
> >
> > I have attached a patch for the overhead calculation.
Attached is a patch that uses "the cell_align thing".
> Thanks, I probably won't get to looking into this until
> after the netfilter workshop next week.
Okay, but I'll see you at the workshop, so I might bug you there ;-)
--
Med venlig hilsen / Best regards
Jesper Brouer
ComX Networks A/S
Linux Network developer
Cand. Scient Datalog / MSc.
Author of http://adsl-optimizer.dk
[-- Attachment #2: upperbound_rate_table_aligned.patch --]
[-- Type: text/x-patch, Size: 2156 bytes --]
commit 9a21e8bd56a5f057fc9f605e061c22d264ec27ef
Author: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Wed Sep 5 15:24:51 2007 +0200
[IPROUTE2]: Change the rate table calc of transmit cost to use upper bound value.
Patrick McHardy, Cite: 'its better to overestimate than underestimate
to stay in control of the queue'.
Illustrating the rate table array:
Legend description
rtab[x] : Array index x of rtab[x]
xmit_sz : Transmit size contained in rtab[x] (normally transmit time)
maps[a-b] : Packet sizes from a to b, will map into rtab[x]
Current/old rate table mapping (cell_log:3):
rtab[0]:=xmit_sz:0 maps[0-7]
rtab[1]:=xmit_sz:8 maps[8-15]
rtab[2]:=xmit_sz:16 maps[16-23]
rtab[3]:=xmit_sz:24 maps[24-31]
rtab[4]:=xmit_sz:32 maps[32-39]
rtab[5]:=xmit_sz:40 maps[40-47]
rtab[6]:=xmit_sz:48 maps[48-55]
New rate table mapping, with kernel cell_align support.
rtab[0]:=xmit_sz:8 maps[0-8]
rtab[1]:=xmit_sz:16 maps[9-16]
rtab[2]:=xmit_sz:24 maps[17-24]
rtab[3]:=xmit_sz:32 maps[25-32]
rtab[4]:=xmit_sz:40 maps[33-40]
rtab[5]:=xmit_sz:48 maps[41-48]
rtab[6]:=xmit_sz:56 maps[49-56]
New TC util on a kernel WITHOUT support for cell_align
rtab[0]:=xmit_sz:8 maps[0-7]
rtab[1]:=xmit_sz:16 maps[8-15]
rtab[2]:=xmit_sz:24 maps[16-23]
rtab[3]:=xmit_sz:32 maps[24-31]
rtab[4]:=xmit_sz:40 maps[32-39]
rtab[5]:=xmit_sz:48 maps[40-47]
rtab[6]:=xmit_sz:56 maps[48-55]
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
diff --git a/tc/tc_core.c b/tc/tc_core.c
index c713a18..752b07c 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -84,11 +84,12 @@ int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mt
cell_log++;
}
for (i=0; i<256; i++) {
- unsigned sz = (i<<cell_log);
+ unsigned sz = ((i+1)<<cell_log);
if (sz < mpu)
sz = mpu;
rtab[i] = tc_calc_xmittime(bps, sz);
}
+ r->cell_align=-1; // Due to the sz calc
r->cell_log=cell_log;
return cell_log;
}
[-- Attachment #3: cleanup_tc_calc_rtable_git.patch --]
[-- Type: text/x-patch, Size: 6028 bytes --]
commit 29044ac37e30d9662ad1bb83290a007c492ad7b2
Author: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Wed Sep 5 10:47:47 2007 +0200
[IPROUTE2]: Cleanup: tc_calc_rtable().
Change tc_calc_rtable() to take a tc_ratespec struct as an
argument. (cell_log still needs to be passed on as a parameter,
because -1 indicate that the cell_log needs to be computed by the
function.).
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
diff --git a/tc/m_police.c b/tc/m_police.c
index 5d2528b..acdfd22 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -263,22 +263,20 @@ int act_parse_police(struct action_util *a,int *argc_p, char ***argv_p, int tca_
}
if (p.rate.rate) {
- if ((Rcell_log = tc_calc_rtable(p.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) {
+ p.rate.mpu = mpu;
+ if (tc_calc_rtable(&p.rate, rtab, Rcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate rate table.\n");
return -1;
}
p.burst = tc_calc_xmittime(p.rate.rate, buffer);
- p.rate.cell_log = Rcell_log;
- p.rate.mpu = mpu;
}
p.mtu = mtu;
if (p.peakrate.rate) {
- if ((Pcell_log = tc_calc_rtable(p.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) {
+ p.peakrate.mpu = mpu;
+ if (tc_calc_rtable(&p.peakrate, ptab, Pcell_log, mtu) < 0) {
fprintf(stderr, "POLICE: failed to calculate peak rate table.\n");
return -1;
}
- p.peakrate.cell_log = Pcell_log;
- p.peakrate.mpu = mpu;
}
tail = NLMSG_TAIL(n);
diff --git a/tc/q_cbq.c b/tc/q_cbq.c
index f2b4ce8..df98312 100644
--- a/tc/q_cbq.c
+++ b/tc/q_cbq.c
@@ -137,12 +137,11 @@ static int cbq_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl
if (allot < (avpkt*3)/2)
allot = (avpkt*3)/2;
- if ((cell_log = tc_calc_rtable(r.rate, rtab, cell_log, allot, mpu)) < 0) {
+ r.mpu = mpu;
+ if (tc_calc_rtable(&r, rtab, cell_log, allot) < 0) {
fprintf(stderr, "CBQ: failed to calculate rate table.\n");
return -1;
}
- r.cell_log = cell_log;
- r.mpu = mpu;
if (ewma_log < 0)
ewma_log = TC_CBQ_DEF_EWMA;
@@ -336,12 +335,11 @@ static int cbq_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
unsigned pktsize = wrr.allot;
if (wrr.allot < (lss.avpkt*3)/2)
wrr.allot = (lss.avpkt*3)/2;
- if ((cell_log = tc_calc_rtable(r.rate, rtab, cell_log, pktsize, mpu)) < 0) {
+ r.mpu = mpu;
+ if (tc_calc_rtable(&r, rtab, cell_log, pktsize) < 0) {
fprintf(stderr, "CBQ: failed to calculate rate table.\n");
return -1;
}
- r.cell_log = cell_log;
- r.mpu = mpu;
}
if (ewma_log < 0)
ewma_log = TC_CBQ_DEF_EWMA;
diff --git a/tc/q_htb.c b/tc/q_htb.c
index b579ebe..cca77fa 100644
--- a/tc/q_htb.c
+++ b/tc/q_htb.c
@@ -212,19 +212,17 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
opt.ceil.mpu = mpu;
opt.rate.mpu = mpu;
- if ((cell_log = tc_calc_rtable(opt.rate.rate, rtab, cell_log, mtu, mpu)) < 0) {
+ if (tc_calc_rtable(&opt.rate, rtab, cell_log, mtu) < 0) {
fprintf(stderr, "htb: failed to calculate rate table.\n");
return -1;
}
opt.buffer = tc_calc_xmittime(opt.rate.rate, buffer);
- opt.rate.cell_log = cell_log;
- if ((ccell_log = tc_calc_rtable(opt.ceil.rate, ctab, cell_log, mtu, mpu)) < 0) {
+ if (tc_calc_rtable(&opt.ceil, ctab, ccell_log, mtu) < 0) {
fprintf(stderr, "htb: failed to calculate ceil rate table.\n");
return -1;
}
opt.cbuffer = tc_calc_xmittime(opt.ceil.rate, cbuffer);
- opt.ceil.cell_log = ccell_log;
tail = NLMSG_TAIL(n);
addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
diff --git a/tc/q_tbf.c b/tc/q_tbf.c
index 1fc05f4..c7b4f0f 100644
--- a/tc/q_tbf.c
+++ b/tc/q_tbf.c
@@ -170,21 +170,20 @@ static int tbf_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl
opt.limit = lim;
}
- if ((Rcell_log = tc_calc_rtable(opt.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) {
+ opt.rate.mpu = mpu;
+ if (tc_calc_rtable(&opt.rate, rtab, Rcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate rate table.\n");
return -1;
}
opt.buffer = tc_calc_xmittime(opt.rate.rate, buffer);
- opt.rate.cell_log = Rcell_log;
- opt.rate.mpu = mpu;
+
if (opt.peakrate.rate) {
- if ((Pcell_log = tc_calc_rtable(opt.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) {
+ opt.peakrate.mpu = mpu;
+ if (tc_calc_rtable(&opt.peakrate, ptab, Pcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate peak rate table.\n");
return -1;
}
opt.mtu = tc_calc_xmittime(opt.peakrate.rate, mtu);
- opt.peakrate.cell_log = Pcell_log;
- opt.peakrate.mpu = mpu;
}
tail = NLMSG_TAIL(n);
diff --git a/tc/tc_core.c b/tc/tc_core.c
index 1ab0ba0..c713a18 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -69,10 +69,11 @@ unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)
rtab[pkt_len>>cell_log] = pkt_xmit_time
*/
-int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu,
- unsigned mpu)
+int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mtu)
{
int i;
+ unsigned bps = r->rate;
+ unsigned mpu = r->mpu;
if (mtu == 0)
mtu = 2047;
@@ -88,6 +89,7 @@ int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu,
sz = mpu;
rtab[i] = tc_calc_xmittime(bps, sz);
}
+ r->cell_log=cell_log;
return cell_log;
}
diff --git a/tc/tc_core.h b/tc/tc_core.h
index a139da6..e98a7b4 100644
--- a/tc/tc_core.h
+++ b/tc/tc_core.h
@@ -13,7 +13,7 @@ long tc_core_time2ktime(long time);
long tc_core_ktime2time(long ktime);
unsigned tc_calc_xmittime(unsigned rate, unsigned size);
unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks);
-int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu, unsigned mpu);
+int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mtu);
int tc_setup_estimator(unsigned A, unsigned time_const, struct tc_estimator *est);
prev parent reply other threads:[~2007-09-05 13:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-31 12:22 [PATCH 2/2]: [NET_SCHED]: Making rate table lookups more flexible Jesper Dangaard Brouer
2007-09-01 7:10 ` Patrick McHardy
2007-09-01 21:56 ` Jesper Dangaard Brouer
2007-09-02 14:35 ` Patrick McHardy
2007-09-02 18:56 ` Jesper Dangaard Brouer
2007-09-02 21:16 ` Patrick McHardy
2007-09-03 14:19 ` Jesper Dangaard Brouer
2007-09-04 16:25 ` Patrick McHardy
2007-09-05 13:58 ` Jesper Dangaard Brouer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1189000698.28083.22.camel@localhost.localdomain \
--to=jdb@comx.dk \
--cc=hawk@diku.dk \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=shemminger@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).