Re: [LARTC] tbf and sfq interaction???

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Patrick McHardy <kaber@trash.net>
To: lartc@vger.kernel.org
Subject: Re: [LARTC] tbf and sfq interaction???
Date: Wed, 26 Nov 2003 17:38:02 +0000	[thread overview]
Message-ID: <marc-lartc-106986923928035@msgid-missing> (raw)
In-Reply-To: <marc-lartc-106979257116407@msgid-missing>

[-- Attachment #1: Type: text/plain, Size: 2581 bytes --]

TBF and SFQ can't be used together. The changes to make TBF classful kept
the limit inside TBF for compatibility. This means TBF calculates how much
bytes are in the leaf qdisc and drops if the limit is exceeded. This
is stupid for two reasons:

1. The limit is checked in the leaf qdisc (default: bfifo) anyways, this is
enough for backward compatibility. Turning the inner qdisc into a pfifo gives
you two limits, N bytes from the TBF and M packets from the pfifo.

2. With SFQ it is not possible for TBF to know how much bytes are in the queue
because SFQ may return NET_XMIT_CN, meaning that this or another packet has been
dropped. The result is that either the limit is not kept (underestimate) or worse,
the qdisc stops accepting packets at some point (overestimate).

So to use TBF+SFQ you need the attached (untested) patch which removes limit checking
from TBF and relies on the inner qdisc to do the right thing.


For you main question: Yes it will work as you expect. SFQ provides
fairness for single flows and TBF limits the total output rate.

Best regards,
Patrick


Jon Zeeff wrote:

>
> How do sfq and tbf interact when packets need to be dropped?
>
>
> Example:
>
> two flows are entering the linux router, flow A is 1 mbit and flow B 
> is 3 mbits (total = 4).  We use tbf to limit outgoing bandwidth to 2 
> mbits (further downstream there is a bottleneck).
>
> If we just used tbf, it would be dropping 50% of all packets - ie, 
> flow A would see about 500K
> of it's traffic getting through and flow B would see about 1.5mbits 
> getting through.
>
> This isn't "fair" - flow A should not have any of its packets dropped 
> and flow B should be
> limited to 1 mbit.    So we add sfq.
>
> What's not clear to me is if this will accomplish what I want.  If the 
> tbf algorithm occurs before
> sfq, tbf will drop 50% of all packets.  After that, sfq will have no 
> effect.  If the sfq algorithm
> is performed first, then all 4 mbits of the resulting traffic is sent 
> to tbf, then tbf will drop
> packets from both flows.
>
> It would seem that there needs to be some interaction between the 
> algorithms - tbf needs to
> signal to sqf that more packets can be sent.  But the packet dropping 
> should occur in sqf (where fairness is understood) not in tbf.
>
> So question:  will it work as I want or is a new qdisk needed?
>
> note: I actually have hundreds of flows, so I don't want to classify 
> them manually.  I also don't
> want hard allocations of bandwidth.  If there is only flow B, 2 mbits 
> of it should be allowed to pass.
> Thanks.
>
>
>


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1823 bytes --]

===== net/sched/sch_tbf.c 1.11 vs edited =====
--- 1.11/net/sched/sch_tbf.c	Tue Nov 18 01:14:58 2003
+++ edited/net/sched/sch_tbf.c	Tue Nov 18 01:45:15 2003
@@ -136,7 +136,7 @@
 	struct tbf_sched_data *q = (struct tbf_sched_data *)sch->data;
 	int ret;
 
-	if (skb->len > q->max_size || sch->stats.backlog + skb->len > q->limit) {
+	if (skb->len > q->max_size) {
 		sch->stats.drops++;
 #ifdef CONFIG_NET_CLS_POLICE
 		if (sch->reshape_fail == NULL || sch->reshape_fail(skb, sch))
@@ -152,7 +152,6 @@
 	}	
 	
 	sch->q.qlen++;
-	sch->stats.backlog += skb->len;
 	sch->stats.bytes += skb->len;
 	sch->stats.packets++;
 	return 0;
@@ -163,10 +162,8 @@
 	struct tbf_sched_data *q = (struct tbf_sched_data *)sch->data;
 	int ret;
 	
-	if ((ret = q->qdisc->ops->requeue(skb, q->qdisc)) == 0) {
+	if ((ret = q->qdisc->ops->requeue(skb, q->qdisc)) == 0)
 		sch->q.qlen++; 
-		sch->stats.backlog += skb->len;
-	}
 	
 	return ret;
 }
@@ -178,7 +175,6 @@
 	
 	if ((len = q->qdisc->ops->drop(q->qdisc)) != 0) {
 		sch->q.qlen--;
-		sch->stats.backlog -= len;
 		sch->stats.drops++;
 	}
 	return len;
@@ -224,7 +220,6 @@
 			q->t_c = now;
 			q->tokens = toks;
 			q->ptokens = ptoks;
-			sch->stats.backlog -= len;
 			sch->q.qlen--;
 			sch->flags &= ~TCQ_F_THROTTLED;
 			return skb;
@@ -253,7 +248,6 @@
 		if (q->qdisc->ops->requeue(skb, q->qdisc) != NET_XMIT_SUCCESS) {
 			/* When requeue fails skb is dropped */ 
 			sch->q.qlen--;
-			sch->stats.backlog -= len;
 			sch->stats.drops++;
 		}	
 		
@@ -269,7 +263,6 @@
 
 	qdisc_reset(q->qdisc);
 	skb_queue_purge(&sch->q);
-	sch->stats.backlog = 0;
 	PSCHED_GET_TIME(q->t_c);
 	q->tokens = q->buffer;
 	q->ptokens = q->mtu;
@@ -456,7 +449,6 @@
 	*old = xchg(&q->qdisc, new);
 	qdisc_reset(*old);
 	sch->q.qlen = 0;
-	sch->stats.backlog = 0;
 	sch_tree_unlock(sch);
 	
 	return 0;

     prev parent reply	other threads:[~2003-11-26 17:38 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-25 20:05 [LARTC] tbf and sfq interaction??? Jon Zeeff
2003-11-26 17:38 ` Patrick McHardy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=marc-lartc-106986923928035@msgid-missing \
    --to=kaber@trash.net \
    --cc=lartc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.