From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jarek Poplawski Subject: Re: [PATCH] sch_htb: ix the deficit overflows Date: Mon, 30 Nov 2009 11:10:20 +0000 Message-ID: <20091130111020.GA7114@ff.dom.local> References: <4B0F8A5D.1040806@gmail.com> <20091128000401.GA3713@ami.dom.local> <412e6f7f0911292026w704a70b8yc3af2c2473e05d34@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jamal Hadi Salim , "David S. Miller" , netdev@vger.kernel.org, Martin Devera To: Changli Gao Return-path: Received: from mail-bw0-f227.google.com ([209.85.218.227]:32970 "EHLO mail-bw0-f227.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752926AbZK3LKU (ORCPT ); Mon, 30 Nov 2009 06:10:20 -0500 Received: by bwz27 with SMTP id 27so2383819bwz.21 for ; Mon, 30 Nov 2009 03:10:25 -0800 (PST) Content-Disposition: inline In-Reply-To: <412e6f7f0911292026w704a70b8yc3af2c2473e05d34@mail.gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: On Mon, Nov 30, 2009 at 12:26:33PM +0800, Changli Gao wrote: > On Sat, Nov 28, 2009 at 8:04 AM, Jarek Poplawski wrote: > > Changli Gao wrote, On 11/27/2009 09:14 AM: > > > > > > This case of the quantum smaller than the packet size should be treated > > as a broken config, so I don't think it's worth to do such a deep change > > with additional delays and cpu cycles for all to fix it. A warning or > > lower limit should be enough (if necessary at all). > > > > I don't think this change is deep. HTB has it own lower limit for > quantum 1000, but the MTU is various, and maybe larger than that. Users can control this with "r2q" and "quantum", and there is a hint on quantum size in the user's guide. > And > if we use IMQ to shape traffic, the skb will be defragmented by > conntrack, and its size will be larger than MTU. IMQ is a very nice thing, but it's considered broken as well, so it can't be the reason for changing HTB. > > The previous patch indeed introduces some additional CPU cycles. > Review the new patch bellow please: And this patch is very similar, except ->peek()/dequeue(). Additional lookups are done instead of dequeuing the first found class, which might be quite long in some cases. > > diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c > index 2e38d1a..d55382b 100644 > --- a/net/sched/sch_htb.c > +++ b/net/sched/sch_htb.c > @@ -815,6 +815,17 @@ next: > goto next; > } > > + if (unlikely(cl->un.leaf.deficit[level] < 0)) { > + cl->un.leaf.deficit[level] += cl->quantum; > + htb_next_rb_node((level ? cl->parent->un.inner.ptr : > + q->ptr[0]) + prio); > + cl = htb_lookup_leaf(q->row[level] + prio, prio, > + q->ptr[level] + prio, > + q->last_ptr_id[level] + prio); > + start = cl; > + goto next; > + } > + > skb = cl->un.leaf.q->dequeue(cl->un.leaf.q); > if (likely(skb != NULL)) > break; > > If you think it is acceptable, I'll resubmit it for inclusion. It's not acceptable to me mainly because the real change done by this patch is different than you describe: preventing an overflow might be simple. You change the way DRR is implemented here, and even if it's right, it should be written explicitly and proved with tests results. Anyway, I think you should rather care for the author's acceptance, because the way it's done doesn't look like accidental and has been heavily tested btw. (I added Martin to CC.) Regards, Jarek P. PS: Btw, this newer version of the patch is broken with spaces.