From mboxrd@z Thu Jan 1 00:00:00 1970 From: Changli Gao Subject: Re: [RFC] sched: QFQ - quick fair queue scheduler Date: Sat, 8 Jan 2011 10:56:33 +0800 Message-ID: References: <20110106195614.20dbc402@nehalam> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: David Miller , Eric Dumazet , Fabio Checconi , netdev@vger.kernel.org, Luigi Rizzo To: Stephen Hemminger Return-path: Received: from mail-fx0-f46.google.com ([209.85.161.46]:48232 "EHLO mail-fx0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751671Ab1AHC44 convert rfc822-to-8bit (ORCPT ); Fri, 7 Jan 2011 21:56:56 -0500 Received: by fxm20 with SMTP id 20so17484048fxm.19 for ; Fri, 07 Jan 2011 18:56:54 -0800 (PST) In-Reply-To: <20110106195614.20dbc402@nehalam> Sender: netdev-owner@vger.kernel.org List-ID: On Fri, Jan 7, 2011 at 11:56 AM, Stephen Hemminger wrote: > This is an implementation of the Quick Fair Queue scheduler developed > by Fabio Checconi and Luigi Rizzo. The same algorithm is already impl= emented in ipfw > in FreeBSD. Fabio had an earlier version developed on Linux, I just > did some cleanup, and backporting of FreeBSD version. > > For more information see web page: http://info.iet.unipi.it/~luigi/qf= q/ > and Google tech talk: http://www.youtube.com/watch?v=3Dr8vBmybeKlE > > This is for inspection at this point, barely tested. > > Signed-off-by: Stephen Hemminger > > --- > Patch against net-next-2.6. > Configuration may get patch fuzz because of testing CHOKe in > same tree. > > =A0include/linux/pkt_sched.h | =A0 14 > =A0net/sched/Kconfig =A0 =A0 =A0 =A0 | =A0 11 > =A0net/sched/Makefile =A0 =A0 =A0 =A0| =A0 =A01 > =A0net/sched/sch_qfq.c =A0 =A0 =A0 | 1012 +++++++++++++++++++++++++++= +++++++++++++++++++ > =A04 files changed, 1038 insertions(+) > > --- a/include/linux/pkt_sched.h 2011-01-05 09:01:33.268032043 -0800 > +++ b/include/linux/pkt_sched.h 2011-01-05 23:17:20.637390255 -0800 > @@ -481,4 +481,18 @@ struct tc_drr_stats { > =A0 =A0 =A0 =A0__u32 =A0 deficit; > =A0}; > > +/* QFQ */ > +enum { > + =A0 =A0 =A0 TCA_QFQ_WEIGHT, > + =A0 =A0 =A0 TCA_QFQ_LMAX, > + =A0 =A0 =A0 __TCA_QFQ_MAX > +}; > + > +#define TCA_QFQ_MAX =A0 =A0(__TCA_QFQ_MAX - 1) > + > +struct tc_qfq_stats { > + =A0 =A0 =A0 __u32 weight; > + =A0 =A0 =A0 __u32 lmax; > +}; > + > =A0#endif > --- a/net/sched/Kconfig 2011-01-05 09:01:33.280032462 -0800 > +++ b/net/sched/Kconfig 2011-01-05 23:17:20.637390255 -0800 > @@ -216,6 +216,17 @@ config NET_SCH_CHOKE > =A0 =A0 =A0 =A0 =A0To compile this code as a module, choose M here: t= he > =A0 =A0 =A0 =A0 =A0module will be called sch_choke. > > +config NET_SCH_QFQ > + =A0 =A0 =A0 =A0tristate "Quick Fair Queueing Scheduler (QFQ)" > + =A0 =A0 =A0 help > + =A0 =A0 =A0 =A0 Say Y here if you want to use the Quick Fair Queuei= ng Scheduler (QFQ) > + =A0 =A0 =A0 =A0 packet scheduling algorithm. > + > + =A0 =A0 =A0 =A0 To compile this driver as a module, choose M here: = the module > + =A0 =A0 =A0 =A0 will be called sch_qfq. > + > + =A0 =A0 =A0 =A0 If unsure, say N. > + > =A0config NET_SCH_INGRESS > =A0 =A0 =A0 =A0tristate "Ingress Qdisc" > =A0 =A0 =A0 =A0depends on NET_CLS_ACT > --- a/net/sched/Makefile =A0 =A0 =A0 =A02011-01-05 09:01:33.284032598= -0800 > +++ b/net/sched/Makefile =A0 =A0 =A0 =A02011-01-05 23:17:20.645389829= -0800 > @@ -32,6 +32,7 @@ obj-$(CONFIG_NET_SCH_MULTIQ) =A0+=3D sch_mult > =A0obj-$(CONFIG_NET_SCH_ATM) =A0 =A0 =A0+=3D sch_atm.o > =A0obj-$(CONFIG_NET_SCH_NETEM) =A0 =A0+=3D sch_netem.o > =A0obj-$(CONFIG_NET_SCH_DRR) =A0 =A0 =A0+=3D sch_drr.o > +obj-$(CONFIG_NET_SCH_QFQ) =A0 =A0 =A0+=3D sch_qfq.o > =A0obj-$(CONFIG_NET_SCH_CHOKE) =A0 =A0+=3D sch_choke.o > =A0obj-$(CONFIG_NET_CLS_U32) =A0 =A0 =A0+=3D cls_u32.o > =A0obj-$(CONFIG_NET_CLS_ROUTE4) =A0 +=3D cls_route.o > --- /dev/null =A0 1970-01-01 00:00:00.000000000 +0000 > +++ b/net/sched/sch_qfq.c =A0 =A0 =A0 2011-01-06 12:51:28.498280327 -= 0800 > @@ -0,0 +1,1125 @@ > +/* > + * net/sched/sch_qfq.c =A0 =A0 =A0 =A0 Quick Fair Queueing Scheduler= =2E > + * > + * Copyright (c) 2009 Fabio Checconi, Luigi Rizzo, and Paolo Valente= =2E > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * version 2 as published by the Free Software Foundation. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +/* =A0Quick Fair Queueing > + =A0 =A0=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > + =A0 =A0Sources: > + =A0 =A0Fabio Checconi and Scuola Superiore and S. Anna > + =A0 =A0and Paolo Valente and Luigi Riz "QFQ: Efficient Packet Sched= uling > + =A0 =A0with Tight Bandwidth Distribution Guarantees", SIGCOMM 2010 > + > + =A0 =A0See also: > + =A0 =A0http://retis.sssup.it/~fabio/linux/qfq/ > + */ > + > +/* > + > + =A0Virtual time computations. > + > + =A0S, F and V are all computed in fixed point arithmetic with > + =A0FRAC_BITS decimal bits. > + > + =A0QFQ_MAX_INDEX is the maximum index allowed for a group. We need > + =A0 =A0 =A0 one bit per index. > + =A0QFQ_MAX_WSHIFT is the maximum power of two supported as a weight= =2E > + > + =A0The layout of the bits is as below: > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 [ MTU_SHIFT ][ =A0 =A0 =A0FRAC_= BITS =A0 =A0] > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 [ MAX_INDEX =A0 =A0][ MIN_SLOT_= SHIFT ] > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0^.__= grp->index =3D 0 > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*.__= grp->slot_shift > + > + =A0where MIN_SLOT_SHIFT is derived by difference from the others. > + > + =A0The max group index corresponds to Lmax/w_min, where > + =A0Lmax=3D1< + =A0From this, and knowing how many groups (MAX_INDEX) we want, > + =A0we can derive the shift corresponding to each group. > + > + =A0Because we often need to compute > + =A0 =A0 =A0 F =3D S + len/w_i =A0and V =3D V + len/wsum > + =A0instead of storing w_i store the value > + =A0 =A0 =A0 inv_w =3D (1< + =A0so we can do F =3D S + len * inv_w * wsum. > + =A0We use W_TOT in the formulas so we can easily move between > + =A0static and adaptive weight sum. > + > + =A0The per-scheduler-instance data contain all the data structures > + =A0for the scheduler: bitmaps and bucket lists. > + > + */ > + > +/* > + * Maximum number of consecutive slots occupied by backlogged classe= s > + * inside a group. > + */ > +#define QFQ_MAX_SLOTS =A032 > + > +/* > + * Shifts used for class<->group mapping. =A0We allow class weights = that are > + * in the range [1, 2^MAX_WSHIFT], and we try to map each class i to= the > + * group with the smallest index that can support the L_i / r_i conf= igured > + * for the class. > + * > + * grp->index is the index of the group; and grp->slot_shift > + * is the shift for the corresponding (scaled) sigma_i. > + */ > +#define QFQ_MAX_INDEX =A0 =A0 =A0 =A0 =A019 > +#define QFQ_MAX_WSHIFT =A0 =A0 =A0 =A0 16 > + > +#define =A0 =A0 =A0 =A0QFQ_MAX_WEIGHT =A0 =A0 =A0 =A0 =A0(1< +#define QFQ_MAX_WSUM =A0 =A0 =A0 =A0 =A0 (2*QFQ_MAX_WEIGHT) > + > +#define FRAC_BITS =A0 =A0 =A0 =A0 =A0 =A0 =A030 =A0 =A0 =A0/* fixed = point arithmetic */ > +#define ONE_FP =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (1UL << FRAC_BITS) > +#define IWSUM =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(ONE_FP/QFQ_MAX_WSU= M) > + > +#define QFQ_MTU_SHIFT =A0 =A0 =A0 =A0 =A011 > +#define QFQ_MIN_SLOT_SHIFT =A0 =A0 (FRAC_BITS + QFQ_MTU_SHIFT - QFQ_= MAX_INDEX) > + > +/* > + * Possible group states. =A0These values are used as indexes for th= e bitmaps > + * array of struct qfq_queue. > + */ > +enum qfq_state { ER, IR, EB, IB, QFQ_MAX_STATE }; > + > +struct qfq_group; > + > +struct qfq_class { > + =A0 =A0 =A0 struct Qdisc_class_common common; > + > + =A0 =A0 =A0 unsigned int refcnt; > + =A0 =A0 =A0 unsigned int filter_cnt; > + > + =A0 =A0 =A0 struct gnet_stats_basic_packed bstats; > + =A0 =A0 =A0 struct gnet_stats_queue qstats; > + =A0 =A0 =A0 struct gnet_stats_rate_est rate_est; > + =A0 =A0 =A0 struct Qdisc *qdisc; > + > + =A0 =A0 =A0 struct qfq_class *next; /* Link for the slot list. */ > + =A0 =A0 =A0 u64 S, F; =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* flow timestamp= s (exact) */ > + > + =A0 =A0 =A0 /* group we belong to. In principle we would need the i= ndex, > + =A0 =A0 =A0 =A0* which is log_2(lmax/weight), but we never referenc= e it > + =A0 =A0 =A0 =A0* directly, only the group. > + =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 struct qfq_group *grp; > + > + =A0 =A0 =A0 /* these are copied from the flowset. */ > + =A0 =A0 =A0 u32 =A0 =A0 inv_w; =A0 =A0 =A0 =A0 =A0/* ONE_FP/weight = */ > + =A0 =A0 =A0 u32 =A0 =A0 lmax; =A0 =A0 =A0 =A0 =A0 /* Max packet siz= e for this flow. */ > +}; > + > +struct qfq_group { > + =A0 =A0 =A0 uint64_t S, F; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* gr= oup timestamps (approx). */ > + =A0 =A0 =A0 unsigned int slot_shift; =A0 =A0 =A0 =A0/* Slot shift. = */ > + =A0 =A0 =A0 unsigned int index; =A0 =A0 =A0 =A0 =A0 =A0 /* Group in= dex. */ > + =A0 =A0 =A0 unsigned int front; =A0 =A0 =A0 =A0 =A0 =A0 /* Index of= the front slot. */ > + =A0 =A0 =A0 unsigned long full_slots; =A0 =A0 =A0 /* non-empty slot= s */ > + > + =A0 =A0 =A0 /* Array of RR lists of active classes. */ > + =A0 =A0 =A0 struct qfq_class *slots[QFQ_MAX_SLOTS]; > +}; > + > +struct qfq_sched { > + =A0 =A0 =A0 struct tcf_proto *filter_list; > + =A0 =A0 =A0 struct Qdisc_class_hash clhash; > + > + =A0 =A0 =A0 uint64_t =A0 =A0 =A0 =A0V; =A0 =A0 =A0 =A0 =A0 =A0 =A0/= * Precise virtual time. */ > + =A0 =A0 =A0 u32 wsum; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /= * weight sum */ > + > + =A0 =A0 =A0 unsigned long bitmaps[QFQ_MAX_STATE]; =A0 =A0 =A0 /* Gr= oup bitmaps. */ > + =A0 =A0 =A0 struct qfq_group groups[QFQ_MAX_INDEX + 1]; /* The grou= ps. */ > +}; > + > +static struct qfq_class *qfq_find_class(struct Qdisc *sch, u32 class= id) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct Qdisc_class_common *clc; > + > + =A0 =A0 =A0 clc =3D qdisc_class_find(&q->clhash, classid); > + =A0 =A0 =A0 if (clc =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return NULL; > + =A0 =A0 =A0 return container_of(clc, struct qfq_class, common); > +} > + > +static void qfq_purge_queue(struct qfq_class *cl) > +{ > + =A0 =A0 =A0 unsigned int len =3D cl->qdisc->q.qlen; > + > + =A0 =A0 =A0 qdisc_reset(cl->qdisc); > + =A0 =A0 =A0 qdisc_tree_decrease_qlen(cl->qdisc, len); > +} > + > +static const struct nla_policy qfq_policy[TCA_QFQ_MAX + 1] =3D { > + =A0 =A0 =A0 [TCA_QFQ_WEIGHT] =3D { .type =3D NLA_U32 }, > + =A0 =A0 =A0 [TCA_QFQ_LMAX] =3D { .type =3D NLA_U32 }, > +}; > + > +/* > + * Calculate a flow index, given its weight and maximum packet lengt= h. > + * index =3D log_2(maxlen/weight) but we need to apply the scaling. > + * This is used only once at flow creation. > + */ > +static int qfq_calc_index(u32 inv_w, unsigned int maxlen) > +{ > + =A0 =A0 =A0 u64 slot_size =3D (u64)maxlen *inv_w; > + =A0 =A0 =A0 unsigned long size_map; > + =A0 =A0 =A0 int index =3D 0; > + > + =A0 =A0 =A0 size_map =3D slot_size >> QFQ_MIN_SLOT_SHIFT; > + =A0 =A0 =A0 if (!size_map) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; > + > + =A0 =A0 =A0 index =3D __fls(size_map) + 1; =A0 =A0/* basically a lo= g_2 */ > + =A0 =A0 =A0 index -=3D !(slot_size - (1ULL << (index + QFQ_MIN_SLOT= _SHIFT - 1))); > + > + =A0 =A0 =A0 if (index < 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 index =3D 0; > +out: > + =A0 =A0 =A0 pr_debug("qfq calc_index: W =3D %lu, L =3D %u, I =3D %d= \n", > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) ONE_FP/inv_w, maxlen= , index); > + > + =A0 =A0 =A0 return index; > +} > + > +static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 pare= ntid, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct nlattr *= *tca, unsigned long *arg) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)*arg; > + =A0 =A0 =A0 struct nlattr *tb[TCA_QFQ_MAX + 1]; > + =A0 =A0 =A0 u32 weight, lmax, inv_w; > + =A0 =A0 =A0 int i, err; > + > + =A0 =A0 =A0 if (tca[TCA_OPTIONS] =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -EINVAL; > + > + =A0 =A0 =A0 err =3D nla_parse_nested(tb, TCA_QFQ_MAX, tca[TCA_OPTIO= NS], qfq_policy); > + =A0 =A0 =A0 if (err < 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return err; > + > + =A0 =A0 =A0 if (tb[TCA_QFQ_WEIGHT]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 weight =3D nla_get_u32(tb[TCA_QFQ_WEIGH= T]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!weight || weight > (1UL << QFQ_MAX= _WSHIFT)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 pr_notice("qfq: invalid= weight %u\n", weight); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -EINVAL; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } else > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 weight =3D 1; > + > + =A0 =A0 =A0 inv_w =3D ONE_FP / weight; > + =A0 =A0 =A0 weight =3D ONE_FP / inv_w; > + =A0 =A0 =A0 if (q->wsum + weight > QFQ_MAX_WSUM) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pr_notice("qfq: total weight out of ran= ge (%u + %u)\n", > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 weight, q->wsum); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -EINVAL; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 if (tb[TCA_QFQ_LMAX]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 lmax =3D nla_get_u32(tb[TCA_QFQ_LMAX]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!lmax || lmax > (1UL << QFQ_MTU_SHI= =46T)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 pr_notice("qfq: invalid= max length %u\n", lmax); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -EINVAL; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } else > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 lmax =3D 1UL << QFQ_MTU_SHIFT; > + > + =A0 =A0 =A0 if (cl !=3D NULL) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (tca[TCA_RATE]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 err =3D gen_replace_est= imator(&cl->bstats, &cl->rate_est, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 qdisc_root_sleeping_lock(sch), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 tca[TCA_RATE]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (err) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return = err; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch_tree_lock(sch); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (tb[TCA_QFQ_WEIGHT]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->wsum =3D weight - ON= E_FP / cl->inv_w; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->inv_w =3D inv_w; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch_tree_unlock(sch); > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return 0; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 cl =3D kzalloc(sizeof(struct qfq_class), GFP_KERNEL); > + =A0 =A0 =A0 if (cl =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -ENOBUFS; > + > + =A0 =A0 =A0 cl->refcnt =3D 1; > + =A0 =A0 =A0 cl->common.classid =3D classid; > + =A0 =A0 =A0 cl->lmax =3D lmax; > + =A0 =A0 =A0 cl->inv_w =3D inv_w; > + =A0 =A0 =A0 i =3D qfq_calc_index(cl->inv_w, cl->lmax); > + > + =A0 =A0 =A0 cl->grp =3D &q->groups[i]; > + =A0 =A0 =A0 q->wsum +=3D weight; > + > + =A0 =A0 =A0 cl->qdisc =3D qdisc_create_dflt(sch->dev_queue, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 &pfifo_qdisc_ops, classid); > + =A0 =A0 =A0 if (cl->qdisc =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->qdisc =3D &noop_qdisc; > + > + =A0 =A0 =A0 if (tca[TCA_RATE]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 err =3D gen_new_estimator(&cl->bstats, = &cl->rate_est, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 qdisc_root_sleeping_lock(sch), > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 tca[TCA_RATE]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (err) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 qdisc_destroy(cl->qdisc= ); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 kfree(cl); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return err; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 sch_tree_lock(sch); > + =A0 =A0 =A0 qdisc_class_hash_insert(&q->clhash, &cl->common); > + =A0 =A0 =A0 sch_tree_unlock(sch); > + > + =A0 =A0 =A0 qdisc_class_hash_grow(sch, &q->clhash); > + > + =A0 =A0 =A0 *arg =3D (unsigned long)cl; > + =A0 =A0 =A0 return 0; > +} > + > +static void qfq_destroy_class(struct Qdisc *sch, struct qfq_class *c= l) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D (struct qfq_sched *)sch; > + > + =A0 =A0 =A0 if (cl->inv_w) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->wsum -=3D ONE_FP / cl->inv_w; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->inv_w =3D 0; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 gen_kill_estimator(&cl->bstats, &cl->rate_est); > + =A0 =A0 =A0 qdisc_destroy(cl->qdisc); > + =A0 =A0 =A0 kfree(cl); > +} > + > +static int qfq_delete_class(struct Qdisc *sch, unsigned long arg) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)arg; > + > + =A0 =A0 =A0 if (cl->filter_cnt > 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -EBUSY; > + > + =A0 =A0 =A0 sch_tree_lock(sch); > + > + =A0 =A0 =A0 qfq_purge_queue(cl); > + =A0 =A0 =A0 qdisc_class_hash_remove(&q->clhash, &cl->common); > + > + =A0 =A0 =A0 if (--cl->refcnt =3D=3D 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_destroy_class(sch, cl); > + > + =A0 =A0 =A0 sch_tree_unlock(sch); > + =A0 =A0 =A0 return 0; > +} > + > +static unsigned long qfq_get_class(struct Qdisc *sch, u32 classid) > +{ > + =A0 =A0 =A0 struct qfq_class *cl =3D qfq_find_class(sch, classid); > + > + =A0 =A0 =A0 if (cl !=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->refcnt++; > + > + =A0 =A0 =A0 return (unsigned long)cl; > +} > + > +static void qfq_put_class(struct Qdisc *sch, unsigned long arg) > +{ > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)arg; > + > + =A0 =A0 =A0 if (--cl->refcnt =3D=3D 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_destroy_class(sch, cl); > +} > + > +static struct tcf_proto **qfq_tcf_chain(struct Qdisc *sch, unsigned = long cl) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + > + =A0 =A0 =A0 if (cl) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return NULL; > + > + =A0 =A0 =A0 return &q->filter_list; > +} > + > +static unsigned long qfq_bind_tcf(struct Qdisc *sch, unsigned long p= arent, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 u32= classid) > +{ > + =A0 =A0 =A0 struct qfq_class *cl =3D qfq_find_class(sch, classid); > + > + =A0 =A0 =A0 if (cl !=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->filter_cnt++; > + > + =A0 =A0 =A0 return (unsigned long)cl; > +} > + > +static void qfq_unbind_tcf(struct Qdisc *sch, unsigned long arg) > +{ > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)arg; > + > + =A0 =A0 =A0 cl->filter_cnt--; > +} > + > +static int qfq_graft_class(struct Qdisc *sch, unsigned long arg, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct Qdisc *ne= w, struct Qdisc **old) > +{ > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)arg; > + > + =A0 =A0 =A0 if (new =3D=3D NULL) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 new =3D qdisc_create_dflt(sch->dev_queu= e, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 &pfifo_qdisc_ops, cl->common.classid); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (new =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 new =3D &noop_qdisc; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 sch_tree_lock(sch); > + =A0 =A0 =A0 qfq_purge_queue(cl); > + =A0 =A0 =A0 *old =3D cl->qdisc; > + =A0 =A0 =A0 cl->qdisc =3D new; > + =A0 =A0 =A0 sch_tree_unlock(sch); > + =A0 =A0 =A0 return 0; > +} > + > +static struct Qdisc *qfq_class_leaf(struct Qdisc *sch, unsigned long= arg) > +{ > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)arg; > + > + =A0 =A0 =A0 return cl->qdisc; > +} > + > +static int qfq_dump_class(struct Qdisc *sch, unsigned long arg, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct sk_buff *skb= , struct tcmsg *tcm) > +{ > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)arg; > + =A0 =A0 =A0 struct nlattr *nest; > + > + =A0 =A0 =A0 tcm->tcm_parent =3D TC_H_ROOT; > + =A0 =A0 =A0 tcm->tcm_handle =3D cl->common.classid; > + =A0 =A0 =A0 tcm->tcm_info =A0 =3D cl->qdisc->handle; > + > + =A0 =A0 =A0 nest =3D nla_nest_start(skb, TCA_OPTIONS); > + =A0 =A0 =A0 if (nest =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nla_put_failure; > + =A0 =A0 =A0 NLA_PUT_U32(skb, TCA_QFQ_WEIGHT, ONE_FP/cl->inv_w); > + =A0 =A0 =A0 NLA_PUT_U32(skb, TCA_QFQ_LMAX, cl->lmax); > + =A0 =A0 =A0 return nla_nest_end(skb, nest); > + > +nla_put_failure: > + =A0 =A0 =A0 nla_nest_cancel(skb, nest); > + =A0 =A0 =A0 return -EMSGSIZE; > +} > + > +static int qfq_dump_class_stats(struct Qdisc *sch, unsigned long arg= , > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct = gnet_dump *d) > +{ > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)arg; > + =A0 =A0 =A0 struct tc_qfq_stats xstats; > + > + =A0 =A0 =A0 memset(&xstats, 0, sizeof(xstats)); > + > + =A0 =A0 =A0 xstats.weight =3D ONE_FP/cl->inv_w; > + =A0 =A0 =A0 xstats.lmax =3D cl->lmax; > + > + =A0 =A0 =A0 if (gnet_stats_copy_basic(d, &cl->bstats) < 0 || > + =A0 =A0 =A0 =A0 =A0 gnet_stats_copy_rate_est(d, NULL, &cl->rate_est= ) < 0 || > + =A0 =A0 =A0 =A0 =A0 gnet_stats_copy_queue(d, &cl->qdisc->qstats) < = 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return -1; > + > + =A0 =A0 =A0 return gnet_stats_copy_app(d, &xstats, sizeof(xstats)); > +} > + > +static void qfq_walk(struct Qdisc *sch, struct qdisc_walker *arg) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_class *cl; > + =A0 =A0 =A0 struct hlist_node *n; > + =A0 =A0 =A0 unsigned int i; > + > + =A0 =A0 =A0 if (arg->stop) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > + > + =A0 =A0 =A0 for (i =3D 0; i < q->clhash.hashsize; i++) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 hlist_for_each_entry(cl, n, &q->clhash.= hash[i], common.hnode) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (arg->count < arg->s= kip) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 arg->co= unt++; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 continu= e; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (arg->fn(sch, (unsig= ned long)cl, arg) < 0) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 arg->st= op =3D 1; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 arg->count++; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } > +} > + > +static struct qfq_class *qfq_classify(struct sk_buff *skb, struct Qd= isc *sch, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 int *qerr) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_class *cl; > + =A0 =A0 =A0 struct tcf_result res; > + =A0 =A0 =A0 int result; > + > + =A0 =A0 =A0 if (TC_H_MAJ(skb->priority ^ sch->handle) =3D=3D 0) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl =3D qfq_find_class(sch, skb->priorit= y); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (cl !=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return cl; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 *qerr =3D NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > + =A0 =A0 =A0 result =3D tc_classify(skb, q->filter_list, &res); > + =A0 =A0 =A0 if (result >=3D 0) { > +#ifdef CONFIG_NET_CLS_ACT > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 switch (result) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 case TC_ACT_QUEUED: > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 case TC_ACT_STOLEN: > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 *qerr =3D NET_XMIT_SUCC= ESS | __NET_XMIT_STOLEN; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 case TC_ACT_SHOT: > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return NULL; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > +#endif > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl =3D (struct qfq_class *)res.class; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (cl =3D=3D NULL) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl =3D qfq_find_class(s= ch, res.classid); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return cl; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 return NULL; > +} > + > +/* Generic comparison function, handling wraparound. */ > +static inline int qfq_gt(u64 a, u64 b) > +{ > + =A0 =A0 =A0 return (s64)(a - b) > 0; > +} > + > +/* Round a precise timestamp to its slotted value. */ > +static inline u64 qfq_round_down(u64 ts, unsigned int shift) > +{ > + =A0 =A0 =A0 return ts & ~((1ULL << shift) - 1); > +} > + > +/* return the pointer to the group with lowest index in the bitmap *= / > +static inline struct qfq_group *qfq_ffs(struct qfq_sched *q, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 unsigned long bitmap) > +{ > + =A0 =A0 =A0 int index =3D __ffs(bitmap); // zero-based > + =A0 =A0 =A0 return &q->groups[index]; > +} > +/* Calculate a mask to mimic what would be ffs_from(). */ > +static inline unsigned long mask_from(unsigned long bitmap, int from= ) > +{ > + =A0 =A0 =A0 return bitmap & ~((1UL << from) - 1); > +} > + > +/* > + * The state computation relies on ER=3D0, IR=3D1, EB=3D2, IB=3D3 > + * First compute eligibility comparing grp->S, q->V, > + * then check if someone is blocking us and possibly add EB > + */ > +static int qfq_calc_state(struct qfq_sched *q, const struct qfq_grou= p *grp) > +{ > + =A0 =A0 =A0 /* if S > V we are not eligible */ > + =A0 =A0 =A0 unsigned int state =3D qfq_gt(grp->S, q->V); > + =A0 =A0 =A0 unsigned long mask =3D mask_from(q->bitmaps[ER], grp->i= ndex); > + =A0 =A0 =A0 struct qfq_group *next; > + > + =A0 =A0 =A0 if (mask) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 next =3D qfq_ffs(q, mask); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (qfq_gt(grp->F, next->F)) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 state |=3D EB; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 return state; > +} > + > + > +/* > + * In principle > + * =A0 =A0 q->bitmaps[dst] |=3D q->bitmaps[src] & mask; > + * =A0 =A0 q->bitmaps[src] &=3D ~mask; > + * but we should make sure that src !=3D dst > + */ > +static inline void qfq_move_groups(struct qfq_sched *q, unsigned lon= g mask, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= int src, int dst) > +{ > + =A0 =A0 =A0 q->bitmaps[dst] |=3D q->bitmaps[src] & mask; > + =A0 =A0 =A0 q->bitmaps[src] &=3D ~mask; > +} > + > +static void qfq_unblock_groups(struct qfq_sched *q, int index, u64 o= ld_F) > +{ > + =A0 =A0 =A0 unsigned long mask =3D mask_from(q->bitmaps[ER], index = + 1); > + =A0 =A0 =A0 struct qfq_group *next; > + > + =A0 =A0 =A0 if (mask) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 next =3D qfq_ffs(q, mask); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!qfq_gt(next->F, old_F)) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 mask =3D (1UL << index) - 1; > + =A0 =A0 =A0 qfq_move_groups(q, mask, EB, ER); > + =A0 =A0 =A0 qfq_move_groups(q, mask, IB, IR); > +} > + > +/* > + * perhaps > + * > + =A0 =A0 =A0 old_V ^=3D q->V; > + =A0 =A0 =A0 old_V >>=3D QFQ_MIN_SLOT_SHIFT; > + =A0 =A0 =A0 if (old_V) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 ... > + =A0 =A0 =A0 } > + * > + */ > +static void qfq_make_eligible(struct qfq_sched *q, u64 old_V) > +{ > + =A0 =A0 =A0 unsigned long vslot =3D q->V >> QFQ_MIN_SLOT_SHIFT; > + =A0 =A0 =A0 unsigned long old_vslot =3D old_V >> QFQ_MIN_SLOT_SHIFT= ; > + > + =A0 =A0 =A0 if (vslot !=3D old_vslot) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unsigned long mask =3D (1UL << fls(vslo= t ^ old_vslot)) - 1; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_move_groups(q, mask, IR, ER); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_move_groups(q, mask, IB, EB); > + =A0 =A0 =A0 } > +} > + > +/* > + * XXX we should make sure that slot becomes less than 32. > + * This is guaranteed by the input values. > + * roundedS is always cl->S rounded on grp->slot_shift bits. > + */ > +static void qfq_slot_insert(struct qfq_group *grp, struct qfq_class = *cl, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= u64 roundedS) > +{ > + =A0 =A0 =A0 u64 slot =3D (roundedS - grp->S) >> grp->slot_shift; > + =A0 =A0 =A0 unsigned int i =3D (grp->front + slot) % QFQ_MAX_SLOTS; > + > + =A0 =A0 =A0 cl->next =3D grp->slots[i]; > + =A0 =A0 =A0 grp->slots[i] =3D cl; > + =A0 =A0 =A0 __set_bit(slot, &grp->full_slots); > +} > + > +/* > + * remove the entry from the slot > + */ > +static void qfq_front_slot_remove(struct qfq_group *grp) > +{ > + =A0 =A0 =A0 struct qfq_class **h =3D &grp->slots[grp->front]; > + > + =A0 =A0 =A0 *h =3D (*h)->next; > + =A0 =A0 =A0 if (!*h) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(0, &grp->full_slots); > +} > + > +/* > + * Returns the first full queue in a group. As a side effect, > + * adjust the bucket list so the first non-empty bucket is at > + * position 0 in full_slots. > + */ > +static struct qfq_class *qfq_slot_scan(struct qfq_group *grp) > +{ > + =A0 =A0 =A0 unsigned int i; > + > + =A0 =A0 =A0 pr_debug("qfq slot_scan: grp %u full %#lx\n", > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0grp->index, grp->full_slots); > + > + =A0 =A0 =A0 if (!grp->full_slots) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return NULL; > + > + =A0 =A0 =A0 i =3D __ffs(grp->full_slots); =A0/* zero based */ > + =A0 =A0 =A0 if (i > 0) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp->front =3D (grp->front + i) % QFQ_M= AX_SLOTS; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp->full_slots >>=3D i; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 return grp->slots[grp->front]; > +} > + > +/* > + * adjust the bucket list. When the start time of a group decreases, > + * we move the index down (modulo QFQ_MAX_SLOTS) so we don't need to > + * move the objects. The mask of occupied slots must be shifted > + * because we use ffs() to find the first non-empty slot. > + * This covers decreases in the group's start time, but what about > + * increases of the start time ? > + * Here too we should make sure that i is less than 32 > + */ > +static void qfq_slot_rotate(struct qfq_group *grp, u64 roundedS) > +{ > + =A0 =A0 =A0 unsigned int i =3D (grp->S - roundedS) >> grp->slot_shi= ft; > + > + =A0 =A0 =A0 grp->full_slots <<=3D i; > + =A0 =A0 =A0 grp->front =3D (grp->front - i) % QFQ_MAX_SLOTS; > +} > + > +static void qfq_update_eligible(struct qfq_sched *q, u64 old_V) > +{ > + =A0 =A0 =A0 struct qfq_group *grp; > + =A0 =A0 =A0 unsigned long ineligible; > + > + =A0 =A0 =A0 ineligible =3D q->bitmaps[IR] | q->bitmaps[IB]; > + =A0 =A0 =A0 if (ineligible) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!q->bitmaps[ER]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp =3D qfq_ffs(q, inel= igible); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (qfq_gt(grp->S, q->V= )) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->V =3D= grp->S; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_make_eligible(q, old_V); > + =A0 =A0 =A0 } > +} > + > +/* What is length of next packet in queue (0 if queue is empty) */ > +static unsigned int qdisc_peek_len(struct Qdisc *sch) > +{ > + =A0 =A0 =A0 struct sk_buff *skb; > + > + =A0 =A0 =A0 skb =3D sch->ops->peek(sch); > + =A0 =A0 =A0 return skb ? qdisc_pkt_len(skb) : 0; > +} > + > +/* > + * Updates the class, returns true if also the group needs to be upd= ated. > + */ > +static bool qfq_update_class(struct qfq_group *grp, struct qfq_class= *cl) > +{ > + =A0 =A0 =A0 unsigned int len =3D qdisc_peek_len(cl->qdisc); > + > + =A0 =A0 =A0 cl->S =3D cl->F; > + =A0 =A0 =A0 if (!len) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_front_slot_remove(grp); =A0 =A0 /* = queue is empty */ > + =A0 =A0 =A0 else { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 u64 roundedS; > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->F =3D cl->S + (u64)len * cl->inv_w; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 roundedS =3D qfq_round_down(cl->S, grp-= >slot_shift); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (roundedS =3D=3D grp->S) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return false; > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_front_slot_remove(grp); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_slot_insert(grp, cl, roundedS); > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 return true; > +} > + > +static struct sk_buff *qfq_dequeue(struct Qdisc *sch) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_group *grp; > + =A0 =A0 =A0 struct qfq_class *cl; > + =A0 =A0 =A0 struct sk_buff *skb; > + =A0 =A0 =A0 unsigned int len; > + =A0 =A0 =A0 u64 old_V; > + > + =A0 =A0 =A0 if (!q->bitmaps[ER]) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return NULL; > + > + =A0 =A0 =A0 grp =3D qfq_ffs(q, q->bitmaps[ER]); > + > + =A0 =A0 =A0 cl =3D grp->slots[grp->front]; > + =A0 =A0 =A0 skb =3D qdisc_dequeue_peeked(cl->qdisc); > + =A0 =A0 =A0 if (!skb) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 WARN_ONCE(1, "qfq_dequeue: non-workcons= erving leaf\n"); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return NULL; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 sch->q.qlen--; > + > + =A0 =A0 =A0 old_V =3D q->V; > + =A0 =A0 =A0 len =3D qdisc_pkt_len(skb); > + =A0 =A0 =A0 q->V +=3D (u64)len * IWSUM; > + =A0 =A0 =A0 pr_debug("qfq enqueue: len %u F %lld now %lld\n", > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0len, (unsigned long long) cl->F, (un= signed long long) q->V); > + > + =A0 =A0 =A0 if (qfq_update_class(grp, cl)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 u64 old_F =3D grp->F; > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl =3D qfq_slot_scan(grp); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!cl) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index,= &q->bitmaps[ER]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 else { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 u64 roundedS =3D qfq_ro= und_down(cl->S, grp->slot_shift); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 unsigned int s; > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (grp->S =3D=3D round= edS) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto sk= ip_unblock; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp->S =3D roundedS; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp->F =3D roundedS + (= 2ULL << grp->slot_shift); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index,= &q->bitmaps[ER]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 s =3D qfq_calc_state(q,= grp); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __set_bit(grp->index, &= q->bitmaps[s]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_unblock_groups(q, grp->index, old_F= ); > + =A0 =A0 =A0 } > + > +skip_unblock: > + =A0 =A0 =A0 qfq_update_eligible(q, old_V); > + > + =A0 =A0 =A0 return skb; > +} > + > +/* > + * Assign a reasonable start time for a new flow k in group i. > + * Admissible values for \hat(F) are multiples of \sigma_i > + * no greater than V+\sigma_i . Larger values mean that > + * we had a wraparound so we consider the timestamp to be stale. > + * > + * If F is not stale and F >=3D V then we set S =3D F. > + * Otherwise we should assign S =3D V, but this may violate > + * the ordering in ER. So, if we have groups in ER, set S to > + * the F_j of the first group j which would be blocking us. > + * We are guaranteed not to move S backward because > + * otherwise our group i would still be blocked. > + */ > +static void qfq_update_start(struct qfq_sched *q, struct qfq_class *= cl) > +{ > + =A0 =A0 =A0 unsigned long mask; > + =A0 =A0 =A0 uint32_t limit, roundedF; > + =A0 =A0 =A0 int slot_shift =3D cl->grp->slot_shift; > + > + =A0 =A0 =A0 roundedF =3D qfq_round_down(cl->F, slot_shift); > + =A0 =A0 =A0 limit =3D qfq_round_down(q->V, slot_shift) + (1UL << sl= ot_shift); > + > + =A0 =A0 =A0 if (!qfq_gt(cl->F, q->V) || qfq_gt(roundedF, limit)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* timestamp was stale */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 mask =3D mask_from(q->bitmaps[ER], cl->= grp->index); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (mask) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct qfq_group *next = =3D qfq_ffs(q, mask); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (qfq_gt(roundedF, ne= xt->F)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->S =3D= next->F; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->S =3D q->V; > + =A0 =A0 =A0 } else { /* timestamp is not stale */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->S =3D cl->F; > + =A0 =A0 =A0 } > +} > + > +static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_group *grp; > + =A0 =A0 =A0 struct qfq_class *cl; > + =A0 =A0 =A0 unsigned int len; > + =A0 =A0 =A0 int err; > + =A0 =A0 =A0 u64 roundedS; > + =A0 =A0 =A0 int s; > + > + =A0 =A0 =A0 cl =3D qfq_classify(skb, sch, &err); > + =A0 =A0 =A0 if (cl =3D=3D NULL || cl->qdisc->q.qlen > 80) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (err & __NET_XMIT_BYPASS) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch->qstats.drops++; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 kfree_skb(skb); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return err; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 len =3D qdisc_pkt_len(skb); > + =A0 =A0 =A0 err =3D qdisc_enqueue(skb, cl->qdisc); > + =A0 =A0 =A0 if (unlikely(err !=3D NET_XMIT_SUCCESS)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (net_xmit_drop_count(err)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl->qstats.drops++; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 sch->qstats.drops++; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return err; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 cl->bstats.packets +=3D skb_is_gso(skb)?skb_shinfo(skb)= ->gso_segs:1; Hmm, there is no other packets schedulers which account packets in this way. Which one is better? I am not sure. And in this patch, qstats.drops isn't maintained in the same way. Would these two be consistent. > + =A0 =A0 =A0 cl->bstats.bytes +=3D qdisc_pkt_len(skb); > + > + =A0 =A0 =A0 sch->q.qlen++; > + =A0 =A0 =A0 sch->bstats.packets +=3D skb_is_gso(skb)?skb_shinfo(skb= )->gso_segs:1; > + =A0 =A0 =A0 sch->bstats.bytes +=3D qdisc_pkt_len(skb); > + > + =A0 =A0 =A0 if (qdisc_peek_head(sch) !=3D skb) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return err; I suspect that it is wrong. Here is the fake code from the paper: 5 i f ( f low . queue . head !=3D pkt ) 6 return ; // Flow already backlogged, we are don So the correct code should be: if (qdisc_peek_head(cl->qdisc) !=3D skb) return err; However, we can't assume the cl->qdisc is work conserving, so the code should be: if (cl->qdisc->q.qlen > 1) return err; > + > + =A0 =A0 =A0 /* If reach this point, queue q was idle */ > + =A0 =A0 =A0 grp =3D cl->grp; > + =A0 =A0 =A0 qfq_update_start(q, cl); > + > + =A0 =A0 =A0 /* compute new finish time and rounded start. */ > + =A0 =A0 =A0 cl->F =3D cl->S + (u64)qdisc_pkt_len(skb) * cl->inv_w; > + =A0 =A0 =A0 roundedS =3D qfq_round_down(cl->S, grp->slot_shift); > + > + =A0 =A0 =A0 /* > + =A0 =A0 =A0 =A0* insert cl in the correct bucket. > + =A0 =A0 =A0 =A0* If cl->S >=3D grp->S we don't need to adjust the > + =A0 =A0 =A0 =A0* bucket list and simply go to the insertion phase. > + =A0 =A0 =A0 =A0* Otherwise grp->S is decreasing, we must make room > + =A0 =A0 =A0 =A0* in the bucket list, and also recompute the group s= tate. > + =A0 =A0 =A0 =A0* Finally, if there were no flows in this group and = nobody > + =A0 =A0 =A0 =A0* was in ER make sure to adjust V. > + =A0 =A0 =A0 =A0*/ > + =A0 =A0 =A0 if (grp->full_slots) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!qfq_gt(grp->S, cl->S)) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto skip_update; > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* create a slot for this cl->S */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_slot_rotate(grp, roundedS); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* group was surely ineligible, remove = */ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index, &q->bitmaps[IR]= ); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index, &q->bitmaps[IB]= ); > + =A0 =A0 =A0 } else if (!q->bitmaps[ER] && qfq_gt(roundedS, q->V)) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 q->V =3D roundedS; > + > + =A0 =A0 =A0 grp->S =3D roundedS; > + =A0 =A0 =A0 grp->F =3D roundedS + (2ULL << grp->slot_shift); > + =A0 =A0 =A0 s =3D qfq_calc_state(q, grp); > + =A0 =A0 =A0 __set_bit(grp->index, &q->bitmaps[s]); > + > + =A0 =A0 =A0 pr_debug("qfq enqueue: new state %d %#lx S %lld F %lld = V %lld\n", > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s, q->bitmaps[s], > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long long) cl->S, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long long) cl->F, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long long) q->V); > + > +skip_update: > + =A0 =A0 =A0 qfq_slot_insert(grp, cl, roundedS); > + > + =A0 =A0 =A0 return err; > +} > + > + > +static void qfq_slot_remove(struct qfq_sched *q, struct qfq_group *g= rp, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct qfq_clas= s *cl, struct qfq_class **pprev) > +{ > + =A0 =A0 =A0 unsigned int i, offset; > + =A0 =A0 =A0 u64 roundedS; > + > + =A0 =A0 =A0 roundedS =3D qfq_round_down(cl->S, grp->slot_shift); > + =A0 =A0 =A0 offset =3D (roundedS - grp->S) >> grp->slot_shift; > + =A0 =A0 =A0 i =3D (grp->front + offset) % QFQ_MAX_SLOTS; > + > + =A0 =A0 =A0 if (!pprev) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pprev =3D &grp->slots[i]; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 while (*pprev && *pprev !=3D cl) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 pprev =3D &(*pprev)->ne= xt; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 *pprev =3D cl->next; > + =A0 =A0 =A0 if (!grp->slots[i]) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(offset, &grp->full_slots); > +} > + > +/* > + * called to forcibly destroy a queue. > + * If the queue is not in the front bucket, or if it has > + * other queues in the front bucket, we can simply remove > + * the queue with no other side effects. > + * Otherwise we must propagate the event up. > + */ > +static void qfq_deactivate_class(struct qfq_sched *q, struct qfq_cla= ss *cl, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0stru= ct qfq_class **pprev) > +{ > + =A0 =A0 =A0 struct qfq_group *grp =3D cl->grp; > + =A0 =A0 =A0 unsigned long mask; > + =A0 =A0 =A0 u64 roundedS; > + =A0 =A0 =A0 int s; > + > + =A0 =A0 =A0 cl->F =3D cl->S; > + =A0 =A0 =A0 qfq_slot_remove(q, grp, cl, pprev); > + > + =A0 =A0 =A0 if (!grp->full_slots) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index, &q->bitmaps[IR]= ); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index, &q->bitmaps[EB]= ); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index, &q->bitmaps[IB]= ); > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (test_bit(grp->index, &q->bitmaps[ER= ]) && > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 !(q->bitmaps[ER] & ~((1UL << gr= p->index) - 1))) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 mask =3D q->bitmaps[ER]= & ((1UL << grp->index) - 1); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (mask) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 mask =3D= ~((1UL << __fls(mask)) - 1); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 else > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 mask =3D= ~0UL; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_move_groups(q, mask= , EB, ER); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_move_groups(q, mask= , IB, IR); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index, &q->bitmaps[ER]= ); > + =A0 =A0 =A0 } else if (!grp->slots[grp->front]) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl =3D qfq_slot_scan(grp); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 roundedS =3D qfq_round_down(cl->S, grp-= >slot_shift); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (grp->S !=3D roundedS) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index,= &q->bitmaps[ER]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index,= &q->bitmaps[IR]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index,= &q->bitmaps[EB]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __clear_bit(grp->index,= &q->bitmaps[IB]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp->S =3D roundedS; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp->F =3D roundedS + (= 2ULL << grp->slot_shift); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 s =3D qfq_calc_state(q,= grp); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __set_bit(grp->index, &= q->bitmaps[s]); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 qfq_update_eligible(q, q->V); > +} > + > +static void qfq_qlen_notify(struct Qdisc *sch, unsigned long arg) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D (struct qfq_sched *)sch; > + =A0 =A0 =A0 struct qfq_class *cl =3D (struct qfq_class *)arg; > + > + =A0 =A0 =A0 if (cl->qdisc->q.qlen =3D=3D 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_deactivate_class(q, cl, NULL); > +} > + > +static unsigned int qfq_drop(struct Qdisc *sch) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_group *grp; > + =A0 =A0 =A0 struct qfq_class *cl, **pp; > + =A0 =A0 =A0 unsigned int i, j, len; > + > + =A0 =A0 =A0 for (i =3D 0; i <=3D QFQ_MAX_INDEX; i++) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp =3D &q->groups[i]; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 for (j =3D 0; j < QFQ_MAX_SLOTS; j++) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 for (pp =3D &grp->slots= [j]; *pp; pp =3D &(*pp)->next) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl =3D = *pp; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!cl= ->qdisc->ops->drop) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 continue; > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 len =3D= cl->qdisc->ops->drop(cl->qdisc); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (len= > 0) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 sch->q.qlen--; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 if (!cl->qdisc->q.qlen) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 qfq_deactivate_class(q, cl, pp); > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 return len; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 return 0; > +} > + > +static int qfq_init_qdisc(struct Qdisc *sch, struct nlattr *opt) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_group *grp; > + =A0 =A0 =A0 int i, err; > + > + =A0 =A0 =A0 err =3D qdisc_class_hash_init(&q->clhash); > + =A0 =A0 =A0 if (err < 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return err; > + > + =A0 =A0 =A0 for (i =3D 0; i <=3D QFQ_MAX_INDEX; i++) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp =3D &q->groups[i]; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp->index =3D i; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 return 0; > +} > + > +static void qfq_reset_qdisc(struct Qdisc *sch) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_group *grp; > + =A0 =A0 =A0 struct qfq_class *cl, **pp; > + =A0 =A0 =A0 struct hlist_node *n; > + =A0 =A0 =A0 unsigned int i, j; > + > + =A0 =A0 =A0 for (i =3D 0; i <=3D QFQ_MAX_INDEX; i++) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 grp =3D &q->groups[i]; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 for (j =3D 0; j < QFQ_MAX_SLOTS; j++) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 for (pp =3D &grp->slots= [j]; *pp; pp =3D &(*pp)->next) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 cl =3D = *pp; > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (cl-= >qdisc->q.qlen) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 qfq_deactivate_class(q, cl, pp); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0 for (i =3D 0; i < q->clhash.hashsize; i++) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 hlist_for_each_entry(cl, n, &q->clhash.= hash[i], common.hnode) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 qdisc_reset(cl->qdisc); > + =A0 =A0 =A0 } > + =A0 =A0 =A0 sch->q.qlen =3D 0; > +} > + > +static void qfq_destroy_qdisc(struct Qdisc *sch) > +{ > + =A0 =A0 =A0 struct qfq_sched *q =3D qdisc_priv(sch); > + =A0 =A0 =A0 struct qfq_class *cl; > + =A0 =A0 =A0 struct hlist_node *n, *next; > + =A0 =A0 =A0 unsigned int i; > + > + =A0 =A0 =A0 tcf_destroy_chain(&q->filter_list); > + > + =A0 =A0 =A0 for (i =3D 0; i < q->clhash.hashsize; i++) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 hlist_for_each_entry_safe(cl, n, next, = &q->clhash.hash[i], > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 common.hnode) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 qfq_destroy_class(sch, = cl); > + =A0 =A0 =A0 } > + =A0 =A0 =A0 qdisc_class_hash_destroy(&q->clhash); > +} > + > +static const struct Qdisc_class_ops qfq_class_ops =3D { > + =A0 =A0 =A0 .change =A0 =A0 =A0 =A0 =3D qfq_change_class, > + =A0 =A0 =A0 .delete =A0 =A0 =A0 =A0 =3D qfq_delete_class, > + =A0 =A0 =A0 .get =A0 =A0 =A0 =A0 =A0 =A0=3D qfq_get_class, > + =A0 =A0 =A0 .put =A0 =A0 =A0 =A0 =A0 =A0=3D qfq_put_class, > + =A0 =A0 =A0 .tcf_chain =A0 =A0 =A0=3D qfq_tcf_chain, > + =A0 =A0 =A0 .bind_tcf =A0 =A0 =A0 =3D qfq_bind_tcf, > + =A0 =A0 =A0 .unbind_tcf =A0 =A0 =3D qfq_unbind_tcf, > + =A0 =A0 =A0 .graft =A0 =A0 =A0 =A0 =A0=3D qfq_graft_class, > + =A0 =A0 =A0 .leaf =A0 =A0 =A0 =A0 =A0 =3D qfq_class_leaf, > + =A0 =A0 =A0 .qlen_notify =A0 =A0=3D qfq_qlen_notify, > + =A0 =A0 =A0 .dump =A0 =A0 =A0 =A0 =A0 =3D qfq_dump_class, > + =A0 =A0 =A0 .dump_stats =A0 =A0 =3D qfq_dump_class_stats, > + =A0 =A0 =A0 .walk =A0 =A0 =A0 =A0 =A0 =3D qfq_walk, > +}; > + > +static struct Qdisc_ops qfq_qdisc_ops __read_mostly =3D { > + =A0 =A0 =A0 .cl_ops =A0 =A0 =A0 =A0 =3D &qfq_class_ops, > + =A0 =A0 =A0 .id =A0 =A0 =A0 =A0 =A0 =A0 =3D "qfq", > + =A0 =A0 =A0 .priv_size =A0 =A0 =A0=3D sizeof(struct qfq_sched), > + =A0 =A0 =A0 .enqueue =A0 =A0 =A0 =A0=3D qfq_enqueue, > + =A0 =A0 =A0 .dequeue =A0 =A0 =A0 =A0=3D qfq_dequeue, > + =A0 =A0 =A0 .peek =A0 =A0 =A0 =A0 =A0 =3D qdisc_peek_dequeued, > + =A0 =A0 =A0 .drop =A0 =A0 =A0 =A0 =A0 =3D qfq_drop, > + =A0 =A0 =A0 .init =A0 =A0 =A0 =A0 =A0 =3D qfq_init_qdisc, > + =A0 =A0 =A0 .reset =A0 =A0 =A0 =A0 =A0=3D qfq_reset_qdisc, > + =A0 =A0 =A0 .destroy =A0 =A0 =A0 =A0=3D qfq_destroy_qdisc, > + =A0 =A0 =A0 .owner =A0 =A0 =A0 =A0 =A0=3D THIS_MODULE, > +}; > + > +static int __init qfq_init(void) > +{ > + =A0 =A0 =A0 return register_qdisc(&qfq_qdisc_ops); > +} > + > +static void __exit qfq_exit(void) > +{ > + =A0 =A0 =A0 unregister_qdisc(&qfq_qdisc_ops); > +} > + > +module_init(qfq_init); > +module_exit(qfq_exit); > +MODULE_LICENSE("GPL"); > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > --=20 Regards, Changli Gao(xiaosuo@gmail.com)