From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jarek Poplawski Subject: Re: [net-next-2.6 PATCH v5 2/2] net_sched: implement a root container qdisc sch_mqprio Date: Thu, 6 Jan 2011 16:41:15 +0100 Message-ID: <20110106154115.GA1846@del.dom.local> References: <20110104185600.13692.47967.stgit@jf-dev1-dcblab> <20110104185646.13692.68146.stgit@jf-dev1-dcblab> <20110104225936.GA2030@del.dom.local> <4D24ACA4.3000301@intel.com> <20110106003208.GA2166@del.dom.local> <4D25296D.8070902@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "davem@davemloft.net" , "hadi@cyberus.ca" , "shemminger@vyatta.com" , "tgraf@infradead.org" , "eric.dumazet@gmail.com" , "bhutchings@solarflare.com" , "nhorman@tuxdriver.com" , "netdev@vger.kernel.org" To: John Fastabend Return-path: Received: from mail-bw0-f46.google.com ([209.85.214.46]:61732 "EHLO mail-bw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753036Ab1AFPlX (ORCPT ); Thu, 6 Jan 2011 10:41:23 -0500 Received: by bwz15 with SMTP id 15so16481392bwz.19 for ; Thu, 06 Jan 2011 07:41:22 -0800 (PST) Content-Disposition: inline In-Reply-To: <4D25296D.8070902@intel.com> Sender: netdev-owner@vger.kernel.org List-ID: On Wed, Jan 05, 2011 at 06:31:09PM -0800, John Fastabend wrote: > On 1/5/2011 4:32 PM, Jarek Poplawski wrote: > > On Wed, Jan 05, 2011 at 09:38:44AM -0800, John Fastabend wrote: > >> On 1/4/2011 2:59 PM, Jarek Poplawski wrote: > >>> On Tue, Jan 04, 2011 at 10:56:46AM -0800, John Fastabend wrote: > > ... > >>>> + > >>>> + /* Always use supplied priority mappings */ > >>>> + for (i = 0; i < TC_BITMASK + 1; i++) { > >>>> + if (netdev_set_prio_tc_map(dev, i, qopt->prio_tc_map[i])) { > >>>> + err = -EINVAL; > >>> > >>> This would probably trigger if we try qopt->num_tc == 0. Is it expected? > >> > >> netdev_set_prio_tc_map() returns 0 on sucess. This if(..) is a bit strange though. > >> > >> err = netdev_set_prio_tc_map(dev, i, qopt->prio_tc_map[i]) > >> if (err) > >> ... > >> > >> Is cleaner IMHO. > > > > Maybe. But I still wonder if qopt->num_tc == 0 is valid (by design)? > > (netdev_set_prio_tc_map() returns -EINVAL if dev->num_tc == 0) > > > > I guess I don't see how it returns -EINVAL. Although setting > qopt->num_tc = 0 is equivalent to disabling traffic classes so > it is a strange thing to do but I don't expect it should be a > problem. > > static inline > int netdev_set_prio_tc_map(struct net_device *dev, u8 prio, u8 tc) > { > if (tc >= dev->num_tc) If dev->num_tc was set to 0 this check is always true and we exit the init with error. It's not a problem if it's documented but checking it earlier seems more readable. > return -EINVAL; > > dev->prio_tc_map[prio & TC_BITMASK] = tc & TC_BITMASK; > return 0; > } > > >>>> +static unsigned long mqprio_get(struct Qdisc *sch, u32 classid) > >>>> +{ > >>>> + unsigned int ntx = TC_H_MIN(classid); > >>> > >>> We need to 'get' tc classes too, eg for individual dumps. Then we > >>> should omit them in .leaf, .graft etc. > >>> > >> > >> OK missed this. Looks like iproute2 always sets NLM_F_DUMP which works because it uses cl_ops->walk > >> > >> # tc -s class show dev eth3 classid 800b:1 > >> class mqprio 800b:1 root > >> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > >> backlog 0b 0p requeues 0 > > > > OK, then it might be only of theoretical value (for some other tools). > >>> Why dev->num_tc above, netdev_get_num_tc(dev) here, and dev->num_tc > >>> below? > >> > >> No reason just inconsistant I will use dev->num_tc. > > > > Well, this would suggest netdev_get_num_tc() is redundant. > > > > Guess it is I can remove it. If you definitely don't expect any #ifdef CONFIG_XXX for this code, both get and set num_tc could be removed. If you're not sure use the function everywhere. > > static inline > u8 netdev_get_num_tc(const struct net_device *dev) > { > return dev->num_tc; > } > > >>>> +static void mqprio_walk(struct Qdisc *sch, struct qdisc_walker *arg) > >>>> +{ > >>>> + struct net_device *dev = qdisc_dev(sch); > >>>> + unsigned long ntx; > >>>> + u8 num_tc = netdev_get_num_tc(dev); > >>>> + > >>>> + if (arg->stop) > >>>> + return; > >>>> + > >>>> + /* Walk hierarchy with a virtual class per tc */ > >>>> + arg->count = arg->skip; > >>>> + for (ntx = arg->skip; ntx < dev->num_tx_queues + num_tc; ntx++) { > >>> > >>> Should we report possibly unused/unconfigured tx_queues? > >> > >> I think it may be OK select_queue() could push skbs onto these queues and we may still want to see the statistics in this case. Although (real_num_tx_queues + num_tc) may make sense I see no reason to show queues above real_num_tx_queues. > > > > IMHO, pushing skbs to unconfigured/unwanted/disabled queues is a bug, > > and select should handle this, but probably it's a matter of taste. > > certainly a bug for queues >= real_num_tx_queues. > > > Btw, I wonder how dev->real_num_tx_queues is obeyed here. > > > > Its not. real_num_tx_queues is not handled correctly here also > real_num_tx_queues can be changed so this need to be handled as well. > This means an additional check in the options parsing. Well, let's do it. Maybe similarly to sch_multiq, with the .change option? Jarek P.