From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [patch net-next RFC 0/4] net: sched: allow qdiscs to share filter block instances Date: Tue, 11 Jul 2017 14:34:12 +0200 Message-ID: <20170711123412.GB1874@nanopsycho> References: <20170710185110.3180-1-jiri@resnulli.us> <29fbfb29-1737-1990-a6a7-b79bed7ed1fa@mojatatu.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netdev@vger.kernel.org, davem@davemloft.net, xiyou.wangcong@gmail.com, edumazet@google.com, stephen@networkplumber.org, jbenc@redhat.com, mlxsw@mellanox.com, andrew@lunn.ch, vivien.didelot@savoirfairelinux.com, f.fainelli@gmail.com, john.fastabend@gmail.com, alexander.h.duyck@intel.com, daniel@iogearbox.net, ogerlitz@mellanox.com, mrv@mojatatu.com To: Jamal Hadi Salim Return-path: Received: from mail-wr0-f176.google.com ([209.85.128.176]:36284 "EHLO mail-wr0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752533AbdGKMeP (ORCPT ); Tue, 11 Jul 2017 08:34:15 -0400 Received: by mail-wr0-f176.google.com with SMTP id c11so181659588wrc.3 for ; Tue, 11 Jul 2017 05:34:15 -0700 (PDT) Content-Disposition: inline In-Reply-To: <29fbfb29-1737-1990-a6a7-b79bed7ed1fa@mojatatu.com> Sender: netdev-owner@vger.kernel.org List-ID: Tue, Jul 11, 2017 at 02:15:27PM CEST, jhs@mojatatu.com wrote: >Hi Jiri, > >Commenting on generalities - will comment on code later: > >On 17-07-10 02:51 PM, Jiri Pirko wrote: >> From: Jiri Pirko >> >> Currently the filters added to qdiscs are independent. So for example if you >> have 2 netdevices and you create ingress qdisc on both and you want to add >> identical filter rules both, you need to add them twice. This patchset >> makes this easier and mainly saves resources allowing to share all filters >> within a qdisc - I call it a "filter block". Also this helps to save >> resources when we do offload to hw for example to expensive TCAM. >> >> So back to the example. First, we create 2 qdiscs. Both will share >> block number 22. "22" is just an identification. If we don't pass any >> block number, a new one will be generated by kernel: >> >> $ tc qdisc add dev ens7 ingress block 22 >> ^^^^^^^^ >> $ tc qdisc add dev ens8 ingress block 22 >> > >Above makes intuitive sense. > > > ^^^^^^^^ >> >> Now if we list the qdiscs, we will see the block index in the output: >> qdisc fq_codel 0: dev ens7 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn >> Sent 9014 bytes 99 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0 >> new_flows_len 0 old_flows_len 0 >> qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22 >> ^^^^^^^^ >> Sent 4592 bytes 58 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> qdisc fq_codel 0: dev ens8 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn >> Sent 17022 bytes 307 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0 >> new_flows_len 0 old_flows_len 0 >> qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22 >> ^^^^^^^^ >> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> > >So does this. > >> >> Now we can add filter to any of qdiscs sharing the same block: >> >> $ tc filter add dev ens7 parent ffff: protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop >> > >So for backward compat - this also makes sense. But: >it does make sense to create new syntax for adding >filters and actions: > >tc filter add block 22 protocol ip pref 25 flower \ > dst_ip 192.168.0.0/16 action drop Was thinking about that. Decided to pass on this now. This should be addressed by follow-up anyway. > >Coordinates of the filter block before were: > >, , [handle] > >You should be able to abuse struct tcmsg ifindex to represent block # >as long as you set parent to be something meaningful that is >identified "block coordinate" via TC_H_XXX (pick something safe not >in use by ingress or egress; look at: uapi/linux/pkt_sched.h) Not sure about this. I have take closer look. In general, I don't like to abuse anything :) > >> >> We will see the same output if we list filters for ens7 and ens8, including stats: >> >> $ tc -s filter show dev ens7 root >> filter parent ffff: protocol ip pref 25 flower >> filter parent ffff: protocol ip pref 25 flower handle 0x1 >> eth_type ipv4 >> dst_ip 192.168.1.0/24 >> action order 1: gact action drop >> random type none pass val 0 >> index 3 ref 1 bind 1 installed 10201 sec used 10150 sec >> Action statistics: >> Sent 4200 bytes 50 pkt (dropped 50, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> >> $ tc -s filter show dev ens8 root >> filter dev ens7 parent ffff: protocol ip pref 25 flower >> filter dev ens7 parent ffff: protocol ip pref 25 flower handle 0x1 >> eth_type ipv4 >> dst_ip 192.168.1.0/24 >> action order 1: gact action drop >> random type none pass val 0 >> index 3 ref 1 bind 1 installed 10202 sec used 10152 sec >> Action statistics: >> Sent 4200 bytes 50 pkt (dropped 50, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> >> >> Issues: >> - tp->q is set by the device used to add the filter. That has to be resolved. >> Impacts the dump (as you can see above) >> > >I think you have more problems if the dump above is reality;-> >You added to ingress and this is showing egress. Howcome? I only don't see "dev x" on ens7. That is the only difference, > >To complete the thought, dump is: > > tc -s filter show block 22 Understood. Again, this should be addressed in follow-up. > >cheers, >jamal >