From: Jiri Pirko <jiri@resnulli.us>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: netdev@vger.kernel.org, davem@davemloft.net,
xiyou.wangcong@gmail.com, edumazet@google.com,
stephen@networkplumber.org, jbenc@redhat.com, mlxsw@mellanox.com,
andrew@lunn.ch, vivien.didelot@savoirfairelinux.com,
f.fainelli@gmail.com, john.fastabend@gmail.com,
alexander.h.duyck@intel.com, daniel@iogearbox.net,
ogerlitz@mellanox.com, mrv@mojatatu.com
Subject: Re: [patch net-next RFC 0/4] net: sched: allow qdiscs to share filter block instances
Date: Tue, 11 Jul 2017 14:34:12 +0200 [thread overview]
Message-ID: <20170711123412.GB1874@nanopsycho> (raw)
In-Reply-To: <29fbfb29-1737-1990-a6a7-b79bed7ed1fa@mojatatu.com>
Tue, Jul 11, 2017 at 02:15:27PM CEST, jhs@mojatatu.com wrote:
>Hi Jiri,
>
>Commenting on generalities - will comment on code later:
>
>On 17-07-10 02:51 PM, Jiri Pirko wrote:
>> From: Jiri Pirko <jiri@mellanox.com>
>>
>> Currently the filters added to qdiscs are independent. So for example if you
>> have 2 netdevices and you create ingress qdisc on both and you want to add
>> identical filter rules both, you need to add them twice. This patchset
>> makes this easier and mainly saves resources allowing to share all filters
>> within a qdisc - I call it a "filter block". Also this helps to save
>> resources when we do offload to hw for example to expensive TCAM.
>>
>> So back to the example. First, we create 2 qdiscs. Both will share
>> block number 22. "22" is just an identification. If we don't pass any
>> block number, a new one will be generated by kernel:
>>
>> $ tc qdisc add dev ens7 ingress block 22
>> ^^^^^^^^
>> $ tc qdisc add dev ens8 ingress block 22
>>
>
>Above makes intuitive sense.
>
>
> ^^^^^^^^
>>
>> Now if we list the qdiscs, we will see the block index in the output:
>> qdisc fq_codel 0: dev ens7 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
>> Sent 9014 bytes 99 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>> maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>> new_flows_len 0 old_flows_len 0
>> qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22
>> ^^^^^^^^
>> Sent 4592 bytes 58 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>> qdisc fq_codel 0: dev ens8 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 32Mb ecn
>> Sent 17022 bytes 307 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>> maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
>> new_flows_len 0 old_flows_len 0
>> qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22
>> ^^^^^^^^
>> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>>
>
>So does this.
>
>>
>> Now we can add filter to any of qdiscs sharing the same block:
>>
>> $ tc filter add dev ens7 parent ffff: protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop
>>
>
>So for backward compat - this also makes sense. But:
>it does make sense to create new syntax for adding
>filters and actions:
>
>tc filter add block 22 protocol ip pref 25 flower \
> dst_ip 192.168.0.0/16 action drop
Was thinking about that. Decided to pass on this now. This should be
addressed by follow-up anyway.
>
>Coordinates of the filter block before were:
>
><ifindex>, <parent>, [handle]
>
>You should be able to abuse struct tcmsg ifindex to represent block #
>as long as you set parent to be something meaningful that is
>identified "block coordinate" via TC_H_XXX (pick something safe not
>in use by ingress or egress; look at: uapi/linux/pkt_sched.h)
Not sure about this. I have take closer look. In general, I don't like
to abuse anything :)
>
>>
>> We will see the same output if we list filters for ens7 and ens8, including stats:
>>
>> $ tc -s filter show dev ens7 root
>> filter parent ffff: protocol ip pref 25 flower
>> filter parent ffff: protocol ip pref 25 flower handle 0x1
>> eth_type ipv4
>> dst_ip 192.168.1.0/24
>> action order 1: gact action drop
>> random type none pass val 0
>> index 3 ref 1 bind 1 installed 10201 sec used 10150 sec
>> Action statistics:
>> Sent 4200 bytes 50 pkt (dropped 50, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>>
>> $ tc -s filter show dev ens8 root
>> filter dev ens7 parent ffff: protocol ip pref 25 flower
>> filter dev ens7 parent ffff: protocol ip pref 25 flower handle 0x1
>> eth_type ipv4
>> dst_ip 192.168.1.0/24
>> action order 1: gact action drop
>> random type none pass val 0
>> index 3 ref 1 bind 1 installed 10202 sec used 10152 sec
>> Action statistics:
>> Sent 4200 bytes 50 pkt (dropped 50, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>>
>>
>> Issues:
>> - tp->q is set by the device used to add the filter. That has to be resolved.
>> Impacts the dump (as you can see above)
>>
>
>I think you have more problems if the dump above is reality;->
>You added to ingress and this is showing egress.
Howcome? I only don't see "dev x" on ens7. That is the only difference,
>
>To complete the thought, dump is:
>
> tc -s filter show block 22
Understood. Again, this should be addressed in follow-up.
>
>cheers,
>jamal
>
next prev parent reply other threads:[~2017-07-11 12:34 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-10 18:51 [patch net-next RFC 0/4] net: sched: allow qdiscs to share filter block instances Jiri Pirko
2017-07-10 18:51 ` [patch net-next RFC 1/4] net: sched: introduce support for multiple filter chain pointers registration Jiri Pirko
2017-07-10 18:51 ` [patch net-next RFC 2/4] net: sched: intruduce qdisc_net helper Jiri Pirko
2017-07-10 18:51 ` [patch net-next RFC 3/4] net: sched: introduce shared filter blocks infrastructure Jiri Pirko
2017-07-10 18:51 ` [patch net-next RFC 4/4] net: sched: allow ingress and clsact qdiscs to share filter blocks Jiri Pirko
2017-07-10 18:52 ` [patch iproute2/net-next RFC] tc: Implement filter block sharing to ingress and clsact qdiscs Jiri Pirko
2017-07-11 6:57 ` [patch net-next RFC 0/4] net: sched: allow qdiscs to share filter block instances Or Gerlitz
2017-07-11 7:02 ` Jiri Pirko
2017-07-11 12:15 ` Jamal Hadi Salim
2017-07-11 12:34 ` Jiri Pirko [this message]
2017-07-14 7:33 ` Jamal Hadi Salim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170711123412.GB1874@nanopsycho \
--to=jiri@resnulli.us \
--cc=alexander.h.duyck@intel.com \
--cc=andrew@lunn.ch \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=f.fainelli@gmail.com \
--cc=jbenc@redhat.com \
--cc=jhs@mojatatu.com \
--cc=john.fastabend@gmail.com \
--cc=mlxsw@mellanox.com \
--cc=mrv@mojatatu.com \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@mellanox.com \
--cc=stephen@networkplumber.org \
--cc=vivien.didelot@savoirfairelinux.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox