From: Jiri Pirko <jiri@resnulli.us>
To: Petr Machata <petrm@mellanox.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Jakub Kicinski <kuba@kernel.org>,
Eric Dumazet <eric.dumazet@gmail.com>,
Jamal Hadi Salim <jhs@mojatatu.com>,
Jiri Pirko <jiri@mellanox.com>,
Ido Schimmel <idosch@mellanox.com>
Subject: Re: [RFC PATCH net-next 0/3] TC: Introduce qevents
Date: Mon, 1 Jun 2020 15:35:24 +0200 [thread overview]
Message-ID: <20200601133524.GP2282@nanopsycho> (raw)
In-Reply-To: <87wo4wtmnx.fsf@mellanox.com>
Thu, May 28, 2020 at 11:48:50AM CEST, petrm@mellanox.com wrote:
>
>Cong Wang <xiyou.wangcong@gmail.com> writes:
>
>> On Wed, May 27, 2020 at 2:56 AM Petr Machata <petrm@mellanox.com> wrote:
>>>
>>>
>>> Cong Wang <xiyou.wangcong@gmail.com> writes:
>>>
>>> > On Tue, May 26, 2020 at 10:11 AM Petr Machata <petrm@mellanox.com> wrote:
>>> >>
>>> >> The Spectrum hardware allows execution of one of several actions as a
>>> >> result of queue management events: tail-dropping, early-dropping, marking a
>>> >> packet, or passing a configured latency threshold or buffer size. Such
>>> >> packets can be mirrored, trapped, or sampled.
>>> >>
>>> >> Modeling the action to be taken as simply a TC action is very attractive,
>>> >> but it is not obvious where to put these actions. At least with ECN marking
>>> >> one could imagine a tree of qdiscs and classifiers that effectively
>>> >> accomplishes this task, albeit in an impractically complex manner. But
>>> >> there is just no way to match on dropped-ness of a packet, let alone
>>> >> dropped-ness due to a particular reason.
>>> >>
>>> >> To allow configuring user-defined actions as a result of inner workings of
>>> >> a qdisc, this patch set introduces a concept of qevents. Those are attach
>>> >> points for TC blocks, where filters can be put that are executed as the
>>> >> packet hits well-defined points in the qdisc algorithms. The attached
>>> >> blocks can be shared, in a manner similar to clsact ingress and egress
>>> >> blocks, arbitrary classifiers with arbitrary actions can be put on them,
>>> >> etc.
>>> >
>>> > This concept does not fit well into qdisc, essentially you still want to
>>> > install filters (and actions) somewhere on qdisc, but currently all filters
>>> > are executed at enqueue, basically you want to execute them at other
>>> > pre-defined locations too, for example early drop.
>>> >
>>> > So, perhaps adding a "position" in tc filter is better? Something like:
>>> >
>>> > tc qdisc add dev x root handle 1: ... # same as before
>>> > tc filter add dev x parent 1:0 position early_drop matchall action....
>>>
>>> Position would just be a chain index.
>>
>> Why? By position, I mean a place where we _execute_ tc filters on
>> a qdisc, currently there is only "enqueue". I don't see how this is
>> close to a chain which is basically a group of filters.
>
>OK, I misunderstood what you mean.
>
>So you propose to have further division within the block? To have sort
>of namespaces within blocks or chains, where depending on the context,
>only filters in the corresponding namespace are executed?
Please take the block as a whole entity. It has one entry ->chain0
processing. The gotos to other chains should be contained within the
block.
Please don't divide the block. If you want to process the block from a
different entry point, please process it as a whole.
>
>>>
>>> > And obviously default position must be "enqueue". Makes sense?
>>>
>>> Chain 0.
>>>
>>> So if I understand the proposal correctly: a qdisc has a classification
>>> block (cl_ops->tcf_block). Qevents then reference a chain on that
>>> classification block.
>>
>> No, I am suggesting to replace your qevents with position, because
>> as I said it does not fit well there.
>>
>>>
>>> If the above is correct, I disagree that this is a better model. RED
>>> does not need to classify anything, modelling this as a classification
>>> block is wrong. It should be another block. (Also shareable, because as
>>> an operator you likely want to treat all, say, early drops the same, and
>>> therefore to add just one rule for all 128 or so of your qdiscs.)
>>
>> You can still choose not to classify anything by using matchall. No
>> one is saying you have to do classification.
>
>The point here is filter reuse, not classification.
>
>> If you want to jump to a block, you can just use a goto action like
>
>I don't think you can jump to a block. You can jump to a chain within
>the same block.
Correct.
entrypoint->
block X
chain 0
chain A
chain B
chain ...
There's no goto/jump out of the block.
>
>> normal. So your above example can be replaced with:
>>
>> # tc qdisc add dev eth0 root handle 1: \
>> red limit 500K avpkt 1K
>>
>> # tc filter add dev eth0 parent 1:0 position early_drop matchall \
>> action goto chain 10
>>
>> # tc chain add dev eth0 index 10 ...
>>
>> See the difference?
next prev parent reply other threads:[~2020-06-01 13:35 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-26 17:10 [RFC PATCH net-next 0/3] TC: Introduce qevents Petr Machata
2020-05-26 17:10 ` [RFC PATCH net-next 1/3] net: sched: Introduce helpers for qevent blocks Petr Machata
2020-05-26 17:10 ` [RFC PATCH net-next 2/3] net: sched: sch_red: Split init and change callbacks Petr Machata
2020-05-26 18:32 ` Jakub Kicinski
2020-05-27 15:23 ` Petr Machata
2020-05-26 17:10 ` [RFC PATCH net-next 3/3] net: sched: sch_red: Add qevents "early" and "mark" Petr Machata
2020-05-26 17:10 ` [RFC PATCH iproute2-next 1/4] uapi: pkt_sched: Add two new RED attributes Petr Machata
2020-05-26 17:10 ` [RFC PATCH iproute2-next 2/4] tc: Add helpers to support qevent parsing Petr Machata
2020-05-26 17:10 ` [RFC PATCH iproute2-next 3/4] man: tc: Describe qevents Petr Machata
2020-05-26 17:10 ` [RFC PATCH iproute2-next 4/4] tc: q_red: Add support for qevents "mark" and "early" Petr Machata
2020-05-27 4:09 ` [RFC PATCH net-next 0/3] TC: Introduce qevents Cong Wang
2020-05-27 9:56 ` Petr Machata
2020-05-28 4:00 ` Cong Wang
2020-05-28 9:48 ` Petr Machata
2020-05-30 4:48 ` Cong Wang
2020-05-30 8:55 ` Petr Machata
2020-06-01 20:01 ` Cong Wang
2020-06-01 13:35 ` Jiri Pirko [this message]
2020-06-01 13:40 ` Jiri Pirko
2020-06-01 19:50 ` Cong Wang
2020-06-01 22:37 ` Petr Machata
2020-06-02 6:05 ` Jiri Pirko
2020-06-03 7:05 ` Cong Wang
2020-06-03 10:08 ` Petr Machata
2020-05-27 15:19 ` Vladimir Oltean
2020-05-27 16:25 ` Petr Machata
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200601133524.GP2282@nanopsycho \
--to=jiri@resnulli.us \
--cc=eric.dumazet@gmail.com \
--cc=idosch@mellanox.com \
--cc=jhs@mojatatu.com \
--cc=jiri@mellanox.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=petrm@mellanox.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox