public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Petr Machata <petrm@mellanox.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Jiri Pirko <jiri@mellanox.com>,
	Ido Schimmel <idosch@mellanox.com>
Subject: Re: [RFC PATCH net-next 0/3] TC: Introduce qevents
Date: Mon, 1 Jun 2020 15:35:24 +0200	[thread overview]
Message-ID: <20200601133524.GP2282@nanopsycho> (raw)
In-Reply-To: <87wo4wtmnx.fsf@mellanox.com>

Thu, May 28, 2020 at 11:48:50AM CEST, petrm@mellanox.com wrote:
>
>Cong Wang <xiyou.wangcong@gmail.com> writes:
>
>> On Wed, May 27, 2020 at 2:56 AM Petr Machata <petrm@mellanox.com> wrote:
>>>
>>>
>>> Cong Wang <xiyou.wangcong@gmail.com> writes:
>>>
>>> > On Tue, May 26, 2020 at 10:11 AM Petr Machata <petrm@mellanox.com> wrote:
>>> >>
>>> >> The Spectrum hardware allows execution of one of several actions as a
>>> >> result of queue management events: tail-dropping, early-dropping, marking a
>>> >> packet, or passing a configured latency threshold or buffer size. Such
>>> >> packets can be mirrored, trapped, or sampled.
>>> >>
>>> >> Modeling the action to be taken as simply a TC action is very attractive,
>>> >> but it is not obvious where to put these actions. At least with ECN marking
>>> >> one could imagine a tree of qdiscs and classifiers that effectively
>>> >> accomplishes this task, albeit in an impractically complex manner. But
>>> >> there is just no way to match on dropped-ness of a packet, let alone
>>> >> dropped-ness due to a particular reason.
>>> >>
>>> >> To allow configuring user-defined actions as a result of inner workings of
>>> >> a qdisc, this patch set introduces a concept of qevents. Those are attach
>>> >> points for TC blocks, where filters can be put that are executed as the
>>> >> packet hits well-defined points in the qdisc algorithms. The attached
>>> >> blocks can be shared, in a manner similar to clsact ingress and egress
>>> >> blocks, arbitrary classifiers with arbitrary actions can be put on them,
>>> >> etc.
>>> >
>>> > This concept does not fit well into qdisc, essentially you still want to
>>> > install filters (and actions) somewhere on qdisc, but currently all filters
>>> > are executed at enqueue, basically you want to execute them at other
>>> > pre-defined locations too, for example early drop.
>>> >
>>> > So, perhaps adding a "position" in tc filter is better? Something like:
>>> >
>>> > tc qdisc add dev x root handle 1: ... # same as before
>>> > tc filter add dev x parent 1:0 position early_drop matchall action....
>>>
>>> Position would just be a chain index.
>>
>> Why? By position, I mean a place where we _execute_ tc filters on
>> a qdisc, currently there is only "enqueue". I don't see how this is
>> close to a chain which is basically a group of filters.
>
>OK, I misunderstood what you mean.
>
>So you propose to have further division within the block? To have sort
>of namespaces within blocks or chains, where depending on the context,
>only filters in the corresponding namespace are executed?

Please take the block as a whole entity. It has one entry ->chain0
processing. The gotos to other chains should be contained within the
block.

Please don't divide the block. If you want to process the block from a
different entry point, please process it as a whole.


>
>>>
>>> > And obviously default position must be "enqueue". Makes sense?
>>>
>>> Chain 0.
>>>
>>> So if I understand the proposal correctly: a qdisc has a classification
>>> block (cl_ops->tcf_block). Qevents then reference a chain on that
>>> classification block.
>>
>> No, I am suggesting to replace your qevents with position, because
>> as I said it does not fit well there.
>>
>>>
>>> If the above is correct, I disagree that this is a better model. RED
>>> does not need to classify anything, modelling this as a classification
>>> block is wrong. It should be another block. (Also shareable, because as
>>> an operator you likely want to treat all, say, early drops the same, and
>>> therefore to add just one rule for all 128 or so of your qdiscs.)
>>
>> You can still choose not to classify anything by using matchall. No
>> one is saying you have to do classification.
>
>The point here is filter reuse, not classification.
>
>> If you want to jump to a block, you can just use a goto action like
>
>I don't think you can jump to a block. You can jump to a chain within
>the same block.

Correct.

entrypoint->
   block X
     chain 0
     chain A
     chain B
     chain ...

There's no goto/jump out of the block.

>
>> normal. So your above example can be replaced with:
>>
>> # tc qdisc add dev eth0 root handle 1: \
>>         red limit 500K avpkt 1K
>>
>> # tc filter add dev eth0 parent 1:0 position early_drop matchall \
>>    action goto chain 10
>>
>> # tc chain add dev eth0 index 10 ...
>>
>> See the difference?

  parent reply	other threads:[~2020-06-01 13:35 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 17:10 [RFC PATCH net-next 0/3] TC: Introduce qevents Petr Machata
2020-05-26 17:10 ` [RFC PATCH net-next 1/3] net: sched: Introduce helpers for qevent blocks Petr Machata
2020-05-26 17:10 ` [RFC PATCH net-next 2/3] net: sched: sch_red: Split init and change callbacks Petr Machata
2020-05-26 18:32   ` Jakub Kicinski
2020-05-27 15:23     ` Petr Machata
2020-05-26 17:10 ` [RFC PATCH net-next 3/3] net: sched: sch_red: Add qevents "early" and "mark" Petr Machata
2020-05-26 17:10 ` [RFC PATCH iproute2-next 1/4] uapi: pkt_sched: Add two new RED attributes Petr Machata
2020-05-26 17:10   ` [RFC PATCH iproute2-next 2/4] tc: Add helpers to support qevent parsing Petr Machata
2020-05-26 17:10   ` [RFC PATCH iproute2-next 3/4] man: tc: Describe qevents Petr Machata
2020-05-26 17:10   ` [RFC PATCH iproute2-next 4/4] tc: q_red: Add support for qevents "mark" and "early" Petr Machata
2020-05-27  4:09 ` [RFC PATCH net-next 0/3] TC: Introduce qevents Cong Wang
2020-05-27  9:56   ` Petr Machata
2020-05-28  4:00     ` Cong Wang
2020-05-28  9:48       ` Petr Machata
2020-05-30  4:48         ` Cong Wang
2020-05-30  8:55           ` Petr Machata
2020-06-01 20:01             ` Cong Wang
2020-06-01 13:35         ` Jiri Pirko [this message]
2020-06-01 13:40   ` Jiri Pirko
2020-06-01 19:50     ` Cong Wang
2020-06-01 22:37       ` Petr Machata
2020-06-02  6:05       ` Jiri Pirko
2020-06-03  7:05         ` Cong Wang
2020-06-03 10:08           ` Petr Machata
2020-05-27 15:19 ` Vladimir Oltean
2020-05-27 16:25   ` Petr Machata

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200601133524.GP2282@nanopsycho \
    --to=jiri@resnulli.us \
    --cc=eric.dumazet@gmail.com \
    --cc=idosch@mellanox.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@mellanox.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=petrm@mellanox.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox