From: Petr Machata <petrm@nvidia.com>
To: <Daniel.Machon@microchip.com>
Cc: <petrm@nvidia.com>, <netdev@vger.kernel.org>, <kuba@kernel.org>,
<vinicius.gomes@intel.com>, <vladimir.oltean@nxp.com>,
<thomas.petazzoni@bootlin.com>, <Allan.Nielsen@microchip.com>,
<maxime.chevallier@bootlin.com>, <roopa@nvidia.com>
Subject: Re: Basic PCP/DEI-based queue classification
Date: Wed, 24 Aug 2022 21:36:54 +0200 [thread overview]
Message-ID: <87k06xjplj.fsf@nvidia.com> (raw)
In-Reply-To: <YwZoGJXgx/t/Qxam@DEN-LT-70577>
<Daniel.Machon@microchip.com> writes:
>> > As I hinted earlier, we could also add an entirely new PCP interface
>> > (like with maxrate), this will give us a bit more flexibility and will
>> > not crash with anything. This approach will not give is trust for DSCP,
>> > but maybe we can disregard this and go with a PCP solution initially?
>>
>> I would like to have a line of sight to how things will be done. Not
>> everything needs to be implemented at once, but we have to understand
>> how to get there when we need to. At least for issues that we can
>> already foresee now, such as the DSCP / PCP / default ordering.
>>
>> Adding the PCP rules as a new APP selector, and then expressing the
>> ordering as a "selector policy" or whatever, IMHO takes care of this
>> nicely.
>>
>> But OK, let's talk about the "flexibility" bit that you mention: what
>> does this approach make difficult or impossible?
>
> It was merely a concern of not changing too much on something that is
> already standard. Maybe I dont quite see how the APP interface can be
> extended to accomodate for: pcp/dei, ingress/egress and trust. Lets
> try to break it down:
>
> - pcp/dei:
> this *could* be expressed in app->protocol and map 1:1 to the
> pcp table entrise, so that 8*dei+pcp:priority. If I want to map
> pcp 3, with dei 1 to priority 2, it would be encoded 11:2.
Yep. In particular something like {sel=255, pid=11, prio=2}.
iproute2 "dcb" would obviously grow brains to let you configure this
stuff semantically, so e.g.:
# dcb app replace dev X pcp-prio 3:3 3de:2 2:2 2de:1
> - ingress/egress:
> I guess we need a selector for each? I notice that the mellanox
> driver uses the dcb_ieee_getapp_prio_dscp_mask_map and
> dcb_ieee_getapp_dscp_prio_mask_map for priority map and priority
> rewrite map, but these seems to be the same for both ingress and
> egress to me?
Ha, I was only thinking about prioritization, not about rewrite at all.
Yeah, mlxsw uses APP rules for rewrite as well. The logic is that if the
network behind port X uses DSCP value D to express priority P, then
packets with priority P leaving that port should have DSCP value of D.
Of course it doesn't work too well, because there are 8 priorities, but
64 DSCP values. So mlxsw arbitrarily chooses the highest DSCP value.
The situation is similar with PCP, where there are 16 PCP+DEI
combinations, but only 8 priorities.
So having a way to configure rewrite would be good. But then we are very
firmly in the extension territory. This would basically need a separate
APP-like object.
> So far only subtle changes. Now how do you see trust going in. Can you
> elaborate a little on the policy selector you mentioned?
Sure. In my mind the policy is a array that describes the order in which
APP rules are applied. "default" is implicitly last.
So "trust DSCP" has a policy of just [DSCP]. "Trust PCP" of [PCP].
"Trust DSCP, then PCP" of [DSCP, PCP]. "Trust port" (i.e. just default)
is simply []. Etc.
Individual drivers validate whether their device can implement the
policy.
I expect most devices to really just support the DSCP and PCP parts, but
this is flexible in allowing more general configuration in devices that
allow it.
ABI-wise it is tempting to reuse APP to assign priority to selectors in
the same way that it currently assigns priority to field values:
# dcb app replace dev X sel-prio dscp:2 pcp:1
But that feels like a hack. It will probably be better to have a
dedicated object for this:
# dcb app-policy set dev X sel-order dscp pcp
This can be sliced in different ways that we can bikeshed to death
later. Does the above basically address your request?
next prev parent reply other threads:[~2022-08-24 21:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-19 9:09 Basic PCP/DEI-based queue classification Daniel.Machon
2022-08-19 10:50 ` Petr Machata
2022-08-21 20:58 ` Daniel.Machon
2022-08-22 10:34 ` Petr Machata
2022-08-24 7:39 ` Daniel.Machon
2022-08-24 9:45 ` Petr Machata
2022-08-24 17:55 ` Daniel.Machon
2022-08-24 19:36 ` Petr Machata [this message]
2022-08-25 0:54 ` Jakub Kicinski
2022-08-26 18:11 ` Petr Machata
2022-08-29 7:53 ` Allan W. Nielsen
2022-09-02 13:32 ` Vladimir Oltean
2022-09-07 10:41 ` Daniel.Machon
2022-09-07 17:26 ` Vladimir Oltean
2022-09-07 19:57 ` Daniel.Machon
2022-09-08 8:03 ` Allan Nielsen - M31684
2022-09-08 11:18 ` Petr Machata
2022-09-08 12:01 ` Daniel.Machon
2022-09-09 12:11 ` Vladimir Oltean
2022-09-08 8:27 ` Petr Machata
2022-08-25 11:31 ` Daniel.Machon
2022-08-25 13:30 ` Petr Machata
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k06xjplj.fsf@nvidia.com \
--to=petrm@nvidia.com \
--cc=Allan.Nielsen@microchip.com \
--cc=Daniel.Machon@microchip.com \
--cc=kuba@kernel.org \
--cc=maxime.chevallier@bootlin.com \
--cc=netdev@vger.kernel.org \
--cc=roopa@nvidia.com \
--cc=thomas.petazzoni@bootlin.com \
--cc=vinicius.gomes@intel.com \
--cc=vladimir.oltean@nxp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.