netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next 00/11] ENETC mqprio/taprio cleanup
@ 2023-01-20 14:15 Vladimir Oltean
  2023-01-20 14:15 ` [RFC PATCH net-next 01/11] net/sched: mqprio: refactor nlattr parsing to a separate function Vladimir Oltean
                   ` (13 more replies)
  0 siblings, 14 replies; 25+ messages in thread
From: Vladimir Oltean @ 2023-01-20 14:15 UTC (permalink / raw)
  To: netdev, John Fastabend
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Claudiu Manoil, Camelia Groza, Xiaoliang Yang, Gerhard Engleder,
	Vinicius Costa Gomes, Alexander Duyck, Kurt Kanzenbach,
	Ferenc Fejes, Tony Nguyen, Jesse Brandeburg, Jacob Keller

I realize that this patch set will start a flame war, but there are
things about the mqprio qdisc that I simply don't understand, so in an
attempt to explain how I see things should be done, I've made some
patches to the code. I hope the reviewers will be patient enough with me :)

I need to touch mqprio because I'm preparing a patch set for Frame
Preemption (an IEEE 802.1Q feature). A disagreement started with
Vinicius here:
https://patchwork.kernel.org/project/netdevbpf/patch/20220816222920.1952936-3-vladimir.oltean@nxp.com/#24976672

regarding how TX packet prioritization should be handled. Vinicius said
that for some Intel NICs, prioritization at the egress scheduler stage
is fundamentally attached to TX queues rather than traffic classes.

In other words, in the "popular" mqprio configuration documented by him:

$ tc qdisc replace dev $IFACE parent root handle 100 mqprio \
      num_tc 3 \
      map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
      queues 1@0 1@1 2@2 \
      hw 0

there are 3 Linux traffic classes and 4 TX queues. The TX queues are
organized in strict priority fashion, like this: TXQ 0 has highest prio
(hardware dequeue precedence for TX scheduler), TXQ 3 has lowest prio.
Packets classified by Linux to TC 2 are hashed between TXQ 2 and TXQ 3,
but the hardware has higher precedence for TXQ2 over TXQ 3, and Linux
doesn't know that.

I am surprised by this fact, and this isn't how ENETC works at all.
For ENETC, we try to prioritize on TCs rather than TXQs, and TC 7 has
higher priority than TC 7. For us, groups of TXQs that map to the same
TC have the same egress scheduling priority. It is possible (and maybe
useful) to have 2 TXQs per TC - one TXQ per CPU). Patch 07/11 tries to
make that more clear.

Furthermore (and this is really the biggest point of contention), myself
and Vinicius have the fundamental disagreement whether the 802.1Qbv
(taprio) gate mask should be passed to the device driver per TXQ or per
TC. This is what patch 11/11 is about.

Again, I'm not *certain* that my opinion on this topic is correct
(and it sure is confusing to see such a different approach for Intel).
But I would appreciate any feedback.

Vladimir Oltean (11):
  net/sched: mqprio: refactor nlattr parsing to a separate function
  net/sched: mqprio: refactor offloading and unoffloading to dedicated
    functions
  net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to
    pkt_sched.h
  net/sched: mqprio: allow offloading drivers to request queue count
    validation
  net/sched: mqprio: add extack messages for queue count validation
  net: enetc: request mqprio to validate the queue counts
  net: enetc: act upon the requested mqprio queue configuration
  net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc()
  net: enetc: act upon mqprio queue config in taprio offload
  net/sched: taprio: validate that gate mask does not exceed number of
    TCs
  net/sched: taprio: only calculate gate mask per TXQ for igc

 drivers/net/ethernet/freescale/enetc/enetc.c  |  67 ++--
 .../net/ethernet/freescale/enetc/enetc_qos.c  |  27 +-
 drivers/net/ethernet/intel/igc/igc_main.c     |  17 +
 include/net/pkt_cls.h                         |  10 -
 include/net/pkt_sched.h                       |  16 +
 net/sched/sch_mqprio.c                        | 298 +++++++++++-------
 net/sched/sch_taprio.c                        |  57 ++--
 7 files changed, 310 insertions(+), 182 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-01-26 20:40 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-20 14:15 [RFC PATCH net-next 00/11] ENETC mqprio/taprio cleanup Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 01/11] net/sched: mqprio: refactor nlattr parsing to a separate function Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 02/11] net/sched: mqprio: refactor offloading and unoffloading to dedicated functions Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 03/11] net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.h Vladimir Oltean
2023-01-25 13:09   ` Kurt Kanzenbach
2023-01-25 13:16     ` Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 04/11] net/sched: mqprio: allow offloading drivers to request queue count validation Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 05/11] net/sched: mqprio: add extack messages for " Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 06/11] net: enetc: request mqprio to validate the queue counts Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 07/11] net: enetc: act upon the requested mqprio queue configuration Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 08/11] net/sched: taprio: pass mqprio queue configuration to ndo_setup_tc() Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 09/11] net: enetc: act upon mqprio queue config in taprio offload Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 10/11] net/sched: taprio: validate that gate mask does not exceed number of TCs Vladimir Oltean
2023-01-20 14:15 ` [RFC PATCH net-next 11/11] net/sched: taprio: only calculate gate mask per TXQ for igc Vladimir Oltean
2023-01-25  1:11   ` Vinicius Costa Gomes
2023-01-23 18:22 ` [RFC PATCH net-next 00/11] ENETC mqprio/taprio cleanup Jacob Keller
2023-01-24 14:26   ` Vladimir Oltean
2023-01-24 22:30     ` Jacob Keller
2023-01-23 21:21 ` Gerhard Engleder
2023-01-23 21:31   ` Vladimir Oltean
2023-01-23 22:20     ` Gerhard Engleder
2023-01-25  1:11 ` Vinicius Costa Gomes
2023-01-25 13:10   ` Vladimir Oltean
2023-01-25 22:47     ` Vinicius Costa Gomes
2023-01-26 20:39       ` Vladimir Oltean

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).