All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
To: Vladimir Oltean <olteanv@gmail.com>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>,
	Oleksij Rempel <linux@rempel-privat.de>,
	netdev@vger.kernel.org, andrew@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	horms@kernel.org, nbd@nbd.name, sean.wang@mediatek.com,
	Mark-MC.Lee@mediatek.com, lorenzo.bianconi83@gmail.com
Subject: Re: [RFC net-next 0/5] Add ETS and TBF Qdisc offload for Airoha EN7581 SoC
Date: Tue, 17 Dec 2024 10:11:46 +0100	[thread overview]
Message-ID: <Z2FAUuOh4jrA0uGu@lore-desk> (raw)
In-Reply-To: <20241216231311.odozs4eki7bbagwp@skbuf>

[-- Attachment #1: Type: text/plain, Size: 11264 bytes --]

> On Mon, Dec 16, 2024 at 11:28:18PM +0100, Lorenzo Bianconi wrote:
> > > ndo_setup_tc_conduit() does not have the same instruments to offload
> > > what port_setup_tc() can offload. It is not involved in all data paths
> > > that port_setup_tc() has to handle. Please ack this. So if port_setup_tc()
> > 
> > Can you please elaborate on this? Both (->ndo_setup_tc_conduit() and
> > ->port_setup_tc()) refer to the same DSA user port (please take a look the
> > callback signature).
> 
> I'd be just repeating what I've said a few times before. Your proposed
> ndo_setup_tc_conduit() appears to be configuring conduit resources
> (QDMA, GDM) for mt7530 user port tc offload, as if it is in complete and
> exclusive control of the user port data path. But as long as there are
> packets in the user port data path that bypass those conduit QoS resources
> (like for example mt7530 switch forwards packet from one port to another,
> bypassing the GDM1 in your drawing[1]), it isn't a good model. Forget
> about ndo_setup_tc_conduit(), it isn't a good tc command to run in the
> first place. The tc command you're trying to make to do what you want is
> supposed to _also_ include the mt7530 packets forwarded from one port to
> another in its QoS mix. It applies at the _egress_ of the mt7530 port.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23020f04932701d5c8363e60756f12b43b8ed752
> 
> Let me try to add some squiggles based on your diagram, to clarify what
> is my understanding and complaint.
> 
>                ┌───────┐                                   ┌───────┐
>                │ QDMA2 │                                   │ QDMA1 │
>                └───┬───┘                                   └───┬───┘
>                    │                                           │
>            ┌───────▼───────────────────────────────────────────▼────────┐
>            │                                                            │
>            │       P5                                          P0       │
>            │                                                            │
>            │                                                            │
>            │                                                            │    ┌──────┐
>            │                                                         P3 ├────► GDM3 │
>            │                                                            │
> ┌─────┐    │                                                            │
> │ PPE ◄────┤ P4                        PSE                              │
> └─────┘    │                                                            │
>            │                                                            │    ┌──────┐
>            │                                                         P9 ├────► GDM4 │
>            │                                                            │    └──────┘
>            │                                                            │
>            │        P2                                         P1       │
>            └─────────┬─────────────────────────────────────────┬────────┘
>                      │                                         │
>                  ┌───▼──┐                                   ┌──▼───┐
>                  │ GDM2 │                                   │ GDM1 │
>                  └──────┘                                   └──┬───┘
>                                                                │
>                                                 ┌──────────────▼───────────────┐
>                                                 │            CPU port          │
>                                                 │   ┌─────────┘                │
>                                                 │   │         MT7530           │
>                                                 │   │                          │
>                                                 │   ▼         x                │
>                                                 │   ┌─────┐ ┌─┘                │
>                                                 │  lan1  lan2  lan3  lan4      │
>                                                 └───│──────────────────────────┘
>                                                     ▼
> 
> When you add an offloaded Qdisc to the egress of lan1, the expectation
> is that packets from lan2 obey it too (offloaded tc goes hand in hand
> with offloaded bridge). Whereas, by using GDM1/QDMA resources, you are
> breaking that expectation, because packets from lan2 bridged by MT7530
> don't go to GDM1 (the "x").

ack, I got your point. I was assuming to cover this case (traffic from lan2 to
lan1) maintaining the port_setup_tc() callback in dsa_user_setup_qdisc() (this
traffic is not managed by ndo_setup_tc_conduit() callback). If this approach is
not ok, I guess we will need to revisit the approach.

> 
> > > returns -EOPNOTSUPP, the entire dsa_user_setup_qdisc() should return
> > > -EOPNOTSUPP, UNLESS you install packet traps on all other offloaded data
> > > paths in the switch, such that all packets that egress the DSA user port
> > > are handled by ndo_setup_tc_conduit()'s instruments.
> > 
> > Uhm, do you mean we are changing the user expected result in this way?
> > It seems to me the only case we are actually changing is if port_setup_tc()
> > callback is not supported by the DSA switch driver while ndo_setup_tc_conduit()
> > one is supported by the mac chip. In this case the previous implementation
> > returns -EOPNOTSUPP while the proposed one does not report any error.
> > Do we really care about this case? If so, I guess we can rework
> > dsa_user_setup_qdisc().
> 
> See above, there's nothing to rework.
> 
> > > > nope, I am not saying the current Qdisc DSA infrastructure is wrong, it just
> > > > does not allow to exploit all hw capabilities available on EN7581 when the
> > > > traffic is routed from the WAN port to a given DSA switch port.
> > > 
> > > And I don't believe it should, in this way.
> > 
> > Can you please elaborate on this? IIUC it seems even Oleksij has a use-case
> > for this.
> 
> See above, I'm also waiting for Oleksij's answer but I don't expect you
> 2 to be talking about the same thing. If there's some common infrastructure
> to be shared, my understanding is it has nothing to do with ndo_setup_tc_conduit().

ack

> 
> > > We need something as the root Qdisc of the conduit which exposes its
> > > hardware capabilities. I just assumed that would be a simple (and single)
> > > ETS, you can correct me if I am wrong.
> > > 
> > > On conduit egress, what is the arbitration scheme between the traffic
> > > destined towards each DSA user port (channel, as the driver calls them)?
> > > How can this be best represented?
> > 
> > The EN7581 supports up to 32 different 'channels' (each of them support 8
> > different hw queues). You can define an ETS and/or TBF Qdisc for each channel.
> > My idea is to associate a channel to each DSA switch port, so the user can
> > define independent QoS policies for each DSA ports (e.g. shape at 100Mbps lan0,
> > apply ETS on lan1, ...) configuring the mac chip instead of the hw switch.
> > The kernel (if the traffic is not offloaded) or the PPE block (if the traffic
> > is offloaded) updates the channel and queue information in the DMA descriptor
> > (please take a look to [0] for the first case).
> 
> But you call it a MAC chip because between the GDM1 and the MT7530 there's
> an in-chip Ethernet MAC (GMII netlist), with a fixed packet rate, right?

With "mac chip" I mean the set of PSE/PPE and QDMA blocks in the diagram
(what is managed by airoha_eth driver). There is no other chip in between
of GDM1 and MT7530 switch (sorry for the confusion).

> I'm asking again, are the channels completely independent of one another,
> or are they consuming shared bandwidth in a way that with your proposal
> is just not visible? If there is a GMII between the GDM1 and the MT7530,
> how come the bandwidth between the channels is not shared in any way?

Channels are logically independent.
GDM1 is connected to the MT7530 switch via a fixed speed link (10Gbps, similar
to what we have for other MediaTek chipset, like MT7988 [0]). The fixed link speed
is higher than the sum of DSA port link speeds (on my development boards I have
4 DSA ports @ 1Gbps);

> And if there is no GMII or similar MAC interface, we need to take 100
> steps back and discuss why was the DSA model chosen for this switch, and
> not a freeform switchdev driver where the conduit is not discrete?
> 
> I'm not sure what to associate these channels with. Would it be wrong to
> think of each channel as a separate DSA conduit? Because for example there
> is API to customize the user port <-> conduit assignment.
> 
> > > IIUC, in your patch set, you expose the conduit hardware QoS capabilities
> > > as if they can be perfectly virtualized among DSA user ports, and as if
> > > each DSA user port can have its own ETS root Qdisc, completely independent
> > > of each other, as if the packets do not serialize on the conduit <-> CPU
> > > port link, and as if that is not a bottleneck. Is that really the case?
> > 
> > correct
> 
> Very interesting but will need more than one word for an explanation :)

I mean your paragraph above well describes hw architecture.

> 
> > > If so (but please explain how), maybe you really need your own root Qdisc
> > > driver, with one class per DSA user port, and those classes have ETS
> > > attached to them.
> > 
> > Can you please clarify what do you mean with 'root Qdisc driver'?
> 
> Quite literally, an implementer of struct Qdisc_ops whose parent can
> only be TC_H_ROOT. I was implying you'd have to create an abstract
> software model of the QoS capabilities of the QDMA and the GDM port such
> that we all understand them, a netlink attribute scheme for configuring
> those QoS parameters, and then a QoS offload mechanism through which
> they are communicated to compatible hardware. But let's leave that aside
> until it becomes more clear what you have.
> 

ack, fine.

Regards,
Lorenzo

[0] https://github.com/openwrt/openwrt/blob/main/target/linux/mediatek/files-6.6/arch/arm64/boot/dts/mediatek/mt7988a.dtsi#L1531

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2024-12-17  9:11 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-11 15:31 [RFC net-next 0/5] Add ETS and TBF Qdisc offload for Airoha EN7581 SoC Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 1/5] net: airoha: Enable Tx drop capability for each Tx DMA ring Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 2/5] net: airoha: Introduce ndo_select_queue callback Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 3/5] net: dsa: Introduce ndo_setup_tc_conduit callback Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 4/5] net: airoha: Add sched ETS offload support Lorenzo Bianconi
2024-12-12 14:37   ` Davide Caratti
2024-12-12 17:04     ` Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 5/5] net: airoha: Add sched TBF " Lorenzo Bianconi
2024-12-11 15:41 ` [RFC net-next 0/5] Add ETS and TBF Qdisc offload for Airoha EN7581 SoC Vladimir Oltean
2024-12-12  9:19   ` Lorenzo Bianconi
2024-12-12 15:06     ` Vladimir Oltean
2024-12-12 17:03       ` Lorenzo Bianconi
2024-12-12 18:46         ` Vladimir Oltean
2024-12-16 12:09           ` Lorenzo Bianconi
2024-12-16 15:49             ` Vladimir Oltean
2024-12-16 18:14               ` Oleksij Rempel
2024-12-16 19:01               ` Lorenzo Bianconi
2024-12-16 19:23                 ` Oleksij Rempel
2024-12-16 21:44                   ` Lorenzo Bianconi
2024-12-17  8:46                     ` Oleksij Rempel
2024-12-16 19:46                 ` Vladimir Oltean
2024-12-16 22:28                   ` Lorenzo Bianconi
2024-12-16 23:13                     ` Vladimir Oltean
2024-12-17  9:11                       ` Lorenzo Bianconi [this message]
2024-12-17  9:30                         ` Vladimir Oltean
2024-12-17 10:01                           ` Lorenzo Bianconi
2024-12-17 10:17                             ` Vladimir Oltean
2024-12-17 10:23                               ` Oleksij Rempel
2024-12-16 23:24                 ` Andrew Lunn
2024-12-17  9:38                   ` Oleksij Rempel
2024-12-17 11:54                     ` Vladimir Oltean
2024-12-17 12:22                       ` Oleksij Rempel
2024-12-17 13:28                         ` Vladimir Oltean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z2FAUuOh4jrA0uGu@lore-desk \
    --to=lorenzo.bianconi@redhat.com \
    --cc=Mark-MC.Lee@mediatek.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux@rempel-privat.de \
    --cc=lorenzo.bianconi83@gmail.com \
    --cc=lorenzo@kernel.org \
    --cc=nbd@nbd.name \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=sean.wang@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.