public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/4] Add some history to the DSA documentation
@ 2023-12-08 19:35 Vladimir Oltean
  2023-12-08 19:35 ` [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism Vladimir Oltean
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-08 19:35 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Florian Fainelli, Luiz Angelo Daros de Luca,
	Alvin Šipraga, Madhuri Sripada, Marcin Wojtas, Linus Walleij,
	Tobias Waldekranz, Arun Ramadoss, Russell King (Oracle),
	Jonathan Corbet

I'm not a historian by any means, but I think the only way of explaining
where DSA is today is by looking at where it's coming from.

Some mailing list discussions with Sean Nyekjaer / Madhuri Sripada /
Luiz Luca / Alvin Šipraga / Andrew Lunn over the past weeks made me
realize that we don't all have the same perspective on things. Some more
documentation could potentially help.

Vladimir Oltean (4):
  docs: net: dsa: document the tagger-owned storage mechanism
  docs: net: dsa: update platform_data documentation
  docs: net: dsa: update user MDIO bus documentation
  docs: net: dsa: replace TODO section with info about history and devel
    ideas

 Documentation/networking/dsa/dsa.rst | 302 ++++++++++++++++++++++++---
 1 file changed, 274 insertions(+), 28 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism
  2023-12-08 19:35 [PATCH net 0/4] Add some history to the DSA documentation Vladimir Oltean
@ 2023-12-08 19:35 ` Vladimir Oltean
  2023-12-08 22:15   ` Linus Walleij
                     ` (2 more replies)
  2023-12-08 19:35 ` [PATCH net 2/4] docs: net: dsa: update platform_data documentation Vladimir Oltean
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-08 19:35 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Florian Fainelli, Luiz Angelo Daros de Luca,
	Alvin Šipraga, Madhuri Sripada, Marcin Wojtas, Linus Walleij,
	Tobias Waldekranz, Arun Ramadoss, Russell King (Oracle),
	Jonathan Corbet

Introduced 2 years ago in commit dc452a471dba ("net: dsa: introduce
tagger-owned storage for private and shared data"), the tagger-owned
storage mechanism has recently sparked some discussions which denote a
general lack of developer understanding / awareness of it. There was
also a bug in the ksz switch driver which indicates the same thing.

Admittedly, it is also not obvious to see the design constraints that
led to the creation of such a complicated mechanism.

Here are some paragraphs that explain what it's about.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 Documentation/networking/dsa/dsa.rst | 59 ++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
index 7b2e69cd7ef0..0c326a42eb81 100644
--- a/Documentation/networking/dsa/dsa.rst
+++ b/Documentation/networking/dsa/dsa.rst
@@ -221,6 +221,44 @@ receive all frames regardless of the value of the MAC DA. This can be done by
 setting the ``promisc_on_conduit`` property of the ``struct dsa_device_ops``.
 Note that this assumes a DSA-unaware conduit driver, which is the norm.
 
+Separation between tagging protocol and switch drivers
+------------------------------------------------------
+
+Sometimes it is desirable to test the behavior of a given conduit interface
+with a given switch protocol, to see how it responds to checksum offloading,
+padding with tail tags, increased MTU, how the hardware parser sees DSA-tagged
+frames, etc.
+
+To achieve that, any tagging protocol driver may be used with ``dsa_loop``
+(this requires modifying the ``dsa_loop_get_protocol()`` function
+implementation). Therefore, tagging protocol drivers must not assume that they
+are used only in conjunction with a particular switch driver. Concretely, the
+tagging protocol driver should make no assumptions about the type of
+``ds->priv``, and its core functionality should only rely on the data
+structures offered by the DSA core for all switches (``struct dsa_switch``,
+``struct dsa_port`` etc).
+
+Additionally, tagging protocol drivers must not depend on symbols exported by
+any particular switch control path driver. Doing so would create a circular
+dependency, because DSA, on behalf of the switch driver, already requests the
+appropriate tagging protocol driver module to be loaded.
+
+Nonetheless, there are exceptional situations when switch-specific processing
+is required in a tagging protocol driver. In some cases the tagger needs a
+place to hold state; in other cases, the packet transmission procedure may
+involve accessing switch registers. The tagger may also be processing packets
+which are not destined for the network stack but for the switch driver's
+management logic, and thus, the switch driver should have a handler for these
+management frames.
+
+A mechanism, called tagger-owned storage (in reference to ``ds->tagger_data``),
+exists, which permits tagging protocol drivers to allocate memory for each
+switch that they connect to. Each tagging protocol driver may define its own
+contract with switch drivers as to what this data structure contains.
+Through the ``struct dsa_device_ops`` methods ``connect()`` and ``disconnect()``,
+tagging protocol drivers are given the possibility to manage the
+``ds->tagger_data`` pointer of any switch that they connect to.
+
 Conduit network devices
 -----------------------
 
@@ -624,6 +662,27 @@ Switch configuration
   case, further calls to ``get_tag_protocol`` should report the protocol in
   current use.
 
+- ``connect_tag_protocol``: optional method to notify the switch driver that a
+  tagging protocol driver has connected to this switch. Depending on the
+  contract established by the protocol given in the ``proto`` argument, the
+  tagger-owned storage (``ds->tagger_data``) may be expected to contain a
+  pointer to a data structure specific to the tagging protocol. This data
+  structure may contain function pointers to packet handlers that the switch
+  driver registers with the tagging protocol. If interested in these packets,
+  the switch driver must cast the ``ds->tagger_data`` pointer to the data type
+  established by the tagging protocol, and assign the packet handler function
+  pointers to methods that it owns. Since the memory pointed to by
+  ``ds->tagger_data`` is owned by the tagging protocol, the switch driver must
+  assume by convention that it has been allocated, and this method is only
+  provided for making initial adjustments to the contents of ``ds->tagger_data``.
+  It is also the reason why no ``disconnect_tag_protocol()`` counterpart is
+  provided. Additionally, a tagging protocol driver which makes use of
+  tagger-owned storage must not assume that the connected switch has
+  implemented the ``connect_tag_protocol()`` method (it may connect to a
+  ``dsa_loop`` switch, which does not). Therefore, a tagging protocol may
+  always rely on ``ds->tagger_data``, but it must treat the packet handlers
+  provided by the switch in this method as optional.
+
 - ``setup``: setup function for the switch, this function is responsible for setting
   up the ``dsa_switch_ops`` private structure with all it needs: register maps,
   interrupts, mutexes, locks, etc. This function is also expected to properly
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net 2/4] docs: net: dsa: update platform_data documentation
  2023-12-08 19:35 [PATCH net 0/4] Add some history to the DSA documentation Vladimir Oltean
  2023-12-08 19:35 ` [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism Vladimir Oltean
@ 2023-12-08 19:35 ` Vladimir Oltean
  2023-12-08 22:19   ` Linus Walleij
                     ` (2 more replies)
  2023-12-08 19:35 ` [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation Vladimir Oltean
  2023-12-08 19:35 ` [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas Vladimir Oltean
  3 siblings, 3 replies; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-08 19:35 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Florian Fainelli, Luiz Angelo Daros de Luca,
	Alvin Šipraga, Madhuri Sripada, Marcin Wojtas, Linus Walleij,
	Tobias Waldekranz, Arun Ramadoss, Russell King (Oracle),
	Jonathan Corbet

We were documenting a bunch of stuff which was removed in commit
93e86b3bc842 ("net: dsa: Remove legacy probing support"). There's some
further cleanup to do in struct dsa_chip_data, so rather than describing
every possible field (when maybe we should be switching to kerneldoc
format), just say what's important about it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 Documentation/networking/dsa/dsa.rst | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
index 0c326a42eb81..676c92136a0e 100644
--- a/Documentation/networking/dsa/dsa.rst
+++ b/Documentation/networking/dsa/dsa.rst
@@ -413,18 +413,17 @@ PHYs, external PHYs, or even external switches.
 Data structures
 ---------------
 
-DSA data structures are defined in ``include/net/dsa.h`` as well as
-``net/dsa/dsa_priv.h``:
-
-- ``dsa_chip_data``: platform data configuration for a given switch device,
-  this structure describes a switch device's parent device, its address, as
-  well as various properties of its ports: names/labels, and finally a routing
-  table indication (when cascading switches)
-
-- ``dsa_platform_data``: platform device configuration data which can reference
-  a collection of dsa_chip_data structures if multiple switches are cascaded,
-  the conduit network device this switch tree is attached to needs to be
-  referenced
+DSA data structures are defined in ``include/linux/platform_data/dsa.h``,
+``include/net/dsa.h`` as well as ``net/dsa/dsa_priv.h``:
+
+- ``dsa_chip_data``: platform data configuration for a given switch device.
+  Most notably, it is necessary to the DSA core because it holds a reference to
+  the conduit interface. It must be accessible through the
+  ``ds->dev->platform_data`` pointer at ``dsa_register_switch()`` time. It is
+  populated by board-specific code. The hardware switch driver may also have
+  its own portion of ``platform_data`` description. In that case,
+  ``ds->dev->platform_data`` can point to a switch-specific structure, which
+  encapsulates ``struct dsa_chip_data`` as its first element.
 
 - ``dsa_switch_tree``: structure assigned to the conduit network device under
   ``dsa_ptr``, this structure references a dsa_platform_data structure as well as
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-08 19:35 [PATCH net 0/4] Add some history to the DSA documentation Vladimir Oltean
  2023-12-08 19:35 ` [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism Vladimir Oltean
  2023-12-08 19:35 ` [PATCH net 2/4] docs: net: dsa: update platform_data documentation Vladimir Oltean
@ 2023-12-08 19:35 ` Vladimir Oltean
  2023-12-08 22:22   ` Linus Walleij
                     ` (2 more replies)
  2023-12-08 19:35 ` [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas Vladimir Oltean
  3 siblings, 3 replies; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-08 19:35 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Florian Fainelli, Luiz Angelo Daros de Luca,
	Alvin Šipraga, Madhuri Sripada, Marcin Wojtas, Linus Walleij,
	Tobias Waldekranz, Arun Ramadoss, Russell King (Oracle),
	Jonathan Corbet

There are people who are trying to push the ds->user_mii_bus feature
past its sell-by date. I think part of the problem is the fact that the
documentation presents it as this great functionality.

Adapt it to 2023, where we have phy-handle to render it useless, at
least with OF.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 Documentation/networking/dsa/dsa.rst | 36 ++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
index 676c92136a0e..2cd91358421e 100644
--- a/Documentation/networking/dsa/dsa.rst
+++ b/Documentation/networking/dsa/dsa.rst
@@ -397,19 +397,41 @@ perspective::
 User MDIO bus
 -------------
 
-In order to be able to read to/from a switch PHY built into it, DSA creates an
-user MDIO bus which allows a specific switch driver to divert and intercept
-MDIO reads/writes towards specific PHY addresses. In most MDIO-connected
-switches, these functions would utilize direct or indirect PHY addressing mode
-to return standard MII registers from the switch builtin PHYs, allowing the PHY
-library and/or to return link status, link partner pages, auto-negotiation
-results, etc.
+The framework creates an MDIO bus for user ports (``ds->user_mii_bus``) when
+both methods ``ds->ops->phy_read()`` and ``ds->ops->phy_write()`` are present.
+However, this pointer may also be populated by the switch driver during the
+``ds->ops->setup()`` method, with an MDIO bus managed by the driver.
+
+Its role is to permit user ports to connect to a PHY (usually internal) when
+the more general ``phy-handle`` property is unavailable (either because the
+MDIO bus is missing from the OF description, or because probing uses
+``platform_data``).
+
+In most MDIO-connected switches, these functions would utilize direct or
+indirect PHY addressing mode to return standard MII registers from the switch
+builtin PHYs, allowing the PHY library and/or to return link status, link
+partner pages, auto-negotiation results, etc.
 
 For Ethernet switches which have both external and internal MDIO buses, the
 user MII bus can be utilized to mux/demux MDIO reads and writes towards either
 internal or external MDIO devices this switch might be connected to: internal
 PHYs, external PHYs, or even external switches.
 
+When using OF, the ``ds->user_mii_bus`` can be seen as a legacy feature, rather
+than core functionality. Since 2014, the DSA OF bindings support the
+``phy-handle`` property, which is a universal mechanism to reference a PHY,
+be it internal or external.
+
+New switch drivers are encouraged to require the more universal ``phy-handle``
+property even for user ports with internal PHYs. This allows device trees to
+interoperate with simpler variants of the drivers such as those from U-Boot,
+which do not have the (redundant) fallback logic for ``ds->user_mii_bus``.
+
+The only use case for ``ds->user_mii_bus`` in new drivers would be for probing
+on non-OF through ``platform_data``. In the distant future where this will be
+possible through software nodes, there will be no need for ``ds->user_mii_bus``
+in new drivers at all.
+
 Data structures
 ---------------
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas
  2023-12-08 19:35 [PATCH net 0/4] Add some history to the DSA documentation Vladimir Oltean
                   ` (2 preceding siblings ...)
  2023-12-08 19:35 ` [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation Vladimir Oltean
@ 2023-12-08 19:35 ` Vladimir Oltean
  2023-12-08 22:40   ` Linus Walleij
  2023-12-08 23:03   ` Florian Fainelli
  3 siblings, 2 replies; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-08 19:35 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Florian Fainelli, Luiz Angelo Daros de Luca,
	Alvin Šipraga, Madhuri Sripada, Marcin Wojtas, Linus Walleij,
	Tobias Waldekranz, Arun Ramadoss, Russell King (Oracle),
	Jonathan Corbet

It was a bit unclear to me what the TODO is about and what is even
actionable about it. I had a discussion with Florian about it at NetConf
2023, and it seems that it's about the amount of boilerplate code that
exists in switchdev drivers, and how that could be maybe made common
with DSA, through something like another library.

I think we are seeing a lot of people who are looking at DSA now,
and there is a lot of misunderstanding about why things are the way
they are, and which are the changes that would benefit the subsystem,
compared to the changes that go against DSA's past development trend.

I think what is missing is the context, which is admittedly a bit
hard to grasp considering there are 15 years of development.
Based on the git log and on the discussions where I participated,
I tried to cobble up a section about DSA's history. Here and there,
I've mentioned the limitations that I am aware of, and some possible
ways forward.

I'm also personally surprised by the amount of change in DSA over the
years, and I hope that putting things into perspective will also
encourage others to not think that it's set in stone the way it is now.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 Documentation/networking/dsa/dsa.rst | 186 +++++++++++++++++++++++++--
 1 file changed, 176 insertions(+), 10 deletions(-)

diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
index 2cd91358421e..305bb471fc02 100644
--- a/Documentation/networking/dsa/dsa.rst
+++ b/Documentation/networking/dsa/dsa.rst
@@ -1200,14 +1200,180 @@ methods must be implemented:
 - ``port_hsr_leave``: function invoked when a given switch port leaves a
   DANP/DANH and returns to normal operation as a standalone port.
 
-TODO
-====
-
-Making SWITCHDEV and DSA converge towards an unified codebase
--------------------------------------------------------------
+History and development directions
+==================================
 
-SWITCHDEV properly takes care of abstracting the networking stack with offload
-capable hardware, but does not enforce a strict switch device driver model. On
-the other DSA enforces a fairly strict device driver model, and deals with most
-of the switch specific. At some point we should envision a merger between these
-two subsystems and get the best of both worlds.
+This section gives some background context to the developers who are eager to
+make changes to the DSA core (``net/dsa/`` excepting ``tag_*.c`` files).
+
+Initially (2008), the DSA core was a platform driver for a platform device
+representing the switch tree. Support for hardware switch chips lived in
+separate modules, which were required to call the ``register_switch_driver()``
+method. The tree platform device was instantiated through ``platform_data``.
+
+Later (2013), the DSA core gained support for OF bindings. The initial bindings
+(now no longer supported) expected a compatible string of "marvell,dsa" or
+"brcm,bcm7445-switch-v4.0" for the tree, and, like the implementation of the
+DSA core, also expected that all switches in the tree are MDIO devices on an
+``mii_bus``. The initial bindings and core driver implementation first assumed
+that all switches in the tree are connected to the same MDIO bus, then in 2015,
+they were augmented such that each switch may be on its own MDIO bus.
+
+The early DSA core was more concerned with using switches as port multiplexers
+and with managing auxiliary functionality like temperature sensors
+(``CONFIG_NET_DSA_HWMON``) rather than with production-ready features.
+Bridging and bonding were handled in software. Support for hardware-accelerated
+bridging, by means of integrating with the ``switchdev`` framework, was added
+in 2015.
+
+In mid 2016, the second (and current) device tree binding for DSA was
+introduced. Here, individual switches are represented as devices in the Linux
+device model sense, on arbitrary buses, not just MDIO. The limitation of being
+able to describe a single CPU port was lifted (the driver support for multiple
+CPU ports came much later, in 2022). During this time, DSA stopped playing the
+role of a device model for switches, and ``register_switch_driver()`` was
+replaced with ``dsa_register_switch()``, the modern mechanism through which
+arbitrary devices may use DSA as a framework exclusively for integrating
+Ethernet switch IP blocks with the network stack.
+
+With the conversion to the second device tree binding, DSA's support for
+``platform_data`` (used in non-OF scenarios) was also changed to align. With
+the DSA tree no longer being a platform device, the ``platform_data`` structure
+moved to individual switch devices.
+
+Support for the initial device tree binding was subsequently removed in 2019.
+
+Probing through ``platform_data`` remains limited in functionality. The
+``ds->dev->of_node`` and ``dp->dn`` pointers are NULL, and the OF API calls
+made by drivers for discovering more complex setups fall back to the implicit
+handling. There is no way to describe multi-chip trees, or switches with
+multiple CPU ports. It is always assumed that shared ports are configured by
+the driver to the maximum supported link speed (they do not use phylink).
+User ports cannot connect to arbitrary PHYs, but are limited to
+``ds->user_mii_bus``.
+
+Many switch drivers introduced since after DSA's second OF binding were not
+designed to support probing through ``platform_data``. Most notably,
+``device_get_match_data()`` and ``of_device_get_match_data()`` return NULL with
+``platform_data``, so generally, drivers which do not have alternative
+mechanisms for this do not support ``platform_data``.
+
+Extending the ``platform_data`` support implies adding more separate code.
+An alternative worth exploring is growing DSA towards the ``fwnode`` API.
+However, not the entire OF binding should be generalized to ``fwnode``.
+The current bindings must be examined with a critical eye, and the properties
+which are no longer considered good practice (like ``label``, because ``udev``
+offers this functionality) should first be deprecated in OF, and not migrated
+to ``fwnode``.
+
+With ``fwnode`` support in the DSA framework, the ``fwnode_create_software_node()``
+API could be used as an alternative to ``platform_data``, to allow describing
+and probing switches on non-OF.
+
+DSA is used to control very complex switching chips. Some devices have a
+microprocessor, and in some cases, this microprocessor can run a variant of the
+Linux kernel. Sometimes, the switch packet I/O procedure of the internal
+microprocessor is different from the packet I/O procedure for an external host.
+The internal processor may have access to switch queues, while the external
+processor may require DSA tags. Other times, the microprocessor may also be
+connected to the switch in a DSA fashion (using an internal MAC to MAC
+connection).
+
+Since DSA is only concerned with switches where the packet I/O is handled
+by an intermediate conduit driver, this leads to the situation where it is
+recommended to have two drivers for the same switch hardware. When the queues
+are accessed directly, a separate non-DSA driver should be used, with its own
+skeleton which is integrated with ``switchdev`` on its own.
+
+In 2019, a DSA driver was added for the ``ocelot`` switch, which is a thin
+front-end over a hardware library that is also common with a ``switchdev``
+driver. While this design is encouraged for other similar cases, code
+duplication among multiple front-ends is a concern, so it may be desirable to
+extract some of DSA's core functionality into a reusable library for Ethernet
+switches. This could offer a driver-facing API similar to ``dsa_switch_ops``,
+but the aspects relating to cross-chip management, to DSA tags and to the
+conduit interface would remain DSA-specific.
+
+Traditionally, DSA switch drivers for discrete chips own the entire
+``spi_device``, ``i2c_client``, ``mdio_device`` etc. When the chip is complex
+and has multiple embedded peripherals (IRQ controller, GPIO controller, MDIO
+controller, LED controller, sensors), the handling of these peripherals is
+currently monolithic within the same device driver that also calls
+``dsa_register_switch()``.
+
+But an internal microprocessor may have a very different view of the switch
+address space, and may have discrete Linux drivers for each peripheral.
+In 2023, the ``ocelot_ext`` driver was added, which deviated from the
+traditional DSA driver architecture. Rather than probing on the entire
+``spi_device``, it created a multi-function device (MFD) top-level driver for
+it (associated with the SoC at large), and the switching IP is only one of the
+children of the MFD (it is a platform device with regmaps provided by its
+parent). The drivers for each peripheral in this switch SoC are the same when
+controlled over SPI and when controlled by the internal processor.
+
+Authors of new switch drivers that use DSA are encouraged to have a wider view
+of the peripherals on the chip that they are controlling, and to use the MFD
+model to separate the components whenever possible. The general direction for
+the DSA core is to shrink in size and to focus only on the Ethernet switching
+IP and its ports. ``CONFIG_NET_DSA_HWMON`` was removed in 2017. Adding new
+methods to ``struct dsa_switch_ops`` which are outside of DSA's core focus on
+Ethernet is strongly discouraged.
+
+DSA's support for multi-chip trees also has limitations. After converting from
+the first to the second OF binding, the switch tree stopped being a platform
+device, and its probing became implicit, and distributed among its constituent
+switch devices. There is currently a synchronization point in
+``dsa_tree_setup_routing_table()``, through which the tree setup is performed
+only once, when there is more than one switch in the tree. The first N-1
+switches will end their probing early, and the last switch will configure the
+entire tree, and thus all the other switches, in its ``dsa_register_switch()``
+calling context.
+
+Furthermore, the synchronization point works because each switch is able to
+determine, in a distributed manner, that the routing table is not complete, aka
+that there is at least one switch which has not probed. This is only possible
+because the ``link`` properties in the device tree describe the connections to
+all other cascade ports in the tree, not just to the directly connected cascade
+port. If only the latter were described, it could happen that a switch waits
+for its direct neighbors to probe before setting up the tree, but not
+necessarily for all switches in the tree (therefore, it sets up the tree too
+early).
+
+With more than 3 switches in a tree, it becomes a difficult task to write
+correct device trees which are not missing any link to the other cascade ports
+in the tree. The routing table, based on which ``dsa_routing_port()`` works, is
+directly taken from the device tree, although it could be computed through BFS
+instead. This means that the device tree writer needs to specify more than just
+the hardware description (represented by the direct cascade port connections).
+
+Simplifying the device tree bindings to require a single ``link`` phandle
+cannot be done without rethinking the distributed probing scheme. One idea is
+to reinstate the switch tree as a platform device, but this time created
+dynamically by ``dsa_register_switch()`` if the switch's tree ID is not already
+present in the system. The switch tree driver walks the device tree hop by hop,
+following the ``link`` references, to discover all the other switches, and to
+construct the full routing table. It then uses the component API to register
+itself as an aggregate driver, with each of the discovered switches as a
+component. When ``dsa_register_switch()`` completes for all component switches,
+the tree probing continues and calls ``dsa_tree_setup()``.
+
+The cross-chip management layer (``net/dsa/switch.c``) can also be improved.
+Currently ``struct dsa_switch_tree`` holds a list of ports rather than a list
+of switches, and thus, calling one function for each switch in a tree is hard.
+DSA currently uses one notifier chain per tree as a workaround for that, with
+each switch registered as a listener (``dsa_switch_event()``).
+
+It is considered bad practice to use notifiers when the emitter and the
+listener are known to each other, instead of a plain function call. Also, error
+handling with notifiers is not robust. When one switch fails mid-operation,
+there is no rollback to the previous state for switches which already completed
+the operation successfully.
+
+To untangle this situation and improve the reliability of the cross-chip
+management layer, it is necessary to split the DSA operations into ones which
+can fail, and ones which cannot fail. For example, ``port_fdb_add()`` can fail,
+whereas ``port_fdb_del()`` cannot. Then, cross-chip operations can take a
+fallible function to make forward progress, and an infallible function for
+rollback. However, it is unclear what to do in the case of ``change_mtu()``.
+It is hard to classify this operation as either fallible or infallible. It is
+also unclear how to deal with I/O access errors on the switch's management bus.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism
  2023-12-08 19:35 ` [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism Vladimir Oltean
@ 2023-12-08 22:15   ` Linus Walleij
  2023-12-08 22:32   ` Florian Fainelli
  2023-12-10 13:37   ` Alvin Šipraga
  2 siblings, 0 replies; 24+ messages in thread
From: Linus Walleij @ 2023-12-08 22:15 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Alvin Šipraga, Madhuri Sripada,
	Marcin Wojtas, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 8, 2023 at 8:36 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> Introduced 2 years ago in commit dc452a471dba ("net: dsa: introduce
> tagger-owned storage for private and shared data"), the tagger-owned
> storage mechanism has recently sparked some discussions which denote a
> general lack of developer understanding / awareness of it. There was
> also a bug in the ksz switch driver which indicates the same thing.
>
> Admittedly, it is also not obvious to see the design constraints that
> led to the creation of such a complicated mechanism.
>
> Here are some paragraphs that explain what it's about.
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Clear and to the point.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 2/4] docs: net: dsa: update platform_data documentation
  2023-12-08 19:35 ` [PATCH net 2/4] docs: net: dsa: update platform_data documentation Vladimir Oltean
@ 2023-12-08 22:19   ` Linus Walleij
  2023-12-09  0:52     ` Vladimir Oltean
  2023-12-08 22:33   ` Florian Fainelli
  2023-12-10 13:37   ` Alvin Šipraga
  2 siblings, 1 reply; 24+ messages in thread
From: Linus Walleij @ 2023-12-08 22:19 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Alvin Šipraga, Madhuri Sripada,
	Marcin Wojtas, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 8, 2023 at 8:36 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> We were documenting a bunch of stuff which was removed in commit
> 93e86b3bc842 ("net: dsa: Remove legacy probing support"). There's some
> further cleanup to do in struct dsa_chip_data, so rather than describing
> every possible field (when maybe we should be switching to kerneldoc
> format), just say what's important about it.
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
(...)
> +  ``ds->dev->platform_data`` pointer at ``dsa_register_switch()`` time. It is
> +  populated by board-specific code. The hardware switch driver may also have

I tend to avoid talking about "board-specific" since that has an embedded
tone to it, and rather say "system-specific". But DSA switches are certainly
in 99% of the cases embedded so it's definitely no big deal.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-08 19:35 ` [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation Vladimir Oltean
@ 2023-12-08 22:22   ` Linus Walleij
  2023-12-08 22:36   ` Florian Fainelli
  2023-12-10 13:48   ` Alvin Šipraga
  2 siblings, 0 replies; 24+ messages in thread
From: Linus Walleij @ 2023-12-08 22:22 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Alvin Šipraga, Madhuri Sripada,
	Marcin Wojtas, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 8, 2023 at 8:36 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> There are people who are trying to push the ds->user_mii_bus feature
> past its sell-by date. I think part of the problem is the fact that the
> documentation presents it as this great functionality.
>
> Adapt it to 2023, where we have phy-handle to render it useless, at
> least with OF.
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism
  2023-12-08 19:35 ` [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism Vladimir Oltean
  2023-12-08 22:15   ` Linus Walleij
@ 2023-12-08 22:32   ` Florian Fainelli
  2023-12-10 13:37   ` Alvin Šipraga
  2 siblings, 0 replies; 24+ messages in thread
From: Florian Fainelli @ 2023-12-08 22:32 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Luiz Angelo Daros de Luca, Alvin Šipraga,
	Madhuri Sripada, Marcin Wojtas, Linus Walleij, Tobias Waldekranz,
	Arun Ramadoss, Russell King (Oracle), Jonathan Corbet

On 12/8/23 11:35, Vladimir Oltean wrote:
> Introduced 2 years ago in commit dc452a471dba ("net: dsa: introduce
> tagger-owned storage for private and shared data"), the tagger-owned
> storage mechanism has recently sparked some discussions which denote a
> general lack of developer understanding / awareness of it. There was
> also a bug in the ksz switch driver which indicates the same thing.

Link: https://lore.kernel.org/all/20231206071655.1626479-1-sean@geanix.com/

> 
> Admittedly, it is also not obvious to see the design constraints that
> led to the creation of such a complicated mechanism.
> 
> Here are some paragraphs that explain what it's about.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
-- 
Florian


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 2/4] docs: net: dsa: update platform_data documentation
  2023-12-08 19:35 ` [PATCH net 2/4] docs: net: dsa: update platform_data documentation Vladimir Oltean
  2023-12-08 22:19   ` Linus Walleij
@ 2023-12-08 22:33   ` Florian Fainelli
  2023-12-10 13:37   ` Alvin Šipraga
  2 siblings, 0 replies; 24+ messages in thread
From: Florian Fainelli @ 2023-12-08 22:33 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Luiz Angelo Daros de Luca, Alvin Šipraga,
	Madhuri Sripada, Marcin Wojtas, Linus Walleij, Tobias Waldekranz,
	Arun Ramadoss, Russell King (Oracle), Jonathan Corbet

On 12/8/23 11:35, Vladimir Oltean wrote:
> We were documenting a bunch of stuff which was removed in commit
> 93e86b3bc842 ("net: dsa: Remove legacy probing support"). There's some
> further cleanup to do in struct dsa_chip_data, so rather than describing
> every possible field (when maybe we should be switching to kerneldoc
> format), just say what's important about it.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
-- 
Florian


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-08 19:35 ` [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation Vladimir Oltean
  2023-12-08 22:22   ` Linus Walleij
@ 2023-12-08 22:36   ` Florian Fainelli
  2023-12-09  1:22     ` Vladimir Oltean
  2023-12-10 13:48   ` Alvin Šipraga
  2 siblings, 1 reply; 24+ messages in thread
From: Florian Fainelli @ 2023-12-08 22:36 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Luiz Angelo Daros de Luca, Alvin Šipraga,
	Madhuri Sripada, Marcin Wojtas, Linus Walleij, Tobias Waldekranz,
	Arun Ramadoss, Russell King (Oracle), Jonathan Corbet

On 12/8/23 11:35, Vladimir Oltean wrote:
> There are people who are trying to push the ds->user_mii_bus feature
> past its sell-by date. I think part of the problem is the fact that the
> documentation presents it as this great functionality.
> 
> Adapt it to 2023, where we have phy-handle to render it useless, at
> least with OF.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>   Documentation/networking/dsa/dsa.rst | 36 ++++++++++++++++++++++------
>   1 file changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
> index 676c92136a0e..2cd91358421e 100644
> --- a/Documentation/networking/dsa/dsa.rst
> +++ b/Documentation/networking/dsa/dsa.rst
> @@ -397,19 +397,41 @@ perspective::
>   User MDIO bus
>   -------------
>   
> -In order to be able to read to/from a switch PHY built into it, DSA creates an
> -user MDIO bus which allows a specific switch driver to divert and intercept
> -MDIO reads/writes towards specific PHY addresses. In most MDIO-connected
> -switches, these functions would utilize direct or indirect PHY addressing mode
> -to return standard MII registers from the switch builtin PHYs, allowing the PHY
> -library and/or to return link status, link partner pages, auto-negotiation
> -results, etc.
> +The framework creates an MDIO bus for user ports (``ds->user_mii_bus``) when
> +both methods ``ds->ops->phy_read()`` and ``ds->ops->phy_write()`` are present.
> +However, this pointer may also be populated by the switch driver during the
> +``ds->ops->setup()`` method, with an MDIO bus managed by the driver.
> +
> +Its role is to permit user ports to connect to a PHY (usually internal) when
> +the more general ``phy-handle`` property is unavailable (either because the
> +MDIO bus is missing from the OF description, or because probing uses
> +``platform_data``).
> +
> +In most MDIO-connected switches, these functions would utilize direct or
> +indirect PHY addressing mode to return standard MII registers from the switch
> +builtin PHYs, allowing the PHY library and/or to return link status, link
> +partner pages, auto-negotiation results, etc.

The "and/or" did not read really well with the reset of the sentence, 
maybe just drop those two words?

>   
>   For Ethernet switches which have both external and internal MDIO buses, the
>   user MII bus can be utilized to mux/demux MDIO reads and writes towards either
>   internal or external MDIO devices this switch might be connected to: internal
>   PHYs, external PHYs, or even external switches.
>   
> +When using OF, the ``ds->user_mii_bus`` can be seen as a legacy feature, rather
> +than core functionality. Since 2014, the DSA OF bindings support the
> +``phy-handle`` property, which is a universal mechanism to reference a PHY,
> +be it internal or external.
> +
> +New switch drivers are encouraged to require the more universal ``phy-handle``
> +property even for user ports with internal PHYs. This allows device trees to
> +interoperate with simpler variants of the drivers such as those from U-Boot,
> +which do not have the (redundant) fallback logic for ``ds->user_mii_bus``.
> +
> +The only use case for ``ds->user_mii_bus`` in new drivers would be for probing
> +on non-OF through ``platform_data``. In the distant future where this will be
> +possible through software nodes, there will be no need for ``ds->user_mii_bus``
> +in new drivers at all.

That works for me, with the above addressed:

Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
-- 
Florian


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas
  2023-12-08 19:35 ` [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas Vladimir Oltean
@ 2023-12-08 22:40   ` Linus Walleij
  2023-12-08 23:03   ` Florian Fainelli
  1 sibling, 0 replies; 24+ messages in thread
From: Linus Walleij @ 2023-12-08 22:40 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Alvin Šipraga, Madhuri Sripada,
	Marcin Wojtas, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 8, 2023 at 8:36 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> It was a bit unclear to me what the TODO is about and what is even
> actionable about it. I had a discussion with Florian about it at NetConf
> 2023, and it seems that it's about the amount of boilerplate code that
> exists in switchdev drivers, and how that could be maybe made common
> with DSA, through something like another library.
>
> I think we are seeing a lot of people who are looking at DSA now,
> and there is a lot of misunderstanding about why things are the way
> they are, and which are the changes that would benefit the subsystem,
> compared to the changes that go against DSA's past development trend.
>
> I think what is missing is the context, which is admittedly a bit
> hard to grasp considering there are 15 years of development.
> Based on the git log and on the discussions where I participated,
> I tried to cobble up a section about DSA's history. Here and there,
> I've mentioned the limitations that I am aware of, and some possible
> ways forward.
>
> I'm also personally surprised by the amount of change in DSA over the
> years, and I hope that putting things into perspective will also
> encourage others to not think that it's set in stone the way it is now.
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Adding documentation is always good, and the kernel definitely
looks better after this patch than before so:
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

For ease of reading I would personally restructure it a bit, by
putting in three sections:

- History
  Pure development history (for the old code maybe add examples
  of which switches brought about the changes, in the same way
  that the Ocelot driver is mentioned?)

- Policy
  Do this, don't do that. The text has a few paragraphs that read
  like so.

- Future directions
  What we want to do next, or can be expected for the future,
  again the text has a few paragraphs that read like that.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas
  2023-12-08 19:35 ` [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas Vladimir Oltean
  2023-12-08 22:40   ` Linus Walleij
@ 2023-12-08 23:03   ` Florian Fainelli
  2023-12-09  1:58     ` Vladimir Oltean
  1 sibling, 1 reply; 24+ messages in thread
From: Florian Fainelli @ 2023-12-08 23:03 UTC (permalink / raw)
  To: Vladimir Oltean, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Luiz Angelo Daros de Luca, Alvin Šipraga,
	Madhuri Sripada, Marcin Wojtas, Linus Walleij, Tobias Waldekranz,
	Arun Ramadoss, Russell King (Oracle), Jonathan Corbet

On 12/8/23 11:35, Vladimir Oltean wrote:
> It was a bit unclear to me what the TODO is about and what is even
> actionable about it. I had a discussion with Florian about it at NetConf
> 2023, and it seems that it's about the amount of boilerplate code that
> exists in switchdev drivers, and how that could be maybe made common
> with DSA, through something like another library.
> 
> I think we are seeing a lot of people who are looking at DSA now,
> and there is a lot of misunderstanding about why things are the way
> they are, and which are the changes that would benefit the subsystem,
> compared to the changes that go against DSA's past development trend.
> 
> I think what is missing is the context, which is admittedly a bit
> hard to grasp considering there are 15 years of development.
> Based on the git log and on the discussions where I participated,
> I tried to cobble up a section about DSA's history. Here and there,
> I've mentioned the limitations that I am aware of, and some possible
> ways forward.
> 
> I'm also personally surprised by the amount of change in DSA over the
> years, and I hope that putting things into perspective will also
> encourage others to not think that it's set in stone the way it is now.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

This is really great, and thanks for having put that together, it 
represents an useful timeline of changes introduces.

> ---

[snip]

> +Probing through ``platform_data`` remains limited in functionality. The
> +``ds->dev->of_node`` and ``dp->dn`` pointers are NULL, and the OF API calls
> +made by drivers for discovering more complex setups fall back to the implicit
> +handling. There is no way to describe multi-chip trees, or switches with
> +multiple CPU ports. It is always assumed that shared ports are configured by
> +the driver to the maximum supported link speed (they do not use phylink).
> +User ports cannot connect to arbitrary PHYs, but are limited to
> +``ds->user_mii_bus``.

Maybe a mention here that this implies built-in/internal PHY devices 
only, just as a way to re-iterate the limitation and to echo to the 
previous patch?

> +
> +Many switch drivers introduced since after DSA's second OF binding were not
> +designed to support probing through ``platform_data``. Most notably,
> +``device_get_match_data()`` and ``of_device_get_match_data()`` return NULL with
> +``platform_data``, so generally, drivers which do not have alternative
> +mechanisms for this do not support ``platform_data``.
> +
> +Extending the ``platform_data`` support implies adding more separate code.
> +An alternative worth exploring is growing DSA towards the ``fwnode`` API.
> +However, not the entire OF binding should be generalized to ``fwnode``.
> +The current bindings must be examined with a critical eye, and the properties
> +which are no longer considered good practice (like ``label``, because ``udev``
> +offers this functionality) should first be deprecated in OF, and not migrated
> +to ``fwnode``.
> +
> +With ``fwnode`` support in the DSA framework, the ``fwnode_create_software_node()``
> +API could be used as an alternative to ``platform_data``, to allow describing
> +and probing switches on non-OF.

Might suggest to move the 3 paragraphs towards the end because otherwise 
it might be a distraction for the reader who might think: ah that's it? 
no more technical details!? Looks like Linus made the same suggestion in 
his review.

> +
> +DSA is used to control very complex switching chips. Some devices have a
> +microprocessor, and in some cases, this microprocessor can run a variant of the
> +Linux kernel. Sometimes, the switch packet I/O procedure of the internal
> +microprocessor is different from the packet I/O procedure for an external host.
> +The internal processor may have access to switch queues, while the external
> +processor may require DSA tags. Other times, the microprocessor may also be
> +connected to the switch in a DSA fashion (using an internal MAC to MAC
> +connection).
> +
> +Since DSA is only concerned with switches where the packet I/O is handled
> +by an intermediate conduit driver, this leads to the situation where it is
> +recommended to have two drivers for the same switch hardware. 

the same switch hardware, one for each of the use cases described above.

When the queues
> +are accessed directly, a separate non-DSA driver should be used, with its own
> +skeleton which is integrated with ``switchdev`` on its own.
> +
> +In 2019, a DSA driver was added for the ``ocelot`` switch, which is a thin
> +front-end over a hardware library that is also common with a ``switchdev``
> +driver. While this design is encouraged for other similar cases, code
> +duplication among multiple front-ends is a concern, so it may be desirable to
> +extract some of DSA's core functionality into a reusable library for Ethernet
> +switches. This could offer a driver-facing API similar to ``dsa_switch_ops``,
> +but the aspects relating to cross-chip management, to DSA tags and to the
> +conduit interface would remain DSA-specific.

Yes! That was exactly the idea indeed.

> +
> +Traditionally, DSA switch drivers for discrete chips own the entire
> +``spi_device``, ``i2c_client``, ``mdio_device`` etc. When the chip is complex
> +and has multiple embedded peripherals (IRQ controller, GPIO controller, MDIO
> +controller, LED controller, sensors), the handling of these peripherals is
> +currently monolithic within the same device driver that also calls
> +``dsa_register_switch()``.
> +
> +But an internal microprocessor may have a very different view of the switch
> +address space, and may have discrete Linux drivers for each peripheral.
> +In 2023, the ``ocelot_ext`` driver was added, which deviated from the
> +traditional DSA driver architecture. Rather than probing on the entire
> +``spi_device``, it created a multi-function device (MFD) top-level driver for
> +it (associated with the SoC at large), and the switching IP is only one of the
> +children of the MFD (it is a platform device with regmaps provided by its
> +parent). The drivers for each peripheral in this switch SoC are the same when
> +controlled over SPI and when controlled by the internal processor.
> +
> +Authors of new switch drivers that use DSA are encouraged to have a wider view
> +of the peripherals on the chip that they are controlling, and to use the MFD
> +model to separate the components whenever possible. The general direction for
> +the DSA core is to shrink in size and to focus only on the Ethernet switching
> +IP and its ports. ``CONFIG_NET_DSA_HWMON`` was removed in 2017. Adding new
> +methods to ``struct dsa_switch_ops`` which are outside of DSA's core focus on
> +Ethernet is strongly discouraged.

Agreed and good idea to put that on (virtual) paper.

> +
> +DSA's support for multi-chip trees also has limitations. After converting from
> +the first to the second OF binding, the switch tree stopped being a platform
> +device, and its probing became implicit, and distributed among its constituent
> +switch devices. There is currently a synchronization point in
> +``dsa_tree_setup_routing_table()``, through which the tree setup is performed
> +only once, when there is more than one switch in the tree. The first N-1
> +switches will end their probing early, and the last switch will configure the
> +entire tree, and thus all the other switches, in its ``dsa_register_switch()``
> +calling context.
> +
> +Furthermore, the synchronization point works because each switch is able to
> +determine, in a distributed manner, that the routing table is not complete, aka
> +that there is at least one switch which has not probed. This is only possible
> +because the ``link`` properties in the device tree describe the connections to
> +all other cascade ports in the tree, not just to the directly connected cascade
> +port. If only the latter were described, it could happen that a switch waits
> +for its direct neighbors to probe before setting up the tree, but not
> +necessarily for all switches in the tree (therefore, it sets up the tree too
> +early).
> +
> +With more than 3 switches in a tree, it becomes a difficult task to write
> +correct device trees which are not missing any link to the other cascade ports
> +in the tree. The routing table, based on which ``dsa_routing_port()`` works, is
> +directly taken from the device tree, although it could be computed through BFS
> +instead. This means that the device tree writer needs to specify more than just
> +the hardware description (represented by the direct cascade port connections).
> +
> +Simplifying the device tree bindings to require a single ``link`` phandle
> +cannot be done without rethinking the distributed probing scheme. One idea is
> +to reinstate the switch tree as a platform device, but this time created
> +dynamically by ``dsa_register_switch()`` if the switch's tree ID is not already
> +present in the system. The switch tree driver walks the device tree hop by hop,
> +following the ``link`` references, to discover all the other switches, and to
> +construct the full routing table. It then uses the component API to register
> +itself as an aggregate driver, with each of the discovered switches as a
> +component. When ``dsa_register_switch()`` completes for all component switches,
> +the tree probing continues and calls ``dsa_tree_setup()``.

An interesting paragraph, but I am not sure this is such a big pain 
point that we should be spending much description of it, especially 
since it sounds like this is solved, but could be improved, but in the 
grand scheme of things, should we really be spending any time on it?

Same-vendor cascade configurations are Marvell specific, and different 
vendor cascades require distinct switch trees, therefore do not really 
fall into the cross-chip design anymore. In a nutshell, cross-chip is 
very very niche and limited.

> +
> +The cross-chip management layer (``net/dsa/switch.c``) can also be improved.
> +Currently ``struct dsa_switch_tree`` holds a list of ports rather than a list
> +of switches, and thus, calling one function for each switch in a tree is hard.
> +DSA currently uses one notifier chain per tree as a workaround for that, with
> +each switch registered as a listener (``dsa_switch_event()``).
> +
> +It is considered bad practice to use notifiers when the emitter and the
> +listener are known to each other, instead of a plain function call. Also, error
> +handling with notifiers is not robust. When one switch fails mid-operation,
> +there is no rollback to the previous state for switches which already completed
> +the operation successfully.
> +
> +To untangle this situation and improve the reliability of the cross-chip
> +management layer, it is necessary to split the DSA operations into ones which
> +can fail, and ones which cannot fail. For example, ``port_fdb_add()`` can fail,
> +whereas ``port_fdb_del()`` cannot. Then, cross-chip operations can take a
> +fallible function to make forward progress, and an infallible function for
> +rollback. However, it is unclear what to do in the case of ``change_mtu()``.
> +It is hard to classify this operation as either fallible or infallible. It is
> +also unclear how to deal with I/O access errors on the switch's management bus.

How about something like this:

I/O access errors occurring during the switch configuration should 
always be logged for debugging but are very unlikely to be recoverable 
and therefore require an investigation into the failure mechanism and 
root cause or possible work around.
-- 
Florian


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 2/4] docs: net: dsa: update platform_data documentation
  2023-12-08 22:19   ` Linus Walleij
@ 2023-12-09  0:52     ` Vladimir Oltean
  0 siblings, 0 replies; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-09  0:52 UTC (permalink / raw)
  To: Linus Walleij
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Alvin Šipraga, Madhuri Sripada,
	Marcin Wojtas, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 08, 2023 at 11:19:28PM +0100, Linus Walleij wrote:
> On Fri, Dec 8, 2023 at 8:36 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> > +  ``ds->dev->platform_data`` pointer at ``dsa_register_switch()`` time. It is
> > +  populated by board-specific code. The hardware switch driver may also have
> 
> I tend to avoid talking about "board-specific" since that has an embedded
> tone to it, and rather say "system-specific". But DSA switches are certainly
> in 99% of the cases embedded so it's definitely no big deal.

I've changed this to "system-specific".

> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Thanks for the review, I appreciate it!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-08 22:36   ` Florian Fainelli
@ 2023-12-09  1:22     ` Vladimir Oltean
  2023-12-09 21:49       ` Linus Walleij
  0 siblings, 1 reply; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-09  1:22 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, Luiz Angelo Daros de Luca,
	Alvin Šipraga, Madhuri Sripada, Marcin Wojtas,
	Linus Walleij, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 08, 2023 at 02:36:40PM -0800, Florian Fainelli wrote:
> On 12/8/23 11:35, Vladimir Oltean wrote:
> > diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
> > index 676c92136a0e..2cd91358421e 100644
> > --- a/Documentation/networking/dsa/dsa.rst
> > +++ b/Documentation/networking/dsa/dsa.rst
> > @@ -397,19 +397,41 @@ perspective::
> >   User MDIO bus
> >   -------------
> > -In order to be able to read to/from a switch PHY built into it, DSA creates an
> > -user MDIO bus which allows a specific switch driver to divert and intercept
> > -MDIO reads/writes towards specific PHY addresses. In most MDIO-connected
> > -switches, these functions would utilize direct or indirect PHY addressing mode
> > -to return standard MII registers from the switch builtin PHYs, allowing the PHY
> > -library and/or to return link status, link partner pages, auto-negotiation
> > -results, etc.
> > +The framework creates an MDIO bus for user ports (``ds->user_mii_bus``) when
> > +both methods ``ds->ops->phy_read()`` and ``ds->ops->phy_write()`` are present.
> > +However, this pointer may also be populated by the switch driver during the
> > +``ds->ops->setup()`` method, with an MDIO bus managed by the driver.
> > +
> > +Its role is to permit user ports to connect to a PHY (usually internal) when
> > +the more general ``phy-handle`` property is unavailable (either because the
> > +MDIO bus is missing from the OF description, or because probing uses
> > +``platform_data``).
> > +
> > +In most MDIO-connected switches, these functions would utilize direct or
> > +indirect PHY addressing mode to return standard MII registers from the switch
> > +builtin PHYs, allowing the PHY library and/or to return link status, link
> > +partner pages, auto-negotiation results, etc.
> 
> The "and/or" did not read really well with the reset of the sentence, maybe
> just drop those two words?

Fun fact, this is the second sentence from the existing text, moved
as-is a bit further down. Git blame on it says:
77760e94928f ("Documentation: networking: add a DSA document")

I do have the slight feeling that the paragraph is pitching sliced
bread (sorry). I did want to remove it completely, but I also wanted to
preserve a phrase about the direct/indirect thing.

How about replacing it with this?

Typically, the user MDIO bus accesses internal PHYs indirectly, by
reading and writing to the MDIO controller registers located in the
switch address space. Sometimes, especially if the switch is controlled
over MDIO by the host, its internal PHYs may also be accessible on the
same MDIO bus as the switch IP, but at a different MDIO address. In that
case, a direct access method for the internal PHYs is to implement the
MDIO access operations as diversions towards the parent MDIO bus of the
switch, at different MDIO addresses.

Conceivably, the direct access method could be extended to also target
external PHYs situated on the same MDIO bus as the switch, or on a
different MDIO bus entirely, referenced through ``platform_data``.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas
  2023-12-08 23:03   ` Florian Fainelli
@ 2023-12-09  1:58     ` Vladimir Oltean
  2023-12-11 17:29       ` Florian Fainelli
  0 siblings, 1 reply; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-09  1:58 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, Luiz Angelo Daros de Luca,
	Alvin Šipraga, Madhuri Sripada, Marcin Wojtas,
	Linus Walleij, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 08, 2023 at 03:03:35PM -0800, Florian Fainelli wrote:
> > +Probing through ``platform_data`` remains limited in functionality. The
> > +``ds->dev->of_node`` and ``dp->dn`` pointers are NULL, and the OF API calls
> > +made by drivers for discovering more complex setups fall back to the implicit
> > +handling. There is no way to describe multi-chip trees, or switches with
> > +multiple CPU ports. It is always assumed that shared ports are configured by
> > +the driver to the maximum supported link speed (they do not use phylink).
> > +User ports cannot connect to arbitrary PHYs, but are limited to
> > +``ds->user_mii_bus``.
> 
> Maybe a mention here that this implies built-in/internal PHY devices only,
> just as a way to re-iterate the limitation and to echo to the previous
> patch?

I am not fully convinced that saying "user_mii_bus can only access internal PHYs"
only would be correct. This paragraph also exists in the user MDIO bus section:

For Ethernet switches which have both external and internal MDIO buses, the
user MII bus can be utilized to mux/demux MDIO reads and writes towards either
internal or external MDIO devices this switch might be connected to: internal
PHYs, external PHYs, or even external switches.

> > +
> > +Many switch drivers introduced since after DSA's second OF binding were not
> > +designed to support probing through ``platform_data``. Most notably,
> > +``device_get_match_data()`` and ``of_device_get_match_data()`` return NULL with
> > +``platform_data``, so generally, drivers which do not have alternative
> > +mechanisms for this do not support ``platform_data``.
> > +
> > +Extending the ``platform_data`` support implies adding more separate code.
> > +An alternative worth exploring is growing DSA towards the ``fwnode`` API.
> > +However, not the entire OF binding should be generalized to ``fwnode``.
> > +The current bindings must be examined with a critical eye, and the properties
> > +which are no longer considered good practice (like ``label``, because ``udev``
> > +offers this functionality) should first be deprecated in OF, and not migrated
> > +to ``fwnode``.
> > +
> > +With ``fwnode`` support in the DSA framework, the ``fwnode_create_software_node()``
> > +API could be used as an alternative to ``platform_data``, to allow describing
> > +and probing switches on non-OF.
> 
> Might suggest to move the 3 paragraphs towards the end because otherwise it
> might be a distraction for the reader who might think: ah that's it? no more
> technical details!? Looks like Linus made the same suggestion in his review.

I think it needs even more rethinking than that. I now remembered that
we also have a "Design limitations" section where the future work can go.

It's hard to navigate through what is now a 1400 line document and not
get lost.

I'm hoping I could move the documentation of variables and methods that
now sits in "Driver development" into kdoc comments inline with the code,
to reduce the clutter a bit.

But I don't know how to tackle this. Should documentation changes go to
"net" or to "net-next"? I targeted this for "net" as a documentation-only
change set. But if I start adding kdocs, it won't be so clear-cut anymore...

> > +Simplifying the device tree bindings to require a single ``link`` phandle
> > +cannot be done without rethinking the distributed probing scheme. One idea is
> > +to reinstate the switch tree as a platform device, but this time created
> > +dynamically by ``dsa_register_switch()`` if the switch's tree ID is not already
> > +present in the system. The switch tree driver walks the device tree hop by hop,
> > +following the ``link`` references, to discover all the other switches, and to
> > +construct the full routing table. It then uses the component API to register
> > +itself as an aggregate driver, with each of the discovered switches as a
> > +component. When ``dsa_register_switch()`` completes for all component switches,
> > +the tree probing continues and calls ``dsa_tree_setup()``.
> 
> An interesting paragraph, but I am not sure this is such a big pain point
> that we should be spending much description of it, especially since it
> sounds like this is solved, but could be improved, but in the grand scheme
> of things, should we really be spending any time on it?
> 
> Same-vendor cascade configurations are Marvell specific, and different
> vendor cascades require distinct switch trees, therefore do not really fall
> into the cross-chip design anymore. In a nutshell, cross-chip is very very
> niche and limited.

Well, I've been contacted by somebody to help with a custom board with 3
daisy chained SJA1105 switches. He is doing the testing for me, and I'm
waiting for the results to come back. I'm currently waiting for an uprev
to an NXP BSP on top of 5.15 to be finalized, so that patches developed
over net-next are at least barely testable...

If you remember, the SJA1105 has these one-shot management routes which
must be installed over SPI, and they decide where the next transmitted
link-local packet goes.

Well, the driver only supports single-chip trees, as you say, because it
only programs the management route in the targeted switch. With daisy
chained switches, one needs to figure out the actual packet route from
the CPU to the leaf user port, and install a management route for every
switch along that route. Otherwise, intermediary switches won't know
what to do with the packets, and drop them.

The specific request was: "help, PTP doesn't work!"

I did solve the problem, and the documentation paragraphs above are
basically my development notes while examining the existing support
and the way in which it isn't giving me the tools I need.

I do need to send a dt-bindings patch on this topic as well. The fact
that we put all cascade links in the device tree means we don't know
which one represents the direct connection to the neighbor cascade port,
and which one is an indirect connection. We need to bake in an
assumption, like for example "the first element of the 'link' phandle
array is the direct connection". I hope retroactively doing that won't
bother the device tree maintainers too much. If it does, the problem is
intractable.

But I agree, requirements for cross-chip support are rare, and with
SJA1105 I don't even own a board where I can directly test it. I
specifically bought the Turris MOX for that.

> > +To untangle this situation and improve the reliability of the cross-chip
> > +management layer, it is necessary to split the DSA operations into ones which
> > +can fail, and ones which cannot fail. For example, ``port_fdb_add()`` can fail,
> > +whereas ``port_fdb_del()`` cannot. Then, cross-chip operations can take a
> > +fallible function to make forward progress, and an infallible function for
> > +rollback. However, it is unclear what to do in the case of ``change_mtu()``.
> > +It is hard to classify this operation as either fallible or infallible. It is
> > +also unclear how to deal with I/O access errors on the switch's management bus.
> 
> How about something like this:
> 
> I/O access errors occurring during the switch configuration should always be
> logged for debugging but are very unlikely to be recoverable and therefore
> require an investigation into the failure mechanism and root cause or
> possible work around.

Yeah, I suppose.

What do you think, has something like phy_error() been a useful mechanism
for anything? Or just a pain in the rear? Would it be useful to shut
everything down on a bus I/O error, phylib style?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-09  1:22     ` Vladimir Oltean
@ 2023-12-09 21:49       ` Linus Walleij
  0 siblings, 0 replies; 24+ messages in thread
From: Linus Walleij @ 2023-12-09 21:49 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, netdev, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	Luiz Angelo Daros de Luca, Alvin Šipraga,
	Madhuri Sripada, Marcin Wojtas, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Sat, Dec 9, 2023 at 2:22 AM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> How about replacing it with this?
>
> Typically, the user MDIO bus accesses internal PHYs indirectly, by
> reading and writing to the MDIO controller registers located in the
> switch address space. Sometimes, especially if the switch is controlled
> over MDIO by the host, its internal PHYs may also be accessible on the
> same MDIO bus as the switch IP, but at a different MDIO address. In that
> case, a direct access method for the internal PHYs is to implement the
> MDIO access operations as diversions towards the parent MDIO bus of the
> switch, at different MDIO addresses.
>
> Conceivably, the direct access method could be extended to also target
> external PHYs situated on the same MDIO bus as the switch, or on a
> different MDIO bus entirely, referenced through ``platform_data``.

This is clear and simple to understand, go with it.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism
  2023-12-08 19:35 ` [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism Vladimir Oltean
  2023-12-08 22:15   ` Linus Walleij
  2023-12-08 22:32   ` Florian Fainelli
@ 2023-12-10 13:37   ` Alvin Šipraga
  2 siblings, 0 replies; 24+ messages in thread
From: Alvin Šipraga @ 2023-12-10 13:37 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev@vger.kernel.org, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Madhuri Sripada, Marcin Wojtas,
	Linus Walleij, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 08, 2023 at 09:35:15PM +0200, Vladimir Oltean wrote:
> Introduced 2 years ago in commit dc452a471dba ("net: dsa: introduce
> tagger-owned storage for private and shared data"), the tagger-owned
> storage mechanism has recently sparked some discussions which denote a
> general lack of developer understanding / awareness of it. There was
> also a bug in the ksz switch driver which indicates the same thing.
> 
> Admittedly, it is also not obvious to see the design constraints that
> led to the creation of such a complicated mechanism.
> 
> Here are some paragraphs that explain what it's about.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>

> ---
>  Documentation/networking/dsa/dsa.rst | 59 ++++++++++++++++++++++++++++
>  1 file changed, 59 insertions(+)
> 
> diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
> index 7b2e69cd7ef0..0c326a42eb81 100644
> --- a/Documentation/networking/dsa/dsa.rst
> +++ b/Documentation/networking/dsa/dsa.rst
> @@ -221,6 +221,44 @@ receive all frames regardless of the value of the MAC DA. This can be done by
>  setting the ``promisc_on_conduit`` property of the ``struct dsa_device_ops``.
>  Note that this assumes a DSA-unaware conduit driver, which is the norm.
>  
> +Separation between tagging protocol and switch drivers
> +------------------------------------------------------
> +
> +Sometimes it is desirable to test the behavior of a given conduit interface
> +with a given switch protocol, to see how it responds to checksum offloading,
> +padding with tail tags, increased MTU, how the hardware parser sees DSA-tagged
> +frames, etc.
> +
> +To achieve that, any tagging protocol driver may be used with ``dsa_loop``
> +(this requires modifying the ``dsa_loop_get_protocol()`` function
> +implementation). Therefore, tagging protocol drivers must not assume that they
> +are used only in conjunction with a particular switch driver. Concretely, the
> +tagging protocol driver should make no assumptions about the type of
> +``ds->priv``, and its core functionality should only rely on the data
> +structures offered by the DSA core for all switches (``struct dsa_switch``,
> +``struct dsa_port`` etc).
> +
> +Additionally, tagging protocol drivers must not depend on symbols exported by
> +any particular switch control path driver. Doing so would create a circular
> +dependency, because DSA, on behalf of the switch driver, already requests the
> +appropriate tagging protocol driver module to be loaded.
> +
> +Nonetheless, there are exceptional situations when switch-specific processing
> +is required in a tagging protocol driver. In some cases the tagger needs a
> +place to hold state; in other cases, the packet transmission procedure may
> +involve accessing switch registers. The tagger may also be processing packets
> +which are not destined for the network stack but for the switch driver's
> +management logic, and thus, the switch driver should have a handler for these
> +management frames.
> +
> +A mechanism, called tagger-owned storage (in reference to ``ds->tagger_data``),
> +exists, which permits tagging protocol drivers to allocate memory for each
> +switch that they connect to. Each tagging protocol driver may define its own
> +contract with switch drivers as to what this data structure contains.
> +Through the ``struct dsa_device_ops`` methods ``connect()`` and ``disconnect()``,
> +tagging protocol drivers are given the possibility to manage the
> +``ds->tagger_data`` pointer of any switch that they connect to.
> +
>  Conduit network devices
>  -----------------------
>  
> @@ -624,6 +662,27 @@ Switch configuration
>    case, further calls to ``get_tag_protocol`` should report the protocol in
>    current use.
>  
> +- ``connect_tag_protocol``: optional method to notify the switch driver that a
> +  tagging protocol driver has connected to this switch. Depending on the
> +  contract established by the protocol given in the ``proto`` argument, the
> +  tagger-owned storage (``ds->tagger_data``) may be expected to contain a
> +  pointer to a data structure specific to the tagging protocol. This data
> +  structure may contain function pointers to packet handlers that the switch
> +  driver registers with the tagging protocol. If interested in these packets,
> +  the switch driver must cast the ``ds->tagger_data`` pointer to the data type
> +  established by the tagging protocol, and assign the packet handler function
> +  pointers to methods that it owns. Since the memory pointed to by
> +  ``ds->tagger_data`` is owned by the tagging protocol, the switch driver must
> +  assume by convention that it has been allocated, and this method is only
> +  provided for making initial adjustments to the contents of ``ds->tagger_data``.
> +  It is also the reason why no ``disconnect_tag_protocol()`` counterpart is
> +  provided. Additionally, a tagging protocol driver which makes use of
> +  tagger-owned storage must not assume that the connected switch has
> +  implemented the ``connect_tag_protocol()`` method (it may connect to a
> +  ``dsa_loop`` switch, which does not). Therefore, a tagging protocol may
> +  always rely on ``ds->tagger_data``, but it must treat the packet handlers
> +  provided by the switch in this method as optional.
> +
>  - ``setup``: setup function for the switch, this function is responsible for setting
>    up the ``dsa_switch_ops`` private structure with all it needs: register maps,
>    interrupts, mutexes, locks, etc. This function is also expected to properly
> -- 
> 2.34.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 2/4] docs: net: dsa: update platform_data documentation
  2023-12-08 19:35 ` [PATCH net 2/4] docs: net: dsa: update platform_data documentation Vladimir Oltean
  2023-12-08 22:19   ` Linus Walleij
  2023-12-08 22:33   ` Florian Fainelli
@ 2023-12-10 13:37   ` Alvin Šipraga
  2 siblings, 0 replies; 24+ messages in thread
From: Alvin Šipraga @ 2023-12-10 13:37 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev@vger.kernel.org, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Madhuri Sripada, Marcin Wojtas,
	Linus Walleij, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 08, 2023 at 09:35:16PM +0200, Vladimir Oltean wrote:
> We were documenting a bunch of stuff which was removed in commit
> 93e86b3bc842 ("net: dsa: Remove legacy probing support"). There's some
> further cleanup to do in struct dsa_chip_data, so rather than describing
> every possible field (when maybe we should be switching to kerneldoc
> format), just say what's important about it.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>

> ---
>  Documentation/networking/dsa/dsa.rst | 23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
> index 0c326a42eb81..676c92136a0e 100644
> --- a/Documentation/networking/dsa/dsa.rst
> +++ b/Documentation/networking/dsa/dsa.rst
> @@ -413,18 +413,17 @@ PHYs, external PHYs, or even external switches.
>  Data structures
>  ---------------
>  
> -DSA data structures are defined in ``include/net/dsa.h`` as well as
> -``net/dsa/dsa_priv.h``:
> -
> -- ``dsa_chip_data``: platform data configuration for a given switch device,
> -  this structure describes a switch device's parent device, its address, as
> -  well as various properties of its ports: names/labels, and finally a routing
> -  table indication (when cascading switches)
> -
> -- ``dsa_platform_data``: platform device configuration data which can reference
> -  a collection of dsa_chip_data structures if multiple switches are cascaded,
> -  the conduit network device this switch tree is attached to needs to be
> -  referenced
> +DSA data structures are defined in ``include/linux/platform_data/dsa.h``,
> +``include/net/dsa.h`` as well as ``net/dsa/dsa_priv.h``:
> +
> +- ``dsa_chip_data``: platform data configuration for a given switch device.
> +  Most notably, it is necessary to the DSA core because it holds a reference to
> +  the conduit interface. It must be accessible through the
> +  ``ds->dev->platform_data`` pointer at ``dsa_register_switch()`` time. It is
> +  populated by board-specific code. The hardware switch driver may also have
> +  its own portion of ``platform_data`` description. In that case,
> +  ``ds->dev->platform_data`` can point to a switch-specific structure, which
> +  encapsulates ``struct dsa_chip_data`` as its first element.
>  
>  - ``dsa_switch_tree``: structure assigned to the conduit network device under
>    ``dsa_ptr``, this structure references a dsa_platform_data structure as well as
> -- 
> 2.34.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-08 19:35 ` [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation Vladimir Oltean
  2023-12-08 22:22   ` Linus Walleij
  2023-12-08 22:36   ` Florian Fainelli
@ 2023-12-10 13:48   ` Alvin Šipraga
  2023-12-11 14:35     ` Vladimir Oltean
  2 siblings, 1 reply; 24+ messages in thread
From: Alvin Šipraga @ 2023-12-10 13:48 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev@vger.kernel.org, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Madhuri Sripada, Marcin Wojtas,
	Linus Walleij, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Fri, Dec 08, 2023 at 09:35:17PM +0200, Vladimir Oltean wrote:
> There are people who are trying to push the ds->user_mii_bus feature
> past its sell-by date. I think part of the problem is the fact that the
> documentation presents it as this great functionality.
> 
> Adapt it to 2023, where we have phy-handle to render it useless, at
> least with OF.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
>  Documentation/networking/dsa/dsa.rst | 36 ++++++++++++++++++++++------
>  1 file changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
> index 676c92136a0e..2cd91358421e 100644
> --- a/Documentation/networking/dsa/dsa.rst
> +++ b/Documentation/networking/dsa/dsa.rst
> @@ -397,19 +397,41 @@ perspective::
>  User MDIO bus
>  -------------
>  
> -In order to be able to read to/from a switch PHY built into it, DSA creates an
> -user MDIO bus which allows a specific switch driver to divert and intercept
> -MDIO reads/writes towards specific PHY addresses. In most MDIO-connected
> -switches, these functions would utilize direct or indirect PHY addressing mode
> -to return standard MII registers from the switch builtin PHYs, allowing the PHY
> -library and/or to return link status, link partner pages, auto-negotiation
> -results, etc.
> +The framework creates an MDIO bus for user ports (``ds->user_mii_bus``) when
> +both methods ``ds->ops->phy_read()`` and ``ds->ops->phy_write()`` are present.
> +However, this pointer may also be populated by the switch driver during the
> +``ds->ops->setup()`` method, with an MDIO bus managed by the driver.
> +
> +Its role is to permit user ports to connect to a PHY (usually internal) when
> +the more general ``phy-handle`` property is unavailable (either because the
> +MDIO bus is missing from the OF description, or because probing uses
> +``platform_data``).
> +
> +In most MDIO-connected switches, these functions would utilize direct or
> +indirect PHY addressing mode to return standard MII registers from the switch
> +builtin PHYs, allowing the PHY library and/or to return link status, link
> +partner pages, auto-negotiation results, etc.
>  
>  For Ethernet switches which have both external and internal MDIO buses, the
>  user MII bus can be utilized to mux/demux MDIO reads and writes towards either
>  internal or external MDIO devices this switch might be connected to: internal
>  PHYs, external PHYs, or even external switches.
>  
> +When using OF, the ``ds->user_mii_bus`` can be seen as a legacy feature, rather
> +than core functionality. Since 2014, the DSA OF bindings support the
> +``phy-handle`` property, which is a universal mechanism to reference a PHY,
> +be it internal or external.
> +
> +New switch drivers are encouraged to require the more universal ``phy-handle``
> +property even for user ports with internal PHYs. This allows device trees to
> +interoperate with simpler variants of the drivers such as those from U-Boot,
> +which do not have the (redundant) fallback logic for ``ds->user_mii_bus``.

Considering this policy, should we not emphasize that ds->user_mii_bus
and ds->ops->phy_{read,write}() ought to be left unpopulated by new
drivers, with the remark that if a driver wants to set up an MDIO bus,
it should store the corresponding struct mii_bus pointer in its own
driver private data? Just to make things crystal clear.

Regardless I think this is good!

Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>

> +
> +The only use case for ``ds->user_mii_bus`` in new drivers would be for probing
> +on non-OF through ``platform_data``. In the distant future where this will be
> +possible through software nodes, there will be no need for ``ds->user_mii_bus``
> +in new drivers at all.
> +
>  Data structures
>  ---------------
>  
> -- 
> 2.34.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-10 13:48   ` Alvin Šipraga
@ 2023-12-11 14:35     ` Vladimir Oltean
  2023-12-13  5:30       ` Luiz Angelo Daros de Luca
  0 siblings, 1 reply; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-11 14:35 UTC (permalink / raw)
  To: Alvin Šipraga
  Cc: netdev@vger.kernel.org, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Andrew Lunn, Florian Fainelli,
	Luiz Angelo Daros de Luca, Madhuri Sripada, Marcin Wojtas,
	Linus Walleij, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Sun, Dec 10, 2023 at 01:48:12PM +0000, Alvin Šipraga wrote:
> On Fri, Dec 08, 2023 at 09:35:17PM +0200, Vladimir Oltean wrote:
> > +When using OF, the ``ds->user_mii_bus`` can be seen as a legacy feature, rather
> > +than core functionality. Since 2014, the DSA OF bindings support the
> > +``phy-handle`` property, which is a universal mechanism to reference a PHY,
> > +be it internal or external.
> > +
> > +New switch drivers are encouraged to require the more universal ``phy-handle``
> > +property even for user ports with internal PHYs. This allows device trees to
> > +interoperate with simpler variants of the drivers such as those from U-Boot,
> > +which do not have the (redundant) fallback logic for ``ds->user_mii_bus``.
> 
> Considering this policy, should we not emphasize that ds->user_mii_bus
> and ds->ops->phy_{read,write}() ought to be left unpopulated by new
> drivers, with the remark that if a driver wants to set up an MDIO bus,
> it should store the corresponding struct mii_bus pointer in its own
> driver private data? Just to make things crystal clear.
> 
> Regardless I think this is good!
> 
> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>

I think something that makes a limited amount of sense is for DSA to
probe on OF, but not describe the MDIO controller in OF. Then, you'd
need ds->user_mii_bus. But new drivers should probably not do that
either; they should look into the MFD model and make the MDIO controller
be separate from (not a child of) the DSA switch. Then use a phy-handle
to it. So for new drivers, even this doesn't make too much sense, and
neither is it best to allocate the mii_bus from driver private code.

What makes no sense whatsoever is commit fe7324b93222 ("net: dsa:
OF-ware slave_mii_bus"). Because DSA provides ds->user_mii_bus to do
something reasonable when the MDIO controller isn't described in OF,
but this change assumes that it _is_ described in OF!

I'm not sure how and where to best put in words "let's not make DSA a
library for everything, just keep it for the switch". I'll think about
it some more.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas
  2023-12-09  1:58     ` Vladimir Oltean
@ 2023-12-11 17:29       ` Florian Fainelli
  0 siblings, 0 replies; 24+ messages in thread
From: Florian Fainelli @ 2023-12-11 17:29 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, Luiz Angelo Daros de Luca,
	Alvin Šipraga, Madhuri Sripada, Marcin Wojtas,
	Linus Walleij, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On 12/8/23 17:58, Vladimir Oltean wrote:
> On Fri, Dec 08, 2023 at 03:03:35PM -0800, Florian Fainelli wrote:
>>> +Probing through ``platform_data`` remains limited in functionality. The
>>> +``ds->dev->of_node`` and ``dp->dn`` pointers are NULL, and the OF API calls
>>> +made by drivers for discovering more complex setups fall back to the implicit
>>> +handling. There is no way to describe multi-chip trees, or switches with
>>> +multiple CPU ports. It is always assumed that shared ports are configured by
>>> +the driver to the maximum supported link speed (they do not use phylink).
>>> +User ports cannot connect to arbitrary PHYs, but are limited to
>>> +``ds->user_mii_bus``.
>>
>> Maybe a mention here that this implies built-in/internal PHY devices only,
>> just as a way to re-iterate the limitation and to echo to the previous
>> patch?
> 
> I am not fully convinced that saying "user_mii_bus can only access internal PHYs"
> only would be correct. This paragraph also exists in the user MDIO bus section:
> 
> For Ethernet switches which have both external and internal MDIO buses, the
> user MII bus can be utilized to mux/demux MDIO reads and writes towards either
> internal or external MDIO devices this switch might be connected to: internal
> PHYs, external PHYs, or even external switches.

OK, so what you have put is good enough, no need to add more that would 
only be a repeat of the user_mii_bus description (with the risk of the 
two being out of sync eventually).

> 
>>> +
>>> +Many switch drivers introduced since after DSA's second OF binding were not
>>> +designed to support probing through ``platform_data``. Most notably,
>>> +``device_get_match_data()`` and ``of_device_get_match_data()`` return NULL with
>>> +``platform_data``, so generally, drivers which do not have alternative
>>> +mechanisms for this do not support ``platform_data``.
>>> +
>>> +Extending the ``platform_data`` support implies adding more separate code.
>>> +An alternative worth exploring is growing DSA towards the ``fwnode`` API.
>>> +However, not the entire OF binding should be generalized to ``fwnode``.
>>> +The current bindings must be examined with a critical eye, and the properties
>>> +which are no longer considered good practice (like ``label``, because ``udev``
>>> +offers this functionality) should first be deprecated in OF, and not migrated
>>> +to ``fwnode``.
>>> +
>>> +With ``fwnode`` support in the DSA framework, the ``fwnode_create_software_node()``
>>> +API could be used as an alternative to ``platform_data``, to allow describing
>>> +and probing switches on non-OF.
>>
>> Might suggest to move the 3 paragraphs towards the end because otherwise it
>> might be a distraction for the reader who might think: ah that's it? no more
>> technical details!? Looks like Linus made the same suggestion in his review.
> 
> I think it needs even more rethinking than that. I now remembered that
> we also have a "Design limitations" section where the future work can go.
> 
> It's hard to navigate through what is now a 1400 line document and not
> get lost.

Agreed.

> 
> I'm hoping I could move the documentation of variables and methods that
> now sits in "Driver development" into kdoc comments inline with the code,
> to reduce the clutter a bit.
> 
> But I don't know how to tackle this. Should documentation changes go to
> "net" or to "net-next"? I targeted this for "net" as a documentation-only
> change set. But if I start adding kdocs, it won't be so clear-cut anymore...

Personal preference is that documentation changes should always target 
'net' (unless they document a 'net-next' change obviously) because 
accurate documentation is most important than anything.

> 
>>> +Simplifying the device tree bindings to require a single ``link`` phandle
>>> +cannot be done without rethinking the distributed probing scheme. One idea is
>>> +to reinstate the switch tree as a platform device, but this time created
>>> +dynamically by ``dsa_register_switch()`` if the switch's tree ID is not already
>>> +present in the system. The switch tree driver walks the device tree hop by hop,
>>> +following the ``link`` references, to discover all the other switches, and to
>>> +construct the full routing table. It then uses the component API to register
>>> +itself as an aggregate driver, with each of the discovered switches as a
>>> +component. When ``dsa_register_switch()`` completes for all component switches,
>>> +the tree probing continues and calls ``dsa_tree_setup()``.
>>
>> An interesting paragraph, but I am not sure this is such a big pain point
>> that we should be spending much description of it, especially since it
>> sounds like this is solved, but could be improved, but in the grand scheme
>> of things, should we really be spending any time on it?
>>
>> Same-vendor cascade configurations are Marvell specific, and different
>> vendor cascades require distinct switch trees, therefore do not really fall
>> into the cross-chip design anymore. In a nutshell, cross-chip is very very
>> niche and limited.
> 
> Well, I've been contacted by somebody to help with a custom board with 3
> daisy chained SJA1105 switches. He is doing the testing for me, and I'm
> waiting for the results to come back. I'm currently waiting for an uprev
> to an NXP BSP on top of 5.15 to be finalized, so that patches developed
> over net-next are at least barely testable...
> 
> If you remember, the SJA1105 has these one-shot management routes which
> must be installed over SPI, and they decide where the next transmitted
> link-local packet goes.

Yes, it's coming back now.

> 
> Well, the driver only supports single-chip trees, as you say, because it
> only programs the management route in the targeted switch. With daisy
> chained switches, one needs to figure out the actual packet route from
> the CPU to the leaf user port, and install a management route for every
> switch along that route. Otherwise, intermediary switches won't know
> what to do with the packets, and drop them.
> 
> The specific request was: "help, PTP doesn't work!"
> 
> I did solve the problem, and the documentation paragraphs above are
> basically my development notes while examining the existing support
> and the way in which it isn't giving me the tools I need.
> 
> I do need to send a dt-bindings patch on this topic as well. The fact
> that we put all cascade links in the device tree means we don't know
> which one represents the direct connection to the neighbor cascade port,
> and which one is an indirect connection. We need to bake in an
> assumption, like for example "the first element of the 'link' phandle
> array is the direct connection". I hope retroactively doing that won't
> bother the device tree maintainers too much. If it does, the problem is
> intractable.

Yes I believe this is exactly how we had intended for the "link" 
property to be used. We should have indeed encoded that more precisely 
into the property.

> 
> But I agree, requirements for cross-chip support are rare, and with
> SJA1105 I don't even own a board where I can directly test it. I
> specifically bought the Turris MOX for that.
> 
>>> +To untangle this situation and improve the reliability of the cross-chip
>>> +management layer, it is necessary to split the DSA operations into ones which
>>> +can fail, and ones which cannot fail. For example, ``port_fdb_add()`` can fail,
>>> +whereas ``port_fdb_del()`` cannot. Then, cross-chip operations can take a
>>> +fallible function to make forward progress, and an infallible function for
>>> +rollback. However, it is unclear what to do in the case of ``change_mtu()``.
>>> +It is hard to classify this operation as either fallible or infallible. It is
>>> +also unclear how to deal with I/O access errors on the switch's management bus.
>>
>> How about something like this:
>>
>> I/O access errors occurring during the switch configuration should always be
>> logged for debugging but are very unlikely to be recoverable and therefore
>> require an investigation into the failure mechanism and root cause or
>> possible work around.
> 
> Yeah, I suppose.
> 
> What do you think, has something like phy_error() been a useful mechanism
> for anything? Or just a pain in the rear? Would it be useful to shut
> everything down on a bus I/O error, phylib style?

phy_error() has been somewhat helpful in knowing whether we have a hard 
to debug, possibly silent MDIO error lurking around. Because the PHY 
library polls or reacts to interrupts updating the link status has a 
very high probability of hitting a MDIO transaction error. This is 
similar to what we see with some of our fielded reports where most 
text/data corruptions occur in scheduler code paths... because the 
scheduler is very often executed. So you know you have an error, but you 
don't know yet why.

For a switch there can be a lot of different transactions, but being 
able to know where it fails precisely could be a lead as to where the 
failure is.
-- 
Floria


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-11 14:35     ` Vladimir Oltean
@ 2023-12-13  5:30       ` Luiz Angelo Daros de Luca
  2023-12-13 12:06         ` Vladimir Oltean
  0 siblings, 1 reply; 24+ messages in thread
From: Luiz Angelo Daros de Luca @ 2023-12-13  5:30 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Alvin Šipraga, netdev@vger.kernel.org, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	Florian Fainelli, Madhuri Sripada, Marcin Wojtas, Linus Walleij,
	Tobias Waldekranz, Arun Ramadoss, Russell King (Oracle),
	Jonathan Corbet

> > > +When using OF, the ``ds->user_mii_bus`` can be seen as a legacy feature, rather
> > > +than core functionality. Since 2014, the DSA OF bindings support the
> > > +``phy-handle`` property, which is a universal mechanism to reference a PHY,
> > > +be it internal or external.
> > > +
> > > +New switch drivers are encouraged to require the more universal ``phy-handle``
> > > +property even for user ports with internal PHYs. This allows device trees to
> > > +interoperate with simpler variants of the drivers such as those from U-Boot,
> > > +which do not have the (redundant) fallback logic for ``ds->user_mii_bus``.
> >
> > Considering this policy, should we not emphasize that ds->user_mii_bus
> > and ds->ops->phy_{read,write}() ought to be left unpopulated by new
> > drivers, with the remark that if a driver wants to set up an MDIO bus,
> > it should store the corresponding struct mii_bus pointer in its own
> > driver private data? Just to make things crystal clear.
> >
> > Regardless I think this is good!
> >
> > Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
>
> I think something that makes a limited amount of sense is for DSA to
> probe on OF, but not describe the MDIO controller in OF. Then, you'd
> need ds->user_mii_bus. But new drivers should probably not do that
> either; they should look into the MFD model and make the MDIO controller
> be separate from (not a child of) the DSA switch. Then use a phy-handle
> to it. So for new drivers, even this doesn't make too much sense, and
> neither is it best to allocate the mii_bus from driver private code.
>
> What makes no sense whatsoever is commit fe7324b93222 ("net: dsa:
> OF-ware slave_mii_bus"). Because DSA provides ds->user_mii_bus to do
> something reasonable when the MDIO controller isn't described in OF,
> but this change assumes that it _is_ described in OF!
>
> I'm not sure how and where to best put in words "let's not make DSA a
> library for everything, just keep it for the switch". I'll think about
> it some more.

Hello Vladimir,

Sorry for my lack of understanding but I still didn't get it.

You are telling us we should not use user_mii_bus when the user MDIO
is described in the OF. Is it only about the "generic DSA mii" or also
about the custom ones the current drivers have? If it is the latter, I
don't know how to connect the dots between phy_read/write functions
and the port.

We have some drivers that define ds->user_mii_bus (not the "generic
DSA mii") with the MDIO described in OF. Are they wrong?

Alvin asked if we should store the mii_bus internally and not in the
ds->user_mii_bus but I don't think you answered it. Is it about where
we store the bus (for dropping user_mii_bus in the future)?

You now also mention using the MFD model (shouldn't it be cited in the
docs too?) but I couldn't identify a DSA driver example that uses that
model, with mdio outside the switch. Do we have one already? Would the
OF compatible with the MDF model be something like this?

my_mfd {
    compatible "aaa";
    switch {
        compatible = "bbb";
        ports {
            port@0: {
              phy-handle = <&ethphy0>;
            }
        }
    }
    mdio {
         compatible = "ccc";
         ethphy0: ethernet-phy@0 {
         }
    }
}

And for MDIO-connected switches, something like this?

eth0 {
     mdio {
         my_mfd {
            switch{...}
            mdio{...}
         }
     }
}

Is it only a suggestion on how to write a new driver or should we
change the existing ones to fit both models?

Regards,

Luiz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation
  2023-12-13  5:30       ` Luiz Angelo Daros de Luca
@ 2023-12-13 12:06         ` Vladimir Oltean
  0 siblings, 0 replies; 24+ messages in thread
From: Vladimir Oltean @ 2023-12-13 12:06 UTC (permalink / raw)
  To: Luiz Angelo Daros de Luca
  Cc: Alvin Šipraga, netdev@vger.kernel.org,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Florian Fainelli, Madhuri Sripada, Marcin Wojtas,
	Linus Walleij, Tobias Waldekranz, Arun Ramadoss,
	Russell King (Oracle), Jonathan Corbet

On Wed, Dec 13, 2023 at 02:30:48AM -0300, Luiz Angelo Daros de Luca wrote:
> Hello Vladimir,
> 
> Sorry for my lack of understanding but I still didn't get it.
> 
> You are telling us we should not use user_mii_bus when the user MDIO
> is described in the OF. Is it only about the "generic DSA mii" or also
> about the custom ones the current drivers have? If it is the latter, I
> don't know how to connect the dots between phy_read/write functions
> and the port.

It's about both. It has to do with the role that ds->user_mii_bus has
(see more below), not about who allocates it.

When the user_mii_bus is allocated by the driver, ds->ops->phy_read()
and ds->ops->phy_write() are not needed. They are only needed for DSA
to implement the ops of the generic user_mii_bus - see dsa_user_mii_bus_init().

> We have some drivers that define ds->user_mii_bus (not the "generic
> DSA mii") with the MDIO described in OF. Are they wrong?

This right here is the core ds->user_mii_bus functionality:

	ret = phylink_of_phy_connect(dp->pl, port_dn, phy_flags);
	if (ret == -ENODEV && ds->user_mii_bus) {
		/* We could not connect to a designated PHY or SFP, so try to
		 * use the switch internal MDIO bus instead
		 */
		ret = dsa_user_phy_connect(user_dev, dp->index, phy_flags);
	}

So, the ds->user_mii_bus is only used if we cant't do phylink_of_phy_connect(),
which follows the "phy-handle" reference.

The reasons (that I can see) why we can't do phylink_of_phy_connect() are:
(a) port_dn is NULL (probing on platform_data and not OF)
(b) port_dn exists, but has no phy-handle
(c) port_dn exists and has a phy-handle to a PHY that doesn't respond
    (wrong address, ?!)

Out of those, it only makes practical sense to design for (a) and (b).

If the ds->user_mii_bus has an OF node, it means that we are not in case
(a). We are in case (b).

Normally to be in case (b), it means that the device tree looks like this:

	switch {
		ports {
			port@0 {
				reg = <0>;
			};
		};
	};

aka port@0 is a user port with an internal PHY, not described in OF.

But to combine case (b) with the additional constraint that ds->user_mii_bus
has an OF node means to have this device tree:

	switch {
		ports {
			port@0 {
				reg = <0>;
				// no phy-handle to <&phy0>
			};
		};

		mdio {
			phy0: ethernet-phy@0 {
			};
		};
	};

Which is simply a broken device tree which should not be like that [1].
If the MDIO bus appears in OF, then _all_ its PHYs must appear in OF [2].
And if all PHYs appear in OF, then you must have a phy-handle to
reference them.

[1] There exists an exception, which is the hack added by Florian in
    commit 771089c2a485 ("net: dsa: bcm_sf2: Ensure that MDIO diversion
    is used"). There, he starts with a valid phy-handle in the device
    tree, but the DSA driver removes it. This makes phylink_of_phy_connect()
    fail, and "diverts" the code to dsa_user_phy_connect(), where the
    mii_bus read and write ops are in control of the DSA driver. Hence
    the name "diversion to ds->user_mii_bus". It's a huge hack, make no
    mistake about it.

[2] This is because __of_mdiobus_register() does this:

	/* Mask out all PHYs from auto probing.  Instead the PHYs listed in
	 * the device tree are populated after the bus has been registered */
	mdio->phy_mask = ~0;

> Alvin asked if we should store the mii_bus internally and not in the
> ds->user_mii_bus but I don't think you answered it. Is it about where
> we store the bus (for dropping user_mii_bus in the future)?

The ds->user_mii_bus is not just a dropbox, a pointer for random storage
that the DSA core generously offers... It would have had a void * type
if it was that, and DSA wouldn't have used it.

When a driver populates ds->user_mii_bus, it opts into its core functionality,
which I just described above. If you don't need it, don't use it. It's
as simple as that. Use your own private pointer to a struct mii_bus,
which stays out of dsa_user_phy_connect()'s business.

It's very unlikely that ds->user_mii_bus will be dropped though, unless
we resolve all other situations that need dsa_user_phy_connect() in some
other way. One of those situations is case (b) described above, aka
device trees with no phy-handle, which we don't want to break (for the
old drivers where that used to work).

I cannot make a blanket comment on whether all drivers that populate
ds->user_mii_bus with an OF-aware MDIO bus "are wrong". Probably out
of sheer ignorance, they connected and tangled together 2 logically
isolated code paths by using ds->user_mii_bus as dumb storage.

What was written carelessly now takes an expert to untangle. You now
have as much information as I do, so you can judge for yourself whether
the behavior given by dsa_user_phy_connect() is needed. My only ask is
to stop proliferating this monkey-see-monkey-do. If, on top of that,
we could eliminate the gratuitous uses of ds->user_mii_bus as plain
storage, that would be just great.

> 
> You now also mention using the MFD model (shouldn't it be cited in the
> docs too?) but I couldn't identify a DSA driver example that uses that
> model, with mdio outside the switch. Do we have one already? Would the
> OF compatible with the MDF model be something like this?

I did mention it already in the docs.

"But an internal microprocessor may have a very different view of the switch
address space (which is MMIO), and may have discrete Linux drivers for each
peripheral. In 2023, the ``ocelot_ext`` driver was added, which deviated from
the traditional DSA driver architecture. Rather than probing on the entire
``spi_device``, it created a multi-function device (MFD) top-level driver for
it (associated with the SoC at large), and the switching IP is only one of the
children of the MFD (it is a platform device with regmaps provided by its
parent). The drivers for each peripheral in this switch SoC are the same when
controlled over SPI and when controlled by the internal processor."

> 
> my_mfd {
>     compatible "aaa";
>     switch {
>         compatible = "bbb";
>         ports {
>             port@0: {
>               phy-handle = <&ethphy0>;
>             }
>         }
>     }
>     mdio {
>          compatible = "ccc";
>          ethphy0: ethernet-phy@0 {
>          }
>     }
> }
> 
> And for MDIO-connected switches, something like this?
> 
> eth0 {
>      mdio {
>          my_mfd {
>             switch{...}
>             mdio{...}
>          }
>      }
> }

Follow the clue given by the "ocelot_ext" reference. Search for the "mscc,vsc7512-switch"
compatible string, which leads you to the Documentation/devicetree/bindings/mfd/mscc,ocelot.yaml
which is the schema for the entire SPI-connected SoC.

> 
> Is it only a suggestion on how to write a new driver or should we
> change the existing ones to fit both models?

First and foremost it is for new drivers. Read the actual patch:

"Authors of new switch drivers that use DSA are encouraged to have a wider view
of the peripherals on the chip that they are controlling, and to use the MFD
model to separate the components whenever possible. The general direction for
the DSA core is to focus only on the Ethernet switching IP and its ports.
``CONFIG_NET_DSA_HWMON`` was removed in 2017. Adding new methods to ``struct
dsa_switch_ops`` which are outside of DSA's core focus on Ethernet is strongly
discouraged."

For changing existing drivers, on one hand I would be glad to see effort
put there, but on the other, I need to be realistic and say that I also
tried to convert the sja1105 driver to the MFD model, and it's really,
really hard to be compatible with both device tree formats. So I'm not
holding my breath that I'll ever see conversion patches.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-12-13 12:07 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-08 19:35 [PATCH net 0/4] Add some history to the DSA documentation Vladimir Oltean
2023-12-08 19:35 ` [PATCH net 1/4] docs: net: dsa: document the tagger-owned storage mechanism Vladimir Oltean
2023-12-08 22:15   ` Linus Walleij
2023-12-08 22:32   ` Florian Fainelli
2023-12-10 13:37   ` Alvin Šipraga
2023-12-08 19:35 ` [PATCH net 2/4] docs: net: dsa: update platform_data documentation Vladimir Oltean
2023-12-08 22:19   ` Linus Walleij
2023-12-09  0:52     ` Vladimir Oltean
2023-12-08 22:33   ` Florian Fainelli
2023-12-10 13:37   ` Alvin Šipraga
2023-12-08 19:35 ` [PATCH net 3/4] docs: net: dsa: update user MDIO bus documentation Vladimir Oltean
2023-12-08 22:22   ` Linus Walleij
2023-12-08 22:36   ` Florian Fainelli
2023-12-09  1:22     ` Vladimir Oltean
2023-12-09 21:49       ` Linus Walleij
2023-12-10 13:48   ` Alvin Šipraga
2023-12-11 14:35     ` Vladimir Oltean
2023-12-13  5:30       ` Luiz Angelo Daros de Luca
2023-12-13 12:06         ` Vladimir Oltean
2023-12-08 19:35 ` [PATCH net 4/4] docs: net: dsa: replace TODO section with info about history and devel ideas Vladimir Oltean
2023-12-08 22:40   ` Linus Walleij
2023-12-08 23:03   ` Florian Fainelli
2023-12-09  1:58     ` Vladimir Oltean
2023-12-11 17:29       ` Florian Fainelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox