All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukasz Majewski <lukma@denx.de>
To: Vladimir Oltean <olteanv@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Madalin Bucur <madalin.bucur@oss.nxp.com>,
	Nicolas Ferre <nicolas.ferre@microchip.com>,
	Joakim Zhang <qiangqing.zhang@nxp.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	netdev@vger.kernel.org, Arnd Bergmann <arnd@arndb.de>,
	Mark Einon <mark.einon@gmail.com>,
	NXP Linux Team <linux-imx@nxp.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC 2/3] net: Provide switchdev driver for NXP's More Than IP L2 switch
Date: Tue, 29 Jun 2021 14:01:04 +0200	[thread overview]
Message-ID: <20210629140104.70a3da1a@ktm> (raw)
In-Reply-To: <20210629093055.x5pvcebk4y4f6nem@skbuf>

[-- Attachment #1: Type: text/plain, Size: 8967 bytes --]

Hi Vladimir,

> On Tue, Jun 29, 2021 at 10:09:37AM +0200, Lukasz Majewski wrote:
> > Hi Vladimir,
> >  
> > > On Mon, Jun 28, 2021 at 04:13:14PM +0200, Lukasz Majewski wrote:  
> > > > > > > So before considering merging your changes, i would like
> > > > > > > to see a usable binding.
> > > > > > >
> > > > > > > I also don't remember seeing support for STP. Without
> > > > > > > that, your network has broadcast storm problems when
> > > > > > > there are loops. So i would like to see the code needed
> > > > > > > to put ports into blocking, listening, learning, and
> > > > > > > forwarding states.
> > > > > > >
> > > > > > > 	  Andrew  
> > > > >
> > > > > I cannot stress enough how important it is for us to see STP
> > > > > support and consequently the ndo_start_xmit procedure for
> > > > > switch ports.  
> > > >
> > > > Ok.
> > > >  
> > > > > Let me see if I understand correctly. When the switch is
> > > > > enabled, eth0 sends packets towards both physical switch
> > > > > ports, and eth1 sends packets towards none, but eth0 handles
> > > > > the link state of switch port 0, and eth1 handles the link
> > > > > state of switch port 1?  
> > > >
> > > > Exactly, this is how FEC driver is utilized for this switch.  
> > >
> > > This is a much bigger problem than anything which has to do with
> > > code organization. Linux does not have any sort of support for
> > > unmanaged switches.  
> >
> > My impression is similar. This switch cannot easily fit into DSA
> > (lack of appending tags)  
> 
> No, this is not why the switch does not fit the DSA model.
> DSA assumes that the master interface and the switch are two
> completely separate devices which manage themselves independently.
> Their boundary is typically at the level of a MAC-to-MAC connection,
> although vendors have sometimes blurred this line a bit in the case
> of integrated switches. But the key point is that if there are 2
> external ports going to the switch, these should be managed by the
> switch driver. But when the switch is sandwiched between the Ethernet
> controller of the "DSA master" (the DMA engine of fec0) and the DSA
> master's MAC (still owned by fec), the separation isn't quite what
> DSA expects, is it? Remember that in the case of the MTIP switch, the
> fec driver needs to put the MACs going to the switch in promiscuous
> mode such that the switch behaves as a switch and actually forwards
> packets by MAC DA instead of dropping them. So the system is much
> more tightly coupled.
> 
>  +---------------------------------------------------------------------------+
>  |
>        | | +--------------+        +--------------------+--------+
>   +------------+ | |              |        | MTIP switch        |
>    |      |            | | |   fec 1 DMA  |---x    |
>   | Port 2 |------| fec 1 MAC  | | |              |        |
>   \  /    |        |      |            | | +--------------+        |
>            \/     +--------+      +------------+ |
>      |             /\              |                   | |
> +--------------+        +--------+   /  \    +--------+
> +------------+ | |              |        |        |           |
>  |      |            | | |   fec 0 DMA  |--------| Port 0 |
> | Port 1 |------| fec 0 MAC  | | |              |        |        |
>         |        |      |            | | +--------------+
> +--------+-----------+--------+      +------------+ |
>                                                           |
> +---------------------------------------------------------------------------+
> 
> Is this DSA? I don't really think so, but you could still try to argue
> otherwise.
> 
> The opposite is also true. DSA supports switches that don't append
> tags to packets (see sja1105). This doesn't make them "less DSA",
> just more of a pain to work with.
> 
> > nor to switchdev.
> >
> > The latter is caused by two modes of operation:
> >
> > - Bypass mode (no switch) -> DMA1 and DMA0 are used
> > - Switch mode -> only DMA0 is used
> >
> >
> > Moreover, from my understanding of the CPSW - looks like it uses
> > always just a single DMA, and the switching seems to be the default
> > operation for two ethernet ports.
> >
> > The "bypass mode" from NXP's L2 switch seems to be achieved inside
> > the CPSW switch, by configuring it to not pass packets between
> > those ports.  
> 
> I don't exactly see the point you're trying to make here. At the end
> of the day, the only thing that matters is what you expose to the
> user. With no way (when the switch is enabled) for a socket opened on
> eth0 to send/receive packets coming only from the first port, and a
> socket opened on eth1 to send/receive packets coming only from the
> second port, I think this driver attempt is a pretty far cry from
> what a switch driver in Linux is expected to offer, be it modeled as
> switchdev or DSA.
> 
> > > Please try to find out if your switch is supposed to be able
> > > to be managed (run control protocols on the CPU).  
> >
> > It can support all the "normal" set of L2 switch features:
> >
> > - VLANs, lookup table (with learning), filtering and forwarding
> >   (Multicast, Broadcast, Unicast), priority queues, IP snooping,
> > etc.
> >
> > Frames for BPDU are recognized by the switch and can be used to
> > implement support for RSTP. However, this switch has a separate
> > address space (not covered and accessed by FEC address).
> >  
> > > If not, well, I
> > > don't know what to suggest.  
> >
> > For me it looks like the NXP's L2 switch shall be treated _just_ as
> > offloading IP block to accelerate switching (NXP already support
> > dpaa[2] for example).
> >
> > The idea with having it configured on demand, when:
> > ip link add name br0 type bridge; ip link set br0 up;
> > ip link set eth0 master br0;
> > ip link set eth1 master br0;
> >
> > Seems to be a reasonable one. In the above scenario it would work
> > hand by hand with FEC drivers (as those would handle PHY
> > communication setup and link up/down events).  
> 
> You seem to imply that we are suggesting something different.
> 
> > It would be welcome if the community could come up with some rough
> > idea how to proceed with this IP block support  
> 
> Ok, so what I would do if I really cared that much about mainline
> support is I would refactor the FEC driver to offer its core
> functionality to a new multi-port driver that is able to handle the
> FEC DMA interfaces, the MACs and the switch. EXPORT_SYMBOL_GPL is your
> friend.
> 
> This driver would probe on a device tree binding with 3 "reg" values:
> 1 for the fec@800f0000, 1 for the fec@800f4000 and 1 for the
> switch@800f8000. No puppet master driver which coordinates other
> drivers, just a single driver that, depending on the operating state,
> manages all the SoC resources in a way that will offer a sane and
> consistent view of the Ethernet ports.
> 
> So it will have a different .ndo_start_xmit implementation depending
> on whether the switch is bypassed or not (if you need to send a
> packet on eth1 and the switch is bypassed, you send it through the
> DMA interface of eth1, otherwise you send it through the DMA
> interface of eth0 in a way in which the switch will actually route it
> to the eth1 physical port).
> 
> Then I would implement support for BPDU RX/TX (I haven't looked at the
> documentation, but I expect that what this switch offers for control
> traffic doesn't scale at high speeds (if it does, great, then send and
> receive all your packets as control packets, to have precise port
> identification). If it doesn't, you'll need a way to treat your data
> plane packets differently from the control plane packets. For the data
> plane, you can perhaps borrow some ideas from net/dsa/tag_8021q.c, or
> even from Tobias Waldekranz's proposal to just let data plane packets
> coming from the bridge slide into the switch with no precise control
> of the destination port at all, just let the switch perform FDB
> lookups for those packets because the switch hardware FDB is supposed
> to be more or less in sync with the bridge software FDB:
> https://patchwork.kernel.org/project/netdevbpf/cover/20210426170411.1789186-1-tobias@waldekranz.com/
> 

Thanks for sketching and sharing such detailed plan. 

> > (especially that for example imx287 is used in many embedded devices
> > and is going to be in active production for next 10+ years).  
> 
> Well, I guess you have a plan then. There are still 10+ years left to
> enjoy the benefits of a proper driver design...

:-)


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2021-06-29 12:01 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22 14:41 [RFC 0/3] net: imx: Provide support for L2 switch as switchdev accelerator Lukasz Majewski
2021-06-22 14:41 ` [RFC 1/3] ARM: dts: imx28: Add description for L2 switch on XEA board Lukasz Majewski
2021-06-22 14:45   ` Andrew Lunn
2021-06-22 20:51     ` Lukasz Majewski
2021-06-23 13:17       ` Andrew Lunn
2021-06-23 15:26         ` Lukasz Majewski
2021-06-24  0:36           ` Florian Fainelli
2021-06-24  2:19             ` Joakim Zhang
2021-06-24 11:21               ` Lukasz Majewski
2021-06-25  8:28                 ` Joakim Zhang
2021-06-25 10:18                   ` Lukasz Majewski
2021-06-24 11:03             ` Lukasz Majewski
2021-06-22 14:41 ` [RFC 2/3] net: Provide switchdev driver for NXP's More Than IP L2 switch Lukasz Majewski
2021-06-22 15:03   ` Andrew Lunn
2021-06-23 11:37     ` Lukasz Majewski
2021-06-23 20:01       ` Andrew Lunn
2021-06-24 10:53         ` Lukasz Majewski
2021-06-24 13:34           ` Andrew Lunn
2021-06-24 14:35             ` Lukasz Majewski
2021-06-24 16:11               ` Andrew Lunn
2021-06-25  9:59                 ` Lukasz Majewski
2021-06-25 14:40                   ` Andrew Lunn
2021-06-28 12:05                     ` Lukasz Majewski
2021-06-28 12:48                       ` Vladimir Oltean
2021-06-28 14:13                         ` Lukasz Majewski
2021-06-28 14:23                           ` Vladimir Oltean
2021-06-29  8:09                             ` Lukasz Majewski
2021-06-29  9:30                               ` Vladimir Oltean
2021-06-29 12:01                                 ` Lukasz Majewski [this message]
2021-06-28 13:23                       ` Andrew Lunn
2021-06-28 14:14                         ` Lukasz Majewski
2021-06-22 14:41 ` [RFC 3/3] net: imx: Adjust fec_main.c to provide support for " Lukasz Majewski
2021-06-22 15:10   ` Andrew Lunn
2021-06-23  7:48     ` Lukasz Majewski
2021-06-25 22:04 ` [RFC 0/3] net: imx: Provide support for L2 switch as switchdev accelerator Vladimir Oltean
2021-06-28  9:41   ` Lukasz Majewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210629140104.70a3da1a@ktm \
    --to=lukma@denx.de \
    --cc=andrew@lunn.ch \
    --cc=arnd@arndb.de \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-imx@nxp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=madalin.bucur@oss.nxp.com \
    --cc=mark.einon@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.ferre@microchip.com \
    --cc=olteanv@gmail.com \
    --cc=qiangqing.zhang@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.