From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath Date: Fri, 28 Mar 2014 07:28:27 +0100 Message-ID: <20140328062827.GB2805@minipsycho.orion> References: <20140326173536.GJ2869@minipsycho.orion> <20140326181436.GL2869@minipsycho.orion> <53334BDA.1060608@mojatatu.com> <20140327065616.GA2845@minipsycho.orion> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Sergey Ryazanov , Jamal Hadi Salim , Roopa Prabhu , Neil Horman , Thomas Graf , netdev , David Miller , Andy Gospodarek , dborkman , ogerlitz , jesse , pshelar , azhou , Ben Hutchings , Stephen Hemminger , jeffrey.t.kirsher@intel.com, vyasevic , Cong Wang , John Fastabend , Eric Dumazet , Scott Feldman , Lennert Buytenhek , Shrijeet Mukherjee , Felix Fietkau To: Florian Fainelli Return-path: Received: from mail-ee0-f52.google.com ([74.125.83.52]:33963 "EHLO mail-ee0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750946AbaC1G2b (ORCPT ); Fri, 28 Mar 2014 02:28:31 -0400 Received: by mail-ee0-f52.google.com with SMTP id e49so3619984eek.39 for ; Thu, 27 Mar 2014 23:28:30 -0700 (PDT) Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Thu, Mar 27, 2014 at 10:20:06PM CET, f.fainelli@gmail.com wrote: >2014-03-27 13:32 GMT-07:00 Sergey Ryazanov : >> 2014-03-27 20:41 GMT+04:00 Florian Fainelli : >>> 2014-03-27 7:10 GMT-07:00 Sergey Ryazanov : >>>> Hi all, >>>> >>>> sorry for the intrusion, but let me place my 2 cents. >>>> >>>> 2014-03-27 10:56 GMT+04:00 Jiri Pirko : >>>>> Wed, Mar 26, 2014 at 11:22:51PM CET, f.fainelli@gmail.com wrote: >>>>>>2014-03-26 14:51 GMT-07:00 Jamal Hadi Salim : >>>>>>> On 03/26/14 14:14, Jiri Pirko wrote: >>>>>>>> >>>>>>>> Wed, Mar 26, 2014 at 06:58:32PM CET, f.fainelli@gmail.com wrote: >>>>>>>>> >>>>>>>>> 2014-03-26 10:35 GMT-07:00 Jiri Pirko : >>>>>>> >>>>>>> >>>>>>> >>>>>>>>> You are right, sw1p0 and sw1p1 were meant to be, say LAN ports in my >>>>>>>>> example. >>>>>>>>> >>>>>>>>> I think there is an implicit convention that sw1 represents the >>>>>>>>> Ethernet switch port connected to the CPU Ethernet MAC, and that it is >>>>>>>>> always connected, hence there is no need to create a "fake" bridge to >>>>>>>>> link sw1 to eth0 for instance? >>>>>>>> >>>>>>>> >>>>>>>> I think you are kind of mixing apples and oranges (or I might be I'm not >>>>>>>> understanding you correctly). >>>>>>>> This is how I see it, sticking to the names you use in the example: >>>>>>>> >>>>>>>> (sw1) (abstract place-holder netdev) >>>>>>>> -------- >>>>>>>> switch chip CPU >>>>>>>> ----------------------- ------ >>>>>>>> sw1p0 sw1p1 sw1p2 sw1p3 eth0 >>>>>>>> | | | | | >>>>>>>> PHY PHY PHY ------someMII----- >>>>>>>> >>>>>>>> You see that eth0 is the CPU part of the "connection" and sw1p3 is the >>>>>>>> switch part (port representation). >>>>>>>> >>>>>>> >>>>>>> >>>>>>> Florian - I am sure you explained this before; I just dont remember. Why >>>>>>> is there need to expose eth0? It seems to me sw1p0-3 are abstracted >>>>>>> already in the kernel and the "cpu port" is merely a control interface. >>>>>> >>>>>>eth0 corresponds to a CPU Ethernet MAC facing e.g: sw1p3 switch port. >>>>>>It is "regular" Ethernet driver connected to the switch without >>>>>>switch-specific logic. The goal is twofold: >>>>>> >>>>>>- allow any regular Ethernet driver to be connected to an external >>>>>>switch via e.g: MDIO/MDC or other without specific switch knowledge >>>>>>- represents accurately how the hardware is designed/connected >>>>>> >>>>>>but maybe, we can simplify and have e.g: sw1p3 and eth0 be the same interface... >>>>> >>>>> I believe that hawing both sw1p3 and eth0 is the correct way of >>>>> modelling this. sw1p3 is instance if switch chip driver representing the >>>>> actual port of a switch. eth0 is an instance of some other ordinary NIC >>>>> driver (8139too is my favorite :)) >>>>> >>>>> This model allows to draw the exact picture. >>>>> Also, when you add the described possibility to use iplink to build >>>>> vlans, bridges whatever on the switch ports, it makes perfect sense to >>>>> have this model. >>>>> >>>>> Merging sw1p3 and eth0 would cause a loose of information and confusion. >>>>> >>>> >>>> CPU switch port and switch fabric itself should be configured in >>>> consistence with host, in first place I mean a set of VLANs. Also it >>>> should be mentioned that some generic knobs such as port rate and >>>> duplex mode are meaningless for CPU switch port and a lot of status >>>> information (rx/tx counters etc.) duplicates statistics of host >>>> interface which is connected to switch port. >>> >>> It duplicates the information when things just work fine, consider an >>> external switch connected via RGMII to a CPU Ethernet MAC, you might >>> want to get statistics from both sides (the switch CPU port and the >>> CPU Ethernet MAC) to diagnose why things are not working as expected, >>> which unfortunately happens once in a while with RGMII. >>> >>> If we expose both net_device, we will be able to retrieve statistics >>> about from both sides, without resorting to ad-hoc debugging tools, >>> but maybe this is not worth the effort. >>> >> I also thought about this situation. Can we use the debugfs interface >> for these purposes? > >Most of the time you are interesting in MIB counters for debugging >such issues, so ethtool quickly comes handy for this task. Since we >will provide per-port counters, the CPU port is not different, so >there are no reason for restricting this. I agree, no need to provide parallel api. > >> >>>> So there are no reasons >>>> to force user to configure this port manually, and automatic >>>> configuration of CPU switch port without exporting them as netdev >>>> seems as good approach. >>> >>> Well, maybe that's the answer, since we know that e.g: sw1p3 is always >>> connected to e.g: eth0, we could create an automatic bridge between >>> those two, this would keep the netdev exposure to user-space, but an >>> user would not have to know about that specific detail to get things >>> to work. >>> >> I would like go further and suggest to consider a netdev that is >> connected to the CPU switch port, as master. In case when we need to >> perform some action on whole switch (e.g. dump FIB). > >This is what the 'sw1' net_device in Jiri's proposal would do. Except, sw1 is not cpu port. It's just a place holder not representing any physical port/netdev. > >> And even name >> switch ports, using master netdev name as prefix (e.g. eth1p0, eth1p1, >> ..., eth1pN for ports of switch that is connected via eth1). > >I think the port naming using the switch abstract interface (sw1 here) >is better because ports do belong to the switch. >-- >Florian