From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath Date: Wed, 26 Mar 2014 22:31:30 +0100 Message-ID: <20140326213130.GA32193@minipsycho.orion> References: <20140325180009.GB15723@casper.infradead.org> <20140325193533.GF8102@hmsreliant.think-freely.org> <5332677F.2090404@cumulusnetworks.com> <5332B1FE.7080102@mojatatu.com> <53330639.8050403@cumulusnetworks.com> <20140326165934.GH2869@minipsycho.orion> <533312A3.5070600@cumulusnetworks.com> <20140326180356.GK2869@minipsycho.orion> <53334629.5030208@cumulusnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jamal Hadi Salim , Florian Fainelli , Neil Horman , Thomas Graf , netdev , David Miller , Andy Gospodarek , dborkman , ogerlitz , jesse , pshelar , azhou , Ben Hutchings , Stephen Hemminger , jeffrey.t.kirsher@intel.com, vyasevic , Cong Wang , John Fastabend , Eric Dumazet , Scott Feldman , Lennert Buytenhek , Shrijeet Mukherjee To: Roopa Prabhu Return-path: Received: from mail-ee0-f46.google.com ([74.125.83.46]:40564 "EHLO mail-ee0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752126AbaCZVbe (ORCPT ); Wed, 26 Mar 2014 17:31:34 -0400 Received: by mail-ee0-f46.google.com with SMTP id t10so2151105eei.33 for ; Wed, 26 Mar 2014 14:31:32 -0700 (PDT) Content-Disposition: inline In-Reply-To: <53334629.5030208@cumulusnetworks.com> Sender: netdev-owner@vger.kernel.org List-ID: Wed, Mar 26, 2014 at 10:27:05PM CET, roopa@cumulusnetworks.com wrote: >On 3/26/14, 11:03 AM, Jiri Pirko wrote: >>Wed, Mar 26, 2014 at 06:47:15PM CET, roopa@cumulusnetworks.com wrote: >>>On 3/26/14, 9:59 AM, Jiri Pirko wrote: >>>>Wed, Mar 26, 2014 at 05:54:17PM CET, roopa@cumulusnetworks.com wrote: >>>>>On 3/26/14, 3:54 AM, Jamal Hadi Salim wrote: >>>>>>On 03/26/14 01:37, Roopa Prabhu wrote: >>>>>>>On 3/25/14, 1:11 PM, Florian Fainelli wrote: >>>>>>>>2014-03-25 12:35 GMT-07:00 Neil Horman : >>>>>>>Sorry about getting on this thread late and possibly in the middle. >>>>>>>Agree on the idea of keeping the ports linked to the master switch dev >>>>>>>(or the 'conduit' to the switch chip) via private list instead of the >>>>>>>master-slave relationship proposed earlier. >>>>>>>By private i mean the netdev->priv linkage to the master switch dev and >>>>>>>not really keeping the ports from being exposed to the user. >>>>>>> >>>>>>>We think its better to keep the switch ports exposed as any other netdev >>>>>>>on linux. >>>>>>> This approach will make the switch ports look exactly like a nic port >>>>>>>and all tools will continue to work seamlessly. The switch port >>>>>>>operations could internally be forwarded to the switch netdev (sw1 in >>>>>>>the above case). >>>>>>> >>>>>>>example: >>>>>>>$ip link set dev sw1p0 up >>>>>>>$ethtool -S sw1p0 >>>>>>> >>>>>>I like the approach. I know the above is a simple version, but i am >>>>>>assuming you also mean i can do things like >>>>>>ip route add ... >>>>>>bridge fdb add ... (and if you like your brctl go ahead) >>>>>>bonding ... >>>>>> >>>>>yes, exactly. We support this model on our boxes today. >>>>>User can bond switch ports on our box in the exact same way as he/she >>>>>would bond two nic ports. >>>>>Our 'conduit to switch chip' reflects the corresponding lag >>>>>configuration in the switch chip. >>>>>Same goes for bridging, routing, acls. >>>>So you implement bonding netlink api? Or you hook into bonding driver >>>>itselt? Can you show us the code? >>>We use the netlink API and libnl. In our current model, our switch >>>chip driver listens to netlink notifications and programs the switch >>>chip. The switch chip driver uses libnl caches and libnl netlink apis >>>to reflect the kernel state to switch chip. >> >>So when you configure for example bonding over 2 ports, you actually use >>bonding driver to do that. And you userspace app listens to >>notifications and programs the switch chip accordingly. Am I close? >yes correct. >> >>How about data? Is this new "bonding" interface able to assign ip to is >>and send/receive packets. >yes >> >>I'm still not sure I understand your concept. Do you have some >>documentation for it available? >> >I think the only documentation available today in this area is the >user guide and that in-turn points to native linux command manpages >iproute2, sysfs, debian ifupdown etc. >I will see if i can find anything else. I ment the architecture design documentation. linux manpages are not that interesting to me :) > >thanks, >Roopa