From: Andy Gospodarek <andy@greyhouse.net>
To: Scott Feldman <sfeldma@cumulusnetworks.com>,
Jiri Pirko <jiri@resnulli.us>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>,
Jamal Hadi Salim <jhs@mojatatu.com>,
Florian Fainelli <f.fainelli@gmail.com>,
Neil Horman <nhorman@tuxdriver.com>, Thomas Graf <tgraf@suug.ch>,
netdev <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>,
dborkman <dborkman@redhat.com>, ogerlitz <ogerlitz@mellanox.com>,
jesse <jesse@nicira.com>, pshelar <pshelar@nicira.com>,
azhou <azhou@nicira.com>, Ben Hutchings <ben@decadent.org.uk>,
Stephen Hemminger <stephen@networkplumber.org>,
jeffrey.t.kirsher@intel.com, vyasevic <vyasevic@redhat.com>,
Cong Wang <xiyou.wangcong@gmail.com>,
John Fastabend <john.r.fastabend@intel.com>,
Eric Dumazet <edumazet@google.com>,
Lennert Buytenhek <buytenh@wantstofly.org>,
Shrijeet Mukherjee <shm@cumulusnetworks.com>
Subject: Re: [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath
Date: Wed, 02 Apr 2014 10:32:49 -0400 [thread overview]
Message-ID: <533C1F91.6000704@greyhouse.net> (raw)
In-Reply-To: <2D65D0C2-6BBC-4968-8400-4EB60BDF887A@cumulusnetworks.com>
On 04/01/2014 03:13 PM, Scott Feldman wrote:
> On Mar 26, 2014, at 11:03 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>
>> Wed, Mar 26, 2014 at 06:47:15PM CET, roopa@cumulusnetworks.com wrote:
>>> On 3/26/14, 9:59 AM, Jiri Pirko wrote:
>>>> Wed, Mar 26, 2014 at 05:54:17PM CET, roopa@cumulusnetworks.com wrote:
>>>> So you implement bonding netlink api? Or you hook into bonding driver
>>>> itselt? Can you show us the code?
>>> We use the netlink API and libnl. In our current model, our switch
>>> chip driver listens to netlink notifications and programs the switch
>>> chip. The switch chip driver uses libnl caches and libnl netlink apis
>>> to reflect the kernel state to switch chip.
>>
>> So when you configure for example bonding over 2 ports, you actually use
>> bonding driver to do that. And you userspace app listens to
>> notifications and programs the switch chip accordingly. Am I close?
>>
>> How about data? Is this new "bonding" interface able to assign ip to is
>> and send/receive packets.
>>
>> I'm still not sure I understand your concept. Do you have some
>> documentation for it available?
> Actually Jiri this is the code you and I worked on recently to netlink-ify bonding/slave attributes and active/inactive notification. You have it right, user uses normal ip link tools and bonding driver to create bond, set attributes, and enslave switch ports. RTM_NEWLINK is used to program ASIC to offload LAG to HW. RTM_NEWLINK msgs contains bond attributes (mode, etc) and slave list, as well as slave status. This is enough information to program ASIC. Once programmed, ASIC offloads the data plane traffic, and in the case of egress, handles the LAG hash distribution. Only the LACP control plane traffic makes it to the bonding driver; data plane traffic does not make it to the bonding driver.
>
> So, not trying to sound like a smart-ass, but the documentation is the bonding driver, specifically the netlink attributes/notifications.
>
> -scott
Using netlink messages to notify drivers for these ASICs really seems
like a great way to handle things. It would obviously require some
expansion of netlink, but that seems fine.
I would prefer that ASIC vendors write initial drivers for their ASICs
such that each physical port is detected and exported as a netdev. This
would mean each *minimal* kernel driver for an ASIC would need to have
support for the following (off the top of my head):
- detect link status on an interface
- set an interface's MAC address
- configure the chip to send all frames to the CPU
- register a napi handler for the interfaces (depending on
packet-buffering capabilities in the hardware)
As support for new hardware capabilities are moved from switch vendor
SDKs to their kernel driver the driver can begin to listen for netlink
messages that:
- setup bonds/teams
- add ports to bridge groups
- configure port-based or mac-based VLANs
- add unicast and multicast entries
- add and remove entries from a flow table
- ...
Maybe this all seems to matter-of-fact and the discussion has evolved
well beyond something this high-level, but there still seems to be
significant discussion about whether or not the ASIC should be exported
as a netdev and I'm just not seeing a compelling reason. This was my
attempt to explain why. :)
next prev parent reply other threads:[~2014-04-02 14:40 UTC|newest]
Thread overview: 125+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-19 15:33 [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath Jiri Pirko
2014-03-19 15:33 ` [patch net-next RFC 1/4] openvswitch: split flow structures into ovs specific and generic ones Jiri Pirko
2014-03-20 13:04 ` Thomas Graf
2014-03-19 15:33 ` [patch net-next RFC 2/4] net: introduce switchdev API Jiri Pirko
2014-03-20 13:59 ` Thomas Graf
2014-03-20 14:18 ` Jiri Pirko
2014-03-20 14:43 ` Nikolay Aleksandrov
2014-03-20 15:42 ` Jiri Pirko
2014-03-19 15:33 ` [patch net-next RFC 3/4] openvswitch: Introduce support for switchdev based datapath Jiri Pirko
2014-03-19 15:33 ` [patch net-next RFC 4/4] net: introduce dummy switch Jiri Pirko
2014-03-20 11:49 ` [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath Jamal Hadi Salim
2014-03-20 12:40 ` Jiri Pirko
2014-03-20 17:21 ` Florian Fainelli
2014-03-21 12:04 ` Jamal Hadi Salim
2014-03-22 9:48 ` Jiri Pirko
2014-03-24 23:07 ` Jamal Hadi Salim
2014-03-25 17:39 ` Neil Horman
2014-03-25 18:00 ` Thomas Graf
2014-03-25 19:35 ` Neil Horman
2014-03-25 20:11 ` Florian Fainelli
2014-03-25 20:31 ` Neil Horman
2014-03-25 21:22 ` Jamal Hadi Salim
2014-03-25 21:26 ` Thomas Graf
2014-03-25 21:42 ` Florian Fainelli
2014-03-25 21:54 ` Thomas Graf
2014-03-26 10:55 ` Neil Horman
2014-03-26 5:37 ` Roopa Prabhu
2014-03-26 10:54 ` Jamal Hadi Salim
2014-03-26 15:31 ` John W. Linville
2014-03-26 16:54 ` Roopa Prabhu
2014-03-26 16:59 ` Jiri Pirko
2014-03-26 17:29 ` Florian Fainelli
2014-03-26 17:35 ` Jiri Pirko
2014-03-26 17:58 ` Florian Fainelli
2014-03-26 18:14 ` Jiri Pirko
2014-03-26 18:29 ` Hannes Frederic Sowa
2014-03-26 18:30 ` Florian Fainelli
2014-03-26 21:51 ` Jamal Hadi Salim
2014-03-26 22:22 ` Florian Fainelli
2014-03-26 22:53 ` Jamal Hadi Salim
2014-03-26 23:16 ` Florian Fainelli
2014-03-27 6:56 ` Jiri Pirko
2014-03-27 10:39 ` Jamal Hadi Salim
2014-03-27 10:50 ` Jiri Pirko
2014-03-27 11:12 ` Jamal Hadi Salim
2014-03-27 11:16 ` Jiri Pirko
2014-03-27 14:10 ` Sergey Ryazanov
2014-03-27 16:41 ` Florian Fainelli
2014-03-27 16:57 ` Jiri Pirko
2014-03-27 16:59 ` Thomas Graf
2014-03-27 20:32 ` Sergey Ryazanov
2014-03-27 21:20 ` Florian Fainelli
2014-03-27 21:55 ` Jamal Hadi Salim
2014-03-28 6:28 ` Jiri Pirko
2014-03-30 12:08 ` Alon Harel
2014-03-27 21:41 ` Jamal Hadi Salim
2014-03-27 16:55 ` Jiri Pirko
2014-03-27 19:58 ` Sergey Ryazanov
2014-03-27 20:01 ` Florian Fainelli
2014-03-27 20:04 ` Sergey Ryazanov
2014-03-27 21:47 ` Jamal Hadi Salim
2014-03-27 21:54 ` Florian Fainelli
2014-03-27 21:59 ` Jamal Hadi Salim
2014-03-27 22:19 ` Florian Fainelli
2014-03-27 23:42 ` Thomas Graf
2014-03-27 23:46 ` Florian Fainelli
2014-03-26 17:57 ` Roopa Prabhu
2014-03-26 18:09 ` Florian Fainelli
2014-03-27 13:46 ` John W. Linville
2014-03-26 17:47 ` Roopa Prabhu
2014-03-26 18:03 ` Jiri Pirko
2014-03-26 21:27 ` Roopa Prabhu
2014-03-26 21:31 ` Jiri Pirko
2014-03-27 15:35 ` Roopa Prabhu
2014-03-27 16:10 ` Jiri Pirko
2014-04-01 19:13 ` Scott Feldman
2014-04-02 6:41 ` Jiri Pirko
2014-04-02 15:37 ` Scott Feldman
2014-04-02 14:32 ` Andy Gospodarek [this message]
2014-04-02 15:25 ` John W. Linville
2014-04-02 16:15 ` Scott Feldman
2014-04-02 16:47 ` Florian Fainelli
2014-04-02 21:52 ` Thomas Graf
2014-04-02 19:29 ` John W. Linville
2014-04-02 19:54 ` Scott Feldman
2014-04-02 20:06 ` John W. Linville
2014-04-02 20:04 ` Stephen Hemminger
2014-04-02 20:23 ` Jiri Pirko
2014-04-02 20:38 ` John W. Linville
2014-04-02 21:36 ` Thomas Graf
2014-03-25 20:56 ` Jamal Hadi Salim
2014-03-25 21:19 ` Thomas Graf
2014-03-25 21:24 ` Jamal Hadi Salim
2014-03-26 7:21 ` Jiri Pirko
2014-03-26 11:00 ` Jamal Hadi Salim
2014-03-26 11:06 ` Jamal Hadi Salim
2014-03-26 11:31 ` Jamal Hadi Salim
2014-03-26 13:20 ` Jiri Pirko
2014-03-26 13:23 ` Jamal Hadi Salim
2014-03-26 13:17 ` Jiri Pirko
2014-03-26 11:10 ` Neil Horman
2014-03-26 11:29 ` Thomas Graf
2014-03-26 12:58 ` Jamal Hadi Salim
2014-03-26 15:22 ` John W. Linville
2014-03-26 21:36 ` Jamal Hadi Salim
2014-03-26 18:21 ` Neil Horman
2014-03-26 19:11 ` Florian Fainelli
2014-03-26 22:44 ` Jamal Hadi Salim
2014-03-26 23:15 ` Thomas Graf
2014-03-26 23:21 ` Florian Fainelli
2014-03-27 15:26 ` Neil Horman
2014-03-27 21:33 ` Jamal Hadi Salim
2014-03-26 19:24 ` Hannes Frederic Sowa
2014-03-27 13:43 ` John W. Linville
2014-03-26 12:19 ` Jamal Hadi Salim
2014-03-26 15:27 ` John W. Linville
2014-03-25 18:33 ` Florian Fainelli
2014-03-25 19:40 ` Neil Horman
2014-03-25 20:00 ` Florian Fainelli
2014-03-25 21:39 ` tgraf
2014-03-25 22:08 ` Jamal Hadi Salim
2014-03-26 5:48 ` Roopa Prabhu
2014-03-25 20:46 ` Jamal Hadi Salim
2014-03-26 7:24 ` Jiri Pirko
2014-03-22 9:40 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=533C1F91.6000704@greyhouse.net \
--to=andy@greyhouse.net \
--cc=azhou@nicira.com \
--cc=ben@decadent.org.uk \
--cc=buytenh@wantstofly.org \
--cc=davem@davemloft.net \
--cc=dborkman@redhat.com \
--cc=edumazet@google.com \
--cc=f.fainelli@gmail.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=jesse@nicira.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=john.r.fastabend@intel.com \
--cc=netdev@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=ogerlitz@mellanox.com \
--cc=pshelar@nicira.com \
--cc=roopa@cumulusnetworks.com \
--cc=sfeldma@cumulusnetworks.com \
--cc=shm@cumulusnetworks.com \
--cc=stephen@networkplumber.org \
--cc=tgraf@suug.ch \
--cc=vyasevic@redhat.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.