netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Graf <tgraf@suug.ch>
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: Scott Feldman <sfeldma@cumulusnetworks.com>,
	"John W. Linville" <linville@tuxdriver.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	Jiri Pirko <jiri@resnulli.us>,
	Roopa Prabhu <roopa@cumulusnetworks.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Neil Horman <nhorman@tuxdriver.com>,
	netdev <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	dborkman <dborkman@redhat.com>, ogerlitz <ogerlitz@mellanox.com>,
	jesse <jesse@nicira.com>, pshelar <pshelar@nicira.com>,
	azhou <azhou@nicira.com>, Ben Hutchings <ben@decadent.org.uk>,
	Stephen Hemminger <stephen@networkplumber.org>,
	jeffrey.t.kirsher@intel.com, vyasevic <vyasevic@redhat.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	John Fastabend <john.r.fastabend@intel.com>,
	Eric Dumazet <edumazet@google.com>,
	Lennert Buytenhek <buytenh@wantstofly.org>,
	Shrijeet Mukherjee <shm@cumulusnetworks.com>
Subject: Re: [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath
Date: Wed, 2 Apr 2014 22:52:15 +0100	[thread overview]
Message-ID: <20140402215215.GO11670@casper.infradead.org> (raw)
In-Reply-To: <CAGVrzcZCU+fMQ0XwNa-3j3ATtSu4rDxL=+Hcwc4nTUj89A8ZZw@mail.gmail.com>

On 04/02/14 at 09:47am, Florian Fainelli wrote:
> 2014-04-02 9:15 GMT-07:00 Scott Feldman <sfeldma@cumulusnetworks.com>:
> > On Apr 2, 2014, at 8:25 AM, John W. Linville <linville@tuxdriver.com> wrote:
> >> On Wed, Apr 02, 2014 at 10:32:49AM -0400, Andy Gospodarek wrote:
> >>> Maybe this all seems to matter-of-fact and the discussion has
> >>> evolved well beyond something this high-level, but there still seems
> >>> to be significant discussion about whether or not the ASIC should be
> >>> exported as a netdev and I'm just not seeing a compelling reason.
> >>> This was my attempt to explain why.  :)
> >>
> >> Andy and I discussed this off-line, so I am admittedly partial to
> >> the conclusions we shared as reflected above... :-)
> >>
> >> While I might be convinced that there should be _something_ to
> >> represent the switch chip for some purpose (e.g. topology mapping),
> >> I'm not at all convinced that thing should be a netdev.  I don't see
> >> where the switch chip by itself looks much like any other netdev at
> >> all, especially once you model the actual front-panel ports with
> >> their own netdevs.  I do know that having an extra "magic netdev"
> >> in the wireless space added a lot of confusion for no clear gain,
> >> leading to it later being abolished.
> >>
> >> Modeling at the switch level might make more sense from a flow
> >> management perspective?  But if those flows are managed using a netlink
> >> protocol, does it matter what sort of entity is listening and acting
> >> on those messages?  If a switch-specific interface is needed for that,
> >> we should build it rather than pretending it looks like a netdev.
> >> I also think that throwing the DSA switches in with flow-based and
> >> "Enterprise" switches may just be confusing things.
> >>
> >> I think that the opening bid should be a minimal hardware driver that
> >> models each front-panel port with a netdev and passes all traffic
> >> to/from the CPU.  Intelligence beyond that should be added on a
> >> 'can-do' basis, with individual drivers (or corresponding userland
> >> components) listening to existing netlink traffic and implementing
> >> support for existing protocols to the best of their abilities.
> >> Missing functionality in the netlink protocols or other functions
> >> (e.g. bonding, bridging, etc) can be evolved over time as we discover
> >> missing bits required for switch acceleration.
> >
> > I agree completely with your/Andy’s view.  It’s the switch port, not the switch, that needs to be modeled as a netdev.  The switch port is the abstraction that allows other existing virtual devices (bridges, bond, vxlans, etc) to cuddle against.  Is a switch port a special netdev in some way?  At a high level, not really.  I mean in sense it’s just eth48 on a super NIC.  OK, there may be some advantage to setting a IFF_SWITCH_PORT on the switch port netdev, so cuddling netdevs could get a hint that their data plane might be offloaded.
> >
> > I’ve been back-and-forth on the switch netdev.  Today I’m not for it.  But I’m still searching for a reason.  At one point I thought a switch netdev would be nice in a L3 router case where we needed a router IP address to do things like OSPF unnumbered interfaces, but even in that case, we can just put the router IP on lo.  Another reason would be to use the switch netdev as a place for switch-wide settings and status.  For example,
> > ethtool -S stats on switch netdev would show switch-wide stats like ACL drops or something like that.  Maybe a switch device is modeled as a new device class?  I guess it comes down to how much is duplicated between different vendors' switch driver implementations.
> 
> I think the idea behind exposing a switch net_device is to account for
> all special cases where there is not already an existing and
> well-defined model for switch-wide events/control/information that we
> might want to have. Why a net_device, because the switch ports will
> already be exposed as such, so mostly for consistency with the
> presented user-space interface. Whether that net_device exposes
> different child devices of different classes, e.g: MTD partitions to
> access firmware updates, SPI master/slave controller(s), MDIO
> controller(s), is yet to be defined I suppose.

Having a master net_device seemed logical to me at first just
like it always made sense to me to have software bridges be
represented by a net_device. I agree with a lot of the concerns
though.

I see the following uses for a master net_device:
 - represent slave/master relationship and provide IFF_UP control
 - expose non port specific statistics
 - flow configuration
 - tunnel configuration
 - allow creation of virtual ports that are not backed up with HW

I want to expand on the last point a bit. I specifically did not
mention IP configuration above which is what the bridge master is
used frequently. I absolutely like the OVS model where multiple
internal ports can be created which hook into the network stack
and can thus be assigned IPs. The model allows for separate internal
ports to be configured as different VLAN access ports for example.
They also provide multiple AF_PACKET rx handlers, etc.

 sw1p1 -+
 sw1p2 -+       +-sw1int0 (ip=30.0.0.1) -> netif_rx()
 sw1p3 -+- sw1 -+-sw1int1 (vlan=10, ip=10.0.0.1) -> netif_rx()
 sw1p4 -+       +-sw1vxlan0 (remote_ip=20.0.0.2)

If supported by the chip, flows can be setup automatically to
feed these virtual ports and setup encapsultion. Others will
require software fallback. Some will not support it at all.

  reply	other threads:[~2014-04-02 21:52 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-19 15:33 [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath Jiri Pirko
2014-03-19 15:33 ` [patch net-next RFC 1/4] openvswitch: split flow structures into ovs specific and generic ones Jiri Pirko
2014-03-20 13:04   ` Thomas Graf
2014-03-19 15:33 ` [patch net-next RFC 2/4] net: introduce switchdev API Jiri Pirko
2014-03-20 13:59   ` Thomas Graf
2014-03-20 14:18     ` Jiri Pirko
2014-03-20 14:43   ` Nikolay Aleksandrov
2014-03-20 15:42     ` Jiri Pirko
2014-03-19 15:33 ` [patch net-next RFC 3/4] openvswitch: Introduce support for switchdev based datapath Jiri Pirko
2014-03-19 15:33 ` [patch net-next RFC 4/4] net: introduce dummy switch Jiri Pirko
2014-03-20 11:49 ` [patch net-next RFC 0/4] introduce infrastructure for support of switch chip datapath Jamal Hadi Salim
2014-03-20 12:40   ` Jiri Pirko
2014-03-20 17:21     ` Florian Fainelli
2014-03-21 12:04       ` Jamal Hadi Salim
2014-03-22  9:48         ` Jiri Pirko
2014-03-24 23:07           ` Jamal Hadi Salim
2014-03-25 17:39             ` Neil Horman
2014-03-25 18:00               ` Thomas Graf
2014-03-25 19:35                 ` Neil Horman
2014-03-25 20:11                   ` Florian Fainelli
2014-03-25 20:31                     ` Neil Horman
2014-03-25 21:22                       ` Jamal Hadi Salim
2014-03-25 21:26                     ` Thomas Graf
2014-03-25 21:42                       ` Florian Fainelli
2014-03-25 21:54                         ` Thomas Graf
2014-03-26 10:55                           ` Neil Horman
2014-03-26  5:37                     ` Roopa Prabhu
2014-03-26 10:54                       ` Jamal Hadi Salim
2014-03-26 15:31                         ` John W. Linville
2014-03-26 16:54                         ` Roopa Prabhu
2014-03-26 16:59                           ` Jiri Pirko
2014-03-26 17:29                             ` Florian Fainelli
2014-03-26 17:35                               ` Jiri Pirko
2014-03-26 17:58                                 ` Florian Fainelli
2014-03-26 18:14                                   ` Jiri Pirko
2014-03-26 18:29                                     ` Hannes Frederic Sowa
2014-03-26 18:30                                     ` Florian Fainelli
2014-03-26 21:51                                     ` Jamal Hadi Salim
2014-03-26 22:22                                       ` Florian Fainelli
2014-03-26 22:53                                         ` Jamal Hadi Salim
2014-03-26 23:16                                           ` Florian Fainelli
2014-03-27  6:56                                         ` Jiri Pirko
2014-03-27 10:39                                           ` Jamal Hadi Salim
2014-03-27 10:50                                             ` Jiri Pirko
2014-03-27 11:12                                               ` Jamal Hadi Salim
2014-03-27 11:16                                                 ` Jiri Pirko
2014-03-27 14:10                                           ` Sergey Ryazanov
2014-03-27 16:41                                             ` Florian Fainelli
2014-03-27 16:57                                               ` Jiri Pirko
2014-03-27 16:59                                               ` Thomas Graf
2014-03-27 20:32                                               ` Sergey Ryazanov
2014-03-27 21:20                                                 ` Florian Fainelli
2014-03-27 21:55                                                   ` Jamal Hadi Salim
2014-03-28  6:28                                                   ` Jiri Pirko
2014-03-30 12:08                                                     ` Alon Harel
2014-03-27 21:41                                               ` Jamal Hadi Salim
2014-03-27 16:55                                             ` Jiri Pirko
2014-03-27 19:58                                               ` Sergey Ryazanov
2014-03-27 20:01                                                 ` Florian Fainelli
2014-03-27 20:04                                                   ` Sergey Ryazanov
2014-03-27 21:47                                                   ` Jamal Hadi Salim
2014-03-27 21:54                                                     ` Florian Fainelli
2014-03-27 21:59                                                       ` Jamal Hadi Salim
2014-03-27 22:19                                                         ` Florian Fainelli
2014-03-27 23:42                                                         ` Thomas Graf
2014-03-27 23:46                                                           ` Florian Fainelli
2014-03-26 17:57                               ` Roopa Prabhu
2014-03-26 18:09                                 ` Florian Fainelli
2014-03-27 13:46                                   ` John W. Linville
2014-03-26 17:47                             ` Roopa Prabhu
2014-03-26 18:03                               ` Jiri Pirko
2014-03-26 21:27                                 ` Roopa Prabhu
2014-03-26 21:31                                   ` Jiri Pirko
2014-03-27 15:35                                     ` Roopa Prabhu
2014-03-27 16:10                                       ` Jiri Pirko
2014-04-01 19:13                                 ` Scott Feldman
2014-04-02  6:41                                   ` Jiri Pirko
2014-04-02 15:37                                     ` Scott Feldman
2014-04-02 14:32                                   ` Andy Gospodarek
2014-04-02 15:25                                     ` John W. Linville
2014-04-02 16:15                                       ` Scott Feldman
2014-04-02 16:47                                         ` Florian Fainelli
2014-04-02 21:52                                           ` Thomas Graf [this message]
2014-04-02 19:29                                         ` John W. Linville
2014-04-02 19:54                                           ` Scott Feldman
2014-04-02 20:06                                             ` John W. Linville
2014-04-02 20:04                                           ` Stephen Hemminger
2014-04-02 20:23                                             ` Jiri Pirko
2014-04-02 20:38                                               ` John W. Linville
2014-04-02 21:36                                                 ` Thomas Graf
2014-03-25 20:56                   ` Jamal Hadi Salim
2014-03-25 21:19                     ` Thomas Graf
2014-03-25 21:24                       ` Jamal Hadi Salim
2014-03-26  7:21                       ` Jiri Pirko
2014-03-26 11:00                         ` Jamal Hadi Salim
2014-03-26 11:06                           ` Jamal Hadi Salim
2014-03-26 11:31                             ` Jamal Hadi Salim
2014-03-26 13:20                             ` Jiri Pirko
2014-03-26 13:23                               ` Jamal Hadi Salim
2014-03-26 13:17                           ` Jiri Pirko
2014-03-26 11:10                     ` Neil Horman
2014-03-26 11:29                       ` Thomas Graf
2014-03-26 12:58                         ` Jamal Hadi Salim
2014-03-26 15:22                         ` John W. Linville
2014-03-26 21:36                           ` Jamal Hadi Salim
2014-03-26 18:21                         ` Neil Horman
2014-03-26 19:11                           ` Florian Fainelli
2014-03-26 22:44                             ` Jamal Hadi Salim
2014-03-26 23:15                               ` Thomas Graf
2014-03-26 23:21                                 ` Florian Fainelli
2014-03-27 15:26                               ` Neil Horman
2014-03-27 21:33                                 ` Jamal Hadi Salim
2014-03-26 19:24                           ` Hannes Frederic Sowa
2014-03-27 13:43                           ` John W. Linville
2014-03-26 12:19                       ` Jamal Hadi Salim
2014-03-26 15:27                       ` John W. Linville
2014-03-25 18:33               ` Florian Fainelli
2014-03-25 19:40                 ` Neil Horman
2014-03-25 20:00                   ` Florian Fainelli
2014-03-25 21:39                     ` tgraf
2014-03-25 22:08                       ` Jamal Hadi Salim
2014-03-26  5:48                         ` Roopa Prabhu
2014-03-25 20:46               ` Jamal Hadi Salim
2014-03-26  7:24               ` Jiri Pirko
2014-03-22  9:40       ` Jiri Pirko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140402215215.GO11670@casper.infradead.org \
    --to=tgraf@suug.ch \
    --cc=andy@greyhouse.net \
    --cc=azhou@nicira.com \
    --cc=ben@decadent.org.uk \
    --cc=buytenh@wantstofly.org \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=edumazet@google.com \
    --cc=f.fainelli@gmail.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jesse@nicira.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.r.fastabend@intel.com \
    --cc=linville@tuxdriver.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=ogerlitz@mellanox.com \
    --cc=pshelar@nicira.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=sfeldma@cumulusnetworks.com \
    --cc=shm@cumulusnetworks.com \
    --cc=stephen@networkplumber.org \
    --cc=vyasevic@redhat.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).