From: John Fastabend <john.fastabend@gmail.com>
To: Jiri Pirko <jiri@resnulli.us>
Cc: netdev@vger.kernel.org, davem@davemloft.net,
nhorman@tuxdriver.com, andy@greyhouse.net, tgraf@suug.ch,
dborkman@redhat.com, ogerlitz@mellanox.com, jesse@nicira.com,
pshelar@nicira.com, azhou@nicira.com, ben@decadent.org.uk,
stephen@networkplumber.org, jeffrey.t.kirsher@intel.com,
vyasevic@redhat.com, xiyou.wangcong@gmail.com,
john.r.fastabend@intel.com, edumazet@google.com,
jhs@mojatatu.com, sfeldma@gmail.com, f.fainelli@gmail.com,
roopa@cumulusnetworks.com, linville@tuxdriver.com,
jasowang@redhat.com, ebiederm@xmission.com,
nicolas.dichtel@6wind.com, ryazanov.s.a@gmail.com,
buytenh@wantstofly.org, aviadr@mellanox.com, nbd@openwrt.org,
alexei.starovoitov@gmail.com, Neil.Jerram@metaswitch.com,
ronye@mellanox.com, simon.horman@netronome.com,
alexander.h.duyck@redhat.com, john.ronciak@intel.com,
mleitner@redhat.com, shrijeet@gmail.com,
gospo@cumulusnetworks.com, bcrl@kvac
Subject: Re: [patch net-next v2 02/10] net: introduce generic switch devices support
Date: Mon, 10 Nov 2014 13:59:38 -0800 [thread overview]
Message-ID: <5461354A.3020906@gmail.com> (raw)
In-Reply-To: <1415530280-9190-3-git-send-email-jiri@resnulli.us>
On 11/09/2014 02:51 AM, Jiri Pirko wrote:
> The goal of this is to provide a possibility to support various switch
> chips. Drivers should implement relevant ndos to do so. Now there is
> only one ndo defined:
> - for getting physical switch id is in place.
>
> Note that user can use random port netdevice to access the switch.
>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
> Documentation/networking/switchdev.txt | 59 ++++++++++++++++++++++++++++++++++
> MAINTAINERS | 7 ++++
> include/linux/netdevice.h | 10 ++++++
> include/net/switchdev.h | 30 +++++++++++++++++
> net/Kconfig | 1 +
> net/Makefile | 3 ++
> net/switchdev/Kconfig | 13 ++++++++
> net/switchdev/Makefile | 5 +++
> net/switchdev/switchdev.c | 33 +++++++++++++++++++
> 9 files changed, 161 insertions(+)
> create mode 100644 Documentation/networking/switchdev.txt
> create mode 100644 include/net/switchdev.h
> create mode 100644 net/switchdev/Kconfig
> create mode 100644 net/switchdev/Makefile
> create mode 100644 net/switchdev/switchdev.c
>
> diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
> new file mode 100644
> index 0000000..98be76c
> --- /dev/null
> +++ b/Documentation/networking/switchdev.txt
> @@ -0,0 +1,59 @@
> +Switch (and switch-ish) device drivers HOWTO
> +===========================
> +
> +Please note that the word "switch" is here used in very generic meaning.
> +This include devices supporting L2/L3 but also various flow offloading chips,
> +including switches embedded into SR-IOV NICs.
> +
> +Lets describe a topology a bit. Imagine the following example:
> +
> + +----------------------------+ +---------------+
> + | SOME switch chip | | CPU |
> + +----------------------------+ +---------------+
> + port1 port2 port3 port4 MNGMNT | PCI-E |
> + | | | | | +---------------+
> + PHY PHY | | | | NIC0 NIC1
> + | | | | | |
> + | | +- PCI-E -+ | |
> + | +------- MII -------+ |
> + +------------- MII ------------+
> +
> +In this example, there are two independent lines between the switch silicon
> +and CPU. NIC0 and NIC1 drivers are not aware of a switch presence. They are
> +separate from the switch driver. SOME switch chip is by managed by a driver
> +via PCI-E device MNGMNT. Note that MNGMNT device, NIC0 and NIC1 may be
> +connected to some other type of bus.
> +
> +Now, for the previous example show the representation in kernel:
> +
> + +----------------------------+ +---------------+
> + | SOME switch chip | | CPU |
> + +----------------------------+ +---------------+
> + sw0p0 sw0p1 sw0p2 sw0p3 MNGMNT | PCI-E |
> + | | | | | +---------------+
> + PHY PHY | | | | eth0 eth1
> + | | | | | |
> + | | +- PCI-E -+ | |
> + | +------- MII -------+ |
> + +------------- MII ------------+
> +
> +Lets call the example switch driver for SOME switch chip "SOMEswitch". This
> +driver takes care of PCI-E device MNGMNT. There is a netdevice instance sw0pX
> +created for each port of a switch. These netdevices are instances
> +of "SOMEswitch" driver. sw0pX netdevices serve as a "representation"
> +of the switch chip. eth0 and eth1 are instances of some other existing driver.
> +
> +The only difference of the switch-port netdevice from the ordinary netdevice
> +is that is implements couple more NDOs:
> +
> + ndo_sw_parent_get_id - This returns the same ID for two port netdevices
> + of the same physical switch chip. This is
> + mandatory to be implemented by all switch drivers
> + and serves the caller for recognition of a port
> + netdevice.
What is the connection between ndo_sw_parent_get_id and
ndo_get_phys_port_id(). I'm having a bit of trouble teasing
this out.
For example here is my ascii art for a SR-IOV NIC,
eth0 eth1 eth2
| | |
| | |
PF VF VF
+----+---------+--------+----+
| embedded bridge |
+-------------+--------------+
|
port
that can do switching between the various uplinks and downlinks.
In IEEE 802.1Q language the embedded bridge acts like an edge
relay. At least that seems to be the current state of the art
for SR-IOV. Edge relay just means it has a single uplink port
to the network and multiple downlinks and also isn't required
to do learning and run loop detection protocols STP, et. al.
Also there are multi-function devices that look the same except
replace the VFs with PFs. It seems to be a common mode for NICs
that do the iSCSI offloads with storage functions.
When something is an embedded bridge vs a SOME switch chip is
not entirely clear.
My understanding is use ndo_sw_parent_get_id() when you have
multiple physical ports all connected to a single switch object.
When you have a single port connected to multiple PCIE functions
or queues representing a netdev (e.g. macvlan offload) use the
ndo_get_phys_port_id(). Just want to be sure we are on the
same page here.
Otherwise patch looks good. I think we can clear the above up
with an addition to the documentation. Could go in after the
initial set and be OK with me.
IMO this patch is needed otherwise user space is at a complete
loss on trying to figure out how netdevs map to switch silicon.
You could have reused ndo_get_phys_port_id() perhaps but then
I think user space may get confused by SR-IOV/VMDQ/etc ports
attached to a switch silicon. For .02$ having a new distinct
identifier is cleaner.
> + ndo_sw_parent_* - Functions that serve for a manipulation of the switch
> + chip itself (it can be though of as a "parent" of the
> + port, therefore the name). They are not port-specific.
> + Caller might use arbitrary port netdevice of the same
> + switch and it will make no difference.
> + ndo_sw_port_* - Functions that serve for a port-specific manipulation.
[...]
Thanks,
John
--
John Fastabend Intel Corporation
next prev parent reply other threads:[~2014-11-10 22:00 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-09 10:51 [patch net-next v2 00/10] introduce rocker switch driver with hardware accelerated datapath api - phase 1: bridge fdb offload Jiri Pirko
2014-11-09 10:51 ` [patch net-next v2 01/10] net: rename netdev_phys_port_id to more generic name Jiri Pirko
2014-11-10 3:35 ` Jamal Hadi Salim
2014-11-10 5:23 ` David Miller
2014-11-10 12:06 ` Jamal Hadi Salim
2014-11-10 12:33 ` Daniel Borkmann
2014-11-10 12:56 ` Jamal Hadi Salim
2014-11-10 16:28 ` David Miller
2014-11-10 7:43 ` Jiri Pirko
2014-11-10 12:17 ` Jamal Hadi Salim
2014-11-10 13:16 ` Jiri Pirko
2014-11-10 13:20 ` Jamal Hadi Salim
2014-11-10 16:28 ` David Miller
2014-11-10 19:03 ` Jamal Hadi Salim
2014-11-10 21:57 ` John Fastabend
2014-11-09 10:51 ` [patch net-next v2 02/10] net: introduce generic switch devices support Jiri Pirko
2014-11-10 21:59 ` John Fastabend [this message]
2014-11-11 15:11 ` Jiri Pirko
2014-11-11 9:49 ` M. Braun
2014-11-11 10:04 ` Jiri Pirko
2014-11-19 13:28 ` Roopa Prabhu
2014-11-19 13:46 ` Jiri Pirko
2014-11-19 13:59 ` Roopa Prabhu
2014-11-20 15:55 ` Andy Gospodarek
2014-11-21 7:16 ` Jiri Pirko
2014-11-09 10:51 ` [patch net-next v2 03/10] rtnl: expose physical switch id for particular device Jiri Pirko
2014-11-10 3:43 ` Jamal Hadi Salim
2014-11-10 7:45 ` Jiri Pirko
2014-11-10 17:58 ` Roopa Prabhu
2014-11-10 20:02 ` Scott Feldman
2014-11-11 13:55 ` Roopa Prabhu
2014-11-10 22:14 ` Jiri Pirko
2014-11-10 22:31 ` John Fastabend
2014-11-10 22:01 ` John Fastabend
2014-11-09 10:51 ` [patch net-next v2 04/10] net-sysfs: " Jiri Pirko
2014-11-10 22:01 ` John Fastabend
2014-11-09 10:51 ` [patch net-next v2 05/10] rocker: introduce rocker switch driver Jiri Pirko
2014-11-10 22:04 ` John Fastabend
2014-11-11 14:29 ` Thomas Graf
2014-11-11 15:19 ` Jiri Pirko
2014-11-11 15:32 ` Thomas Graf
2014-11-11 15:40 ` Jiri Pirko
2014-11-11 16:10 ` Thomas Graf
2014-11-27 14:09 ` Florian Fainelli
2014-11-11 15:41 ` Roopa Prabhu
2014-11-11 15:44 ` John Fastabend
2014-11-11 15:28 ` Jiri Pirko
2014-11-09 10:51 ` [patch net-next v2 06/10] bridge: introduce fdb offloading via switchdev Jiri Pirko
2014-11-10 3:47 ` Jamal Hadi Salim
2014-11-10 8:15 ` Jiri Pirko
2014-11-10 9:30 ` Scott Feldman
2014-11-10 12:47 ` Jamal Hadi Salim
2014-11-10 13:47 ` Jiri Pirko
2014-11-10 19:13 ` Jamal Hadi Salim
2014-11-10 13:51 ` Thomas Graf
2014-11-10 17:30 ` Andy Gospodarek
2014-11-10 19:03 ` Roopa Prabhu
2014-11-12 13:43 ` Jiri Pirko
2014-11-09 10:51 ` [patch net-next v2 07/10] bridge: call netdev_sw_port_stp_update when bridge port STP status changes Jiri Pirko
2014-11-10 13:11 ` Jamal Hadi Salim
2014-11-10 14:04 ` Thomas Graf
2014-11-10 19:20 ` Jamal Hadi Salim
2014-11-10 15:59 ` Roopa Prabhu
2014-11-09 10:51 ` [patch net-next v2 08/10] bridge: add API to notify bridge driver of learned FBD on offloaded device Jiri Pirko
2014-11-11 14:21 ` Roopa Prabhu
2014-11-11 17:38 ` Scott Feldman
2014-11-11 21:43 ` Roopa Prabhu
2014-11-09 10:51 ` [patch net-next v2 09/10] rocker: implement rocker ofdpa flow table manipulation Jiri Pirko
2014-11-09 10:51 ` [patch net-next v2 10/10] rocker: implement L2 bridge offloading Jiri Pirko
2014-11-10 3:53 ` Jamal Hadi Salim
2014-11-10 8:18 ` Jiri Pirko
2014-11-10 9:10 ` Nicolas Dichtel
2014-11-10 8:46 ` Scott Feldman
2014-11-10 12:27 ` Jamal Hadi Salim
2014-11-10 16:12 ` Roopa Prabhu
2014-11-10 17:36 ` Scott Feldman
2014-11-10 18:35 ` Roopa Prabhu
2014-11-10 19:27 ` Jamal Hadi Salim
2014-11-10 19:47 ` Scott Feldman
2014-11-10 21:14 ` Jamal Hadi Salim
2014-11-10 19:25 ` Jamal Hadi Salim
2014-11-10 17:22 ` Scott Feldman
2014-11-09 16:40 ` [patch net-next] bridge: rename fdb_*_hw to fdb_*_hw_addr to avoid confusion Jiri Pirko
2014-11-11 2:33 ` David Miller
2014-11-11 7:20 ` Jiri Pirko
2014-11-10 3:31 ` [patch net-next v2 00/10] introduce rocker switch driver with hardware accelerated datapath api - phase 1: bridge fdb offload Jamal Hadi Salim
2014-11-10 3:46 ` Simon Horman
2014-11-10 4:03 ` Jamal Hadi Salim
2014-11-10 4:58 ` Simon Horman
2014-11-10 22:23 ` John Fastabend
2014-11-11 8:51 ` Simon Horman
2014-11-13 5:44 ` Simon Horman
2014-11-13 6:31 ` John Fastabend
2014-11-21 2:01 ` Simon Horman
2014-11-21 7:20 ` John Fastabend
2014-11-10 7:23 ` Jiri Pirko
2014-11-10 12:16 ` Jamal Hadi Salim
2014-11-10 13:12 ` Jiri Pirko
2014-11-10 16:48 ` Thomas Graf
2014-11-12 13:44 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5461354A.3020906@gmail.com \
--to=john.fastabend@gmail.com \
--cc=Neil.Jerram@metaswitch.com \
--cc=alexander.h.duyck@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andy@greyhouse.net \
--cc=aviadr@mellanox.com \
--cc=azhou@nicira.com \
--cc=bcrl@kvac \
--cc=ben@decadent.org.uk \
--cc=buytenh@wantstofly.org \
--cc=davem@davemloft.net \
--cc=dborkman@redhat.com \
--cc=ebiederm@xmission.com \
--cc=edumazet@google.com \
--cc=f.fainelli@gmail.com \
--cc=gospo@cumulusnetworks.com \
--cc=jasowang@redhat.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=jesse@nicira.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=john.r.fastabend@intel.com \
--cc=john.ronciak@intel.com \
--cc=linville@tuxdriver.com \
--cc=mleitner@redhat.com \
--cc=nbd@openwrt.org \
--cc=netdev@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=nicolas.dichtel@6wind.com \
--cc=ogerlitz@mellanox.com \
--cc=pshelar@nicira.com \
--cc=ronye@mellanox.com \
--cc=roopa@cumulusnetworks.com \
--cc=ryazanov.s.a@gmail.com \
--cc=sfeldma@gmail.com \
--cc=shrijeet@gmail.com \
--cc=simon.horman@netronome.com \
--cc=stephen@networkplumber.org \
--cc=tgraf@suug.ch \
--cc=vyasevic@redhat.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).