From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [patch net-next v2 02/10] net: introduce generic switch devices support Date: Fri, 21 Nov 2014 08:16:17 +0100 Message-ID: <20141121071617.GA1872@nanopsycho.orion> References: <1415530280-9190-1-git-send-email-jiri@resnulli.us> <1415530280-9190-3-git-send-email-jiri@resnulli.us> <546C9AEA.2020209@cumulusnetworks.com> <20141119134645.GE1926@nanopsycho.orion> <546CA237.3010600@cumulusnetworks.com> <20141120155541.GA19003@gospo.rtplab.test> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Roopa Prabhu , netdev@vger.kernel.org, davem@davemloft.net, nhorman@tuxdriver.com, andy@greyhouse.net, tgraf@suug.ch, dborkman@redhat.com, ogerlitz@mellanox.com, jesse@nicira.com, pshelar@nicira.com, azhou@nicira.com, ben@decadent.org.uk, stephen@networkplumber.org, jeffrey.t.kirsher@intel.com, vyasevic@redhat.com, xiyou.wangcong@gmail.com, john.r.fastabend@intel.com, edumazet@google.com, jhs@mojatatu.com, sfeldma@gmail.com, f.fainelli@gmail.com, linville@tuxdriver.com, jasowang@redhat.com, ebiederm@xmission.com, nicolas.dichtel@6wind.com, ryazanov.s.a@gmail.com, buytenh@wantstofly.org, aviadr@mellanox.com, nbd@openwrt.org, alexei.starovoitov@gmail.com, Neil.Jerram@metaswitch.com, ronye@mellanox.com, simon.horman@netronome.com, alexander.h.duyck@redhat.com, john.ronciak@intel.com, mleitner@redhat.com, shrijeet@gmail.com, bcrl@kvack.org To: Andy Gospodarek Return-path: Received: from mail-wi0-f172.google.com ([209.85.212.172]:50277 "EHLO mail-wi0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750805AbaKUHQU (ORCPT ); Fri, 21 Nov 2014 02:16:20 -0500 Received: by mail-wi0-f172.google.com with SMTP id n3so11025210wiv.17 for ; Thu, 20 Nov 2014 23:16:19 -0800 (PST) Content-Disposition: inline In-Reply-To: <20141120155541.GA19003@gospo.rtplab.test> Sender: netdev-owner@vger.kernel.org List-ID: Thu, Nov 20, 2014 at 04:55:41PM CET, gospo@cumulusnetworks.com wrote: >On Wed, Nov 19, 2014 at 05:59:19AM -0800, Roopa Prabhu wrote: >> On 11/19/14, 5:46 AM, Jiri Pirko wrote: >> >Wed, Nov 19, 2014 at 02:28:10PM CET, roopa@cumulusnetworks.com wrote: >> >>On 11/9/14, 2:51 AM, Jiri Pirko wrote: >> >>>The goal of this is to provide a possibility to support various switch >> >>>chips. Drivers should implement relevant ndos to do so. Now there is >> >>>only one ndo defined: >> >>>- for getting physical switch id is in place. >> >>> >> >>>Note that user can use random port netdevice to access the switch. >> >>> >> >>>Signed-off-by: Jiri Pirko >> >>>--- >> >>> Documentation/networking/switchdev.txt | 59 ++++++++++++++++++++++++++++++++++ >> >>> MAINTAINERS | 7 ++++ >> >>> include/linux/netdevice.h | 10 ++++++ >> >>> include/net/switchdev.h | 30 +++++++++++++++++ >> >>> net/Kconfig | 1 + >> >>> net/Makefile | 3 ++ >> >>> net/switchdev/Kconfig | 13 ++++++++ >> >>> net/switchdev/Makefile | 5 +++ >> >>> net/switchdev/switchdev.c | 33 +++++++++++++++++++ >> >>> 9 files changed, 161 insertions(+) >> >>> create mode 100644 Documentation/networking/switchdev.txt >> >>> create mode 100644 include/net/switchdev.h >> >>> create mode 100644 net/switchdev/Kconfig >> >>> create mode 100644 net/switchdev/Makefile >> >>> create mode 100644 net/switchdev/switchdev.c >> >>> >> >>>diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt >> >>>new file mode 100644 >> >>>index 0000000..98be76c >> >>>--- /dev/null >> >>>+++ b/Documentation/networking/switchdev.txt >> >>>@@ -0,0 +1,59 @@ >> >>>+Switch (and switch-ish) device drivers HOWTO >> >>>+=========================== >> >>>+ >> >>>+Please note that the word "switch" is here used in very generic meaning. >> >>>+This include devices supporting L2/L3 but also various flow offloading chips, >> >>>+including switches embedded into SR-IOV NICs. >> >>>+ >> >>>+Lets describe a topology a bit. Imagine the following example: >> >>>+ >> >>>+ +----------------------------+ +---------------+ >> >>>+ | SOME switch chip | | CPU | >> >>>+ +----------------------------+ +---------------+ >> >>>+ port1 port2 port3 port4 MNGMNT | PCI-E | >> >>>+ | | | | | +---------------+ >> >>>+ PHY PHY | | | | NIC0 NIC1 >> >>>+ | | | | | | >> >>>+ | | +- PCI-E -+ | | >> >>>+ | +------- MII -------+ | >> >>>+ +------------- MII ------------+ >> >>>+ >> >>>+In this example, there are two independent lines between the switch silicon >> >>>+and CPU. NIC0 and NIC1 drivers are not aware of a switch presence. They are >> >>>+separate from the switch driver. SOME switch chip is by managed by a driver >> >>>+via PCI-E device MNGMNT. Note that MNGMNT device, NIC0 and NIC1 may be >> >>>+connected to some other type of bus. >> >>>+ >> >>>+Now, for the previous example show the representation in kernel: >> >>>+ >> >>>+ +----------------------------+ +---------------+ >> >>>+ | SOME switch chip | | CPU | >> >>>+ +----------------------------+ +---------------+ >> >>>+ sw0p0 sw0p1 sw0p2 sw0p3 MNGMNT | PCI-E | >> >>>+ | | | | | +---------------+ >> >>>+ PHY PHY | | | | eth0 eth1 >> >>>+ | | | | | | >> >>>+ | | +- PCI-E -+ | | >> >>>+ | +------- MII -------+ | >> >>>+ +------------- MII ------------+ >> >>>+ >> >>>+Lets call the example switch driver for SOME switch chip "SOMEswitch". This >> >>>+driver takes care of PCI-E device MNGMNT. There is a netdevice instance sw0pX >> >>>+created for each port of a switch. These netdevices are instances >> >>>+of "SOMEswitch" driver. sw0pX netdevices serve as a "representation" >> >>>+of the switch chip. eth0 and eth1 are instances of some other existing driver. >> >>>+ >> >>>+The only difference of the switch-port netdevice from the ordinary netdevice >> >>>+is that is implements couple more NDOs: >> >>>+ >> >>>+ ndo_sw_parent_get_id - This returns the same ID for two port netdevices >> >>>+ of the same physical switch chip. This is >> >>>+ mandatory to be implemented by all switch drivers >> >>>+ and serves the caller for recognition of a port >> >>>+ netdevice. >> >>>+ ndo_sw_parent_* - Functions that serve for a manipulation of the switch >> >>>+ chip itself (it can be though of as a "parent" of the >> >>>+ port, therefore the name). They are not port-specific. >> >>>+ Caller might use arbitrary port netdevice of the same >> >>>+ switch and it will make no difference. >> >>>+ ndo_sw_port_* - Functions that serve for a port-specific manipulation. >> >>>diff --git a/MAINTAINERS b/MAINTAINERS >> >>>index 3a41fb0..776e078 100644 >> >>>--- a/MAINTAINERS >> >>>+++ b/MAINTAINERS >> >>>@@ -9003,6 +9003,13 @@ F: lib/swiotlb.c >> >>> F: arch/*/kernel/pci-swiotlb.c >> >>> F: include/linux/swiotlb.h >> >>>+SWITCHDEV >> >>>+M: Jiri Pirko >> >>>+L: netdev@vger.kernel.org >> >>>+S: Supported >> >>>+F: net/switchdev/ >> >>>+F: include/net/switchdev.h >> >>>+ >> >>> SYNOPSYS ARC ARCHITECTURE >> >>> M: Vineet Gupta >> >>> S: Supported >> >>>diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> >>>index 71922e0..97eade9 100644 >> >>>--- a/include/linux/netdevice.h >> >>>+++ b/include/linux/netdevice.h >> >>>@@ -1017,6 +1017,12 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev, >> >>> * performing GSO on a packet. The device returns true if it is >> >>> * able to GSO the packet, false otherwise. If the return value is >> >>> * false the stack will do software GSO. >> >>>+ * >> >>>+ * int (*ndo_sw_parent_id_get)(struct net_device *dev, >> >>>+ * struct netdev_phys_item_id *psid); >> >>>+ * Called to get an ID of the switch chip this port is part of. >> >>>+ * If driver implements this, it indicates that it represents a port >> >>>+ * of a switch chip. >> >>> */ >> >>> struct net_device_ops { >> >>> int (*ndo_init)(struct net_device *dev); >> >>>@@ -1168,6 +1174,10 @@ struct net_device_ops { >> >>> int (*ndo_get_lock_subclass)(struct net_device *dev); >> >>> bool (*ndo_gso_check) (struct sk_buff *skb, >> >>> struct net_device *dev); >> >>>+#ifdef CONFIG_NET_SWITCHDEV >> >>>+ int (*ndo_sw_parent_id_get)(struct net_device *dev, >> >>>+ struct netdev_phys_item_id *psid); >> >>Can we keep the name generic and not include "sw" which implies switch here >> >>?. >> >>I understand that it is under CONFIG_NET_SWITCHDEV but we might find use for >> >>them in other offload scenarios in the future. >> >>This particular ndo can be just ndo_parent_id_get(). >> >>And the others that do specific offloads can have "offload" in them if >> >>required..?. >> > >> >But this is for getting parent switch id, sw should be there. >> >> Since we have not figured out the details or namespace for switchd ids yet, >> its still some parent id to me. >> >> >If comes a >> >time when this might be reused to something else, we change it then. >> >This is internal api, easily changeable. >> also, "sw" seems more "software" than "switch". > >I had voiced this same concern when we met and discussed this in >Dusseldorf. Let's move to another name such as 'hw' (since we really >are talking about hardware abstraction) or 'offload.' Just using 'sw' >is confusing as many will not read it as switch. > >I'll be happy to post a fix based on your devel patches. Np, I'll take care of it. > > >> >> > >> >> >> >> >> >>>+#endif >> >>> }; >> >>> /** >> >>>diff --git a/include/net/switchdev.h b/include/net/switchdev.h >> >>>new file mode 100644 >> >>>index 0000000..79bf9bd >> >>>--- /dev/null >> >>>+++ b/include/net/switchdev.h >> >>>@@ -0,0 +1,30 @@ >> >>>+/* >> >>>+ * include/net/switchdev.h - Switch device API >> >>>+ * Copyright (c) 2014 Jiri Pirko >> >>>+ * >> >>>+ * This program is free software; you can redistribute it and/or modify >> >>>+ * it under the terms of the GNU General Public License as published by >> >>>+ * the Free Software Foundation; either version 2 of the License, or >> >>>+ * (at your option) any later version. >> >>>+ */ >> >>>+#ifndef _LINUX_SWITCHDEV_H_ >> >>>+#define _LINUX_SWITCHDEV_H_ >> >>>+ >> >>>+#include >> >>>+ >> >>>+#ifdef CONFIG_NET_SWITCHDEV >> >>>+ >> >>>+int netdev_sw_parent_id_get(struct net_device *dev, >> >>>+ struct netdev_phys_item_id *psid); >> >>>+ >> >>>+#else >> >>>+ >> >>>+static inline int netdev_sw_parent_id_get(struct net_device *dev, >> >>>+ struct netdev_phys_item_id *psid) >> >>>+{ >> >>>+ return -EOPNOTSUPP; >> >>>+} >> >>>+ >> >>>+#endif >> >>>+ >> >>>+#endif /* _LINUX_SWITCHDEV_H_ */ >> >>>diff --git a/net/Kconfig b/net/Kconfig >> >>>index 99815b5..ff9ffc1 100644 >> >>>--- a/net/Kconfig >> >>>+++ b/net/Kconfig >> >>>@@ -228,6 +228,7 @@ source "net/vmw_vsock/Kconfig" >> >>> source "net/netlink/Kconfig" >> >>> source "net/mpls/Kconfig" >> >>> source "net/hsr/Kconfig" >> >>>+source "net/switchdev/Kconfig" >> >>> config RPS >> >>> boolean >> >>>diff --git a/net/Makefile b/net/Makefile >> >>>index 7ed1970..95fc694 100644 >> >>>--- a/net/Makefile >> >>>+++ b/net/Makefile >> >>>@@ -73,3 +73,6 @@ obj-$(CONFIG_OPENVSWITCH) += openvswitch/ >> >>> obj-$(CONFIG_VSOCKETS) += vmw_vsock/ >> >>> obj-$(CONFIG_NET_MPLS_GSO) += mpls/ >> >>> obj-$(CONFIG_HSR) += hsr/ >> >>>+ifneq ($(CONFIG_NET_SWITCHDEV),) >> >>>+obj-y += switchdev/ >> >>>+endif >> >>>diff --git a/net/switchdev/Kconfig b/net/switchdev/Kconfig >> >>>new file mode 100644 >> >>>index 0000000..1557545 >> >>>--- /dev/null >> >>>+++ b/net/switchdev/Kconfig >> >>>@@ -0,0 +1,13 @@ >> >>>+# >> >>>+# Configuration for Switch device support >> >>>+# >> >>>+ >> >>>+config NET_SWITCHDEV >> >>>+ boolean "Switch (and switch-ish) device support (EXPERIMENTAL)" >> >>>+ depends on INET >> >>>+ ---help--- >> >>>+ This module provides glue between core networking code and device >> >>>+ drivers in order to support hardware switch chips in very generic >> >>>+ meaning of the word "switch". This include devices supporting L2/L3 but >> >>>+ also various flow offloading chips, including switches embedded into >> >>>+ SR-IOV NICs. >> >>>diff --git a/net/switchdev/Makefile b/net/switchdev/Makefile >> >>>new file mode 100644 >> >>>index 0000000..5ed63ed >> >>>--- /dev/null >> >>>+++ b/net/switchdev/Makefile >> >>>@@ -0,0 +1,5 @@ >> >>>+# >> >>>+# Makefile for the Switch device API >> >>>+# >> >>>+ >> >>>+obj-$(CONFIG_NET_SWITCHDEV) += switchdev.o >> >>>diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >> >>>new file mode 100644 >> >>>index 0000000..5010f646 >> >>>--- /dev/null >> >>>+++ b/net/switchdev/switchdev.c >> >>>@@ -0,0 +1,33 @@ >> >>>+/* >> >>>+ * net/switchdev/switchdev.c - Switch device API >> >>>+ * Copyright (c) 2014 Jiri Pirko >> >>>+ * >> >>>+ * This program is free software; you can redistribute it and/or modify >> >>>+ * it under the terms of the GNU General Public License as published by >> >>>+ * the Free Software Foundation; either version 2 of the License, or >> >>>+ * (at your option) any later version. >> >>>+ */ >> >>>+ >> >>>+#include >> >>>+#include >> >>>+#include >> >>>+#include >> >>>+#include >> >>>+ >> >>>+/** >> >>>+ * netdev_sw_parent_id_get - Get ID of a switch >> >>>+ * @dev: port device >> >>>+ * @psid: switch ID >> >>>+ * >> >>>+ * Get ID of a switch this port is part of. >> >>>+ */ >> >>>+int netdev_sw_parent_id_get(struct net_device *dev, >> >>>+ struct netdev_phys_item_id *psid) >> >>>+{ >> >>>+ const struct net_device_ops *ops = dev->netdev_ops; >> >>>+ >> >>>+ if (!ops->ndo_sw_parent_id_get) >> >>>+ return -EOPNOTSUPP; >> >>>+ return ops->ndo_sw_parent_id_get(dev, psid); >> >>>+} >> >>>+EXPORT_SYMBOL(netdev_sw_parent_id_get); >> >-- >> >To unsubscribe from this list: send the line "unsubscribe netdev" in >> >the body of a message to majordomo@vger.kernel.org >> >More majordomo info at http://vger.kernel.org/majordomo-info.html >>