[net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support
@ 2010-05-06  4:42 Scott Feldman
  2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 1/3] Add netdev/netlink port-profile support (was iovnl) Scott Feldman
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Scott Feldman @ 2010-05-06  4:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, chrisw, arnd

(Resending to fix a bug found in testing get_vf_port_profile).

The following series adds port-profile netlink support and adds an
implementation to Cisco's enic netdev driver:

	1/3: Adds port-profile netlink RTM_SETLINK/RTM_GETLINK support, and
	     adds matching netdev ops net_{set|get}_vf_port_profile.

	2/3: Adds enic support for net_{set|get}_vf_port_profile for enic
	     dynamic devices.

	3/3: (please don't apply this 3rd patch) Enables SR-IOV support for
	     enic to illustrate support for port-profile netlink using SR-IOV-
	     compliant devices.

The SETLINK/GETLINK support follows the model for other IFLA_VF_* msgs used
for SR-IOV devices where the receipent of the netlink msg is the PF, but the
target is the VF.

The intent of this patch set is to cover both definitions of port-profile
as defined by Cisco's enic use and as defined by VSI discover protocol (VDP),
used in VEPA implemenations.  While both definitions are based on pre-
standards, the concept of a port-profile to be applied to an external switch
port on behalf of a virtual machine interface is common, as well as many
of the fields defining the protocols.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [net-next-2.6 V5 PATCH 1/3] Add netdev/netlink port-profile support (was iovnl)
  2010-05-06  4:42 [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Scott Feldman
@ 2010-05-06  4:42 ` Scott Feldman
  2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 2/3] Add ndo_{set|get}_vf_port_profile op support for enic dynamic vnics Scott Feldman
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: Scott Feldman @ 2010-05-06  4:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, chrisw, arnd

From: Scott Feldman <scofeldm@cisco.com>

Add new netdev ops ndo_{set|get}_vf_port_profile to allow setting of
port-profile on a netdev interface.  Extends netlink socket RTM_SETLINK/
RTM_GETLINK with new sub cmd called IFLA_VF_PORT_PROFILE (added to end of
IFLA_cmd list).

A port-profile is used to configure/enable the external switch port backing
the netdev interface, not to configure the host-facing side of the netdev.  A
port-profile is an identifier known to the switch.  How port-profiles are
installed on the switch or how available port-profiles are made know to the
host is outside the scope of this patch.

The general flow is the port-profile is applied to a host netdev interface
using RTM_SETLINK, the receiver of the RTM_SETLINK msg (more about that later)
communicates with the switch, and the switch port backing the host netdev
interface is configured/enabled based on the settings defined by the port-
profile.  What those settings comprise, and how those settings are managed is
again outside the scope of this patch, since this patch only deals with the
first step in the flow.

Since we're using netlink sockets, the receiver of the RTM_SETLINK msg can
be in kernel- or user-space.  For kernel-space recipient, rtnetlink.c, the
new ndo_set_vf_port_profile netdev op is called to set the port-profile.
User-space recipients can decide how they comminucate the IFLA_VF_PORT_PROFILE
to the external switch.

There is a RTM_GETLINK cmd to to return port-profile setting of an
interface and to also return the status of the last port-profile.

IFLA_VF_PORT_PROFILE is modeled after the existing IFLA_VF_* cmd where a
VF number is passed in to identify the virtual function (VF) of an SR-IOV-
capable device.  In this case, the target of IFLA_VF_PORT_PROFILE msg is the
netdev physical function (PF) device.  The PF will apply the port-profile
to the VF.  IFLA_VF_PORT_PROFILE can also be used for devices that don't
adhere to SR-IOV and can apply the port-profile directly to the netdev
target.  In this case, the VF number is ignored.

Passing in a NULL port-profile is used to delete the port-profile association.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com>
---
 include/linux/if_link.h   |   25 +++++++++++++++++++++++++
 include/linux/netdevice.h |   10 ++++++++++
 net/core/rtnetlink.c      |   39 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 73 insertions(+), 1 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index cfd420b..d763358 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -116,6 +116,7 @@ enum {
 	IFLA_VF_TX_RATE,	/* TX Bandwidth Allocation */
 	IFLA_VFINFO,
 	IFLA_STATS64,
+	IFLA_VF_PORT_PROFILE,
 	__IFLA_MAX
 };
 
@@ -259,4 +260,28 @@ struct ifla_vf_info {
 	__u32 qos;
 	__u32 tx_rate;
 };
+
+enum {
+	IFLA_VF_PORT_PROFILE_STATUS_UNKNOWN,
+	IFLA_VF_PORT_PROFILE_STATUS_SUCCESS,
+	IFLA_VF_PORT_PROFILE_STATUS_INPROGRESS,
+	IFLA_VF_PORT_PROFILE_STATUS_ERROR,
+};
+
+#define IFLA_VF_PORT_PROFILE_MAX	40
+#define IFLA_VF_UUID_MAX		40
+#define IFLA_VF_CLIENT_NAME_MAX		40
+
+struct ifla_vf_port_profile {
+	__u32 vf;
+	__u32 flags;
+	__u32 status;
+	__u8 port_profile[IFLA_VF_PORT_PROFILE_MAX];
+	__u8 mac[32];					/* MAX_ADDR_LEN */
+	/* UUID e.g. "CEEFD3B1-9E11-11DE-BDFD-000BAB01C0FB" */
+	__u8 host_uuid[IFLA_VF_UUID_MAX];
+	__u8 client_uuid[IFLA_VF_UUID_MAX];
+	__u8 client_name[IFLA_VF_CLIENT_NAME_MAX];	/* e.g. "vm0-eth1" */
+};
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3c5ed5f..949abdb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -696,6 +696,10 @@ struct netdev_rx_queue {
  * int (*ndo_set_vf_tx_rate)(struct net_device *dev, int vf, int rate);
  * int (*ndo_get_vf_config)(struct net_device *dev,
  *			    int vf, struct ifla_vf_info *ivf);
+ * int (*ndo_set_vf_port_profile)(struct net_device *dev, int vf,
+ *				  struct ifla_vf_port_profile *ivp);
+ * int (*ndo_get_vf_port_profile)(struct net_device *dev, int vf,
+ *				  struct ifla_vf_port_profile *ivp);
  */
 #define HAVE_NET_DEVICE_OPS
 struct net_device_ops {
@@ -744,6 +748,12 @@ struct net_device_ops {
 	int			(*ndo_get_vf_config)(struct net_device *dev,
 						     int vf,
 						     struct ifla_vf_info *ivf);
+	int			(*ndo_set_vf_port_profile)(
+					struct net_device *dev, int vf,
+					struct ifla_vf_port_profile *ivp);
+	int			(*ndo_get_vf_port_profile)(
+					struct net_device *dev, int vf,
+					struct ifla_vf_port_profile *ivp);
 #if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE)
 	int			(*ndo_fcoe_enable)(struct net_device *dev);
 	int			(*ndo_fcoe_disable)(struct net_device *dev);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 78c8598..e427a70 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -747,17 +747,40 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 		goto nla_put_failure;
 	copy_rtnl_link_stats64(nla_data(attr), stats);
 
+	if (dev->dev.parent)
+		NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent));
+
 	if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent) {
 		int i;
 		struct ifla_vf_info ivi;
 
-		NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent));
 		for (i = 0; i < dev_num_vf(dev->dev.parent); i++) {
 			if (dev->netdev_ops->ndo_get_vf_config(dev, i, &ivi))
 				break;
 			NLA_PUT(skb, IFLA_VFINFO, sizeof(ivi), &ivi);
 		}
 	}
+
+	if (dev->netdev_ops->ndo_get_vf_port_profile && dev->dev.parent) {
+		struct ifla_vf_port_profile ivp;
+
+		if (dev_num_vf(dev->dev.parent)) {
+			int i;
+
+			for (i = 0; i < dev_num_vf(dev->dev.parent); i++) {
+				if (dev->netdev_ops->ndo_get_vf_port_profile(
+					dev, i, &ivp))
+					break;
+				NLA_PUT(skb, IFLA_VF_PORT_PROFILE,
+					sizeof(ivp), &ivp);
+			}
+		} else if (!dev->netdev_ops->ndo_get_vf_port_profile(dev,
+			0, &ivp)) {
+			NLA_PUT(skb, IFLA_VF_PORT_PROFILE,
+				sizeof(ivp), &ivp);
+		}
+	}
+
 	if (dev->rtnl_link_ops) {
 		if (rtnl_link_fill(skb, dev) < 0)
 			goto nla_put_failure;
@@ -824,6 +847,8 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 				    .len = sizeof(struct ifla_vf_vlan) },
 	[IFLA_VF_TX_RATE]	= { .type = NLA_BINARY,
 				    .len = sizeof(struct ifla_vf_tx_rate) },
+	[IFLA_VF_PORT_PROFILE]	= { .type = NLA_BINARY,
+				    .len = sizeof(struct ifla_vf_port_profile)},
 };
 EXPORT_SYMBOL(ifla_policy);
 
@@ -1028,6 +1053,18 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
 	}
 	err = 0;
 
+	if (tb[IFLA_VF_PORT_PROFILE]) {
+		struct ifla_vf_port_profile *ivp;
+		ivp = nla_data(tb[IFLA_VF_PORT_PROFILE]);
+		err = -EOPNOTSUPP;
+		if (ops->ndo_set_vf_port_profile)
+			err = ops->ndo_set_vf_port_profile(dev, ivp->vf, ivp);
+		if (err < 0)
+			goto errout;
+		modified = 1;
+	}
+	err = 0;
+
 errout:
 	if (err < 0 && modified && net_ratelimit())
 		printk(KERN_WARNING "A link change request failed with "


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next-2.6 V5 PATCH 2/3] Add ndo_{set|get}_vf_port_profile op support for enic dynamic vnics
  2010-05-06  4:42 [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Scott Feldman
  2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 1/3] Add netdev/netlink port-profile support (was iovnl) Scott Feldman
@ 2010-05-06  4:42 ` Scott Feldman
  2010-05-06 13:47   ` Arnd Bergmann
  2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 3/3] Add SR-IOV support to enic (please don't apply this patch) Scott Feldman
  2010-05-06 13:51 ` [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Arnd Bergmann
  3 siblings, 1 reply; 20+ messages in thread
From: Scott Feldman @ 2010-05-06  4:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, chrisw, arnd

From: Scott Feldman <scofeldm@cisco.com>

Add enic ndo_{set|get}_vf_port_profile ops to support setting/getting
port-profile for enic dynamic devices.  Enic dynamic devices are just like
normal enic eth devices except dynamic enics require an extra configuration
step to assign a port-profile identifier to the interface before the
interface is useable.  Once a port-profile is assigned, link comes up on the
interface and is ready for I/O.  The port-profile is used to configure the
network port assigned to the interface.  The network port configuration
includes VLAN membership, QoS policies, and port security settings typical
of a data center network.

A dynamic enic is assigned a default random mac address.  If no mac address
parameter is specified in the ndo_set_vf_port_profile op, the default random
mac address is used when assigning the port-profile.  Otherwise the mac
address specified in the op is used.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com>
---
 drivers/net/enic/Makefile    |    2 
 drivers/net/enic/enic.h      |    3 -
 drivers/net/enic/enic_main.c |  200 +++++++++++++++++++++++++++++++++++++-----
 drivers/net/enic/vnic_dev.c  |   50 +++++++++++
 drivers/net/enic/vnic_dev.h  |    3 +
 drivers/net/enic/vnic_vic.c  |   73 +++++++++++++++
 drivers/net/enic/vnic_vic.h  |   59 ++++++++++++
 7 files changed, 363 insertions(+), 27 deletions(-)

diff --git a/drivers/net/enic/Makefile b/drivers/net/enic/Makefile
index 391c3bc..e7b6c31 100644
--- a/drivers/net/enic/Makefile
+++ b/drivers/net/enic/Makefile
@@ -1,5 +1,5 @@
 obj-$(CONFIG_ENIC) := enic.o
 
 enic-y := enic_main.o vnic_cq.o vnic_intr.o vnic_wq.o \
-	enic_res.o vnic_dev.o vnic_rq.o
+	enic_res.o vnic_dev.o vnic_rq.o vnic_vic.o
 
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 5fa56f1..718033f 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -34,7 +34,7 @@
 
 #define DRV_NAME		"enic"
 #define DRV_DESCRIPTION		"Cisco VIC Ethernet NIC Driver"
-#define DRV_VERSION		"1.3.1.1"
+#define DRV_VERSION		"1.3.1.1-pp"
 #define DRV_COPYRIGHT		"Copyright 2008-2009 Cisco Systems, Inc"
 #define PFX			DRV_NAME ": "
 
@@ -95,6 +95,7 @@ struct enic {
 	u32 port_mtu;
 	u32 rx_coalesce_usecs;
 	u32 tx_coalesce_usecs;
+	struct ifla_vf_port_profile pp;
 
 	/* work queue cache line section */
 	____cacheline_aligned struct vnic_wq wq[ENIC_WQ_MAX];
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 1232887..8e5e46b 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -29,6 +29,7 @@
 #include <linux/etherdevice.h>
 #include <linux/if_ether.h>
 #include <linux/if_vlan.h>
+#include <linux/if_link.h>
 #include <linux/ethtool.h>
 #include <linux/in.h>
 #include <linux/ip.h>
@@ -40,6 +41,7 @@
 #include "vnic_dev.h"
 #include "vnic_intr.h"
 #include "vnic_stats.h"
+#include "vnic_vic.h"
 #include "enic_res.h"
 #include "enic.h"
 
@@ -49,10 +51,12 @@
 #define ENIC_DESC_MAX_SPLITS		(MAX_TSO / WQ_ENET_MAX_DESC_LEN + 1)
 
 #define PCI_DEVICE_ID_CISCO_VIC_ENET         0x0043  /* ethernet vnic */
+#define PCI_DEVICE_ID_CISCO_VIC_ENET_DYN     0x0044  /* enet dynamic vnic */
 
 /* Supported devices */
 static DEFINE_PCI_DEVICE_TABLE(enic_id_table) = {
 	{ PCI_VDEVICE(CISCO, PCI_DEVICE_ID_CISCO_VIC_ENET) },
+	{ PCI_VDEVICE(CISCO, PCI_DEVICE_ID_CISCO_VIC_ENET_DYN) },
 	{ 0, }	/* end of table */
 };
 
@@ -113,6 +117,11 @@ static const struct enic_stat enic_rx_stats[] = {
 static const unsigned int enic_n_tx_stats = ARRAY_SIZE(enic_tx_stats);
 static const unsigned int enic_n_rx_stats = ARRAY_SIZE(enic_rx_stats);
 
+static int enic_is_dynamic(struct enic *enic)
+{
+	return enic->pdev->device == PCI_DEVICE_ID_CISCO_VIC_ENET_DYN;
+}
+
 static int enic_get_settings(struct net_device *netdev,
 	struct ethtool_cmd *ecmd)
 {
@@ -810,14 +819,24 @@ static void enic_reset_mcaddrs(struct enic *enic)
 
 static int enic_set_mac_addr(struct net_device *netdev, char *addr)
 {
-	if (!is_valid_ether_addr(addr))
-		return -EADDRNOTAVAIL;
+	struct enic *enic = netdev_priv(netdev);
 
-	memcpy(netdev->dev_addr, addr, netdev->addr_len);
+	if (enic_is_dynamic(enic)) {
+		random_ether_addr(netdev->dev_addr);
+	} else {
+		if (!is_valid_ether_addr(addr))
+			return -EADDRNOTAVAIL;
+		memcpy(netdev->dev_addr, addr, netdev->addr_len);
+	}
 
 	return 0;
 }
 
+static int enic_set_mac_address(struct net_device *netdev, void *p)
+{
+	return -EOPNOTSUPP;
+}
+
 /* netif_tx_lock held, BHs disabled */
 static void enic_set_multicast_list(struct net_device *netdev)
 {
@@ -922,6 +941,131 @@ static void enic_tx_timeout(struct net_device *netdev)
 	schedule_work(&enic->reset);
 }
 
+static int enic_vnic_dev_deinit(struct enic *enic)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_deinit(enic->vdev);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+static int enic_dev_init_prov(struct enic *enic, struct vic_provinfo *vp)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_init_prov(enic->vdev,
+		(u8 *)vp, vic_provinfo_size(vp));
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+static int enic_dev_init_done(struct enic *enic, int *done, int *error)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_init_done(enic->vdev, done, error);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
+static int enic_provinfo_add_tlv_str(struct vic_provinfo *vp, u16 type,
+	u16 max_length, char *str)
+{
+	if (!str)
+		return 0;
+
+	if (strlen(str) + 1 > max_length)
+		return 0;
+
+	return vic_provinfo_add_tlv(vp, type, strlen(str) + 1, str);
+}
+
+static int enic_set_vf_port_profile(struct net_device *netdev, int vf,
+	struct ifla_vf_port_profile *ivp)
+{
+	struct enic *enic = netdev_priv(netdev);
+	struct vic_provinfo *vp;
+	u8 oui[3] = VIC_PROVINFO_CISCO_OUI;
+	u8 *mac = ivp->mac;
+	int err;
+
+	if (!enic_is_dynamic(enic))
+		return -EOPNOTSUPP;
+
+	memset(&enic->pp, 0, sizeof(enic->pp));
+
+	enic_vnic_dev_deinit(enic);
+
+	if (strlen(ivp->port_profile) == 0)
+		return 0;
+
+	if (is_zero_ether_addr(mac))
+		mac = netdev->dev_addr;
+
+	if (!is_valid_ether_addr(mac))
+		return -EADDRNOTAVAIL;
+
+	vp = vic_provinfo_alloc(GFP_KERNEL, oui, VIC_PROVINFO_LINUX_TYPE);
+	if (!vp)
+		return -ENOMEM;
+
+	enic_provinfo_add_tlv_str(vp, VIC_LINUX_PROV_TLV_PORT_PROFILE_NAME_STR,
+		IFLA_VF_PORT_PROFILE_MAX, ivp->port_profile);
+	vic_provinfo_add_tlv(vp, VIC_LINUX_PROV_TLV_CLIENT_MAC_ADDR,
+		ETH_ALEN, mac);
+	enic_provinfo_add_tlv_str(vp, VIC_LINUX_PROV_TLV_HOST_UUID_STR,
+		IFLA_VF_UUID_MAX, ivp->host_uuid);
+	enic_provinfo_add_tlv_str(vp, VIC_LINUX_PROV_TLV_CLIENT_UUID_STR,
+		IFLA_VF_UUID_MAX, ivp->client_uuid);
+	enic_provinfo_add_tlv_str(vp, VIC_LINUX_PROV_TLV_CLIENT_NAME_STR,
+		IFLA_VF_CLIENT_NAME_MAX, ivp->client_name);
+
+	err = enic_dev_init_prov(enic, vp);
+	if (err)
+		goto err_out;
+
+	memcpy(&enic->pp, ivp, sizeof(enic->pp));
+
+err_out:
+	vic_provinfo_free(vp);
+
+	return err;
+}
+
+static int enic_get_vf_port_profile(struct net_device *netdev, int vf,
+	struct ifla_vf_port_profile *ivp)
+{
+	struct enic *enic = netdev_priv(netdev);
+	int err, error, done;
+
+	if (!enic_is_dynamic(enic))
+		return -EOPNOTSUPP;
+
+	enic->pp.status = IFLA_VF_PORT_PROFILE_STATUS_UNKNOWN;
+
+	err = enic_dev_init_done(enic, &done, &error);
+
+	if (err || error)
+		enic->pp.status = IFLA_VF_PORT_PROFILE_STATUS_ERROR;
+
+	if (!done)
+		enic->pp.status = IFLA_VF_PORT_PROFILE_STATUS_INPROGRESS;
+
+	if (!error)
+		enic->pp.status = IFLA_VF_PORT_PROFILE_STATUS_SUCCESS;
+
+	memcpy(ivp, &enic->pp, sizeof(enic->pp));
+
+	return 0;
+}
+
 static void enic_free_rq_buf(struct vnic_rq *rq, struct vnic_rq_buf *buf)
 {
 	struct enic *enic = vnic_dev_priv(rq->vdev);
@@ -1440,10 +1584,12 @@ static int enic_open(struct net_device *netdev)
 	for (i = 0; i < enic->rq_count; i++)
 		vnic_rq_enable(&enic->rq[i]);
 
-	spin_lock(&enic->devcmd_lock);
-	enic_add_station_addr(enic);
-	spin_unlock(&enic->devcmd_lock);
-	enic_set_multicast_list(netdev);
+	if (!enic_is_dynamic(enic)) {
+		spin_lock(&enic->devcmd_lock);
+		enic_add_station_addr(enic);
+		spin_unlock(&enic->devcmd_lock);
+		enic_set_multicast_list(netdev);
+	}
 
 	netif_wake_queue(netdev);
 	napi_enable(&enic->napi);
@@ -1775,20 +1921,22 @@ static void enic_clear_intr_mode(struct enic *enic)
 }
 
 static const struct net_device_ops enic_netdev_ops = {
-	.ndo_open		= enic_open,
-	.ndo_stop		= enic_stop,
-	.ndo_start_xmit		= enic_hard_start_xmit,
-	.ndo_get_stats		= enic_get_stats,
-	.ndo_validate_addr	= eth_validate_addr,
-	.ndo_set_mac_address 	= eth_mac_addr,
-	.ndo_set_multicast_list	= enic_set_multicast_list,
-	.ndo_change_mtu		= enic_change_mtu,
-	.ndo_vlan_rx_register	= enic_vlan_rx_register,
-	.ndo_vlan_rx_add_vid	= enic_vlan_rx_add_vid,
-	.ndo_vlan_rx_kill_vid	= enic_vlan_rx_kill_vid,
-	.ndo_tx_timeout		= enic_tx_timeout,
+	.ndo_open			= enic_open,
+	.ndo_stop			= enic_stop,
+	.ndo_start_xmit			= enic_hard_start_xmit,
+	.ndo_get_stats			= enic_get_stats,
+	.ndo_validate_addr		= eth_validate_addr,
+	.ndo_set_multicast_list		= enic_set_multicast_list,
+	.ndo_set_mac_address		= enic_set_mac_address,
+	.ndo_change_mtu			= enic_change_mtu,
+	.ndo_vlan_rx_register		= enic_vlan_rx_register,
+	.ndo_vlan_rx_add_vid		= enic_vlan_rx_add_vid,
+	.ndo_vlan_rx_kill_vid		= enic_vlan_rx_kill_vid,
+	.ndo_tx_timeout			= enic_tx_timeout,
+	.ndo_set_vf_port_profile	= enic_set_vf_port_profile,
+	.ndo_get_vf_port_profile	= enic_get_vf_port_profile,
 #ifdef CONFIG_NET_POLL_CONTROLLER
-	.ndo_poll_controller	= enic_poll_controller,
+	.ndo_poll_controller		= enic_poll_controller,
 #endif
 };
 
@@ -2010,11 +2158,13 @@ static int __devinit enic_probe(struct pci_dev *pdev,
 
 	netif_carrier_off(netdev);
 
-	err = vnic_dev_init(enic->vdev, 0);
-	if (err) {
-		printk(KERN_ERR PFX
-			"vNIC dev init failed, aborting.\n");
-		goto err_out_dev_close;
+	if (!enic_is_dynamic(enic)) {
+		err = vnic_dev_init(enic->vdev, 0);
+		if (err) {
+			printk(KERN_ERR PFX
+				"vNIC dev init failed, aborting.\n");
+			goto err_out_dev_close;
+		}
 	}
 
 	err = enic_dev_init(enic);
diff --git a/drivers/net/enic/vnic_dev.c b/drivers/net/enic/vnic_dev.c
index d43a9d4..e351b0f 100644
--- a/drivers/net/enic/vnic_dev.c
+++ b/drivers/net/enic/vnic_dev.c
@@ -682,6 +682,56 @@ int vnic_dev_init(struct vnic_dev *vdev, int arg)
 	return r;
 }
 
+int vnic_dev_init_done(struct vnic_dev *vdev, int *done, int *err)
+{
+	u64 a0 = 0, a1 = 0;
+	int wait = 1000;
+	int ret;
+
+	*done = 0;
+
+	ret = vnic_dev_cmd(vdev, CMD_INIT_STATUS, &a0, &a1, wait);
+	if (ret)
+		return ret;
+
+	*done = (a0 == 0);
+
+	*err = (a0 == 0) ? a1 : 0;
+
+	return 0;
+}
+
+int vnic_dev_init_prov(struct vnic_dev *vdev, u8 *buf, u32 len)
+{
+	u64 a0, a1 = len;
+	int wait = 1000;
+	u64 prov_pa;
+	void *prov_buf;
+	int ret;
+
+	prov_buf = pci_alloc_consistent(vdev->pdev, len, &prov_pa);
+	if (!prov_buf)
+		return -ENOMEM;
+
+	memcpy(prov_buf, buf, len);
+
+	a0 = prov_pa;
+
+	ret = vnic_dev_cmd(vdev, CMD_INIT_PROV_INFO, &a0, &a1, wait);
+
+	pci_free_consistent(vdev->pdev, len, prov_buf, prov_pa);
+
+	return ret;
+}
+
+int vnic_dev_deinit(struct vnic_dev *vdev)
+{
+	u64 a0 = 0, a1 = 0;
+	int wait = 1000;
+
+	return vnic_dev_cmd(vdev, CMD_DEINIT, &a0, &a1, wait);
+}
+
 int vnic_dev_link_status(struct vnic_dev *vdev)
 {
 	if (vdev->linkstatus)
diff --git a/drivers/net/enic/vnic_dev.h b/drivers/net/enic/vnic_dev.h
index f5be640..27f5a5a 100644
--- a/drivers/net/enic/vnic_dev.h
+++ b/drivers/net/enic/vnic_dev.h
@@ -124,6 +124,9 @@ int vnic_dev_disable(struct vnic_dev *vdev);
 int vnic_dev_open(struct vnic_dev *vdev, int arg);
 int vnic_dev_open_done(struct vnic_dev *vdev, int *done);
 int vnic_dev_init(struct vnic_dev *vdev, int arg);
+int vnic_dev_init_done(struct vnic_dev *vdev, int *done, int *err);
+int vnic_dev_init_prov(struct vnic_dev *vdev, u8 *buf, u32 len);
+int vnic_dev_deinit(struct vnic_dev *vdev);
 int vnic_dev_soft_reset(struct vnic_dev *vdev, int arg);
 int vnic_dev_soft_reset_done(struct vnic_dev *vdev, int *done);
 void vnic_dev_set_intr_mode(struct vnic_dev *vdev,
diff --git a/drivers/net/enic/vnic_vic.c b/drivers/net/enic/vnic_vic.c
new file mode 100644
index 0000000..d769772
--- /dev/null
+++ b/drivers/net/enic/vnic_vic.c
@@ -0,0 +1,73 @@
+/*
+ * Copyright 2010 Cisco Systems, Inc.  All rights reserved.
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+
+#include "vnic_vic.h"
+
+struct vic_provinfo *vic_provinfo_alloc(gfp_t flags, u8 *oui, u8 type)
+{
+	struct vic_provinfo *vp = kzalloc(VIC_PROVINFO_MAX_DATA, flags);
+
+	if (!vp || !oui)
+		return NULL;
+
+	memcpy(vp->oui, oui, sizeof(vp->oui));
+	vp->type = type;
+	vp->length = htonl(sizeof(vp->num_tlvs));
+
+	return vp;
+}
+
+void vic_provinfo_free(struct vic_provinfo *vp)
+{
+	kfree(vp);
+}
+
+int vic_provinfo_add_tlv(struct vic_provinfo *vp, u16 type, u16 length,
+	void *value)
+{
+	struct vic_provinfo_tlv *tlv;
+
+	if (!vp || !value)
+		return -EINVAL;
+
+	if (ntohl(vp->length) + sizeof(*tlv) + length >
+		VIC_PROVINFO_MAX_TLV_DATA)
+		return -ENOMEM;
+
+	tlv = (struct vic_provinfo_tlv *)((u8 *)vp->tlv +
+		ntohl(vp->length) - sizeof(vp->num_tlvs));
+
+	tlv->type = htons(type);
+	tlv->length = htons(length);
+	memcpy(tlv->value, value, length);
+
+	vp->num_tlvs = htonl(ntohl(vp->num_tlvs) + 1);
+	vp->length = htonl(ntohl(vp->length) + sizeof(*tlv) + length);
+
+	return 0;
+}
+
+size_t vic_provinfo_size(struct vic_provinfo *vp)
+{
+	return vp ?  ntohl(vp->length) + sizeof(*vp) - sizeof(vp->num_tlvs) : 0;
+}
diff --git a/drivers/net/enic/vnic_vic.h b/drivers/net/enic/vnic_vic.h
new file mode 100644
index 0000000..085c2a2
--- /dev/null
+++ b/drivers/net/enic/vnic_vic.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright 2010 Cisco Systems, Inc.  All rights reserved.
+ *
+ * This program is free software; you may redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef _VNIC_VIC_H_
+#define _VNIC_VIC_H_
+
+/* Note: All integer fields in NETWORK byte order */
+
+/* Note: String field lengths include null char */
+
+#define VIC_PROVINFO_CISCO_OUI		{ 0x00, 0x00, 0x0c }
+#define VIC_PROVINFO_LINUX_TYPE		0x2
+
+enum vic_linux_prov_tlv_type {
+	VIC_LINUX_PROV_TLV_PORT_PROFILE_NAME_STR = 0,
+	VIC_LINUX_PROV_TLV_CLIENT_MAC_ADDR = 1,			/* u8[6] */
+	VIC_LINUX_PROV_TLV_CLIENT_NAME_STR = 2,
+	VIC_LINUX_PROV_TLV_HOST_UUID_STR = 8,
+	VIC_LINUX_PROV_TLV_CLIENT_UUID_STR = 9,
+};
+
+struct vic_provinfo {
+	u8 oui[3];		/* OUI of data provider */
+	u8 type;		/* provider-specific type */
+	u32 length;		/* length of data below */
+	u32 num_tlvs;		/* number of tlvs */
+	struct vic_provinfo_tlv {
+		u16 type;
+		u16 length;
+		u8 value[0];
+	} tlv[0];
+} __attribute__ ((packed));
+
+#define VIC_PROVINFO_MAX_DATA		1385
+#define VIC_PROVINFO_MAX_TLV_DATA (VIC_PROVINFO_MAX_DATA - \
+	sizeof(struct vic_provinfo))
+
+struct vic_provinfo *vic_provinfo_alloc(gfp_t flags, u8 *oui, u8 type);
+void vic_provinfo_free(struct vic_provinfo *vp);
+int vic_provinfo_add_tlv(struct vic_provinfo *vp, u16 type, u16 length,
+	void *value);
+size_t vic_provinfo_size(struct vic_provinfo *vp);
+
+#endif	/* _VNIC_VIC_H_ */


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [net-next-2.6 V5 PATCH 2/3] Add ndo_{set|get}_vf_port_profile op support for enic dynamic vnics
  2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 2/3] Add ndo_{set|get}_vf_port_profile op support for enic dynamic vnics Scott Feldman
@ 2010-05-06 13:47   ` Arnd Bergmann
  2010-05-06 16:25     ` Scott Feldman
  0 siblings, 1 reply; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-06 13:47 UTC (permalink / raw)
  To: Scott Feldman; +Cc: davem, netdev, chrisw

On Thursday 06 May 2010, Scott Feldman wrote:
> @@ -810,14 +819,24 @@ static void enic_reset_mcaddrs(struct enic *enic)
>  
>  static int enic_set_mac_addr(struct net_device *netdev, char *addr)
>  {
> -       if (!is_valid_ether_addr(addr))
> -               return -EADDRNOTAVAIL;
> +       struct enic *enic = netdev_priv(netdev);
>  
> -       memcpy(netdev->dev_addr, addr, netdev->addr_len);
> +       if (enic_is_dynamic(enic)) {
> +               random_ether_addr(netdev->dev_addr);
> +       } else {
> +               if (!is_valid_ether_addr(addr))
> +                       return -EADDRNOTAVAIL;
> +               memcpy(netdev->dev_addr, addr, netdev->addr_len);
> +       }
>  
>         return 0;
>  }
>  
> +static int enic_set_mac_address(struct net_device *netdev, void *p)
> +{
> +       return -EOPNOTSUPP;
> +}
> +
>  /* netif_tx_lock held, BHs disabled */
>  static void enic_set_multicast_list(struct net_device *netdev)
>  {

Thsi looks funny. So you just ignore the address that gets passed to
enic_set_mac_addr for dynamic interfaces and instead set a random
address?

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [net-next-2.6 V5 PATCH 2/3] Add ndo_{set|get}_vf_port_profile op support for enic dynamic vnics
  2010-05-06 13:47   ` Arnd Bergmann
@ 2010-05-06 16:25     ` Scott Feldman
  2010-05-06 16:45       ` Arnd Bergmann
  0 siblings, 1 reply; 20+ messages in thread
From: Scott Feldman @ 2010-05-06 16:25 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: davem, netdev, chrisw

On 5/6/10 6:47 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:

> On Thursday 06 May 2010, Scott Feldman wrote:
>> @@ -810,14 +819,24 @@ static void enic_reset_mcaddrs(struct enic *enic)
>>  
>>  static int enic_set_mac_addr(struct net_device *netdev, char *addr)
>>  {
>> -       if (!is_valid_ether_addr(addr))
>> -               return -EADDRNOTAVAIL;
>> +       struct enic *enic = netdev_priv(netdev);
>>  
>> -       memcpy(netdev->dev_addr, addr, netdev->addr_len);
>> +       if (enic_is_dynamic(enic)) {
>> +               random_ether_addr(netdev->dev_addr);
>> +       } else {
>> +               if (!is_valid_ether_addr(addr))
>> +                       return -EADDRNOTAVAIL;
>> +               memcpy(netdev->dev_addr, addr, netdev->addr_len);
>> +       }
>>  
>>         return 0;
>>  }
>>  
>> +static int enic_set_mac_address(struct net_device *netdev, void *p)
>> +{
>> +       return -EOPNOTSUPP;
>> +}
>> +
>>  /* netif_tx_lock held, BHs disabled */
>>  static void enic_set_multicast_list(struct net_device *netdev)
>>  {
> 
> Thsi looks funny. So you just ignore the address that gets passed to
> enic_set_mac_addr for dynamic interfaces and instead set a random
> address?

Dynamic enics have all-zero mac address on init, so we assign a random mac
addr to the interface.  This would seem less funny:

    if (enic_is_dynamic(enic) && is_zero_ether_addr(addr))
        random_ether_addr(netdev->dev_addr);
    else
        ...

I'll make that change and resubmit with your VDP additions if you like.

-scott


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [net-next-2.6 V5 PATCH 2/3] Add ndo_{set|get}_vf_port_profile op support for enic dynamic vnics
  2010-05-06 16:25     ` Scott Feldman
@ 2010-05-06 16:45       ` Arnd Bergmann
  0 siblings, 0 replies; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-06 16:45 UTC (permalink / raw)
  To: Scott Feldman; +Cc: davem, netdev, chrisw

On Thursday 06 May 2010, Scott Feldman wrote:
> Dynamic enics have all-zero mac address on init, so we assign a random mac
> addr to the interface.  This would seem less funny:
> 
>     if (enic_is_dynamic(enic) && is_zero_ether_addr(addr))
>         random_ether_addr(netdev->dev_addr);
>     else
>         ...
> 
> I'll make that change and resubmit with your VDP additions if you like.

The change is ok, but what I think would be more helpful is a code comment
with your above sentence.

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [net-next-2.6 V5 PATCH 3/3] Add SR-IOV support to enic (please don't apply this patch)
  2010-05-06  4:42 [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Scott Feldman
  2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 1/3] Add netdev/netlink port-profile support (was iovnl) Scott Feldman
  2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 2/3] Add ndo_{set|get}_vf_port_profile op support for enic dynamic vnics Scott Feldman
@ 2010-05-06  4:42 ` Scott Feldman
  2010-05-06 13:51 ` [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Arnd Bergmann
  3 siblings, 0 replies; 20+ messages in thread
From: Scott Feldman @ 2010-05-06  4:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, chrisw, arnd

From: Scott Feldman <scofeldm@cisco.com>

This patch is to illustrate how port-profiles will be assigned to VFs in
a compliant SR-IOV enic device.  Here the VF devices are dynamic enics and
the PF device is a "static" enic device.  Only the PF device resonds to
ndo_vf_{set|get}_port_profile to set/get the port-profile on a VF.  It's
not possible to set a port-profile on a PF since PFs have an immutable port
assignment on the external switch, established when the PF was provisioned.

The same driver (enic) is used for both PFs and VFs devices.  The PF
enables N number of VFs based on a PF configuration parameter assigned
when the PF is provisioned.

While this patch is functionally complete, we (Cisco) need to do more testing
before we can cliam full SR-IOV support in Linux, so we ask that this patch
not be applied at this time.  it is provide with this patch set for
illustrative purposes only to show how the port-profile netlink API would
be used for a SR-IOV compliant device that supports port-profiles.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Roopa Prabhu<roprabhu@cisco.com>
---
 drivers/net/enic/enic.h      |    5 +-
 drivers/net/enic/enic_main.c |   96 +++++++++++++++++++++++++++++++-----------
 drivers/net/enic/enic_res.c  |    3 +
 drivers/net/enic/vnic_dev.c  |   12 +++--
 drivers/net/enic/vnic_dev.h  |    6 +--
 drivers/net/enic/vnic_enet.h |    1 
 6 files changed, 86 insertions(+), 37 deletions(-)

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 718033f..4d00e5e 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -34,7 +34,7 @@
 
 #define DRV_NAME		"enic"
 #define DRV_DESCRIPTION		"Cisco VIC Ethernet NIC Driver"
-#define DRV_VERSION		"1.3.1.1-pp"
+#define DRV_VERSION		"1.3.1.1-sr-iov"
 #define DRV_COPYRIGHT		"Copyright 2008-2009 Cisco Systems, Inc"
 #define PFX			DRV_NAME ": "
 
@@ -95,7 +95,8 @@ struct enic {
 	u32 port_mtu;
 	u32 rx_coalesce_usecs;
 	u32 tx_coalesce_usecs;
-	struct ifla_vf_port_profile pp;
+	struct ifla_vf_port_profile *pp;
+	unsigned int vf_count;
 
 	/* work queue cache line section */
 	____cacheline_aligned struct vnic_wq wq[ENIC_WQ_MAX];
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 8e5e46b..1488431 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -941,35 +941,36 @@ static void enic_tx_timeout(struct net_device *netdev)
 	schedule_work(&enic->reset);
 }
 
-static int enic_vnic_dev_deinit(struct enic *enic)
+static int enic_vnic_dev_deinit(struct enic *enic, int vf)
 {
 	int err;
 
 	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_deinit(enic->vdev);
+	err = vnic_dev_deinit(enic->vdev, vf);
 	spin_unlock(&enic->devcmd_lock);
 
 	return err;
 }
 
-static int enic_dev_init_prov(struct enic *enic, struct vic_provinfo *vp)
+static int enic_dev_init_prov(struct enic *enic, int vf,
+	struct vic_provinfo *vp)
 {
 	int err;
 
 	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_init_prov(enic->vdev,
+	err = vnic_dev_init_prov(enic->vdev, vf,
 		(u8 *)vp, vic_provinfo_size(vp));
 	spin_unlock(&enic->devcmd_lock);
 
 	return err;
 }
 
-static int enic_dev_init_done(struct enic *enic, int *done, int *error)
+static int enic_dev_init_done(struct enic *enic, int vf, int *done, int *error)
 {
 	int err;
 
 	spin_lock(&enic->devcmd_lock);
-	err = vnic_dev_init_done(enic->vdev, done, error);
+	err = vnic_dev_init_done(enic->vdev, vf, done, error);
 	spin_unlock(&enic->devcmd_lock);
 
 	return err;
@@ -993,23 +994,22 @@ static int enic_set_vf_port_profile(struct net_device *netdev, int vf,
 	struct enic *enic = netdev_priv(netdev);
 	struct vic_provinfo *vp;
 	u8 oui[3] = VIC_PROVINFO_CISCO_OUI;
-	u8 *mac = ivp->mac;
 	int err;
 
-	if (!enic_is_dynamic(enic))
+	if (enic_is_dynamic(enic))
 		return -EOPNOTSUPP;
 
-	memset(&enic->pp, 0, sizeof(enic->pp));
+	if (vf < 0 || vf >= enic->vf_count)
+		return -EOPNOTSUPP;
+
+	memset(&enic->pp[vf], 0, sizeof(enic->pp[vf]));
 
-	enic_vnic_dev_deinit(enic);
+	enic_vnic_dev_deinit(enic, vf);
 
 	if (strlen(ivp->port_profile) == 0)
 		return 0;
 
-	if (is_zero_ether_addr(mac))
-		mac = netdev->dev_addr;
-
-	if (!is_valid_ether_addr(mac))
+	if (!is_valid_ether_addr(ipv->mac))
 		return -EADDRNOTAVAIL;
 
 	vp = vic_provinfo_alloc(GFP_KERNEL, oui, VIC_PROVINFO_LINUX_TYPE);
@@ -1019,7 +1019,7 @@ static int enic_set_vf_port_profile(struct net_device *netdev, int vf,
 	enic_provinfo_add_tlv_str(vp, VIC_LINUX_PROV_TLV_PORT_PROFILE_NAME_STR,
 		IFLA_VF_PORT_PROFILE_MAX, ivp->port_profile);
 	vic_provinfo_add_tlv(vp, VIC_LINUX_PROV_TLV_CLIENT_MAC_ADDR,
-		ETH_ALEN, mac);
+		ETH_ALEN, ivp->mac);
 	enic_provinfo_add_tlv_str(vp, VIC_LINUX_PROV_TLV_HOST_UUID_STR,
 		IFLA_VF_UUID_MAX, ivp->host_uuid);
 	enic_provinfo_add_tlv_str(vp, VIC_LINUX_PROV_TLV_CLIENT_UUID_STR,
@@ -1027,11 +1027,11 @@ static int enic_set_vf_port_profile(struct net_device *netdev, int vf,
 	enic_provinfo_add_tlv_str(vp, VIC_LINUX_PROV_TLV_CLIENT_NAME_STR,
 		IFLA_VF_CLIENT_NAME_MAX, ivp->client_name);
 
-	err = enic_dev_init_prov(enic, vp);
+	err = enic_dev_init_prov(enic, vf, vp);
 	if (err)
 		goto err_out;
 
-	memcpy(&enic->pp, ivp, sizeof(enic->pp));
+	memcpy(&enic->pp[vf], ivp, sizeof(enic->pp[vf]));
 
 err_out:
 	vic_provinfo_free(vp);
@@ -1045,23 +1045,26 @@ static int enic_get_vf_port_profile(struct net_device *netdev, int vf,
 	struct enic *enic = netdev_priv(netdev);
 	int err, error, done;
 
-	if (!enic_is_dynamic(enic))
+	if (enic_is_dynamic(enic))
+		return -EOPNOTSUPP;
+
+	if (vf < 0 || vf >= enic->vf_count)
 		return -EOPNOTSUPP;
 
-	enic->pp.status = IFLA_VF_PORT_PROFILE_STATUS_UNKNOWN;
+	enic->pp[vf].status = IFLA_VF_PORT_PROFILE_STATUS_UNKNOWN;
 
-	err = enic_dev_init_done(enic, &done, &error);
+	err = enic_dev_init_done(enic, vf, &done, &error);
 
 	if (err || error)
-		enic->pp.status = IFLA_VF_PORT_PROFILE_STATUS_ERROR;
+		enic->pp[vf].status = IFLA_VF_PORT_PROFILE_STATUS_ERROR;
 
 	if (!done)
-		enic->pp.status = IFLA_VF_PORT_PROFILE_STATUS_INPROGRESS;
+		enic->pp[vf].status = IFLA_VF_PORT_PROFILE_STATUS_INPROGRESS;
 
 	if (!error)
-		enic->pp.status = IFLA_VF_PORT_PROFILE_STATUS_SUCCESS;
+		enic->pp[vf].status = IFLA_VF_PORT_PROFILE_STATUS_SUCCESS;
 
-	memcpy(ivp, &enic->pp, sizeof(enic->pp));
+	memcpy(ivp, &enic->pp[vf], sizeof(enic->pp[vf]));
 
 	return 0;
 }
@@ -2023,6 +2026,37 @@ err_out_free_vnic_resources:
 	return err;
 }
 
+static int enic_enable_vfs(struct enic *enic)
+{
+	int err;
+
+	enic->vf_count = enic->config.vf_count;
+
+	enic->pp = kzalloc(enic->vf_count  *
+		sizeof(struct ifla_vf_port_profile), GFP_KERNEL);
+	if (!enic->pp)
+		return -ENOMEM;
+
+	if (enic->pdev->is_physfn && enic->vf_count > 0) {
+
+		err = pci_enable_sriov(enic->pdev, enic->vf_count);
+		if (err) {
+			kfree(enic->pp);
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+static void enic_disable_vfs(struct enic *enic)
+{
+       if (enic->pdev->is_physfn && enic->vf_count > 0)
+               pci_disable_sriov(enic->pdev);
+       kfree(enic->pp);
+       enic->vf_count = 0;
+}
+
 static void enic_iounmap(struct enic *enic)
 {
 	unsigned int i;
@@ -2174,6 +2208,13 @@ static int __devinit enic_probe(struct pci_dev *pdev,
 		goto err_out_dev_close;
 	}
 
+	err = enic_enable_vfs(enic);
+	if (err) {
+		printk(KERN_ERR PFX
+			"SR-IOV VF enable failed, aborting.\n");
+		goto err_out_dev_deinit;
+	}
+
 	/* Setup notification timer, HW reset task, and locks
 	 */
 
@@ -2198,7 +2239,7 @@ static int __devinit enic_probe(struct pci_dev *pdev,
 	if (err) {
 		printk(KERN_ERR PFX
 			"Invalid MAC address, aborting.\n");
-		goto err_out_dev_deinit;
+		goto err_out_disable_vfs;
 	}
 
 	enic->tx_coalesce_usecs = enic->config.intr_timer_usec;
@@ -2234,11 +2275,13 @@ static int __devinit enic_probe(struct pci_dev *pdev,
 	if (err) {
 		printk(KERN_ERR PFX
 			"Cannot register net device, aborting.\n");
-		goto err_out_dev_deinit;
+		goto err_out_disable_vfs;
 	}
 
 	return 0;
 
+err_out_disable_vfs:
+	enic_disable_vfs(enic);
 err_out_dev_deinit:
 	enic_dev_deinit(enic);
 err_out_dev_close:
@@ -2267,6 +2310,7 @@ static void __devexit enic_remove(struct pci_dev *pdev)
 
 		flush_scheduled_work();
 		unregister_netdev(netdev);
+		enic_disable_vfs(enic);
 		enic_dev_deinit(enic);
 		vnic_dev_close(enic->vdev);
 		vnic_dev_unregister(enic->vdev);
diff --git a/drivers/net/enic/enic_res.c b/drivers/net/enic/enic_res.c
index 02839bf..dfb37f2 100644
--- a/drivers/net/enic/enic_res.c
+++ b/drivers/net/enic/enic_res.c
@@ -69,6 +69,7 @@ int enic_get_vnic_config(struct enic *enic)
 	GET_CONFIG(intr_timer_type);
 	GET_CONFIG(intr_mode);
 	GET_CONFIG(intr_timer_usec);
+	GET_CONFIG(vf_count);
 
 	c->wq_desc_count =
 		min_t(u32, ENIC_MAX_WQ_DESCS,
@@ -99,6 +100,8 @@ int enic_get_vnic_config(struct enic *enic)
 		c->mtu, ENIC_SETTING(enic, TXCSUM),
 		ENIC_SETTING(enic, RXCSUM), ENIC_SETTING(enic, TSO),
 		ENIC_SETTING(enic, LRO), c->intr_timer_usec);
+	if (c->vf_count)
+		printk(KERN_INFO PFX "vNIC SR-IOV VF count %d\n", c->vf_count);
 
 	return 0;
 }
diff --git a/drivers/net/enic/vnic_dev.c b/drivers/net/enic/vnic_dev.c
index e351b0f..261d5f0 100644
--- a/drivers/net/enic/vnic_dev.c
+++ b/drivers/net/enic/vnic_dev.c
@@ -682,9 +682,9 @@ int vnic_dev_init(struct vnic_dev *vdev, int arg)
 	return r;
 }
 
-int vnic_dev_init_done(struct vnic_dev *vdev, int *done, int *err)
+int vnic_dev_init_done(struct vnic_dev *vdev, u16 vf, int *done, int *err)
 {
-	u64 a0 = 0, a1 = 0;
+	u64 a0 = vf, a1 = 0;
 	int wait = 1000;
 	int ret;
 
@@ -701,9 +701,9 @@ int vnic_dev_init_done(struct vnic_dev *vdev, int *done, int *err)
 	return 0;
 }
 
-int vnic_dev_init_prov(struct vnic_dev *vdev, u8 *buf, u32 len)
+int vnic_dev_init_prov(struct vnic_dev *vdev, u16 vf, u8 *buf, u32 len)
 {
-	u64 a0, a1 = len;
+	u64 a0, a1 = (u64)len | ((u64)vf << 32);
 	int wait = 1000;
 	u64 prov_pa;
 	void *prov_buf;
@@ -724,9 +724,9 @@ int vnic_dev_init_prov(struct vnic_dev *vdev, u8 *buf, u32 len)
 	return ret;
 }
 
-int vnic_dev_deinit(struct vnic_dev *vdev)
+int vnic_dev_deinit(struct vnic_dev *vdev, u16 vf)
 {
-	u64 a0 = 0, a1 = 0;
+	u64 a0 = vf, a1 = 0;
 	int wait = 1000;
 
 	return vnic_dev_cmd(vdev, CMD_DEINIT, &a0, &a1, wait);
diff --git a/drivers/net/enic/vnic_dev.h b/drivers/net/enic/vnic_dev.h
index 27f5a5a..d508187 100644
--- a/drivers/net/enic/vnic_dev.h
+++ b/drivers/net/enic/vnic_dev.h
@@ -124,9 +124,9 @@ int vnic_dev_disable(struct vnic_dev *vdev);
 int vnic_dev_open(struct vnic_dev *vdev, int arg);
 int vnic_dev_open_done(struct vnic_dev *vdev, int *done);
 int vnic_dev_init(struct vnic_dev *vdev, int arg);
-int vnic_dev_init_done(struct vnic_dev *vdev, int *done, int *err);
-int vnic_dev_init_prov(struct vnic_dev *vdev, u8 *buf, u32 len);
-int vnic_dev_deinit(struct vnic_dev *vdev);
+int vnic_dev_init_done(struct vnic_dev *vdev, u16 vf, int *done, int *err);
+int vnic_dev_init_prov(struct vnic_dev *vdev, u16 vf, u8 *buf, u32 len);
+int vnic_dev_deinit(struct vnic_dev *vdev, u16 vf);
 int vnic_dev_soft_reset(struct vnic_dev *vdev, int arg);
 int vnic_dev_soft_reset_done(struct vnic_dev *vdev, int *done);
 void vnic_dev_set_intr_mode(struct vnic_dev *vdev,
diff --git a/drivers/net/enic/vnic_enet.h b/drivers/net/enic/vnic_enet.h
index 8eeb675..466a7b3 100644
--- a/drivers/net/enic/vnic_enet.h
+++ b/drivers/net/enic/vnic_enet.h
@@ -35,6 +35,7 @@ struct vnic_enet_config {
 	u8 intr_mode;
 	char devname[16];
 	u32 intr_timer_usec;
+	u16 vf_count;
 };
 
 #define VENETF_TSO		0x1	/* TSO enabled */


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support
  2010-05-06  4:42 [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Scott Feldman
                   ` (2 preceding siblings ...)
  2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 3/3] Add SR-IOV support to enic (please don't apply this patch) Scott Feldman
@ 2010-05-06 13:51 ` Arnd Bergmann
  2010-05-06 16:19   ` Scott Feldman
  3 siblings, 1 reply; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-06 13:51 UTC (permalink / raw)
  To: Scott Feldman; +Cc: davem, netdev, chrisw

On Thursday 06 May 2010, Scott Feldman wrote:
> The intent of this patch set is to cover both definitions of port-profile
> as defined by Cisco's enic use and as defined by VSI discover protocol (VDP),
> used in VEPA implemenations.  While both definitions are based on pre-
> standards, the concept of a port-profile to be applied to an external switch
> port on behalf of a virtual machine interface is common, as well as many
> of the fields defining the protocols.

The description no either no longer matches the patches, or you did not make the
changes that were needed based on our last discussion.

What happened to the base-device argument that you were planning to pass?

The fields that I mentioned are needed for VDP (associate/pre-associate/disassociate-flag,
VLAN ID, etc) are not there. I assume that means we should use a different
data structure for VDP, but then your description above should be updated
to state that this is no longer common for the two.

I'll follow up with a draft for VDP based on your definitions.

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support
  2010-05-06 13:51 ` [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Arnd Bergmann
@ 2010-05-06 16:19   ` Scott Feldman
  2010-05-06 16:42     ` Arnd Bergmann
  0 siblings, 1 reply; 20+ messages in thread
From: Scott Feldman @ 2010-05-06 16:19 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: davem, netdev, chrisw

On 5/6/10 6:51 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:

> On Thursday 06 May 2010, Scott Feldman wrote:
>> The intent of this patch set is to cover both definitions of port-profile
>> as defined by Cisco's enic use and as defined by VSI discover protocol (VDP),
>> used in VEPA implemenations.  While both definitions are based on pre-
>> standards, the concept of a port-profile to be applied to an external switch
>> port on behalf of a virtual machine interface is common, as well as many
>> of the fields defining the protocols.
> 
> The description no either no longer matches the patches, or you did not make
> the
> changes that were needed based on our last discussion.
> 
> What happened to the base-device argument that you were planning to pass?

Using the IFLA_VF_* model works better for us where the recipient of the
netlink msg is the PF but the msg is to be applied to the VF.  The third
patch illustrates how this fits nicely with SR-IOV devices.  The PF is the
base device.
 
> The fields that I mentioned are needed for VDP
> (associate/pre-associate/disassociate-flag,
> VLAN ID, etc) are not there. I assume that means we should use a different
> data structure for VDP, but then your description above should be updated
> to state that this is no longer common for the two.
> 
> I'll follow up with a draft for VDP based on your definitions.

I tried to accommodate space for VDP, but was hoping you could add the
definitions on top of what I had since your more familiar with VDP and can
do the testing.

Also, I wasn't sure if you could use the existing IFLA_VF_VLAN msg to apply
the VLAN ID or if you wanted VLAN ID also added to IFLA_VF_PORT_PROFILE.

-scott


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support
  2010-05-06 16:19   ` Scott Feldman
@ 2010-05-06 16:42     ` Arnd Bergmann
  2010-05-08 23:20       ` [PATCH] virtif: initial interface extensions Arnd Bergmann
  0 siblings, 1 reply; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-06 16:42 UTC (permalink / raw)
  To: Scott Feldman; +Cc: davem, netdev, chrisw

On Thursday 06 May 2010, Scott Feldman wrote:
> On 5/6/10 6:51 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:
> 
> > On Thursday 06 May 2010, Scott Feldman wrote:
> >> The intent of this patch set is to cover both definitions of port-profile
> >> as defined by Cisco's enic use and as defined by VSI discover protocol (VDP),
> >> used in VEPA implemenations.  While both definitions are based on pre-
> >> standards, the concept of a port-profile to be applied to an external switch
> >> port on behalf of a virtual machine interface is common, as well as many
> >> of the fields defining the protocols.
> > 
> > The description no either no longer matches the patches, or you did not make
> > the
> > changes that were needed based on our last discussion.
> > 
> > What happened to the base-device argument that you were planning to pass?
> 
> Using the IFLA_VF_* model works better for us where the recipient of the
> netlink msg is the PF but the msg is to be applied to the VF.  The third
> patch illustrates how this fits nicely with SR-IOV devices.  The PF is the
> base device.

Ah, got it. I did not notice that you had put a vf field in there.
It now makes a lot more sense to me, and is more in line with what
we need for VDP.

It does however make me wonder how this could be implemented for
a software-only implementation of your protocol that does not refer
to vf numbers. One way would be to define the 'vf' field as implementation
specific and just use the ifindex in this case, which would also work
in case of network namespaces. Alternatively, it could use whatever
tag you use in your wire protocol (e.g. an S-VID)

Both are a bit of a stretch, but I see no technical problems with them.

> > The fields that I mentioned are needed for VDP
> > (associate/pre-associate/disassociate-flag,
> > VLAN ID, etc) are not there. I assume that means we should use a different
> > data structure for VDP, but then your description above should be updated
> > to state that this is no longer common for the two.
> > 
> > I'll follow up with a draft for VDP based on your definitions.
> 
> I tried to accommodate space for VDP, but was hoping you could add the
> definitions on top of what I had since your more familiar with VDP and can
> do the testing.
> 
> Also, I wasn't sure if you could use the existing IFLA_VF_VLAN msg to apply
> the VLAN ID or if you wanted VLAN ID also added to IFLA_VF_PORT_PROFILE.

The IFLA_VF_VLAN would not work well here because of the issue we discussed
before that I think we need to keep device setup separate from the protocol
exchange. IFLA_VF_VLAN configures the VLAN, while we need to tell the switch
about the configuration.

One (new) point that came up today is that your protocol is actually much
more closely related to the 'CDCP' protocol in 802.1Qbg than to 'VDP'.
I'll also try to make sure that we cover this case as well. It should
also be possible to do VDP over a dynamic enic VF and have multiple guests
using macvtap on that function, and there will probably be adapters that
need to use IFLA_VF_PORT_PROFILE (or another set) as the interface between
libvirt and the adapter firmware for doing CDCP.

To give some background, CDCP is an LLDP extension that is used to create
virtual channels between a physical NIC and the phys bridge on the other side,
using S-VLAN tagging. You can either assign one of these channels to a
guest directly (similar to what enic does), or use VDP on the channel
to connect multiple guests using a bridge device or macvtap in the same
way that we also do VDP on the physical device in the absence of CDCP.

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] virtif: initial interface extensions
  2010-05-06 16:42     ` Arnd Bergmann
@ 2010-05-08 23:20       ` Arnd Bergmann
  2010-05-10 15:37         ` Stefan Berger
  0 siblings, 1 reply; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-08 23:20 UTC (permalink / raw)
  To: Scott Feldman; +Cc: davem, netdev, chrisw

Building on the work of Scott Feldman, this extends the netlink interface
to deal with not only port profiles but also CDCP multichannel and VDP
VSI registration.

The protocols are split apart into separate netlink attributes for
a cleaner separation. A device can have multiple IFLA_VIRTIF attributes,
each pointing to one of the slaves that get registered using one of
the protocols. In case of VDP, each IFLA_VIRTIF attribute needs
both an IFLA_VIRTIF_VSI and one or more IFLA_VIRTIF_VSI_MAC_VLAN
attributes.

The VF number is split out because it only makes sense when the
implementation is in the device driver or firmware, but not for
software-only implementations like we do for the initial VDP code
using LLDPAD. If a IFLA_VIRTIF_VF attribute is given, the kernel
takes care of the association through the device driver for that
VF, otherwise we do it in user space. This should work for each
of the three protocols.

This code is a first rough prototype, completely untested and
very likely buggy. This is also the first time I'm trying
to deal with netlink in the kernel, so chances are that I've
misunderstood something in a major way.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/if_link.h   |   81 ++++++++++++++++++++++++++++------
 include/linux/netdevice.h |    8 ++--
 net/core/rtnetlink.c      |  106 ++++++++++++++++++++++++++++++++++-----------
 3 files changed, 152 insertions(+), 43 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index d763358..675d190 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -116,7 +116,7 @@ enum {
 	IFLA_VF_TX_RATE,	/* TX Bandwidth Allocation */
 	IFLA_VFINFO,
 	IFLA_STATS64,
-	IFLA_VF_PORT_PROFILE,
+	IFLA_VIRTIF,
 	__IFLA_MAX
 };
 
@@ -262,26 +262,79 @@ struct ifla_vf_info {
 };
 
 enum {
-	IFLA_VF_PORT_PROFILE_STATUS_UNKNOWN,
-	IFLA_VF_PORT_PROFILE_STATUS_SUCCESS,
-	IFLA_VF_PORT_PROFILE_STATUS_INPROGRESS,
-	IFLA_VF_PORT_PROFILE_STATUS_ERROR,
+	IFLA_VIRTIF_UNSPEC,
+	IFLA_VIRTIF_VF,
+	IFLA_VIRTIF_PORT_PROFILE, /* Cisco enic */
+	IFLA_VIRTIF_CHANNEL,	  /* 802.1Qbg CDCP */
+	IFLA_VIRTIF_VSI,	  /* 802.1Qbg VDP */
+	IFLA_VIRTIF_VSI_MAC_VLAN,
+	__IFLA_VIRTIF_MAX,
 };
 
-#define IFLA_VF_PORT_PROFILE_MAX	40
-#define IFLA_VF_UUID_MAX		40
-#define IFLA_VF_CLIENT_NAME_MAX		40
+#define IFLA_VIRTIF_MAX (__IFLA_VIRTIF_MAX - 1)
 
-struct ifla_vf_port_profile {
-	__u32 vf;
+enum {
+	VIRTIF_PORT_PROFILE_STATUS_UNKNOWN,
+	VIRTIF_PORT_PROFILE_STATUS_SUCCESS,
+	VIRTIF_PORT_PROFILE_STATUS_INPROGRESS,
+	VIRTIF_PORT_PROFILE_STATUS_ERROR,
+};
+
+#define VIRTIF_PORT_PROFILE_MAX	40
+#define VIRTIF_UUID_MAX		40
+#define VIRTIF_CLIENT_NAME_MAX	40
+
+struct ifla_virtif_port_profile {
 	__u32 flags;
 	__u32 status;
-	__u8 port_profile[IFLA_VF_PORT_PROFILE_MAX];
+	__u8 port_profile[VIRTIF_PORT_PROFILE_MAX];
 	__u8 mac[32];					/* MAX_ADDR_LEN */
 	/* UUID e.g. "CEEFD3B1-9E11-11DE-BDFD-000BAB01C0FB" */
-	__u8 host_uuid[IFLA_VF_UUID_MAX];
-	__u8 client_uuid[IFLA_VF_UUID_MAX];
-	__u8 client_name[IFLA_VF_CLIENT_NAME_MAX];	/* e.g. "vm0-eth1" */
+	__u8 host_uuid[VIRTIF_UUID_MAX];
+	__u8 client_uuid[VIRTIF_UUID_MAX];
+	__u8 client_name[VIRTIF_CLIENT_NAME_MAX];	/* e.g. "vm0-eth1" */
+};
+
+/*
+ * CDCP and VDP come from the 802.1Qbg standard, these definitions are taken
+ * from the respective TLV definitions in there. Adapters implementing CDCP
+ * or VDP can encapsulate the structures in LLDP/ECP headers and send them
+ * to the switch.
+ */
+struct ifla_virtif_channel {
+	__u16 scid;	/* S-Channel ID */
+	__u16 svid;	/* S-VLAN ID */
+};
+
+enum {
+	VIRTIF_VDP_REQUEST_PREASSOCIATE = 0,
+	VIRVIF_VDP_REQUEST_PREASSOCIATE_RR,
+	VIRVIF_VDP_REQUEST_ASSOCIATE,
+	VIRVIF_VDP_REQUEST_DISASSOCIATE,
+};
+
+enum {
+	VIRTIF_VDP_RESPONSE_SUCCESS = 0,
+	VIRTIF_VDP_RESPONSE_INVALID_FORMAT,
+	VIRTIF_VDP_RESPONSE_INSUFFICIENT_RESOURCES,
+	VIRTIF_VDP_RESPONSE_UNUSED_VTID,
+	VIRTIF_VDP_RESPONSE_VTID_VIOLATION,
+	VIRTIF_VDP_RESPONSE_VTID_VERSION_VIOLATION,
+	VIRTIF_VDP_RESPONSE_OUT_OF_SYNC,
+};
+
+struct ifla_virtif_vsi {
+	__u8 mode_request;
+	__u8 mode_response;
+	__u8 vsi_mgr_id;	/* these three define the policy */
+	__u8 vsi_type_id[3]; 	/*   for the guest */
+	__u8 vsi_type_version;
+	__u8 vsi_instance[16];	/* identifies the guest */
+};
+
+struct ifla_virtif_vsi_mac_vlan {
+	__u8 mac[6];
+	__u8 vlan_id[2];
 };
 
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f5b0be5..d549a3d 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -687,9 +687,9 @@ struct netdev_rx_queue {
  * int (*ndo_get_vf_config)(struct net_device *dev,
  *			    int vf, struct ifla_vf_info *ivf);
  * int (*ndo_set_vf_port_profile)(struct net_device *dev, int vf,
- *				  struct ifla_vf_port_profile *ivp);
+ *				  struct virtif_port_profile *ivp);
  * int (*ndo_get_vf_port_profile)(struct net_device *dev, int vf,
- *				  struct ifla_vf_port_profile *ivp);
+ *				  struct virtif_port_profile *ivp);
  */
 #define HAVE_NET_DEVICE_OPS
 struct net_device_ops {
@@ -741,10 +741,10 @@ struct net_device_ops {
 						     struct ifla_vf_info *ivf);
 	int			(*ndo_set_vf_port_profile)(
 					struct net_device *dev, int vf,
-					struct ifla_vf_port_profile *ivp);
+					struct ifla_virtif_port_profile *ivp);
 	int			(*ndo_get_vf_port_profile)(
 					struct net_device *dev, int vf,
-					struct ifla_vf_port_profile *ivp);
+					struct ifla_virtif_port_profile *ivp);
 #if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE)
 	int			(*ndo_fcoe_enable)(struct net_device *dev);
 	int			(*ndo_fcoe_disable)(struct net_device *dev);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d2ef45b..cf5e328 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -653,6 +653,22 @@ static inline int rtnl_vfinfo_size(const struct net_device *dev)
 		return 0;
 }
 
+static size_t rtnl_virtif_get_size(const struct net_device *dev)
+{
+	size_t total = 0;
+
+	if (dev->netdev_ops->ndo_get_vf_port_profile) {
+		total = dev_num_vf(dev->dev.parent) *
+			(nla_total_size(sizeof(struct ifla_virtif_port_profile))
+			 + nla_total_size(4)); /* IFLA_VIRTIF_VF */
+	}
+
+	/* fill in IFLA_VIRTIF_CHANNEL and IFLA_VIRTIF_VSI when needed */
+		
+	return total;
+}
+
+
 static inline size_t if_nlmsg_size(const struct net_device *dev)
 {
 	return NLMSG_ALIGN(sizeof(struct ifinfomsg))
@@ -674,6 +690,38 @@ static inline size_t if_nlmsg_size(const struct net_device *dev)
 	       + nla_total_size(4) /* IFLA_NUM_VF */
 	       + nla_total_size(rtnl_vfinfo_size(dev)) /* IFLA_VFINFO */
 	       + rtnl_link_get_size(dev); /* IFLA_LINKINFO */
+	       + rtnl_virtif_get_size(dev); /* IFLA_VIRTIF */
+}
+
+static int rtnl_virtif_fill_ifinfo(struct sk_buff *skb, struct net_device *dev)
+{
+	if (dev->netdev_ops->ndo_get_vf_port_profile && dev->dev.parent) {
+		struct ifla_virtif_port_profile ivp;
+
+		if (dev_num_vf(dev->dev.parent)) {
+			int i;
+
+			for (i = 0; i < dev_num_vf(dev->dev.parent); i++) {
+				if (dev->netdev_ops->ndo_get_vf_port_profile(
+					dev, i, &ivp))
+					break;
+				NLA_PUT_U32(skb, IFLA_VIRTIF_VF, i);
+				NLA_PUT(skb, IFLA_VIRTIF_PORT_PROFILE,
+					sizeof(ivp), &ivp);
+			}
+		}
+#if 0 /* Not sure how we should do this -arnd */
+		 else if (!dev->netdev_ops->ndo_get_vf_port_profile(dev,
+			0, &ivp)) {
+			NLA_PUT(skb, VIRTIF_PORT_PROFILE,
+				sizeof(ivp), &ivp);
+		}
+#endif
+	}
+	return 0;
+
+nla_put_failure:
+	return -EMSGSIZE;
 }
 
 static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
@@ -761,25 +809,8 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 		}
 	}
 
-	if (dev->netdev_ops->ndo_get_vf_port_profile && dev->dev.parent) {
-		struct ifla_vf_port_profile ivp;
-
-		if (dev_num_vf(dev->dev.parent)) {
-			int i;
-
-			for (i = 0; i < dev_num_vf(dev->dev.parent); i++) {
-				if (dev->netdev_ops->ndo_get_vf_port_profile(
-					dev, i, &ivp))
-					break;
-				NLA_PUT(skb, IFLA_VF_PORT_PROFILE,
-					sizeof(ivp), &ivp);
-			}
-		} else if (!dev->netdev_ops->ndo_get_vf_port_profile(dev,
-			0, &ivp)) {
-			NLA_PUT(skb, IFLA_VF_PORT_PROFILE,
-				sizeof(ivp), &ivp);
-		}
-	}
+	if (rtnl_virtif_fill_ifinfo(skb, dev))
+		goto nla_put_failure;
 
 	if (dev->rtnl_link_ops) {
 		if (rtnl_link_fill(skb, dev) < 0)
@@ -847,8 +878,7 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 				    .len = sizeof(struct ifla_vf_vlan) },
 	[IFLA_VF_TX_RATE]	= { .type = NLA_BINARY,
 				    .len = sizeof(struct ifla_vf_tx_rate) },
-	[IFLA_VF_PORT_PROFILE]	= { .type = NLA_BINARY,
-				    .len = sizeof(struct ifla_vf_port_profile)},
+	[IFLA_VIRTIF]		= { .type = NLA_NESTED },
 };
 EXPORT_SYMBOL(ifla_policy);
 
@@ -857,6 +887,18 @@ static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
 	[IFLA_INFO_DATA]	= { .type = NLA_NESTED },
 };
 
+static const struct nla_policy ifla_virtif_policy[IFLA_VIRTIF_MAX+1] = {
+	[IFLA_VIRTIF_VF]	   = { .type = NLA_U32 },
+	[IFLA_VIRTIF_PORT_PROFILE] = { .type = NLA_BINARY,
+				       .len = sizeof(struct ifla_virtif_port_profile) },
+	[IFLA_VIRTIF_CHANNEL]	   = { .type = NLA_BINARY,
+				       .len = sizeof(struct ifla_virtif_channel) },
+	[IFLA_VIRTIF_VSI]	   = { .type = NLA_BINARY,
+				       .len = sizeof(struct ifla_virtif_vsi) },
+	[IFLA_VIRTIF_VSI_MAC_VLAN] = { .type = NLA_BINARY,
+				       .len = sizeof(struct ifla_virtif_vsi_mac_vlan) },
+};
+				    
 struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[])
 {
 	struct net *net;
@@ -1053,16 +1095,30 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
 	}
 	err = 0;
 
-	if (tb[IFLA_VF_PORT_PROFILE]) {
-		struct ifla_vf_port_profile *ivp;
-		ivp = nla_data(tb[IFLA_VF_PORT_PROFILE]);
+	if (tb[IFLA_VIRTIF]) {
+		struct ifla_virtif_port_profile *ivp;
+		struct nlattr *virtif[IFLA_VIRTIF_MAX+1];
+		u32 vf;
+
+		err = nla_parse_nested(virtif, IFLA_VIRTIF_MAX,
+				       tb[IFLA_VIRTIF], ifla_virtif_policy);
+		if (err < 0)
+			return err;
+
+		if (!virtif[IFLA_VIRTIF_VF] || !virtif[IFLA_VIRTIF_PORT_PROFILE])
+			goto novirtif; /* IFLA_VIRTIF may be directed at user space */
+
+		vf = nla_get_u32(virtif[IFLA_VIRTIF_VF]);
+		ivp = nla_data(virtif[IFLA_VIRTIF_PORT_PROFILE]);
+
 		err = -EOPNOTSUPP;
 		if (ops->ndo_set_vf_port_profile)
-			err = ops->ndo_set_vf_port_profile(dev, ivp->vf, ivp);
+			err = ops->ndo_set_vf_port_profile(dev, vf, ivp);
 		if (err < 0)
 			goto errout;
 		modified = 1;
 	}
+novirtif:
 	err = 0;
 
 errout:
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] virtif: initial interface extensions
  2010-05-08 23:20       ` [PATCH] virtif: initial interface extensions Arnd Bergmann
@ 2010-05-10 15:37         ` Stefan Berger
  2010-05-10 18:56           ` Scott Feldman
  0 siblings, 1 reply; 20+ messages in thread
From: Stefan Berger @ 2010-05-10 15:37 UTC (permalink / raw)
  To: netdev

Arnd Bergmann <arnd <at> arndb.de> writes:

[...]

> +	if (tb[IFLA_VIRTIF]) {
> +		struct ifla_virtif_port_profile *ivp;
> +		struct nlattr *virtif[IFLA_VIRTIF_MAX+1];
> +		u32 vf;
> +
> +		err = nla_parse_nested(virtif, IFLA_VIRTIF_MAX,
> +				       tb[IFLA_VIRTIF], ifla_virtif_policy);
> +		if (err < 0)
> +			return err;
> +
> +		if (!virtif[IFLA_VIRTIF_VF] || !virtif[IFLA_VIRTIF_PORT_PROFILE])
> +			goto novirtif; /* IFLA_VIRTIF may be directed at user space */

In what case would the IFLA_VIRTIF_PORT_PROFILE be provided? Would libvirt for
example need to be aware of whether the Ethernet device can handle the setup
protocol via its firmware and in this case provide the port profile parameter
and in other cases provide other parameters? Certainly the user or upper layer
management software would have to know it when creating the domain XML and in
fact different types of parameters were needed. Obviously we should have one
common set of (XML) parameters that go into the netlink message and that can be
handled by the kernel driver if the firmware knows how to handle it or by
LLDPAD. Libvirt would send the parameters via netlink message to trigger the
setup protocol and the message may be received by kernel and LLDPAD. From what I
can see LLDPAD also may need a way to probe the kernel driver whether it handled
the setup protocol via firmware on a given interface, which may or may not be
true for all interfaces, but may be necessary to avoid triggering the setup
protocol twice.

   Stefan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] virtif: initial interface extensions
  2010-05-10 15:37         ` Stefan Berger
@ 2010-05-10 18:56           ` Scott Feldman
  2010-05-10 21:46             ` Arnd Bergmann
  0 siblings, 1 reply; 20+ messages in thread
From: Scott Feldman @ 2010-05-10 18:56 UTC (permalink / raw)
  To: Stefan Berger, netdev

On 5/10/10 8:37 AM, "Stefan Berger" <stefanb@us.ibm.com> wrote:

> Arnd Bergmann <arnd <at> arndb.de> writes:
> 
> [...]
> 
>> + if (tb[IFLA_VIRTIF]) {
>> +  struct ifla_virtif_port_profile *ivp;
>> +  struct nlattr *virtif[IFLA_VIRTIF_MAX+1];
>> +  u32 vf;
>> +
>> +  err = nla_parse_nested(virtif, IFLA_VIRTIF_MAX,
>> +           tb[IFLA_VIRTIF], ifla_virtif_policy);
>> +  if (err < 0)
>> +   return err;
>> +
>> +  if (!virtif[IFLA_VIRTIF_VF] || !virtif[IFLA_VIRTIF_PORT_PROFILE])
>> +   goto novirtif; /* IFLA_VIRTIF may be directed at user space */
> 
> 
> In what case would the IFLA_VIRTIF_PORT_PROFILE be provided? Would libvirt for
> example need to be aware of whether the Ethernet device can handle the setup
> protocol via its firmware and in this case provide the port profile parameter
> and in other cases provide other parameters? Certainly the user or upper layer
> management software would have to know it when creating the domain XML and in
> fact different types of parameters were needed.

> Obviously we should have one
> common set of (XML) parameters that go into the netlink message and that can
> be handled by the kernel driver if the firmware knows how to handle it or by
> LLDPAD. 

With Arnd's latest additions, we have a single netlink msg, but the
parameter sets are disjoint between VDP/CDCP and what we need for the kernel
driver.  So that means the sender (libvirt in this case) needs to know about
both setups to send a single netlink msg.  An alternative is a have two
netlink msgs, one for each setup.  That still requires the sender to know
about two setups.  

> Libvirt would send the parameters via netlink message to trigger the
> setup protocol and the message may be received by kernel and LLDPAD.
 
That was the original idea by having libvirt send the netlink msg using
multicast.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] virtif: initial interface extensions
  2010-05-10 18:56           ` Scott Feldman
@ 2010-05-10 21:46             ` Arnd Bergmann
  2010-05-10 23:51               ` Stefan Berger
                                 ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-10 21:46 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Stefan Berger, netdev, Arnd Bergmann

On Monday 10 May 2010 20:56:39 Scott Feldman wrote:
> On 5/10/10 8:37 AM, "Stefan Berger" <stefanb@us.ibm.com> wrote:
> > In what case would the IFLA_VIRTIF_PORT_PROFILE be provided? Would libvirt for
> > example need to be aware of whether the Ethernet device can handle the setup
> > protocol via its firmware and in this case provide the port profile parameter
> > and in other cases provide other parameters? Certainly the user or upper layer
> > management software would have to know it when creating the domain XML and in
> > fact different types of parameters were needed.
> 
> > Obviously we should have one
> > common set of (XML) parameters that go into the netlink message and that can
> > be handled by the kernel driver if the firmware knows how to handle it or by
> > LLDPAD. 
> 
> With Arnd's latest additions, we have a single netlink msg, but the
> parameter sets are disjoint between VDP/CDCP and what we need for the kernel
> driver.  So that means the sender (libvirt in this case) needs to know about
> both setups to send a single netlink msg.  An alternative is a have two
> netlink msgs, one for each setup.  That still requires the sender to know
> about two setups.

There are two separate issues here. The first one is whether we're doing
the association in the device driver or in user space. The assumption here
is that if it's in the device driver, there will be a VF number to identify
the channel, while in user space that is not needed.

The other question is which protocol we're using. There are as far as I
can tell five options:

1. enic device driver
2. VDP
3. CDCP
4. CDCP + VDP
5. enic + VDP

The first two ones are the most interesting for now, since Linux cannot do
S-VLANs yet, and they are required for CDCP. However, each of these options
could theoreticall be done in the kernel (plus firmware) or in user space.

If it's done in user space, the VF number is meaningless, because the setup
of the software device is also done from software, but instead you need to
take care of creating the software device with the correct parameters, e.g.
a macvtap device connected to a VLAN interface using the numbers you pass
in the VDP protocol.

Right now, we're not planning to do the protocol that enic uses in LLDPAD,
because it's not publically released. Similarly, there are no adapters that
do VDP in firmware, but both these cases should be covered by the protocol
and it would be good if libvirt could handle them.

Stefan, can you just define the XML in a way that matches the netlink
definition? What you need is something like

1. VF number (optional, signifies that 2/3 are done in firmware)
2. Lower-level protocol
  2.1. CDCP
     2.1.1 SVID
     2.1.2 SCID
  2.2. enic
     2.2.1 port profile name
     2.2.2 ...
3. VDP
  3.1 VSI type/version/provider
  3.2 UUID
  3.3 MAC/VLAN

You need to have 2. or 3. or both, and 2.1/2.2 are mutually exclusive.

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] virtif: initial interface extensions
  2010-05-10 21:46             ` Arnd Bergmann
@ 2010-05-10 23:51               ` Stefan Berger
  2010-05-11  0:25               ` Scott Feldman
       [not found]               ` <OFFE8F5F70.5C07C656-ON8525771F.00787A71-8525771F.007FCDFC@us.ibm.com>
  2 siblings, 0 replies; 20+ messages in thread
From: Stefan Berger @ 2010-05-10 23:51 UTC (permalink / raw)
  To: netdev

Arnd Bergmann <arnd@arndb.de> wrote on 05/10/2010 05:46:37 PM:


> The other question is which protocol we're using. There are as far as I
> can tell five options:
> 
> 1. enic device driver
> 2. VDP
> 3. CDCP
> 4. CDCP + VDP
> 5. enic + VDP
> 
> The first two ones are the most interesting for now, since Linux cannot do
> S-VLANs yet, and they are required for CDCP. However, each of these options
> could theoreticall be done in the kernel (plus firmware) or in user space.
> 
> If it's done in user space, the VF number is meaningless, because the setup
> of the software device is also done from software, but instead you need to
> take care of creating the software device with the correct parameters, e.g.
> a macvtap device connected to a VLAN interface using the numbers you pass
> in the VDP protocol.
> 
> Right now, we're not planning to do the protocol that enic uses in LLDPAD,
> because it's not publically released. Similarly, there are no adapters that
> do VDP in firmware, but both these cases should be covered by the protocol
> and it would be good if libvirt could handle them.
> 
> Stefan, can you just define the XML in a way that matches the netlink
> definition? What you need is something like
> 
> 1. VF number (optional, signifies that 2/3 are done in firmware) 

Shouldn't we be able to query that number via netlink starting with the 
macvtap device and the following the trail to the root and trying to find 
a VF number on the way? 

> 2. Lower-level protocol
>   2.1. CDCP
>      2.1.1 SVID
>      2.1.2 SCID 

Will the later on be qeueryable via netlink as well but not today??? 
Vivek tells me svid is vlan, so that could be found out from the kernel. 

So if we want to only support 1 and 2 for now, I'd rather skip them for now. 

>   2.2. enic
>      2.2.1 port profile name 

as proposed on libvirt mailing list 

>      2.2.2 ...
> 3. VDP
>   3.1 VSI type/version/provider 

as proposed on libvirt mailing list 

>   3.2 UUID 

we have a couple of UUIDs, which one? 


>   3.3 MAC/VLAN 

MAC: available from libvirt 
VLAN: can be found out by querying for every interface for VLAN ID while
following the path towards the root device. 


    Stefan 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] virtif: initial interface extensions
  2010-05-10 21:46             ` Arnd Bergmann
  2010-05-10 23:51               ` Stefan Berger
@ 2010-05-11  0:25               ` Scott Feldman
  2010-05-11 12:59                 ` Arnd Bergmann
  2010-05-11 17:15                 ` Vivek Kashyap
       [not found]               ` <OFFE8F5F70.5C07C656-ON8525771F.00787A71-8525771F.007FCDFC@us.ibm.com>
  2 siblings, 2 replies; 20+ messages in thread
From: Scott Feldman @ 2010-05-11  0:25 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Stefan Berger, netdev

On 5/10/10 2:46 PM, "Arnd Bergmann" <arnd@arndb.de> wrote:

> On Monday 10 May 2010 20:56:39 Scott Feldman wrote:
>> With Arnd's latest additions, we have a single netlink msg, but the
>> parameter sets are disjoint between VDP/CDCP and what we need for the kernel
>> driver.  So that means the sender (libvirt in this case) needs to know about
>> both setups to send a single netlink msg.  An alternative is a have two
>> netlink msgs, one for each setup.  That still requires the sender to know
>> about two setups.
> 
> There are two separate issues here. The first one is whether we're doing
> the association in the device driver or in user space. The assumption here
> is that if it's in the device driver, there will be a VF number to identify
> the channel, while in user space that is not needed.
> 
> The other question is which protocol we're using. There are as far as I
> can tell five options:
> 
> 1. enic device driver
> 2. VDP
> 3. CDCP
> 4. CDCP + VDP
> 5. enic + VDP
> 
> The first two ones are the most interesting for now, since Linux cannot do
> S-VLANs yet, and they are required for CDCP. However, each of these options
> could theoreticall be done in the kernel (plus firmware) or in user space.
> 
> If it's done in user space, the VF number is meaningless, because the setup
> of the software device is also done from software, but instead you need to
> take care of creating the software device with the correct parameters, e.g.
> a macvtap device connected to a VLAN interface using the numbers you pass
> in the VDP protocol.
> 
> Right now, we're not planning to do the protocol that enic uses in LLDPAD,
> because it's not publically released. Similarly, there are no adapters that
> do VDP in firmware, but both these cases should be covered by the protocol
> and it would be good if libvirt could handle them.
> 
> Stefan, can you just define the XML in a way that matches the netlink
> definition? What you need is something like
> 
> 1. VF number (optional, signifies that 2/3 are done in firmware)
> 2. Lower-level protocol
>   2.1. CDCP
>      2.1.1 SVID
>      2.1.2 SCID
>   2.2. enic
>      2.2.1 port profile name
>      2.2.2 ...
> 3. VDP
>   3.1 VSI type/version/provider
>   3.2 UUID
>   3.3 MAC/VLAN
> 
> You need to have 2. or 3. or both, and 2.1/2.2 are mutually exclusive.

I'm don't think this is going in the right direction.  We're talking a
pretty simple concept of a port-profile used to configure the virtual port
backing a VM i/f to something that's trying to munge disjoint protocols
based on pre-standard work into a single API.  It's forcing all those
pre-standard protocol details into the API, into the XML, and into the mgmt
software (libvirt), and into the admin's lap.

I want the API to pass a port-profile name plus other information associated
with the VM i/f to some mgmt object which can setup the virtual port backing
the VM i/f.  (And unset it).  Using netlink for API let's that object be in
user- or kernel-space software, hardware or firmware.  How the object sets
up the virtual port based on the passed port-profile is beyond the scope of
the API.

My last port-profile patch is this API.  It gives us this:

    1) single netlink msg for kernel- and user-space
    2) single parameter set from sender's perspective (libvirt)
    3) single XML representation of parameters
    4) single code path in kernel and libvirt
    5) (potential) cross-vendor-switch VM migration
    6) admin-friendly port-profile names
    7) allows pre-standard (802.1Qbg/bh) details to change
       without bogging down the API

What I proposed on the libvirt list is to maintain a mapping database from
port-profile to vendor-specific or protocol-specific parameters.  Using VDP
as an example, a port-profile would resolve the VDP tuple:

     port-profile: "joes-garage"   --->  VSI Manager ID: 15
                                         VSI Type ID: 12345
                                         VSI Type ID Ver: 1
                                         other VSI settings (preassociate)

How the mapping database is maintained is beyond the scope of the API.

The port-profile string is the unifying concept.  This is the common ground
and the only way to be protocol-independent in the API.

-scott


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] virtif: initial interface extensions
  2010-05-11  0:25               ` Scott Feldman
@ 2010-05-11 12:59                 ` Arnd Bergmann
  2010-05-11 17:15                 ` Vivek Kashyap
  1 sibling, 0 replies; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-11 12:59 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Stefan Berger, netdev, chrisw, davem

On Tuesday 11 May 2010, Scott Feldman wrote:
> On 5/10/10 2:46 PM, "Arnd Bergmann" <arnd@arndb.de> wrote:
>
> > Stefan, can you just define the XML in a way that matches the netlink
> > definition? What you need is something like
> > 
> > 1. VF number (optional, signifies that 2/3 are done in firmware)
> > 2. Lower-level protocol
> >   2.1. CDCP
> >      2.1.1 SVID
> >      2.1.2 SCID
> >   2.2. enic
> >      2.2.1 port profile name
> >      2.2.2 ...
> > 3. VDP
> >   3.1 VSI type/version/provider
> >   3.2 UUID
> >   3.3 MAC/VLAN
> > 
> > You need to have 2. or 3. or both, and 2.1/2.2 are mutually exclusive.
> 
> I'm don't think this is going in the right direction.  We're talking a
> pretty simple concept of a port-profile used to configure the virtual port
> backing a VM i/f to something that's trying to munge disjoint protocols
> based on pre-standard work into a single API.  It's forcing all those
> pre-standard protocol details into the API, into the XML, and into the mgmt
> software (libvirt), and into the admin's lap.

No. I agree that the port-profile concept is simple and that the complexity
comes from trying to merge the Linux interface with what we do for VDP.
It would be much more sensible IMHO to just unify port-profiles and CDCP
in the kernel interface, because they are conceptually more similar and
both rather simple (note that there is no S-VLAN implementation in Linux,
so CDCP is not there yet either).

If we layer VDP on top of the two and do it always in user space (LLDPAD
with lldptool), things are much simpler on the interface side.

The focus of VDP is to manage migration, which is something that
enic doesn't do (or doesn't need to do) and most of the complexity
in there comes from this.

> I want the API to pass a port-profile name plus other information associated
> with the VM i/f to some mgmt object which can setup the virtual port backing
> the VM i/f.  (And unset it).  Using netlink for API let's that object be in
> user- or kernel-space software, hardware or firmware.  How the object sets
> up the virtual port based on the passed port-profile is beyond the scope of
> the API.
> 
> My last port-profile patch is this API.  It gives us this:
> 
>     1) single netlink msg for kernel- and user-space

Except for the VF argument, which is kernel- only, and it doesn't work
if user space needs additional information that only the firmware has
in your case (something like a virtual channel ID).

>     2) single parameter set from sender's perspective (libvirt)
>     3) single XML representation of parameters
>     4) single code path in kernel and libvirt

>     5) (potential) cross-vendor-switch VM migration

You cannot do live migration between Cisco switches and those implementing
802.1Qbg, because of the differences between port profiles and VSI types,
e.g. how they handle VLANs or handover during migration.
Migration between 802.1Qbg capable switches is covered by the standard.

>     6) admin-friendly port-profile names

>     7) allows pre-standard (802.1Qbg/bh) details to change
>        without bogging down the API

We're basically nailing down the API right here. As soon as this is
supported in Linux, it will be the standard, so the details won't
be able to change any more.

> What I proposed on the libvirt list is to maintain a mapping database from
> port-profile to vendor-specific or protocol-specific parameters.  Using VDP
> as an example, a port-profile would resolve the VDP tuple:
> 
>      port-profile: "joes-garage"   --->  VSI Manager ID: 15
>                                          VSI Type ID: 12345
>                                          VSI Type ID Ver: 1
>                                          other VSI settings (preassociate)

I agree that the port profile name matches the VSI manager/type/version ID
to a large degree and that it would be nice to unify these, but I don't think
there is a point given all the other differences.

There is no way around the preassociate/associate/disassociate messages
being part of the API, because otherwise you cannot do seamless migration
across multiple switches. 

Also, the primary key on VDP is the VSI UUID, which needs to be the same
on the target host after migration, while in your case the switch never
identifies the guest itself, only the port profile.

If I have understood you correctly, the primary key identifying a port
on enic is something that is only visible to the switch and nic firmware,
but never to software, so you identify the guest by its VF number on the
user API.

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] virtif: initial interface extensions
  2010-05-11  0:25               ` Scott Feldman
  2010-05-11 12:59                 ` Arnd Bergmann
@ 2010-05-11 17:15                 ` Vivek Kashyap
  1 sibling, 0 replies; 20+ messages in thread
From: Vivek Kashyap @ 2010-05-11 17:15 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Arnd Bergmann, Stefan Berger, netdev

>
> I'm don't think this is going in the right direction.  We're talking a
> pretty simple concept of a port-profile used to configure the virtual port
> backing a VM i/f to something that's trying to munge disjoint protocols
> based on pre-standard work into a single API.  It's forcing all those

The multi-channel setup and VDP can live, and are designed to live
together:
  - multiple channels per link (setup by the bridge) -- CDCP
  - association of a vsi (virtual interface) to bridge ports -- VDP

However, their implementation can be pursued separately.

> pre-standard protocol details into the API, into the XML, and into the mgmt
> software (libvirt), and into the admin's lap.
>
> I want the API to pass a port-profile name plus other information associated

The problem with defining a 'name' as the lookup key to another database is 
that one then requires additional management mechanisms to describe, 
maintain, and map to the real values needed. It lands in admin's lap 
anyway by a different path.


> with the VM i/f to some mgmt object which can setup the virtual port backing
> the VM i/f.  (And unset it).  Using netlink for API let's that object be in
> user- or kernel-space software, hardware or firmware.  How the object sets
> up the virtual port based on the passed port-profile is beyond the scope of
> the API.

Agree.  See Stefan's last patch in libvirt - it converges 802.1Qbh/portprofile
and 802.1Qbg quite well. The netlink message can be deciphered by the
recipient and meets the advantages below.  (Or, one could use a direct 
unix domain socket to communicate with a daemon as well).


>
> My last port-profile patch is this API.  It gives us this:
>
>    1) single netlink msg for kernel- and user-space
>    2) single parameter set from sender's perspective (libvirt)
>    3) single XML representation of parameters
>    4) single code path in kernel and libvirt
>    5) (potential) cross-vendor-switch VM migration
>    6) admin-friendly port-profile names
>    7) allows pre-standard (802.1Qbg/bh) details to change
>       without bogging down the API
>
> What I proposed on the libvirt list is to maintain a mapping database from
> port-profile to vendor-specific or protocol-specific parameters.  Using VDP
> as an example, a port-profile would resolve the VDP tuple:
>
>     port-profile: "joes-garage"   --->  VSI Manager ID: 15
>                                         VSI Type ID: 12345
>                                         VSI Type ID Ver: 1
>                                         other VSI settings (preassociate)
>
> How the mapping database is maintained is beyond the scope of the API.

Yes, but that will add another mechanism to set the mappings. The 
xml posted (on libvirt) defines the vsi values and avoids this 
additional management. It also allows for port-profile name for 802.1Qbh.

thanks
 	Vivek



^ permalink raw reply	[flat|nested] 20+ messages in thread

[parent not found: <OFFE8F5F70.5C07C656-ON8525771F.00787A71-8525771F.007FCDFC@us.ibm.com>]

* Re: [PATCH] virtif: initial interface extensions
       [not found]               ` <OFFE8F5F70.5C07C656-ON8525771F.00787A71-8525771F.007FCDFC@us.ibm.com>
@ 2010-05-11 12:25                 ` Arnd Bergmann
       [not found]                   ` <OF2E2B37D4.51A81D74-ON85257720.0045FA96-85257720.004C5403@us.ibm.com>
  0 siblings, 1 reply; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-11 12:25 UTC (permalink / raw)
  To: Stefan Berger; +Cc: netdev, Scott Feldman

On Tuesday 11 May 2010, Stefan Berger wrote:
> Arnd Bergmann <arnd@arndb.de> wrote on 05/10/2010 05:46:37 PM:
> 
> > Stefan, can you just define the XML in a way that matches the netlink
> > definition? What you need is something like
> > 
> > 1. VF number (optional, signifies that 2/3 are done in firmware)
> 
> Shouldn't we be able to query that number via netlink starting with the
> macvtap device and the following the trail to the root and trying to find
> a VF number on the way?

No. If we have a macvtap device, there is no VF number. The VF number
should be known to libvirt in those cases where instead of creating a
macvtap device, it assigns a VF of an SR-IOV adapter to the guest.

> > 2. Lower-level protocol
> >   2.1. CDCP
> >      2.1.1 SVID
> >      2.1.2 SCID
> 
> Will the later on be qeueryable via netlink as well but not today???
> Vivek tells me svid is vlan, so that could be found out from the kernel.
> 
> So if we want to only support 1 and 2 for now, I'd rather skip them for 
> now.

svid is almost vlan (hence S-VLAN), but slightly different and is not
currently supported by the kernel. Again, if the implementation is done in
firmware, libvirt needs to set the same S-VLAN ID when setting up the
VF and when associating it to the switch.

When it's done in software, we need to create the device (or have
it created in advance), so you either know it or can query it as
you describe.

You don't need to support it yet in libvirt, but the definition should
be done in a way that leaves the option open to add it later.

> >      2.2.2 ...
> > 3. VDP
> >   3.1 VSI type/version/provider
> 
> as proposed on libvirt mailing list
> 
> >   3.2 UUID
> 
> we have a couple of UUIDs, which one?

This is a UUID that describes the VSI to the switch. It needs to be
unique in the migration domain. For a guest that has multiple
macvtap interfaces, you either need to have a single UUID and
put all MAC/VLAN pairs into the same netlink message with this
UUID, or have one UUID per device. 

> >   3.3 MAC/VLAN
> 
> MAC: available from libvirt
> VLAN: can be found out by querying for every interface for VLAN ID while 
> following the path towards the root device. 

Yes, in case of macvtap.

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

[parent not found: <OF2E2B37D4.51A81D74-ON85257720.0045FA96-85257720.004C5403@us.ibm.com>]

* Re: [PATCH] virtif: initial interface extensions
       [not found]                   ` <OF2E2B37D4.51A81D74-ON85257720.0045FA96-85257720.004C5403@us.ibm.com>
@ 2010-05-11 14:22                     ` Arnd Bergmann
  0 siblings, 0 replies; 20+ messages in thread
From: Arnd Bergmann @ 2010-05-11 14:22 UTC (permalink / raw)
  To: Stefan Berger, Chris Wright; +Cc: netdev, Scott Feldman

On Tuesday 11 May 2010, Stefan Berger wrote:
> Arnd Bergmann <arnd@arndb.de> wrote on 05/11/2010 08:25:27 AM:
> > netdev, Scott Feldman
> > On Tuesday 11 May 2010, Stefan Berger wrote:
> > > Arnd Bergmann <arnd@arndb.de> wrote on 05/10/2010 05:46:37 PM:
> > No. If we have a macvtap device, there is no VF number. The VF number
> > should be known to libvirt in those cases where instead of creating a
> > macvtap device, it assigns a VF of an SR-IOV adapter to the guest.
> 
> The only interface type that currently supports the vsi parameters is the
> 'direct' type of interface which directly maps into macvtap. That's the 
> only one that would currently let you run the setup protocol with the
> switch. Regular tap devices created through other interface types 
> (bridge, network) do not support these parameters and hence you cannot
> run the protocol with the switch. I never tried passthrough but I 
> believe libvirt is not aware of what it is passing through nor do we
> currently support the parameters for passthrough devices.

Ok. I believe we will at least have to add the same kind of setup
to bridged devices as well, not just macvtap.

For SR-IOV with device assignment, I'm not sure. This will be more
important when adapters show up that actually support VEPA in hardware
and don't have their own switch, but even for those with an integrated
switch, it would be nice if we could use VDP correctly.

> > svid is almost vlan (hence S-VLAN), but slightly different and is not
> > currently supported by the kernel. Again, if the implementation is done  in
> > firmware, libvirt needs to set the same S-VLAN ID when setting up the
> > VF and when associating it to the switch.
> 
> The netlink messages go into the kernel and I suppose the driver should be
> able to find out what the S-VLAN ID is that it needs to use, no?

Possibly yes, but that will depend on how the firmware does this. It
may also be possible that adapters implement this similar to what enic
does, which does not expose the S-VLAN ID at all and uses the VF
number as the identifier.

Maybe we should leave out the CDCP stuff for now, until we start seeing
hardware for it.

> > This is a UUID that describes the VSI to the switch. It needs to be
> > unique in the migration domain. For a guest that has multiple
> > macvtap interfaces, you either need to have a single UUID and
> > put all MAC/VLAN pairs into the same netlink message with this
> > UUID, or have one UUID per device. 
> 
> In that case it's the instanceID as proposed in this XML here:
> 
>    <interface type='direct'>
>       <source dev='static' mode='vepa'/>
>       <model type='virtio'/>
>       <vsi managerid='12' typeid='0x123456' typeidversion='1'
>            instanceid='fa9b7fff-b0a0-4893-8e0e-beef4ff18f8f' />
>       <filterref filter='clean-traffic'/>
>    </interface>

Yes.

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2010-05-11 17:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-06  4:42 [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Scott Feldman
2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 1/3] Add netdev/netlink port-profile support (was iovnl) Scott Feldman
2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 2/3] Add ndo_{set|get}_vf_port_profile op support for enic dynamic vnics Scott Feldman
2010-05-06 13:47   ` Arnd Bergmann
2010-05-06 16:25     ` Scott Feldman
2010-05-06 16:45       ` Arnd Bergmann
2010-05-06  4:42 ` [net-next-2.6 V5 PATCH 3/3] Add SR-IOV support to enic (please don't apply this patch) Scott Feldman
2010-05-06 13:51 ` [net-next-2.6 V5 PATCH 0/3] Add port-profile netlink support Arnd Bergmann
2010-05-06 16:19   ` Scott Feldman
2010-05-06 16:42     ` Arnd Bergmann
2010-05-08 23:20       ` [PATCH] virtif: initial interface extensions Arnd Bergmann
2010-05-10 15:37         ` Stefan Berger
2010-05-10 18:56           ` Scott Feldman
2010-05-10 21:46             ` Arnd Bergmann
2010-05-10 23:51               ` Stefan Berger
2010-05-11  0:25               ` Scott Feldman
2010-05-11 12:59                 ` Arnd Bergmann
2010-05-11 17:15                 ` Vivek Kashyap
     [not found]               ` <OFFE8F5F70.5C07C656-ON8525771F.00787A71-8525771F.007FCDFC@us.ibm.com>
2010-05-11 12:25                 ` Arnd Bergmann
     [not found]                   ` <OF2E2B37D4.51A81D74-ON85257720.0045FA96-85257720.004C5403@us.ibm.com>
2010-05-11 14:22                     ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).