Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v6 1/8] SUNRPC: introduce helpers for reference counted rpcbind clients
From: Stanislav Kinsbursky @ 2011-10-25 12:41 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs@vger.kernel.org, Pavel Emelianov, neilb@suse.de,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	bfields@fieldses.org, davem@davemloft.net, devel@openvz.org
In-Reply-To: <1319541369.2716.1.camel@lade.trondhjem.org>

25.10.2011 15:16, Trond Myklebust пишет:
> On Tue, 2011-10-25 at 14:16 +0300, Stanislav Kinsbursky wrote:
>> v6:
>> 1) added write memory barrier to rpcb_set_local to make sure, that rpcbind
>> clients become valid before rpcb_users assignment
>> 2) explicitly set rpcb_users to 1 instead of incrementing it (looks clearer from
>> my pow).
>>
>> v5: fixed races with rpcb_users in rpcb_get_local()
>>
>> This helpers will be used for dynamical creation and destruction of rpcbind
>> clients.
>> Variable rpcb_users is actually a counter of lauched RPC services. If rpcbind
>> clients has been created already, then we just increase rpcb_users.
>>
>> Signed-off-by: Stanislav Kinsbursky<skinsbursky@parallels.com>
>>
>> ---
>>   net/sunrpc/rpcb_clnt.c |   54 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 files changed, 54 insertions(+), 0 deletions(-)
>>
>> diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
>> index e45d2fb..9fcdb42 100644
>> --- a/net/sunrpc/rpcb_clnt.c
>> +++ b/net/sunrpc/rpcb_clnt.c
>> @@ -114,6 +114,9 @@ static struct rpc_program	rpcb_program;
>>   static struct rpc_clnt *	rpcb_local_clnt;
>>   static struct rpc_clnt *	rpcb_local_clnt4;
>>
>> +DEFINE_SPINLOCK(rpcb_clnt_lock);
>> +unsigned int			rpcb_users;
>> +
>>   struct rpcbind_args {
>>   	struct rpc_xprt *	r_xprt;
>>
>> @@ -161,6 +164,57 @@ static void rpcb_map_release(void *data)
>>   	kfree(map);
>>   }
>>
>> +static int rpcb_get_local(void)
>> +{
>> +	int cnt;
>> +
>> +	spin_lock(&rpcb_clnt_lock);
>> +	if (rpcb_users)
>> +		rpcb_users++;
>> +	cnt = rpcb_users;
>> +	spin_unlock(&rpcb_clnt_lock);
>> +
>> +	return cnt;
>> +}
>> +
>> +void rpcb_put_local(void)
>> +{
>> +	struct rpc_clnt *clnt = rpcb_local_clnt;
>> +	struct rpc_clnt *clnt4 = rpcb_local_clnt4;
>> +	int shutdown;
>> +
>> +	spin_lock(&rpcb_clnt_lock);
>> +	if (--rpcb_users == 0) {
>> +		rpcb_local_clnt = NULL;
>> +		rpcb_local_clnt4 = NULL;
>> +	}
>> +	shutdown = !rpcb_users;
>> +	spin_unlock(&rpcb_clnt_lock);
>> +
>> +	if (shutdown) {
>> +		/*
>> +		 * cleanup_rpcb_clnt - remove xprtsock's sysctls, unregister
>> +		 */
>> +		if (clnt4)
>> +			rpc_shutdown_client(clnt4);
>> +		if (clnt)
>> +			rpc_shutdown_client(clnt);
>> +	}
>> +	return;
>
> I'm removing this before applying...
>
Sorry, but I don't understand what exactly you are removing, and why?

>> +}
>> +
>


-- 
Best regards,
Stanislav Kinsbursky

^ permalink raw reply

* Re: [PATCH v6 1/8] SUNRPC: introduce helpers for reference counted rpcbind clients
From: Trond Myklebust @ 2011-10-25 12:45 UTC (permalink / raw)
  To: Stanislav Kinsbursky
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Pavel Emelianov, neilb-l3A5Bk7waGM@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org,
	devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
In-Reply-To: <4EA6AE87.5000309-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

On Tue, 2011-10-25 at 16:41 +0400, Stanislav Kinsbursky wrote: 
> 25.10.2011 15:16, Trond Myklebust пишет:
> > On Tue, 2011-10-25 at 14:16 +0300, Stanislav Kinsbursky wrote:
> >> v6:
> >> 1) added write memory barrier to rpcb_set_local to make sure, that rpcbind
> >> clients become valid before rpcb_users assignment
> >> 2) explicitly set rpcb_users to 1 instead of incrementing it (looks clearer from
> >> my pow).
> >>
> >> v5: fixed races with rpcb_users in rpcb_get_local()
> >>
> >> This helpers will be used for dynamical creation and destruction of rpcbind
> >> clients.
> >> Variable rpcb_users is actually a counter of lauched RPC services. If rpcbind
> >> clients has been created already, then we just increase rpcb_users.
> >>
> >> Signed-off-by: Stanislav Kinsbursky<skinsbursky-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> >>
> >> ---
> >>   net/sunrpc/rpcb_clnt.c |   54 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>   1 files changed, 54 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
> >> index e45d2fb..9fcdb42 100644
> >> --- a/net/sunrpc/rpcb_clnt.c
> >> +++ b/net/sunrpc/rpcb_clnt.c
> >> @@ -114,6 +114,9 @@ static struct rpc_program	rpcb_program;
> >>   static struct rpc_clnt *	rpcb_local_clnt;
> >>   static struct rpc_clnt *	rpcb_local_clnt4;
> >>
> >> +DEFINE_SPINLOCK(rpcb_clnt_lock);
> >> +unsigned int			rpcb_users;
> >> +
> >>   struct rpcbind_args {
> >>   	struct rpc_xprt *	r_xprt;
> >>
> >> @@ -161,6 +164,57 @@ static void rpcb_map_release(void *data)
> >>   	kfree(map);
> >>   }
> >>
> >> +static int rpcb_get_local(void)
> >> +{
> >> +	int cnt;
> >> +
> >> +	spin_lock(&rpcb_clnt_lock);
> >> +	if (rpcb_users)
> >> +		rpcb_users++;
> >> +	cnt = rpcb_users;
> >> +	spin_unlock(&rpcb_clnt_lock);
> >> +
> >> +	return cnt;
> >> +}
> >> +
> >> +void rpcb_put_local(void)
> >> +{
> >> +	struct rpc_clnt *clnt = rpcb_local_clnt;
> >> +	struct rpc_clnt *clnt4 = rpcb_local_clnt4;
> >> +	int shutdown;
> >> +
> >> +	spin_lock(&rpcb_clnt_lock);
> >> +	if (--rpcb_users == 0) {
> >> +		rpcb_local_clnt = NULL;
> >> +		rpcb_local_clnt4 = NULL;
> >> +	}
> >> +	shutdown = !rpcb_users;
> >> +	spin_unlock(&rpcb_clnt_lock);
> >> +
> >> +	if (shutdown) {
> >> +		/*
> >> +		 * cleanup_rpcb_clnt - remove xprtsock's sysctls, unregister
> >> +		 */
> >> +		if (clnt4)
> >> +			rpc_shutdown_client(clnt4);
> >> +		if (clnt)
> >> +			rpc_shutdown_client(clnt);
> >> +	}
> >> +	return;
> >
> > I'm removing this before applying...
> >
> Sorry, but I don't understand what exactly you are removing, and why?

The empty 'return' at the end of a void function: it is 100%
redundant...

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org
www.netapp.com

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [patch net-next V5] net: introduce ethernet teaming device
From: Jiri Pirko @ 2011-10-25 13:03 UTC (permalink / raw)
  To: netdev
  Cc: davem, eric.dumazet, bhutchings, shemminger, fubar, andy, tgraf,
	ebiederm, mirqus, kaber, greearb, jesse, fbl, benjamin.poirier,
	jzupka

This patch introduces new network device called team. It supposes to be
very fast, simple, userspace-driven alternative to existing bonding
driver.

Userspace library called libteam with couple of demo apps is available
here:
https://github.com/jpirko/libteam
Note it's still in its dipers atm.

team<->libteam use generic netlink for communication. That and rtnl
suppose to be the only way to configure team device, no sysfs etc.

Python binding basis for libteam was recently introduced (some need
still need to be done on it though). Daemon providing arpmon/miimon
active-backup functionality will be introduced shortly.
All what's necessary is already implemented in kernel team driver.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

v4->v5:
	- team_change_mtu() uses team->lock while travesing though port
	  list
	- mac address changes are moved completely to jurisdiction of
	  userspace daemon. This way the daemon can do FOM1, FOM2 and
	  possibly other weird things with mac addresses.
	  Only round-robin mode sets up all ports to bond's address then
	  enslaved.
	- Extended Kconfig text

v3->v4:
	- remove redundant synchronize_rcu from __team_change_mode()
	- revert "set and clear of mode_ops happens per pointer, not per
	  byte"
	- extend comment of function __team_change_mode()

v2->v3:
	- team_change_mtu() uses rcu version of list traversal to unwind
	- set and clear of mode_ops happens per pointer, not per byte
	- port hashlist changed to be embedded into team structure
	- error branch in team_port_enter() does cleanup now
	- fixed rtln->rtnl

v1->v2:
	- modes are made as modules. Makes team more modular and
	  extendable.
	- several commenters' nitpicks found on v1 were fixed
	- several other bugs were fixed.
	- note I ignored Eric's comment about roundrobin port selector
	  as Eric's way may be easily implemented as another mode (mode
	  "random") in future.
---
 Documentation/networking/team.txt         |    2 +
 MAINTAINERS                               |    7 +
 drivers/net/Kconfig                       |    2 +
 drivers/net/Makefile                      |    1 +
 drivers/net/team/Kconfig                  |   43 +
 drivers/net/team/Makefile                 |    7 +
 drivers/net/team/team.c                   | 1555 +++++++++++++++++++++++++++++
 drivers/net/team/team_mode_activebackup.c |  137 +++
 drivers/net/team/team_mode_roundrobin.c   |  107 ++
 include/linux/Kbuild                      |    1 +
 include/linux/if.h                        |    1 +
 include/linux/if_team.h                   |  230 +++++
 12 files changed, 2093 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/networking/team.txt
 create mode 100644 drivers/net/team/Kconfig
 create mode 100644 drivers/net/team/Makefile
 create mode 100644 drivers/net/team/team.c
 create mode 100644 drivers/net/team/team_mode_activebackup.c
 create mode 100644 drivers/net/team/team_mode_roundrobin.c
 create mode 100644 include/linux/if_team.h

diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
new file mode 100644
index 0000000..5a01368
--- /dev/null
+++ b/Documentation/networking/team.txt
@@ -0,0 +1,2 @@
+Team devices are driven from userspace via libteam library which is here:
+	https://github.com/jpirko/libteam
diff --git a/MAINTAINERS b/MAINTAINERS
index 5008b08..c33400d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6372,6 +6372,13 @@ W:	http://tcp-lp-mod.sourceforge.net/
 S:	Maintained
 F:	net/ipv4/tcp_lp.c
 
+TEAM DRIVER
+M:	Jiri Pirko <jpirko@redhat.com>
+L:	netdev@vger.kernel.org
+S:	Supported
+F:	drivers/net/team/
+F:	include/linux/if_team.h
+
 TEGRA SUPPORT
 M:	Colin Cross <ccross@android.com>
 M:	Erik Gilling <konkers@android.com>
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 583f66c..b3020be 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -125,6 +125,8 @@ config IFB
 	  'ifb1' etc.
 	  Look at the iproute2 documentation directory for usage etc
 
+source "drivers/net/team/Kconfig"
+
 config MACVLAN
 	tristate "MAC-VLAN support (EXPERIMENTAL)"
 	depends on EXPERIMENTAL
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index fa877cd..4e4ebfe 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_NET) += Space.o loopback.o
 obj-$(CONFIG_NETCONSOLE) += netconsole.o
 obj-$(CONFIG_PHYLIB) += phy/
 obj-$(CONFIG_RIONET) += rionet.o
+obj-$(CONFIG_NET_TEAM) += team/
 obj-$(CONFIG_TUN) += tun.o
 obj-$(CONFIG_VETH) += veth.o
 obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
diff --git a/drivers/net/team/Kconfig b/drivers/net/team/Kconfig
new file mode 100644
index 0000000..248a144
--- /dev/null
+++ b/drivers/net/team/Kconfig
@@ -0,0 +1,43 @@
+menuconfig NET_TEAM
+	tristate "Ethernet team driver support (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	---help---
+	  This allows one to create virtual interfaces that teams together
+	  multiple ethernet devices.
+
+	  Team devices can be added using the "ip" command from the
+	  iproute2 package:
+
+	  "ip link add link [ address MAC ] [ NAME ] type team"
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called team.
+
+if NET_TEAM
+
+config NET_TEAM_MODE_ROUNDROBIN
+	tristate "Round-robin mode support"
+	depends on NET_TEAM
+	---help---
+	  Basic mode where port used for transmitting packets is selected in
+	  round-robin fashion using packet counter.
+
+	  All added ports are setup to have bond's mac address.
+
+	  To compile this team mode as a module, choose M here: the module
+	  will be called team_mode_roundrobin.
+
+config NET_TEAM_MODE_ACTIVEBACKUP
+	tristate "Active-backup mode support"
+	depends on NET_TEAM
+	---help---
+	  Only one port is active at a time and the rest of ports are used
+	  for backup.
+
+	  Mac addresses of ports are not modified. Userspace is responsible
+	  to do so.
+
+	  To compile this team mode as a module, choose M here: the module
+	  will be called team_mode_activebackup.
+
+endif # NET_TEAM
diff --git a/drivers/net/team/Makefile b/drivers/net/team/Makefile
new file mode 100644
index 0000000..85f2028
--- /dev/null
+++ b/drivers/net/team/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the network team driver
+#
+
+obj-$(CONFIG_NET_TEAM) += team.o
+obj-$(CONFIG_NET_TEAM_MODE_ROUNDROBIN) += team_mode_roundrobin.o
+obj-$(CONFIG_NET_TEAM_MODE_ACTIVEBACKUP) += team_mode_activebackup.o
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
new file mode 100644
index 0000000..0334ee8
--- /dev/null
+++ b/drivers/net/team/team.c
@@ -0,0 +1,1555 @@
+/*
+ * net/drivers/team/team.c - Network team device driver
+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/rcupdate.h>
+#include <linux/errno.h>
+#include <linux/ctype.h>
+#include <linux/notifier.h>
+#include <linux/netdevice.h>
+#include <linux/if_arp.h>
+#include <linux/socket.h>
+#include <linux/etherdevice.h>
+#include <linux/rtnetlink.h>
+#include <net/rtnetlink.h>
+#include <net/genetlink.h>
+#include <net/netlink.h>
+#include <linux/if_team.h>
+
+#define DRV_NAME "team"
+
+
+/**********
+ * Helpers
+ **********/
+
+#define team_port_exists(dev) (dev->priv_flags & IFF_TEAM_PORT)
+
+static struct team_port *team_port_get_rcu(const struct net_device *dev)
+{
+	struct team_port *port = rcu_dereference(dev->rx_handler_data);
+
+	return team_port_exists(dev) ? port : NULL;
+}
+
+static struct team_port *team_port_get_rtnl(const struct net_device *dev)
+{
+	struct team_port *port = rtnl_dereference(dev->rx_handler_data);
+
+	return team_port_exists(dev) ? port : NULL;
+}
+
+/*
+ * Since the ability to change mac address for open port device is tested in
+ * team_port_add, this function can be called without control of return value
+ */
+static int __set_port_mac(struct net_device *port_dev,
+			  const unsigned char *dev_addr)
+{
+	struct sockaddr addr;
+
+	memcpy(addr.sa_data, dev_addr, ETH_ALEN);
+	addr.sa_family = ARPHRD_ETHER;
+	return dev_set_mac_address(port_dev, &addr);
+}
+
+int team_port_set_orig_mac(struct team_port *port)
+{
+	return __set_port_mac(port->dev, port->orig.dev_addr);
+}
+
+int team_port_set_team_mac(struct team_port *port)
+{
+	return __set_port_mac(port->dev, port->team->dev->dev_addr);
+}
+EXPORT_SYMBOL(team_port_set_team_mac);
+
+
+/*******************
+ * Options handling
+ *******************/
+
+void team_options_register(struct team *team, struct team_option *option,
+			   size_t option_count)
+{
+	int i;
+
+	for (i = 0; i < option_count; i++, option++)
+		list_add_tail(&option->list, &team->option_list);
+}
+EXPORT_SYMBOL(team_options_register);
+
+static void __team_options_change_check(struct team *team,
+					struct team_option *changed_option);
+
+static void __team_options_unregister(struct team *team,
+				      struct team_option *option,
+				      size_t option_count)
+{
+	int i;
+
+	for (i = 0; i < option_count; i++, option++)
+		list_del(&option->list);
+}
+
+void team_options_unregister(struct team *team, struct team_option *option,
+			     size_t option_count)
+{
+	__team_options_unregister(team, option, option_count);
+	__team_options_change_check(team, NULL);
+}
+EXPORT_SYMBOL(team_options_unregister);
+
+static int team_option_get(struct team *team, struct team_option *option,
+			   void *arg)
+{
+	return option->getter(team, arg);
+}
+
+static int team_option_set(struct team *team, struct team_option *option,
+			   void *arg)
+{
+	int err;
+
+	err = option->setter(team, arg);
+	if (err)
+		return err;
+
+	__team_options_change_check(team, option);
+	return err;
+}
+
+/****************
+ * Mode handling
+ ****************/
+
+static LIST_HEAD(mode_list);
+static DEFINE_SPINLOCK(mode_list_lock);
+
+static struct team_mode *__find_mode(const char *kind)
+{
+	struct team_mode *mode;
+
+	list_for_each_entry(mode, &mode_list, list) {
+		if (strcmp(mode->kind, kind) == 0)
+			return mode;
+	}
+	return NULL;
+}
+
+static bool is_good_mode_name(const char *name)
+{
+	while (*name != '\0') {
+		if (!isalpha(*name) && !isdigit(*name) && *name != '_')
+			return false;
+		name++;
+	}
+	return true;
+}
+
+int team_mode_register(struct team_mode *mode)
+{
+	int err = 0;
+
+	if (!is_good_mode_name(mode->kind) ||
+	    mode->priv_size > TEAM_MODE_PRIV_SIZE)
+		return -EINVAL;
+	spin_lock(&mode_list_lock);
+	if (__find_mode(mode->kind)) {
+		err = -EEXIST;
+		goto unlock;
+	}
+	list_add_tail(&mode->list, &mode_list);
+unlock:
+	spin_unlock(&mode_list_lock);
+	return err;
+}
+EXPORT_SYMBOL(team_mode_register);
+
+int team_mode_unregister(struct team_mode *mode)
+{
+	spin_lock(&mode_list_lock);
+	list_del_init(&mode->list);
+	spin_unlock(&mode_list_lock);
+	return 0;
+}
+EXPORT_SYMBOL(team_mode_unregister);
+
+static struct team_mode *team_mode_get(const char *kind)
+{
+	struct team_mode *mode;
+
+	spin_lock(&mode_list_lock);
+	mode = __find_mode(kind);
+	if (!mode) {
+		spin_unlock(&mode_list_lock);
+		request_module("team-mode-%s", kind);
+		spin_lock(&mode_list_lock);
+		mode = __find_mode(kind);
+	}
+	if (mode)
+		if (!try_module_get(mode->owner))
+			mode = NULL;
+
+	spin_unlock(&mode_list_lock);
+	return mode;
+}
+
+static void team_mode_put(const char *kind)
+{
+	struct team_mode *mode;
+
+	spin_lock(&mode_list_lock);
+	mode = __find_mode(kind);
+	BUG_ON(!mode);
+	module_put(mode->owner);
+	spin_unlock(&mode_list_lock);
+}
+
+/*
+ * We can benefit from the fact that it's ensured no port is present
+ * at the time of mode change. Therefore no packets are in fly so there's no
+ * need to set mode operations in any special way.
+ */
+static int __team_change_mode(struct team *team,
+			      const struct team_mode *new_mode)
+{
+	/* Check if mode was previously set and do cleanup if so */
+	if (team->mode_kind) {
+		void (*exit_op)(struct team *team) = team->mode_ops.exit;
+
+		/* Clear ops area so no callback is called any longer */
+		memset(&team->mode_ops, 0, sizeof(struct team_mode_ops));
+
+		if (exit_op)
+			exit_op(team);
+		team_mode_put(team->mode_kind);
+		team->mode_kind = NULL;
+		/* zero private data area */
+		memset(&team->mode_priv, 0,
+		       sizeof(struct team) - offsetof(struct team, mode_priv));
+	}
+
+	if (!new_mode)
+		return 0;
+
+	if (new_mode->ops->init) {
+		int err;
+
+		err = new_mode->ops->init(team);
+		if (err)
+			return err;
+	}
+
+	team->mode_kind = new_mode->kind;
+	memcpy(&team->mode_ops, new_mode->ops, sizeof(struct team_mode_ops));
+
+	return 0;
+}
+
+static int team_change_mode(struct team *team, const char *kind)
+{
+	struct team_mode *new_mode;
+	struct net_device *dev = team->dev;
+	int err;
+
+	if (!list_empty(&team->port_list)) {
+		netdev_err(dev, "No ports can be present during mode change\n");
+		return -EBUSY;
+	}
+
+	if (team->mode_kind && strcmp(team->mode_kind, kind) == 0) {
+		netdev_err(dev, "Unable to change to the same mode the team is in\n");
+		return -EINVAL;
+	}
+
+	new_mode = team_mode_get(kind);
+	if (!new_mode) {
+		netdev_err(dev, "Mode \"%s\" not found\n", kind);
+		return -EINVAL;
+	}
+
+	err = __team_change_mode(team, new_mode);
+	if (err) {
+		netdev_err(dev, "Failed to change to mode \"%s\"\n", kind);
+		team_mode_put(kind);
+		return err;
+	}
+
+	netdev_info(dev, "Mode changed to \"%s\"\n", kind);
+	return 0;
+}
+
+
+/************************
+ * Rx path frame handler
+ ************************/
+
+/* note: already called with rcu_read_lock */
+static rx_handler_result_t team_handle_frame(struct sk_buff **pskb)
+{
+	struct sk_buff *skb = *pskb;
+	struct team_port *port;
+	struct team *team;
+	rx_handler_result_t res = RX_HANDLER_ANOTHER;
+
+	skb = skb_share_check(skb, GFP_ATOMIC);
+	if (!skb)
+		return RX_HANDLER_CONSUMED;
+
+	*pskb = skb;
+
+	port = team_port_get_rcu(skb->dev);
+	team = port->team;
+
+	if (team->mode_ops.receive)
+		res = team->mode_ops.receive(team, port, skb);
+
+	if (res == RX_HANDLER_ANOTHER) {
+		struct team_pcpu_stats *pcpu_stats;
+
+		pcpu_stats = this_cpu_ptr(team->pcpu_stats);
+		u64_stats_update_begin(&pcpu_stats->syncp);
+		pcpu_stats->rx_packets++;
+		pcpu_stats->rx_bytes += skb->len;
+		if (skb->pkt_type == PACKET_MULTICAST)
+			pcpu_stats->rx_multicast++;
+		u64_stats_update_end(&pcpu_stats->syncp);
+
+		skb->dev = team->dev;
+	} else {
+		this_cpu_inc(team->pcpu_stats->rx_dropped);
+	}
+
+	return res;
+}
+
+
+/****************
+ * Port handling
+ ****************/
+
+static bool team_port_find(const struct team *team,
+			   const struct team_port *port)
+{
+	struct team_port *cur;
+
+	list_for_each_entry(cur, &team->port_list, list)
+		if (cur == port)
+			return true;
+	return false;
+}
+
+/*
+ * Add/delete port to the team port list. Write guarded by rtnl_lock.
+ * Takes care of correct port->index setup (might be racy).
+ */
+static void team_port_list_add_port(struct team *team,
+				    struct team_port *port)
+{
+	port->index = team->port_count++;
+	hlist_add_head_rcu(&port->hlist,
+			   team_port_index_hash(team, port->index));
+	list_add_tail_rcu(&port->list, &team->port_list);
+}
+
+static void __reconstruct_port_hlist(struct team *team, int rm_index)
+{
+	int i;
+	struct team_port *port;
+
+	for (i = rm_index + 1; i < team->port_count; i++) {
+		port = team_get_port_by_index_rcu(team, i);
+		hlist_del_rcu(&port->hlist);
+		port->index--;
+		hlist_add_head_rcu(&port->hlist,
+				   team_port_index_hash(team, port->index));
+	}
+}
+
+static void team_port_list_del_port(struct team *team,
+				   struct team_port *port)
+{
+	int rm_index = port->index;
+
+	hlist_del_rcu(&port->hlist);
+	list_del_rcu(&port->list);
+	__reconstruct_port_hlist(team, rm_index);
+	team->port_count--;
+}
+
+#define TEAM_VLAN_FEATURES (NETIF_F_ALL_CSUM | NETIF_F_SG | \
+			    NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
+			    NETIF_F_HIGHDMA | NETIF_F_LRO)
+
+static void __team_compute_features(struct team *team)
+{
+	struct team_port *port;
+	u32 vlan_features = TEAM_VLAN_FEATURES;
+	unsigned short max_hard_header_len = ETH_HLEN;
+
+	list_for_each_entry(port, &team->port_list, list) {
+		vlan_features = netdev_increment_features(vlan_features,
+					port->dev->vlan_features,
+					TEAM_VLAN_FEATURES);
+
+		if (port->dev->hard_header_len > max_hard_header_len)
+			max_hard_header_len = port->dev->hard_header_len;
+	}
+
+	team->dev->vlan_features = vlan_features;
+	team->dev->hard_header_len = max_hard_header_len;
+
+	netdev_change_features(team->dev);
+}
+
+static void team_compute_features(struct team *team)
+{
+	spin_lock(&team->lock);
+	__team_compute_features(team);
+	spin_unlock(&team->lock);
+}
+
+static int team_port_enter(struct team *team, struct team_port *port)
+{
+	int err = 0;
+
+	dev_hold(team->dev);
+	port->dev->priv_flags |= IFF_TEAM_PORT;
+	if (team->mode_ops.port_enter) {
+		err = team->mode_ops.port_enter(team, port);
+		if (err) {
+			netdev_err(team->dev, "Device %s failed to enter team mode\n",
+				   port->dev->name);
+			goto err_port_enter;
+		}
+	}
+
+	return 0;
+
+err_port_enter:
+	port->dev->priv_flags &= ~IFF_TEAM_PORT;
+	dev_put(team->dev);
+
+	return err;
+}
+
+static void team_port_leave(struct team *team, struct team_port *port)
+{
+	if (team->mode_ops.port_leave)
+		team->mode_ops.port_leave(team, port);
+	port->dev->priv_flags &= ~IFF_TEAM_PORT;
+	dev_put(team->dev);
+}
+
+static void __team_port_change_check(struct team_port *port, bool linkup);
+
+static int team_port_add(struct team *team, struct net_device *port_dev)
+{
+	struct net_device *dev = team->dev;
+	struct team_port *port;
+	char *portname = port_dev->name;
+	int err;
+
+	if (port_dev->flags & IFF_LOOPBACK ||
+	    port_dev->type != ARPHRD_ETHER) {
+		netdev_err(dev, "Device %s is of an unsupported type\n",
+			   portname);
+		return -EINVAL;
+	}
+
+	if (team_port_exists(port_dev)) {
+		netdev_err(dev, "Device %s is already a port "
+				"of a team device\n", portname);
+		return -EBUSY;
+	}
+
+	if (port_dev->flags & IFF_UP) {
+		netdev_err(dev, "Device %s is up. Set it down before adding it as a team port\n",
+			   portname);
+		return -EBUSY;
+	}
+
+	port = kzalloc(sizeof(struct team_port), GFP_KERNEL);
+	if (!port)
+		return -ENOMEM;
+
+	port->dev = port_dev;
+	port->team = team;
+
+	port->orig.mtu = port_dev->mtu;
+	err = dev_set_mtu(port_dev, dev->mtu);
+	if (err) {
+		netdev_dbg(dev, "Error %d calling dev_set_mtu\n", err);
+		goto err_set_mtu;
+	}
+
+	memcpy(port->orig.dev_addr, port_dev->dev_addr, ETH_ALEN);
+
+	err = team_port_enter(team, port);
+	if (err) {
+		netdev_err(dev, "Device %s failed to enter team mode\n",
+			   portname);
+		goto err_port_enter;
+	}
+
+	err = dev_open(port_dev);
+	if (err) {
+		netdev_dbg(dev, "Device %s opening failed\n",
+			   portname);
+		goto err_dev_open;
+	}
+
+	err = netdev_set_master(port_dev, dev);
+	if (err) {
+		netdev_err(dev, "Device %s failed to set master\n", portname);
+		goto err_set_master;
+	}
+
+	err = netdev_rx_handler_register(port_dev, team_handle_frame,
+					 port);
+	if (err) {
+		netdev_err(dev, "Device %s failed to register rx_handler\n",
+			   portname);
+		goto err_handler_register;
+	}
+
+	team_port_list_add_port(team, port);
+	__team_compute_features(team);
+	__team_port_change_check(port, !!netif_carrier_ok(port_dev));
+
+	netdev_info(dev, "Port device %s added\n", portname);
+
+	return 0;
+
+err_handler_register:
+	netdev_set_master(port_dev, NULL);
+
+err_set_master:
+	dev_close(port_dev);
+
+err_dev_open:
+	team_port_leave(team, port);
+	team_port_set_orig_mac(port);
+
+err_port_enter:
+	dev_set_mtu(port_dev, port->orig.mtu);
+
+err_set_mtu:
+	kfree(port);
+
+	return err;
+}
+
+static int team_port_del(struct team *team, struct net_device *port_dev)
+{
+	struct net_device *dev = team->dev;
+	struct team_port *port;
+	char *portname = port_dev->name;
+
+	port = team_port_get_rtnl(port_dev);
+	if (!port || !team_port_find(team, port)) {
+		netdev_err(dev, "Device %s does not act as a port of this team\n",
+			   portname);
+		return -ENOENT;
+	}
+
+	__team_port_change_check(port, false);
+	team_port_list_del_port(team, port);
+	netdev_rx_handler_unregister(port_dev);
+	netdev_set_master(port_dev, NULL);
+	dev_close(port_dev);
+	team_port_leave(team, port);
+	team_port_set_orig_mac(port);
+	dev_set_mtu(port_dev, port->orig.mtu);
+	synchronize_rcu();
+	kfree(port);
+	netdev_info(dev, "Port device %s removed\n", portname);
+	__team_compute_features(team);
+
+	return 0;
+}
+
+
+/*****************
+ * Net device ops
+ *****************/
+
+static const char team_no_mode_kind[] = "*NOMODE*";
+
+static int team_mode_option_get(struct team *team, void *arg)
+{
+	const char **str = arg;
+
+	*str = team->mode_kind ? team->mode_kind : team_no_mode_kind;
+	return 0;
+}
+
+static int team_mode_option_set(struct team *team, void *arg)
+{
+	const char **str = arg;
+
+	return team_change_mode(team, *str);
+}
+
+static struct team_option team_options[] = {
+	{
+		.name = "mode",
+		.type = TEAM_OPTION_TYPE_STRING,
+		.getter = team_mode_option_get,
+		.setter = team_mode_option_set,
+	},
+};
+
+static int team_init(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+	int i;
+
+	team->dev = dev;
+	spin_lock_init(&team->lock);
+
+	team->pcpu_stats = alloc_percpu(struct team_pcpu_stats);
+	if (!team->pcpu_stats)
+		return -ENOMEM;
+
+	for (i = 0; i < TEAM_PORT_HASHENTRIES; i++)
+		INIT_HLIST_HEAD(&team->port_hlist[i]);
+	INIT_LIST_HEAD(&team->port_list);
+
+	INIT_LIST_HEAD(&team->option_list);
+	team_options_register(team, team_options, ARRAY_SIZE(team_options));
+	netif_carrier_off(dev);
+
+	return 0;
+}
+
+static void team_uninit(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	struct team_port *tmp;
+
+	spin_lock(&team->lock);
+	list_for_each_entry_safe(port, tmp, &team->port_list, list)
+		team_port_del(team, port->dev);
+
+	__team_change_mode(team, NULL); /* cleanup */
+	__team_options_unregister(team, team_options, ARRAY_SIZE(team_options));
+	spin_unlock(&team->lock);
+}
+
+static void team_destructor(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+
+	free_percpu(team->pcpu_stats);
+	free_netdev(dev);
+}
+
+static int team_open(struct net_device *dev)
+{
+	netif_carrier_on(dev);
+	return 0;
+}
+
+static int team_close(struct net_device *dev)
+{
+	netif_carrier_off(dev);
+	return 0;
+}
+
+/*
+ * note: already called with rcu_read_lock
+ */
+static netdev_tx_t team_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+	bool tx_success = false;
+	unsigned int len = skb->len;
+
+	/*
+	 * Ensure transmit function is called only in case there is at least
+	 * one port present.
+	 */
+	if (likely(!list_empty(&team->port_list) && team->mode_ops.transmit))
+		tx_success = team->mode_ops.transmit(team, skb);
+	if (tx_success) {
+		struct team_pcpu_stats *pcpu_stats;
+
+		pcpu_stats = this_cpu_ptr(team->pcpu_stats);
+		u64_stats_update_begin(&pcpu_stats->syncp);
+		pcpu_stats->tx_packets++;
+		pcpu_stats->tx_bytes += len;
+		u64_stats_update_end(&pcpu_stats->syncp);
+	} else {
+		this_cpu_inc(team->pcpu_stats->tx_dropped);
+	}
+
+	return NETDEV_TX_OK;
+}
+
+static void team_change_rx_flags(struct net_device *dev, int change)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	int inc;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		if (change & IFF_PROMISC) {
+			inc = dev->flags & IFF_PROMISC ? 1 : -1;
+			dev_set_promiscuity(port->dev, inc);
+		}
+		if (change & IFF_ALLMULTI) {
+			inc = dev->flags & IFF_ALLMULTI ? 1 : -1;
+			dev_set_allmulti(port->dev, inc);
+		}
+	}
+	rcu_read_unlock();
+}
+
+static void team_set_rx_mode(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		dev_uc_sync(port->dev, dev);
+		dev_mc_sync(port->dev, dev);
+	}
+	rcu_read_unlock();
+}
+
+static int team_set_mac_address(struct net_device *dev, void *p)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	struct sockaddr *addr = p;
+
+	memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list)
+		if (team->mode_ops.port_change_mac)
+			team->mode_ops.port_change_mac(team, port);
+	rcu_read_unlock();
+	return 0;
+}
+
+static int team_change_mtu(struct net_device *dev, int new_mtu)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	int err;
+
+	/*
+	 * Alhough this is reader, it's guarded by team lock. It's not possible
+	 * to traverse list in reverse under rcu_read_lock
+	 */
+	spin_lock(&team->lock);
+	list_for_each_entry(port, &team->port_list, list) {
+		err = dev_set_mtu(port->dev, new_mtu);
+		if (err) {
+			netdev_err(dev, "Device %s failed to change mtu",
+				   port->dev->name);
+			goto unwind;
+		}
+	}
+	spin_unlock(&team->lock);
+
+	dev->mtu = new_mtu;
+
+	return 0;
+
+unwind:
+	list_for_each_entry_continue_reverse(port, &team->port_list, list)
+		dev_set_mtu(port->dev, dev->mtu);
+	spin_unlock(&team->lock);
+
+	return err;
+}
+
+static struct rtnl_link_stats64 *
+team_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_pcpu_stats *p;
+	u64 rx_packets, rx_bytes, rx_multicast, tx_packets, tx_bytes;
+	u32 rx_dropped = 0, tx_dropped = 0;
+	unsigned int start;
+	int i;
+
+	for_each_possible_cpu(i) {
+		p = per_cpu_ptr(team->pcpu_stats, i);
+		do {
+			start = u64_stats_fetch_begin_bh(&p->syncp);
+			rx_packets	= p->rx_packets;
+			rx_bytes	= p->rx_bytes;
+			rx_multicast	= p->rx_multicast;
+			tx_packets	= p->tx_packets;
+			tx_bytes	= p->tx_bytes;
+		} while (u64_stats_fetch_retry_bh(&p->syncp, start));
+
+		stats->rx_packets	+= rx_packets;
+		stats->rx_bytes		+= rx_bytes;
+		stats->multicast	+= rx_multicast;
+		stats->tx_packets	+= tx_packets;
+		stats->tx_bytes		+= tx_bytes;
+		/*
+		 * rx_dropped & tx_dropped are u32, updated
+		 * without syncp protection.
+		 */
+		rx_dropped	+= p->rx_dropped;
+		tx_dropped	+= p->tx_dropped;
+	}
+	stats->rx_dropped	= rx_dropped;
+	stats->tx_dropped	= tx_dropped;
+	return stats;
+}
+
+static void team_vlan_rx_add_vid(struct net_device *dev, uint16_t vid)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		const struct net_device_ops *ops = port->dev->netdev_ops;
+
+		ops->ndo_vlan_rx_add_vid(port->dev, vid);
+	}
+	rcu_read_unlock();
+}
+
+static void team_vlan_rx_kill_vid(struct net_device *dev, uint16_t vid)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		const struct net_device_ops *ops = port->dev->netdev_ops;
+
+		ops->ndo_vlan_rx_kill_vid(port->dev, vid);
+	}
+	rcu_read_unlock();
+}
+
+static int team_add_slave(struct net_device *dev, struct net_device *port_dev)
+{
+	struct team *team = netdev_priv(dev);
+	int err;
+
+	spin_lock(&team->lock);
+	err = team_port_add(team, port_dev);
+	spin_unlock(&team->lock);
+	return err;
+}
+
+static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
+{
+	struct team *team = netdev_priv(dev);
+	int err;
+
+	spin_lock(&team->lock);
+	err = team_port_del(team, port_dev);
+	spin_unlock(&team->lock);
+	return err;
+}
+
+static const struct net_device_ops team_netdev_ops = {
+	.ndo_init		= team_init,
+	.ndo_uninit		= team_uninit,
+	.ndo_open		= team_open,
+	.ndo_stop		= team_close,
+	.ndo_start_xmit		= team_xmit,
+	.ndo_change_rx_flags	= team_change_rx_flags,
+	.ndo_set_rx_mode	= team_set_rx_mode,
+	.ndo_set_mac_address	= team_set_mac_address,
+	.ndo_change_mtu		= team_change_mtu,
+	.ndo_get_stats64	= team_get_stats64,
+	.ndo_vlan_rx_add_vid	= team_vlan_rx_add_vid,
+	.ndo_vlan_rx_kill_vid	= team_vlan_rx_kill_vid,
+	.ndo_add_slave		= team_add_slave,
+	.ndo_del_slave		= team_del_slave,
+};
+
+
+/***********************
+ * rt netlink interface
+ ***********************/
+
+static void team_setup(struct net_device *dev)
+{
+	ether_setup(dev);
+
+	dev->netdev_ops = &team_netdev_ops;
+	dev->destructor	= team_destructor;
+	dev->tx_queue_len = 0;
+	dev->flags |= IFF_MULTICAST;
+	dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
+
+	/*
+	 * Indicate we support unicast address filtering. That way core won't
+	 * bring us to promisc mode in case a unicast addr is added.
+	 * Let this up to underlay drivers.
+	 */
+	dev->priv_flags |= IFF_UNICAST_FLT;
+
+	dev->features |= NETIF_F_LLTX;
+	dev->features |= NETIF_F_GRO;
+	dev->hw_features = NETIF_F_HW_VLAN_TX |
+			   NETIF_F_HW_VLAN_RX |
+			   NETIF_F_HW_VLAN_FILTER;
+
+	dev->features |= dev->hw_features;
+}
+
+static int team_newlink(struct net *src_net, struct net_device *dev,
+			struct nlattr *tb[], struct nlattr *data[])
+{
+	int err;
+
+	if (tb[IFLA_ADDRESS] == NULL)
+		random_ether_addr(dev->dev_addr);
+
+	err = register_netdevice(dev);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static int team_validate(struct nlattr *tb[], struct nlattr *data[])
+{
+	if (tb[IFLA_ADDRESS]) {
+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
+			return -EINVAL;
+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
+			return -EADDRNOTAVAIL;
+	}
+	return 0;
+}
+
+static struct rtnl_link_ops team_link_ops __read_mostly = {
+	.kind		= DRV_NAME,
+	.priv_size	= sizeof(struct team),
+	.setup		= team_setup,
+	.newlink	= team_newlink,
+	.validate	= team_validate,
+};
+
+
+/***********************************
+ * Generic netlink custom interface
+ ***********************************/
+
+static struct genl_family team_nl_family = {
+	.id		= GENL_ID_GENERATE,
+	.name		= TEAM_GENL_NAME,
+	.version	= TEAM_GENL_VERSION,
+	.maxattr	= TEAM_ATTR_MAX,
+	.netnsok	= true,
+};
+
+static const struct nla_policy team_nl_policy[TEAM_ATTR_MAX + 1] = {
+	[TEAM_ATTR_UNSPEC]			= { .type = NLA_UNSPEC, },
+	[TEAM_ATTR_TEAM_IFINDEX]		= { .type = NLA_U32 },
+	[TEAM_ATTR_LIST_OPTION]			= { .type = NLA_NESTED },
+	[TEAM_ATTR_LIST_PORT]			= { .type = NLA_NESTED },
+};
+
+static const struct nla_policy
+team_nl_option_policy[TEAM_ATTR_OPTION_MAX + 1] = {
+	[TEAM_ATTR_OPTION_UNSPEC]		= { .type = NLA_UNSPEC, },
+	[TEAM_ATTR_OPTION_NAME] = {
+		.type = NLA_STRING,
+		.len = TEAM_STRING_MAX_LEN,
+	},
+	[TEAM_ATTR_OPTION_CHANGED]		= { .type = NLA_FLAG },
+	[TEAM_ATTR_OPTION_TYPE]			= { .type = NLA_U8 },
+	[TEAM_ATTR_OPTION_DATA] = {
+		.type = NLA_BINARY,
+		.len = TEAM_STRING_MAX_LEN,
+	},
+};
+
+static int team_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
+{
+	struct sk_buff *msg;
+	void *hdr;
+	int err;
+
+	msg = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(msg, info->snd_pid, info->snd_seq,
+			  &team_nl_family, 0, TEAM_CMD_NOOP);
+	if (IS_ERR(hdr)) {
+		err = PTR_ERR(hdr);
+		goto err_msg_put;
+	}
+
+	genlmsg_end(msg, hdr);
+
+	return genlmsg_unicast(genl_info_net(info), msg, info->snd_pid);
+
+err_msg_put:
+	nlmsg_free(msg);
+
+	return err;
+}
+
+/*
+ * Netlink cmd functions should be locked by following two functions.
+ * To ensure team_uninit would not be called in between, hold rcu_read_lock
+ * all the time.
+ */
+static struct team *team_nl_team_get(struct genl_info *info)
+{
+	struct net *net = genl_info_net(info);
+	int ifindex;
+	struct net_device *dev;
+	struct team *team;
+
+	if (!info->attrs[TEAM_ATTR_TEAM_IFINDEX])
+		return NULL;
+
+	ifindex = nla_get_u32(info->attrs[TEAM_ATTR_TEAM_IFINDEX]);
+	rcu_read_lock();
+	dev = dev_get_by_index_rcu(net, ifindex);
+	if (!dev || dev->netdev_ops != &team_netdev_ops) {
+		rcu_read_unlock();
+		return NULL;
+	}
+
+	team = netdev_priv(dev);
+	spin_lock(&team->lock);
+	return team;
+}
+
+static void team_nl_team_put(struct team *team)
+{
+	spin_unlock(&team->lock);
+	rcu_read_unlock();
+}
+
+static int team_nl_send_generic(struct genl_info *info, struct team *team,
+				int (*fill_func)(struct sk_buff *skb,
+						 struct genl_info *info,
+						 int flags, struct team *team))
+{
+	struct sk_buff *skb;
+	int err;
+
+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	err = fill_func(skb, info, NLM_F_ACK, team);
+	if (err < 0)
+		goto err_fill;
+
+	err = genlmsg_unicast(genl_info_net(info), skb, info->snd_pid);
+	return err;
+
+err_fill:
+	nlmsg_free(skb);
+	return err;
+}
+
+static int team_nl_fill_options_get_changed(struct sk_buff *skb,
+					    u32 pid, u32 seq, int flags,
+					    struct team *team,
+					    struct team_option *changed_option)
+{
+	struct nlattr *option_list;
+	void *hdr;
+	struct team_option *option;
+
+	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
+			  TEAM_CMD_OPTIONS_GET);
+	if (IS_ERR(hdr))
+		return PTR_ERR(hdr);
+
+	NLA_PUT_U32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex);
+	option_list = nla_nest_start(skb, TEAM_ATTR_LIST_OPTION);
+	if (!option_list)
+		return -EMSGSIZE;
+
+	list_for_each_entry(option, &team->option_list, list) {
+		struct nlattr *option_item;
+		long arg;
+
+		option_item = nla_nest_start(skb, TEAM_ATTR_ITEM_OPTION);
+		if (!option_item)
+			goto nla_put_failure;
+		NLA_PUT_STRING(skb, TEAM_ATTR_OPTION_NAME, option->name);
+		if (option == changed_option)
+			NLA_PUT_FLAG(skb, TEAM_ATTR_OPTION_CHANGED);
+		switch (option->type) {
+		case TEAM_OPTION_TYPE_U32:
+			NLA_PUT_U8(skb, TEAM_ATTR_OPTION_TYPE, NLA_U32);
+			team_option_get(team, option, &arg);
+			NLA_PUT_U32(skb, TEAM_ATTR_OPTION_DATA, arg);
+			break;
+		case TEAM_OPTION_TYPE_STRING:
+			NLA_PUT_U8(skb, TEAM_ATTR_OPTION_TYPE, NLA_STRING);
+			team_option_get(team, option, &arg);
+			NLA_PUT_STRING(skb, TEAM_ATTR_OPTION_DATA,
+				       (char *) arg);
+			break;
+		default:
+			BUG();
+		}
+		nla_nest_end(skb, option_item);
+	}
+
+	nla_nest_end(skb, option_list);
+	return genlmsg_end(skb, hdr);
+
+nla_put_failure:
+	genlmsg_cancel(skb, hdr);
+	return -EMSGSIZE;
+}
+
+static int team_nl_fill_options_get(struct sk_buff *skb,
+				    struct genl_info *info, int flags,
+				    struct team *team)
+{
+	return team_nl_fill_options_get_changed(skb, info->snd_pid,
+						info->snd_seq, NLM_F_ACK,
+						team, NULL);
+}
+
+static int team_nl_cmd_options_get(struct sk_buff *skb, struct genl_info *info)
+{
+	struct team *team;
+	int err;
+
+	team = team_nl_team_get(info);
+	if (!team)
+		return -EINVAL;
+
+	err = team_nl_send_generic(info, team, team_nl_fill_options_get);
+
+	team_nl_team_put(team);
+
+	return err;
+}
+
+static int team_nl_cmd_options_set(struct sk_buff *skb, struct genl_info *info)
+{
+	struct team *team;
+	int err = 0;
+	int i;
+	struct nlattr *nl_option;
+
+	team = team_nl_team_get(info);
+	if (!team)
+		return -EINVAL;
+
+	err = -EINVAL;
+	if (!info->attrs[TEAM_ATTR_LIST_OPTION]) {
+		err = -EINVAL;
+		goto team_put;
+	}
+
+	nla_for_each_nested(nl_option, info->attrs[TEAM_ATTR_LIST_OPTION], i) {
+		struct nlattr *mode_attrs[TEAM_ATTR_OPTION_MAX + 1];
+		enum team_option_type opt_type;
+		struct team_option *option;
+		char *opt_name;
+		bool opt_found = false;
+
+		if (nla_type(nl_option) != TEAM_ATTR_ITEM_OPTION) {
+			err = -EINVAL;
+			goto team_put;
+		}
+		err = nla_parse_nested(mode_attrs, TEAM_ATTR_OPTION_MAX,
+				       nl_option, team_nl_option_policy);
+		if (err)
+			goto team_put;
+		if (!mode_attrs[TEAM_ATTR_OPTION_NAME] ||
+		    !mode_attrs[TEAM_ATTR_OPTION_TYPE] ||
+		    !mode_attrs[TEAM_ATTR_OPTION_DATA]) {
+			err = -EINVAL;
+			goto team_put;
+		}
+		switch (nla_get_u8(mode_attrs[TEAM_ATTR_OPTION_TYPE])) {
+		case NLA_U32:
+			opt_type = TEAM_OPTION_TYPE_U32;
+			break;
+		case NLA_STRING:
+			opt_type = TEAM_OPTION_TYPE_STRING;
+			break;
+		default:
+			goto team_put;
+		}
+
+		opt_name = nla_data(mode_attrs[TEAM_ATTR_OPTION_NAME]);
+		list_for_each_entry(option, &team->option_list, list) {
+			long arg;
+			struct nlattr *opt_data_attr;
+
+			if (option->type != opt_type ||
+			    strcmp(option->name, opt_name))
+				continue;
+			opt_found = true;
+			opt_data_attr = mode_attrs[TEAM_ATTR_OPTION_DATA];
+			switch (opt_type) {
+			case TEAM_OPTION_TYPE_U32:
+				arg = nla_get_u32(opt_data_attr);
+				break;
+			case TEAM_OPTION_TYPE_STRING:
+				arg = (long) nla_data(opt_data_attr);
+				break;
+			default:
+				BUG();
+			}
+			err = team_option_set(team, option, &arg);
+			if (err)
+				goto team_put;
+		}
+		if (!opt_found) {
+			err = -ENOENT;
+			goto team_put;
+		}
+	}
+
+team_put:
+	team_nl_team_put(team);
+
+	return err;
+}
+
+static int team_nl_fill_port_list_get_changed(struct sk_buff *skb,
+					      u32 pid, u32 seq, int flags,
+					      struct team *team,
+					      struct team_port *changed_port)
+{
+	struct nlattr *port_list;
+	void *hdr;
+	struct team_port *port;
+
+	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
+			  TEAM_CMD_PORT_LIST_GET);
+	if (IS_ERR(hdr))
+		return PTR_ERR(hdr);
+
+	NLA_PUT_U32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex);
+	port_list = nla_nest_start(skb, TEAM_ATTR_LIST_PORT);
+	if (!port_list)
+		return -EMSGSIZE;
+
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		struct nlattr *port_item;
+
+		port_item = nla_nest_start(skb, TEAM_ATTR_ITEM_PORT);
+		if (!port_item)
+			goto nla_put_failure;
+		NLA_PUT_U32(skb, TEAM_ATTR_PORT_IFINDEX, port->dev->ifindex);
+		if (port == changed_port)
+			NLA_PUT_FLAG(skb, TEAM_ATTR_PORT_CHANGED);
+		if (port->linkup)
+			NLA_PUT_FLAG(skb, TEAM_ATTR_PORT_LINKUP);
+		NLA_PUT_U32(skb, TEAM_ATTR_PORT_SPEED, port->speed);
+		NLA_PUT_U8(skb, TEAM_ATTR_PORT_DUPLEX, port->duplex);
+		nla_nest_end(skb, port_item);
+	}
+
+	nla_nest_end(skb, port_list);
+	return genlmsg_end(skb, hdr);
+
+nla_put_failure:
+	genlmsg_cancel(skb, hdr);
+	return -EMSGSIZE;
+}
+
+static int team_nl_fill_port_list_get(struct sk_buff *skb,
+				      struct genl_info *info, int flags,
+				      struct team *team)
+{
+	return team_nl_fill_port_list_get_changed(skb, info->snd_pid,
+						  info->snd_seq, NLM_F_ACK,
+						  team, NULL);
+}
+
+static int team_nl_cmd_port_list_get(struct sk_buff *skb,
+				     struct genl_info *info)
+{
+	struct team *team;
+	int err;
+
+	team = team_nl_team_get(info);
+	if (!team)
+		return -EINVAL;
+
+	err = team_nl_send_generic(info, team, team_nl_fill_port_list_get);
+
+	team_nl_team_put(team);
+
+	return err;
+}
+
+static struct genl_ops team_nl_ops[] = {
+	{
+		.cmd = TEAM_CMD_NOOP,
+		.doit = team_nl_cmd_noop,
+		.policy = team_nl_policy,
+	},
+	{
+		.cmd = TEAM_CMD_OPTIONS_SET,
+		.doit = team_nl_cmd_options_set,
+		.policy = team_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = TEAM_CMD_OPTIONS_GET,
+		.doit = team_nl_cmd_options_get,
+		.policy = team_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = TEAM_CMD_PORT_LIST_GET,
+		.doit = team_nl_cmd_port_list_get,
+		.policy = team_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+};
+
+static struct genl_multicast_group team_change_event_mcgrp = {
+	.name = TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME,
+};
+
+static int team_nl_send_event_options_get(struct team *team,
+					  struct team_option *changed_option)
+{
+	struct sk_buff *skb;
+	int err;
+	struct net *net = dev_net(team->dev);
+
+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	err = team_nl_fill_options_get_changed(skb, 0, 0, 0, team,
+					       changed_option);
+	if (err < 0)
+		goto err_fill;
+
+	err = genlmsg_multicast_netns(net, skb, 0, team_change_event_mcgrp.id,
+				      GFP_KERNEL);
+	return err;
+
+err_fill:
+	nlmsg_free(skb);
+	return err;
+}
+
+static int team_nl_send_event_port_list_get(struct team_port *port)
+{
+	struct sk_buff *skb;
+	int err;
+	struct net *net = dev_net(port->team->dev);
+
+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	err = team_nl_fill_port_list_get_changed(skb, 0, 0, 0,
+						 port->team, port);
+	if (err < 0)
+		goto err_fill;
+
+	err = genlmsg_multicast_netns(net, skb, 0, team_change_event_mcgrp.id,
+				      GFP_KERNEL);
+	return err;
+
+err_fill:
+	nlmsg_free(skb);
+	return err;
+}
+
+static int team_nl_init(void)
+{
+	int err;
+
+	err = genl_register_family_with_ops(&team_nl_family, team_nl_ops,
+					    ARRAY_SIZE(team_nl_ops));
+	if (err)
+		return err;
+
+	err = genl_register_mc_group(&team_nl_family, &team_change_event_mcgrp);
+	if (err)
+		goto err_change_event_grp_reg;
+
+	return 0;
+
+err_change_event_grp_reg:
+	genl_unregister_family(&team_nl_family);
+
+	return err;
+}
+
+static void team_nl_fini(void)
+{
+	genl_unregister_family(&team_nl_family);
+}
+
+
+/******************
+ * Change checkers
+ ******************/
+
+static void __team_options_change_check(struct team *team,
+					struct team_option *changed_option)
+{
+	int err;
+
+	err = team_nl_send_event_options_get(team, changed_option);
+	if (err)
+		netdev_warn(team->dev, "Failed to send options change via netlink\n");
+}
+
+/* rtnl lock is held */
+static void __team_port_change_check(struct team_port *port, bool linkup)
+{
+	int err;
+
+	if (port->linkup == linkup)
+		return;
+
+	port->linkup = linkup;
+	if (linkup) {
+		struct ethtool_cmd ecmd;
+
+		err = __ethtool_get_settings(port->dev, &ecmd);
+		if (!err) {
+			port->speed = ethtool_cmd_speed(&ecmd);
+			port->duplex = ecmd.duplex;
+			goto send_event;
+		}
+	}
+	port->speed = 0;
+	port->duplex = 0;
+
+send_event:
+	err = team_nl_send_event_port_list_get(port);
+	if (err)
+		netdev_warn(port->team->dev, "Failed to send port change of device %s via netlink\n",
+			    port->dev->name);
+
+}
+
+static void team_port_change_check(struct team_port *port, bool linkup)
+{
+	struct team *team = port->team;
+
+	spin_lock(&team->lock);
+	__team_port_change_check(port, linkup);
+	spin_unlock(&team->lock);
+}
+
+/************************************
+ * Net device notifier event handler
+ ************************************/
+
+static int team_device_event(struct notifier_block *unused,
+			     unsigned long event, void *ptr)
+{
+	struct net_device *dev = (struct net_device *) ptr;
+	struct team_port *port;
+
+	port = team_port_get_rtnl(dev);
+	if (!port)
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case NETDEV_UP:
+		if (netif_carrier_ok(dev))
+			team_port_change_check(port, true);
+	case NETDEV_DOWN:
+		team_port_change_check(port, false);
+	case NETDEV_CHANGE:
+		if (netif_running(port->dev))
+			team_port_change_check(port,
+					       !!netif_carrier_ok(port->dev));
+		break;
+	case NETDEV_UNREGISTER:
+		team_del_slave(port->team->dev, dev);
+		break;
+	case NETDEV_FEAT_CHANGE:
+		team_compute_features(port->team);
+		break;
+	case NETDEV_CHANGEMTU:
+		/* Forbid to change mtu of underlaying device */
+		return NOTIFY_BAD;
+	case NETDEV_PRE_TYPE_CHANGE:
+		/* Forbid to change type of underlaying device */
+		return NOTIFY_BAD;
+	}
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block team_notifier_block __read_mostly = {
+	.notifier_call = team_device_event,
+};
+
+
+/***********************
+ * Module init and exit
+ ***********************/
+
+static int __init team_module_init(void)
+{
+	int err;
+
+	register_netdevice_notifier(&team_notifier_block);
+
+	err = rtnl_link_register(&team_link_ops);
+	if (err)
+		goto err_rtnl_reg;
+
+	err = team_nl_init();
+	if (err)
+		goto err_nl_init;
+
+	return 0;
+
+err_nl_init:
+	rtnl_link_unregister(&team_link_ops);
+
+err_rtnl_reg:
+	unregister_netdevice_notifier(&team_notifier_block);
+
+	return err;
+}
+
+static void __exit team_module_exit(void)
+{
+	team_nl_fini();
+	rtnl_link_unregister(&team_link_ops);
+	unregister_netdevice_notifier(&team_notifier_block);
+}
+
+module_init(team_module_init);
+module_exit(team_module_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
+MODULE_DESCRIPTION("Ethernet team device driver");
+MODULE_ALIAS_RTNL_LINK(DRV_NAME);
diff --git a/drivers/net/team/team_mode_activebackup.c b/drivers/net/team/team_mode_activebackup.c
new file mode 100644
index 0000000..b939a2c
--- /dev/null
+++ b/drivers/net/team/team_mode_activebackup.c
@@ -0,0 +1,137 @@
+/*
+ * net/drivers/team/team_mode_activebackup.c - Active-backup mode for team
+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <linux/netdevice.h>
+#include <net/rtnetlink.h>
+#include <linux/if_team.h>
+
+struct ab_priv {
+	struct team_port __rcu *active_port;
+};
+
+static struct ab_priv *ab_priv(struct team *team)
+{
+	return (struct ab_priv *) &team->mode_priv;
+}
+
+static rx_handler_result_t ab_receive(struct team *team, struct team_port *port,
+				      struct sk_buff *skb) {
+	struct team_port *active_port;
+
+	active_port = rcu_dereference(ab_priv(team)->active_port);
+	if (active_port != port)
+		return RX_HANDLER_EXACT;
+	return RX_HANDLER_ANOTHER;
+}
+
+static bool ab_transmit(struct team *team, struct sk_buff *skb)
+{
+	struct team_port *active_port;
+
+	active_port = rcu_dereference(ab_priv(team)->active_port);
+	if (unlikely(!active_port))
+		goto drop;
+	skb->dev = active_port->dev;
+	if (dev_queue_xmit(skb))
+		return false;
+	return true;
+
+drop:
+	dev_kfree_skb(skb);
+	return false;
+}
+
+static void ab_port_leave(struct team *team, struct team_port *port)
+{
+	if (ab_priv(team)->active_port == port)
+		rcu_assign_pointer(ab_priv(team)->active_port, NULL);
+}
+
+static int ab_active_port_get(struct team *team, void *arg)
+{
+	u32 *ifindex = arg;
+
+	*ifindex = 0;
+	if (ab_priv(team)->active_port)
+		*ifindex = ab_priv(team)->active_port->dev->ifindex;
+	return 0;
+}
+
+static int ab_active_port_set(struct team *team, void *arg)
+{
+	u32 *ifindex = arg;
+	struct team_port *port;
+
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		if (port->dev->ifindex == *ifindex) {
+			rcu_assign_pointer(ab_priv(team)->active_port, port);
+			return 0;
+		}
+	}
+	return -ENOENT;
+}
+
+static struct team_option ab_options[] = {
+	{
+		.name = "activeport",
+		.type = TEAM_OPTION_TYPE_U32,
+		.getter = ab_active_port_get,
+		.setter = ab_active_port_set,
+	},
+};
+
+int ab_init(struct team *team)
+{
+	team_options_register(team, ab_options, ARRAY_SIZE(ab_options));
+	return 0;
+}
+
+void ab_exit(struct team *team)
+{
+	team_options_unregister(team, ab_options, ARRAY_SIZE(ab_options));
+}
+
+static const struct team_mode_ops ab_mode_ops = {
+	.init			= ab_init,
+	.exit			= ab_exit,
+	.receive		= ab_receive,
+	.transmit		= ab_transmit,
+	.port_leave		= ab_port_leave,
+};
+
+static struct team_mode ab_mode = {
+	.kind		= "activebackup",
+	.owner		= THIS_MODULE,
+	.priv_size	= sizeof(struct ab_priv),
+	.ops		= &ab_mode_ops,
+};
+
+static int __init ab_init_module(void)
+{
+	return team_mode_register(&ab_mode);
+}
+
+static void __exit ab_cleanup_module(void)
+{
+	team_mode_unregister(&ab_mode);
+}
+
+module_init(ab_init_module);
+module_exit(ab_cleanup_module);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
+MODULE_DESCRIPTION("Active-backup mode for team");
+MODULE_ALIAS("team-mode-activebackup");
diff --git a/drivers/net/team/team_mode_roundrobin.c b/drivers/net/team/team_mode_roundrobin.c
new file mode 100644
index 0000000..0374052
--- /dev/null
+++ b/drivers/net/team/team_mode_roundrobin.c
@@ -0,0 +1,107 @@
+/*
+ * net/drivers/team/team_mode_roundrobin.c - Round-robin mode for team
+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <linux/netdevice.h>
+#include <linux/if_team.h>
+
+struct rr_priv {
+	unsigned int sent_packets;
+};
+
+static struct rr_priv *rr_priv(struct team *team)
+{
+	return (struct rr_priv *) &team->mode_priv;
+}
+
+static struct team_port *__get_first_port_up(struct team *team,
+					     struct team_port *port)
+{
+	struct team_port *cur;
+
+	if (port->linkup)
+		return port;
+	cur = port;
+	list_for_each_entry_continue_rcu(cur, &team->port_list, list)
+		if (cur->linkup)
+			return cur;
+	list_for_each_entry_rcu(cur, &team->port_list, list) {
+		if (cur == port)
+			break;
+		if (cur->linkup)
+			return cur;
+	}
+	return NULL;
+}
+
+static bool rr_transmit(struct team *team, struct sk_buff *skb)
+{
+	struct team_port *port;
+	int port_index;
+
+	port_index = rr_priv(team)->sent_packets++ % team->port_count;
+	port = team_get_port_by_index_rcu(team, port_index);
+	port = __get_first_port_up(team, port);
+	if (unlikely(!port))
+		goto drop;
+	skb->dev = port->dev;
+	if (dev_queue_xmit(skb))
+		return false;
+	return true;
+
+drop:
+	dev_kfree_skb(skb);
+	return false;
+}
+
+static int rr_port_enter(struct team *team, struct team_port *port)
+{
+	return team_port_set_team_mac(port);
+}
+
+static void rr_port_change_mac(struct team *team, struct team_port *port)
+{
+	team_port_set_team_mac(port);
+}
+
+static const struct team_mode_ops rr_mode_ops = {
+	.transmit		= rr_transmit,
+	.port_enter		= rr_port_enter,
+	.port_change_mac	= rr_port_change_mac,
+};
+
+static struct team_mode rr_mode = {
+	.kind		= "roundrobin",
+	.owner		= THIS_MODULE,
+	.priv_size	= sizeof(struct rr_priv),
+	.ops		= &rr_mode_ops,
+};
+
+static int __init rr_init_module(void)
+{
+	return team_mode_register(&rr_mode);
+}
+
+static void __exit rr_cleanup_module(void)
+{
+	team_mode_unregister(&rr_mode);
+}
+
+module_init(rr_init_module);
+module_exit(rr_cleanup_module);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
+MODULE_DESCRIPTION("Round-robin mode for team");
+MODULE_ALIAS("team-mode-roundrobin");
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index 619b565..0b091b3 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -185,6 +185,7 @@ header-y += if_pppol2tp.h
 header-y += if_pppox.h
 header-y += if_slip.h
 header-y += if_strip.h
+header-y += if_team.h
 header-y += if_tr.h
 header-y += if_tun.h
 header-y += if_tunnel.h
diff --git a/include/linux/if.h b/include/linux/if.h
index db20bd4..06b6ef6 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -79,6 +79,7 @@
 #define IFF_TX_SKB_SHARING	0x10000	/* The interface supports sharing
 					 * skbs on transmit */
 #define IFF_UNICAST_FLT	0x20000		/* Supports unicast filtering	*/
+#define IFF_TEAM_PORT	0x40000		/* device used as team port */
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
 #define IF_GET_PROTO	0x0002
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
new file mode 100644
index 0000000..e7a1e19
--- /dev/null
+++ b/include/linux/if_team.h
@@ -0,0 +1,230 @@
+/*
+ * include/linux/if_team.h - Network team device driver header
+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _LINUX_IF_TEAM_H_
+#define _LINUX_IF_TEAM_H_
+
+#ifdef __KERNEL__
+
+struct team_pcpu_stats {
+	u64			rx_packets;
+	u64			rx_bytes;
+	u64			rx_multicast;
+	u64			tx_packets;
+	u64			tx_bytes;
+	struct u64_stats_sync	syncp;
+	u32			rx_dropped;
+	u32			tx_dropped;
+};
+
+struct team;
+
+struct team_port {
+	struct net_device *dev;
+	struct hlist_node hlist; /* node in hash list */
+	struct list_head list; /* node in ordinary list */
+	struct team *team;
+	int index;
+
+	/*
+	 * A place for storing original values of the device before it
+	 * become a port.
+	 */
+	struct {
+		unsigned char dev_addr[MAX_ADDR_LEN];
+		unsigned int mtu;
+	} orig;
+
+	bool linkup;
+	u32 speed;
+	u8 duplex;
+
+	struct rcu_head rcu;
+};
+
+struct team_mode_ops {
+	int (*init)(struct team *team);
+	void (*exit)(struct team *team);
+	rx_handler_result_t (*receive)(struct team *team,
+				       struct team_port *port,
+				       struct sk_buff *skb);
+	bool (*transmit)(struct team *team, struct sk_buff *skb);
+	int (*port_enter)(struct team *team, struct team_port *port);
+	void (*port_leave)(struct team *team, struct team_port *port);
+	void (*port_change_mac)(struct team *team, struct team_port *port);
+};
+
+enum team_option_type {
+	TEAM_OPTION_TYPE_U32,
+	TEAM_OPTION_TYPE_STRING,
+};
+
+struct team_option {
+	struct list_head list;
+	const char *name;
+	enum team_option_type type;
+	int (*getter)(struct team *team, void *arg);
+	int (*setter)(struct team *team, void *arg);
+};
+
+struct team_mode {
+	struct list_head list;
+	const char *kind;
+	struct module *owner;
+	size_t priv_size;
+	const struct team_mode_ops *ops;
+};
+
+#define TEAM_PORT_HASHBITS 4
+#define TEAM_PORT_HASHENTRIES (1 << TEAM_PORT_HASHBITS)
+
+#define TEAM_MODE_PRIV_LONGS 4
+#define TEAM_MODE_PRIV_SIZE (sizeof(long) * TEAM_MODE_PRIV_LONGS)
+
+struct team {
+	struct net_device *dev; /* associated netdevice */
+	struct team_pcpu_stats __percpu *pcpu_stats;
+
+	spinlock_t lock; /* used for overall locking, e.g. port lists write */
+
+	/*
+	 * port lists with port count
+	 */
+	int port_count;
+	struct hlist_head port_hlist[TEAM_PORT_HASHENTRIES];
+	struct list_head port_list;
+
+	struct list_head option_list;
+
+	const char *mode_kind;
+	struct team_mode_ops mode_ops;
+	long mode_priv[TEAM_MODE_PRIV_LONGS];
+};
+
+static inline struct hlist_head *team_port_index_hash(struct team *team,
+						      int port_index)
+{
+	return &team->port_hlist[port_index & (TEAM_PORT_HASHENTRIES - 1)];
+}
+
+static inline struct team_port *team_get_port_by_index_rcu(struct team *team,
+							   int port_index)
+{
+	struct hlist_node *p;
+	struct team_port *port;
+	struct hlist_head *head = team_port_index_hash(team, port_index);
+
+	hlist_for_each_entry_rcu(port, p, head, hlist)
+		if (port->index == port_index)
+			return port;
+	return NULL;
+}
+
+extern int team_port_set_team_mac(struct team_port *port);
+extern void team_options_register(struct team *team,
+				  struct team_option *option,
+				  size_t option_count);
+extern void team_options_unregister(struct team *team,
+				    struct team_option *option,
+				    size_t option_count);
+extern int team_mode_register(struct team_mode *mode);
+extern int team_mode_unregister(struct team_mode *mode);
+
+#endif /* __KERNEL__ */
+
+#define TEAM_STRING_MAX_LEN 32
+
+/**********************************
+ * NETLINK_GENERIC netlink family.
+ **********************************/
+
+enum {
+	TEAM_CMD_NOOP,
+	TEAM_CMD_OPTIONS_SET,
+	TEAM_CMD_OPTIONS_GET,
+	TEAM_CMD_PORT_LIST_GET,
+
+	__TEAM_CMD_MAX,
+	TEAM_CMD_MAX = (__TEAM_CMD_MAX - 1),
+};
+
+enum {
+	TEAM_ATTR_UNSPEC,
+	TEAM_ATTR_TEAM_IFINDEX,		/* u32 */
+	TEAM_ATTR_LIST_OPTION,		/* nest */
+	TEAM_ATTR_LIST_PORT,		/* nest */
+
+	__TEAM_ATTR_MAX,
+	TEAM_ATTR_MAX = __TEAM_ATTR_MAX - 1,
+};
+
+/* Nested layout of get/set msg:
+ *
+ *	[TEAM_ATTR_LIST_OPTION]
+ *		[TEAM_ATTR_ITEM_OPTION]
+ *			[TEAM_ATTR_OPTION_*], ...
+ *		[TEAM_ATTR_ITEM_OPTION]
+ *			[TEAM_ATTR_OPTION_*], ...
+ *		...
+ *	[TEAM_ATTR_LIST_PORT]
+ *		[TEAM_ATTR_ITEM_PORT]
+ *			[TEAM_ATTR_PORT_*], ...
+ *		[TEAM_ATTR_ITEM_PORT]
+ *			[TEAM_ATTR_PORT_*], ...
+ *		...
+ */
+
+enum {
+	TEAM_ATTR_ITEM_OPTION_UNSPEC,
+	TEAM_ATTR_ITEM_OPTION,		/* nest */
+
+	__TEAM_ATTR_ITEM_OPTION_MAX,
+	TEAM_ATTR_ITEM_OPTION_MAX = __TEAM_ATTR_ITEM_OPTION_MAX - 1,
+};
+
+enum {
+	TEAM_ATTR_OPTION_UNSPEC,
+	TEAM_ATTR_OPTION_NAME,		/* string */
+	TEAM_ATTR_OPTION_CHANGED,	/* flag */
+	TEAM_ATTR_OPTION_TYPE,		/* u8 */
+	TEAM_ATTR_OPTION_DATA,		/* dynamic */
+
+	__TEAM_ATTR_OPTION_MAX,
+	TEAM_ATTR_OPTION_MAX = __TEAM_ATTR_OPTION_MAX - 1,
+};
+
+enum {
+	TEAM_ATTR_ITEM_PORT_UNSPEC,
+	TEAM_ATTR_ITEM_PORT,		/* nest */
+
+	__TEAM_ATTR_ITEM_PORT_MAX,
+	TEAM_ATTR_ITEM_PORT_MAX = __TEAM_ATTR_ITEM_PORT_MAX - 1,
+};
+
+enum {
+	TEAM_ATTR_PORT_UNSPEC,
+	TEAM_ATTR_PORT_IFINDEX,		/* u32 */
+	TEAM_ATTR_PORT_CHANGED,		/* flag */
+	TEAM_ATTR_PORT_LINKUP,		/* flag */
+	TEAM_ATTR_PORT_SPEED,		/* u32 */
+	TEAM_ATTR_PORT_DUPLEX,		/* u8 */
+
+	__TEAM_ATTR_PORT_MAX,
+	TEAM_ATTR_PORT_MAX = __TEAM_ATTR_PORT_MAX - 1,
+};
+
+/*
+ * NETLINK_GENERIC related info
+ */
+#define TEAM_GENL_NAME "team"
+#define TEAM_GENL_VERSION 0x1
+#define TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME "change_event"
+
+#endif /* _LINUX_IF_TEAM_H_ */
-- 
1.7.6

^ permalink raw reply related

* Re: [PATCH] net: make bonding slaves honour master's skb->priority
From: Flavio Leitner @ 2011-10-25 13:08 UTC (permalink / raw)
  To: Maciej Żenczykowski; +Cc: Maciej Żenczykowski, netdev
In-Reply-To: <1319519056-21677-1-git-send-email-zenczykowski@gmail.com>

On Mon, 24 Oct 2011 22:04:16 -0700
Maciej Żenczykowski <zenczykowski@gmail.com> wrote:

> From: Maciej Żenczykowski <maze@google.com>
> 
> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> ---
>  drivers/net/bonding/bond_main.c |    1 -
>  1 files changed, 0 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c
> b/drivers/net/bonding/bond_main.c index 41430ba..2cbc2f8 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -395,7 +395,6 @@ int bond_dev_queue_xmit(struct bonding *bond,
> struct sk_buff *skb, struct net_device *slave_dev)
>  {
>  	skb->dev = slave_dev;
> -	skb->priority = 1;
>  
>  	skb->queue_mapping = bond_queue_mapping(skb);
>  

I tripped on it few times and never understood why we
were fixing the priority at this point.
thanks,

Acked-by: Flavio Leitner <fbl@redhat.com>
 

^ permalink raw reply

* Re: [GIT] Networking
From: Linus Torvalds @ 2011-10-25 13:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, Greg KH, Mikulas Patocka, akpm, netdev,
	linux-kernel
In-Reply-To: <m1wrbtb4rj.fsf@fess.ebiederm.org>

On Tue, Oct 25, 2011 at 3:05 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> The meat of my change seems to have gotten lost in the conflict
> resolution.  Fixes in the patch below.

Duh. I was thinking about that when doing the merge, but then never
did the fixups. My bad.

Thanks, applied,

                Linus

^ permalink raw reply

* Re: [GIT] Networking
From: Greg KH @ 2011-10-25 13:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, Eric W. Biederman, Mikulas Patocka, akpm, netdev,
	linux-kernel
In-Reply-To: <CA+55aFw1GVvszqoC_f0RAvG5t1xj0CSYLhLU=y0gQ+_54Gsomw@mail.gmail.com>

On Tue, Oct 25, 2011 at 01:46:11PM +0200, Linus Torvalds wrote:
> Anyway, after that rant about really bad practices, let me say that I
> did fix up the conflict and I think it's right. But I won't guarantee
> it, so please check the changes to fs/sysfs/dir.c.

I think it looks ok, I've booted the merge result, and am typing and
sending this from the new kernel, and it hasn't crashed yet :)

greg k-h

^ permalink raw reply

* Re: [patch net-next V4] net: introduce ethernet teaming device
From: Flavio Leitner @ 2011-10-25 13:22 UTC (permalink / raw)
  To: Michał Mirosław
  Cc: Jiri Pirko, netdev, davem, eric.dumazet, bhutchings, shemminger,
	fubar, andy, tgraf, ebiederm, kaber, greearb, jesse,
	benjamin.poirier, jzupka
In-Reply-To: <CAHXqBFLzuxaCW-q5X066zM3E_t9KHO1uTF6WD3Je2u0EifaSSw@mail.gmail.com>

On Mon, 24 Oct 2011 19:22:36 +0200
Michał Mirosław <mirqus@gmail.com> wrote:

> 2011/10/24 Jiri Pirko <jpirko@redhat.com>:
> > This patch introduces new network device called team. It supposes
> > to be very fast, simple, userspace-driven alternative to existing
> > bonding driver.
> [...]
> >  drivers/net/team/team.c                   | 1573
> > +++++++++++++++++++++++++++++
> > drivers/net/team/team_mode_activebackup.c |  152 +++
> > drivers/net/team/team_mode_roundrobin.c   |  107 ++
> 
> I think this mode-modularity is overkill. One mode will compile to at
> most a few hundred bytes of code+data, but will use at least 10 times
> that to get loaded and tracked properly. How often/how many more modes
> you anticipate to be introduced? You could just keep the modular
> design but drop the kernel module separation and maybe have modes
> conditionally compiled (for those from the embedded world squeezing
> every byte).
> 

Please keep the modes separated.  Security and embedded folks will
thank you.  If you are concerned with the overhead which still should
be small and doesn't impact performance, you can compile as built-in.
The good thing is that you can choose the best option for you while
doing the opposite doesn't offer that.

Also, if in the future we need to add module parameters, the namespace
is clean.

I would ask to keep it as the work is done already.

fbl

^ permalink raw reply

* Re: [PATCH v6 1/8] SUNRPC: introduce helpers for reference counted rpcbind clients
From: Stanislav Kinsbursky @ 2011-10-25 13:25 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs@vger.kernel.org, Pavel Emelianov, neilb@suse.de,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	bfields@fieldses.org, davem@davemloft.net, devel@openvz.org
In-Reply-To: <1319546750.2716.5.camel@lade.trondhjem.org>

25.10.2011 16:45, Trond Myklebust пишет:
>
> The empty 'return' at the end of a void function: it is 100%
> redundant...
>

O, yes, of course. Sorry.


-- 
Best regards,
Stanislav Kinsbursky

^ permalink raw reply

* [PATCH] net: ucc_geth, increase no. of HW RX descriptors
From: Joakim Tjernlund @ 2011-10-25 13:17 UTC (permalink / raw)
  To: Anton Vorontsov, netdev; +Cc: Joakim Tjernlund

In a busy network we see ucc_geth is dropping RX pkgs every now
and then. Increase the first RX queues HW descriptors from
16 to 32 to deal with this.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
 drivers/net/ucc_geth.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index fa00935..04e22ff 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -134,7 +134,7 @@ static struct ucc_geth_info ugeth_primary_info = {
 			TX_BD_RING_LEN},
 
 	.bdRingLenRx = {
-			RX_BD_RING_LEN,
+			RX_BD_RING_LEN*2,
 			RX_BD_RING_LEN,
 			RX_BD_RING_LEN,
 			RX_BD_RING_LEN,
-- 
1.7.3.4

^ permalink raw reply related

* Re: [patch net-next V5] net: introduce ethernet teaming device
From: Eric Dumazet @ 2011-10-25 13:27 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, bhutchings, shemminger, fubar, andy, tgraf,
	ebiederm, mirqus, kaber, greearb, jesse, fbl, benjamin.poirier,
	jzupka
In-Reply-To: <1319547821-1045-1-git-send-email-jpirko@redhat.com>

Le mardi 25 octobre 2011 à 15:03 +0200, Jiri Pirko a écrit :
> This patch introduces new network device called team. It supposes to be
> very fast, simple, userspace-driven alternative to existing bonding
> driver.
> 
> Userspace library called libteam with couple of demo apps is available
> here:
> https://github.com/jpirko/libteam
> Note it's still in its dipers atm.
> 
> team<->libteam use generic netlink for communication. That and rtnl
> suppose to be the only way to configure team device, no sysfs etc.
> 
> Python binding basis for libteam was recently introduced (some need
> still need to be done on it though). Daemon providing arpmon/miimon
> active-backup functionality will be introduced shortly.
> All what's necessary is already implemented in kernel team driver.
> 
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
> 
> v4->v5:
> 	- team_change_mtu() uses team->lock while travesing though port
> 	  list
> 	- mac address changes are moved completely to jurisdiction of
> 	  userspace daemon. This way the daemon can do FOM1, FOM2 and
> 	  possibly other weird things with mac addresses.
> 	  Only round-robin mode sets up all ports to bond's address then
> 	  enslaved.
> 	- Extended Kconfig text


Following is not called under rcu, but rtnl or team spinlock held.

Therefore, team_get_port_by_index_rcu() is not the right thing.

+static void __reconstruct_port_hlist(struct team *team, int rm_index)
+{
+       int i;
+       struct team_port *port;
+
+       for (i = rm_index + 1; i < team->port_count; i++) {
+               port = team_get_port_by_index_rcu(team, i);
+               hlist_del_rcu(&port->hlist);
+               port->index--;
+               hlist_add_head_rcu(&port->hlist,
+                                  team_port_index_hash(team, port->index));
+       }
+}
+

In fact, I claim most of your rcu_read_lock() in management side are
bogus and obfuscate code.

RCU is an exact science, not a commodity.

When RCU is used, rcu_read_lock()/rcu_read_unlock() are used by readers,
not managers :

They should use a spin/mutex(rtnl usually in network land)
and normal reads, no need for rcu_something

Only writes must take care of concurrent readers (aka rcu_assign_pointer())

For example:

team_nl_fill_port_list_get_changed() should not use
	list_for_each_entry_rcu(port, &team->port_list, list) {
but a regular
	list_for_each_entry(port, &team->port_list, list) {


team_nl_team_get() should not play with RCU either.
	It can use __dev_get_by_index() instead of dev_get_by_index_rcu()

Comment in front of team_nl_team_get() is bogus as well :

/*
 * Netlink cmd functions should be locked by following two functions.
 * To ensure team_uninit would not be called in between, hold rcu_read_lock
 * all the time.
 */

How can holding rcu_read_lock() can prevent another cpu doing whatever he wants ?

It seems you believe rcu_read_lock() is a read_lock(), but it isnt.

Using right API is essential to get appropriate LOCKDEP semantic and
code maintainability.

^ permalink raw reply

* linux-next: build failure after merge of the net-next tree
From: Stephen Rothwell @ 2011-10-25 13:55 UTC (permalink / raw)
  To: David Miller, netdev; +Cc: linux-next, linux-kernel, Eric Dumazet, Linus

[-- Attachment #1: Type: text/plain, Size: 534 bytes --]

Hi all,

After merging the net-next tree, today's linux-next build (powerpc
pseries_defconfig) failed like this:

drivers/net/ethernet/ibm/ehea/ehea_main.c: In function 'write_swqe2_data':
drivers/net/ethernet/ibm/ehea/ehea_main.c:1692: error: implicit declaration of function 'frag_size'

Presumably caused by commit 9e903e085262 ("net: add skb frag size
accessors").  I guess it should have been "skb_frag_size()".
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* [PATCH v2 2/4] SUNRPC: use virtualized rpcbind internals instead of static ones
From: Stanislav Kinsbursky @ 2011-10-25 14:58 UTC (permalink / raw)
  To: Trond.Myklebust
  Cc: linux-nfs, xemul, neilb, netdev, linux-kernel, bfields, davem,
	devel
In-Reply-To: <20111025135637.5286.89632.stgit@localhost6.localdomain6>

This patch makes rpcbind logic works in network namespace context. IOW each
network namespace will have it's own unique rpcbind internals (clients and
friends) which is required for registering svc services per network namespace.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>

---
 net/sunrpc/rpcb_clnt.c |   64 ++++++++++++++++++++++++++----------------------
 1 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index c2054ae..18eba8e 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -23,12 +23,15 @@
 #include <linux/errno.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
+#include <linux/nsproxy.h>
 #include <net/ipv6.h>
 
 #include <linux/sunrpc/clnt.h>
 #include <linux/sunrpc/sched.h>
 #include <linux/sunrpc/xprtsock.h>
 
+#include "netns.h"
+
 #ifdef RPC_DEBUG
 # define RPCDBG_FACILITY	RPCDBG_BIND
 #endif
@@ -111,12 +114,6 @@ static void			rpcb_getport_done(struct rpc_task *, void *);
 static void			rpcb_map_release(void *data);
 static struct rpc_program	rpcb_program;
 
-static struct rpc_clnt *	rpcb_local_clnt;
-static struct rpc_clnt *	rpcb_local_clnt4;
-
-DEFINE_SPINLOCK(rpcb_clnt_lock);
-unsigned int			rpcb_users;
-
 struct rpcbind_args {
 	struct rpc_xprt *	r_xprt;
 
@@ -167,29 +164,31 @@ static void rpcb_map_release(void *data)
 static int rpcb_get_local(void)
 {
 	int cnt;
+	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
-	spin_lock(&rpcb_clnt_lock);
-	if (rpcb_users)
-		rpcb_users++;
-	cnt = rpcb_users;
-	spin_unlock(&rpcb_clnt_lock);
+	spin_lock(&sn->rpcb_clnt_lock);
+	if (sn->rpcb_users)
+		sn->rpcb_users++;
+	cnt = sn->rpcb_users;
+	spin_unlock(&sn->rpcb_clnt_lock);
 
 	return cnt;
 }
 
 void rpcb_put_local(void)
 {
-	struct rpc_clnt *clnt = rpcb_local_clnt;
-	struct rpc_clnt *clnt4 = rpcb_local_clnt4;
+	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
+	struct rpc_clnt *clnt = sn->rpcb_local_clnt;
+	struct rpc_clnt *clnt4 = sn->rpcb_local_clnt4;
 	int shutdown;
 
-	spin_lock(&rpcb_clnt_lock);
-	if (--rpcb_users == 0) {
-		rpcb_local_clnt = NULL;
-		rpcb_local_clnt4 = NULL;
+	spin_lock(&sn->rpcb_clnt_lock);
+	if (--sn->rpcb_users == 0) {
+		sn->rpcb_local_clnt = NULL;
+		sn->rpcb_local_clnt4 = NULL;
 	}
-	shutdown = !rpcb_users;
-	spin_unlock(&rpcb_clnt_lock);
+	shutdown = !sn->rpcb_users;
+	spin_unlock(&sn->rpcb_clnt_lock);
 
 	if (shutdown) {
 		/*
@@ -205,14 +204,16 @@ void rpcb_put_local(void)
 
 static void rpcb_set_local(struct rpc_clnt *clnt, struct rpc_clnt *clnt4)
 {
+	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
+
 	/* Protected by rpcb_create_local_mutex */
-	rpcb_local_clnt = clnt;
-	rpcb_local_clnt4 = clnt4;
+	sn->rpcb_local_clnt = clnt;
+	sn->rpcb_local_clnt4 = clnt4;
 	smp_wmb(); 
-	rpcb_users = 1;
+	sn->rpcb_users = 1;
 	dprintk("RPC:       created new rpcb local clients (rpcb_local_clnt: "
-			"%p, rpcb_local_clnt4: %p)\n", rpcb_local_clnt,
-			rpcb_local_clnt4);
+			"%p, rpcb_local_clnt4: %p)\n", sn->rpcb_local_clnt,
+			sn->rpcb_local_clnt4);
 }
 
 /*
@@ -432,6 +433,7 @@ int rpcb_register(u32 prog, u32 vers, int prot, unsigned short port)
 	struct rpc_message msg = {
 		.rpc_argp	= &map,
 	};
+	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
 	dprintk("RPC:       %sregistering (%u, %u, %d, %u) with local "
 			"rpcbind\n", (port ? "" : "un"),
@@ -441,7 +443,7 @@ int rpcb_register(u32 prog, u32 vers, int prot, unsigned short port)
 	if (port)
 		msg.rpc_proc = &rpcb_procedures2[RPCBPROC_SET];
 
-	return rpcb_register_call(rpcb_local_clnt, &msg);
+	return rpcb_register_call(sn->rpcb_local_clnt, &msg);
 }
 
 /*
@@ -454,6 +456,7 @@ static int rpcb_register_inet4(const struct sockaddr *sap,
 	struct rpcbind_args *map = msg->rpc_argp;
 	unsigned short port = ntohs(sin->sin_port);
 	int result;
+	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
 	map->r_addr = rpc_sockaddr2uaddr(sap);
 
@@ -466,7 +469,7 @@ static int rpcb_register_inet4(const struct sockaddr *sap,
 	if (port)
 		msg->rpc_proc = &rpcb_procedures4[RPCBPROC_SET];
 
-	result = rpcb_register_call(rpcb_local_clnt4, msg);
+	result = rpcb_register_call(sn->rpcb_local_clnt4, msg);
 	kfree(map->r_addr);
 	return result;
 }
@@ -481,6 +484,7 @@ static int rpcb_register_inet6(const struct sockaddr *sap,
 	struct rpcbind_args *map = msg->rpc_argp;
 	unsigned short port = ntohs(sin6->sin6_port);
 	int result;
+	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
 	map->r_addr = rpc_sockaddr2uaddr(sap);
 
@@ -493,7 +497,7 @@ static int rpcb_register_inet6(const struct sockaddr *sap,
 	if (port)
 		msg->rpc_proc = &rpcb_procedures4[RPCBPROC_SET];
 
-	result = rpcb_register_call(rpcb_local_clnt4, msg);
+	result = rpcb_register_call(sn->rpcb_local_clnt4, msg);
 	kfree(map->r_addr);
 	return result;
 }
@@ -501,6 +505,7 @@ static int rpcb_register_inet6(const struct sockaddr *sap,
 static int rpcb_unregister_all_protofamilies(struct rpc_message *msg)
 {
 	struct rpcbind_args *map = msg->rpc_argp;
+	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
 	dprintk("RPC:       unregistering [%u, %u, '%s'] with "
 		"local rpcbind\n",
@@ -509,7 +514,7 @@ static int rpcb_unregister_all_protofamilies(struct rpc_message *msg)
 	map->r_addr = "";
 	msg->rpc_proc = &rpcb_procedures4[RPCBPROC_UNSET];
 
-	return rpcb_register_call(rpcb_local_clnt4, msg);
+	return rpcb_register_call(sn->rpcb_local_clnt4, msg);
 }
 
 /**
@@ -567,8 +572,9 @@ int rpcb_v4_register(const u32 program, const u32 version,
 	struct rpc_message msg = {
 		.rpc_argp	= &map,
 	};
+	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
-	if (rpcb_local_clnt4 == NULL)
+	if (sn->rpcb_local_clnt4 == NULL)
 		return -EPROTONOSUPPORT;
 
 	if (address == NULL)

^ permalink raw reply related

* Re: linux-next: build failure after merge of the net-next tree
From: Eric Dumazet @ 2011-10-25 14:16 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: David Miller, netdev, linux-next, linux-kernel, Linus
In-Reply-To: <20111026005530.a4bd779e1a8182ea57dbf99f@canb.auug.org.au>

Le mercredi 26 octobre 2011 à 00:55 +1100, Stephen Rothwell a écrit :
> Hi all,
> 
> After merging the net-next tree, today's linux-next build (powerpc
> pseries_defconfig) failed like this:
> 
> drivers/net/ethernet/ibm/ehea/ehea_main.c: In function 'write_swqe2_data':
> drivers/net/ethernet/ibm/ehea/ehea_main.c:1692: error: implicit declaration of function 'frag_size'
> 
> Presumably caused by commit 9e903e085262 ("net: add skb frag size
> accessors").  I guess it should have been "skb_frag_size()".

My bad, sorry

[PATCH] ehea: fix skb_frag_size typo

commit 9e903e085262 (net: add skb frag size accessors) introduced a typo
in ehea driver.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/ethernet/ibm/ehea/ehea_main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c
index 0d4d4f6..37b70f7 100644
--- a/drivers/net/ethernet/ibm/ehea/ehea_main.c
+++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c
@@ -1689,7 +1689,7 @@ static inline void write_swqe2_data(struct sk_buff *skb, struct net_device *dev,
 			sgentry = &sg_list[i - sg1entry_contains_frag_data];
 
 			sgentry->l_key = lkey;
-			sgentry->len = frag_size(frag);
+			sgentry->len = skb_frag_size(frag);
 			sgentry->vaddr = ehea_map_vaddr(skb_frag_address(frag));
 			swqe->descriptors++;
 		}

^ permalink raw reply related

* Re: [PATCH v2 0/4] Series short description
From: Stanislav Kinsbursky @ 2011-10-25 14:20 UTC (permalink / raw)
  To: Trond.Myklebust@netapp.com
  Cc: linux-nfs@vger.kernel.org, Pavel Emelianov, neilb@suse.de,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	bfields@fieldses.org, davem@davemloft.net, devel@openvz.org
In-Reply-To: <20111025135637.5286.89632.stgit@localhost6.localdomain6>

Sorry for dummy series description.
Here is the right one:

"[PATCH v2 0/4] SUNRPC: rcbind clients virtualization"

-- 
Best regards,
Stanislav Kinsbursky

^ permalink raw reply

* [PATCH v2 0/4] Series short description
From: Stanislav Kinsbursky @ 2011-10-25 14:57 UTC (permalink / raw)
  To: Trond.Myklebust
  Cc: linux-nfs, xemul, neilb, netdev, linux-kernel, bfields, davem,
	devel

This patch-set was created in context of clone of git branch:
git://git.linux-nfs.org/projects/trondmy/nfs-2.6.git
and rebased on tag "v3.1".

This patch-set virtualizes rpcbind clients per network namespace context. IOW,
each network namespace will have its own pair of rpcbind clients (if the would
be created by request).

Note:
1) this patch-set depends on "SUNRPC: make rpcbind clients allocated and
destroyed dynamically" patch-set which has been send earlier.
2) init_net pointer is still used instead of current->nsproxy->net_ns,
because I'm not sure yet about how to virtualize services.
I.e. NFS callback services will be per netns. NFSd service will be per netns
too from my pow. But Lockd can be per netns or one for all.
And also we have NFSd file system, which is not virtualized yet.

The following series consists of:

---

Stanislav Kinsbursky (4):
      SUNRPC: rpcbind clients internals virtualization
      SUNRPC: use virtualized rpcbind internals instead of static ones
      SUNRPC: optimize net_ns dereferencing in rpcbind creation calls
      SUNRPC: optimize net_ns dereferencing in rpcbind registering calls


 net/sunrpc/netns.h     |    5 ++
 net/sunrpc/rpcb_clnt.c |  103 ++++++++++++++++++++++++++----------------------
 2 files changed, 61 insertions(+), 47 deletions(-)

-- 
Signature

^ permalink raw reply

* [PATCH v2 1/4] SUNRPC: rpcbind clients internals virtualization
From: Stanislav Kinsbursky @ 2011-10-25 14:57 UTC (permalink / raw)
  To: Trond.Myklebust
  Cc: linux-nfs, xemul, neilb, netdev, linux-kernel, bfields, davem,
	devel
In-Reply-To: <20111025135637.5286.89632.stgit@localhost6.localdomain6>

This patch moves static rpcbind internals to sunrpc part of network namespace
context. This will allow to create rcpbind clients per network namespace.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>

---
 net/sunrpc/netns.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
index d013bf2..83eede3 100644
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -9,6 +9,11 @@ struct cache_detail;
 struct sunrpc_net {
 	struct proc_dir_entry *proc_net_rpc;
 	struct cache_detail *ip_map_cache;
+
+	struct rpc_clnt *rpcb_local_clnt;
+	struct rpc_clnt *rpcb_local_clnt4;
+	spinlock_t rpcb_clnt_lock;
+	unsigned int rpcb_users;
 };
 
 extern int sunrpc_net_id;

^ permalink raw reply related

* [PATCH v2 3/4] SUNRPC: optimize net_ns dereferencing in rpcbind creation calls
From: Stanislav Kinsbursky @ 2011-10-25 14:58 UTC (permalink / raw)
  To: Trond.Myklebust
  Cc: linux-nfs, xemul, neilb, netdev, linux-kernel, bfields, davem,
	devel
In-Reply-To: <20111025135637.5286.89632.stgit@localhost6.localdomain6>

Static rpcbind creation functions can be parametrized by network namespace
pointer, calculated only once, instead of using init_net pointer (or taking it
from current when virtualization will be comleted) in many places.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>

---
 net/sunrpc/rpcb_clnt.c |   35 +++++++++++++++++++----------------
 1 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 18eba8e..ef37c55 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -161,10 +161,10 @@ static void rpcb_map_release(void *data)
 	kfree(map);
 }
 
-static int rpcb_get_local(void)
+static int rpcb_get_local(struct net *net)
 {
 	int cnt;
-	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
+	struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
 
 	spin_lock(&sn->rpcb_clnt_lock);
 	if (sn->rpcb_users)
@@ -202,9 +202,10 @@ void rpcb_put_local(void)
 	return;
 }
 
-static void rpcb_set_local(struct rpc_clnt *clnt, struct rpc_clnt *clnt4)
+static void rpcb_set_local(struct net *net, struct rpc_clnt *clnt,
+			struct rpc_clnt *clnt4)
 {
-	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
+	struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
 
 	/* Protected by rpcb_create_local_mutex */
 	sn->rpcb_local_clnt = clnt;
@@ -212,22 +213,23 @@ static void rpcb_set_local(struct rpc_clnt *clnt, struct rpc_clnt *clnt4)
 	smp_wmb(); 
 	sn->rpcb_users = 1;
 	dprintk("RPC:       created new rpcb local clients (rpcb_local_clnt: "
-			"%p, rpcb_local_clnt4: %p)\n", sn->rpcb_local_clnt,
-			sn->rpcb_local_clnt4);
+			"%p, rpcb_local_clnt4: %p) for net %p%s\n",
+			sn->rpcb_local_clnt, sn->rpcb_local_clnt4,
+			net, (net == &init_net) ? " (init_net)" : "");
 }
 
 /*
  * Returns zero on success, otherwise a negative errno value
  * is returned.
  */
-static int rpcb_create_local_unix(void)
+static int rpcb_create_local_unix(struct net *net)
 {
 	static const struct sockaddr_un rpcb_localaddr_rpcbind = {
 		.sun_family		= AF_LOCAL,
 		.sun_path		= RPCBIND_SOCK_PATHNAME,
 	};
 	struct rpc_create_args args = {
-		.net		= &init_net,
+		.net		= net,
 		.protocol	= XPRT_TRANSPORT_LOCAL,
 		.address	= (struct sockaddr *)&rpcb_localaddr_rpcbind,
 		.addrsize	= sizeof(rpcb_localaddr_rpcbind),
@@ -260,7 +262,7 @@ static int rpcb_create_local_unix(void)
 		clnt4 = NULL;
 	}
 
-	rpcb_set_local(clnt, clnt4);
+	rpcb_set_local(net, clnt, clnt4);
 
 out:
 	return result;
@@ -270,7 +272,7 @@ out:
  * Returns zero on success, otherwise a negative errno value
  * is returned.
  */
-static int rpcb_create_local_net(void)
+static int rpcb_create_local_net(struct net *net)
 {
 	static const struct sockaddr_in rpcb_inaddr_loopback = {
 		.sin_family		= AF_INET,
@@ -278,7 +280,7 @@ static int rpcb_create_local_net(void)
 		.sin_port		= htons(RPCBIND_PORT),
 	};
 	struct rpc_create_args args = {
-		.net		= &init_net,
+		.net		= net,
 		.protocol	= XPRT_TRANSPORT_TCP,
 		.address	= (struct sockaddr *)&rpcb_inaddr_loopback,
 		.addrsize	= sizeof(rpcb_inaddr_loopback),
@@ -312,7 +314,7 @@ static int rpcb_create_local_net(void)
 		clnt4 = NULL;
 	}
 
-	rpcb_set_local(clnt, clnt4);
+	rpcb_set_local(net, clnt, clnt4);
 
 out:
 	return result;
@@ -326,16 +328,17 @@ int rpcb_create_local(void)
 {
 	static DEFINE_MUTEX(rpcb_create_local_mutex);
 	int result = 0;
+	struct net *net = &init_net;
 
-	if (rpcb_get_local())
+	if (rpcb_get_local(net))
 		return result;
 
 	mutex_lock(&rpcb_create_local_mutex);
-	if (rpcb_get_local())
+	if (rpcb_get_local(net))
 		goto out;
 
-	if (rpcb_create_local_unix() != 0)
-		result = rpcb_create_local_net();
+	if (rpcb_create_local_unix(net) != 0)
+		result = rpcb_create_local_net(net);
 
 out:
 	mutex_unlock(&rpcb_create_local_mutex);

^ permalink raw reply related

* [PATCH v2 4/4] SUNRPC: optimize net_ns dereferencing in rpcbind registering calls
From: Stanislav Kinsbursky @ 2011-10-25 14:58 UTC (permalink / raw)
  To: Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, xemul-bzQdu9zFT3WakBO8gow8eQ,
	neilb-l3A5Bk7waGM, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bfields-uC3wQj2KruNg9hUCZPvPmw, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	devel-GEFAQzZX7r8dnm+yROfE0A
In-Reply-To: <20111025135637.5286.89632.stgit-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org>

Static rpcbind registering functions can be parametrized by network namespace
pointer, calculated only once, instead of using init_net pointer (or taking it
from current when virtualization will be comleted) in many places.

Signed-off-by: Stanislav Kinsbursky <skinsbursky-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>

---
 net/sunrpc/rpcb_clnt.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index ef37c55..92ed74f 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -452,14 +452,14 @@ int rpcb_register(u32 prog, u32 vers, int prot, unsigned short port)
 /*
  * Fill in AF_INET family-specific arguments to register
  */
-static int rpcb_register_inet4(const struct sockaddr *sap,
+static int rpcb_register_inet4(struct sunrpc_net *sn,
+			       const struct sockaddr *sap,
 			       struct rpc_message *msg)
 {
 	const struct sockaddr_in *sin = (const struct sockaddr_in *)sap;
 	struct rpcbind_args *map = msg->rpc_argp;
 	unsigned short port = ntohs(sin->sin_port);
 	int result;
-	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
 	map->r_addr = rpc_sockaddr2uaddr(sap);
 
@@ -480,14 +480,14 @@ static int rpcb_register_inet4(const struct sockaddr *sap,
 /*
  * Fill in AF_INET6 family-specific arguments to register
  */
-static int rpcb_register_inet6(const struct sockaddr *sap,
+static int rpcb_register_inet6(struct sunrpc_net *sn,
+			       const struct sockaddr *sap,
 			       struct rpc_message *msg)
 {
 	const struct sockaddr_in6 *sin6 = (const struct sockaddr_in6 *)sap;
 	struct rpcbind_args *map = msg->rpc_argp;
 	unsigned short port = ntohs(sin6->sin6_port);
 	int result;
-	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
 	map->r_addr = rpc_sockaddr2uaddr(sap);
 
@@ -505,10 +505,10 @@ static int rpcb_register_inet6(const struct sockaddr *sap,
 	return result;
 }
 
-static int rpcb_unregister_all_protofamilies(struct rpc_message *msg)
+static int rpcb_unregister_all_protofamilies(struct sunrpc_net *sn,
+					     struct rpc_message *msg)
 {
 	struct rpcbind_args *map = msg->rpc_argp;
-	struct sunrpc_net *sn = net_generic(&init_net, sunrpc_net_id);
 
 	dprintk("RPC:       unregistering [%u, %u, '%s'] with "
 		"local rpcbind\n",
@@ -581,13 +581,13 @@ int rpcb_v4_register(const u32 program, const u32 version,
 		return -EPROTONOSUPPORT;
 
 	if (address == NULL)
-		return rpcb_unregister_all_protofamilies(&msg);
+		return rpcb_unregister_all_protofamilies(sn, &msg);
 
 	switch (address->sa_family) {
 	case AF_INET:
-		return rpcb_register_inet4(address, &msg);
+		return rpcb_register_inet4(sn, address, &msg);
 	case AF_INET6:
-		return rpcb_register_inet6(address, &msg);
+		return rpcb_register_inet6(sn, address, &msg);
 	}
 
 	return -EAFNOSUPPORT;

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [RFC v2 PATCH 5/4 PATCH] virtio-net: send gratuitous packet when needed
From: Michael S. Tsirkin @ 2011-10-25 15:41 UTC (permalink / raw)
  To: Jason Wang
  Cc: aliguori, kvm, quintela, jan.kiszka, Rusty Russell, qemu-devel,
	blauwirbel, netdev, pbonzini
In-Reply-To: <4EA62401.5000602@redhat.com>

On Tue, Oct 25, 2011 at 10:50:41AM +0800, Jason Wang wrote:
> On 10/24/2011 01:25 PM, Michael S. Tsirkin wrote:
> > On Mon, Oct 24, 2011 at 02:54:59PM +1030, Rusty Russell wrote:
> >> On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote:
> >>> This make let virtio-net driver can send gratituous packet by a new
> >>> config bit - VIRTIO_NET_S_ANNOUNCE in each config update
> >>> interrupt. When this bit is set by backend, the driver would schedule
> >>> a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS.
> >>>
> >>> This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE.
> >>>
> >>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>
> >> This seems like a huge layering violation.  Imagine this in real
> >> hardware, for example.
> > 
> > commits 06c4648d46d1b757d6b9591a86810be79818b60c
> > and 99606477a5888b0ead0284fecb13417b1da8e3af
> > document the need for this:
> > 
> > NETDEV_NOTIFY_PEERS notifier indicates that a device moved to a 
> > different physical link.
> > 	and
> > In real hardware such notifications are only
> > generated when the device comes up or the address changes.
> > 
> > So hypervisor could get the same behaviour by sending link up/down
> > events, this is just an optimization so guest won't do
> > unecessary stuff like try to reconfigure an IP address.
> > 
> > 
> > Maybe LOCATION_CHANGE would be a better name?
> > 
> 
> ANNOUNCE_SELF?

It would be nice to formulate what kind of event
are we notifying the guest about.
The announce part of it is really up to the guest, isn't it?

> > 
> >> There may be a good reason why virtual devices might want this kind of
> >> reconfiguration cheat, which is unnecessary for normal machines,
> > 
> > I think yes, the difference with real hardware is guest can change
> > location without link getting dropped.
> > FWIW, Xen seems to use this capability too.
> 
> So does ms netvsc.
> 
> > 
> >> but
> >> it'd have to be spelled out clearly in the spec to justify it...
> >>
> >> Cheers,
> >> Rusty.
> > 
> > Agree, and I'd like to see the spec too. The interface seems
> > to involve the guest clearing the status bit when it detects
> > an event?
> 
> I would describe this in spec. The interface need guest to clear the
> status bit, this would let the back-end know it has finished the work as
> we may need to send the gratuitous packets many times.
> 
> > 
> > Also - how does it interact with the link up event?
> > We probably don't want to schedule this when we detect
> > a link status change or during initialization, as
> > this patch seems to do? What if link goes down
> > while the work is running? Is that OK?
> > 
> 
> Looks like there's are duplications if guest enable arp_notify vm is
> started,

How hard would it be to avoid these duplicates?

> but we need to handle the situation that resuming a stopped
> virtual machine.
> 
> For the link down race, I don't see any real issue, either dropping or
> queued.

For example, you do
        unregister_netdev(vi->dev);
        cancel_work_sync(&vi->announce);

which looks scary as announce seems to use the netdev.

-- 
MST

^ permalink raw reply

* Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode
From: Michael S. Tsirkin @ 2011-10-25 15:46 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: netdev, sri, dragos.tatulea, arnd, kvm, davem, mchan, dwang2,
	shemminger, eric.dumazet, kaber, benve, Rose, Gregory V
In-Reply-To: <CACAF939.37AAD%roprabhu@cisco.com>

On Mon, Oct 24, 2011 at 11:15:05AM -0700, Roopa Prabhu wrote:
> >> ... And also I dont know of any hw
> >> that provides an interface to set hw filters on a per queue basis.
> > 
> > VMDq hardware would support this, no?
> > 
> Am not really sure. This patch uses netdev to pass filters to hw. And I
> don't see any netdev infrastructure that would support per queue filters.

Sure. I was only saying that as far as I understand,
VMDq hardware does support this functionality.
Right, Greg?

-- 
MST

^ permalink raw reply

* RE: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode
From: Rose, Gregory V @ 2011-10-25 15:59 UTC (permalink / raw)
  To: Michael S. Tsirkin, Roopa Prabhu
  Cc: netdev@vger.kernel.org, sri@us.ibm.com, dragos.tatulea@gmail.com,
	arnd@arndb.de, kvm@vger.kernel.org, davem@davemloft.net,
	mchan@broadcom.com, dwang2@cisco.com, shemminger@vyatta.com,
	eric.dumazet@gmail.com, kaber@trash.net, benve@cisco.com
In-Reply-To: <20111025154617.GB22553@redhat.com>

> -----Original Message-----
> From: Michael S. Tsirkin [mailto:mst@redhat.com]
> Sent: Tuesday, October 25, 2011 8:46 AM
> To: Roopa Prabhu
> Cc: netdev@vger.kernel.org; sri@us.ibm.com; dragos.tatulea@gmail.com;
> arnd@arndb.de; kvm@vger.kernel.org; davem@davemloft.net;
> mchan@broadcom.com; dwang2@cisco.com; shemminger@vyatta.com;
> eric.dumazet@gmail.com; kaber@trash.net; benve@cisco.com; Rose, Gregory V
> Subject: Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address
> filtering support for passthru mode
> 
> On Mon, Oct 24, 2011 at 11:15:05AM -0700, Roopa Prabhu wrote:
> > >> ... And also I dont know of any hw
> > >> that provides an interface to set hw filters on a per queue basis.
> > >
> > > VMDq hardware would support this, no?
> > >
> > Am not really sure. This patch uses netdev to pass filters to hw. And I
> > don't see any netdev infrastructure that would support per queue
> filters.
> 
> Sure. I was only saying that as far as I understand,
> VMDq hardware does support this functionality.
> Right, Greg?

In the case of Intel HW yes.  I'll refrain from speaking for other HW vendors although I'm guessing it would be true in their cases also.  YMMV, caveat emptor, etc. etc.

- Greg

> 
> --
> MST

^ permalink raw reply

* Re: [RFC PATCH 11/17] sh_eth: Don't unnecessarily reset the PHY
From: Moffett, Kyle D @ 2011-10-25 16:00 UTC (permalink / raw)
  To: Yoshihiro Shimoda
  Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	David S. Miller, Nobuhiro Iwamatsu, Kuninori Morimoto
In-Reply-To: <4EA69972.4090004@renesas.com>

On Oct 25, 2011, at 07:11, Yoshihiro Shimoda wrote:
> 2011/10/21 6:00, Kyle Moffett wrote:
>> Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
>> ---
>> drivers/net/sh_eth.c |   19 +------------------
>> 1 files changed, 1 insertions(+), 18 deletions(-)
>> 
>> diff --git a/drivers/net/sh_eth.c b/drivers/net/sh_eth.c
>> index 1c1666e..7ef4378 100644
>> --- a/drivers/net/sh_eth.c
>> +++ b/drivers/net/sh_eth.c
>> @@ -1235,23 +1235,6 @@ static int sh_eth_phy_init(struct net_device *ndev)
>> 	return 0;
>> }
>> 
>> -/* PHY control start function */
>> -static int sh_eth_phy_start(struct net_device *ndev)
>> -{
>> -	struct sh_eth_private *mdp = netdev_priv(ndev);
>> -	int ret;
>> -
>> -	ret = sh_eth_phy_init(ndev);
>> -	if (ret)
>> -		return ret;
>> -
>> -	/* reset phy - this also wakes it from PDOWN */
>> -	phy_write(mdp->phydev, MII_BMCR, BMCR_RESET);
>> -	phy_start(mdp->phydev);
> 
> I think that the driver needs the phy_start().
> So, I removed the phy_write() only. Then, the driver could work correctly.
> 
> Therefore, I would to remove the following code only.
> Do you think about this?
> 
>> -	/* reset phy - this also wakes it from PDOWN */
>> -	phy_write(mdp->phydev, MII_BMCR, BMCR_RESET);

You're right, sorry about that.  I think I mis-merged this when I rebased
from a much earlier commit.  The simple removal of the BMCR_RESET write
should achieve what I was intending.

Cheers,
Kyle Moffett

--
Curious about my work on the Debian powerpcspe port?
I'm keeping a blog here: http://pureperl.blogspot.com/

^ permalink raw reply

* Re: [PATCH] net: make bonding slaves honour master's skb->priority
From: Maciej Żenczykowski @ 2011-10-25 16:25 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: netdev
In-Reply-To: <20111025110808.27db8487@asterix.rh>

> I tripped on it few times and never understood why we
> were fixing the priority at this point.
> thanks,

It looks like ancient code, going all the way back to something like
Linux 2.4.14 (if I recall my git blame session correctly).
I can't think of a good reason for this...

- Maciej

^ permalink raw reply

* Re: [PATCH] net: make bonding slaves honour master's skb->priority
From: Jay Vosburgh @ 2011-10-25 16:45 UTC (permalink / raw)
  To: =?UTF-8?Q?Maciej_=C5=BBenczykowski?=; +Cc: Flavio Leitner, netdev
In-Reply-To: <CANP3RGeMbVFf6oL3+9i8h43q1ycdzcA47rNfzeyLWpi8M8ZV5w@mail.gmail.com>

Maciej Żenczykowski <zenczykowski@gmail.com> wrote:

>> I tripped on it few times and never understood why we
>> were fixing the priority at this point.
>> thanks,
>
>It looks like ancient code, going all the way back to something like
>Linux 2.4.14 (if I recall my git blame session correctly).
>I can't think of a good reason for this...

	I believe this line goes back to the beginning, prior to git or
bk.  The skb->priority probably meant something different than it does
today.

	In any event, I'm fine with the code change.  There's still no
actual log message, but if Davem is ok with that I'm ok with it, given
the trivial nature of the change.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* drivers/net/ethernet/apple
From: Geert Uytterhoeven @ 2011-10-25 20:19 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: Benjamin Herrenschmidt, netdev, linux-kernel

Hi Jeff,

drivers/net/ethernet/apple/mac89x0.c is a driver for the Crystal Semiconductor
(Now Cirrus Logic) CS89[02]0, so it belongs in drivers/net/ethernet/cirrus,
next to cs89x0.c.

And according to drivers/net/ethernet/apple/mace.h, "mace" is the
"Am79C940 MACE (Medium Access Control for Ethernet)", so mace and
macmace should be in drivers/net/ethernet/amd/.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox